(-: (-: SMILES :-) :-)

Similar documents
Rule Learning With Negation: Issues Regarding Effectiveness

Python Machine Learning

Rule Learning with Negation: Issues Regarding Effectiveness

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Linking Task: Identifying authors and book titles in verbose queries

Softprop: Softmax Neural Network Backpropagation Learning

Human Emotion Recognition From Speech

Computerized Adaptive Psychological Testing A Personalisation Perspective

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Word Segmentation of Off-line Handwritten Documents

CS Machine Learning

A Quantitative Method for Machine Translation Evaluation

Lecture 1: Machine Learning Basics

Switchboard Language Model Improvement with Conversational Data from Gigaword

CS 446: Machine Learning

Reducing Features to Improve Bug Prediction

Activity Recognition from Accelerometer Data

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

CSL465/603 - Machine Learning

Using dialogue context to improve parsing performance in dialogue systems

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Learning From the Past with Experiment Databases

Exposé for a Master s Thesis

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

A Case Study: News Classification Based on Term Frequency

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Lecture 1: Basic Concepts of Machine Learning

Universidade do Minho Escola de Engenharia

CWIS 23,3. Nikolaos Avouris Human Computer Interaction Group, University of Patras, Patras, Greece

Handling Concept Drifts Using Dynamic Selection of Classifiers

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

MYCIN. The MYCIN Task

Calibration of Confidence Measures in Speech Recognition

Assignment 1: Predicting Amazon Review Ratings

Feature Selection based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification using Naïve Bayes

Applications of data mining algorithms to analysis of medical data

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

Knowledge-Based - Systems

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Data Stream Processing and Analytics

STA 225: Introductory Statistics (CT)

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Semi-Supervised Face Detection

Generative models and adversarial training

Model Ensemble for Click Prediction in Bing Search Ads

Ensemble Technique Utilization for Indonesian Dependency Parser

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Modeling function word errors in DNN-HMM based LVCSR systems

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Online Marking of Essay-type Assignments

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Cooperative evolutive concept learning: an empirical study

Learning Methods for Fuzzy Systems

(Sub)Gradient Descent

arxiv: v2 [cs.cv] 30 Mar 2017

Knowledge Transfer in Deep Convolutional Neural Nets

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Evidence for Reliability, Validity and Learning Effectiveness

Time series prediction

Executive summary (in English)

Bug triage in open source systems: a review

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

An OO Framework for building Intelligence and Learning properties in Software Agents

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

Learning Methods in Multilingual Speech Recognition

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

An Empirical Comparison of Supervised Ensemble Learning Approaches

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Indian Institute of Technology, Kanpur

INPE São José dos Campos

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

AQUA: An Ontology-Driven Question Answering System

Interpreting ACER Test Results

Multi-label classification via multi-target regression on data streams

Constructive Induction-based Learning Agents: An Architecture and Preliminary Experiments

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Detecting Student Emotions in Computer-Enabled Classrooms

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Probability estimates in a scenario tree

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Customized Question Handling in Data Removal Using CPHC

Transcription:

(-: (-: SMILES :-) :-) A Multi-purpose Learning System Vicent Estruch, Cèsar Ferri, José Hernández-Orallo, M.José Ramírez-Quintana {vestruch, cferri, jorallo, mramirez}@dsic.upv.es Dep. de Sistemes Informàtics i Computació, Universitat Politècnica de València, Valencia, Spain 8th European Conference on Logics in Artificial Intelligence JELIA'02, System Presentation Session Cosenza, Italy, September 23-26, 2002

Introduction SMILES: integrates many different and innovative features in machine learning techniques. extends classical decision tree learners in many ways: new splitting criteria non-greedy search new partitions extraction of several and different solutions anytime handling of resources sophisticated and quite effective handling of costs. JELIA'2002 2

Motivation Some hindrances for a wider applicability of Machine Learning: Generation: Computational costs: powerful methods in ML systems require huge amounts of memory and time to generate accurate hypotheses. Application: Prediction error costs: not all the errors have the same consequences: Cost matrices and ROC analysis necessary. Test costs: not all the attributes can be tested economically. Especially in medical applications. Intelligibility: the comprehensibility of the extracted models is critical for their validation, acceptance, diffusion and ultimate use. Throughput (response time): complex models are difficult to be applied efficiently in real-time applications, such as fraud detection. JELIA'2002 3

Ensemble Methods (1/2) Ensemble Methods (Multi-classifier or hybrid systems): Aim at obtaining higher accuracy than single methods. Generate multiple and possibly heterogeneous models and then combine them through voting or other fusion methods. Good results related to the number and variety of classifiers. Different topologies: simple, stacking, cascading, a 1 a 2 a m Decision Tree Neural Net C 1 Data C 2 a 1 a 2 a m Fusion Combined Prediction Data a 1 C a 1 2 Decision a Tree m a 1 a 2 a m Neural Net C 2 Decision Tree MC Combined Prediction a 1 a 2 a m SVM C n Simple Combination a 1 a 2 a m SVM JELIA'2002 Stacking 4 C n

Ensemble Methods (2/2) Main drawbacks of Ensemble Methods: Computational costs: lots of memory and time are required to obtain and store the set of hypotheses (ensemble). Prediction error costs: most ensemble methods are based on the maximisation of accuracy and not other cost-sensitive measures. Test costs: the use of several (and diverse) hypotheses forces the evaluation of (almost) all the attributes. Intelligibility: the combined model is a black box. Throughput: the application of the combined model is slow. The resolution of these drawbacks would boost the applicability of ensemble methods in machine learning applications. JELIA'2002 5

Addressing Computational Costs Many ensemble solutions have common parts. Traditional ensemble methods repeat those parts: memory and time SMILES is based on the construction of a shared ensemble: Common parts are shared in an AND/OR tree structure. DECISION MULTI-TREE Throughput is also improved by this technique. JELIA'2002 6

Addressing Misclassification & Test Costs (1/2) Many ensemble methods aim at increasing accuracy. AUC (Area Under the ROC Curve) better measure when classification costs may be variable. can be used as a metric for comparing classifiers: Classifier with greatest AUC 1 TPR ROC diagram AUC 0 0 FPR 1 MAUC: Multi-class extension JELIA'2002 of the AUC measure (Hand & Till 2001). 7

Addressing Misclassification & Test Costs (2/2) SMILES has splitting criteria based on the maximisation of the AUC MAUCsplit: Adaptation of Multi-class extension of AUC. MSEsplit: Adaptation of Minimum Squared Error as splitting criterion. Splitting criteria can also be modified to minimise the test cost. JELIA'2002 8

Addressing Test Cost and Intelligibility Ensemble methods (and many other ML methods) are: Black boxes: no insight given by the model (ensembles, ANN, SVM ). Attribute exhaustive: all or nearly all the attributes must be examined (ensembles, ANN, SVM, Bayes, ). Slow in real-time applications: all the classifiers must be evaluated. The Multi-tree structure (our shared ensemble) has also these problems. SMILES introduces the notion of ARCHETYPE of the ensemble. JELIA'2002 9

Archetype The archetype is the representative single hypothesis that is closer to the combined hypothesis. H: hypothesis space h i : hypotheses in the ensemble. F: combined hypothesis. h c : archetype. SMILES extracts the archetype from the multi-tree structure without the need of a validation dataset. Comprehensibility, test cost and throughput problems solved. JELIA'2002 10

Some Experiments (1/4) Combination Accuracy compared to other Ensemble Methods: JELIA'2002 11

Some Experiments (2/4) Combination Resources compared to other Ensemble Methods: JELIA'2002 12

Some Experiments (3/4) Evaluation of splitting criteria wrt.: accuracy AUC number of rules GEOMEANS GAINRATIO MAUCSPLIT MSESPLIT Accuracy 87.45 87.19 87.05 M-AUC 87.42 88.08 87.98 Rules 23.27 21.19 22.99 25 Two-class datasets from UCI repository. Pruning enabled. GEOMEANS GAINRATIO MAUCSPLIT MSESPLIT Accuracy 80.90 80.29 83.12 M-AUC 89.30 90.18 90.09 Rules 74.49 75.62 68.26 14 Multi-class datasets from UCI repository. Pruning enabled. JELIA'2002 13

Some Experiments (4/4) Evaluation of the Archetype: The accuracy gets close to the combined solution, and much better than the first single tree: JELIA'2002 14

Availability SMILES is freely available at: http://www.dsic.upv.es/~flip/smiles/ C++ sources. UNIX (Linux) and Windows versions. Many Examples (more than 30 datasets) adapted to SMILES format. Complete User Manual (90 pages). JELIA'2002 15

Additional Applications SMILES can be used as a by-pass for non-comprehensible ML methods: Labelled random dataset Training set Unlabelled random dataset It s different from stacking. The resulting model is semantically similar to the ANN but it is a comprehensible DT defined in terms of the original attributes. JELIA'2002 16

Conclusions and Future Work SMILES: combines and improves hypotheses combination and cost-sensitive learning (ROC analysis, AUC, test cost). The archetyping technique provides a novel and different way to take advantage of classifier ensembles, especially shared ensembles. Well suited for applications requiring high accuracy/auc, low cost and high comprehensibility with flexible handling of resources. Future work: Inputs and outputs in XML. (PMML standard) Graphical interface. Incremental extension. Expressiveness extension (functional-logic, higher-order, ) JELIA'2002 17