Multi-objective Evolutionary Approaches for ROC Performance Maximization

Similar documents
Python Machine Learning

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Laboratorio di Intelligenza Artificiale e Robotica

Australian Journal of Basic and Applied Sciences

Learning From the Past with Experiment Databases

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Laboratorio di Intelligenza Artificiale e Robotica

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Word Segmentation of Off-line Handwritten Documents

Reducing Features to Improve Bug Prediction

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Rule Learning With Negation: Issues Regarding Effectiveness

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

(Sub)Gradient Descent

Lecture 1: Machine Learning Basics

Applications of data mining algorithms to analysis of medical data

INPE São José dos Campos

Knowledge-Based - Systems

Rule Learning with Negation: Issues Regarding Effectiveness

Active Learning. Yingyu Liang Computer Sciences 760 Fall

DIANA: A computer-supported heterogeneous grouping system for teachers to conduct successful small learning groups

Assignment 1: Predicting Amazon Review Ratings

Detecting English-French Cognates Using Orthographic Edit Distance

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

arxiv: v1 [cs.cl] 2 Apr 2017

Multiobjective Optimization for Biomedical Named Entity Recognition and Classification

CS Machine Learning

On the Combined Behavior of Autonomous Resource Management Agents

Practice Examination IREB

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Generative models and adversarial training

An Introduction to the Minimalist Program

BMBF Project ROBUKOM: Robust Communication Networks

Calibration of Confidence Measures in Speech Recognition

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Ordered Incremental Training with Genetic Algorithms

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Semi-Supervised Face Detection

Softprop: Softmax Neural Network Backpropagation Learning

arxiv: v1 [cs.lg] 3 May 2013

A survey of multi-view machine learning

Disambiguation of Thai Personal Name from Online News Articles

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Speech Recognition by Indexing and Sequencing

Speech Emotion Recognition Using Support Vector Machine

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Evolution of Symbolisation in Chimpanzees and Neural Nets

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing

Multivariate k-nearest Neighbor Regression for Time Series data -

Conference Presentation

CS 446: Machine Learning

Cooperative evolutive concept learning: an empirical study

Time series prediction

EVOLVING POLICIES TO SOLVE THE RUBIK S CUBE: EXPERIMENTS WITH IDEAL AND APPROXIMATE PERFORMANCE FUNCTIONS

University of Cincinnati College of Medicine. DECISION ANALYSIS AND COST-EFFECTIVENESS BE-7068C: Spring 2016

Optimizing to Arbitrary NLP Metrics using Ensemble Selection

Indian Institute of Technology, Kanpur

A Reinforcement Learning Variant for Control Scheduling

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Seminar - Organic Computing

Issues in the Mining of Heart Failure Datasets

Linking Task: Identifying authors and book titles in verbose queries

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

For Jury Evaluation. The Road to Enlightenment: Generating Insight and Predicting Consumer Actions in Digital Markets

Human Emotion Recognition From Speech

Speech Recognition at ICSI: Broadcast News and beyond

2017 Florence, Italty Conference Abstract

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Activity Recognition from Accelerometer Data

Cost-sensitive Deep Learning for Early Readmission Prediction at A Major Hospital

Evolutive Neural Net Fuzzy Filtering: Basic Description

Probabilistic Latent Semantic Analysis

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Using dialogue context to improve parsing performance in dialogue systems

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Chapter 2 Rule Learning in a Nutshell

Test Effort Estimation Using Neural Network

Lecture 2: Quantifiers and Approximation

WHEN THERE IS A mismatch between the acoustic

Team Formation for Generalized Tasks in Expertise Social Networks

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Customized Question Handling in Data Removal Using CPHC

Short vs. Extended Answer Questions in Computer Science Exams

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

TD(λ) and Q-Learning Based Ludo Players

A study of speaker adaptation for DNN-based speech synthesis

Applying Fuzzy Rule-Based System on FMEA to Assess the Risks on Project-Based Software Engineering Education

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Transcription:

Multi-objective Evolutionary Approaches for ROC Performance Maximization Ke Tang USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications (UBRI) School of Computer Science and Technology University of Science and Technology of China July 2014 @ USTC 1

Outline Introduction to ROC analysis Related works A Multi-Objective Evolutionary Approach to ROCCH maximization (CH-MOEA) Conclusions 2

Introduction to ROC Analysis Many real-world classification problems are either cost sensitive or have imbalanced class distribution. In such situations, a classifier with large classification accuracy might not make sense at all. Alternative performance metric is needed. In Big Data era, misclassification cost and class distribution may even change over time. 3

Introduction to ROC Analysis Confusion Matrix Predicted Positive Predicted Negative Positive True Positive rate False Negative rate Negative False Positive rate True Negative rate 4

Introduction to ROC Analysis Receiver Operating Characteristic (ROC) 5

Introduction to ROC Analysis ROC Curve A curve in the ROC space, generated by tuning the threshold of a classifier. f(x)=w T x+b 6

Introduction to ROC Analysis From ROC analysis to performance measure simple version: Area Under the ROC Curve (AUC) Complicated version: ROC Convex Hull (ROCCH) 7

Introduction to ROC Analysis An important characteristic of ROCCH: Under any target cost and class distributions, the best classifier for those conditions must be a vertex or on the edge of the convex hull of all classifiers. 8

Related Work Both AUC and ROCCH can be used as objective functions for training a classifier/learner. When seeking a (soft) classifier with maximum AUC or ROCCH, we actually seek a set of (hard) classifiers, e.g., classifiers with different thresholds. More intuitively, we tries to find a classifier that is roughly good (robust) and can be easily adapted to different misclassification costs, or class distributions. 9

Related Work AUC maximization is (in some circumstances), equivalent to a bipartite ranking problem, and can be addressed with learning-to-rank approaches. Rank-SVM (Joachims, 2005) Rankboost (Freund, 2003) ROCCH maximization more challenging than AUC-maximization problem. Can only be tackled with heuristic approaches PRIE (Fawcett, 2008) 10

CH-MOEA Existing approaches tries to obtain a set of homogenous classifiers in the sense that the classifiers only adopts different thresholds. Question: why the classifiers must be homogeneous? Heterogeneous classifiers might spread better in the ROC space. The difference between homogenous and heterogeneous classifiers make little difference in practical implementation. 11

CH-MOEA Our Target: Train a set of (Heterogeneous) classifiers such that the ROCCH is maximized. A set-based optimization problem can could hardly be solved with existing mathematical programming tools. Evolutionary Algorithms provides a natural way to search for the desired classifier set. 12

CH-MOEA In particular, multi-objective evolutionary algorithms are off-the-shelf tools for this problem. Maximize TP Minimize FP 13

' CH-MOEA General framework of EAs 1. Generate the initial population P (0) at random, and set i 0; 2. REPEAT & (a) Evaluate the fitness of each individual in P (i); (b) Select parents from P (i) based on their fitness in P (i); (c) Generate offspring from the parents using crossover and mutation to form P (i + 1); (d) i i + 1; 3. UNTIL halting criteria are satisfied 14

CH-MOEA What is the most famous MOEAs so far? Probably NSGA-II (Kalyan Deb, 2002), mainly famous for its selection scheme: 15

CH-MOEA However, directly application of NSGA-II (or ay other MOEA) might be inappropriate as: A non-dominated (or pareto optimal) solution is not necessarily on the convex hull. The objective space of the problem is essentially discrete (may cause redundant solutions) 16

CH-MOEA Our approach: Convex Hull-based MOEA (CH- MOEA) New features of CH-MOEA: Redundancy elimination A new sorting scheme dedicated to ROOCH maximization. 17

CH-MOEA Redundancy Elimination 18

CH-MOEA New sorting scheme for ROCCH maximization 19

CH-MOEA The CH-MOEA can be combined with any learning models that can be evolved Neural Network Decision Tree SVM Genetic Programming is adopted in our work, which can be viewed as the evolving a decision tree. 20

CH-MOEA Pseudo-code of CH-MOGP 21

CH-MOEA Dataset for empirical studies 22

CH-MOEA Compared methods 23

CH-MOEA CH-MOGP outperformed state-of-the-art MOEAs 24

CH-MOEA CH-MOGP outperformed other non-ea methods. 25

CH-MOEA CH-MOGP outperformed other non-ea methods. 26

Conclusions Cost-sensitive or class imbalance learning are commonly encountered in the real world. ROCCH fits these type of problem very well for its insensitivity with respect to misclassification cost and class distribution ROCCH is formulated as a special MOP that has not been well addressed by existing MOEAs. A new MOEA, namely CH-MOEA, is proposed to tackle this learning problem. CH-MOEA could be extended to any machine learning model. 27

Reference P. Wang, M. Emmerich, R. Li, K. Tang, T. Baeck and X. Yao, Convex Hull- Based Multi-objective Genetic Programming for Maximizing Receiver Operating Characteristic Performance, IEEE Transactions on Evolutionary Computation, in press (DOI: 10.1109/TEVC.2014.2305671). P. Wang, K. Tang, T. Weise, E. P. K. Tsang and X. Yao, Multiobjective Genetic Programming for Maximizing ROC Performance, Neurocomputing, 125: 102-118, February 2014. 28

Collaborators Dr. Pu Wang Prof. Xin Yao Prof. Edward Tsang Dr. Thomas Weise Dr. Michael Emmerich Dr. Rui Li Prof. Thomas Baeck 29

Thanks for your time! Q&A? 30