Cross-Validation TOM STEVENSON 24 OCTOBER 2016

Similar documents
CS Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Python Machine Learning

Learning From the Past with Experiment Databases

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Reducing Features to Improve Bug Prediction

Softprop: Softmax Neural Network Backpropagation Learning

Rule Learning with Negation: Issues Regarding Effectiveness

(Sub)Gradient Descent

CS 446: Machine Learning

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Knowledge Transfer in Deep Convolutional Neural Nets

Lecture 1: Machine Learning Basics

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Applications of data mining algorithms to analysis of medical data

Linking Task: Identifying authors and book titles in verbose queries

IMGD Technical Game Development I: Iterative Development Techniques. by Robert W. Lindeman

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Speech Recognition at ICSI: Broadcast News and beyond

Rule Learning With Negation: Issues Regarding Effectiveness

Switchboard Language Model Improvement with Conversational Data from Gigaword

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Semi-Supervised Face Detection

On-the-Fly Customization of Automated Essay Scoring

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Major Milestones, Team Activities, and Individual Deliverables

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

arxiv: v1 [cs.lg] 15 Jun 2015

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

An Empirical Comparison of Supervised Ensemble Learning Approaches

MGT/MGP/MGB 261: Investment Analysis

Word Segmentation of Off-line Handwritten Documents

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

State Budget Update February 2016

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Optimizing to Arbitrary NLP Metrics using Ensemble Selection

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Speech Emotion Recognition Using Support Vector Machine

Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing

On the Combined Behavior of Autonomous Resource Management Agents

Truth Inference in Crowdsourcing: Is the Problem Solved?

An Online Handwriting Recognition System For Turkish

Seminar - Organic Computing

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Model Ensemble for Click Prediction in Bing Search Ads

Hardhatting in a Geo-World

CSL465/603 - Machine Learning

Life and career planning

The Action Similarity Labeling Challenge

What Different Kinds of Stratification Can Reveal about the Generalizability of Data-Mined Skill Assessment Models

Software Maintenance

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

An investigation of imitation learning algorithms for structured prediction

Copyright by Sung Ju Hwang 2013

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Conference Presentation

The Importance of Social Network Structure in the Open Source Software Developer Community

PM tutor. Estimate Activity Durations Part 2. Presented by Dipo Tepede, PMP, SSBB, MBA. Empowering Excellence. Powered by POeT Solvers Limited

Georgetown University at TREC 2017 Dynamic Domain Track

A Pipelined Approach for Iterative Software Process Model

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

How to Judge the Quality of an Objective Classroom Test

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Progress Monitoring for Behavior: Data Collection Methods & Procedures

Human Emotion Recognition From Speech

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

arxiv: v1 [cs.cl] 2 Apr 2017

Genre classification on German novels

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Chapter 2 Rule Learning in a Nutshell

Comparison of network inference packages and methods for multiple networks inference

Universidade do Minho Escola de Engenharia

LEGO MINDSTORMS Education EV3 Coding Activities

Experience and Innovation Factory: Adaptation of an Experience Factory Model for a Research and Development Laboratory

Davidson College Library Strategic Plan

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Applying Fuzzy Rule-Based System on FMEA to Assess the Risks on Project-Based Software Engineering Education

Introduction on Lean, six sigma and Lean game. Remco Paulussen, Statistics Netherlands Anne S. Trolie, Statistics Norway

WHEN THERE IS A mismatch between the acoustic

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Lecture 1: Basic Concepts of Machine Learning

Reinforcement Learning by Comparing Immediate Reward

Using dialogue context to improve parsing performance in dialogue systems

Institutionen för datavetenskap. Hardware test equipment utilization measurement

Comment-based Multi-View Clustering of Web 2.0 Items

Learning goal-oriented strategies in problem solving

Transcription:

Cross-Validation TOM STEVENSON T.J.STEVENSON@QMUL.AC.UK

MOTIVATION AND THE ISSUE Cross-Validation in TMVA Need confidence that the trained MVA is robust: Performance on unseen samples accurately predicted. Reproducible performance for new data, systematics etc. Validation techniques required for: Model Selection: Methods have at least one free parameter. How are these parameters of models optimally selected? Performance Estimation: How does the chosen model perform? Various figures of merit (FOM): ROCIntegral, Significance, etc. TOM STEVENSON 2

MOTIVATION AND THE ISSUE Cross-Validation in TMVA For an unlimited dataset these issues are trivial, simply iterate through parameters and find model with lowest error rate. In reality datasets are smaller than we would like. Naïvely use whole dataset to select and train classifier and to estimate error. Leads to overfitting/overtraining as classifier learns fluctuations in the dataset and performs worse on unseen data. Overfitting more distinct for classifiers with large number of tuneable parameters. Also gives overly optimistic estimation of FOM. TOM STEVENSON 3

K-FOLD CROSS-VALIDATION Cross-Validation in TMVA May not be able to reserve a large portion of data for testing: Hold-out method may not be viable. Use k-fold cross-validation: Dataset Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold k Split dataset into k randomly sampled independent subsets (folds). Train classifier with k-1 folds and test with remaining fold. Repeat k times. Advantage of using the whole dataset for testing and training. FOM is then estimated using average of FOMs for each fold. TOM STEVENSON 4

Cross-Validation in TMVA IMPLEMENTATION IN TMVA Hyper parameter tuning simply set up and called with: TMVA::HyperParameterOptimisation * HPO = new TMVA::HyperParameterOptimisation(dataloader); HPO->BookMethod(TMVA::Types::kSVM, SVM,""); HPO->SetNumFolds(3); HPO->SetFitter("Minuit"); HPO->SetFOMType("Separation"); HPO->Evaluate(); const TMVA::HyperParameterOptimisationResult & HPOResult = HPO->GetResults(); HPOResult.Print(); Data splitting done behind scenes in dataloader. Specify number of sig/background events first in usual way. Runs OptimiseTuningParameters for each combination of folds. Returns one set of hyper parameters per fold. Working on splitting the training sample so validation set can be used to test performance. Looking at integrating CV into OptimiseTuningParameters. TOM STEVENSON 5

Cross-Validation in TMVA IMPLEMENTATION IN TMVA Cross Validation set up and called with: TMVA::CrossValidation * CV = new TMVA::CrossValidation(dataloader); CV->BookMethod(TMVA::Types::kSVM,"SVM",optionsString); CV->SetNumFolds(3); CV->Evaluate(); const TMVA::CrossValidationResult & CVResult = CV->GetResults(); CVResult.Print(); CVResult.Draw(); CrossValidationResult currently contains some of metrics in EvaluateAllMethods metric in Factory. ROC Integral Separation Significance Efficiencies at different working points. Working on adding more. TOM STEVENSON 6

IMPLEMENTATION IN TMVA - OUTPUT 1 Cross-Validation in TMVA TOM STEVENSON 7

IMPLEMENTATION IN TMVA - OUTPUT 2 Cross-Validation in TMVA Background Rejection 1 0.8 0.6 0.4 0.2 SVM_fold1 SVM_fold2 SVM_fold3 0 0 0.2 0.4 0.6 0.8 1 Signal Efficiency TOM STEVENSON 8

Cross-Validation in TMVA EXAMPLE Dataset: Example toy dataset 20000 sig & bkg events. Background Rejection 1 0.8 0.6 0.4 0.2 Out-of-the-box BDT with fixed parameters. 0 0 0.2 0.4 0.6 0.8 1 Signal Efficiency ROC Integrals for 100 fold CV BDT 100 fold cross-validation. # 25 20 Shows effect of changing the training/testing set. 15 10 5 TOM STEVENSON 9 0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 ROC Integral

Cross-Validation in TMVA SUMMARY Basic functionality for cross-validation and hyper-parameter optimisation integrated into TMVA. Adding more metrics. Investigating other ways to compare performance of classifiers. Currently not running in parallel but this will be a welcome improvement. Example code is attached. TOM STEVENSON 10