Session 7: Face Detection (cont.)

Similar documents
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

CS Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

Semi-Supervised Face Detection

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Australian Journal of Basic and Applied Sciences

Lecture 1: Machine Learning Basics

CS 446: Machine Learning

Chapter 2 Rule Learning in a Nutshell

Combining Proactive and Reactive Predictions for Data Streams

Rule Learning with Negation: Issues Regarding Effectiveness

Human Emotion Recognition From Speech

Python Machine Learning

Learning Methods in Multilingual Speech Recognition

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Interactive Whiteboard

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

(Sub)Gradient Descent

Interpreting ACER Test Results

Corrective Feedback and Persistent Learning for Information Extraction

Memory-based grammatical error correction

Generative models and adversarial training

Softprop: Softmax Neural Network Backpropagation Learning

INPE São José dos Campos

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Activity 2 Multiplying Fractions Math 33. Is it important to have common denominators when we multiply fraction? Why or why not?

Calibration of Confidence Measures in Speech Recognition

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Speech Emotion Recognition Using Support Vector Machine

CS177 Python Programming

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Answer Key For The California Mathematics Standards Grade 1

WHEN THERE IS A mismatch between the acoustic

On the Combined Behavior of Autonomous Resource Management Agents

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Conference Presentation

Psychometric Research Brief Office of Shared Accountability

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Computerized Adaptive Psychological Testing A Personalisation Perspective

The stages of event extraction

Universidade do Minho Escola de Engenharia

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Seminar - Organic Computing

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Probabilistic Latent Semantic Analysis

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application:

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Sample Problems for MATH 5001, University of Georgia

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

arxiv: v2 [cs.cv] 30 Mar 2017

Disambiguation of Thai Personal Name from Online News Articles

SARDNET: A Self-Organizing Feature Map for Sequences

Activity Recognition from Accelerometer Data

A Note on Structuring Employability Skills for Accounting Students

A Case Study: News Classification Based on Term Frequency

Laboratorio di Intelligenza Artificiale e Robotica

Word Segmentation of Off-line Handwritten Documents

Modeling function word errors in DNN-HMM based LVCSR systems

Getting Started with Deliberate Practice

Create Quiz Questions

South Carolina English Language Arts

Multi-label Classification via Multi-target Regression on Data Streams

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Detecting English-French Cognates Using Orthographic Edit Distance

Switchboard Language Model Improvement with Conversational Data from Gigaword

BRAZOSPORT COLLEGE LAKE JACKSON, TEXAS SYLLABUS. POFI 1301: COMPUTER APPLICATIONS I (File Management/PowerPoint/Word/Excel)

Version Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18

EXECUTIVE SUMMARY. Online courses for credit recovery in high schools: Effectiveness and promising practices. April 2017

Learning From the Past with Experiment Databases

Assignment 1: Predicting Amazon Review Ratings

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Linking Task: Identifying authors and book titles in verbose queries

Evolutive Neural Net Fuzzy Filtering: Basic Description

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Artificial Neural Networks written examination

Speech Recognition at ICSI: Broadcast News and beyond

School Size and the Quality of Teaching and Learning

Dimensions of Classroom Behavior Measured by Two Systems of Interaction Analysis

How to set up gradebook categories in Moodle 2.

Dyslexia and Dyscalculia Screeners Digital. Guidance and Information for Teachers

Reducing Features to Improve Bug Prediction

Characteristics of Collaborative Network Models. ed. by Line Gry Knudsen

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Practice Examination IREB

Using dialogue context to improve parsing performance in dialogue systems

Multivariate k-nearest Neighbor Regression for Time Series data -

Laboratorio di Intelligenza Artificiale e Robotica

Beyond the Pipeline: Discrete Optimization in NLP

Chapter 4 - Fractions

Junior Fractions. With reference to the work of Peter Hughes, the late Richard Skemp, Van de Walle and other researchers.

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Transcription:

Session 7: Face Detection (cont.) John Magee 8 February 2017 Slides courtesy of Diane H. Theriault

Question of the Day: How can we find faces in images?

Face Detection Compute features in the image Apply a classifier Viola & Jones. Rapid Object Detection using a Boosted Cascade of Simple Features

What do Faces Look Like? Chosen features are responses of the image to box filters at specific locations in the image

Using Features for Classification Gateway between the signal processing world and the machine learning world For any candidate image, we will compute the response of the image to some different filters at some different locations These will be our input to our machine learning algorithm

Using Features for Classification In Machine Learning, a classifier is a thing that can make a decision about whether a piece of data belongs to a particular class (e.g. is this image a face or not ) There are different types of classifiers, and one way to find their parameters by looking at some training data (i.e. supervised learning ) A weak learner is a classifier for which you can find the best parameters you can, but it still does a mediocre job

Using Features for Classification Weak Learner Example: Apply a threshold to a feature (the response of the image to a filter) How to find a threshold?

Using Features for Classification How to find a threshold? Start with labeled training data 9,832 face images (positive examples) 10,000 non-face images (sub-windows cut from other pictures containing no faces) (negative examples) Compute some measure of how good a particular threshold his (e.g. accuracy ), then find the threshold that gives the best result

Using Features for Classification For a particular threshold on a particular feature, compute: True positives (faces that are identified as faces) True negatives (non-face patches that are identified as non-faces) False positives (non-faces identified as faces) False negatives (faces identified as non-faces) Accuracy: % correctly classified Classification Error: 1 - Accuracy For each feature, choose a threshold that maximizes the accuracy Classifier Result positive negative Known Classification positive True Positive False Negative negative False Positive True Negative Confusion Matrix

Using Features for Classification How do you know which feature to use? Try them all and pick the one that gives the best result Then, choose the next one that does the next best job, emphasizing the misclassified images Each threshold on a single feature gives mediocre results, but if you combine them in a clever way, you can get good results (That s the extremely short version of boosting )

Training: Classification with Adaboost Given a pool of weak learners and some data, Create a boosted classifier by An awesome Machine Learning Algorithm! choosing a good combination of K weak learners and associated weights In our case Train a Weak Learner = = Choose a feature to use and which threshold to apply

Classification with Adaboost Training: Initialization: Assign data weights uniformly to each data point For 1:K Train all of the weak learners Compute the weighted classification error using weights assigned to each data point Choose the weak learner with the lowest weighted error Compute a classifier weight associated with the weak learner, based on the classification error Adjust the weights for the data points to emphasize misclassified points (Specifics of how to compute weights in paper)

Classification with Adaboost Classification: Use the boosted classifier (the weak learners and associated weights we found during training) to label faces Evaluate each weak learner we chose on the new data point by computing the response of the image to the filter and applying the threshold to obtain a binary result Make a final decision by computing a weighted sum of the classification results from the weak learners

Classification Cascade Tradeoff between more complex classifier that uses more features (computational cost) and accuracy What is an acceptable error rate? What is an acceptable computational cost? Can we have our cake and eat it too?

Classification Cascade Solution: Use a cascade of increasingly complex classifiers Create less complex classifiers with fewer weak learners that achieve high detection rates (maybe with extra false positives) Evaluate more complex, more picky, classifiers only after the image passes the early classifiers Train later classifiers in the cascade using only images that pass earlier classifiers

To Detect Faces Divide large images into overlapping sub-windows Apply classifier cascade to each sub-window Apply to sub-windows of different sizes by scaling the features (using larger box filters)

Discussion Questions: What is the relationship between an image feature and the response of an image to a box filter applied at a particular location? If you were given a set of labeled images and a filter response for some particular filter, how would you choose a threshold to use? How would you adjust your procedure for finding the best possible threshold if you wanted to find the best threshold that recognized at least 99% of faces, even if it let through some nonfaces (false positives)? Given an image, what are the steps for labeling it as face or nonface? What is a classifier cascade and why would you want to use one?