Statistical Learning- Classification STAT 441/ 841, CM 764

Similar documents
CSL465/603 - Machine Learning

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Lecture 1: Machine Learning Basics

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Lecture 1: Basic Concepts of Machine Learning

Python Machine Learning

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Rule Learning With Negation: Issues Regarding Effectiveness

CS Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Reducing Features to Improve Bug Prediction

Learning Methods in Multilingual Speech Recognition

Rule Learning with Negation: Issues Regarding Effectiveness

Learning From the Past with Experiment Databases

Semi-Supervised Face Detection

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Switchboard Language Model Improvement with Conversational Data from Gigaword

Word Segmentation of Off-line Handwritten Documents

CS 446: Machine Learning

Comparison of network inference packages and methods for multiple networks inference

MGT/MGP/MGB 261: Investment Analysis

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Welcome to. ECML/PKDD 2004 Community meeting

(Sub)Gradient Descent

Australian Journal of Basic and Applied Sciences

Speech Emotion Recognition Using Support Vector Machine

Generative models and adversarial training

arxiv: v1 [cs.lg] 15 Jun 2015

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Active Learning. Yingyu Liang Computer Sciences 760 Fall

A study of speaker adaptation for DNN-based speech synthesis

Universidade do Minho Escola de Engenharia

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

A Case Study: News Classification Based on Term Frequency

arxiv: v2 [cs.cv] 30 Mar 2017

Introduction to Personality Daily 11:00 11:50am

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

The stages of event extraction

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

CS 100: Principles of Computing

THE UNIVERSITY OF WESTERN ONTARIO. Department of Psychology

Going to School: Measuring Schooling Behaviors in GloFish

Model Ensemble for Click Prediction in Bing Search Ads

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Human Emotion Recognition From Speech

Mathematics Success Grade 7

Catchy Title for Machine

Linking Task: Identifying authors and book titles in verbose queries

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Evolutive Neural Net Fuzzy Filtering: Basic Description

Nutrition 10 Contemporary Nutrition WINTER 2016

Math 96: Intermediate Algebra in Context

Feature Selection based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification using Naïve Bayes

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Axiom 2013 Team Description Paper

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

Probabilistic Latent Semantic Analysis

Applications of data mining algorithms to analysis of medical data

BUS Computer Concepts and Applications for Business Fall 2012

Top US Tech Talent for the Top China Tech Company

Handling Concept Drifts Using Dynamic Selection of Classifiers

Cooperative evolutive concept learning: an empirical study

Discriminative Learning of Beam-Search Heuristics for Planning

Calibration of Confidence Measures in Speech Recognition

A Vector Space Approach for Aspect-Based Sentiment Analysis

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

Mining Association Rules in Student s Assessment Data

Modeling function word errors in DNN-HMM based LVCSR systems

Time series prediction

HCI 440: Introduction to User-Centered Design Winter Instructor Ugochi Acholonu, Ph.D. College of Computing & Digital Media, DePaul University

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

MTH 215: Introduction to Linear Algebra

Learning Distributed Linguistic Classes

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Knowledge Transfer in Deep Convolutional Neural Nets

Probability and Statistics Curriculum Pacing Guide

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Content-free collaborative learning modeling using data mining

Academic Integrity RN to BSN Option Student Tutorial

Transcription:

Statistical Learning- Classification STAT 441/ 841, CM 764 Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo aghodsib@uwaterloo.ca

Two Paradigms Classical Statistics Infer information from small data sets (Not enough data) Machine Learning Infer information from large data sets (Too many data)

We are drowning in information and starving for knowledge. Rutherford D. Roger

Fundamental problems Classification Regression Clustering Representation Learning ( Feature extraction, Manifold learning, Density estimation)

Applications Machine Learning is most useful when the structure of the task is not well understood but can be characterized by a dataset with strong statistical regularity. Search and recommendation (e.g. Google, Amazon) Automatic speech recognition and speaker verification Text parsing Face identification Tracking objects in video Financial prediction, fraud detection (e.g. credit cards) Medical diagnosis

Applications Machine Learning is most useful when the structure of the task is not well understood but can be characterized by a dataset with strong statistical regularity. Search and recommendation (e.g. Google, Amazon) Automatic speech recognition and speaker verification Text parsing Face identification Tracking objects in video Financial prediction, fraud detection (e.g. credit cards) Medical diagnosis

More Applications More science and technology applications: handwritten identification drug discovery (to identify the biological activity of chemical compounds using features describing the chemical structures) Gene expression analysis ( thousands of features with only dozens of observations)

Classification

Classification

Data

Features (X) ( 6, 4, 4.5) ( 7, 4.5, 5) (6, 3, 3.5) ( 4.5, 4, 4.5) ( 1.5, 8, 2) (1.5, 7, 2.5)

Data Representation

Data Representation

Data Representation 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 0.5 0.5 0.5 1 1 1 1 1 1

Features and labels ( 6, 4, 4.5) Green Pepper ( 7, 4.5, 5) Green Pepper (6, 3, 3.5) Red Pepper ( 4.5, 4, 4.5) Red Pepper ( 1.5, 8, 2) ( 1.5, 7, 2.5) Hot Pepper Hot Pepper

Features and labels Objects Features (X) Labels (Y)

Classification (New point) ( 7, 4, 4.5) h(7, 4, 4.5)?

Classification (New point) (5, 3, 4.5) h(5, 3, 4.5)?

Classification (New point) ( 6, 4, 4.5) h(6, 4, 4.5)?

Classification (New point) ( 2, 10, 1.2) h( 2, 10, 1.2)?

Face Identification

Face Identification

Classification

Classification

Classification

Classification

Digit Recognition

t-sne: most images of faces were clustered in the bottom. Most images of airplanes were clustered on the right.

Example from (Tenenbaum 2000)

Different Features

Glasses vs. No Glasses

Beard vs. No Beard

Beard Distinction

Glasses Distinction

Multiple-Attribute Metric

Textbook There is no required textbook for the class. Some classic papers will be assigned as readings.

Three recommended books that cover the similar material are: Hastie, Tibshirani, Friedman Elements of Statistical Learning. Bishop Pattern Recognition and Machine Learning. Murphy Machine Learning: a Probabilistic Perspective

Course Evaluation (tentative) Assignment 50%, Group Project 50%

Project Final group project (presentation and reports up to 7 pages of PDF) are worth 50% of your final grade. Right Whale Recognition kaggle competition as your final project.

Project We will find out if teaching environment computational resources are adequate. If it turns out that you don't have access to adequate computational resources, you may chose other possible types of projects as follows: Another active kaggle completion.

The basic types of projects Develop a new algorithm. In this case, you will need to demonstrate (theoretically and/or empirically) why your technique is better (or worse) than other algorithms. Note: A negative result does not lose marks, as long as you followed proper theoretical and/or experimental techniques.

Application of classification to some domain. This could either be your own research problem, or you could try reproducing results of someone else's paper.

The project is a chance to learn more about some sub-area of classification that you might be most interested in. You may benefit more from implementing an algorithm and doing some simulations rather than trying to read and summarize some state-of-the-art papers.

Final project reports will be checked by Turnitin (Plagiarism detection software).

Communication All communication should take place using the Piazza discussion board. Piazza is a good way to discuss and ask questions about the course materials, including assignments, in a public forum

Communication It enables you to learn from the questions of others, and to avoid asking questions that have already been asked and answered. Students are expected to read Piazza on a regular basis.

Enrolling in Piazza You will be sent an invitation to your UW email address. It will include a link to a web page where you may complete the enrollment process.

Piazza Guidelines In any posts you make, do not give away any details on how to do any of the assignments. This could be construed as cheating, and you will be responsible as the poster.

Course Website We will mostly use Piazza for communication. Assignments and grades will be handled through Learn. Please log on frequently. You are responsible for being aware of all material, information and email messages found on Piazza and Learn..

Reading Journals: Neural Computation, JMLR, ML, IEEE PAMI Conferences: NIPS, UAI, ICML, AI-STATS, IJCAI, IJCNN Vision: CVPR, ECCV, SIGGRAPH Speech: EuroSpeech, ICSLP, ICASSP Online: citesser, google Books: Elements of Statistical Learning, Hastie, Tibshirani, Friedman Pattern Recognition and Machine Learning, Bishop Pattern Classification, Duda, Hart, Strok Machine Learning an Algorithmic Perspective, Marsland

Important Dates Oct 6: Final project proposal due (Use the link posted on Learan) Nov 17: Presentation begin (tentative)

Prerequisite Grads: none for STATS/CS/ECE/SYDE grad students, instructor permission otherwise Undergrads: CM 361/STAT 341 or (STAT 330 and 340)

Tentative topics Feature extraction Error rates and the Bayes classifier Gaussian and linear classifier Linear regression and logistic regression Neural networks Deep Learning Radial basis function networks Density estimation and Naive Bayes Trees

Tentative topics Assessing error rates and model selection Support vector machines Kernel methods k-nearest neighbors Bagging Boosting Semi-supervised learning for classification Metric learning for classification