INTRODUCTION. Pattern Recognition. Slides at https://ekapolc.github.io/slides/l1-intro.pdf

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Python Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

CS Machine Learning

Lecture 1: Basic Concepts of Machine Learning

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Linking Task: Identifying authors and book titles in verbose queries

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Rule Learning with Negation: Issues Regarding Effectiveness

Top US Tech Talent for the Top China Tech Company

CS 100: Principles of Computing

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Using dialogue context to improve parsing performance in dialogue systems

INTERMEDIATE ALGEBRA Course Syllabus

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Modeling user preferences and norms in context-aware systems

CIS 121 INTRODUCTION TO COMPUTER INFORMATION SYSTEMS - SYLLABUS

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

(Sub)Gradient Descent

Automating the E-learning Personalization

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

School of Innovative Technologies and Engineering

Lecture 1: Machine Learning Basics

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

The Algebra in the Arithmetic Finding analogous tasks and structures in arithmetic that can be used throughout algebra

GACE Computer Science Assessment Test at a Glance

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Introduction. Chem 110: Chemical Principles 1 Sections 40-52

Pretest Integers and Expressions

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Unit 7 Data analysis and design

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

Telekooperation Seminar

CS177 Python Programming

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

On-Line Data Analytics

MYCIN. The MYCIN Task

BUS Computer Concepts and Applications for Business Fall 2012

Learning From the Past with Experiment Databases

Human Emotion Recognition From Speech

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Artificial Neural Networks written examination

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

Probability estimates in a scenario tree

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Data Structures and Algorithms

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Word Segmentation of Off-line Handwritten Documents

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

MADERA SCIENCE FAIR 2013 Grades 4 th 6 th Project due date: Tuesday, April 9, 8:15 am Parent Night: Tuesday, April 16, 6:00 8:00 pm

Circuit Simulators: A Revolutionary E-Learning Platform

CSL465/603 - Machine Learning

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

Why Did My Detector Do That?!

E C C. American Heart Association. Basic Life Support Instructor Course. Updated Written Exams. February 2016

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Getting Started with Deliberate Practice

University of Toronto Physics Practicals. University of Toronto Physics Practicals. University of Toronto Physics Practicals

Humboldt-Universität zu Berlin

Evolutive Neural Net Fuzzy Filtering: Basic Description

Exposé for a Master s Thesis

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Welcome to. ECML/PKDD 2004 Community meeting

Math 96: Intermediate Algebra in Context

DOUBLE DEGREE PROGRAM AT EURECOM. June 2017 Caroline HANRAS International Relations Manager

Northern Kentucky University Department of Accounting, Finance and Business Law Financial Statement Analysis ACC 308

Utilizing FREE Internet Resources to Flip Your Classroom. Presenter: Shannon J. Holden

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Australian Journal of Basic and Applied Sciences

COMM370, Social Media Advertising Fall 2017

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Lecture 2: Quantifiers and Approximation

Kelli Allen. Vicki Nieter. Jeanna Scheve. Foreword by Gregory J. Kaiser

MULTIMEDIA Motion Graphics for Multimedia

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

EPI BIO 446 DESIGN, CONDUCT, and ANALYSIS of CLINICAL TRIALS 1.0 Credit SPRING QUARTER 2014

A Case Study: News Classification Based on Term Frequency

Probabilistic Latent Semantic Analysis

Fourth Grade. Reporting Student Progress. Libertyville School District 70. Fourth Grade

The Good Judgment Project: A large scale test of different methods of combining expert predictions

ACC 362 Course Syllabus

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Mathematics subject curriculum

Firms and Markets Saturdays Summer I 2014

Mining Student Evolution Using Associative Classification and Clustering

Detecting English-French Cognates Using Orthographic Edit Distance

ITSC 2321 Integrated Software Applications II COURSE SYLLABUS

Measurement & Analysis in the Real World

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Computer Science PhD Program Evaluation Proposal Based on Domain and Non-Domain Characteristics

Transcription:

INTRODUCTION Pattern Recognition Slides at https://ekapolc.github.io/slides/l1-intro.pdf

Syllabus

Registration Graduate students 12 slots, sec 2 If filled, register as V/W only For undergrads, sec 21 Signup sheet for sit-ins, s/u, v/w going around the room

Tools Python Python Python Jupyter Numpy Scipy Pandas Tensorflow, Keras

Plagiarism Policy You shall not show other people your code or solution Copying will result in a score of zero for both parties on the assignment Many of these algorithms have code available on the internet, do not copy paste the codes

Courseville 2110597.21 (2017/1) https://www.mycourseville.com/?q=courseville/course/ register/2110597.21_2017_1&spin=on Password: cattern

Piazza http://piazza.com/chula.ac.th/fall2017/2110597 Requires chula.ac.th email 5 points of participation score comes from piazza

Office hours Thursdays 16.30-18.30 starting from Aug 31 st Location TBA

Cloud Gcloud Credit card

Course project 3-4 people (exact number TBA) Topic of your choice Can be implementing a paper Extension of a homework Project for other courses with an additional machine learning component Your current research (with additional scope) Or work on a new application Must already have existing data! No data collection! Topics need to be pre-approved Details about the procedure TBA

The machine learning trend http://www.gartner.com/newsroom/id/3114217

The machine learning trend http://www.gartner.com/newsroom/id/3412017

The data era 2017 numbers = 400 hours/min http://www.tubefilter.com/2014/12/01/youtube-300-hours-video-per-minute/

Factors for ML Data Compute Algo http://www.kdnuggets.com/2017/06/practical-guide-machine-learning-understand-differentiate-apply.html

The cost of storage http://royal.pingdom.com/2008/04/08/the-history-of-computer-data-storage-in-pictures/ 1980 250MB hard disk drive 250 kg 100k USD (300k USD in today s dollar) https://www.backblaze.com/blog/farming-hard-drives-2-years-and-1m-later/

The cost of compute http://aiimpacts.org/trends-in-the-cost-of-computing/

Hitting the sweet spot on performance http://recognize-speech.com/acoustic-model/knn/benchmarks-comparison-of-different-architectures

Hitting the sweet spot in performance

Now time for a video https://www.youtube.com/watch?v=wioopo9jtzw

If I were to guess like what our biggest existential threat is, it s probably that. So we need to be very careful with the artificial intelligence. There should be some regulatory oversight maybe at the national and international level, just to make sure that we don t do something very foolish.

I think people who are naysayers and try to drum up these doomsday scenarios I just, I don t understand it. It s really negative and in some ways I actually think it is pretty irresponsible

Poll

What is Pattern Recognition? Pattern recognition is a branch of machine learning that focuses on the recognition of patterns and regularities in data, although it is in some cases considered to be nearly synonymous with machine learning. wikipedia What about Data mining Knowledge Discovery in Databases (KDD) Statistics

ML vs PR vs DM vs KDD The short answer is: None. They are concerned with the same question: how do we learn from data? Larry Wasserman CMU Professor Nearly identical tools and subject matter

History Pattern Recognition started from the engineering community (mainly Electrical Engineering and Computer Vision) Machine learning comes out of AI and mostly considered a Computer Science subject Data mining starts from the database community

Different community viewpoints A screw looking for a screw driver A screw driver looking for a screw Different applications Different tools

The Screwdriver and the Screw DM PR ML AI

Distinguishing things DM Data warehouse, ETL AI Artificial General Intelligence PR Signal processing (feature engineering) http://www.deeplearningbook.org/

Different terminologies http://statweb.stanford.edu/~tibs/stat315a/glossary.pdf

Merging communities and fields With the advent of Deep learning the fields are merging and the differences are becoming unclear

How do we learn from data? The typical workflow Real world observations sensors Feature extraction 1 5 3.6 1 3-1 Feature vector x

How do we learn from data? 1 5 3.6 1 3-1 Learning algorithm Training set Model h Desired output y Training phase

How do we learn from data? New input X 1 5 3.6 1 3-1 h Predicted output y Testing phase

A task The raw inputs and the desired output defines a machine learning task data1 data2 data3 Magic Predicted output y Predicting After You stock price with CCTV image, facebook posts, and daily temperature

Key concepts Feature extraction Evaluation

Feature extraction The process of extracting meaningful information related to the goal A distinctive characteristic or quality Example features data1 data2 data3

Garbage in Garbage out The machine is as intelligent as the data/features we put in Garbage in, Garbage out Data cleaning is often done to reduce unwanted things https://precisionchiroco.com/garbage-in-garbage-out/

The need for data cleaning However, good models should be able to handle some dirtiness! https://www.linkedin.com/pulse/big-data-conundrum-garbage-out-other-challenges-business-platform

Feature properties The quality of the feature vector is related to its ability to discriminate samples from different classes

Model evaluation How to compare h1 and h2? New input X 1 5 3.6 1 3-1 h1 h2 Predicted output y Testing phase

Metrics Compare the output of the models Errors/failures, accuracy/success We want to quantify the error/accuracy of the models How would you measure the error/accuracy of the following

Ground truths We usually compare the model predicted answer with the correct answer. What if there is no real answer? How would you rate machine translation? ไปไหน Model A: Where are you going? Model B: Where to? Designing a metric can be tricky, especially when it s subjective

Metrics consideration 1 Are there several metrics? Use the metric closest to your goal but never disregard other metrics. May help identify possible improvements

Metrics consideration 2 Are there sub-metrics? http://www.ustar-consortium.com/qws/slot/u50227/research.html

Metrics definition Defining a metric can be tricky when the answer is flexible https://www.cc.gatech.edu/~hays/compvision/proj5/

Be clear about your definition of an error before hand! Make sure that it can be easily calculated! This will save you a lot of time.

Commonly used metrics Error rate Accuracy rate Precision True positive Recall False alarm F score

A detection problem Identify whether an event occur A yes/no question A binary classifier Smoke detector Hotdog detector

Evaluating a detection problem 4 possible scenarios Detector Yes Actual Yes True positive False negative (Type II error) No False Alarm (Type I error) False alarm and True positive carries all the information of the performance. No True negative True positive + False negative = # of actual yes False alarm + True negative = # of actual no

Definitions True positive rate (Recall, sensitivity) = # true positive / # of actual yes False positive rate (False alarm rate) = # false positive / # of actual no False negative rate (Miss rate) = # false negative / # of actual yes True negative rate (Specificity) = # true negative / # of actual no Precision = # true positive / # of predicted positive

Search engine example A recall of 50% means? A precision of 50% means? When do you want high recall? When do you want high precision?

Recall/precision When do you want high recall? When do you want high precision? Initial screening for cancer Face recognition system for authentication Detecting possible suicidal postings on social media Usually there s a trade off between precision and recall. We will re-visit this later

Definitions 2 F score (F1 score, f-measure) A single measure that combines both aspects A harmonic mean between precision and recall (an average of rates) Note that precision and recall says nothing about the true negative

Harmonic mean vs Arithmetic mean You travel for half an hour for 60 km/hr, then half an hour for 40 km/hr. What is your average speed? Arithmetic mean = 50 km/hr Harmonic mean n 1 +... + 1 = x 1 x n 2 1 40 + 1 60 Total distance covered in 1 hour = 30+20 = 50 = 48 km/hr 30 mins 60 km/hr 30 mins 40 km/hr

Harmonic mean vs Arithmetic mean You travel for distance X for 60 km/hr, then another X for 40 km/hr. What is your average speed? Arithmetic mean = 50 km/hr Harmonic mean Total distance covered 2X n 1 +... + 1 = x 1 x n 2 1 40 + 1 60 = 48 km/hr X km 60 km/hr X km 40 km/hr

Harmonic mean vs Arithmetic mean For the arithmetic mean to be valid you need to compared over the same number of hours (denominator) For precision and recall, you have different denominators, but the same numerator, which fits the harmonic mean. True positive rate (Recall, sensitivity) = # true positive / # of actual yes Precision = # true positive / # of predicted positive

Evaluating models We talked about the training set used to learn the model We use a different data set to test the accuracy/error of models test set We can still compute the error and accuracy on the training set Training error vs Testing error We will discuss how we can use these to help guide us later

Other considerations when evaluating models Training time Testing time Memory requirement Parallelizability Latency

Course walkthrough

Why anything else besides deep learning The rise and fall of machine learning algorithms Methods used in bioinformatics papers https://www.ncbi.nlm.nih.gov/pmc/articles/pmc3232371/figure/f1/

What we will not cover Random forest Decision trees Boosting Graphical models

Homework Reading assignment https://hbr.org/cover-story/2017/07/the-business-of-artificialintelligence