CSC 411: Lecture 01: Introduction

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Lecture 1: Machine Learning Basics

CSL465/603 - Machine Learning

Lecture 1: Basic Concepts of Machine Learning

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Python Machine Learning

Laboratorio di Intelligenza Artificiale e Robotica

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

CIS Introduction to Digital Forensics 12:30pm--1:50pm, Tuesday/Thursday, SERC 206, Fall 2015

(Sub)Gradient Descent

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

CS Machine Learning

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Laboratorio di Intelligenza Artificiale e Robotica

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

CS 3516: Computer Networks

Computer Science 141: Computing Hardware Course Information Fall 2012

CS 100: Principles of Computing

Axiom 2013 Team Description Paper

Artificial Neural Networks written examination

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Word Segmentation of Off-line Handwritten Documents

Firms and Markets Saturdays Summer I 2014

Speech Emotion Recognition Using Support Vector Machine

Lecture 10: Reinforcement Learning

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

CS 446: Machine Learning

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

Learning From the Past with Experiment Databases

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Assignment 1: Predicting Amazon Review Ratings

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

*In Ancient Greek: *In English: micro = small macro = large economia = management of the household or family

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

Data Structures and Algorithms

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

Linking Task: Identifying authors and book titles in verbose queries

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

KOMAR UNIVERSITY OF SCIENCE AND TECHNOLOGY (KUST)

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Reducing Features to Improve Bug Prediction

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Learning Methods for Fuzzy Systems

B.S/M.A in Mathematics

MKT ADVERTISING. Fall 2016

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Australian Journal of Basic and Applied Sciences

Welcome to. ECML/PKDD 2004 Community meeting

Math 181, Calculus I

Speech Recognition at ICSI: Broadcast News and beyond

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

A study of speaker adaptation for DNN-based speech synthesis

Syllabus Foundations of Finance Summer 2014 FINC-UB

Human Emotion Recognition From Speech

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

CALCULUS I Math mclauh/classes/calculusi/ SYLLABUS Fall, 2003

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Navigating the PhD Options in CMS

Probabilistic Latent Semantic Analysis

AGN 331 Soil Science Lecture & Laboratory Face to Face Version, Spring, 2012 Syllabus

Machine Learning and Development Policy

Instructor Dr. Kimberly D. Schurmeier

Visualizing Architecture

Class Mondays & Wednesdays 11:00 am - 12:15 pm Rowe 161. Office Mondays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Introduction to Simulation

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Managerial Decision Making

ACC 362 Course Syllabus

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Ericsson Wallet Platform (EWP) 3.0 Training Programs. Catalog of Course Descriptions

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

Rule Learning With Negation: Issues Regarding Effectiveness

Generative models and adversarial training

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

PHY2048 Syllabus - Physics with Calculus 1 Fall 2014

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

Time series prediction

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

Phys4051: Methods of Experimental Physics I

Knowledge Transfer in Deep Convolutional Neural Nets

FINN FINANCIAL MANAGEMENT Spring 2014

Introduction. Chem 110: Chemical Principles 1 Sections 40-52

Switchboard Language Model Improvement with Conversational Data from Gigaword

MYCIN. The MYCIN Task

MTH 215: Introduction to Linear Algebra

arxiv: v1 [cs.cl] 2 Apr 2017

Financial Accounting Concepts and Research

Probability and Game Theory Course Syllabus

Universidade do Minho Escola de Engenharia

Transcription:

CSC 411: Lecture 01: Introduction Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 1 / 44

Today Administration details Why is machine learning so cool? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 2 / 44

The Team I Instructors: Raquel Urtasun Richard Zemel Email: csc411prof@cs.toronto.edu Offices: Raquel: 290E in Pratt Richard: 290D in Pratt Office hours: TBA Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 3 / 44

The Team II TA s: Siddharth Ancha Azin Asgarian Min Bai Lluis Castrejon Subira Kaustav Kundu Hao-Wei Lee Renjie Liao Shun Liao Wenjie Luo Email: csc411ta@cs.toronto.edu David Madras Seyed Parsa Mirdehghan Mengye Ren Geoffrey Roeder Yulia Rubanova Elias Tragas Eleni Triantafillou Shenlong Wang Ayazhan Zhakhan Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 4 / 44

Admin Details Liberal wrt waiving pre-requisites But it is up to you to determine if you have the appropriate background Do I have the appropriate background? Linear algebra: vector/matrix manipulations, properties Calculus: partial derivatives Probability: common distributions; Bayes Rule Statistics: mean/median/mode; maximum likelihood Sheldon Ross: A First Course in Probability Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 5 / 44

Course Information (Section 1) Class: Mondays at 11-1pm in AH 400 Instructor: Raquel Urtasun Tutorials: Monday, 3-4pm, same classroom Class Website: http://www.cs.toronto.edu/~urtasun/courses/csc411_fall16/ CSC411_Fall16.html The class will use Piazza for announcements and discussions: https://piazza.com/utoronto.ca/fall2016/csc411/home First time, sign up here: https://piazza.com/utoronto.ca/fall2016/csc411 Your grade will not depend on your participation on Piazza. It s just a good way for asking questions, discussing with your instructor, TAs and your peers Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 6 / 44

Course Information (Section 2) Class: Wednesdays at 11-1pm in MS 2170 Instructor: Raquel Urtasun Tutorials: Wednesday, 3-4pm, BA 1170 Class Website: http://www.cs.toronto.edu/~urtasun/courses/csc411_fall16/ CSC411_Fall16.html The class will use Piazza for announcements and discussions: https://piazza.com/utoronto.ca/fall2016/csc411/home First time, sign up here: https://piazza.com/utoronto.ca/fall2016/csc411/home Your grade will not depend on your participation on Piazza. It s just a good way for asking questions, discussing with your instructor, TAs and your peers Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 7 / 44

Course Information (Section 3) Class: Thursdays at 4-6pm in KP 108 Instructor: Richard Zemel Tutorials: Thursday, 6-7pm, same class Class Website: http://www.cs.toronto.edu/~urtasun/courses/csc411_fall16/ CSC411_Fall16.html The class will use Piazza for announcements and discussions: https://piazza.com/utoronto.ca/fall2016/csc411/home First time, sign up here: https://piazza.com/utoronto.ca/fall2016/csc411/home Your grade will not depend on your participation on Piazza. It s just a good way for asking questions, discussing with your instructor, TAs and your peers Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 8 / 44

Course Information (Section 4) Class: Fridays at 11-1pm in MS 2172 Instructor: Richard Zemel Tutorials: Thursday, 3-4pm, same class Class Website: http://www.cs.toronto.edu/~urtasun/courses/csc411_fall16/ CSC411_Fall16.html The class will use Piazza for announcements and discussions: https://piazza.com/utoronto.ca/fall2016/csc411/home First time, sign up here: https://piazza.com/utoronto.ca/fall2016/csc411/home Your grade will not depend on your participation on Piazza. It s just a good way for asking questions, discussing with your instructor, TAs and your peers Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 9 / 44

Textbook(s) Christopher Bishop: Pattern Recognition and Machine Learning, 2006 Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 10 / 44

Textbook(s) Christopher Bishop: Pattern Recognition and Machine Learning, 2006 Other Textbooks: Kevin Murphy: Machine Learning: a Probabilistic Perspective David Mackay: Information Theory, Inference, and Learning Algorithms Ethem Alpaydin: Introduction to Machine Learning, 2nd edition, 2010. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 10 / 44

Requirements (Undergrads) Do the readings! Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44

Requirements (Undergrads) Do the readings! Assignments: Three assignments, first two worth 15% each, last one worth 25%, for a total of 55% Programming: take code and extend it Derivations: pen(cil)-and-paper Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44

Requirements (Undergrads) Do the readings! Assignments: Three assignments, first two worth 15% each, last one worth 25%, for a total of 55% Programming: take code and extend it Derivations: pen(cil)-and-paper Mid-term: One hour exam Worth 20% of course mark Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44

Requirements (Undergrads) Do the readings! Assignments: Three assignments, first two worth 15% each, last one worth 25%, for a total of 55% Programming: take code and extend it Derivations: pen(cil)-and-paper Mid-term: One hour exam Worth 20% of course mark Final: Focused on second half of course Worth 25% of course mark Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 11 / 44

Requirements (Grads) Do the readings! Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44

Requirements (Grads) Do the readings! Assignments: Three assignments, first two worth 15% each, last one worth 25%, for a total of 55% Programming: take code and extend it Derivations: pen(cil)-and-paper Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44

Requirements (Grads) Do the readings! Assignments: Three assignments, first two worth 15% each, last one worth 25%, for a total of 55% Programming: take code and extend it Derivations: pen(cil)-and-paper Mid-term: One hour exam Worth 20% of course mark Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44

Requirements (Grads) Do the readings! Assignments: Three assignments, first two worth 15% each, last one worth 25%, for a total of 55% Programming: take code and extend it Derivations: pen(cil)-and-paper Mid-term: One hour exam Worth 20% of course mark Final: Focused on second half of course Worth 25% of course mark Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 12 / 44

More on Assigments Collaboration on the assignments is not allowed. Each student is responsible for his/her own work. Discussion of assignments should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44

More on Assigments Collaboration on the assignments is not allowed. Each student is responsible for his/her own work. Discussion of assignments should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations. The schedule of assignments is included in the syllabus. Assignments are due at the beginning of class/tutorial on the due date. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44

More on Assigments Collaboration on the assignments is not allowed. Each student is responsible for his/her own work. Discussion of assignments should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations. The schedule of assignments is included in the syllabus. Assignments are due at the beginning of class/tutorial on the due date. Assignments handed in late but before 5 pm of that day will be penalized by 5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day will be assessed thereafter. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44

More on Assigments Collaboration on the assignments is not allowed. Each student is responsible for his/her own work. Discussion of assignments should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations. The schedule of assignments is included in the syllabus. Assignments are due at the beginning of class/tutorial on the due date. Assignments handed in late but before 5 pm of that day will be penalized by 5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day will be assessed thereafter. Extensions will be granted only in special situations, and you will need a Student Medical Certificate or a written request approved by the instructor at least one week before the due date. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44

More on Assigments Collaboration on the assignments is not allowed. Each student is responsible for his/her own work. Discussion of assignments should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations. The schedule of assignments is included in the syllabus. Assignments are due at the beginning of class/tutorial on the due date. Assignments handed in late but before 5 pm of that day will be penalized by 5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day will be assessed thereafter. Extensions will be granted only in special situations, and you will need a Student Medical Certificate or a written request approved by the instructor at least one week before the due date. Final assignment is a bake-off: competition between ML algorithms. We will give you some data for training a ML system, and you will try to develop the best method. We will then determine which system performs best on unseen test data. Grads can do own project. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 13 / 44

Provisional Calendar (Section 1) Intro + Linear Regression Linear Classif. + Logistic Regression Non-parametric + Decision trees Multi-class + Prob. Classif I Thanksgiving Prob. Classif II + NNets I Nnet II + Clustering Midterm + Mixt. of Gaussians Reading Week PCA/Autoencoders + SVM Kernels + Ensemble I Ensemble II + RL Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 14 / 44

Provisional Calendar (Sections 2,3,4) Intro + Linear Regression Linear Classif. + Logistic Regression Non-parametric + Decision trees Multi-class + Prob. Classif I Prob. Classif II + NNets I Nnet II + Clustering Midterm + Mixt. of Gaussians PCA/Autoencoders + SVM Kernels + Ensemble I Ensemble II + RL Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 15 / 44

What is Machine Learning? How can we solve a specific problem? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem Figure: How can we make a robot cook? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem In many cases is very difficult to specify those rules, e.g., given a picture determine whether there is a cat in the image Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem In many cases is very difficult to specify those rules, e.g., given a picture determine whether there is a cat in the image Learning systems are not directly programmed to solve a problem, instead develop own program based on: Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem In many cases is very difficult to specify those rules, e.g., given a picture determine whether there is a cat in the image Learning systems are not directly programmed to solve a problem, instead develop own program based on: Examples of how they should behave Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem In many cases is very difficult to specify those rules, e.g., given a picture determine whether there is a cat in the image Learning systems are not directly programmed to solve a problem, instead develop own program based on: Examples of how they should behave From trial-and-error experience trying to solve the problem Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem In many cases is very difficult to specify those rules, e.g., given a picture determine whether there is a cat in the image Learning systems are not directly programmed to solve a problem, instead develop own program based on: Examples of how they should behave From trial-and-error experience trying to solve the problem Different than standard CS: Want to implement unknown function, only have access e.g., to sample input-output pairs (training examples) Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

What is Machine Learning? How can we solve a specific problem? As computer scientists we write a program that encodes a set of rules that are useful to solve the problem In many cases is very difficult to specify those rules, e.g., given a picture determine whether there is a cat in the image Learning systems are not directly programmed to solve a problem, instead develop own program based on: Examples of how they should behave From trial-and-error experience trying to solve the problem Different than standard CS: Want to implement unknown function, only have access e.g., to sample input-output pairs (training examples) Learning simply means incorporating information from the training examples into the system Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 16 / 44

Tasks that requires machine learning: What makes a 2? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 17 / 44

Tasks that benefits from machine learning: cooking! Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 18 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit What distinguishes a 2 from a 7? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit What distinguishes a 2 from a 7? How does our brain do it? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit What distinguishes a 2 from a 7? How does our brain do it? Instead of writing a program by hand, we collect examples that specify the correct output for a given input Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit What distinguishes a 2 from a 7? How does our brain do it? Instead of writing a program by hand, we collect examples that specify the correct output for a given input A machine learning algorithm then takes these examples and produces a program that does the job Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit What distinguishes a 2 from a 7? How does our brain do it? Instead of writing a program by hand, we collect examples that specify the correct output for a given input A machine learning algorithm then takes these examples and produces a program that does the job The program produced by the learning algorithm may look very different from a typical hand-written program. It may contain millions of numbers. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Why use learning? It is very hard to write programs that solve problems like recognizing a handwritten digit What distinguishes a 2 from a 7? How does our brain do it? Instead of writing a program by hand, we collect examples that specify the correct output for a given input A machine learning algorithm then takes these examples and produces a program that does the job The program produced by the learning algorithm may look very different from a typical hand-written program. It may contain millions of numbers. If we do it right, the program works for new cases as well as the ones we trained it on. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 19 / 44

Learning algorithms are useful in many tasks 1. Classification: Determine which discrete category the example is Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 20 / 44

Examples of Classification What digit is this? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44

Examples of Classification Is this a dog? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44

Examples of Classification what about this one? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44

Examples of Classification Am I going to pass the exam? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44

Examples of Classification Do I have diabetes? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 21 / 44

Learning algorithms are useful in many tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 22 / 44

Examples of Recognizing patterns Figure: Siri: https://www.youtube.com/watch?v=8ciaggasro0 Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 23 / 44

Examples of Recognizing patterns Figure: Photomath: https://photomath.net/ Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 23 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 24 / 44

Examples of Recommendation systems Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 25 / 44

Examples of Recommendation systems Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 25 / 44

Examples of Recommendation systems Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 25 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 26 / 44

Examples of Information Retrieval Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44

Examples of Information Retrieval Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44

Examples of Information Retrieval Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44

Examples of Information Retrieval Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 27 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content 5. Computer vision: detection, segmentation, depth estimation, optical flow, etc Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 28 / 44

Computer Vision Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 29 / 44

Computer Vision Figure: Kinect: https://www.youtube.com/watch?v=op82fdrrqsy Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 29 / 44

Computer Vision [Gatys, Ecker, Bethge. A Neural Algorithm of Artistic Style. Arxiv 15.] Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 29 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content 5. Computer vision: detection, segmentation, depth estimation, optical flow, etc 6. Robotics: perception, planning, etc Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 30 / 44

Autonomous Driving Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 31 / 44

Flying Robots Figure: Video: https://www.youtube.com/watch?v=yqimgv5vtd4 Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 32 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content 5. Computer vision: detection, segmentation, depth estimation, optical flow, etc 6. Robotics: perception, planning, etc 7. Learning to play games Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 33 / 44

Playing Games: Atari Figure: Video: https://www.youtube.com/watch?v=v1eynij0rnk Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 34 / 44

Playing Games: Super Mario Figure: Video: https://www.youtube.com/watch?v=wfl4l_l4u9a Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 35 / 44

Playing Games: Alpha Go Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 36 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content 5. Computer vision: detection, segmentation, depth estimation, optical flow, etc 6. Robotics: perception, planning, etc 7. Learning to play games 8. Recognizing anomalies: Unusual sequences of credit card transactions, panic situation at an airport Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 37 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content 5. Computer vision: detection, segmentation, depth estimation, optical flow, etc 6. Robotics: perception, planning, etc 7. Learning to play games 8. Recognizing anomalies: Unusual sequences of credit card transactions, panic situation at an airport 9. Spam filtering, fraud detection: The enemy adapts so we must adapt too Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 37 / 44

Learning algorithms are useful in other tasks 1. Classification: Determine which discrete category the example is 2. Recognizing patterns: Speech Recognition, facial identity, etc 3. Recommender Systems: Noisy data, commercial pay-off (e.g., Amazon, Netflix). 4. Information retrieval: Find documents or images with similar content 5. Computer vision: detection, segmentation, depth estimation, optical flow, etc 6. Robotics: perception, planning, etc 7. Learning to play games 8. Recognizing anomalies: Unusual sequences of credit card transactions, panic situation at an airport 9. Spam filtering, fraud detection: The enemy adapts so we must adapt too 10. Many more! Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 37 / 44

Human Learning Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 38 / 44

Types of learning tasks Supervised: correct output known for each training example Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Classification: 1-of-N output (speech recognition, object recognition, medical diagnosis) Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Classification: 1-of-N output (speech recognition, object recognition, medical diagnosis) Regression: real-valued output (predicting market prices, customer rating) Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Classification: 1-of-N output (speech recognition, object recognition, medical diagnosis) Regression: real-valued output (predicting market prices, customer rating) Unsupervised learning Create an internal representation of the input, capturing regularities/structure in data Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Classification: 1-of-N output (speech recognition, object recognition, medical diagnosis) Regression: real-valued output (predicting market prices, customer rating) Unsupervised learning Create an internal representation of the input, capturing regularities/structure in data Examples: form clusters; extract features Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Classification: 1-of-N output (speech recognition, object recognition, medical diagnosis) Regression: real-valued output (predicting market prices, customer rating) Unsupervised learning Create an internal representation of the input, capturing regularities/structure in data Examples: form clusters; extract features How do we know if a representation is good? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Types of learning tasks Supervised: correct output known for each training example Learn to predict output when given an input vector Classification: 1-of-N output (speech recognition, object recognition, medical diagnosis) Regression: real-valued output (predicting market prices, customer rating) Unsupervised learning Create an internal representation of the input, capturing regularities/structure in data Examples: form clusters; extract features How do we know if a representation is good? Reinforcement learning Learn action to maximize payoff Not much information in a payoff signal Payoff is often delayed Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 39 / 44

Machine Learning vs Data Mining Data-mining: Typically using very simple machine learning techniques on very large databases because computers are too slow to do anything more interesting with ten billion examples Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44

Machine Learning vs Data Mining Data-mining: Typically using very simple machine learning techniques on very large databases because computers are too slow to do anything more interesting with ten billion examples Previously used in a negative sense misguided statistical procedure of looking for all kinds of relationships in the data until finally find one Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44

Machine Learning vs Data Mining Data-mining: Typically using very simple machine learning techniques on very large databases because computers are too slow to do anything more interesting with ten billion examples Previously used in a negative sense misguided statistical procedure of looking for all kinds of relationships in the data until finally find one Now lines are blurred: many ML problems involve tons of data Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44

Machine Learning vs Data Mining Data-mining: Typically using very simple machine learning techniques on very large databases because computers are too slow to do anything more interesting with ten billion examples Previously used in a negative sense misguided statistical procedure of looking for all kinds of relationships in the data until finally find one Now lines are blurred: many ML problems involve tons of data But problems with AI flavor (e.g., recognition, robot navigation) still domain of ML Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 40 / 44

Machine Learning vs Statistics ML uses statistical theory to build models Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44

Machine Learning vs Statistics ML uses statistical theory to build models A lot of ML is rediscovery of things statisticians already knew; often disguised by differences in terminology Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44

Machine Learning vs Statistics ML uses statistical theory to build models A lot of ML is rediscovery of things statisticians already knew; often disguised by differences in terminology But the emphasis is very different: Good piece of statistics: Clever proof that relatively simple estimation procedure is asymptotically unbiased. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44

Machine Learning vs Statistics ML uses statistical theory to build models A lot of ML is rediscovery of things statisticians already knew; often disguised by differences in terminology But the emphasis is very different: Good piece of statistics: Clever proof that relatively simple estimation procedure is asymptotically unbiased. Good piece of ML: Demo that a complicated algorithm produces impressive results on a specific task. Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44

Machine Learning vs Statistics ML uses statistical theory to build models A lot of ML is rediscovery of things statisticians already knew; often disguised by differences in terminology But the emphasis is very different: Good piece of statistics: Clever proof that relatively simple estimation procedure is asymptotically unbiased. Good piece of ML: Demo that a complicated algorithm produces impressive results on a specific task. Can view ML as applying computational techniques to statistical problems. But go beyond typical statistics problems, with different aims (speed vs. accuracy). Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 41 / 44

Cultural gap (Tibshirani) MACHINE LEARNING weights learning generalization supervised learning unsupervised learning large grant: $1,000,000 conference location: Snowbird, French Alps STATISTICS parameters fitting test set performance regression/classification density estimation, clustering large grant: $50,000 conference location: Las Vegas in August Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 42 / 44

Course Survey Please complete the following survey this week: https://docs.google.com/forms/d/e/ 1FAIpQLScd5JwTrh55gW-O-5UKXLidFPvvH-XhVxr36AqfQzsrdDNxGQ/ viewform?usp=send_form Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 43 / 44

Initial Case Study What grade will I get in this course? Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44

Initial Case Study What grade will I get in this course? Data: entry survey and marks from this and previous years Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44

Initial Case Study What grade will I get in this course? Data: entry survey and marks from this and previous years Process the data Split into training set; and test set Determine representation of input; Determine the representation of the output; Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44

Initial Case Study What grade will I get in this course? Data: entry survey and marks from this and previous years Process the data Split into training set; and test set Determine representation of input; Determine the representation of the output; Choose form of model: linear regression Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44

Initial Case Study What grade will I get in this course? Data: entry survey and marks from this and previous years Process the data Split into training set; and test set Determine representation of input; Determine the representation of the output; Choose form of model: linear regression Decide how to evaluate the system s performance: objective function Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44

Initial Case Study What grade will I get in this course? Data: entry survey and marks from this and previous years Process the data Split into training set; and test set Determine representation of input; Determine the representation of the output; Choose form of model: linear regression Decide how to evaluate the system s performance: objective function Set model parameters to optimize performance Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44

Initial Case Study What grade will I get in this course? Data: entry survey and marks from this and previous years Process the data Split into training set; and test set Determine representation of input; Determine the representation of the output; Choose form of model: linear regression Decide how to evaluate the system s performance: objective function Set model parameters to optimize performance Evaluate on test set: generalization Zemel, Urtasun, Fidler (UofT) CSC 411: 01-Introduction 44 / 44