CSC321 Lecture 1: Introduction

Similar documents
Python Machine Learning

Generative models and adversarial training

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Lecture 1: Machine Learning Basics

A Neural Network GUI Tested on Text-To-Phoneme Mapping

CS224d Deep Learning for Natural Language Processing. Richard Socher, PhD

Self Study Report Computer Science

(Sub)Gradient Descent

CSL465/603 - Machine Learning

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

EGRHS Course Fair. Science & Math AP & IB Courses

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

Firms and Markets Saturdays Summer I 2014

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Laboratorio di Intelligenza Artificiale e Robotica

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Axiom 2013 Team Description Paper

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

B. How to write a research paper

Human Emotion Recognition From Speech

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Mathematics Program Assessment Plan

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Laboratorio di Intelligenza Artificiale e Robotica

Computer Science 141: Computing Hardware Course Information Fall 2012

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Modeling function word errors in DNN-HMM based LVCSR systems

Students Understanding of Graphical Vector Addition in One and Two Dimensions

SOFTWARE EVALUATION TOOL

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Math 96: Intermediate Algebra in Context

Syllabus Foundations of Finance Summer 2014 FINC-UB

DOCTOR OF PHILOSOPHY HANDBOOK

Honors Mathematics. Introduction and Definition of Honors Mathematics

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

Test Effort Estimation Using Neural Network

Modeling function word errors in DNN-HMM based LVCSR systems

Lecture 10: Reinforcement Learning

ECO 3101: Intermediate Microeconomics

Forget catastrophic forgetting: AI that learns after deployment

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Lecture 1: Basic Concepts of Machine Learning

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

School of Innovative Technologies and Engineering

Welcome to ACT Brain Boot Camp

CS Machine Learning

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

CAFE ESSENTIAL ELEMENTS O S E P P C E A. 1 Framework 2 CAFE Menu. 3 Classroom Design 4 Materials 5 Record Keeping

Knowledge Transfer in Deep Convolutional Neural Nets

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Methodology

Essentials of Ability Testing. Joni Lakin Assistant Professor Educational Foundations, Leadership, and Technology

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Assignment 1: Predicting Amazon Review Ratings

Navigating the PhD Options in CMS

FINN FINANCIAL MANAGEMENT Spring 2014

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Research Area

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

CS 100: Principles of Computing

MATH 1A: Calculus I Sec 01 Winter 2017 Room E31 MTWThF 8:30-9:20AM

MGT/MGP/MGB 261: Investment Analysis

THE enormous growth of unstructured data, including

CS 101 Computer Science I Fall Instructor Muller. Syllabus

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Dialog-based Language Learning

South Carolina English Language Arts

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

4. Long title: Emerging Technologies for Gaming, Animation, and Simulation

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Top US Tech Talent for the Top China Tech Company

Word Segmentation of Off-line Handwritten Documents

arxiv: v1 [cs.lg] 15 Jun 2015

Computer Science 1015F ~ 2016 ~ Notes to Students

AI Agent for Ice Hockey Atari 2600

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

CS Course Missive

Disciplinary Literacy in Science

Artificial Neural Networks written examination

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

Mathematics. Mathematics

ICTCM 28th International Conference on Technology in Collegiate Mathematics

Active Learning. Yingyu Liang Computer Sciences 760 Fall

GACE Computer Science Assessment Test at a Glance

MTH 215: Introduction to Linear Algebra

Exploration. CS : Deep Reinforcement Learning Sergey Levine

White Paper. The Art of Learning

FINANCE 3320 Financial Management Syllabus May-Term 2016 *

Transcription:

CSC321 Lecture 1: Introduction Roger Grosse Roger Grosse CSC321 Lecture 1: Introduction 1 / 26

What is machine learning? For many problems, it s difficult to program the correct behavior by hand recognizing people and objects understanding human speech Roger Grosse CSC321 Lecture 1: Introduction 2 / 26

What is machine learning? For many problems, it s difficult to program the correct behavior by hand recognizing people and objects understanding human speech Machine learning approach: program an algorithm to automatically learn from data, or from experience Roger Grosse CSC321 Lecture 1: Introduction 2 / 26

What is machine learning? For many problems, it s difficult to program the correct behavior by hand recognizing people and objects understanding human speech Machine learning approach: program an algorithm to automatically learn from data, or from experience Some reasons you might want to use a learning algorithm: hard to code up a solution by hand (e.g. vision, speech) system needs to adapt to a changing environment (e.g. spam detection) want the system to perform better than the human programmers privacy/fairness (e.g. ranking search results) Roger Grosse CSC321 Lecture 1: Introduction 2 / 26

What is machine learning? It s similar to statistics... Both fields try to uncover patterns in data Both fields draw heavily on calculus, probability, and linear algebra, and share many of the same core algorithms Roger Grosse CSC321 Lecture 1: Introduction 3 / 26

What is machine learning? It s similar to statistics... Both fields try to uncover patterns in data Both fields draw heavily on calculus, probability, and linear algebra, and share many of the same core algorithms But it s not statistics! Stats is more concerned with helping scientists and policymakers draw good conclusions; ML is more concerned with building autonomous agents Stats puts more emphasis on interpretability and mathematical rigor; ML puts more emphasis on predictive performance, scalability, and autonomy Roger Grosse CSC321 Lecture 1: Introduction 3 / 26

What is machine learning? Types of machine learning Supervised learning: have labeled examples of the correct behavior Reinforcement learning: learning system receives a reward signal, tries to learn to maximize the reward signal Unsupervised learning: no labeled examples instead, looking for interesting patterns in the data Roger Grosse CSC321 Lecture 1: Introduction 4 / 26

Course information Course about machine learning, with a focus on neural networks Independent of CSC411, and CSC412, with about 25% overlap in topics First 2/3: supervised learning Last 1/3: unsupervised learning and reinforcement learning Two sections Equivalent content, same assignments and exams Both sections are full, so please attend your own. Roger Grosse CSC321 Lecture 1: Introduction 5 / 26

Course information Formal prerequisites: Calculus: (MAT136H1 with a minimum mark of 77)/(MAT137Y1 with a minimum mark of 73)/(MAT157Y1 with a minimum mark of 67)/MAT235Y1/MAT237Y1/MAT257Y1 Linear Algebra: MAT221H1/MAT223H1/MAT240H1 Probability: STA247H1/STA255H1/STA257H1 Multivariable calculus (recommended): MAT235Y1/MAT237Y1/MAT257Y1 Programming experience (recommended) Roger Grosse CSC321 Lecture 1: Introduction 6 / 26

Course information Expectations and marking Written homeworks (20% of total mark) Due Wednesday nights at 11:59pm, starting 1/17 2-3 short conceptual questions Use material covered up through Tuesday of the preceding week 4 programming assignments (30% of total mark) Python, PyTorch 10-15 lines of code may also involve some mathematical derivations give you a chance to experiment with the algorithms Exams midterm (15%) final (35%) See Course Information handout for detailed policies Roger Grosse CSC321 Lecture 1: Introduction 7 / 26

Course information Textbooks None, but we link to lots of free online resources. (see syllabus) Professor Geoffrey Hinton s Coursera lectures the Deep Learning textbook by Goodfellow et al. Metacademy I will try to post detailed lecture notes, but I will not have time to cover every lecture. Tutorials Roughly every week Programming background; worked-through examples Roger Grosse CSC321 Lecture 1: Introduction 8 / 26

Course information Course web page: http://www.cs.toronto.edu/~rgrosse/courses/csc321_2018/ Includes detailed course information handout Roger Grosse CSC321 Lecture 1: Introduction 9 / 26

Supervised learning examples Supervised learning: have labeled examples of the correct behavior e.g. Handwritten digit classification with the MNIST dataset Task: given an image of a handwritten digit, predict the digit class Input: the image Target: the digit class Roger Grosse CSC321 Lecture 1: Introduction 10 / 26

Supervised learning examples Supervised learning: have labeled examples of the correct behavior e.g. Handwritten digit classification with the MNIST dataset Task: given an image of a handwritten digit, predict the digit class Input: the image Target: the digit class Data: 70,000 images of handwritten digits labeled by humans Training set: first 60,000 images, used to train the network Test set: last 10,000 images, not available during training, used to evaluate performance Roger Grosse CSC321 Lecture 1: Introduction 10 / 26

Supervised learning examples Supervised learning: have labeled examples of the correct behavior e.g. Handwritten digit classification with the MNIST dataset Task: given an image of a handwritten digit, predict the digit class Input: the image Target: the digit class Data: 70,000 images of handwritten digits labeled by humans Training set: first 60,000 images, used to train the network Test set: last 10,000 images, not available during training, used to evaluate performance This dataset is the fruit fly of neural net research Neural nets already achieved > 99% accuracy in the 1990s, but we still continue to learn a lot from it Roger Grosse CSC321 Lecture 1: Introduction 10 / 26

Supervised learning examples What makes a 2? Roger Grosse CSC321 Lecture 1: Introduction 11 / 26

Supervised learning examples Object recognition (Krizhevsky and Hinton, 2012) ImageNet dataset: thousands of categories, millions of labeled images Lots of variability in viewpoint, lighting, etc. Error rate dropped from 25.7% to 5.7% over the course of a few years! Roger Grosse CSC321 Lecture 1: Introduction 12 / 26

Supervised learning examples Caption generation Given: dataset of Flickr images with captions More examples at http://deeplearning.cs.toronto.edu/i2t Roger Grosse CSC321 Lecture 1: Introduction 13 / 26

Unsupervised learning examples In generative modeling, we want to learn a distribution over some dataset, such as natural images. We can evaluate a generative model by sampling from the model and seeing if it looks like the data. These results were considered impressive in 2014: Denton et al., 2014, Deep generative image models using a Laplacian pyramid of adversarial networks Roger Grosse CSC321 Lecture 1: Introduction 14 / 26

Unsupervised learning examples New state-of-the-art: Roger Grosse CSC321 Lecture 1: Introduction 15 / 26

Unsupervised learning examples Recent exciting result: a model called the CycleGAN takes lots of images of one category (e.g. horses) and lots of images of another category (e.g. zebras) and learns to translate between them. https://github.com/junyanz/cyclegan You will implement this model for Programming Assignment 4. Roger Grosse CSC321 Lecture 1: Introduction 16 / 26

Reinforcement learning An agent interacts with an environment (e.g. game of Breakout) In each time step, the agent receives observations (e.g. pixels) which give it information about the state (e.g. positions of the ball and paddle) the agent picks an action (e.g. keystrokes) which affects the state The agent periodically receives a reward (e.g. points) The agent wants to learn a policy, or mapping from observations to actions, which maximizes its average reward over time Roger Grosse CSC321 Lecture 1: Introduction 17 / 26

Reinforcement learning DeepMind trained neural networks to play many different Atari games given the raw screen as input, plus the score as a reward single network architecture shared between all the games in many cases, the networks learned to play better than humans (in terms of points in the first minute) https://www.youtube.com/watch?v=v1eynij0rnk Roger Grosse CSC321 Lecture 1: Introduction 18 / 26

What are neural networks? Most of the biological details aren t essential, so we use vastly simplified models of neurons. While neural nets originally drew inspiration from the brain, nowadays we mostly think about math, statistics, etc. y output output bias i'th weight w 1 w2 w3 weights inputs y = g b + nonlinearity x 1 x 2 x 3 i x i w i i'th input Neural networks are collections of thousands (or millions) of these simple processing units that together perform useful computations. Roger Grosse CSC321 Lecture 1: Introduction 19 / 26

What are neural networks? Why neural nets? inspiration from the brain proof of concept that a neural architecture can see and hear! very effective across a range of applications (vision, text, speech, medicine, robotics, etc.) widely used in both academia and the tech industry powerful software frameworks (Torch, PyTorch, TensorFlow, Theano) let us quickly implement sophisticated algorithms Roger Grosse CSC321 Lecture 1: Introduction 20 / 26

Deep learning Deep learning: many layers (stages) of processing E.g. this network which recognizes objects in images: (Krizhevsky et al., 2012) Each of the boxes consists of many neuron-like units similar to the one on the previous slide! Roger Grosse CSC321 Lecture 1: Introduction 21 / 26

Deep learning You can visualize what a learned feature is responding to by finding an image that excites it. (We ll see how to do this.) Higher layers in the network often learn higher-level, more interpretable representations https://distill.pub/2017/feature-visualization/ Roger Grosse CSC321 Lecture 1: Introduction 22 / 26

Deep learning You can visualize what a learned feature is responding to by finding an image that excites it. Higher layers in the network often learn higher-level, more interpretable representations https://distill.pub/2017/feature-visualization/ Roger Grosse CSC321 Lecture 1: Introduction 23 / 26

Software frameworks Array processing (NumPy) vectorize computations (express them in terms of matrix/vector operations) to exploit hardware efficiency Neural net frameworks: Torch, PyTorch, TensorFlow, Theano automatic differentiation compiling computation graphs libraries of algorithms and network primitives support for graphics processing units (GPUs) For this course: Python, NumPy Autograd, a lightweight automatic differentiation package written by Professor David Duvenaud and colleagues PyTorch, a widely used neural net framework Roger Grosse CSC321 Lecture 1: Introduction 24 / 26

Software frameworks Why take this class, if PyTorch does so much for you? So you know what do to if something goes wrong! Debugging learning algorithms requires sophisticated detective work, which requires understanding what goes on beneath the hood. That s why we derive things by hand in this class! Roger Grosse CSC321 Lecture 1: Introduction 25 / 26

Next time Next lecture: linear regression Roger Grosse CSC321 Lecture 1: Introduction 26 / 26