CSC 411 MACHINE LEARNING and DATA MINING

Similar documents
CSL465/603 - Machine Learning

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Python Machine Learning

Math 181, Calculus I

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

(Sub)Gradient Descent

BUS Computer Concepts and Applications for Business Fall 2012

CIS Introduction to Digital Forensics 12:30pm--1:50pm, Tuesday/Thursday, SERC 206, Fall 2015


MTH 215: Introduction to Linear Algebra

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

CALCULUS I Math mclauh/classes/calculusi/ SYLLABUS Fall, 2003

INTERMEDIATE ALGEBRA Course Syllabus

SPM 5309: SPORT MARKETING Fall 2017 (SEC. 8695; 3 credits)

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

ASTRONOMY 2801A: Stars, Galaxies & Cosmology : Fall term

Please read this entire syllabus, keep it as reference and is subject to change by the instructor.

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Lecture 1: Machine Learning Basics

Social Media Journalism J336F Unique ID CMA Fall 2012

MKT ADVERTISING. Fall 2016

Reducing Features to Improve Bug Prediction

Neuroscience I. BIOS/PHIL/PSCH 484 MWF 1:00-1:50 Lecture Center F6. Fall credit hours

Syllabus ENGR 190 Introductory Calculus (QR)

Instructor: Khaled Kassem (Mr. K) Classroom: C Use the message tool within UNM LEARN, or

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

MGT/MGP/MGB 261: Investment Analysis

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

TUESDAYS/THURSDAYS, NOV. 11, 2014-FEB. 12, 2015 x COURSE NUMBER 6520 (1)

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Instructor Dr. Kimberly D. Schurmeier

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Lecture 1: Basic Concepts of Machine Learning

CIS 2 Computers and the Internet in Society -

Strategic Management (MBA 800-AE) Fall 2010

Department of Anthropology ANTH 1027A/001: Introduction to Linguistics Dr. Olga Kharytonava Course Outline Fall 2017

University of Victoria School of Exercise Science, Physical and Health Education EPHE 245 MOTOR LEARNING. Calendar Description Units: 1.

CS 100: Principles of Computing

Generative models and adversarial training

Design and Creation of Games GAME

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

ACC 362 Course Syllabus

CS 3516: Computer Networks

WE ARE EXCITED TO HAVE ALL OF OUR FFG KIDS BACK FOR OUR SCHOOL YEAR PROGRAM! WE APPRECIATE YOUR CONTINUED SUPPORT AS WE HEAD INTO OUR 8 TH SEASON!

Learning From the Past with Experiment Databases

Computer Science 1015F ~ 2016 ~ Notes to Students

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Financial Accounting Concepts and Research

UCC2: Course Change Transmittal Form

Syllabus Foundations of Finance Summer 2014 FINC-UB

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Class Tuesdays & Thursdays 12:30-1:45 pm Friday 107. Office Tuesdays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

CHEM:1070 Sections A, B, and C General Chemistry I (Fall 2017)

Course Syllabus Solid Waste Management and Environmental Health ENVH 445 Fall Quarter 2016 (3 Credits)

ACC 380K.4 Course Syllabus

Introduction to Sociology SOCI 1101 (CRN 30025) Spring 2015

Welcome to. ECML/PKDD 2004 Community meeting

STA 225: Introductory Statistics (CT)

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler


CS Machine Learning

CHEMISTRY 104 FALL Lecture 1: TR 9:30-10:45 a.m. in Chem 1351 Lecture 2: TR 1:00-2:15 p.m. in Chem 1361

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

CTE Teacher Preparation Class Schedule Career and Technical Education Business and Industry Route Teacher Preparation Program

MATH 205: Mathematics for K 8 Teachers: Number and Operations Western Kentucky University Spring 2017

Course Syllabus. Alternatively, a student can schedule an appointment by .

Speech Emotion Recognition Using Support Vector Machine

Year 11 GCSE Information Evening

Class Schedule

Jeff Walker Office location: Science 476C (I have a phone but is preferred) 1 Course Information. 2 Course Description

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

AS SYLLABUS. 2 nd Year Arabic COURSE DESCRIPTION

Course Syllabus for Math

English Policy Statement and Syllabus Fall 2017 MW 10:00 12:00 TT 12:15 1:00 F 9:00 11:00

Welcome to the University of Hertfordshire and the MSc Environmental Management programme, which includes the following pathways:

Universidade do Minho Escola de Engenharia

JN2000: Introduction to Journalism Syllabus Fall 2016 Tuesdays and Thursdays 12:30 1:45 p.m., Arrupe Hall 222

Control Tutorials for MATLAB and Simulink

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Office: Colson 228 Office Hours: By appointment

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

EGRHS Course Fair. Science & Math AP & IB Courses

Semi-Supervised Face Detection

CALCULUS III MATH

Math 96: Intermediate Algebra in Context

arxiv: v1 [cs.lg] 15 Jun 2015

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Latin I (LA 4923) August 23-Dec 17, 2014 Michal A. Isbell. Course Description, Policies, and Syllabus

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

BUAD 425 Data Analysis for Decision Making Syllabus Fall 2015

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Course Syllabus for Calculus I (Summer 2017)

Lecture 10: Reinforcement Learning

Transcription:

CSC 411 MACHINE LEARNING and DATA MINING Lectures: Monday, Wednesday 12-1 (section 1), 3-4 (section 2) Lecture Room: MP 134 (section 1); Bahen 1200 (section 2) Instructor (section 1): Richard Zemel Instructor (section 2): Raquel Urtasun Instructor email: <csc411prof@cs.toronto.edu> Office hours: Tuesday 3-4 Pratt 290E (Urtasun); Thursday 3-4 Pratt 290D (Zemel) TA email: <csc411ta@cs.toronto.edu> Tutorials: Fridays 12-1 (section 1); 3-4 (section 2) Tutorial Room: Same as lecture Class URL: www.cs.toronto.edu/ zemel/courses/cs411.html Overview Machine learning research aims to build computer systems that learn from experience. Learning systems are not directly programmed by a person to solve a problem, but instead they develop their own program based on examples of how they should behave, or from trial-and-error experience trying to solve the problem. These systems require learning algorithms that specify how the system should change its behavior as a result of experience. Researchers in machine learning develop new algorithms, and try to understand which algorithms should be applied in which circumstances. Machine learning is an exciting interdisciplinary field, with historical roots in computer science, statistics, pattern recognition, and even neuroscience and physics. In the past 10 years, many of these approaches have converged and led to rapid theoretical advances and real-world applications. This course will focus on the machine learning methods that have proven valuable and successful in practical applications. This course will contrast the various methods, with the aim of explaining the circumstances under which each is most appropriate. We will also discuss basic issues that confront any machine learning method. Pre-requisites You should understand basic probability and statistics, (STA 107, 250), and college-level algebra and calculus. For example it is expected that you know about standard probability distributions (Gaussians, Poisson), and also how to calculate derivatives. Knowledge 1

of linear algebra is also expected, and knowledge of mathematics underlying probability models (STA 255, 261) will be useful. For the programming assignments, you should have some background in programming (CSC 270), and it would be helpful if you know Matlab or Python. Some introductory material for Matlab will be available on the course website as well as in the first tutorial. Readings There is no required textbook for this course. There are several recommended books. On the course webpage we will readings from Introduction to Machine Learning by Ethem Alpaydin, and from Pattern Recognition and Machine Learning by Chris Bishop. We also provide pointers to other online resources. Course requirements and grading The format of the class will be lecture, with some discussion. I strongly encourage interaction and questions. There are assigned readings for each lecture that are intended to prepare you to participate in the class discussion for that day. The grading in the class will be divided up as follows: Assignments 50% Mid-Term Exam 20% Final Exam 30% There will be four assignments; each is worth 12.5% of your grade. Homework assignments The best way to learn about a machine learning method is to program it yourself and experiment with it. So the assignments will generally involve implementing machine learning algorithms, and experimentation to test your algorithms on some data. You will be asked to summarize your work, and analyze the results, in brief (3-4 page) write ups. The implementations may be done in any language, but Matlab or Python is recommended. A brief tutorial on Matlab is available from the course web-site. Collaboration on the assignments is not allowed. Each student is responsible for his or her own work. Discussion of assignments and programs should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations. 2

The schedule of assignments is included in the syllabus. Assignments are due at the beginning of class/tutorial on the due date. Because they may be discussed in class that day, it is important that you have completed them by that day. Assignments handed in late but before 5 pm of that day will be penalized by 5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day will be assessed thereafter. Extensions will be granted only in special situations, and you will need a Student Medical Certificate or a written request approved by the instructor at least one week before the due date. For the final assignment, we will have a bake-off: a competition between machine learning algorithms. We will give everyone some data for training a machine learning system, and you will try to develop the best method. We will then determine which system performs best on some unseen test data. Exams There will be a mid-term in tutorial on October 24 th, which will be a closed book exam on all material covered up to that point in the lectures, tutorials, required readings, and assignments. The final will not be cumulative, except insofar as concepts from the first half of the semester are essential for understanding the later material. The exams will cover material presented in lectures, tutorials, and assignments. You will not be responsible for topics in the reading not covered in any of these. Attendance We expect students to attend all classes, and all tutorials. This is especially important because we will cover material in class that is not included in the textbook. Also, the tutorials will not only be for review and answering questions, but new material will also be covered. Electronic Communication If you have questions about the assignments, you should send email to the TA account, and cc me on it. You should include your full name in the email, and it will also be useful to include your CDF account name and/or student number. Feel free to email me with questions or comments about the material covered in the course, or other related questions. For questions about marks on the assignments, please first contact the TA. Questions about the exams should be addressed to me. 3

CLASS SCHEDULE, Part 1 Shown below are the topics for lectures and tutorials (in italics), as are the dates that each assignment will be handed out and is due. The notes from each lecture and tutorial will be available on the class web-site the day of the class meeting. The assigned readings are specific sections from the book. All of these are subject to change. Date Topic Assignments Sep 8 Sep 10 Sep 12 Sep 15 Sep 17 Sep 19 Introduction Linear Regression Probability for ML & Linear regression Linear Discrimination Logistic Regression Optimization for ML Sep 22 Decision Trees Asst 1 Out Sep 24 Sept 26 Sep 29 Nonparametric Methods Parametric vs. Nonparametric Multi-class Classification Oct 1 Probabilistic Classifiers Asst 1 In Oct 3 (No tutorial) Oct 6 Probabilistic Classifiers II Asst 2 Out Oct 8 Oct 10 [Oct 13] Oct 15 Oct 17 Naive Bayes Naive Bayes & Gaussian Bayes classifiers Thanksgiving: No class Neural Networks I Mid-term review 4

CLASS SCHEDULE, Part 2 Date Topic Assignments Oct 20 Neural Networks II Oct 22 Clustering Asst 2 In Oct 24 Oct 27 Mid-term EM & Mixtures of Gaussians Oct 29 PCA & Autoencoders Asst 3 Out Oct 31 Nov 3 Nov 5 Nov 7 Nov 10 Mixture of Gaussians Support Vector Machines Kernels and Margins SVMs Ensemble Methods Nov 12 Ensemble Methods II Asst 3 In Nov 14 Bagging & Boosting Asst 4 Out [Nov 17] Nov 19 Mid-term break: No class Bayesian Learning Nov 21 Nov 24 Nov 26 Computational Learning Theory Computational Learning Theory II Nov 28 Dec 1 Reinforcement Learning I Asst 4 In 5