MLD Statistical Machine Learning

Similar documents
Lecture 1: Machine Learning Basics

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Python Machine Learning

(Sub)Gradient Descent

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Comparison of network inference packages and methods for multiple networks inference

CSL465/603 - Machine Learning

A survey of multi-view machine learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Math 96: Intermediate Algebra in Context

Probabilistic Latent Semantic Analysis

STA 225: Introductory Statistics (CT)

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Foothill College Summer 2016

arxiv: v2 [cs.cv] 30 Mar 2017

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Australian Journal of Basic and Applied Sciences

Probability and Statistics Curriculum Pacing Guide

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Mathematics. Mathematics

WHEN THERE IS A mismatch between the acoustic

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Probability and Game Theory Course Syllabus

CS Machine Learning

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

A Survey on Unsupervised Machine Learning Algorithms for Automation, Classification and Maintenance

Office Hours: Mon & Fri 10:00-12:00. Course Description

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

Analysis of Enzyme Kinetic Data

ECO 3101: Intermediate Microeconomics

*In Ancient Greek: *In English: micro = small macro = large economia = management of the household or family

MGT/MGP/MGB 261: Investment Analysis

MATH 1A: Calculus I Sec 01 Winter 2017 Room E31 MTWThF 8:30-9:20AM

Generative models and adversarial training

Semi-Supervised Face Detection

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Speech Emotion Recognition Using Support Vector Machine

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

Teaching a Discussion Section

Syllabus ENGR 190 Introductory Calculus (QR)

arxiv: v1 [cs.lg] 15 Jun 2015

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

Ryerson University Sociology SOC 483: Advanced Research and Statistics

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

MTH 141 Calculus 1 Syllabus Spring 2017

Truth Inference in Crowdsourcing: Is the Problem Solved?

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Instructor: Matthew Wickes Kilgore Office: ES 310

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

A study of speaker adaptation for DNN-based speech synthesis

Course Content Concepts

INTERMEDIATE ALGEBRA Course Syllabus

MATH 205: Mathematics for K 8 Teachers: Number and Operations Western Kentucky University Spring 2017

Course Syllabus Chem 482: Chemistry Seminar

Self Study Report Computer Science

CS/SE 3341 Spring 2012

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Honors Mathematics. Introduction and Definition of Honors Mathematics

EGRHS Course Fair. Science & Math AP & IB Courses

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Massachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139

OFFICE SUPPORT SPECIALIST Technical Diploma

Welcome to. ECML/PKDD 2004 Community meeting

Learning Methods for Fuzzy Systems

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Assignment 1: Predicting Amazon Review Ratings

Lecture 1: Basic Concepts of Machine Learning

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Spring 2015 Natural Science I: Quarks to Cosmos CORE-UA 209. SYLLABUS and COURSE INFORMATION.

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

BIOH : Principles of Medical Physiology

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Time series prediction

Learning From the Past with Experiment Databases

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

Measurement. When Smaller Is Better. Activity:

Modeling function word errors in DNN-HMM based LVCSR systems

DIGITAL GAMING AND SIMULATION Course Syllabus Advanced Game Programming GAME 2374

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

Evolutive Neural Net Fuzzy Filtering: Basic Description

A NEW ALGORITHM FOR GENERATION OF DECISION TREES

Reducing Features to Improve Bug Prediction

Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.

Math 181, Calculus I

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

San José State University Department of Marketing and Decision Sciences BUS 90-06/ Business Statistics Spring 2017 January 26 to May 16, 2017

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Word Segmentation of Off-line Handwritten Documents

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

LOW-RANK AND SPARSE SOFT TARGETS TO LEARN BETTER DNN ACOUSTIC MODELS

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Data Structures and Algorithms

Transcription:

Spring 2008 Syllabus MLD 10-702 Statistical Machine Learning http://www.stat.cmu.edu/ larry/=sml2008 Statistical Machine Learning is a second graduate level course in machine learning, assuming students have taken Machine Learning (10-701) and Intermediate Statistics (36-705). The term statistical in the title reflects the emphasis on statistical analysis and methodology, which is the predominant approach in modern machine learning. The course combines methodology with theoretical foundations. It is intended for students who want to practice the art of designing good learning algorithms, and also understand the science of analyzing an algorithm s statistical properties and performance guarantees. Theorems are presented together with practical aspects of methodology and intuition to help students develop tools for selecting appropriate methods and approaches to problems in their own research. The course includes topics in statistical theory that are now becoming important for researchers in machine learning, including consistency, minimax estimation, and concentration of measure. Schedule LECTURES Mon. and Wed. 1:30-2:50 Wean Hall 4623 OFFICE HOURS Tuesdays 4:00-5:00 Baker Hall 228a TA OFFICE HOURS TBA TBA Contact Information Professor: Larry Wasserman Baker Hall 228A, 268-8727 larry@stat.cmu.edu Teaching Assistant: Jingrui He TBA jingruih@cs.cmu.edu Secretary: Diane Stidle Wean Hall 4609, 268-3431 diane@cs.cmu.edu Prerequisites You should have taken 10-701 and 36-705. I will assume that you are familiar with the following concepts: 1. convergence in probability 2. central limit theorem 3. maximum likelihood 4. delta method 5. Fisher information 6. Bayesian inference 7. posterior distribution 8. bias, variance and mean squared error 9. determinants, eigenvalues, eigenvectors It is essential that you know these topics. 1

Text and Reference Materials There is no required text for the course; however, lecture notes will be regularly distributed. These are draft chapters and sections from a book in progress (also called Statistical Machine Learning ). Comments, corrections, and other input on the drafts are highly encouraged. The book is intended to be at a more advanced level than current texts such as The Elements of Statistical Learning by Hastie, Tibshirani and Freedman or Pattern Recognition and Machine Learning by Bishop. But these books are excellent references that may complement many parts of the course. Recommended texts include: Chris Bishop, Pattern Recognition and Machine Learning, Springer, Information Science and Statistics Series, 2006. Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Texts in Statistics, Springer-Verlag, New York, 2001. Larry Wasserman, All of Statistics: A Concise Course in Statistical Inference, Springer Texts in Statistics, Springer-Verlag, New York, 2004. Larry Wasserman, All of Nonparametric Statistics, Springer Texts in Statistics, Springer-Verlag, New York, 2005. Assignments, Exams, and Grades The course will have Six (6) assignments, which will include both problem solving and experimental components. The assignments will be given roughly every two weeks. They will be due on Fridays at 3:00 p.m. Midterm exam. There will be a midterm exam on Monday, March 3. Project.There will be a final project, similar to the project in 10-701. The project is described later in this syllabus and on the website. Grading for the class will be as follows: 50% Assignments 25% Midterm exam 25% Project Programming Language All computational problems for the course are to be completed using the R programming language. R is an excellent language for statistical computing, which has many advantages over Matlab and other scientific scripting languages. The underlying programming language is elegant and powerful. Students have found it useful, and not difficult, to learn this language even if they primarily use another language in their own research. Free downloads of the language, together with an extensive set of resources, can be found at http://www.r-project.org. 2

Policy on Collaboration Collaboration on homework assignments with fellow students is encouraged. However, such collaboration should be clearly acknowledged, by listing the names of the students with whom you have had discussions concerning your solution. You may not, however, share written work or code after discussing a problem with others, the solution should be written by yourself. Topics The course will follow the outline of the book manuscript, and will include topics from the following: 1. Statistical Theory: Maximum likelihood, Bayes, minimax, Parametric versus Nonparametric Methods, Bayesian versus Non-Bayesian Approaches, classification, regression, density estimation. 2. Convexity and optimization: Convexity, conjugate functions, unconstrained and constrained optimization, KKT conditions. 3. Parametric Methods: Linear Regression, Model Selection, Generalized Linear Models, Mixture Models, Classification, Graphical Models, Structured Prediction, Hidden Markov Models 4. Sparsity: High Dimensional Data and the Role of Sparsity, Basis Pursuit and the Lasso Revisited, Sparsistency, Consistency, Persistency, Greedy Algorithms for Sparse Linear Regression, Sparsity in Nonparametric Regression. Sparsity in Graphical Models, Compressed Sensing 5. Nonparametric Methods: Nonparametric Regression and Density Estimation, Nonparametric Classification, Clustering and Dimension Reduction, Manifold Methods, Spectral Methods, The Bootstrap and Subsampling, Nonparametric Bayes. 6. Advanced Theory: Concentration of Measure, Covering numbers, Learning theory, Risk Minimization, Tsybakov noise, minimax rates for classification and regression, surrogate loss functions. 7. Kernel methods: Mercel kernels, kernel classification, kernel PCA, kernel tests of independence. 8. Computation: The EM Algorithm, Simulation, Variational Methods, Regularization Path Algorithms, Graph Algorithms 9. Other Learning Methods: Semi-Supervised Learning, Reinforcement Learning, Minimum Description Length, Online Learning, The PAC Model, Active Learning Final Project The project is similar to the project in 10-701. Here are the rules: 1. You may work by yourself or in teams of 2. 2. Choose an interesting dataset that you have not analyzed before. A good source of data is: http://www.ics.uci.edu/ mlearn/mlrepository.html 3. The goals are (i) to use the methods you have learned in class or, if you wish, to develop a new method and (ii) present a theoretical analysis of the methods. 3

4. You will provide: (i) a proposal, (ii) a progress report and (iii) and final report. 5. The reports should be well-written. This is a good time to buy a copy of The Elements of Style by Strunk and White. Proposal. The proposal is due February 15. The length is 1 page. It should contain the following information: Project title, Team members, Description of the data, Precise description of the question you are trying to answer with the data, Preliminary plan for analysis, Reading list. (Papers you will need to read). Progress Report. Due April 4. Three pages. Include: (i) a high quality introduction, (ii) what have you done so far and (iii) what remains to be done. Final Report: Due May 2. The paper should be in NIPS format. However, it can be up to 20 pages long. You should submit a pdf file electronically. It should have the following format: 1. Introduction. A quick summary of the problem, methods and results. 2. Problem description. Detailed description of the problem. What question are you trying to address? 3. Methods. Description of methods used. 4. Results. The results of applying the methods to the data set. 5. Theory. This section should contain a cogent discussion of the theoretical properties of the method. It should also discuss under what assumptions the methods should work and under what conditions they will fail. 6. Simulation studies. Results of applying the method to simulated data sets. 7. Conclusions. What is the answer to the question? What did you learn about the methods? 4

Course Calendar Week of Mon Wed Friday January 14 21 Homework 1 28 February 4 Homework 2 11 Project Proposal 18 Homework 3 25 March 3 Midterm Exam No Class Spring Break 10 Spring Break Spring Break Spring Break 17 Homework 4 24 31 Progress report April 7 Homework 5 14 21 28 Last Class Submit Project May 5 Homework 6 5