Course Outline STAT*6801: FALL 2017

Similar documents
Teaching Team Professor Dr. Lorraine Jadeski OVC 2617, Extension Office Hours: by appointment

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Python Machine Learning

Lecture 1: Machine Learning Basics

Learning From the Past with Experiment Databases

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

(Sub)Gradient Descent

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

CSL465/603 - Machine Learning

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Western University , Ext DANCE IMPROVISATION Dance 2270A

DEPARTMENT OF HISTORY AND CLASSICS Academic Year , Classics 104 (Summer Term) Introduction to Ancient Rome

PSYC 2700H-B: INTRODUCTION TO SOCIAL PSYCHOLOGY

CS Machine Learning

Axiom 2013 Team Description Paper

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

FINANCE 3320 Financial Management Syllabus May-Term 2016 *

Course Content Concepts

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Foothill College Summer 2016

MTH 141 Calculus 1 Syllabus Spring 2017

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

Student Assessment Policy: Education and Counselling

MATH 205: Mathematics for K 8 Teachers: Number and Operations Western Kentucky University Spring 2017

AGN 331 Soil Science Lecture & Laboratory Face to Face Version, Spring, 2012 Syllabus

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

AGN 331 Soil Science. Lecture & Laboratory. Face to Face Version, Spring, Syllabus

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

CS 100: Principles of Computing

Assignment 1: Predicting Amazon Review Ratings

Course Syllabus It is the responsibility of each student to carefully review the course syllabus. The content is subject to revision with notice.

Marketing Management MBA 706 Mondays 2:00-4:50

THE UNIVERSITY OF WESTERN ONTARIO. Department of Psychology

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Welcome to. ECML/PKDD 2004 Community meeting

Math 96: Intermediate Algebra in Context

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Postprint.

ENG 111 Achievement Requirements Fall Semester 2007 MWF 10:30-11: OLSC

Computational Data Analysis Techniques In Economics And Finance

CHEM 6487: Problem Seminar in Inorganic Chemistry Spring 2010

Office Hours: Mon & Fri 10:00-12:00. Course Description

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

Universidade do Minho Escola de Engenharia

CS/SE 3341 Spring 2012

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Softprop: Softmax Neural Network Backpropagation Learning

Introduction to Psychology

Class Mondays & Wednesdays 11:00 am - 12:15 pm Rowe 161. Office Mondays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

Social Media Marketing BUS COURSE OUTLINE

Probabilistic Latent Semantic Analysis

Medical Terminology - Mdca 1313 Course Syllabus: Summer 2017

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

ECON 484-A1 GAME THEORY AND ECONOMIC APPLICATIONS

Instructor. Darlene Diaz. Office SCC-SC-124. Phone (714) Course Information

Multi-tasks Deep Learning Model for classifying MRI images of AD/MCI Patients

Texas A&M University - Central Texas PSYK PRINCIPLES OF RESEARCH FOR THE BEHAVIORAL SCIENCES. Professor: Elizabeth K.

RURAL SOCIOLOGY 1500 INTRODUCTION TO RURAL SOCIOLOGY

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

COMM370, Social Media Advertising Fall 2017

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

EEAS 101 BASIC WIRING AND CIRCUIT DESIGN. Electrical Principles and Practices Text 3 nd Edition, Glen Mazur & Peter Zurlis

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

REGULATIONS RIGHTS AND OBLIGATIONS OF THE STUDENT

Time series prediction

EGRHS Course Fair. Science & Math AP & IB Courses

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

ASTR 102: Introduction to Astronomy: Stars, Galaxies, and Cosmology

Pitching Accounts & Advertising Sales ADV /PR

Course Syllabus for Math

Faculty of Health and Behavioural Sciences School of Health Sciences Subject Outline SHS222 Foundations of Biomechanics - AUTUMN 2013

Lecture 1: Basic Concepts of Machine Learning

The Policymaking Process Course Syllabus

HANDBOOK. Doctoral Program in Educational Leadership. Texas A&M University Corpus Christi College of Education and Human Development

ASTRONOMY 2801A: Stars, Galaxies & Cosmology : Fall term

THE UNIVERSITY OF WINNIPEG

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

PSY 1012 General Psychology. Course Policies and Syllabus

Comparison of network inference packages and methods for multiple networks inference

Educational Leadership and Policy Studies Doctoral Programs (Ed.D. and Ph.D.)

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

Multivariate k-nearest Neighbor Regression for Time Series data -

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Introduction to Personality Daily 11:00 11:50am

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

CEEF 6306 Lifespan Development New Orleans Baptist Theological Seminary

Biology 1 General Biology, Lecture Sections: 47231, and Fall 2017

Model Ensemble for Click Prediction in Bing Search Ads

MARKETING MANAGEMENT II: MARKETING STRATEGY (MKTG 613) Section 007

University of Toronto Mississauga Sociology SOC387 H5S Qualitative Analysis I Mondays 11 AM to 1 PM IB 250

Military Science 101, Sections 001, 002, 003, 004 Fall 2014

Transcription:

Course Outline STAT*6801: FALL 2017 General Information Course Title: Statistical Learning Course Description (from Graduate Calendar): Topics include: nonparametric and semiparametric regression; kernel methods; regression splines; local polynomial models; generalized additive models; classification and regression trees; neural networks. This course deals with both the methodology and its application with appropriate software. Areas of application include biology, economics, engineering and medicine. Credit Weight: 0.5 Academic Department (or campus): Math and Stats Campus: Guelph Semester Offering: Fall 2017 Class Schedule and Location: Monday 4:00 to 5:20, MCKN 226 Instructor Information Instructor Name: Tony Desmond Instructor Email: tdesmond@uoguelph.ca Office location and office hours: MCN 523 Friday 4-5 pm. Course Content This course will deal with a variety of topics in statistical learning and their implementation in R. In lectures I will briefly review recent research in generalized linear models. One focus of the course will be nonparametric and semiparametric versions of

these models. An important example is generalized additive models (GAMs), which will be treated in some depth. In addition modern nonparametric regression via kernels, splines, etc. will be studied. Other topics, which will be covered, include: classification and regression trees,random forests, boosting and neural networks. Time permitting, topics such as wavelets and MARS (Multivariate Adaptive Regression Splines) may also be treated. In the project component of the course the student is encouraged to work in areas (both applied and theoretical), of his or her own interest, with the instructor s permission. Much of the material in the required and recommended texts relates to research published in the last two decades or so. Areas of application include medicine, finance, agriculture, economics, pharmacokinetics, bioassay, engineering reliability, to name only a few. Familiarity with R will be assumed. The best way to acquire familiarity is via the manuals (available on line). Also simply working through the required texts is of great value. Learning Outcomes: 1. Understand basic statistical learning concepts such as: generalization, predictive accuracy, overfitting, training, test and validation sets, parsimony, cross-validation. 2. Explore and understand how standard parametric models such as linear and generalized models can be viewed from a statistical learning perspective. 3. Explore and understand non-parametric approaches to statistical learning, which extend the flexibility and enhance the predictive accuracy of parametric supervised learning. 4. Explore and understand algorithmic approaches such as neural nets, classification and regression trees. 5. Implement the approaches in 2, 3, and 4 using the software package R on real data from various subject matter areas. Lecture Content: 1. Statistical Learning: Prediction vs Inference; The 2 cultures; Algorithmic vs Data models; The bias-variance tradeoff; The prediction accuracy/ model-interpretability tradeoff; generalizability and validation; supervised and unsupervised learning; regression vs classification. 2. Linear and Generalized Linear Models from a statistical learning perspective. Difficulties with high-dimensional data. Ridge Regression and the LASSO; The glmnet package. 3. Moving beyond Linearity: Regression splines; smoothing splines; local regression; generalized additive models. 4. Tree-based Methods: Regression and Classification Trees; Trees vs Linear and Generalized Linear Models; Random Forests, Bagging and Boosting. 5. Neural Networks

6. Other topics: Wavelets, MARS (Multivariate Adaptive regression splines); Support Vector Machines. Course Assignments and Term Project: 4 assignments, each worth 12.5%; Due Dates: A1, October 4 (In class); A2, October 18 (In class); A3, November 1 (In class); A4, November 15 (In class) Final Term Project, worth 50%: Due Date: December 13 before 5pm. I require both hard copies and e-copies (pdf or Word) of the final project. Course Resources Required Texts: Extending the Linear Model with R, by Julian Faraway, 2nd Ed. Chapman and Hall 2016. The Elements of Statistical Learning: Data Mining, Inference and Prediction, by Hastie, Tibshirani, and Friedman. Springer 2009 2nd Edition. Recommended Texts: An Introduction to Statistical Learning with Applications in R, by James, G et al., Springer 2014. Statistical Learning with Sparsity: The LASSO and its Generalizations, by Hastie et al, Chapman and Hall 2016 Modern Applied Statistics with S, 4 th Edition, by W.N. Venables and B.D. Ripley. Springer 2004. Statistical Learning from a Regression Perspective, by Berk, R. Springer 2008. Semiparametric Regression, by Ruppert, Wand and Carroll, Cambridge University Press 2003. Generalized Additive Models, by Hastie and Tibshirani, Chapman and Hall, 1990. Statistical Learning for Biomedical Data, by Malley et al, CUP, 2011.

NB: Copies of each of these texts have been placed on reserve in the library. With the exception of the last 2 these are electronic copies. Course Policies Late Assignments will not be accepted except under very exceptional circumstances. Course Policy on Group Work: Assignment solutions should be your own work, be clear, legible and well organized. You may discuss assignments with other classmates, but the work handed in should be your own. Course Policy regarding use of electronic devices and recording of lectures Electronic recording of classes is expressly forbidden without consent of the instructor. When recordings are permitted they are solely for the use of the authorized student and may not be reproduced, or transmitted to others, without the express written consent of the instructor. University Policies Academic Consideration When you find yourself unable to meet an in-course requirement because of illness or compassionate reasons, please advise the course instructor in writing, with your name, id#, and e-mail contact. See the academic calendar for information on regulations and procedures for Academic Consideration: http://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-ac.shtml Academic Misconduct The University of Guelph is committed to upholding the highest standards of academic integrity and it is the responsibility of all members of the University community, faculty, staff, and

students to be aware of what constitutes academic misconduct and to do as much as possible to prevent academic offences from occurring. University of Guelph students have the responsibility of abiding by the University's policy on academic misconduct regardless of their location of study; faculty, staff and students have the responsibility of supporting an environment that discourages misconduct. Students need to remain aware that instructors have access to and the right to use electronic and other means of detection. Please note: Whether or not a student intended to commit academic misconduct is not relevant for a finding of guilt. Hurried or careless submission of assignments does not excuse students from responsibility for verifying the academic integrity of their work before submitting it. Students who are in any doubt as to whether an action on their part could be construed as an academic offence should consult with a faculty member or faculty advisor. The Academic Misconduct Policy is detailed in the Undergraduate Calendar: http://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-amisconduct.shtml Accessibility The University of Guelph is committed to creating a barrier-free environment. Providing services for students is a shared responsibility among students, faculty and administrators. This relationship is based on respect of individual rights, the dignity of the individual and the University community's shared commitment to an open and supportive learning environment. Students requiring service or accommodation, whether due to an identified, ongoing disability or a short-term disability should contact the Centre for Students with Disabilities as soon as possible. For more information, contact SAS at 519-824-4120 ext. 56208 or email csd@uoguelph.ca or see the website: http://www.uoguelph.ca/csd/ Course Evaluation Information Please see http://www.mathstat.uoguelph.ca/files/teachevaluationformf10.pdf Drop date The last date to drop one-semester courses, without academic penalty, is Friday, November 3 2017. For regulations and procedures for Dropping Courses, see the Academic Calendar: http://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-drop.shtml Additional Course Information Additional Course Information will be provided in class.