Course Outline STAT 841 / 441, CM 763 Statistical Learning-Classification

Similar documents
CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Lecture 1: Machine Learning Basics

Python Machine Learning

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

CSL465/603 - Machine Learning

Learning From the Past with Experiment Databases

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

University of Waterloo Department of Economics Economics 102 (Section 006) Introduction to Macroeconomics Winter 2012

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Data Structures and Algorithms

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Financial Accounting Concepts and Research

(Sub)Gradient Descent

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

TU-E2090 Research Assignment in Operations Management and Services

Switchboard Language Model Improvement with Conversational Data from Gigaword

MGT/MGP/MGB 261: Investment Analysis

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Course Content Concepts

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Universidade do Minho Escola de Engenharia

CS Course Missive

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Philosophy in Literature: Italo Calvino (Phil. 331) Fall 2014, M and W 12:00-13:50 p.m.; 103 PETR. Professor Alejandro A. Vallega.

MTH 215: Introduction to Linear Algebra

Physics 270: Experimental Physics

CS 100: Principles of Computing

THESIS GUIDE FORMAL INSTRUCTION GUIDE FOR MASTER S THESIS WRITING SCHOOL OF BUSINESS

Reducing Features to Improve Bug Prediction

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

CS Machine Learning

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

BIODIVERSITY: CAUSES, CONSEQUENCES, AND CONSERVATION

Rhetoric and the Social Construction of Monsters ACWR Academic Writing Fall Semester 2013

Course Syllabus MFG Modern Manufacturing Techniques I Spring 2017

ACADEMIC POLICIES AND PROCEDURES

BUS Computer Concepts and Applications for Business Fall 2012

ASTRONOMY 2801A: Stars, Galaxies & Cosmology : Fall term

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

BA 130 Introduction to International Business

A Case Study: News Classification Based on Term Frequency

Navigating the PhD Options in CMS

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Indian Institute of Technology, Kanpur

Assignment 1: Predicting Amazon Review Ratings

HMS 241 Lab Introduction to Early Childhood Education Fall 2015

ECO 3101: Intermediate Microeconomics

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Office Hours: Mon & Fri 10:00-12:00. Course Description

SPM 5309: SPORT MARKETING Fall 2017 (SEC. 8695; 3 credits)

Academic Integrity RN to BSN Option Student Tutorial

Math 181, Calculus I

ECON 484-A1 GAME THEORY AND ECONOMIC APPLICATIONS

GEOG Introduction to GIS - Fall 2015

Rule Learning With Negation: Issues Regarding Effectiveness

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

THE UNIVERSITY OF WESTERN ONTARIO. Department of Psychology

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Speech Emotion Recognition Using Support Vector Machine

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Generative models and adversarial training

Australian Journal of Basic and Applied Sciences

POFI 1349 Spreadsheets ONLINE COURSE SYLLABUS

Human Emotion Recognition From Speech

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Introduction to Personality Daily 11:00 11:50am

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

School: Business Course Number: ACCT603 General Accounting and Business Concepts Credit Hours: 3 hours Length of Course: 8 weeks Prerequisite: None

ITSC 2321 Integrated Software Applications II COURSE SYLLABUS

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

Corporate Communication

Course Syllabus It is the responsibility of each student to carefully review the course syllabus. The content is subject to revision with notice.

Event on Teaching Assignments October 7, 2015

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Social Media Journalism J336F Unique ID CMA Fall 2012

ECON492 Senior Capstone Seminar: Cost-Benefit and Local Economic Policy Analysis Fall 2017 Instructor: Dr. Anita Alves Pena

The Multi-genre Research Project

TROY UNIVERSITY MASTER OF SCIENCE IN INTERNATIONAL RELATIONS DEGREE PROGRAM

arxiv: v1 [cs.lg] 3 May 2013

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

STA 225: Introductory Statistics (CT)

Language Arts Methods

Model Ensemble for Click Prediction in Bing Search Ads

Comparison of network inference packages and methods for multiple networks inference

ME 4495 Computational Heat Transfer and Fluid Flow M,W 4:00 5:15 (Eng 177)

FINN FINANCIAL MANAGEMENT Spring 2014

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Learning Methods in Multilingual Speech Recognition

MAR Environmental Problems & Solutions. Stony Brook University School of Marine & Atmospheric Sciences (SoMAS)

Required Materials: The Elements of Design, Third Edition; Poppy Evans & Mark A. Thomas; ISBN GB+ flash/jump drive

PSYC 2700H-B: INTRODUCTION TO SOCIAL PSYCHOLOGY

Social Media Marketing BUS COURSE OUTLINE

CALIFORNIA STATE UNIVERSITY, SAN MARCOS SCHOOL OF EDUCATION

COURSE DESCRIPTION PREREQUISITE COURSE PURPOSE

Theory of Probability

CS/SE 3341 Spring 2012

Transcription:

Course Outline STAT 841 / 441, CM 763 Statistical Learning-Classification Fall 2015 Instructor: Ali Ghodsi Dept. of Statistics & Actuarial Science University of Waterloo Office: M3 4208 E-mail: aghodsib@uwaterloo.ca Office hours: 4:00-5:00 T Lectures: (11:30-12:50TTh, RCH 207) Prerequisites: Grads: none for STATS/CS/ECE/SYDE grad students, instructor permission otherwise Undergrads: STAT 341 or (STAT 330 and 340) Course Description: Classification, also known as pattern recognition is the problem of predicting a discrete random variable Y from another random variable X. The random variable X may take many different forms from Digital image libraries and text corpora to gene expression microarrays and financial time series. This course provides a comprehensive introduction to the problem of classification and pattern recognition and reflects recent developments in the filed. Required Textbook: There is no required textbook for the class. Three recommended books that cover the similar material are: Hastie, Tibshirani, Friedman Elements of Statistical Learning Bishop, Pattern Recognition and Machine Learning. Murphy, Machine Learning: a Probabilistic Perspective 1

Tentative topics:: Feature selection Feature extraction (dimensionality reduction) Error rates and the Bayes classifier Gaussian and linear classifier Linear regression and logistic regression Neural networks Radial basis function networks Naive Bayes Trees Assessing error rates and model selection Support vector machines Kernel methods k-nearest neighbors Deep learning Bagging Boosting Semi-supervised learning for classification Metric learning for classification Evaluation:(tentative) Assignments Final project 50% (4 or 5 assignments) 50% (10% Presentation) (40% Ranking and report) 2

Project: Final group project (presentation and reports up to 7 pages of PDF) are worth 50% of your final grade. You are encouraged to participate in the Right Whale Recognition kaggle competition as your final project. If you don t have access to adequate computational resources, you may chose other possible types of projects as follows: Another active kaggle completion. Develop a new algorithm. In this case, you will need to demonstrate (theoretically and/or empirically) why your technique is better (or worse) than other algorithms. (Note: A negative result does not lose marks, as long as you followed proper theoretical and/or experimental techniques). Application of classification to some domain. This could either be your own research problem, or you could try reproducing results of someone else s paper. Note that you cannot borrow part of an existing thesis work, nor can you re-use a project from another course as your final project. Final project reports will be checked by Turnitin (Plagiarism detection software). Communication All communication should take place using the Piazza discussion board. Piazza is a good way to discuss and ask questions about the course materials, including assignments, in a public forum. It enables you to learn from the questions of others, and to avoid asking questions that have already been asked and answered. It also provides a forum for course personnel to make announcements and clarifications about assignments and other course-related topics. Students are expected to read Piazza on a regular basis. Enrolling in Piazza You will be sent an invitation to your UW email address. It will include a link to a web page where you may complete the enrollment process. 3

Piazza Guidelines Here are some guidelines that you should keep in mind when posting items to Piazza: 1. Please remember that everything you post is public - everyone enrolled in this course will be reading it. As a result, in any posts you make, do not give away any details on how to do any of the assignments. This could be construed as cheating, and you will be responsible as the poster. If you have questions about an assignment that require you give specific details of your solution, you may still post to Piazza, but check This is a private post - only visible to class instructors (and TAs). If the instructors and/or TAs feels that posting it to everyone is appropriate, they will do so. 2. Keep posts related to the course, concise, and topical. As students are all expected to read Piazza on a regular basis, try not to waste the time of readers. 3. Please be diligent about attempting to find the answer before you post a question. Piazza includes excellent search facilities use them! Scan all of the questions that have already been asked. Better yet, read them along with the answers. You ll learn lots! Please do all you can to avoid duplicates. 4. Make it easy for other students to find your question just in case they have the same question and want to see the answer. Use a meaningful subject heading. Help and even Help for A3Q2 is not very meaningful. Clarify parameter order for A3Q2 is much better. Tag your post with all the applicable tags. Start a tag by typing the hash character (#). A drop-down list of tags that are currently in use will appear. Use one of them, if applicable. If not, create a new one. However, any tag you create should be applicable to many posts not just yours. 5. Please don t post things to the group that provide no useful information to readers. Posts like I have the same question as this one just posted, or I agree with this comment serve no useful purpose, and waste people s time. 6. Keep complaints about the course out of Piazza or mark them with the This is a private post - only visible to class instructors checkbox. If you have a concern about anything to do with the course, the best way to deal with it, and to get results, is to take it to the course instructor. Piazza is not a complaint forum. Assignments and grades will be handled through Learn. Please log on frequently to Piazza and Learn. You are responsible for being aware of all STAT 341 material, information and email messages found on Learn and Piazza throughout the semester. 4

Important Dates: Oct 6 Nov 17 Final project proposal due (Use the link posted on Leran) Presentation begin (tentative) Academic Honesty: In assignments, projects and wikicoursenote, if you use ideas, plots, text and other intellectual property developed by someone else you have to cite the original source. If you copy a sentence or a paragraph from work done by someone else, in addition of citing the original source you have to use quotation marks to identify the scope of the copied material. Example: Plagiarism is an act of using ideas, plots, text and other intellectual property developed by someone else while claiming it is your original work. [1] Evidence of copying or plagiarism will cause a failing mark in the course. Persons with Disabilities: The office for Persons with Disabilities (OPD), located in Needles Hall, Room 1132 collaborates with all academic departments to arrange appropriate accommodations for students with disabilities without compromising the academic integrity of the curriculum. If you require academic accommodations to lessen the impact of your disability, please register with OPD at the start of each academic term. References [1] Tec Encyclopedia. http://www.answers.com/topic/plagiarism 5