COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining.

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining."

Transcription

1 ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining 1.0 Course Designations and Approvals Required course approvals: Academic Unit Curriculum Committee College Curriculum Committee Approval request date: Approval granted date: Optional designations: General Education: Writing Intensive: Honors Is designation desired? No No No *Approval request date: **Approval granted date: 2.0 Course information: Course title: Principles of Statistical Data Mining Credit hours: 3 Prerequisite(s): One course in basic statistics Co-requisite(s): None Course proposed by: Ernest Fokoué Effective date: August 2013 Contact hours Maximum students/section Classroom 3 25 Lab 0 Studio 0 Other (specify) 0 2.a Course Conversion Designation*** (Please check which applies to this course). *For more information on Course Conversion Designations please see page four. Semester Equivalent (SE) Please indicate which quarter course it is equivalent to: Semester Replacement (SR) Please indicate the quarter course(s) this course is replacing: Principles of Statistical Data Mining 2.b Semester(s) offered (check) September 2010

2 Fall (online) Spring (campus) Summer Other All courses must be offered at least once every 2 years. If course will be offered on a bi-annual basis, please indicate here: 2.c Student Requirements Students required to take this course: (by program and year, as appropriate) None Students who might elect to take the course: This is an elective for graduate students in Advanced Certificate and MS programs in Applied Statistics. Graduate students in other programs who interested in statistical data mining will also elect to take this class. In the sections that follow, please use sub-numbering as appropriate (eg. 3.1, 3.2, etc.) 3.0 Goals of the course (including rationale for the course, when appropriate): 3.1 To achieve a practical understanding of modern statistical data mining techniques 3.2 To develop the ability to correctly apply modern data mining techniques to a variety of real world case studies involving very massive high dimensional complex data. 3.3 To gain a hands on experience with data mining through case studies, among which examples like: Describing website visitors, Market basket analysis, Describing customer satisfaction, Predicting credit risk of small businesses, Predicting e-learning student performance, Predicting customer lifetime value and Operational risk management. 4.0 Course description (as it will appear in the RIT Catalog, including pre- and corequisites, and quarters offered). Please use the following format: COS-STAT-747 Principles of Statistical Data Mining I This course covers topics such as clustering, classification and regression trees, multiple linear regression under various conditions, logistic regression, PCA and kernel PCA, model-based clustering via mixture of Gaussians, spectral clustering, text mining, neural networks, support vector machines, multidimensional scaling, variable selection, model selection, k-means clustering, k-nearest neighbors classifiers, statistical tools for modern machine learning and data mining, naïve Bayes classifiers, variance reduction methods (bagging) and ensemble methods for predictive optimality. 5.0 Possible resources (texts, references, computer packages, etc.) Required texts 5.1 Applied Data Mining for Business and Industry, 2nd ed., Paolo Giudici and Silvia Figini (2009), Wiley, ISBN Recommended Texts 5.2 Statistical Data Mining Using SAS Applications, 2nd ed., George Fernandez (2009), CRC Press, ISBN Data Mining Using SAS Enterprise Miner, Randall Matignon (2009), Wiley 5.4 Getting Started with SAS Enterprise Miner (From SAS) 2

3 5.5 Applied Analytics Using SAS Enterprise Miner (From SAS) 6.0 Topics (outline): 6.1. Complex data structures and the emergence of Data Mining and Machine Learning 6.2. Measures of location and measures of variability 6.3. Distance measures, Similarity Measures and Dependency measures 6.4. Multiple linear regression and its extensions to Radial Basis Function regression 6.5. Difference of focus between model identification and predictive optimality 6.6. Principles and applications of dimensionality reduction techniques 6.7. Principal component Analysis and Singular Value Decomposition 6.8. Cluster analysis.via Hierarchical and Hierarchical Methods 6.9. Factor Analysis and Mixtures of Factor Analyzers Multidimensional scaling and its relationship to other techniques Model Based Clustering via Mixtures of Gaussians Logistic regression for Pattern Recognition Linear and Quadratic Discriminant analysis Classification and Regression Trees Neural networks: Multilayer Perceptron and Kohonen networks Support Vector Machines for classification and regression Nearest-neighbor models: kmeans and K Nearest Neighbors Variance Reduction Techniques: Bagging Predictors Non-parametric modeling and Bayesian Modeling Generalized linear models and Log-linear models Graphical models and their applications Model Evaluation and model selection techniques Ensemble Methods for Predictive Optimality: Boosting 3

4 7.0 Intended course learning outcomes and associated assessment methods of those outcomes (please include as many Course Learning Outcomes as appropriate, one outcome and assessment method per row). Course Objectives Level 2: Comprehension: 2.1.Understands the central role of model uncertainty in data mining, and maintains a keen awareness of the difference between accurate model identification and optimal prediction 2.2.Appreciates and takes into account the everpresent bias/variance dilemma in model selection and model building, and strives to find solutions that achieve bias/variance trade-off 2.3.Knows when and how to combine unsupervised learning techniques (e.g.: PCA for feature extraction) with supervised learning techniques (e.g. Neural Networks) to achieve optimality 2.4.Recognizes when and how to use Ensemble methods rather than select a single model, and also knows when to use variance reduction techniques like Bagging! 2.5.Understands the profound meaning of the No Free Lunch theorem, and refrains from relying solely on one single method of data mining, and indeed always comparing various methods before making recommendations Level 3: Application: 3.1.Identifies an interesting real world engineering problem during the course of study and formulates its statistically 3.2.Recognizes for each real world case study which classes of data mining methods are more appropriate 3.3.Uses statistical software like SAS Enterprise Miner to perform a thorough data mining analysis of real world problems Level 4: Analysis: 4.1.Determines/decides which statistical model(s) appear to be most appropriate for the task at hand in light of the graphs and descriptive statistics obtained for exploratory data analysis Assessment Method Homework Exams Projects 4

5 4.2.Fits the chosen plausible model(s) using a statistical software package like SAS Enterprise Miner, then extracts and interprets the estimates of the parameters 4.3.Performs additional statistical hypothesis tests wherever needed 4.4.Checks all the assumptions underlying each method/technique used 4.5.Interprets the statistical estimation and prediction results produced by the software package Level 5: Synthesis: 5.1.Selects the best model according to some of the usual model selection criteria 5.2.Provides any needed/required formal prediction or estimation. 5.3.Uses an ensemble (aggregation) of methods wherever the need arises 5.4.Draws conclusions and interpretations about the original engineering task based on sound formal analysis like confidence intervals and results of hypothesis testing. Level 6: Evaluation: 6.1.Evaluates several potential statistical models and decides on the most appropriate one for a given purpose. 6.2.Provides any needed/required formal prediction or estimation 6.3.Makes recommendations in clear and non technical language based a thorough assessment of the statistical findings 5

6 8.0 Program outcomes and/or goals supported by this course Relationship to Program Outcomes (1 = slightly, 2=moderately, 3=significantly) Program Outcomes and/or Goals for CQAS 8.1 Advanced Certificate in Lean Six Sigma Demonstrates an solid understanding of statistical thinking and Lean Six Sigma methodology in solving real-world problems Leads Lean Six Sigma improvement projects. Level of Support Advanced Certificate and Masters of Science in Applied Statistics Demonstrates solid understanding of statistical thinking and applied statistics methodology in solving real-world problems Designs studies that are efficient and valid Analyzes data using appropriate statistical methods Communicates the results of statistical analysis with effective reports and presentations. Note: Students obtaining the Advanced Certificate in Applied Statistics will not be expected to perform at the same level as students obtaining a Master of Science degree Not Applicable General Education Learning Outcome Supported by the Course, if appropriate Communication Express themselves effectively in common college-level written forms using standard American English Revise and improve written and visual content Express themselves effectively in presentations, either in spoken standard American English or sign language (American Sign Language or English-based Signing) Comprehend information accessed through reading and discussion Intellectual Inquiry Review, assess, and draw conclusions about hypotheses and theories Analyze arguments, in relation to their premises, assumptions, contexts, and conclusions Construct logical and reasonable arguments that include anticipation of counterarguments Use relevant evidence gathered through accepted scholarly methods and properly acknowledge sources of information Assessment Method 6

7 Ethical, Social and Global Awareness Analyze similarities and differences in human experiences and consequent perspectives Examine connections among the world s populations Identify contemporary ethical questions and relevant stakeholder positions Scientific, Mathematical and Technological Literacy Explain basic principles and concepts of one of the natural sciences Apply methods of scientific inquiry and problem solving to contemporary issues Comprehend and evaluate mathematical and statistical information Perform college-level mathematical operations on quantitative data Describe the potential and the limitations of technology Use appropriate technology to achieve desired outcomes Creativity, Innovation and Artistic Literacy Demonstrate creative/innovative approaches to course-based assignments or projects Interpret and evaluate artistic expression considering the cultural context in which it was created 10.0 Other relevant information (such as special classroom, studio, or lab needs, special scheduling, media requirements, etc.) None *Optional course designation; approval request date: This is the date that the college curriculum committee forwards this course to the appropriate optional course designation curriculum committee for review. The chair of the college curriculum committee is responsible to fill in this date. **Optional course designation; approval granted date: This is the date the optional course designation curriculum committee approves a course for the requested optional course designation. The chair of the appropriate optional course designation curriculum committee is responsible to fill in this date. ***Course Conversion Designations Please use the following definitions to complete table 2.a on page one. Semester Equivalent (SE) Closely corresponds to an existing quarter course (e.g., a 4 quarter credit hour (qch) course which becomes a 3 semester credit hour (sch) course.) The semester course may develop material in greater depth or length. Semester Replacement (SR) A semester course (or courses) taking the place of a previous quarter course(s) by rearranging or combining material from a previous quarter course(s) (e.g. a two semester sequence that replaces a three quarter sequence). New (N) - No corresponding quarter course(s). 7

COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-762 SAS Database Programming. request date: *Approval

COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-762 SAS Database Programming. request date: *Approval ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences NEW (or REVISED) COURSE: COS-STAT-762 SAS Database Programming 1.0 Course Designations and Approvals

More information

COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-741 Regression Analysis. request date: *Approval

COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-741 Regression Analysis. request date: *Approval ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences NEW (or REVISED) COURSE: COS-STAT-741 Regression Analysis 1.0 Course Designations and Approvals

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences. Revised COURSE: COS-MATH-241 Linear Algebra

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences. Revised COURSE: COS-MATH-241 Linear Algebra ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-241 Linear Algebra 1.0 Course designations and approvals: Required

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-185 Mathematics of Graphical Simulation I 1.0 Course designations

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences. Revised COURSE: COS-MATH-461 Topology

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences. Revised COURSE: COS-MATH-461 Topology ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-461 Topology 1.0 Course designations and approvals: Required Course

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences. Revised COURSE: COS-MATH-361 Combinatorics

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences. Revised COURSE: COS-MATH-361 Combinatorics ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-361 Combinatorics 1.0 Course designations and approvals: Required

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. Chester F. Carlson Center for Imaging Science

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. Chester F. Carlson Center for Imaging Science ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE Chester F. Carlson Center for Imaging Science NEW COURSE: COS-IMGS-441-Noise and System Modeling 1.0 Course Designations and Approvals

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences ! ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences New Revised COURSE: COS-MATH-411 Numerical Analysis 1.0 Course designations and approvals: Required

More information

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE PROPOSAL FORM COLLEGE OF SCIENCE. Chester F. Carlson Center for Imaging Science

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE PROPOSAL FORM COLLEGE OF SCIENCE. Chester F. Carlson Center for Imaging Science ROCHESTER INSTITUTE OF TECHNOLOGY COURSE PROPOSAL FORM COLLEGE OF SCIENCE Chester F. Carlson Center for Imaging Science REVISED COURSE: COS-IMGS-682-Image Processing and Computer Vision 1.0 Course Designations

More information

Pattern Classification and Clustering Spring 2006

Pattern Classification and Clustering Spring 2006 Pattern Classification and Clustering Time: Spring 2006 Room: Instructor: Yingen Xiong Office: 621 McBryde Office Hours: Phone: 231-4212 Email: yxiong@cs.vt.edu URL: http://www.cs.vt.edu/~yxiong/pcc/ Detailed

More information

Machine Learning L, T, P, J, C 2,0,2,4,4

Machine Learning L, T, P, J, C 2,0,2,4,4 Subject Code: Objective Expected Outcomes Machine Learning L, T, P, J, C 2,0,2,4,4 It introduces theoretical foundations, algorithms, methodologies, and applications of Machine Learning and also provide

More information

BGS Training Requirement in Statistics

BGS Training Requirement in Statistics BGS Training Requirement in Statistics All BGS students are required to have an understanding of statistical methods and their application to biomedical research. Most students take BIOM611, Statistical

More information

36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B

36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B 36-350: Data Mining Fall 2009 Instructor: Cosma Shalizi, Statistics Dept., Baker Hall 229C, cshalizi@stat.cmu.edu Teaching Assistant: Joseph Richards, jwrichar@stat.cmu.edu Lectures: Monday, Wednesday

More information

CSC 411 MACHINE LEARNING and DATA MINING

CSC 411 MACHINE LEARNING and DATA MINING CSC 411 MACHINE LEARNING and DATA MINING Lectures: Monday, Wednesday 12-1 (section 1), 3-4 (section 2) Lecture Room: MP 134 (section 1); Bahen 1200 (section 2) Instructor (section 1): Richard Zemel Instructor

More information

Statistics and Machine Learning, Master s Programme

Statistics and Machine Learning, Master s Programme DNR LIU-2017-02005 1(9) Statistics and Machine Learning, Master s Programme 120 credits Statistics and Machine Learning, Master s Programme F7MSL Valid from: 2018 Autumn semester Determined by Board of

More information

Statistics Short Courses Faculty of Health, Arts and Design

Statistics Short Courses Faculty of Health, Arts and Design SEMESTER 2, 2017 Statistics Short Courses Faculty of Health, Arts and Design Online quizzes are available for each course. To pass the course you are expected to attend most of the classes and pass the

More information

CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015

CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015 CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:30-11 (WESB 100).

More information

Machine Learning with MATLAB Antti Löytynoja Application Engineer

Machine Learning with MATLAB Antti Löytynoja Application Engineer Machine Learning with MATLAB Antti Löytynoja Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB MATLAB as an interactive

More information

LEHMAN COLLEGE OF THE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE CURRICULUM CHANGE

LEHMAN COLLEGE OF THE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE CURRICULUM CHANGE LEHMAN COLLEGE OF THE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE CURRICULUM CHANGE Name of Program and Degree Award: Mathematics, BA Hegis Number: 1701.00 Program Code:

More information

Ensembles. CS Ensembles 1

Ensembles. CS Ensembles 1 Ensembles CS 478 - Ensembles 1 A Holy Grail of Machine Learning Outputs Just a Data Set or just an explanation of the problem Automated Learner Hypothesis Input Features CS 478 - Ensembles 2 Ensembles

More information

Department of Biostatistics

Department of Biostatistics The University of Kansas 1 Department of Biostatistics The mission of the Department of Biostatistics is to provide an infrastructure of biostatistical and informatics expertise to support and enhance

More information

10701: Intro to Machine Learning. Instructors: Pradeep Ravikumar, Manuela Veloso, Teaching Assistants:

10701: Intro to Machine Learning. Instructors: Pradeep Ravikumar, Manuela Veloso, Teaching Assistants: 10701: Intro to Machine Instructors: Pradeep Ravikumar, pradeepr@cs.cmu.edu Manuela Veloso, mmv@cs.cmu.edu Teaching Assistants: Shaojie Bai shaojieb@andrew.cmu.edu Adarsh Prasad adarshp@andrew.cmu.edu

More information

A Review on Classification Techniques in Machine Learning

A Review on Classification Techniques in Machine Learning A Review on Classification Techniques in Machine Learning R. Vijaya Kumar Reddy 1, Dr. U. Ravi Babu 2 1 Research Scholar, Dept. of. CSE, Acharya Nagarjuna University, Guntur, (India) 2 Principal, DRK College

More information

Machine Learning Algorithms: A Review

Machine Learning Algorithms: A Review Machine Learning Algorithms: A Review Ayon Dey Department of CSE, Gautam Buddha University, Greater Noida, Uttar Pradesh, India Abstract In this paper, various machine learning algorithms have been discussed.

More information

University of California, Berkeley Department of Statistics Statistics Undergraduate Major Information 2018

University of California, Berkeley Department of Statistics Statistics Undergraduate Major Information 2018 University of California, Berkeley Department of Statistics Statistics Undergraduate Major Information 2018 OVERVIEW and LEARNING OUTCOMES of the STATISTICS MAJOR Statisticians help design data collection

More information

Syllabus Data Mining for Business Analytics - Managerial INFO-GB.3336, Spring 2018

Syllabus Data Mining for Business Analytics - Managerial INFO-GB.3336, Spring 2018 Syllabus Data Mining for Business Analytics - Managerial INFO-GB.3336, Spring 2018 Course information When: Mondays and Wednesdays 3-4:20pm Where: KMEC 3-65 Professor Manuel Arriaga Email: marriaga@stern.nyu.edu

More information

ST 562: Data Mining with SAS Enterprise Miner

ST 562: Data Mining with SAS Enterprise Miner ST 562: Data Mining with SAS Enterprise Miner In Workflow 1. 17ST GR Director of Curriculum (demarti4@ncsu.edu; bondell@stat.ncsu.edu) 2. 17ST Grad Head (demarti4@ncsu.edu; bondell@stat.ncsu.edu; fuentes@ncsu.edu)

More information

MACHINE LEARNING WITH SAS

MACHINE LEARNING WITH SAS This webinar will be recorded. Please engage, use the Questions function during the presentation! MACHINE LEARNING WITH SAS SAS NORDIC FANS WEBINAR 21. MARCH 2017 Gert Nissen Technical Client Manager Georg

More information

Department of Statistics and Data Science Courses

Department of Statistics and Data Science Courses Department of Statistics and Data Science Courses 1 Department of Statistics and Data Science Courses Note on Course Numbers Each Carnegie Mellon course number begins with a two-digit prefix which designates

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Hamed Pirsiavash CMSC 678 http://www.csee.umbc.edu/~hpirsiav/courses/ml_fall17 The slides are closely adapted from Subhransu Maji s slides Course background What is the

More information

Statistical Learning- Classification STAT 441/ 841, CM 764

Statistical Learning- Classification STAT 441/ 841, CM 764 Statistical Learning- Classification STAT 441/ 841, CM 764 Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo aghodsib@uwaterloo.ca Two Paradigms Classical Statistics Infer

More information

COMP 551 Applied Machine Learning Lecture 11: Ensemble learning

COMP 551 Applied Machine Learning Lecture 11: Ensemble learning COMP 551 Applied Machine Learning Lecture 11: Ensemble learning Instructor: Herke van Hoof (herke.vanhoof@mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp551

More information

MD - Data Mining

MD - Data Mining Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 017 70 - FIB - Barcelona School of Informatics 715 - EIO - Department of Statistics and Operations Research 73 - CS - Department of

More information

STA 414/2104 Statistical Methods for Machine Learning and Data Mining

STA 414/2104 Statistical Methods for Machine Learning and Data Mining STA 414/2104 Statistical Methods for Machine Learning and Data Mining Radford M. Neal, University of Toronto, 2014 Week 1 What are Machine Learning and Data Mining? Typical Machine Learning and Data Mining

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Course Guide Year GENERAL INFORMATION Course information Name. Machine Learning Code

Course Guide Year GENERAL INFORMATION Course information Name. Machine Learning Code Course Guide Year 2017-2018 ESCUELA TÉCNICA SUPERIOR DE INGENIERÍA GENERAL INFORMATION Course information Name Machine Learning Code DOI-MIC-515 Degree MIC, MII, MIT Year Semester Spring ECTS credits 6

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Machine Learning for Computer Vision

Machine Learning for Computer Vision Computer Group Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.059 Main lecture MSc. Ioannis John

More information

White Paper. Using Sentiment Analysis for Gaining Actionable Insights

White Paper. Using Sentiment Analysis for Gaining Actionable Insights corevalue.net info@corevalue.net White Paper Using Sentiment Analysis for Gaining Actionable Insights Sentiment analysis is a growing business trend that allows companies to better understand their brand,

More information

Unsupervised Learning: Clustering

Unsupervised Learning: Clustering Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning

More information

Statistics for Risk Modeling Exam September 2018

Statistics for Risk Modeling Exam September 2018 Statistics for Risk Modeling Exam September 2018 IMPORTANT NOTICE This version of the syllabus is final, though minor changes may occur. This March 2018 version includes updates to this page and to the

More information

COMP 551 Applied Machine Learning Lecture 12: Ensemble learning

COMP 551 Applied Machine Learning Lecture 12: Ensemble learning COMP 551 Applied Machine Learning Lecture 12: Ensemble learning Associate Instructor: Herke van Hoof (herke.vanhoof@mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551

More information

Machine Learning and Applications in Finance

Machine Learning and Applications in Finance Machine Learning and Applications in Finance Christian Hesse 1,2,* 1 Autobahn Equity Europe, Global Markets Equity, Deutsche Bank AG, London, UK christian-a.hesse@db.com 2 Department of Computer Science,

More information

Session 1: Gesture Recognition & Machine Learning Fundamentals

Session 1: Gesture Recognition & Machine Learning Fundamentals IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research

More information

ECE-271A Statistical Learning I

ECE-271A Statistical Learning I ECE-271A Statistical Learning I Nuno Vasconcelos ECE Department, UCSD The course the course is an introductory level course in statistical learning by introductory I mean that you will not need any previous

More information

GENERAL BUSINESS (GEN BUS)

GENERAL BUSINESS (GEN BUS) General Business (GEN BUS) 1 GENERAL (GEN BUS) GEN BUS 100 INTRODUCTION TO Introduction to the basic concepts, practices and analytical methods that are part of the market enterprise system. Overview of

More information

Ensemble Learning CS534

Ensemble Learning CS534 Ensemble Learning CS534 Ensemble Learning How to generate ensembles? There have been a wide range of methods developed We will study some popular approaches Bagging ( and Random Forest, a variant that

More information

Programming Social Robots for Human Interaction. Lecture 4: Machine Learning and Pattern Recognition

Programming Social Robots for Human Interaction. Lecture 4: Machine Learning and Pattern Recognition Programming Social Robots for Human Interaction Lecture 4: Machine Learning and Pattern Recognition Zheng-Hua Tan Dept. of Electronic Systems, Aalborg Univ., Denmark zt@es.aau.dk, http://kom.aau.dk/~zt

More information

Outline. Ensemble Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

Outline. Ensemble Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012) Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Voting 3 Stacking 4 Bagging 5 Boosting Rationale

More information

STID Statistics and Business Intelligence

STID Statistics and Business Intelligence STID Statistics and Business Intelligence IUT Roubaix Lille 2 University France Sylvia CANONNE Description of teaching modules. September 2014 3 Course descriptions subject to change Term 1 M1101A -Mathematics

More information

10-702: Statistical Machine Learning

10-702: Statistical Machine Learning 10-702: Statistical Machine Learning Syllabus, Spring 2010 http://www.cs.cmu.edu/~10702 Statistical Machine Learning is a second graduate level course in machine learning, assuming students have taken

More information

STATISTICS (STAT) Statistics (STAT) 1. STAT PROBABILITY AND STATISTICS Short Title: PROBABILITY & STATISTICS

STATISTICS (STAT) Statistics (STAT) 1. STAT PROBABILITY AND STATISTICS Short Title: PROBABILITY & STATISTICS Statistics (STAT) 1 STATISTICS (STAT) STAT 280 - ELEMENTARY APPLIED STATISTICS Short Title: ELEMENTARY APPLIED STATISTICS /Laboratory Credit Hours: 4 Course Level: Undergraduate Lower-Level Description:

More information

CS545 Machine Learning

CS545 Machine Learning Machine learning and related fields CS545 Machine Learning Course Introduction Machine learning: the construction and study of systems that learn from data. Pattern recognition: the same field, different

More information

Bird Species Identification from an Image

Bird Species Identification from an Image Bird Species Identification from an Image Aditya Bhandari, 1 Ameya Joshi, 2 Rohit Patki 3 1 Department of Computer Science, Stanford University 2 Department of Electrical Engineering, Stanford University

More information

- Introduzione al Corso - (a.a )

- Introduzione al Corso - (a.a ) Short Course on Machine Learning for Web Mining - Introduzione al Corso - (a.a. 2009-2010) Roberto Basili (University of Roma, Tor Vergata) 1 Overview MLxWM: Motivations and perspectives A temptative syllabus

More information

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Aditya Sarkar, Julien Kawawa-Beaudan, Quentin Perrot Friday, December 11, 2014 1 Problem Definition Driving while drowsy inevitably

More information

About This Specialization

About This Specialization About This Specialization The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 April 6, 2009 Outline Outline Introduction to Machine Learning Outline Outline Introduction to Machine Learning

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Statistics. Department Degree Requirements - Statistics. Faculty. Options. Departmental Honors. Undergraduate. Credit for Beginning Courses.

Statistics. Department Degree Requirements - Statistics. Faculty. Options. Departmental Honors. Undergraduate. Credit for Beginning Courses. Statistics 1 Statistics Dongchu Sun, Chair College of Arts and Science 146 Middlebush Hall (573) 882-6376 www.stat.missouri.edu umcasstat@missouri.edu Information is needed to solve the many problems of

More information

Machine Learning for Computer Vision

Machine Learning for Computer Vision Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.058 (Fridays) Main lecture MSc. Ioannis John Chiotellis

More information

Lecture 1.1: Introduction CSC Machine Learning

Lecture 1.1: Introduction CSC Machine Learning Lecture 1.1: Introduction CSC 84020 - Machine Learning Andrew Rosenberg January 29, 2010 Today Introductions and Class Mechanics. Background about me Me: Graduated from Columbia in 2009 Research Speech

More information

INTRODUCTION TO DATA SCIENCE

INTRODUCTION TO DATA SCIENCE DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:

More information

Feedback Prediction for Blogs

Feedback Prediction for Blogs Feedback Prediction for Blogs Krisztian Buza Budapest University of Technology and Economics Department of Computer Science and Information Theory buza@cs.bme.hu Abstract. The last decade lead to an unbelievable

More information

Statistics for the Life Sciences, 5/e, Samuels, Witmer and Schaffner ISBN:

Statistics for the Life Sciences, 5/e, Samuels, Witmer and Schaffner ISBN: v Credits 4 credits Course Title Statistics Course Number STA 3100 Pre-requisite None Co-requisite (s) None (s) Hours 60 theory hours/60 clock hours Total Outside Hours 120 hours Note: A minimum of 2 hours

More information

Statistics. General Course Information. Introductory Courses and Sequences. Department Website: Program of Study

Statistics. General Course Information. Introductory Courses and Sequences. Department Website:  Program of Study Statistics 1 Statistics Department Website: http://www.stat.uchicago.edu Program of Study The modern science of statistics involves the development of principles and methods for modeling uncertainty, for

More information

DATA SCIENCE CURRICULUM

DATA SCIENCE CURRICULUM DATA SCIENCE CURRICULUM Immersive program covers all the necessary tools and concepts used by data scientists in the industry, including machine learning, statistical inference, and working with data at

More information

CS534 Machine Learning

CS534 Machine Learning CS534 Machine Learning Spring 2013 Lecture 1: Introduction to ML Course logistics Reading: The discipline of Machine learning by Tom Mitchell Course Information Instructor: Dr. Xiaoli Fern Kec 3073, xfern@eecs.oregonstate.edu

More information

Data Analysis for Business and Industry

Data Analysis for Business and Industry Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2017 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research BACHELOR'S

More information

Multi-Class Sentiment Analysis with Clustering and Score Representation

Multi-Class Sentiment Analysis with Clustering and Score Representation Multi-Class Sentiment Analysis with Clustering and Score Representation Mohsen Farhadloo Erik Rolland mfarhadloo@ucmerced.edu 1 CONTENT Introduction Applications Related works Our approach Experimental

More information

Machine Learning for Computer Vision

Machine Learning for Computer Vision Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.059 (Fridays) Main lecture MSc. Ioannis John Chiotellis

More information

Binary decision trees

Binary decision trees Binary decision trees A binary decision tree ultimately boils down to taking a majority vote within each cell of a partition of the feature space (learned from the data) that looks something like this

More information

SYLLABUS DSCI Introduction to Data Mining Fall 2017

SYLLABUS DSCI Introduction to Data Mining Fall 2017 SYLLABUS DSCI 4520.001 Introduction to Data Mining Fall 2017 CLASS (DAY/TIME): Wednesdays 6:30-9:20, BLB 070 INSTRUCTOR: Dr. Nick Evangelopoulos OFFICE HRS: TW 1:00-2:00pm at BLB 365D CONTACT INFO: OFFICE

More information

Master s (Level 7) Standards in Statistics

Master s (Level 7) Standards in Statistics Master s (Level 7) Standards in Statistics In determining the Master s (qualifications framework Level 7) standards for a course in statistics, reference is made to the Graduate, Honours Degree, (Level

More information

STA 225: Introductory Statistics (CT)

STA 225: Introductory Statistics (CT) Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic

More information

CSE 258 Lecture 3. Web Mining and Recommender Systems. Supervised learning Classification

CSE 258 Lecture 3. Web Mining and Recommender Systems. Supervised learning Classification CSE 258 Lecture 3 Web Mining and Recommender Systems Supervised learning Classification Last week Last week we started looking at supervised learning problems Last week We studied linear regression, in

More information

Master of Science in ECE - Machine Learning & Data Science Focus

Master of Science in ECE - Machine Learning & Data Science Focus Master of Science in ECE - Machine Learning & Data Science Focus Core Coursework (16 units) ECE269: Linear Algebra ECE271A: Statistical Learning I ECE 225A: Probability and Statistics for Data Science

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, 582631 5 credits Introduction to Machine Learning Lecturer: Teemu Roos Assistant: Ville Hyvönen Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer and Jyrki

More information

SB2b Statistical Machine Learning Hilary Term 2017

SB2b Statistical Machine Learning Hilary Term 2017 SB2b Statistical Machine Learning Hilary Term 2017 Mihaela van der Schaar and Seth Flaxman Guest lecturer: Yee Whye Teh Department of Statistics Oxford Slides and other materials available at: http://www.oxford-man.ox.ac.uk/~mvanderschaar/home_

More information

Semester Statistics Short courses

Semester Statistics Short courses Semester 1 2018 Statistics Short courses Course: STAA0001 - Basic Statistics Blackboard Site: STAA0001 Dates: Twelve 2 hourly sessions: Thursdays 1/3-17/5 (5.30 pm 7.30 pm) Room: BA513 Assumed Knowledge:

More information

Performance Analysis of Various Data Mining Techniques on Banknote Authentication

Performance Analysis of Various Data Mining Techniques on Banknote Authentication International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 5 Issue 2 February 2016 PP.62-71 Performance Analysis of Various Data Mining Techniques on

More information

Course Overview. Yu Hen Hu. Introduction to ANN & Fuzzy Systems

Course Overview. Yu Hen Hu. Introduction to ANN & Fuzzy Systems Course Overview Yu Hen Hu Introduction to ANN & Fuzzy Systems Outline Overview of the course Goals, objectives Background knowledge required Course conduct Content Overview (highlight of each topics) 2

More information

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 15th, 2018

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 15th, 2018 Data Mining CS573 Purdue University Bruno Ribeiro February 15th, 218 1 Today s Goal Ensemble Methods Supervised Methods Meta-learners Unsupervised Methods 215 Bruno Ribeiro Understanding Ensembles The

More information

Software Defect Prediction using Support Vector Machine

Software Defect Prediction using Support Vector Machine ISSN: 2454-132X Impact factor: 4.295 (Volume3, Issue1) Available online at: www.ijariit.com Software Defect Prediction using Support Vector Machine Er. Ramandeep Kaur Bahra Group of Institutes, Patiala.

More information

Applied Multivariate Statistics

Applied Multivariate Statistics Applied Multivariate Statistics Fall Semester 2017 University of Mannheim Department of Economics Chair of Statistics Toni Stocker Applied Multivariate Statistics (AMS) - Content Introduction to AMS Matrix

More information

240ST014 - Data Analysis of Transport and Logistics

240ST014 - Data Analysis of Transport and Logistics Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2017 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research MASTER'S

More information

The Health Economics and Outcomes Research Applications and Valuation of Digital Health Technologies and Machine Learning

The Health Economics and Outcomes Research Applications and Valuation of Digital Health Technologies and Machine Learning The Health Economics and Outcomes Research Applications and Valuation of Digital Health Technologies and Machine Learning Workshop W29 - Session V 3:00 4:00pm May 25, 2016 ISPOR 21 st Annual International

More information

CS 6140: Machine Learning Spring 2017

CS 6140: Machine Learning Spring 2017 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Time and Loca@on

More information

COMS 4771 Introduction to Machine Learning. Nakul Verma

COMS 4771 Introduction to Machine Learning. Nakul Verma COMS 4771 Introduction to Machine Learning Nakul Verma Machine learning: what? Study of making machines learn a concept without having to explicitly program it. Constructing algorithms that can: learn

More information

Machine Learning for NLP

Machine Learning for NLP Natural Language Processing SoSe 2014 Machine Learning for NLP Dr. Mariana Neves April 30th, 2014 (based on the slides of Dr. Saeedeh Momtazi) Introduction Field of study that gives computers the ability

More information

CSCI , Data Mining and Warehousing Spring 2015

CSCI , Data Mining and Warehousing Spring 2015 CSCI 6366.01, Data Mining and Warehousing Spring 2015 Instructor: Zhixiang Chen, Office: ENGR 3.272, Phone: 665-3520, Email: zchen@utpa.edu, WWW Home Page: faculty. utpa.edu/zchen/ Office Hours: Monday

More information

Welcome to CMPS 142: Machine Learning. Administrivia. Lecture Slides for. Instructor: David Helmbold,

Welcome to CMPS 142: Machine Learning. Administrivia. Lecture Slides for. Instructor: David Helmbold, Welcome to CMPS 142: Machine Learning Instructor: David Helmbold, dph@soe.ucsc.edu Web page: www.soe.ucsc.edu/classes/cmps142/winter07/ Text: Introduction to Machine Learning, Alpaydin Administrivia Sign

More information

CALL FOR APPLICATIONS FOR ADMISSION GRADUATE STUDY PROGRAM "MASTER OF SCIENCE in DATA SCIENCE" Part Time Program

CALL FOR APPLICATIONS FOR ADMISSION GRADUATE STUDY PROGRAM MASTER OF SCIENCE in DATA SCIENCE Part Time Program CALL FOR APPLICATIONS FOR ADMISSION GRADUATE STUDY PROGRAM "MASTER OF SCIENCE in DATA SCIENCE" Part Time Program 2017-2019 Data Science is the study of data through computational and statistical techniques,

More information

Module 4: Data analysis and presentation

Module 4: Data analysis and presentation Module 4: Data analysis and presentation Six steps in the IR process Quantitative vs Qualitative What are the differences between quantitative and qualitative research? Research questions Methodological

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Classification Ensemble e Methods 1 Jeff Howbert Introduction to Machine Learning Winter 2012 1 Ensemble methods Basic idea of ensemble methods: Combining predictions from competing models often gives

More information

One Week Faculty Development Program on DATA AND TEXT ANALYTICS USING R DELHI INSTITUTE OF ADVANCED STUDIES. is organizing. 29 th May- 3 rd June 2017

One Week Faculty Development Program on DATA AND TEXT ANALYTICS USING R DELHI INSTITUTE OF ADVANCED STUDIES. is organizing. 29 th May- 3 rd June 2017 DELHI INSTITUTE OF ADVANCED STUDIES (NAAC Accredited A Grade Institute) is organizing One Week Faculty Development Program on DATA AND TEXT ANALYTICS USING R 29 th May- 3 rd June 2017 DELHI INSTITUTE OF

More information

M. R. Ahmadzadeh Isfahan University of Technology. M. R. Ahmadzadeh Isfahan University of Technology

M. R. Ahmadzadeh Isfahan University of Technology. M. R. Ahmadzadeh Isfahan University of Technology 1 2 M. R. Ahmadzadeh Isfahan University of Technology Ahmadzadeh@cc.iut.ac.ir M. R. Ahmadzadeh Isfahan University of Technology Textbooks 3 Introduction to Machine Learning - Ethem Alpaydin Pattern Recognition

More information

Machine Learning : Hinge Loss

Machine Learning : Hinge Loss Machine Learning Hinge Loss 16/01/2014 Machine Learning : Hinge Loss Recap tasks considered before Let a training dataset be given with (i) data and (ii) classes The goal is to find a hyper plane that

More information

learn from the accelerometer data? A close look into privacy Member: Devu Manikantan Shila

learn from the accelerometer data? A close look into privacy Member: Devu Manikantan Shila What can we learn from the accelerometer data? A close look into privacy Team Member: Devu Manikantan Shila Abstract: A handful of research efforts nowadays focus on gathering and analyzing the data from

More information