CS434 Machine Learning and Data Mining. Fall 2013

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Laboratorio di Intelligenza Artificiale e Robotica

CSL465/603 - Machine Learning

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Laboratorio di Intelligenza Artificiale e Robotica

Firms and Markets Saturdays Summer I 2014

Course Content Concepts

CS 101 Computer Science I Fall Instructor Muller. Syllabus

CS Machine Learning

Navigating the PhD Options in CMS

Axiom 2013 Team Description Paper

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Data Structures and Algorithms

Syllabus Foundations of Finance Summer 2014 FINC-UB

Probability and Statistics Curriculum Pacing Guide

FINN FINANCIAL MANAGEMENT Spring 2014

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

MOODLE 2.0 GLOSSARY TUTORIALS

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

Paper Reference. Edexcel GCSE Mathematics (Linear) 1380 Paper 1 (Non-Calculator) Foundation Tier. Monday 6 June 2011 Afternoon Time: 1 hour 30 minutes

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Introduction to Forensic Drug Chemistry

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Word Segmentation of Off-line Handwritten Documents

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Making Sales Calls. Watertown High School, Watertown, Massachusetts. 1 hour, 4 5 days per week

A Case Study: News Classification Based on Term Frequency

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Artificial Neural Networks written examination

Networks and the Diffusion of Cutting-Edge Teaching and Learning Knowledge in Sociology

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

UNIT ONE Tools of Algebra

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Syllabus Fall 2014 Earth Science 130: Introduction to Oceanography

Foothill College Summer 2016

PART C: ENERGIZERS & TEAM-BUILDING ACTIVITIES TO SUPPORT YOUTH-ADULT PARTNERSHIPS

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

White Paper. The Art of Learning

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

Lecture 10: Reinforcement Learning

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

BIOS 104 Biology for Non-Science Majors Spring 2016 CRN Course Syllabus

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

Course Syllabus for Math

Math 121 Fundamentals of Mathematics I

Changing User Attitudes to Reduce Spreadsheet Risk

CWSEI Teaching Practices Inventory

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

CS177 Python Programming

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

*In Ancient Greek: *In English: micro = small macro = large economia = management of the household or family

Reinforcement Learning by Comparing Immediate Reward

University of Groningen. Systemen, planning, netwerken Bosman, Aart

INTERMEDIATE ALGEBRA Course Syllabus

Using Proportions to Solve Percentage Problems I

May To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment

Genevieve L. Hartman, Ph.D.

CS 100: Principles of Computing

LEGO MINDSTORMS Education EV3 Coding Activities

INTERMEDIATE ALGEBRA PRODUCT GUIDE

preassessment was administered)

Dangerous. He s got more medical student saves than anybody doing this kind of work, Bradley said. He s tremendous.

Tips for Academic Scholarship Success. Handouts from today s presentation are available online: studentaffairs.pitt.edu/fye/academicscholarships/

SARDNET: A Self-Organizing Feature Map for Sequences

Introduction to Questionnaire Design

(Sub)Gradient Descent

Essentials of Ability Testing. Joni Lakin Assistant Professor Educational Foundations, Leadership, and Technology

DOCTOR OF PHILOSOPHY HANDBOOK

Math 96: Intermediate Algebra in Context

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

Functional Maths Skills Check E3/L x

Classify: by elimination Road signs

Alex Robinson Financial Aid

MKTG 611- Marketing Management The Wharton School, University of Pennsylvania Fall 2016

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

OFFICE SUPPORT SPECIALIST Technical Diploma

Functional Skills Mathematics Level 2 assessment

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Financial Accounting Concepts and Research

CS 3516: Computer Networks

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Use the Syllabus to tick off the things you know, and highlight the areas you are less clear on. Use BBC Bitesize Lessons, revision activities and

Human Emotion Recognition From Speech

Python Machine Learning

Please read this entire syllabus, keep it as reference and is subject to change by the instructor.

MinE 382 Mine Power Systems Fall Semester, 2014

Radius STEM Readiness TM

How to make successful presentations in English Part 2

Training Staff with Varying Abilities and Special Needs

Shared Mental Models

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

essays personal admission college college personal admission

Transcription:

CS434 Machine Learning and Data Mining Fall 2013 1

Administrative Trivia Instructor: TA: Dr. Xiaoli Fern web.engr.oregonstate.edu/~xfern Office hour: one hour before class or by appointment Zahra Iman (zahra.iman87@gmail.com) Office hour: TBA Course webpage classes.engr.oregonstate.edu/eecs/fall2013/cs434 Please check course webpage frequently Learning objectives, Syllabus, Course policy Course related announcements Assignments 2

Briefly Grading (tentative): Homework & class participations 40% Exam (one midterm) 25% Final project 35% Homework due at the beginning of the class (first 5 minutes of the class) Late submission will be accepted within 24 hours, but only gets 80% Collaborations policy (for solo assignments) Verbal discussion about general approaches and strategies allowed Can talk about examples not in the assignments Anything you turn in has to be created by you and you alone For team assignments, the above policies apply between teams Team assignment submissions must also indicate the individual roles 3

Course materials No text book required, slides and reading materials will be provided on course webpage There are a few recommended books that are good references Machine learning by Tom Mitchell (TM) slightly out of date but good intro to some topics Pattern recognition and machine learning by Chris Bishop (Bishop) dense material 4

What is learning? Generally speaking any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same distribution --- Herbert Simon* * One of the founding fathers of AI, Turing award winner 5

Machine learning Task T Performance P Learning Algorithm Experience E Learning = Improving with experience at some task Improve over task T with respect to P based on experience E

When do we need computer to learn? A program that does tax return A program that looks up phone numbers in phone directory 7

When do we need learning? Sometimes there is no human expert knowledge Predict whether a new compound will be effective for treating some disease Predict whether two profiles on match.com would be a good match (or does this belong the next category?) Sometimes humans can do it but can t describe how they do it Recognize visual objects Speech recognition Sometimes the things we need to learn change frequently Stock market analysis, weather forecasting, computer network routing Sometimes the thing we need to learn needs customization Spam filters, movie/product recommendation 8

Sub-fields of Interest Supervised learning learn to predict (regression and classification) Unsupervised learning learn to understand and describe the data (clustering, frequent pattern mining) Reinforcement learning learn to act Data mining A highly overlapping concept, but heavier focus on large volume of data: To obtain useful knowledge from large volume of data 9

Supervised Learning: example Learn to predict output from input Output can be continuous (regression) or discrete (classification) E.g. predict the risk level (high vs.low) of a loan applicant based on income and savings MANY successful applications! Spam filters Collaborative filtering (predicting if a customer will be interested in an advertisement ) Ecological (predicting if a species absence/presence in a certain environment ) Medical diagnosis

Unsupervised learning Find patterns and structure in data Clustering art 11

Example Applications Market partition: divide a market into distinct subsets of customers Find clusters of similar customers, where each cluster may conceivably be selected as a market target to be reached with a distinct marketing strategy Automatic organization of information Automatic organization of images Generate a categorized view of a collection of documents Organize search results to diversify results Scientific applications: Bioinformatics: clustering the genes based on their expression profile to find clusters of similarly regulated genes functional groups Atmospheric science: clustering temporal signals (e.g., temperature, wind, pressure) for finding different weather regimes 12

Reinforcement learning 13

Example Applications Robotics Gait control for robotic legs Routing of the robot in a complex environment Controls Helicopter control, automatous vehicle Operation research Automatic pricing of internet advertisements AI game agents Real time strategy game agent GO, Chess.. 14

Course Learning Objectives 1. Students are able to apply supervised learning algorithms to prediction problems and evaluate the results. 2. Students are able to apply unsupervised learning algorithms to data analysis problems and evaluate results. 3. Students are able to apply reinforcement learning algorithms to control problem and evaluate results. 4. Students are able to take a description of a new problem and decide what kind of problem (supervised, unsupervised, or reinforcement) it is. 15

Example: Learning to play checkers Task: play checkers Performace: percent of games won in the world tournament To design a learning system for this task, we need to consider: What experience to learn from? (the training data) What should we exactly learn? (the target function) How should we represent this thing that we are learning? (Representation of the target function) What type of learning is it supervised, unsupervised, or reinforcement learning, and what specific algorithm to use? 16

Type of training experience Direct training (like watching a master play) For a given board state, we observe a best move for that position Observe many states and many moves (that will be our training data) Try to learn a formula of some sort that tells us what is the best move for any arbitrary state This fits in supervised learning Indirect training (like learning by playing) Just observe a sequence of plays and the end result More difficult, because which of the moves are the bad (good) ones for a bad (good) game? This is the credit assignment problem, challenging to solve This is more like reinforcement learning 17

Choose the Target Function (what should we learn) Choosemove: board state -> move? Supervised learning V: Board state -> Reward (value of the state)? Reinforcement learning If you know the value of all possible states, at any state you can choose a move that leads to the best next state This is more similar to how people understands the game 18

Possible definition for target function V If b is a final board state that won, V(b)=100 If b is a final board state that is lost, V(b)= -100 If b is a final board state that is drawn, the V(b)=0 If b is not a final board state, then V(b)=V(b ), where b is the best possible final state reachable from b. This gives correct values, but is not operational A more practical approach is to compute a set of features describing the board state and the value of the board state is a function of these features Features can be: # of black pieces, # of red pieces, # of black king pieces,. 19

Choose representation for target function Linear function of the board features? Polynomial functions of board features? 20 ) ( ) ( ) ( 2 2 1 1 0 b f w b f w b f w w n n ) ( ) ( ) ( ) ( ) ( ) ( 2 1 5 2 2 4 2 1 3 2 2 1 1 0 b f b f w b f w b f w b f w b f w w

A diagram of design choices In this class, you will become familiar with many of these choices, and even try them in practice. We would like to prepare you so that you can make good design choices when facing a new learning problem! 21

For next lecture A small exercise for you to do Please take some measurements (all in cms) of yourself and send me the results by tomorrow Your height The knee height The arm-span (spreading your arms out, and measure the length from finger tip to finger tip) 22