TDT4173 Machine Learning and Case-Based Reasoning

Similar documents
Lecture 1: Basic Concepts of Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Python Machine Learning

Lecture 1: Machine Learning Basics

Active Learning. Yingyu Liang Computer Sciences 760 Fall

(Sub)Gradient Descent

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

MYCIN. The MYCIN Task

Learning Methods for Fuzzy Systems

ALL-IN-ONE MEETING GUIDE THE ECONOMICS OF WELL-BEING

LEGO MINDSTORMS Education EV3 Coding Activities

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

CS Machine Learning

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Axiom 2013 Team Description Paper

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Artificial Neural Networks written examination

Airplane Rescue: Social Studies. LEGO, the LEGO logo, and WEDO are trademarks of the LEGO Group The LEGO Group.

Math 1313 Section 2.1 Example 2: Given the following Linear Program, Determine the vertices of the feasible set. Subject to:

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

A Case Study: News Classification Based on Term Frequency

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Rule Learning With Negation: Issues Regarding Effectiveness

Scientific Method Investigation of Plant Seed Germination

A Reinforcement Learning Variant for Control Scheduling

FINN FINANCIAL MANAGEMENT Spring 2014

EXAMINING THE DEVELOPMENT OF FIFTH AND SIXTH GRADE STUDENTS EPISTEMIC CONSIDERATIONS OVER TIME THROUGH AN AUTOMATED ANALYSIS OF EMBEDDED ASSESSMENTS

Rule-based Expert Systems

Word Segmentation of Off-line Handwritten Documents

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Diagnostic Test. Middle School Mathematics

The Strong Minimalist Thesis and Bounded Optimality

Person Centered Positive Behavior Support Plan (PC PBS) Report Scoring Criteria & Checklist (Rev ) P. 1 of 8

Learning From the Past with Experiment Databases

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Speech Recognition at ICSI: Broadcast News and beyond

Evolutive Neural Net Fuzzy Filtering: Basic Description

A study of speaker adaptation for DNN-based speech synthesis

Seminar - Organic Computing

Human Emotion Recognition From Speech

CSL465/603 - Machine Learning

KLI: Infer KCs from repeated assessment events. Do you know what you know? Ken Koedinger HCI & Psychology CMU Director of LearnLab

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Cooperative evolutive concept learning: an empirical study

GACE Computer Science Assessment Test at a Glance

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Practice Examination IREB

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

On-Line Data Analytics

MOODLE 2.0 GLOSSARY TUTORIALS

Evolution of Symbolisation in Chimpanzees and Neural Nets

Probability and Statistics Curriculum Pacing Guide

Distributed Weather Net: Wireless Sensor Network Supported Inquiry-Based Learning

Science Fair Project Handbook

Modeling user preferences and norms in context-aware systems

Software Maintenance

Medical Complexity: A Pragmatic Theory

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

Getting Started with TI-Nspire High School Science

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Chapter 2 Rule Learning in a Nutshell

A Pipelined Approach for Iterative Software Process Model

Cognitive Thinking Style Sample Report

TCC Jim Bolen Math Competition Rules and Facts. Rules:

Intelligent Agents. Chapter 2. Chapter 2 1

Probabilistic Latent Semantic Analysis

Time series prediction

SAM - Sensors, Actuators and Microcontrollers in Mobile Robots

COMMUNICATION & NETWORKING. How can I use the phone and to communicate effectively with adults?

Getting Started with Deliberate Practice

Computers Change the World

A Version Space Approach to Learning Context-free Grammars

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

SARDNET: A Self-Organizing Feature Map for Sequences

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Foothill College Summer 2016

Assessment System for M.S. in Health Professions Education (rev. 4/2011)

Reinforcement Learning by Comparing Immediate Reward

Science with Kids, Science by Kids By Sally Bowers, Dane County 4-H Youth Development Educator and Tom Zinnen, Biotechnology Specialist

An Introduction to Simio for Beginners

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

B. How to write a research paper

Navigating the PhD Options in CMS

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Making Sales Calls. Watertown High School, Watertown, Massachusetts. 1 hour, 4 5 days per week

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

Assignment 1: Predicting Amazon Review Ratings

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Rule Learning with Negation: Issues Regarding Effectiveness

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Transcription:

TDT4173 Machine Learning and Case-Based Reasoning Lecture 1 Introduction Norwegian University of Science and Technology Agnar Aamodt and Helge Langseth 1 TDT4173 Machine Learning and Case-Based Reasoning

Outline 1 Introduction to Machine learning Machine learning overview Examples The Learning Problem 2 Concept Learning From Examples EnjoySport - example The Inductive Learning Hypothesis Find-S Version Spaces CandidateEliminationAlgorithm Summary 3 Practical information About TDT4173 The other stuff 2 TDT4173 Machine Learning and Case-Based Reasoning

The grand vision Introduction to Machine learning Machine learning overview An autonomous self-moving machine that acts, reasons, and learns like a human We are still very far from achieving this... 3 TDT4173 Machine Learning and Case-Based Reasoning

What is machine learning? What is learning? Any process by which a system improves performance (H. Simon) Making useful changes in our minds (M. Minsky) Constructing or modifying representations of what is being experienced (R. Michalski) 2 TDT4173 Machine Learning and Case-Based Reasoning

What is machine learning? Why study learning in computers? To model learning in human beings To study learning as a theoretical phenomena To automate the development and maintenance of computer systems 3 TDT4173 Machine Learning and Case-Based Reasoning

What is machine learning? Methods and techniques that enable computers to improve their performance through their own experience 4 TDT4173 Machine Learning and Case-Based Reasoning

What is machine learning? Methods and techniques that enable computers to improve their performance through their own experience Basic definition. What about - knowledge structures and representations? - reasoning and reflection? - explanation capabilites? - performance vs. competence? 5 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Why Machine Learning Machine learning overview Recent progress in algorithms and theory Growing flood of online data Computational power is available Budding industry Three niches for machine learning: Data mining: using historical data to improve decisions - medical records medical knowledge Software applications we can t program by hand - autonomous driving - speech recognition Self customizing programs - Recommendation systems 4 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Typical Datamining Task Examples Data: Patient103 Patient103... time=1 Patient103 time=2 time=n Age: 23 FirstPregnancy: no Anemia: no Diabetes: no PreviousPrematureBirth: no Ultrasound:? Elective C Section:? Emergency C Section:?... Age: 23 FirstPregnancy: no Anemia: no Diabetes: YES PreviousPrematureBirth: no Ultrasound: abnormal Elective C Section: no Emergency C Section:?... Age: 23 FirstPregnancy: no Anemia: no Diabetes: no PreviousPrematureBirth: no Ultrasound:? Elective C Section: no Emergency C Section: Yes... Given: 9714 patient records, each describing a pregnancy and birth Each patient record contains 215 features Learn to predict: Classes of future patients at high risk for Emergency Cesarean Section 5 TDT4173 Machine Learning and Case-Based Reasoning

Datamining Result 6 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Examples Problems Too Difficult to Program by Hand ALVINN [Pomerleau] drives 70 mph on highways Sharp Left Straight Ahead Sharp Right 30 Output Units 4 Hidden Units 30x32 Sensor Input Retina 6 TDT4173 Machine Learning and Case-Based Reasoning

Practical information About TDT4171 Example: DARPA Urban Challenge Autonomous vehicle research and development program Vehicles maneuvering in a mock city environment, executing simulated military supply missions while merging into moving traffic, navigating traffic circles, negotiating busy intersections, and avoiding obstacles. Winner: Tartan Racing 9 TDT4171 Artificial Intelligence Methods

Practical information The utility-based agent...and beyond About TDT4171 10 TDT4171 Artificial Intelligence Methods

Introduction to Machine learning Examples Software that Customizes to User http://www.last.fm 7 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Where Is this Headed? Examples Today: tip of the iceberg: First-generation algorithms: neural nets, decision trees, regression... Applied to well-formatted database Budding industry Opportunity for tomorrow: enormous impact: Learn across full mixed-media data Learn by active experimentation Cumulative, lifelong learning Programming languages with learning embedded?... etc. (Only your imagination limits this list!) 8 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Introduction Machine Learning (Ch. 1) The Learning Problem Learning = Improving with experience at some task Improve over task T, with respect to performance measure P, based on experience E. Task Experience Task Program Program Program Performance Performance 9 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Learning to Play Checkers The Learning Problem T: Play checkers P: Percent of games won in world tournament E: opportunity to play against self 10 TDT4173 Machine Learning and Case-Based Reasoning

Design Choices Introduction to Machine learning The Learning Problem What experience can we learn from? What exactly should be learned? How shall it be represented? Target function: collection of rules? neural network? polynomial function of board features?... What specific algorithm can we use to learn it? 11 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Type of knowledge learned The Learning Problem We wish to learn a function that for any given board position B chooses the best move M, ChooseMove:B M. 12 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Type of knowledge learned The Learning Problem We wish to learn a function that for any given board position B chooses the best move M, ChooseMove:B M. Direct training: Examples of individual checkers board states and the correct move for each. Indirect training: Examples of sequences of moves and final outcomes of the various games played. Indirect training makes ChooseMove impractical to learn: If we end up winning, is the first move then optimal? 12 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning The Learning Problem Approximation The start of the learning work Instead of ChooseMove, we establish a Value function V : V : B R that maps legal board states B into some real value. Playing rule: For any board position, choose the move that maximizes the value of the resulting board position. 13 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning The Learning Problem Approximation The start of the learning work Instead of ChooseMove, we establish a Value function V : V : B R that maps legal board states B into some real value. Playing rule: For any board position, choose the move that maximizes the value of the resulting board position. 1 if b is a final board state that is won, then V (b) = 100 2 if b is a final board state that is lost, then V (b) = 100 3 if b is a final board state that is drawn, then V (b) = 0 4 if b is a not a final state in the game, then V (b) =?? 13 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning The Learning Problem Approximation The start of the learning work Instead of ChooseMove, we establish a Value function V : V : B R that maps legal board states B into some real value. Playing rule: For any board position, choose the move that maximizes the value of the resulting board position. 1 if b is a final board state that is won, then V (b) = 100 2 if b is a final board state that is lost, then V (b) = 100 3 if b is a final board state that is drawn, then V (b) = 0 4 if b is a not a final state in the game, then V (b) = V (b ), b is the best final board state that can be achieved starting from b and playing optimally until the end of the game. It is still not trivial... 13 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning What is of importance? The Learning Problem x 1 : # black pieces x 3 : # black kings x 5 : # white pieces threatened x 2 : # white pieces x 4 : # white kings x 6 : # black pieces threatened 14 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning What is of importance? The Learning Problem x 1 : # black pieces x 3 : # black kings x 5 : # white pieces threatened x 2 : # white pieces x 4 : # white kings x 6 : # black pieces threatened Approximation: ˆV (b) = w 0 + w 1 x 1 + w 2 x 2 + w 3 x 3 + w 4 x 4 + w 5 x 5 + w 6 x 6, where w i is the weight assigned to x i. Learning task: Determine the weights w 0, w 1, w 2, w 3, w 4, w 5, and w 6. 14 TDT4173 Machine Learning and Case-Based Reasoning

How to learn Introduction to Machine learning The Learning Problem In order to learn ˆV, we require a set of training examples,each describing a board state b and a training value V train for b: Current weights: w 0, w 1, w 2, w 3, w 4, w 5, and w 6 yield ˆV. New game: b 1,b 2,...,b end What value should V train (b i ) should we attach to position b i? 15 TDT4173 Machine Learning and Case-Based Reasoning

How to learn Introduction to Machine learning The Learning Problem In order to learn ˆV, we require a set of training examples,each describing a board state b and a training value V train for b: Current weights: w 0, w 1, w 2, w 3, w 4, w 5, and w 6 yield ˆV. New game: b 1,b 2,...,b end What value should V train (b i ) should we attach to position b i? Idea: When faced with a situation b k, both players do the best they can resulting in b end. 15 TDT4173 Machine Learning and Case-Based Reasoning

How to learn Introduction to Machine learning The Learning Problem In order to learn ˆV, we require a set of training examples,each describing a board state b and a training value V train for b: Current weights: w 0, w 1, w 2, w 3, w 4, w 5, and w 6 yield ˆV. New game: b 1,b 2,...,b end What value should V train (b i ) should we attach to position b i? Idea: When faced with a situation b k, both players do the best they can resulting in b end. In general: V train (b i ) ˆV (b i+1 ) This makes sense if ˆV is more accurate for board states closer to the end of the game. 15 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning Ehhh... And what does this mean? The Learning Problem Current state: b i Next state: b i+1 The (system believes that) situation b i b i+1 Therefore V train (b i ) = V (b i+1 ) V (b i+1 ) is unknown, but assuming the system is very good, we have ˆV (b i+1 ) V (b i+1 ). Thus, we decide that V train (b i ) ˆV (b i+1 ). 16 TDT4173 Machine Learning and Case-Based Reasoning

And will it work? Introduction to Machine learning The Learning Problem Can we get reasonable training data? We know what V (b end ) is for any state b end Using the previous setup, we should therefore be able to value situations that are one step away from being finished!...and using the same setup again, we should next be able to value situations that are two steps away from being finished...and so on This should work, but we need to be able to use the training data... 17 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning How to learn the weights The Learning Problem Current weights: w 0, w 1, w 2, w 3, w 4, w 5, and w 6 yield ˆV. New game: < b 1,V train (b 1 ) >,...,< b end,v train (b end ) >. 18 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning How to learn the weights The Learning Problem Current weights: w 0, w 1, w 2, w 3, w 4, w 5, and w 6 yield ˆV. New game: < b 1,V train (b 1 ) >,...,< b end,v train (b end ) >. Idea: Introduce error function E, and change weights such that the total error over all training examples is minimal. E = <b,v train (b)> training examples ( V train (b) ˆV ) 2 (b) Note: E is a function of the weights, E = E(w 0,w 1,w 2,w 3,w 4,w 5,w 6 ), and we will change the weights to make E obtain its minimal value. 18 TDT4173 Machine Learning and Case-Based Reasoning

Introduction to Machine learning LMS Weight update rule The Learning Problem For each training set < b,v train (b) > do: Use the current weights to calculate ˆV (b). For each weight w i do: w i w i + µ x i where µ is the learning rate. ( V train (b) ˆV ) (b) 19 TDT4173 Machine Learning and Case-Based Reasoning

Practical information The utility-based agent...and beyond About TDT4171 http://folk.ntnu.no/thomame/checkers/ 10 TDT4171 Artificial Intelligence Methods

Design Choices Introduction to Machine learning The Learning Problem Determine Type of Training Experience Games against experts Games against self Table of correct moves... Determine Target Function Board move Board value... Determine Representation of Learned Function Polynomial Linear function of six features Artificial neural network... Determine Learning Algorithm Completed Design Gradient descent Linear programming... 20 TDT4173 Machine Learning and Case-Based Reasoning

TDT4173 Practical information About IT3704 Goals of the course: The course will give a basic insight into principles and methods for how computer systems can learn from its own experience. Syllabus: The text-book Machine Learning by Tom Mitchell. A number of papers to be decided and made available How to get it... Book available at Tapir. Papers will be made available for downloaded from our webpage 34 TDT4173 Machine Learning and Case-Based Reasoning

Exercises Practical information About IT3704 Designed to give hands-on experience with the different machine learning methods we talk about Will contain both coding tasks as well as requirements towards discussions Typically given with a two weeks-deadline. NB! Counts towards final grade All exercises count towards the final grade: If you fail one assignment, you will automatically take off 3.3% of the total available score. 34 TDT4173 Machine Learning and Case-Based Reasoning

Paper presentation Practical information About IT3704 A number of classic texts Papers to be presented by students NB! Counts towards final grade Each student must participate in presenting at least one paper. This counts as one exercise (out of 6) 34 TDT4173 Machine Learning and Case-Based Reasoning

Getting information Practical information The other stuff Sources for information: Check the web-page http://www.idi.ntnu.no/emner/tdt4 173/ Check that you are registered in It's Learning Ask the course assistant, Shengtong Zhong, or lecturer, if you have problems (Contact info on web-pag e.) 34 TDT4173 Machine Learning and Case-Based Reasoning

Reference group Practical information The other stuff We need two students to volunteer to be in the reference group. Not much work (if all goes well): Evaluation meeting(s) Evaluation report Students spokesman if there is something we should take into account 46 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Concept Learning (Ch. 2) EnjoySport - example Training Examples for EnjoySport Index Sky Temp Humid Wind Water Forecast EnjoySport 1 Sunny Warm Normal Strong Warm Same Yes 2 Sunny Warm High Strong Warm Same Yes 3 Rainy Cold High Strong Warm Change No 4 Sunny Warm High Strong Cool Change Yes What is the general concept? 21 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Training Examples for EnjoySport EnjoySport - example Index Sky Temp Humid Wind Water Forecast EnjoySport 1 Sunny Warm Normal Strong Warm Same Yes 2 Sunny Warm High Strong Warm Same Yes 3 Rainy Cold High Strong Warm Change No 4 Sunny Warm High Strong Cool Change Yes Sky = Sunny? What is the general concept? Sky = Sunny AND Temp = Warm? Forecast = Same OR Water = Cool? When Index is written in binary digits it requires one 1? 21 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Representing Hypotheses EnjoySport - example Many possible representations Here, h is conjunction of constraints on attributes Each constraint can be a specific value (e.g., Water = Warm ) don t care (e.g., Water =? ) no value allowed (e.g., Water = ) For example, Sky AirTemp Humid Wind Water Forecast Sunny?? Strong? Same 22 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Prototypical Concept Learning Task The Inductive Learning Hypothesis Given: Instances X: Possible days, each described by the attributes Sky, AirTemp, Humidity, Wind, Water, Forecast Target function c: EnjoySport: X {0, 1} Hypotheses H: Conjunctions of literals, e.g.,?,cold,high,?,?,?. Training examples D: Positive and negative examples of the target function x 1,c(x 1 ),... x m,c(x m ) Determine a hypothesis h H such that x D : h(x) = c(x). 23 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Prototypical Concept Learning Task The Inductive Learning Hypothesis Given: Instances X: Possible days, each described by the attributes Sky, AirTemp, Humidity, Wind, Water, Forecast Target function c: EnjoySport: X {0, 1} Hypotheses H: Conjunctions of literals, e.g.,?,cold,high,?,?,?. Training examples D: Positive and negative examples of the target function x 1,c(x 1 ),... x m,c(x m ) Determine a hypothesis h H such that x D : h(x) = c(x). The inductive learning hypothesis: Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples. 23 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples The Inductive Learning Hypothesis Instance, Hypotheses, and More-General-Than Instances X Hypotheses H Specific x 1 x 2 h 1 h 2 h 3 General x 1 = <Sunny, Warm, High, Strong, Cool, Same> x = <Sunny, Warm, High, Light, Warm, Same> 2 h 1 = <Sunny,?,?, Strong,?,?> h = <Sunny,?,?,?,?,?> 2 h = <Sunny,?,?,?, Cool,?> 3 24 TDT4173 Machine Learning and Case-Based Reasoning

Find-S Algorithm Learning From Examples Find-S 1 Initialize h to the most specific hypothesis in H 2 For each positive training instance x For each attribute constraint a i in h If a i in h is satisfied by x Then do nothing Else replace a i in h by the next more general constraint that is satisfied by x 3 Output hypothesis h 25 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Find-S Hypothesis Space Search by Find-S Instances X Hypotheses H - x 3 h 0 h 1 Specific x + 1 x+ 2 h 2,3 x+ 4 h 4 General x = <Sunny Warm Normal Strong Warm Same>, + 1 x 2 = <Sunny Warm High Strong Warm Same>, + x 3 = <Rainy Cold High Strong Warm Change>, - x = <Sunny Warm High Strong Cool Change>, + 4 h = <,,,,, > 0 h 1 = <Sunny Warm Normal Strong Warm Same> h 2 = <Sunny Warm? Strong Warm Same> h = <Sunny Warm? Strong Warm Same> 3 h = <Sunny Warm? Strong?? > 4 26 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Complaints about Find-S Find-S Can t tell whether it has learned concept Can t tell when training data inconsistent Picks a maximally specific h (why?) Depending on H, there might be several! 27 TDT4173 Machine Learning and Case-Based Reasoning

Version Spaces Learning From Examples Version Spaces A hypothesis h is consistent with a set of training examples D of target concept c if and only if h(x) = c(x) for each training example x,c(x) in D. Consistent(h,D) ( x,c(x) D) h(x) = c(x) The version space, V S H,D, with respect to hypothesis space H and training examples D, is the subset of hypotheses from H consistent with all training examples in D. V S H,D {h H Consistent(h,D)} 28 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Version Spaces The List-Then-Eliminate Algorithm 1 VersionSpace a list containing every hypothesis in H 2 For each training example, x,c(x) remove from VersionSpace any hypothesis h for which h(x) c(x) 3 Output the list of hypotheses in VersionSpace 29 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Example Version Space Version Spaces S: { <Sunny, Warm,?, Strong,?,?> } <Sunny,?,?, Strong,?,?> <Sunny, Warm,?,?,?,?> <?, Warm,?, Strong,?,?> G: { <Sunny,?,?,?,?,?>, <?, Warm,?,?,?,?> } 30 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Representing Version Spaces Version Spaces The General boundary, G, of version space V S H,D is the set of its maximally general members The Specific boundary, S, of version space V S H,D is the set of its maximally specific members Every member of the version space lies between these boundaries V S H,D = {h H ( s S)( g G)(g h s)} where x y means x is more general or equal to y 31 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Candidate Elimination Algorithm Version Spaces G maximally general hypotheses in H S maximally specific hypotheses in H For each training example d, do If d is a positive example Remove from G any hypothesis inconsistent with d For each hypothesis s in S that is not consistent with d Remove s from S Add to S all minimal generalizations h of s such that - h is consistent with d, and - some member of G is more general than h Remove from S any hypothesis that is more general than another hypothesis in S [CONT d] 32 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Candidate Elimination Algorithm Version Spaces [...FROM PREVIOUS SLIDE] If d is a negative example Remove from S any hypothesis inconsistent with d For each hypothesis g in G that is not consistent with d Remove g from G Add to G all minimal specializations h of g such that - h is consistent with d, and - some member of S is more specific than h Remove from G any hypothesis that is less general than another hypothesis in G 33 TDT4173 Machine Learning and Case-Based Reasoning

Example Trace Learning From Examples Version Spaces S 0 : { <,,,,, > } S 1 : { <Sunny, Warm, Normal, Strong, Warm, Same> } S 2 : { <Sunny, Warm,?, Strong, Warm, Same> } G 0, G 1, G 2 : { <?,?,?,?,?,?>} Training examples: 1. <Sunny, Warm, Normal, Strong, Warm, Same>, Enjoy Sport = Yes 2. <Sunny, Warm, High, Strong, Warm, Same>, Enjoy Sport = Yes 34 TDT4173 Machine Learning and Case-Based Reasoning

Example Trace Learning From Examples Version Spaces S 2, S 3 : { <Sunny, Warm,?, Strong, Warm, Same> } G 3 : { <Sunny,?,?,?,?,?> <?, Warm,?,?,?,?> <?,?,?,?,?, Same> } G 2: { <?,?,?,?,?,?> } Training Example: 3. <Rainy, Cold, High, Strong, Warm, Change>, EnjoySport=No 34 TDT4173 Machine Learning and Case-Based Reasoning

Example Trace Learning From Examples Version Spaces S 3 : { <Sunny, Warm,?, Strong, Warm, Same> } S 4: { <Sunny, Warm,?, Strong,?,?>} G 4: { <Sunny,?,?,?,?,?> <?, Warm,?,?,?,?>} G 3 : { <Sunny,?,?,?,?,?> <?, Warm,?,?,?,?> <?,?,?,?,?, Same> } Training Example: 4.<Sunny, Warm, High, Strong, Cool, Change>, EnjoySport = Yes 34 TDT4173 Machine Learning and Case-Based Reasoning

Learning From Examples Version Spaces How Should These Be Classified?? S: { <Sunny, Warm,?, Strong,?,?> } <Sunny,?,?, Strong,?,?> <Sunny, Warm,?,?,?,?> <?, Warm,?, Strong,?,?> G: { <Sunny,?,?,?,?,?>, <?, Warm,?,?,?,?> } Sunny Warm Normal Strong Cool Change Rainy Cool Normal Light Warm Same Sunny Warm Normal Light Warm Same 35 IT 3704 Machine Learning and Case-Based Reasoning

Summary Points Learning From Examples Summary 1 Concept learning as search through H 2 General-to-specific ordering over H 3 Version space candidate elimination algorithm 4 S and G boundaries characterize learner s uncertainty 5 6 7 40 IT 3704 Machine Learning and Case-Based Reasoning