Inducing a Decision Tree

Similar documents
Laboratorio di Intelligenza Artificiale e Robotica

CS Machine Learning

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Lecture 1: Machine Learning Basics

Field Experience Management 2011 Training Guides

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

A Version Space Approach to Learning Context-free Grammars

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Test How To. Creating a New Test

Laboratorio di Intelligenza Artificiale e Robotica

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Create Quiz Questions

Lecture 10: Reinforcement Learning

Artificial Neural Networks written examination

Chapter 2 Rule Learning in a Nutshell

Learning to Schedule Straight-Line Code

Generative models and adversarial training

Probability and Game Theory Course Syllabus

Learning goal-oriented strategies in problem solving

Multimedia Application Effective Support of Education

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

A Case Study: News Classification Based on Term Frequency

Python Machine Learning

Linking Task: Identifying authors and book titles in verbose queries

Axiom 2013 Team Description Paper

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

CS 446: Machine Learning

Knowledge Transfer in Deep Convolutional Neural Nets

Average Number of Letters

Analysis of Enzyme Kinetic Data

Lecture 2: Quantifiers and Approximation

Shockwheat. Statistics 1, Activity 1

Evolution of Symbolisation in Chimpanzees and Neural Nets

Writing Research Articles

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

(Sub)Gradient Descent

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Outreach Connect User Manual

Graduate Calendar. Graduate Calendar. Fall Semester 2015

Improving Conceptual Understanding of Physics with Technology

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Radius STEM Readiness TM

20 HOURS PER WEEK. Barcelona. 1.1 Intensive Group Courses - All levels INTENSIVE COURSES OF

Firms and Markets Saturdays Summer I 2014

Reinforcement Learning by Comparing Immediate Reward

Automatic Discretization of Actions and States in Monte-Carlo Tree Search

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

NR-509: ADVANCED PHYSICAL ASSESSMENT Lab/Immersion Weekend Fact Sheet

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

The Interface between Phrasal and Functional Constraints

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

An OO Framework for building Intelligence and Learning properties in Software Agents

INTERMEDIATE ALGEBRA Course Syllabus

While you are waiting... socrative.com, room number SIMLANG2016

Seminar - Organic Computing

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

SOCIAL SCIENCE RESEARCH COUNCIL DISSERTATION PROPOSAL DEVELOPMENT FELLOWSHIP SPRING 2008 WORKSHOP AGENDA

Proof Theory for Syntacticians

Action Models and their Induction

Learning and Transferring Relational Instance-Based Policies

Lab 1 - The Scientific Method

BAYLOR COLLEGE OF MEDICINE ACADEMY WEEKLY INSTRUCTIONAL AGENDA 8 th Grade 02/20/ /24/2017

Applications of data mining algorithms to analysis of medical data

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Summer 2017 in Mexico

Build on students informal understanding of sharing and proportionality to develop initial fraction concepts.

Should a business have the right to ban teenagers?

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Georgetown University at TREC 2017 Dynamic Domain Track

General Information. Duration of teaching unit. Company holidays Additional summer fee 25/week from to

Relationships Between Motivation And Student Performance In A Technology-Rich Classroom Environment

Word learning as Bayesian inference

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

University of Victoria School of Exercise Science, Physical and Health Education EPHE 245 MOTOR LEARNING. Calendar Description Units: 1.

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

AQUA: An Ontology-Driven Question Answering System

Speech Recognition at ICSI: Broadcast News and beyond

re An Interactive web based tool for sorting textbook images prior to adaptation to accessible format: Year 1 Final Report

"Women of Influence in Education" A Leadership Gathering in Hong Kong

DIRECT CERTIFICATION AND THE COMMUNITY ELIGIBILITY PROVISION (CEP) HOW DO THEY WORK?

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

Learning Methods in Multilingual Speech Recognition

If you need the Praxis CORE exams for admission to the Teacher Ed Program, then plan to attend the following workshop:

Knowledge-Based - Systems

Integrating Blended Learning into the Classroom

Lesson 12. Lesson 12. Suggested Lesson Structure. Round to Different Place Values (6 minutes) Fluency Practice (12 minutes)

Dangerous. He s got more medical student saves than anybody doing this kind of work, Bradley said. He s tremendous.

ALL-IN-ONE MEETING GUIDE THE ECONOMICS OF WELL-BEING

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Transcription:

Inducing a Decision Tree In order to learn a decision tree, our agent will need to have some information to learn from: a training set of examples each example is described by its values for the problem s attributes each example is described by its output value, from the possible values of the target attribute In the restaurant example, our problem attributes are What is the estimated time?, What kind of food do they serve?, and the like. The target attribute is Will we wait? It is a boolean attribute: its value is either yes or no. Problems with boolean target attributes are called classification problems. The learning agent is learning to recognize whether a situation is a positive example of some concept or a negative example.

An Example Our agent s ideal goal is to find the most efficient, correct decision tree. Since most efficient is too hard, it will have to settle for as efficient a tree as can be found in reasonable time. input examples a training set attributes a set of attributes default the default goal predicate value output a decision tree

The Induction Algorithm if examples is empty, then return default if all examples have the same classification, then return that classification if attributes is empty, then return the most common classification of the remaining examples choose the attribute a that best discriminates among the remaining examples create a tree t with a as its root for each possible value v of a select the subset of examples ex having value(a) = v let subtree sub be the result of recursively calling the induction algorithm with ex, (attributes - a), and the most common classification of ex add a branch to t with label v and subtree sub return t [ Assume, for now that best discriminates means creates subsets of roughly equal size but with some subsets having members with a common answer.]

What Does Best Discriminates Mean? +: x1, x3, x4, x6, x8, x12 -: x2, x5, x7, x9, x10, x11 Patrons? none some many +: -: x7, x11 +: x1, x3, x6, x8 -: +: x4, x12 -: x2, x5, x9, x10 +: x1, x3, x4, x6, x8, x12 -: x2, x5, x7, x9, x10, x11 Type? French +: x1 -: x5 Thai +: x6 -: x10 Italian +: x4, x8 -: x2, x11 Burger +: x3, x12 -: x7, x9 Using information theory, we can compute a right answer to what discriminates best. Nilsson gives a simple approximation rule in Section 17.5.

The Answer...

Evaluating a Learning Algorithm A learning algorithm is good if it produces hypotheses that do a good job predicting the values of unseen cases. One technique for evaluating a learning algorithm: Partition the set of cases into two sets: a training set and a test set. Run the algorithm on the training set to induce a decision tree. Evaluate the decision tree s performance when applied to the test set. Experimental questions How do we split the case set? Size? Make-up? How good is good enough? Partial credit?

Evaluating the Induction Algorithm Russell and Norvig ran an experiment on our table from the restaurant domain. They generated random sets of cases using the problem and target attributes. Then they ran 20 trials each for training set sizes of 1-100, with each training set chosen randomly from the set of all cases. On each trial, any case not in the training set was placed in the test set. Here are the results: This is called a happy graph. There was a pattern, and the algorithm found it. Questions: What would an unhappy graph look like? Can a learning agent learn too much?

Exercise: Build a Decision Tree A number of patients have shown up at the local hospital emergency room complaining of certain symptoms. Our crack staff has identified the problem as an uncommon allergic reaction to a certain food. The patients all know each other, but some of their other friends have not had this reaction. The doctors know how to treat the reaction, but they would also like to be able to suggest some dining guidelines to this group of people so that they can avoid the reaction if they choose. Here is a set of case data on some members of the group of friends. Use our induction algorithm to build a decision tree for dining options... Case # Restaurant Meal Day Cost Reaction? 1 Sam s breakfast Saturday cheap yes 2 Lobdell lunch Saturday expensive no 3 Sam s lunch Sunday cheap yes 4 FooBarBaz breakfast Monday cheap no 5 Sam s breakfast Sunday expensive no

Toward a Solution

Reinforcement Learning Induction is much different than the sort of learning that neural networks and genetic algorithms do. A program can do induction in batch from problem/solution pairs. Neural nets and GAs rely on interleaving learning with problem solving in order to get feedback. Basic Statement: An agent is given a sequence of trials for which it knows: the states it visited for each trial the payoff it received at the end of the trial The agent has no knowledge of: the domain (full effects of actions) the payoff system The agent is to learn the domain the expected value of payoff for each action a problem-solving policy

Types of Reinforcement Learning Passive learning The agent has no real control over its actions. It wants to learn the expected values of states. Active learning The agent can choose actions on its own. It wants to learn not only the expected values of states but also an optimal policy. Model-based learning The agent learns the expected payoffs of each action and then tries to learn an optimal policy. Q learning The agent tries to learn the optimal policy without knowing the payoffs or probabilities directly.