THE DESIGN OF A LEARNING SYSTEM Lecture 2

Similar documents
Lecture 1: Machine Learning Basics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Python Machine Learning

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Artificial Neural Networks written examination

(Sub)Gradient Descent

Axiom 2013 Team Description Paper

Lecture 1: Basic Concepts of Machine Learning

Exploration. CS : Deep Reinforcement Learning Sergey Levine

TD(λ) and Q-Learning Based Ludo Players

Evolutive Neural Net Fuzzy Filtering: Basic Description

Lecture 10: Reinforcement Learning

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Strong Minimalist Thesis and Bounded Optimality

Getting Started with Deliberate Practice

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Laboratorio di Intelligenza Artificiale e Robotica

Reinforcement Learning by Comparing Immediate Reward

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Generative models and adversarial training

A Genetic Irrational Belief System

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

A Case Study: News Classification Based on Term Frequency

An Introduction to Simio for Beginners

Proof Theory for Syntacticians

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

CSL465/603 - Machine Learning

Mathematics process categories

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Self Study Report Computer Science

PreReading. Lateral Leadership. provided by MDI Management Development International

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Using focal point learning to improve human machine tacit coordination

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Learning Methods in Multilingual Speech Recognition

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Time series prediction

A Version Space Approach to Learning Context-free Grammars

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

OUTLINE OF ACTIVITIES

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Abstractions and the Brain

An OO Framework for building Intelligence and Learning properties in Software Agents

New Paths to Learning with Chromebooks

What is a Mental Model?

Science Fair Project Handbook

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

arxiv: v1 [cs.lg] 15 Jun 2015

Laboratorio di Intelligenza Artificiale e Robotica

12- A whirlwind tour of statistics

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Softprop: Softmax Neural Network Backpropagation Learning

MYCIN. The MYCIN Task

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Speaker Identification by Comparison of Smart Methods. Abstract

Major Milestones, Team Activities, and Individual Deliverables

RESPONSE TO LITERATURE

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Activity 2 Multiplying Fractions Math 33. Is it important to have common denominators when we multiply fraction? Why or why not?

Foothill College Summer 2016

Rule-based Expert Systems

Characteristics of Functions

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

The Agile Mindset. Linda Rising.

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Telekooperation Seminar

STA 225: Introductory Statistics (CT)

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Red Flags of Conflict

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

Learning Methods for Fuzzy Systems

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

Bachelor Class

How To Take Control In Your Classroom And Put An End To Constant Fights And Arguments

Data Fusion Through Statistical Matching

Math 1313 Section 2.1 Example 2: Given the following Linear Program, Determine the vertices of the feasible set. Subject to:

An ICT environment to assess and support students mathematical problem-solving performance in non-routine puzzle-like word problems

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Conducting an interview

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Visual CP Representation of Knowledge

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Truth Inference in Crowdsourcing: Is the Problem Solved?

Math 96: Intermediate Algebra in Context

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Concept Formation Learning Plan

Calibration of Confidence Measures in Speech Recognition

Texts and Materials: Traditions and Encounters, AP Edition. Bentley, Ziegler. McGraw Hill, $ Replacement Cost

New Project Learning Environment Integrates Company Based R&D-work and Studying

Soft Computing based Learning for Cognitive Radio

Version Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18

Transcription:

THE DESIGN OF A LEARNING SYSTEM Lecture 2

Challenge: Design a Learning System for Checkers What training experience should the system have? A design choice with great impact on the outcome Choice #1: Direct training examples. Just a bunch of board states together with a correct move. Choice #2: Indirect training. A bunch of recorded games, where the correctness of the moves is inferred by the result of the game. credit assignment for each move (how good or bad)

Challenge: Design a Learning System for Checkers What amount of interaction should there be between the system and the supervisor? Choice #1: No freedom. Supervisor provides all training examples. Choice #2: Semi-free. Supervisor provides training examples, system constructs its own examples too, and asks questions to the supervisor in cases of doubt. Choice #3: Total-freedom. System learns to play completely unsupervised How daring the system should be in exploring new boards?

Challenge: Design a Learning System for Checkers Which training examples? There is an huge huge number of possible games. No time to try all possible games. System should learn with examples that it will encounter in the future. For example, if the goal is to beat humans, it should be able to do well in situations that humans encounter when they play (this is hard to achieve in practice).

So far : Choosing the Training Experience Determine Type of Training Experience.. etc Games against experts Games against self Table of correct moves Next step??

The Wikipedia Get To Philosophy game and the analogue for Machine Learning Machine Learning: Get to Math We start from a high level definition of the problem we want to solve. We progressively reduce it to something more mathematical.

What should be learned exactly? get in the mind of a player The computer program knows the legal moves. Should learn how to choose the best move. The computer should learn a hidden function. Perhaps the function should be Knowing the value of the board can be used to pick the best move. Choose the move that leads to the board with the highest value. So far, in two steps, we ve reduced the original problem to that of learning the function V.

Let s attempt to define the function V 1. If b is an end-of-game win board state: V(b) = 100 2. If b is an end-of-game lose board state: V(b) = -100 3. If b is an end-of-game draw board state: V(b) = 0 4. If b is not a final board state then V(b)=V(b ) where b is the best final board state that can be achieved from b, assuming optimal playing. With this definition, every board b has a value of 100, -100, or 0. To compute it we need to run all possibilities until the final state. So this is a non-operational definition, because we can t work with it in practice. Finding an operational definition may be even impossible. In practice, we will try to find an operational definition of an approximation to V.

A break for notation. People in math really like using `hats to denote approximations. approximate person person

Let s attempt to define the approximation to the function V 1. A large table with boards b and values or. 2. An artificial neural network that implements it. Imagine a neuron for every cell of the board, firing when a piece is present, and many connections between neurons, or.. 3. A polynomial function of predefined board features etc There is a trade-off between very expressive representations that can get very close to V but are nearly non-operational or simpler representations that are efficiently computable. x 1 number of black pieces x 2 number of red piece x 3 number of black kings x 4 number of red kings x 5 number of black pieces under threat x 6 number of red pieces under threat

The reduction steps so far Determine Type of Training Experience.. etc Games against experts Games against self Table of correct moves Determine Target Function.. etc Board Move Board Value Polynomial Determine Representation of Learned Function Neural Network Linear Function of Six Features

Estimating training values 1. Recall that training experience is indirect. We work only with the outcomes of the moves, i.e. the end-of-game boarding states. So at least we know the value of for these final boards. For the previous boards we can initialize 2. Learning has a significant feedback component. A new trace of game comes in, and the system will use it to update using the previously learned approximation. This makes sense. In learning, we build upon previous knowledge. 3. Let s make a wish. The approx function should be equal to the function defined by: Is it possible? In the first training step, all previous-to-last boards of the game get a value among 100,-100 and 0. Reflects certainty towards end-of-game.

Why can t the new hat-v be V train? Because the values we just assigned to V train most probably don t obey the basic definition of hat-v:: Even if we pick the values of w i, we won t be able to match the values of V train. However we can pick values for w i, so that the new hat-v is as close as possible to the values of V train. To put it mathematically, we will try to find a hat-v which minimizes. Least Mean Squares Problem (LMS): A classic problem in numerical analysis with known solution that we can use to update w i

The reduction steps so far Determine Type of Training Experience.. etc Games against experts Games against self Table of correct moves Determine Target Function.. etc Board Move Board Value Polynomial Determine Representation of Learned Function Neural Network Linear Programming Linear Function of Six Features Gradient Descent FINAL DESIGN

The Final Design Experiment Generator New problem (initial board state) Hypothesis V-hat Performance System (player hat-v) Generalizer Solution Trace (game history) Critic Training Examples <b 1, V train (b 1 )> <b 2, V train (b 2 )>

Would it play well? Probably not, because the representation we chose is too simple. However, a better representation V-hat could produce a very good system. The overall design looks similar to a very good learning system for backgammon. In that case V-hat is a neural network. In any case, this is just a possible design. There are others, for instance 1. Store many training examples (boards/moves), and for every new board try to find the one that looks closest among the training boards. Use it to determine the move. [nearest neighbor search] 2. Train many systems, not just one. Have them fight each other, and choose the best for the next generation. [genetic algorithms] 3. Try to imitate a more human-like approach: understanding of the move based on explanations. [explanation-based learning]

General Perspectives and Issues in Machine Learning In general, a learning algorithm searches a large (potentially infinite) set of hypotheses, and tries to find one that fits best the data. The various techniques of Machine Learning are in most cases different ways of hypothesis representation. Different representations suite better different tasks. The class will review representations and explain how they exploit the underlying structure for different problems.

General Perspectives and Issues in Machine Learning What algorithms exist for learning general target functions from specific training examples? In what settings will particular algorithms converge to the desired function, given sufficient training data? Which algorithms perform best for which types of problems and representations? How much training data is needed to achieve a level of confidence to the learned hypothesis? Are there any theoretical and general bounds? Can prior knowledge be useful during training, even when approximate? How should we chose the training experiences? What is the relationship between the strategy and the complexity of the learning problem. Is it possible to make automatic the process of picking target functions for learning? Is it possible to have a system that doesn t have a fixed hypothesis representation but keeps changing it in response to the performance?

Next Lecture: Chapter 2 Concept Learning and General-to-Specific Ordering In general, a learning algorithm searches a large (potentially infinite) set of hypotheses, and tries to find one that fits best the data. View learning as a search in a space of hypotheses. Present several algorithms for performing the search.