Final Exam. Monday, May 1, 5:30-8pm Either here (FJ-D) or FJ-B (to be determined) Cumulative, but emphasizes material postmidterm.

Similar documents
Lecture 10: Reinforcement Learning

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Axiom 2013 Team Description Paper

Lecture 1: Machine Learning Basics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Introduction to Simulation

An OO Framework for building Intelligence and Learning properties in Software Agents

(Sub)Gradient Descent

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Lecture 1: Basic Concepts of Machine Learning

High-level Reinforcement Learning in Strategy Games

MYCIN. The MYCIN Task

CSL465/603 - Machine Learning

TD(λ) and Q-Learning Based Ludo Players

The Evolution of Random Phenomena

Knowledge-Based - Systems

Learning Methods for Fuzzy Systems

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

CS 446: Machine Learning

Test Effort Estimation Using Neural Network

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

AMULTIAGENT system [1] can be defined as a group of

Reinforcement Learning by Comparing Immediate Reward

Planning with External Events

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Using focal point learning to improve human machine tacit coordination

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Laboratorio di Intelligenza Artificiale e Robotica

Speeding Up Reinforcement Learning with Behavior Transfer

Automatic Discretization of Actions and States in Monte-Carlo Tree Search

Evolutive Neural Net Fuzzy Filtering: Basic Description

Generative models and adversarial training

Seminar - Organic Computing

A Neural Network GUI Tested on Text-To-Phoneme Mapping

CS Machine Learning

Rule-based Expert Systems

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Discriminative Learning of Beam-Search Heuristics for Planning

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Intelligent Agents. Chapter 2. Chapter 2 1

Predicting Future User Actions by Observing Unmodified Applications

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Python Machine Learning

Learning to Schedule Straight-Line Code

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Artificial Neural Networks written examination

Regret-based Reward Elicitation for Markov Decision Processes

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Softprop: Softmax Neural Network Backpropagation Learning

Probability and Game Theory Course Syllabus

Learning and Transferring Relational Instance-Based Policies

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

NUMBERS AND OPERATIONS

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Visual CP Representation of Knowledge

Laboratorio di Intelligenza Artificiale e Robotica

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

INPE São José dos Campos

Learning Prospective Robot Behavior

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Universidade do Minho Escola de Engenharia

FF+FPG: Guiding a Policy-Gradient Planner

Rule Learning With Negation: Issues Regarding Effectiveness

Calibration of Confidence Measures in Speech Recognition

Cognitive Thinking Style Sample Report

Designing A Computer Opponent for Wargames: Integrating Planning, Knowledge Acquisition and Learning in WARGLES

Probabilistic Latent Semantic Analysis

Task Completion Transfer Learning for Reward Inference

A Genetic Irrational Belief System

Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games

Beyond the Pipeline: Discrete Optimization in NLP

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Device Independence and Extensibility in Gesture Recognition

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

A Comparison of Annealing Techniques for Academic Course Scheduling

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Navigating the PhD Options in CMS

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Task Completion Transfer Learning for Reward Inference

SARDNET: A Self-Organizing Feature Map for Sequences

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

B. How to write a research paper

A Reinforcement Learning Variant for Control Scheduling

Evolution of Collective Commitment during Teamwork

A Case Study: News Classification Based on Term Frequency

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Improving Action Selection in MDP s via Knowledge Transfer

Firms and Markets Saturdays Summer I 2014

Fault tree analysis for maintenance needs

GACE Computer Science Assessment Test at a Glance

A NEW ALGORITHM FOR GENERATION OF DECISION TREES

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Transcription:

Wrapup

Final Exam Monday, May 1, 5:30-8pm Either here (FJ-D) or FJ-B (to be determined) Cumulative, but emphasizes material postmidterm. Study old homework assignments, including programming projects.

A victory lap is an extra trip around the track By the exhausted victors (us) J Victory Lap Review course goals See if we met them

Artificial Intelligence

Goals Give you a toolbox of AI techniques. Show you when each technique is most appropriate.

Tools and techniques State space search Adversarial search Probability Bayes nets Naïve Bayes Hypothesis choosing (ML/MAP) Markov chains & hidden Markov models Reinforcement learning Neural nets

Environments Fully-observable vs partially-observable Single agent vs multiple agents Deterministic vs stochastic Episodic vs sequential Static or dynamic Discrete or continuous

Models, Inference, and Learning A model is an abstract way of representing a problem, including its environment, how the environment works, and the possible solutions to the problem. Often includes data structures and/or mathematical relationships. Examples: state spaces, game trees, Bayes nets (including Naïve Bayes classifiers, Markov chains, and HMMs), MDPs, neural networks. A model is how we represent the world and how it works.

Models, Inference, and Learning An inference algorithm draws conclusions or makes inferences based on the model. Search (uniform cost search, greedy best first search, minimax, alpha-beta pruning), exact inference algorithm for Bayes nets, ML & MAP, inference algorithm in Markov chains, forward algorithm, backward algorithm, calculating output of neural network, value iteration. Inference algorithms answer questions about an existing model of the world (they don't change the model, they just use it)

Models, Inference, and Learning A learning algorithm tries to deduce the structure or parameters of the model itself from auxiliary data (often examples). Training a Naïve Bayes classifier by estimating the prior and feature probabilities. Training a neural network by using the backpropagation algorithm to learn the weights. Q-learning. Learning algorithms produce or modify a model of the world. (Studied further in machine learning courses.)

State Space Search Represent a partial solution to the problem as a state. Use an algorithms to find the best path through the state space. Pros: Often easy to formulate the model: states and actions. Cons: Often slow with a mediocre heuristic, state space is often too big to store explicitly in memory. Environment needed: Fully observable, single agent, deterministic, static.

Aside: What is a state? A (agent) state is an abstraction of the agent's current knowledge about the world. In state space search, this is the set of variables describing what the agent knows at a certain time. Suppose you were doing state space search by hand, and you had to stop in the middle. A friend is going to take over for you. What knowledge (separate from the environmental model) would you have to tell them to allow them to continue?

Aside: What is a state? You have a graph G = (V, E) and an integer n. Find a set of n vertices V' such that that the set of vertices either in V' or adjacent to a vertex in V' is as large as possible. How do you represent a state? How do you represent the actions?

Adversarial Search Still uses a state, only we aren t usually interested in the entire best path, just the best next move. Can use minimax and alpha-beta pruning to search the game tree. Pros: The model & algorithm(s) for 2-player games. Cons: Can t represent entire tree in memory, very slow for large games, still requires heuristics for deep trees. Environment needed: Fully observable, multiagent (2 opponents), deterministic, static.

Probability Way of representing uncertainty in a model or algorithm. Many modern AI techniques based on rules of probability. Often can give better results than heuristic approaches, where any numbers used may not be derived from any mathematical rules. Algorithms for ML and MAP hypothesis choosing.

Bayesian Networks A representation of the conditional independences that hold among a set of random variables. Lets you compute the probability of any event, given any observation (setting) of a set of other variables. Pros: Simple representation, grounded in math Cons: Hard to learn, exact inference can be slow, scientist must develop set of appropriate variables.

Naïve Bayes Particular kind of Bayes net with nice properties. Assumes conditional independence among all pieces of evidence/features/data. Useful where you need to choose a hypothesis, but don t necessarily care about the actual posterior probability (often the conditional independence assumption messes that up). Pros: Very simple, parameters of model easy to learn, fast algorithms for inference and learning. Cons: Can make gross oversimplifications, probability estimates may not be very accurate (though hypothesis often is). Environment needed: Fully observable, (single agent), (deterministic?), static.

Markov chains and HMMs Another type of Bayes net! Makes Markov assumption: probability distribution of next state depends only upon current state. (Sometimes called Markov property) Used for sequential or temporal data. Pros: Only model so far that takes time into account, efficient algorithms for inference and learning. Cons: Again, might be overly simplistic for some applications. Environment needed: Fully/partially observable, single agent, stochastic, static.

Reinforcement learning Model: MDP Inference: Bellman equations, value iteration Learning: Q-learning, lots of others Pros: Simple representation, good for cases where you ll be in the same state many times. Cons: Sloooooooooow, must be able to get experience by repeating same situations over and over. Environment needed: Fully (partially) observable, single/multi agent, stochastic, static (dynamic).

All Markov Models Are the states completely observable? Do we have control over the state transitions? No Yes Yes Markov chain MDP (Markov decision process) No HMM (Hidden Markov model) POMDP (Partiallyobservable Markov decision process)

Neural networks Models: choice of activation function, # of hidden layers and # of nodes, what inputs look like. Inference: Calculating output of NN from given inputs. Learning: perceptron learning algorithm (single layer), backpropagation algorithm (multi-layer), all kinds of more modern algs (deep learning resurgence). Pros: Modern NNs are very accurate. Cons: can be hard or slow to train, need lots of training data.

Comparison of models Some model-algorithm combinations can solve any problem: State-space search (assuming fully-observable and deterministic environment) But often they either require lots of engineering on the human s part and/or are intractable on real-world problems

Comparison of models Other model-algorithm combinations solve problems very quickly: e.g., Naïve Bayes and HMMs But they only work for problems that fit the model well. Being good in AI involves picking the right combination of model and algorithm.

Future Other algorithms: local search/optimization, constraint satisfaction problems, formal logic, planning, knowledge representation, so much more Bayes net/nn stuff, most of machine learning,... Other application areas: robotics, speech/natural language processing, computer vision,... What's hot now: NNs and deep learning What will be hot in ten years: who knows?

What next? Take these ideas and use them in practice! (But only where it makes sense.) Stay in touch Tell me when this class helps you out with something cool (seriously). Ask me cool AI questions (may not always know the answer, but I can tell you where to find it). Don't be a stranger: let me know how the rest of your time at Rhodes (and beyond!) goes I really do like to know.