Planning: Representation

Similar documents
University of Groningen. Systemen, planning, netwerken Bosman, Aart

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Introduction to Simulation

Discriminative Learning of Beam-Search Heuristics for Planning

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

Lecture 10: Reinforcement Learning

Seminar - Organic Computing

Visual CP Representation of Knowledge

Major Milestones, Team Activities, and Individual Deliverables

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

Intelligent Agents. Chapter 2. Chapter 2 1

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Radius STEM Readiness TM

BMBF Project ROBUKOM: Robust Communication Networks

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

CS Machine Learning

M55205-Mastering Microsoft Project 2016

An Investigation into Team-Based Planning

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

BADM 641 (sec. 7D1) (on-line) Decision Analysis August 16 October 6, 2017 CRN: 83777

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

Planning with External Events

Self Study Report Computer Science

Laboratorio di Intelligenza Artificiale e Robotica

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

Learning and Transferring Relational Instance-Based Policies

Causal Link Semantics for Narrative Planning Using Numeric Fluents

A Comparison of Annealing Techniques for Academic Course Scheduling

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Laboratorio di Intelligenza Artificiale e Robotica

Artificial Neural Networks written examination

Short vs. Extended Answer Questions in Computer Science Exams

Lecture 1: Basic Concepts of Machine Learning

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Lecture 1: Machine Learning Basics

TD(λ) and Q-Learning Based Ludo Players

Action Models and their Induction

MYCIN. The MYCIN Task

Probability and Game Theory Course Syllabus

Rule-based Expert Systems

AMULTIAGENT system [1] can be defined as a group of

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Challenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley

Getting Started with Deliberate Practice

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Ricochet Robots - A Case Study for Human Complex Problem Solving

Working with Rich Mathematical Tasks

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

While you are waiting... socrative.com, room number SIMLANG2016

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

MATH 205: Mathematics for K 8 Teachers: Number and Operations Western Kentucky University Spring 2017

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

Axiom 2013 Team Description Paper

Reinforcement Learning by Comparing Immediate Reward

TotalLMS. Getting Started with SumTotal: Learner Mode

Predicting Future User Actions by Observing Unmodified Applications

Computer Science 141: Computing Hardware Course Information Fall 2012

On the Combined Behavior of Autonomous Resource Management Agents

Investigating Ahuja-Orlin s Large Neighbourhood Search Approach for Examination Timetabling

Evolution of Collective Commitment during Teamwork

Liquid Narrative Group Technical Report Number

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Regret-based Reward Elicitation for Markov Decision Processes

Knowledge-Based - Systems

ECE-492 SENIOR ADVANCED DESIGN PROJECT

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

Data Structures and Algorithms

EVOLVING POLICIES TO SOLVE THE RUBIK S CUBE: EXPERIMENTS WITH IDEAL AND APPROXIMATE PERFORMANCE FUNCTIONS

Probability and Statistics Curriculum Pacing Guide

Toward Probabilistic Natural Logic for Syllogistic Reasoning

An Introduction to Simio for Beginners

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Visit us at:

High-level Reinforcement Learning in Strategy Games

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Machine Learning and Development Policy

CSC200: Lecture 4. Allan Borodin

Probabilistic Latent Semantic Analysis

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

SARDNET: A Self-Organizing Feature Map for Sequences

Clouds = Heavy Sidewalk = Wet. davinci V2.1 alpha3

A simulated annealing and hill-climbing algorithm for the traveling tournament problem

Software Maintenance

Foothill College Summer 2016

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

A Generic Object-Oriented Constraint Based. Model for University Course Timetabling. Panepistimiopolis, Athens, Greece

Pragmatic Use Case Writing

LEGO MINDSTORMS Education EV3 Coding Activities

Intermediate Computable General Equilibrium (CGE) Modelling: Online Single Country Course

Executive Guide to Simulation for Health

4. Long title: Emerging Technologies for Gaming, Animation, and Simulation

A Genetic Irrational Belief System

A Framework for Customizable Generation of Hypertext Presentations

Transcription:

Planning: Representation Alan Mackworth UBC CS 322 Planning 1 February 13, 2013 Textbook 8.0-8.2

Reminders Coming up: - Assignment 2 due on Friday, 1pm - Midterm Wednesday, Mar 6: DMP 110, 3-3:50pm - ~60% short answer questions. See Connect soon for full set. - ~40% long answer question. - See Connect soon for previous midterm with solutions. - Giuseppe Carenini will lecture the week of Feb 25 Mar 1.

Lecture Overview Watson & Siri Recap: types of SLS algorithms Planning: intro Planning: example STRIPS: A Feature-Based Representation Time-permitting: forward planning (planning as search) 3

Watson & Siri Very impressive performance Watson's cancer knowledge is now being put to work in two products described here: http://allthingsd.com/20130209/ibms-game-show-winning-watson-computer-goes-to-work-treatingcancer/ Siri gets great reviews. Gets to know user over time locations, preferences, habits,. Siri features heavily in stories about Apple's rumoured upcoming Dick Tracy iwatch: http://asktog.com/atc/apple-iwatch/ Both solve a very complex problem: question answering Much harder for AI than logical problems like chess or proofs Dealing with uncertainty last 2 modules in the course + 422 Knowledge of its own confidence is particularly important Many potential applications Medicine, Law, Business,. Personal assistant 4

Lecture Overview Watson & Siri Recap: types of SLS algorithms Planning: intro Planning: example STRIPS: A Feature-Based Representation Time-permitting: start of forward planning (planning as search) 5

P.S. Definition of a plateau Local minimum Search state n such that all its neighbours n have h(n ) > h(n) Plateau Set of connected states {n 1,, n k } with h(n 1 ) = h(n 2 ) = = h(n k ) At least one of the n i has a neighbour n with h(n ) < h(n i ) Problem: some problem instances have very large plateaus, in high dimensional spaces: need to search them effectively Plateau Plateau Local minimum region Strict local minimum 6

Types of SLS algorithms Simulated Annealing Tabu Search Iterated Local Search (Stochastic) Beam Search Genetic Algorithms These algorithms can often do well at escaping local minima and plateaus. Only need to know high-level concepts 7

How to set the parameters? Automated algorithm configuration - Optimize the performance of arbitrary parameterized algorithms Parameter is a very general concept - Numerical domains: real or integer - Categorical domains: finite and unordered - Alternative heuristics to use in A* - Alternative data structures - Alternative Java classes in a framework implementation - 8

Lecture Overview Watson & Siri Recap: types of SLS algorithms Planning: intro Planning: example STRIPS: A Feature-Based Representation Time-permitting: start of forward planning (planning as search) 9

Problem Type Static Sequential Constraint Satisfaction Logic Planning We just finished CSP Course Overview Deterministic Arc Consistency Variables + Search Constraints Logics STRIPS Search Environment Search Arc consistency (on CSP encoding) Stochastic Bayesian Networks Variable Elimination Decision Networks Variable Elimination Markov Processes Value Iteration Course Module Representation Reasoning Technique Uncertainty Decision Theory 10

Problem Type Static Sequential Constraint Satisfaction Logic Planning Now we start planning Course Overview Deterministic Arc Consistency Variables + Search Constraints Logics STRIPS Search Environment Search As CSP (using arc consistency) Stochastic Bayesian Networks Variable Elimination Decision Networks Variable Elimination Markov Processes Value Iteration Course Module Representation Reasoning Technique Uncertainty Decision Theory 11

Planning With CSPs, we looked for solutions to essentially atemporal problems find a single variable assignment (state) that satisfies all of our constraints did not care about the path leading to that state Now consider a problem where we are given: A description of an initial state A description of the effects and preconditions of actions A goal to achieve...and want to find a sequence of actions that is possible and will result in a state satisfying the goal note: here we want not a single state that satisfies our constraints, but rather a sequence of states that gets us to a goal 12

Key Idea of Planning Open up the representation of states, goals and actions States and goals: as features (variable assignments), as in CSP Actions: as preconditions and effects defined on features Agent can reason more deliberately about what actions to consider to achieve its goals.

Contrast this to simple graph search How did we represent the problem in graph search? States, start states, goal states, and successor function Successor function: when applying action a in state s, you end up in s We used a flat state-based representation there's no sense in which we can say that states a and b are more similar than states a and z (they re just nodes in a graph) Thus, we can't represent the successor function any more compactly than a tabular representation 14

Problems with the Tabular Representation Usually too many states for a tabular representation to be feasible Small changes to the model can mean big changes for the representation e.g., if we added another variable, all the states would change There may be structure and regularity to the states and to the actions no way to capture this with a tabular representation 15

Feature-Based Representation Features helped us to represent CSPs more compactly than states could The main idea: factor states into joint variable assignments Each constraint only needed to mention the variables it constrains That enabled efficient constraint propagation: arc consistency No way to do this in flat state-based representation Want to use similar idea when searching for a sequence of actions that brings us from a start state to a goal state Main idea: compact, rich representation and efficient reasoning 16

Lecture Overview Watson & Siri Recap: types of SLS algorithms Planning: intro Planning: example STRIPS: A Feature-Based Representation Time-permitting: start of forward planning (planning as search) 17

Delivery Robot Example (textbook) Consider a delivery robot named Rob, who must navigate the following environment, and can deliver coffee and mail to Sam, in his office 18

Delivery Robot Example: features RLoc - Rob's location Domain: {coffee shop, Sam's office, mail room, laboratory} short {cs, off, mr, lab} RHC Rob has coffee Domain: {true, false}. By rhc indicate that Rob has coffee, and by that Rob doesn t have coffee rhc SWC Sam wants coffee {true, false} MW Mail is waiting {true, false} RHM Rob has mail {true, false} An example state is How many states are there? 32 64 48 16 19

Delivery Robot Example: Actions The robot s actions are: Move - Rob's move action move clockwise (mc), move anti-clockwise (mac ) PUC - Rob picks up coffee must be at the coffee shop DelC - Rob delivers coffee must be at the office, and must have coffee PUM - Rob picks up mail must be in the mail room, and mail must be waiting DelM - Rob delivers mail must be at the office and have mail Preconditions for action application

Example State-Based Representation Tabular representation: need an entry for every state and every action applicable in that state! 21

Example for more compact representation A representation of the action pick up coffee, PUC: Only changes a subset of features In this case, only RHC (Rob has coffee) Only depends on a subset of features In this case, Loc = cs (Rob is in the coffee shop) preconditions Loc = cs and RHC = effects RHC = rhc 22

Lecture Overview Watson & Siri Recap: types of SLS algorithms Planning: intro Planning: example STRIPS: A Feature-Based Representation Time-permitting: forward planning (planning as search) 23

Feature-Based Representation Where we stand so far: the state-based representation is unworkable a feature-based representation might help How would a feature-based representation work? states are easy, just as in CSP: joint assignment to variables Includes initial states and goal states the key is modeling actions 24

Modeling actions To model actions in the feature-based representation, we need to solve two problems: Model when the actions are possible, in terms of the values of the features of the current state Model the state transitions in a factored way Why might this be more tractable/manageable than the tabular representation? If actions only depend on/modify some features Representation will be more compact (exponentially so!) The representation can be easier to modify/update 25

The STRIPS Representation For reference: The book also discusses a feature-centric representation for every feature, where does its value come from? causal rule: ways a feature s value can be changed by taking an action. frame rule: requires that a feature s value is unchanged if no action changes it. STRIPS is an action-centric representation: for every action, what does it do? This leaves us with no way to state frame rules. The STRIPS assumption: all variables not explicitly changed by an action stay unchanged 26

STRIPS representation (STanford Research Institute Problem Solver ) STRIPS - the planner in Shakey, first AI robot http://en.wikipedia.org/wiki/shakey_the_robot In STRIPS, an action has two parts: 1. Preconditions: a set of assignments to variables that must be satisfied in order for the action to be legal 2. Effects: a set of assignments to variables that are caused by the action

Example STRIPS representation of the action pick up coffee, PUC: preconditions Loc = cs and RHC effects RHC = rhc STRIPS representation of the action deliver coffee, DelC: preconditions Loc = off and RHC = rhc effects RHC = and SWC = Note that Sam doesn t have to want coffee for Rob to deliver it; one way or another, Sam doesn't want coffee after delivery. 28

STRIPS (cont ) The STRIPS assumption: all features not explicitly changed by an action stay unchanged So if the feature V has value v i in state S i, after action a has been performed, what can we conclude about a and/or the state of the world S i-1, immediately preceding the execution of a? S i-1 a S i V = v i

What can we conclude about a and/or the state of the world S i-1,immediately preceding the execution of a? V = v i was TRUE in S i-1 One of the effects of a is to set V = v i At least one of the above Both of the above S i-1 a S i V = v i

Lecture Overview Watson & Siri Recap: types of SLS algorithms Planning: intro Planning: example STRIPS: A Feature-Based Representation Time-permitting: forward planning (planning as search) 31

Solving planning problems STRIPS lends itself to solve planning problems either As pure search problems As CSP problems We will look at one technique for each approach

Forward planning Idea: search in the state-space graph The nodes represent the states The arcs correspond to the actions: The arcs from a state s represent all of the actions that are possible in state s A plan is a path from the state representing the initial state to a state that satisfies the goal What actions a are possible in a state s? Those where a s effects are satisfied in s Those where the state s reached via a is on the way to the goal Those where a s preconditions are satisfied in s 33

Example state-space graph: first level Goal: 34

Part of state-space graph Goal: 35

Standard Search vs. Specific R&R systems Constraint Satisfaction (Problems): State: assignments of values to a subset of the variables Successor function: assign values to a free variable Goal test: set of constraints Solution: possible world that satisfies the constraints Heuristic function: none (all solutions at the same distance from start) Planning : State: full assignment of values to features Successor function: states reachable by applying valid actions Goal test: partial assignment of values to features Solution: a sequence of actions Heuristic function: next lecture Inference State Successor function Goal test Solution Heuristic function Slide 36

Learning Goals for today s class You can: Represent a planning problem with the STRIPS representation Explain the STRIPS assumption Solve a planning problem by search (forward planning). Specify states, successor function, goal test and solution.