Artificial Intelligence

Size: px
Start display at page:

Download "Artificial Intelligence"

Transcription

1 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 1/39 Artificial Intelligence 16. Non-Classical Planning Relaxing our assumptions over the agents environment Álvaro Torralba Wolfgang Wahlster Summer Term 2018 Thanks to Prof. Hoffmann for slide sources

2 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 2/39 Agenda 1 Introduction 2 Planning with Uncertainty 3 Non-Deterministic Planning 4 Numeric Planning 5 Multi-Agent Planning 6 Temporal Planning 7 Planning in the Real World 8 Conclusion

3 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 4/39 Reminder (Chapter 3): What is an Agent in AI? Agents: Perceive the environment through sensors ( percepts). Act upon the environment through actuators ( actions). Agent Sensors Percepts? Environment Actuators Actions

4 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 5/39 Reminder (Chapter 3): Environment of Rational Agents Fully observable vs. Partially observable: Are the relevant aspects of the environment accessible to the sensors? Deterministic vs. stochastic: Is the next state of the environment completely determined by the current state and the selected action? Episodic vs. sequential: Can the quality of an action be evaluated within an episode (perception + action), or are future developments decisive? Static vs. dynamic: Can the environment change while the agent is deliberating? Discrete vs. continuous: Is the environment discrete or continuous? Single agent vs. multi-agent: Is there just one agent, or several of them?

5 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 6/39 Planning Model-based sequential decision making: Given a model of the environment and the agent s capabilities, decide what action to execute next reasoning about the possible future outcomes. Chapter 14: Classical STRIPS Planning: A STRIPS planning task, short planning task, is a 4-tuple Π = (P, A, I, G) where: P is a finite set of facts. A is a finite set of actions; each a A is a triple of pre, add, and del. I P is the initial state. G P is the goal. Solution (plan) = sequence of actions transforming I into a G s. So, what assumptions are we making over the environment? Fully observable, Deterministic, Static, Discrete, Single agent

6 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 7/39 Our Agenda for This Chapter Planning with Uncertainty: Dealing with partially-observable environments. Non-deterministic Planning: Dealing with non-deterministic environments. Numeric Planning: Dealing with continous environments. Multi-agent Planning: Dealing with environments where several agents cooperate. Temporal Planning: Dealing with actions whose effects take some time. Planning in the Real World: So, what is the role of classical planning in all this.

7 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 8/39 Disclaimer(s) This lecture intends to be a general overview of many differnet topics. The concepts and explanations in this chapter are very broad and superficial, and also you are not expected to understand all the details. Moreover, we do not cover all types of non-classical planning.

8 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 10/39 Partially Observable Environments In the real world: we do not know everything! Example: Chapter 10: Wumpus world Differences with respect to classical planning: 1 We do not have full information about the initial state 2 (Optionally) Actions may give us new information about the current state

9 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 11/39 Belief State We have only partial knowledge about the initial state. How to represent our knowledge? Logic! Initial knowledge: You re in cell [1,1]. P 1,1 There s a Wumpus (W 1,1 W 1,2 W 1,3... ) There s gold (G 1,1 G 1,2 G 1,3... ) There s no stench in position [1, 1]: S 1,1 General knowledge: S 1,1 W 1,1 W 1,2 W 2,1 Definition (Belief State). Let ϕ be a propositional formula that describes our knowledge about the current state. Then, the belief state B is the set of states that correspond to satisfying assignments of ϕ. The set of states that are consistent with our belief.

10 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 12/39 Partially-Observable Planning: Conformant Planning Conformant Planning: A planning task with uncertainty in the initial state, is a 4-tuple Π = (P, A, ϕ I, G) where: P is a finite set of facts. A is a finite set of actions; each a A is a triple of pre, add, and del. ϕ I is the initial belief state. G P is the goal. Find a sequence of actions that transform the initial belief state ϕ I into ϕ G = G (conformant plan). A conformant plan is a sequence of actions that works no matter in which initial state are we (compatible with our initial belief state)

11 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 13/39 Conformant Planning: How to solve it? Method 1: Heuristic search Each search node contains a belief state that can be represented in different ways: Enumerate states (possibly exponentially many) As a logical formula φ B (checking whether an action is applicable in φ B requires solving a satisfiability problem to check if the formula entails the precondition of the action). Method 2: Compiling it to classical planning The compilation is either exponential in the worst case (or incomplete) Complexity: PlanEx of conformant planning is EXPSPACE-hard

12 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 14/39 Questionnaire Question! What s a conformant plan for the Wumpus example? None, we need to sense the environment to discover where is the wumpus!

13 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 15/39 Partially-Observable Planning: Contingent Planning Contingent Planning: A contingent planning task, is a 4-tuple Π = (P, A, ϕ I, G) where: P is a finite set of facts. A is a finite set of actions; each a A is a tuple (pre, add, del, obs). The value of facts in obs set after executing the action. ϕ I is the initial belief state. G P is the goal.

14 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 16/39 Modelling the Wumpus Problem σ I = {at 1 1 clear 1 1 clear i,j wumpus i,j wumpus i,j stench i,j+1... } G = {have gold} move-x-y: pre: {at x, clear y } add: {at y } del: {at x } obs: {stench y, breeze y } move-x-y has 4 different outcomes and we need to plan for each of them individually! (Blackboard)

15 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 17/39 Contingent Planning: How to Solve It Given a contingent planning task find a tree of actions that transform the initial belief state ϕ I into ϕ G = G. We may need different plans for every result of the observation actions! Method: An option is AndOr search, similar to MinMax search where we distinguish between Or/Max nodes (where we choose the action to apply) and And/Min nodes where the environment chooses. Complexity: PlanEx of contingent planning is 2EXPSPACE-hard

16 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 19/39 Stochastic Environments In the real world: we cannot always anticipate the effect of our actions! win buy-lottery lose Differences with respect to classical planning: 1 Actions can have multiple outcomes 2 (Optionally) If the probability of each outcome is known, then we call it probabilistic planning.

17 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 20/39 Markov Decision Processes The state space is replaced by a Markov Decision Processes: A Markov Decision Process, is a 4-tuple (S, A, T, R, I, S G ) where: S is a finite set of states. A is a finite set of actions. T is a transition function (s, a, s ) probability. R is a reward function R(s, a, s ) R. I is the initial state. S G is the set of goal states. Differences with respect to classical search problems: Transitions have multiple outcomes each with a given probability Transitions provide a reward Objective: Reach a goal state? Maximize reward?

18 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 21/39 Markov Decision Processes: Objective Find a policy: mapping from states to actions. Multiple types of MDPs, depending on what is the objective: 1 Maximize reward:the reward is often no matter what the agent does! 1 Maximize reward after a finite number of steps 2 Maximize reward with discount factor 2 Reach goal: 1 Find a policy that reaches a goal state with least average cost if one exists. what if it is not possible to reach the goal with probability 1? 2 Find policy that reaches a goal state with maximum probability

19 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 22/39 Probabilistic Planning: How to Solve It Value Iteration: Finding optimal policy can be done in polynomial time in the size of the MDP! But: the MDP is often exponential in the size of the probabilistic planning task LRTDP: We can use heuristic search to approximate what is the relevant part of the MDP Complexity: PlanEx of contingent planning is 2EXPSPACE-hard

20 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 23/39 Questionnaire buy-lottery win lose Question! In the real world, what do you do? Do you always plan ahead for every possibility/contingency? No, you do not want to spent an hour thinking whether it is best to take the 102 or the 124 back to the city center. You may consider the possibility that the bus is delayed but not of the bus crashing.

21 Online vs. Offline Planning Given a contingent/probabilistic planning task: Offline Planning: Plan ahead for every possibility. Find contingent plan Find (optimal) policy Online Planning: Decide what to do next: spent some time deciding what action to execute, execute it, observe the result and re-plan if necessary. FF-Replan. Given a probabilistic planning task. 1 Drop probabilities away (assume that you get to choose the outcome) 2 Use a classical problem to solve the simplified task 3 Execute the action recommended by the planner 4 If the outcome is the expected one, continue. Otherwise, re-plan from the state. Very effective online probabilistic planner, specially in tasks where probabilities model that actions have a (low) probability of failure. Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 24/39

22 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 26/39 Continous Environments In the real world: We have numbers! Fuel Voltage control

23 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 27/39 Numeric STRIPS Numeric STRIPS Planning: Extends STRIPS by introducing numeric variables V n with rational values in Q. Numeric expressions: We can do simple arithmetic (+,,, ) with the values of variables and/or constants. Numeric conditions: compare numeric expressions with {<,, =,, >}. Numeric effects: assign a numeric expression to a numeric variable in V n. Example: drive (x,y): pre : {at(x), fuel > 0}, add : {at(y)}del : {at(x)}, assign : {fuel := fuel 1}

24 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 28/39 Numeric STRIPS: How to solve it? Method 1: Compiling it to classical planning Each number is discretized into a finite set of values: The loss of precision requires some rounding, causing this methods to be either unsound or incomplete. Method 2: Heuristic Search: Similar to classical planning, only that now heuristics must take the numbers into account (e.g. delete relaxation based on intervals). But the state space is not finite anymore! Complexity: PlanEx-Numeric STRIPS is Undecidable

25 Competitive Environments: General Game Playing (see Chapter 07) Colaborative Environments Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 30/39 Multi-agent Environments In the real world: We are not the single and only agent!

26 Multi-agent Environments Multi-agent planning: Several agents must collaborate to achieve a common goal. Key: There is some global information known by all agents but each agent has his own private facts, who do not want to share with the rest. Multi-Agent STRIPS Planning: A multi-agent STRIPS planning task, is a 5-tuple Π = (P, A, I, G) where: P is a finite set of facts, divided in private and public facts. A is a finite set of actions; each a A is a triple of pre, add, and del, divided in private and public actions. I P is the initial state. G P is the goal. Agents must communicate during the planning process to share information about how they will achieve the goal Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 31/39

27 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 33/39 Hybrid Environments In the real world: Events do not happen instantaneously!

28 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 34/39 Durative Actions In classical planning the action effects happen immediately. However, in the real world, actions take time to execute. When the precondition needs to hold? When the effect is applied? preconditions at-start, and effects at-end preconditions at-end effects at-end over-all: during the execution of the action

29 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 36/39 Planning in the Real World Question! If most real-world environments are not deterministic, not fully observable, not discrete, not single agent, and temporal. What is classical planning good for? Agent Sensors? Actuators Percepts Actions Environment The model does not try to simulate the environment, it is just a tool to take good decisions. Oftentimes, reasoning with a simplified model can still lead to intelligent decisions and solutions are easier to compute than with more complex models. My two cents: Ideally, we should always provide an accurate description of the environment so that the AI simplifies it when necessary. However, automatic simplification methods are not powerful enough in all cases yet.

30 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 38/39 Summary There are many planning sub-areas that study the model-based sequential decision making. Non-classical planning models reason about more complex environments such as non-deterministic, partially-observable, continuous, temporal, etc. Solving these problems by computing a complete offline policy is hard (though many non-classical planners are able to do this satisfactorily in some domains). Many approaches are online, planning to decide the next action by looking into the future but without considering all alternatives. Classical planning and heuristic search techniques are still an important ingredient of many approaches that deal with complex environments.

31 Torralba and Wahlster Artificial Intelligence Chapter 16: Non-Classical Planning 39/39 References I

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

Learning and Transferring Relational Instance-Based Policies

Learning and Transferring Relational Instance-Based Policies Learning and Transferring Relational Instance-Based Policies Rocío García-Durán, Fernando Fernández y Daniel Borrajo Universidad Carlos III de Madrid Avda de la Universidad 30, 28911-Leganés (Madrid),

More information

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators s and environments Percepts Intelligent s? Chapter 2 Actions s include humans, robots, softbots, thermostats, etc. The agent function maps from percept histories to actions: f : P A The agent program runs

More information

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors) Intelligent Agents Chapter 2 1 Outline Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Agent types 2 Agents and environments sensors environment percepts

More information

Intelligent Agents. Chapter 2. Chapter 2 1

Intelligent Agents. Chapter 2. Chapter 2 1 Intelligent Agents Chapter 2 Chapter 2 1 Outline Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types The structure of agents Chapter 2 2 Agents

More information

Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University

More information

High-level Reinforcement Learning in Strategy Games

High-level Reinforcement Learning in Strategy Games High-level Reinforcement Learning in Strategy Games Christopher Amato Department of Computer Science University of Massachusetts Amherst, MA 01003 USA camato@cs.umass.edu Guy Shani Department of Computer

More information

AMULTIAGENT system [1] can be defined as a group of

AMULTIAGENT system [1] can be defined as a group of 156 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 2, MARCH 2008 A Comprehensive Survey of Multiagent Reinforcement Learning Lucian Buşoniu, Robert Babuška,

More information

FF+FPG: Guiding a Policy-Gradient Planner

FF+FPG: Guiding a Policy-Gradient Planner FF+FPG: Guiding a Policy-Gradient Planner Olivier Buffet LAAS-CNRS University of Toulouse Toulouse, France firstname.lastname@laas.fr Douglas Aberdeen National ICT australia & The Australian National University

More information

Planning with External Events

Planning with External Events 94 Planning with External Events Jim Blythe School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 blythe@cs.cmu.edu Abstract I describe a planning methodology for domains with uncertainty

More information

Introduction to Simulation

Introduction to Simulation Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /

More information

Evolution of Collective Commitment during Teamwork

Evolution of Collective Commitment during Teamwork Fundamenta Informaticae 56 (2003) 329 371 329 IOS Press Evolution of Collective Commitment during Teamwork Barbara Dunin-Kȩplicz Institute of Informatics, Warsaw University Banacha 2, 02-097 Warsaw, Poland

More information

BMBF Project ROBUKOM: Robust Communication Networks

BMBF Project ROBUKOM: Robust Communication Networks BMBF Project ROBUKOM: Robust Communication Networks Arie M.C.A. Koster Christoph Helmberg Andreas Bley Martin Grötschel Thomas Bauschert supported by BMBF grant 03MS616A: ROBUKOM Robust Communication Networks,

More information

Speeding Up Reinforcement Learning with Behavior Transfer

Speeding Up Reinforcement Learning with Behavior Transfer Speeding Up Reinforcement Learning with Behavior Transfer Matthew E. Taylor and Peter Stone Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188 {mtaylor, pstone}@cs.utexas.edu

More information

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Transfer Learning Action Models by Measuring the Similarity of Different Domains Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn

More information

TOKEN-BASED APPROACH FOR SCALABLE TEAM COORDINATION. by Yang Xu PhD of Information Sciences

TOKEN-BASED APPROACH FOR SCALABLE TEAM COORDINATION. by Yang Xu PhD of Information Sciences TOKEN-BASED APPROACH FOR SCALABLE TEAM COORDINATION by Yang Xu PhD of Information Sciences Submitted to the Graduate Faculty of in partial fulfillment of the requirements for the degree of Doctor of Philosophy

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games

Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games Santiago Ontañón

More information

Mathematics subject curriculum

Mathematics subject curriculum Mathematics subject curriculum Dette er ei omsetjing av den fastsette læreplanteksten. Læreplanen er fastsett på Nynorsk Established as a Regulation by the Ministry of Education and Research on 24 June

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

Causal Link Semantics for Narrative Planning Using Numeric Fluents

Causal Link Semantics for Narrative Planning Using Numeric Fluents Proceedings, The Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-17) Causal Link Semantics for Narrative Planning Using Numeric Fluents Rachelyn Farrell,

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

Planning in Intelligent Systems: Model-based Approach to Autonomous Behavior

Planning in Intelligent Systems: Model-based Approach to Autonomous Behavior Planning in Intelligent Systems: Model-based Approach to Autonomous Behavior Departamento de Computación Universidad de Buenos Aires Hector Geffner ICREA & Universitat Pompeu Fabra Barcelona, Spain Hector

More information

Regret-based Reward Elicitation for Markov Decision Processes

Regret-based Reward Elicitation for Markov Decision Processes 444 REGAN & BOUTILIER UAI 2009 Regret-based Reward Elicitation for Markov Decision Processes Kevin Regan Department of Computer Science University of Toronto Toronto, ON, CANADA kmregan@cs.toronto.edu

More information

An Investigation into Team-Based Planning

An Investigation into Team-Based Planning An Investigation into Team-Based Planning Dionysis Kalofonos and Timothy J. Norman Computing Science Department University of Aberdeen {dkalofon,tnorman}@csd.abdn.ac.uk Abstract Models of plan formation

More information

MYCIN. The MYCIN Task

MYCIN. The MYCIN Task MYCIN Developed at Stanford University in 1972 Regarded as the first true expert system Assists physicians in the treatment of blood infections Many revisions and extensions over the years The MYCIN Task

More information

Generating Test Cases From Use Cases

Generating Test Cases From Use Cases 1 of 13 1/10/2007 10:41 AM Generating Test Cases From Use Cases by Jim Heumann Requirements Management Evangelist Rational Software pdf (155 K) In many organizations, software testing accounts for 30 to

More information

Knowledge-Based - Systems

Knowledge-Based - Systems Knowledge-Based - Systems ; Rajendra Arvind Akerkar Chairman, Technomathematics Research Foundation and Senior Researcher, Western Norway Research institute Priti Srinivas Sajja Sardar Patel University

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Rule-based Expert Systems

Rule-based Expert Systems Rule-based Expert Systems What is knowledge? is a theoretical or practical understanding of a subject or a domain. is also the sim of what is currently known, and apparently knowledge is power. Those who

More information

Action Models and their Induction

Action Models and their Induction Action Models and their Induction Michal Čertický, Comenius University, Bratislava certicky@fmph.uniba.sk March 5, 2013 Abstract By action model, we understand any logic-based representation of effects

More information

Visual CP Representation of Knowledge

Visual CP Representation of Knowledge Visual CP Representation of Knowledge Heather D. Pfeiffer and Roger T. Hartley Department of Computer Science New Mexico State University Las Cruces, NM 88003-8001, USA email: hdp@cs.nmsu.edu and rth@cs.nmsu.edu

More information

Georgetown University at TREC 2017 Dynamic Domain Track

Georgetown University at TREC 2017 Dynamic Domain Track Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain

More information

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Exploration. CS : Deep Reinforcement Learning Sergey Levine Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?

More information

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Varun Raj Kompella, Marijn Stollenga, Matthew Luciw, Juergen Schmidhuber The Swiss AI Lab IDSIA, USI

More information

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Proceedings of 28 ISFA 28 International Symposium on Flexible Automation Atlanta, GA, USA June 23-26, 28 ISFA28U_12 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Amit Gil, Helman Stern, Yael Edan, and

More information

GUIDE TO THE CUNY ASSESSMENT TESTS

GUIDE TO THE CUNY ASSESSMENT TESTS GUIDE TO THE CUNY ASSESSMENT TESTS IN MATHEMATICS Rev. 117.016110 Contents Welcome... 1 Contact Information...1 Programs Administered by the Office of Testing and Evaluation... 1 CUNY Skills Assessment:...1

More information

A theoretic and practical framework for scheduling in a stochastic environment

A theoretic and practical framework for scheduling in a stochastic environment J Sched (2009) 12: 315 344 DOI 10.1007/s10951-008-0080-x A theoretic and practical framework for scheduling in a stochastic environment Julien Bidot Thierry Vidal Philippe Laborie J. Christopher Beck Received:

More information

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

A R ! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ; A R "! I,,, r.-ii ' i '!~ii ii! A ow ' I % i o,... V. 4..... JA' i,.. Al V5, 9 MiN, ; Logic and Language Models for Computer Science Logic and Language Models for Computer Science HENRY HAMBURGER George

More information

Seminar - Organic Computing

Seminar - Organic Computing Seminar - Organic Computing Self-Organisation of OC-Systems Markus Franke 25.01.2006 Typeset by FoilTEX Timetable 1. Overview 2. Characteristics of SO-Systems 3. Concern with Nature 4. Design-Concepts

More information

Are You Ready? Simplify Fractions

Are You Ready? Simplify Fractions SKILL 10 Simplify Fractions Teaching Skill 10 Objective Write a fraction in simplest form. Review the definition of simplest form with students. Ask: Is 3 written in simplest form? Why 7 or why not? (Yes,

More information

Towards Team Formation via Automated Planning

Towards Team Formation via Automated Planning Towards Team Formation via Automated Planning Christian Muise, Frank Dignum, Paolo Felli, Tim Miller, Adrian R. Pearce, Liz Sonenberg Department of Computing and Information Systems, University of Melbourne

More information

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering Lecture Details Instructor Course Objectives Tuesday and Thursday, 4:00 pm to 5:15 pm Information Technology and Engineering

More information

PreReading. Lateral Leadership. provided by MDI Management Development International

PreReading. Lateral Leadership. provided by MDI Management Development International PreReading Lateral Leadership NEW STRUCTURES REQUIRE A NEW ATTITUDE In an increasing number of organizations hierarchies lose their importance and instead companies focus on more network-like structures.

More information

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

TD(λ) and Q-Learning Based Ludo Players

TD(λ) and Q-Learning Based Ludo Players TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability

More information

An Introduction to Simio for Beginners

An Introduction to Simio for Beginners An Introduction to Simio for Beginners C. Dennis Pegden, Ph.D. This white paper is intended to introduce Simio to a user new to simulation. It is intended for the manufacturing engineer, hospital quality

More information

Designing A Computer Opponent for Wargames: Integrating Planning, Knowledge Acquisition and Learning in WARGLES

Designing A Computer Opponent for Wargames: Integrating Planning, Knowledge Acquisition and Learning in WARGLES In the AAAI 93 Fall Symposium Games: Planning and Learning From: AAAI Technical Report FS-93-02. Compilation copyright 1993, AAAI (www.aaai.org). All rights reserved. Designing A Computer Opponent for

More information

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS R.Barco 1, R.Guerrero 2, G.Hylander 2, L.Nielsen 3, M.Partanen 2, S.Patel 4 1 Dpt. Ingeniería de Comunicaciones. Universidad de Málaga.

More information

Liquid Narrative Group Technical Report Number

Liquid Narrative Group Technical Report Number http://liquidnarrative.csc.ncsu.edu/pubs/tr04-004.pdf NC STATE UNIVERSITY_ Liquid Narrative Group Technical Report Number 04-004 Equivalence between Narrative Mediation and Branching Story Graphs Mark

More information

Math 121 Fundamentals of Mathematics I

Math 121 Fundamentals of Mathematics I I. Course Description: Math 121 Fundamentals of Mathematics I Math 121 is a general course in the fundamentals of mathematics. It includes a study of concepts of numbers and fundamental operations with

More information

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society UC Merced Proceedings of the nnual Meeting of the Cognitive Science Society Title Multi-modal Cognitive rchitectures: Partial Solution to the Frame Problem Permalink https://escholarship.org/uc/item/8j2825mm

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

How do adults reason about their opponent? Typologies of players in a turn-taking game

How do adults reason about their opponent? Typologies of players in a turn-taking game How do adults reason about their opponent? Typologies of players in a turn-taking game Tamoghna Halder (thaldera@gmail.com) Indian Statistical Institute, Kolkata, India Khyati Sharma (khyati.sharma27@gmail.com)

More information

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14) IAT 888: Metacreation Machines endowed with creative behavior Philippe Pasquier Office 565 (floor 14) pasquier@sfu.ca Outline of today's lecture A little bit about me A little bit about you What will that

More information

Harvesting the Wisdom of Coalitions

Harvesting the Wisdom of Coalitions Harvesting the Wisdom of Coalitions Understanding Collaboration and Innovation in the Coalition Context February 2015 Prepared by: Juliana Ramirez and Samantha Berger Executive Summary In the context of

More information

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014 UNSW Australia Business School School of Risk and Actuarial Studies ACTL5103 Stochastic Modelling For Actuaries Course Outline Semester 2, 2014 Part A: Course-Specific Information Please consult Part B

More information

On the Combined Behavior of Autonomous Resource Management Agents

On the Combined Behavior of Autonomous Resource Management Agents On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science

More information

Learning Prospective Robot Behavior

Learning Prospective Robot Behavior Learning Prospective Robot Behavior Shichao Ou and Rod Grupen Laboratory for Perceptual Robotics Computer Science Department University of Massachusetts Amherst {chao,grupen}@cs.umass.edu Abstract This

More information

GACE Computer Science Assessment Test at a Glance

GACE Computer Science Assessment Test at a Glance GACE Computer Science Assessment Test at a Glance Updated May 2017 See the GACE Computer Science Assessment Study Companion for practice questions and preparation resources. Assessment Name Computer Science

More information

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017 Instructor Syed Zahid Ali Room No. 247 Economics Wing First Floor Office Hours Email szahid@lums.edu.pk Telephone Ext. 8074 Secretary/TA TA Office Hours Course URL (if any) Suraj.lums.edu.pk FINN 321 Econometrics

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Automatic Discretization of Actions and States in Monte-Carlo Tree Search

Automatic Discretization of Actions and States in Monte-Carlo Tree Search Automatic Discretization of Actions and States in Monte-Carlo Tree Search Guy Van den Broeck 1 and Kurt Driessens 2 1 Katholieke Universiteit Leuven, Department of Computer Science, Leuven, Belgium guy.vandenbroeck@cs.kuleuven.be

More information

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic

More information

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

Using Rhetoric Technique in Persuasive Speech

Using Rhetoric Technique in Persuasive Speech Using Rhetoric Technique in Persuasive Speech Rhetoric is the ancient art of using language to persuade. If you use it well, your audience will easily understand what you're saying, and will be influenced

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Abstractions and the Brain

Abstractions and the Brain Abstractions and the Brain Brian D. Josephson Department of Physics, University of Cambridge Cavendish Lab. Madingley Road Cambridge, UK. CB3 OHE bdj10@cam.ac.uk http://www.tcm.phy.cam.ac.uk/~bdj10 ABSTRACT

More information

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1 Decision Support: Decision Analysis Jožef Stefan International Postgraduate School, Ljubljana Programme: Information and Communication Technologies [ICT3] Course Web Page: http://kt.ijs.si/markobohanec/ds/ds.html

More information

Visit us at:

Visit us at: White Paper Integrating Six Sigma and Software Testing Process for Removal of Wastage & Optimizing Resource Utilization 24 October 2013 With resources working for extended hours and in a pressurized environment,

More information

A Reinforcement Learning Variant for Control Scheduling

A Reinforcement Learning Variant for Control Scheduling A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement

More information

Shared Mental Models

Shared Mental Models Shared Mental Models A Conceptual Analysis Catholijn M. Jonker 1, M. Birna van Riemsdijk 1, and Bas Vermeulen 2 1 EEMCS, Delft University of Technology, Delft, The Netherlands {m.b.vanriemsdijk,c.m.jonker}@tudelft.nl

More information

Task Completion Transfer Learning for Reward Inference

Task Completion Transfer Learning for Reward Inference Machine Learning for Interactive Systems: Papers from the AAAI-14 Workshop Task Completion Transfer Learning for Reward Inference Layla El Asri 1,2, Romain Laroche 1, Olivier Pietquin 3 1 Orange Labs,

More information

CSC200: Lecture 4. Allan Borodin

CSC200: Lecture 4. Allan Borodin CSC200: Lecture 4 Allan Borodin 1 / 22 Announcements My apologies for the tutorial room mixup on Wednesday. The room SS 1088 is only reserved for Fridays and I forgot that. My office hours: Tuesdays 2-4

More information

Inside the mind of a learner

Inside the mind of a learner Inside the mind of a learner - Sampling experiences to enhance learning process INTRODUCTION Optimal experiences feed optimal performance. Research has demonstrated that engaging students in the learning

More information

Integrating derivational analogy into a general problem solving architecture

Integrating derivational analogy into a general problem solving architecture Integrating derivational analogy into a general problem solving architecture Jaime Carbonell Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 March 1988 Abstract

More information

1.11 I Know What Do You Know?

1.11 I Know What Do You Know? 50 SECONDARY MATH 1 // MODULE 1 1.11 I Know What Do You Know? A Practice Understanding Task CC BY Jim Larrison https://flic.kr/p/9mp2c9 In each of the problems below I share some of the information that

More information

A Genetic Irrational Belief System

A Genetic Irrational Belief System A Genetic Irrational Belief System by Coen Stevens The thesis is submitted in partial fulfilment of the requirements for the degree of Master of Science in Computer Science Knowledge Based Systems Group

More information

Science Olympiad Competition Model This! Event Guidelines

Science Olympiad Competition Model This! Event Guidelines Science Olympiad Competition Model This! Event Guidelines These guidelines should assist event supervisors in preparing for and setting up the Model This! competition for Divisions B and C. Questions should

More information

UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL

UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL A thesis submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE

More information

Foothill College Summer 2016

Foothill College Summer 2016 Foothill College Summer 2016 Intermediate Algebra Math 105.04W CRN# 10135 5.0 units Instructor: Yvette Butterworth Text: None; Beoga.net material used Hours: Online Except Final Thurs, 8/4 3:30pm Phone:

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Mathematics. Mathematics

Mathematics. Mathematics Mathematics Program Description Successful completion of this major will assure competence in mathematics through differential and integral calculus, providing an adequate background for employment in

More information

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES Luiz Fernando Gonçalves, luizfg@ece.ufrgs.br Marcelo Soares Lubaszewski, luba@ece.ufrgs.br Carlos Eduardo Pereira, cpereira@ece.ufrgs.br

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Strategy for teaching communication skills in dentistry

Strategy for teaching communication skills in dentistry Strategy for teaching communication in dentistry SADJ July 2010, Vol 65 No 6 p260 - p265 Prof. JG White: Head: Department of Dental Management Sciences, School of Dentistry, University of Pretoria, E-mail:

More information

An Introduction to Simulation Optimization

An Introduction to Simulation Optimization An Introduction to Simulation Optimization Nanjing Jian Shane G. Henderson Introductory Tutorials Winter Simulation Conference December 7, 2015 Thanks: NSF CMMI1200315 1 Contents 1. Introduction 2. Common

More information

University of Cincinnati College of Medicine. DECISION ANALYSIS AND COST-EFFECTIVENESS BE-7068C: Spring 2016

University of Cincinnati College of Medicine. DECISION ANALYSIS AND COST-EFFECTIVENESS BE-7068C: Spring 2016 1 DECISION ANALYSIS AND COST-EFFECTIVENESS BE-7068C: Spring 2016 Instructor Name: Mark H. Eckman, MD, MS Office:, Division of General Internal Medicine (MSB 7564) (ML#0535) Cincinnati, Ohio 45267-0535

More information

MTH 141 Calculus 1 Syllabus Spring 2017

MTH 141 Calculus 1 Syllabus Spring 2017 Instructor: Section/Meets Office Hrs: Textbook: Calculus: Single Variable, by Hughes-Hallet et al, 6th ed., Wiley. Also needed: access code to WileyPlus (included in new books) Calculator: Not required,

More information

Computer Science 141: Computing Hardware Course Information Fall 2012

Computer Science 141: Computing Hardware Course Information Fall 2012 Computer Science 141: Computing Hardware Course Information Fall 2012 September 4, 2012 1 Outline The main emphasis of this course is on the basic concepts of digital computing hardware and fundamental

More information

Seamus Bradley and Katie Steele Can free evidence be bad? Value of information for the imprecise probabilist

Seamus Bradley and Katie Steele Can free evidence be bad? Value of information for the imprecise probabilist Seamus Bradley and Katie Steele Can free evidence be bad? Value of information for the imprecise probabilist Article (Accepted version) (Refereed) Original citation: Bradley, Seamus and Steele, Katie (2016)

More information

Chapter 2 Rule Learning in a Nutshell

Chapter 2 Rule Learning in a Nutshell Chapter 2 Rule Learning in a Nutshell This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the

More information

Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

More information

Software Development: Programming Paradigms (SCQF level 8)

Software Development: Programming Paradigms (SCQF level 8) Higher National Unit Specification General information Unit code: HL9V 35 Superclass: CB Publication date: May 2017 Source: Scottish Qualifications Authority Version: 01 Unit purpose This unit is intended

More information

Task Completion Transfer Learning for Reward Inference

Task Completion Transfer Learning for Reward Inference Task Completion Transfer Learning for Reward Inference Layla El Asri 1,2, Romain Laroche 1, Olivier Pietquin 3 1 Orange Labs, Issy-les-Moulineaux, France 2 UMI 2958 (CNRS - GeorgiaTech), France 3 University

More information

Improving Action Selection in MDP s via Knowledge Transfer

Improving Action Selection in MDP s via Knowledge Transfer In Proc. 20th National Conference on Artificial Intelligence (AAAI-05), July 9 13, 2005, Pittsburgh, USA. Improving Action Selection in MDP s via Knowledge Transfer Alexander A. Sherstov and Peter Stone

More information

Language properties and Grammar of Parallel and Series Parallel Languages

Language properties and Grammar of Parallel and Series Parallel Languages arxiv:1711.01799v1 [cs.fl] 6 Nov 2017 Language properties and Grammar of Parallel and Series Parallel Languages Mohana.N 1, Kalyani Desikan 2 and V.Rajkumar Dare 3 1 Division of Mathematics, School of

More information