Last Class: Agents acting in an environment

Similar documents
Lecture 10: Reinforcement Learning

Intelligent Agents. Chapter 2. Chapter 2 1

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

Introduction to Simulation

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Planning with External Events

Version Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18

AMULTIAGENT system [1] can be defined as a group of

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Rule-based Expert Systems

Learning and Transferring Relational Instance-Based Policies

Lecture 1: Basic Concepts of Machine Learning

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Seminar - Organic Computing

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

A Genetic Irrational Belief System

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

An Investigation into Team-Based Planning

Knowledge-Based - Systems

High-level Reinforcement Learning in Strategy Games

Evolution of Collective Commitment during Teamwork

Laboratorio di Intelligenza Artificiale e Robotica

Reinforcement Learning by Comparing Immediate Reward

Axiom 2013 Team Description Paper

The Strong Minimalist Thesis and Bounded Optimality

Radius STEM Readiness TM

Lecture 1: Machine Learning Basics

Introduction to Modeling and Simulation. Conceptual Modeling. OSMAN BALCI Professor

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Abstractions and the Brain

Robot Shaping: Developing Autonomous Agents through Learning*

Mathematics. Mathematics

How do adults reason about their opponent? Typologies of players in a turn-taking game

Integrating Meta-Level and Domain-Level Knowledge for Task-Oriented Dialogue

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

MSc Education and Training for Development

CS 101 Computer Science I Fall Instructor Muller. Syllabus

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Evolutive Neural Net Fuzzy Filtering: Basic Description

Speeding Up Reinforcement Learning with Behavior Transfer

Agent-Based Software Engineering

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Action Models and their Induction

The Enterprise Knowledge Portal: The Concept

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Today we examine the distribution of infinitival clauses, which can be

A Version Space Approach to Learning Context-free Grammars

College Pricing. Ben Johnson. April 30, Abstract. Colleges in the United States price discriminate based on student characteristics

Learning Prospective Robot Behavior

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

"f TOPIC =T COMP COMP... OBJ

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

An OO Framework for building Intelligence and Learning properties in Software Agents

RESEARCH UNITS, CENTRES AND INSTITUTES. Annual Report Template

On-Line Data Analytics

Improving Action Selection in MDP s via Knowledge Transfer

Firms and Markets Saturdays Summer I 2014

Lecturing Module

Towards Team Formation via Automated Planning

Automatic Discretization of Actions and States in Monte-Carlo Tree Search

Grammars & Parsing, Part 1:

Laboratorio di Intelligenza Artificiale e Robotica

THE ROLE OF TOOL AND TEACHER MEDIATIONS IN THE CONSTRUCTION OF MEANINGS FOR REFLECTION

CSL465/603 - Machine Learning

CS 598 Natural Language Processing

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Generating Test Cases From Use Cases

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Getting Started with Deliberate Practice

PreReading. Lateral Leadership. provided by MDI Management Development International

A theoretic and practical framework for scheduling in a stochastic environment

Shared Mental Models

PHILOSOPHY & CULTURE Syllabus

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

Syllabus: CS 377 Communication and Ethical Issues in Computing 3 Credit Hours Prerequisite: CS 251, Data Structures Fall 2015

TD(λ) and Q-Learning Based Ludo Players

Discriminative Learning of Beam-Search Heuristics for Planning

Proof Theory for Syntacticians

MYCIN. The MYCIN Task

Regret-based Reward Elicitation for Markov Decision Processes

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

EGRHS Course Fair. Science & Math AP & IB Courses

Objectives. Comprehensive. Susan Hepburn, PhD CANDO Presentation 6/13/14 1. Today we ll discuss 4 ways to individualize interventions

MAT 122 Intermediate Algebra Syllabus Summer 2016

Cooperative evolutive concept learning: an empirical study

Group A Lecture 1. Future suite of learning resources. How will these be created?

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

Organizational Design as Virtual Adaptation : Designing Project Organizations Based on Micro-Contingency Analysis 1. Raymond E.

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Integrating Blended Learning into the Classroom

M-Learning. Hauptseminar E-Learning Sommersemester Michael Kellerer LFE Medieninformatik

Transcription:

Last Class: Agents acting in an environment Abilities Goals/Preferences Prior Knowledge Agent Observations Past Experiences Environment Actions

Clicker Question The ability of the agent is: A What functions the agent is able to carry out B The set of actions available to the agent C Whether it can play tennis D What the agent wants E What is has learned from experience Answer: B

Clicker Question An agent that does not learn does not need: A Abilities B Goals/Preferences C Prior Knowledge D Observations E Past experiences Answer: E

Discussion Groups For discussion groups I would prefer to use: A Connect B Piazza (even thought it is hosted in USA and will be monitored by NSA) C My.CS.ubc.ca D wiki.ubc.ca E I volunteer to research other options and report back on Monday

Python Tutorial I would prefer a Python tutorial (and would come) next week: A Tue: 1:00 2:00pm B Wed: 11:00am-12:00pm C Thu: 1:00 2:00pm D Fri: 2:00 3:00pm E I want to go but can t go to any of these

Learning Objectives At the end of the class you should be able to: characterize simplifying assumptions made in building AI systems determine what simplifying assumptions particular AI systems are making suggest what assumptions to lift to build a more intelligent system than an existing one

Dimensions Research proceeds by making simplifying assumptions, and gradually reducing them. Each simplifying assumption gives a dimension of complexity multiple values in a dimension: from simple to complex simplifying assumptions can be relaxed in various combinations

Dimensions of Complexity Deterministic or stochastic dynamics Fully observable or partially observable Explicit states or features or individuals and relations Static or finite stage or indefinite stage or infinite stage Goals or complex preferences Perfect rationality or bounded rationality Flat or modular or hierarchical Single-agent or multiple agents Knowledge is given or learned from experience Reason offline or reason online while interacting with environment

Uncertainty There are two dimensions for uncertainty. In each dimension an agent can have No uncertainty: the agent knows what is true Disjunctive uncertainty: there is a set of states that are possible Probabilistic uncertainty: a probability distribution over states.

Why Probability? Agents need to act even if they are uncertain. Predictions are needed to decide what to do: definitive predictions: you will be run over tomorrow disjunctions: be careful or you will be run over point probabilities: probability you will be run over tomorrow is 0.002 if you are careful and 0.05 if you are not careful probability ranges: you will be run over with probability in range [0.001,0.34] Acting is gambling: agents who don t use probabilities will lose to those who do. Probabilities can be learned from data and prior knowledge.

Uncertain dynamics If an agent knew the initial state and its action, could it predict the resulting state? The dynamics can be: Deterministic : the resulting state is determined from the action and the state Stochastic : there is uncertainty about the resulting state.

Sensing Uncertainty Whether an agent can determine the state from its observations: Fully-observable : the agent can observe the state of the world. Partially-observable : there can be a number states that are possible given the agent s observations.

Clicker Question Chess is: A Stochastic and Partially Observable B Stochastic and Fully Observable C Deterministic and Fully Observable D Deterministic and Partially Observable E None of the above or more than one of the above

Clicker Question Backgammon is: A Stochastic and Partially Observable B Stochastic and Fully Observable C Deterministic and Fully Observable D Deterministic and Partially Observable E None of the above or more than one of the above

Clicker Question Poker is: A Stochastic and Partially Observable B Stochastic and Fully Observable C Deterministic and Fully Observable D Deterministic and Partially Observable E None of the above or more than one of the above

Succinctness and Expressiveness Much of modern AI is about finding compact representations and exploiting the compactness for computational gains. A agent can reason in terms of: Explicit states a state is one way the world could be Features or propositions. States can be described using features. 30 binary features can represent 2 30 = 1, 073, 741, 824 states. Individuals and relations There is a feature for each relationship on each tuple of individuals. Often an agent can reason without knowing the individuals or when there are infinitely many individuals.

Planning horizon...how far the agent looks into the future when deciding what to do. Static: world does not change Finite stage: agent reasons about a fixed finite number of time steps Indefinite stage: agent reasons about a finite, but not predetermined, number of time steps Infinite stage: the agent plans for going on forever (process oriented)

Goals or complex preferences achievement goal is a goal to achieve. This can be a complex logical formula. complex preferences may involve tradeoffs between various desiderata, perhaps at different times. ordinal only the order matters cardinal absolute values also matter We will examine cardinal preferences called utility Examples: coffee delivery robot, medical doctor

Clicker Question Sam prefers coffee to tea is: A achievement goal B ordinal preference C cardinal preference

Clicker Question Deliver coffee to Sam is: A achievement goal B ordinal preference C cardinal preference

Perfect rationality or bounded rationality Perfect rationality: the agent can determine the best course of action, without taking into account its limited computational resources. Bounded rationality: the agent must make good decisions based on its perceptual, computational and memory limitations.

Overview of Course dynamics observable repr search det fully states indef goals perfect CSPs det fully feats static perfect SLS det fully feats static bounded logic det fully relns static perfect planning det fully feats indef goals perfect belief nets stoch partial feats static perfect stoch siml stoch partial feats static bounded decision nets stoch partial feats finite utility perfect MDPs stoch fully states infinite utility perfect stage preference rationality

Modularity Model at one level of abstraction: flat Model with interacting modules that can be understood separately: modular Model with modules that are (recursively) decomposed into modules: hierarchical Example: Planning a trip from here to a see the Mona Lisa in Paris. Flat representations are adequate for simple systems. Complex biological systems, computer systems, organizations are all hierarchical A flat description is either continuous or discrete. Hierarchical reasoning is often a hybrid of continuous and discrete.

By a hierarchic system, or hierarchy, I mean a system that is composed of interrelated subsystems, each of the latter being in turn hierarchic in structure until we reach some lowest level of elementary subsystem. In most systems of nature it is somewhat arbitrary as to where we leave off the partitioning and what subsystems we take as elementary. Physics makes much use of the concept of elementary particle, although the particles have a disconcerting tendency not to remain elementary very long... Empirically a large proportion of the complex systems we observe in nature exhibit hierarchic structure. On theoretical grounds we would expect complex systems to be hierarchies in a world in which complexity had to evolve from simplicity. Herbert A. Simon, The Sciences of the Artificial, 1996

Single agent or multiple agents Single agent reasoning: any other agents are part of the environment. Multiple agent reasoning: an agent reasons strategically about the reasoning of other agents. Agents can have their own goals: cooperative, competitive, or goals can be independent of each other

Learning from experience Whether the model is fully specified a priori: Knowledge is given. Knowledge is learned from data or past experience.

Interaction reason offline reason while interacting with environment

Dimensions of Complexity Deterministic or stochastic dynamics Fully observable or partially observable Explicit states or features or individuals and relations Static or finite stage or indefinite stage or infinite stage Goals or complex preferences Perfect rationality or bounded rationality Flat or modular or hierarchical Single-agent or multiple agents Knowledge is given or learned from experience Reason offline or reason online while interacting with environment

State-space Search deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Classical Planning deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Decision Networks deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Markov Decision Processes (MDPs) deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Decision-theoretic Planning deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Reinforcement Learning deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Classical Game Theory deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

Humans deterministic or stochastic dynamics fully observable or partially observable explicit states or features or individuals and relations static or finite stage or indefinite stage or infinite stage goals or complex preferences perfect rationality or bounded rationality flat or modular or hierarchical single agent or multiple agents knowledge is given or knowledge is learned reason offline or reason while interacting with environment

The Dimensions Interact in Complex Ways Partial observability makes multi-agent and indefinite horizon reasoning more complex Modularity interacts with uncertainty and succinctness: some levels may be fully observable, some may be partially observable Three values of dimensions promise to make reasoning simpler for the agent: Hierarchical reasoning Individuals and relations Bounded rationality