Reinforcement Learning
|
|
- Tracy Bruce
- 6 years ago
- Views:
Transcription
1 Reinforcement Learning LU 1 - Introduction Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de Acknowledgement Slides courtesy of Martin Riedmiller and Martin Lauer Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (1)
2 Organisational issues Dr. Joschka Boedecker Room 00010, building 079 Office hours: Tuesday 2-3 pm no script - slides available online Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (2)
3 Dates winter term 2015/ Lecture Monday, 14:00 (c.t.) - 15:30, SR , building 052 Wednesday, 16:00 (s.t) - 17:30, SR , building 052 Exercise sessions on Wednesday, 16:00-17:30, interleaved with lecture starting at Oct. 28 held by Jan Wülfing, wuelfj@informatik.uni-freiburg.de Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (3)
4 Goal of this lecture Introduction of learning problem type Reinforcement Learning Introduction to the mathematical basics of an independently learning system. Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (4)
5 Goal of the 1. unit Motivation, definition and differentiation Outline Examples Solution approaches Machine Learning Reinforcement Learning Overview Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (5)
6 Example Backgammon Can a program independently learn Backgammon? Learning from success (win) and failure (loss) Neuro-Backgammon: Playing at world champion level (Tesauro, 1992) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (6)
7 Example pole balancing (control engineering) Can a program independently learn balancing? Learning from success and failure Neural RL Controller: Noise, inaccuracies, unknown behaviour, non-linearities,... (Riedmiller et.al. ) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (7)
8 Example robot soccer Can programs independently learn how to cooperate? Learning from success and failure Cooperative RL Agents: Complexity, distributed intelligence,... (Riedmiller et.al. ) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (8)
9 Example: Autonomous (e.g. humanoid) robots Task: Movement control similar to humans (walking, running, playing soccer, cycling, skiing,...) Input: Image from camera Output: Control signals to the joints Problems: very complex consequences of actions hard to predict interference / noise Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (9)
10 Example: Maze Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (10)
11 The Agent Concept [Russell and Norvig 1995, page 33] An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors. examples: a human a robot arm an autonomous car a motor controller... Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (11)
12 Solution approaches in Artificial Intelligence (AI) Planning / search (e.g. A, backtracking) Deduction (e.g. logic programming, predicate logic) Expert systems (e.g. knowledge generated by experts) Fuzzy control systems (fuzzy logic) Genetic algorithms (evolution of solutions) Machine Learning (e.g. reinforcement learning) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (12)
13 Types of learning (in humans) Learning from a teacher Structuring of objects Learning from experience Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (13)
14 Types of Machine Learning (ML) Learning with a teacher. Supervised Learning: Examples of input / (target-)output. Goal: generalization (in general not simply memorization) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (14)
15 Types of Machine Learning (ML) Learning with a teacher. Supervised Learning: Examples of input / (target-)output. Goal: generalization (in general not simply memorization) Structuring / recognition of correlations. Unsupervised learning: Goal: Clustering of similar data points, e.g. for preprocessing. Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (14)
16 Types of Machine Learning (ML) Learning with a teacher. Supervised Learning: Examples of input / (target-)output. Goal: generalization (in general not simply memorization) Structuring / recognition of correlations. Unsupervised learning: Goal: Clustering of similar data points, e.g. for preprocessing. Learning through reward / penalty. Reinforcement Learning: Prerequisite: Specification of target goal (or events to be avoided).... Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (14)
17 Machine Learning: ingredients 1. Type of the learning problem (given / seeked) 2. Representation of learned solution knowledge table, rules, linear mapping, neural network, Solution process (observed data solution) (heuristic) search, gradient descent, optimization technique,... Not at all: For this problem I need a neural network Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (15)
18 Emphasis of the lecture: Reinforcement Learning No information regarding the solution strategy required Independent learning of a strategy by smart trial of solutions ( trial and error ) Biggest challenge of a learning system Representation of solution knowledge by usage of a function approximator (e.g. tables, linear models, neural networks, etc.) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (16)
19 RL using the example of autonomous robots bad: Damage (fall,...) good: task done successfully better: fast / low energy / smooth movements /... optimization! Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (17)
20 Reinforcement Learning (RL) Also: Learning from evaluations, autonomous learning, neuro dynamic programming Defines a learning type and not a method! Central feature: Evaluating training signal - e.g. good / bad RL with immediate evaluation: Decision Evaluation Example: Parameter for a basketball throw RL with rewards delayed in time Decision, decision,..., decision evaluation substantially harder; interesting, because of versatile applications Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (18)
21 Delayed RL Decision, decision,..., decision evaluation Example: Robotics, control systems, games (chess, backgammon) Basic problem: Temporal credit assignment Basic architecture: Actor-critic system Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (19)
22 Multistage decision problems Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (20)
23 Actor-critic system (Barto, Sutton, 1983) Actor: In situation s choose action u (strategy π : S U) Critic: Distribution of the external signal onto single actions Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (21)
24 Reinforcement Learning 1959 Samuel s Checker-Player: Temporal difference (TD) methods 1968 Michie and Chambers: Boxes 1983 Barto, Sutton s AHC/ACE, 1987 Sutton s TD(λ) Early 90ies: Correlation between dynamic programming (DP) and RL: Werbos, Sutton, Barto, Watkins, Singh, Bertsekas DP - classic optimization technique (late 50ies: Bellman) too much effort for large tasks Advantage: Clean mathematical formulation, convergences 2000 Policy Gradient methods (Sutton et. al, Peters et. al,...) 2005 Fitted Q (Batch DP method) (Ernst et. al, Riedmiller,..) many examples of successful, at least practically relevant applications since Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (22)
25 Other examples field input goal example output (actions) games board situation winning backgammon, chess valid move robotics sensor data reference value pendulum, robot soccer control variable sequence state gain assembly line, mobile network planning candidate benchmark state goal position maze direction Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (23)
26 Goal: Autonomous learning system Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (24)
27 Approach - rough outline Formulation of the learning problem as an optimization task Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (25)
28 Approach - rough outline Formulation of the learning problem as an optimization task Solution by learning based on the optimization technique of Dynamic Programming Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (25)
29 Approach - rough outline Formulation of the learning problem as an optimization task Solution by learning based on the optimization technique of Dynamic Programming Difficulties: very large state space process behaviour unknown Application of approximation techniques (e.g. neural networks,...) Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (25)
30 Outline of lecture 1. part: Introduction Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (26)
31 Outline of lecture 1. part: Introduction 2. part: Dynamic Programming Markov Decision Problems, Backwards DP, Value Iteration, Policy Iteration Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (26)
32 Outline of lecture 1. part: Introduction 2. part: Dynamic Programming Markov Decision Problems, Backwards DP, Value Iteration, Policy Iteration 3. part: Approximate DP / Reinforcement Learning Monte Carlo methods, stochastic approximation, TD(λ), Q-learning Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (26)
33 Outline of lecture 1. part: Introduction 2. part: Dynamic Programming Markov Decision Problems, Backwards DP, Value Iteration, Policy Iteration 3. part: Approximate DP / Reinforcement Learning Monte Carlo methods, stochastic approximation, TD(λ), Q-learning 4. part: Advanced methods of Reinforcement Learning Policy Gradient methods, hierarchic methods, POMDPs, relational Reinforcement Learning Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (26)
34 Outline of lecture 1. part: Introduction 2. part: Dynamic Programming Markov Decision Problems, Backwards DP, Value Iteration, Policy Iteration 3. part: Approximate DP / Reinforcement Learning Monte Carlo methods, stochastic approximation, TD(λ), Q-learning 4. part: Advanced methods of Reinforcement Learning Policy Gradient methods, hierarchic methods, POMDPs, relational Reinforcement Learning 5. part: Applications of Reinforcement Learning Robot soccer, Pendulum, RL competition Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (26)
35 Further courses on machine learning lecture: machine learning (summer term) lab course: deep learning (Wed., 10-12) Bachelor-/ Master theses, team projects Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (27)
36 Further readings WWW: D. P. Bertsekas and J.N. Tsitsiklis. Neuro Dynamic Programming. Athena Scientific, Belmont, Massachusetts, A. Barto and R. Sutton. Reinforcement Learning. MIT Press, Cambridge, Massachusetts, M. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, New York, L.P. Kaelbling, M.L. Littman and A.W. Moore. Reinforcement Learning: A survey. Journal of Artificial Intelligence Research, 4: , 1996 M. Wiering (ed.). Reinforcement learning : state-of-the-art. Springer, Prof. Dr. M. Riedmiller, Dr. M. Lauer, Dr. J. Boedecker Machine Learning Lab, University of Freiburg Reinforcement Learning (28)
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationAxiom 2013 Team Description Paper
Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association
More informationReinforcement Learning by Comparing Immediate Reward
Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate
More informationLecture 10: Reinforcement Learning
Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation
More informationTD(λ) and Q-Learning Based Ludo Players
TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationLearning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for
Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com
More informationLaboratorio di Intelligenza Artificiale e Robotica
Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning
More informationSpeeding Up Reinforcement Learning with Behavior Transfer
Speeding Up Reinforcement Learning with Behavior Transfer Matthew E. Taylor and Peter Stone Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188 {mtaylor, pstone}@cs.utexas.edu
More informationISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM
Proceedings of 28 ISFA 28 International Symposium on Flexible Automation Atlanta, GA, USA June 23-26, 28 ISFA28U_12 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Amit Gil, Helman Stern, Yael Edan, and
More informationExploration. CS : Deep Reinforcement Learning Sergey Levine
Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationLaboratorio di Intelligenza Artificiale e Robotica
Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning
More informationA Reinforcement Learning Variant for Control Scheduling
A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement
More informationChallenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley
Challenges in Deep Reinforcement Learning Sergey Levine UC Berkeley Discuss some recent work in deep reinforcement learning Present a few major challenges Show some of our recent work toward tackling
More informationGeorgetown University at TREC 2017 Dynamic Domain Track
Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain
More informationSeminar - Organic Computing
Seminar - Organic Computing Self-Organisation of OC-Systems Markus Franke 25.01.2006 Typeset by FoilTEX Timetable 1. Overview 2. Characteristics of SO-Systems 3. Concern with Nature 4. Design-Concepts
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationContinual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots
Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Varun Raj Kompella, Marijn Stollenga, Matthew Luciw, Juergen Schmidhuber The Swiss AI Lab IDSIA, USI
More informationReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology
ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon
More informationSAM - Sensors, Actuators and Microcontrollers in Mobile Robots
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2017 230 - ETSETB - Barcelona School of Telecommunications Engineering 710 - EEL - Department of Electronic Engineering BACHELOR'S
More informationLearning Prospective Robot Behavior
Learning Prospective Robot Behavior Shichao Ou and Rod Grupen Laboratory for Perceptual Robotics Computer Science Department University of Massachusetts Amherst {chao,grupen}@cs.umass.edu Abstract This
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering
ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering Lecture Details Instructor Course Objectives Tuesday and Thursday, 4:00 pm to 5:15 pm Information Technology and Engineering
More informationFF+FPG: Guiding a Policy-Gradient Planner
FF+FPG: Guiding a Policy-Gradient Planner Olivier Buffet LAAS-CNRS University of Toulouse Toulouse, France firstname.lastname@laas.fr Douglas Aberdeen National ICT australia & The Australian National University
More informationOn the Combined Behavior of Autonomous Resource Management Agents
On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science
More informationLearning and Transferring Relational Instance-Based Policies
Learning and Transferring Relational Instance-Based Policies Rocío García-Durán, Fernando Fernández y Daniel Borrajo Universidad Carlos III de Madrid Avda de la Universidad 30, 28911-Leganés (Madrid),
More informationKnowledge-Based - Systems
Knowledge-Based - Systems ; Rajendra Arvind Akerkar Chairman, Technomathematics Research Foundation and Senior Researcher, Western Norway Research institute Priti Srinivas Sajja Sardar Patel University
More informationLecture 6: Applications
Lecture 6: Applications Michael L. Littman Rutgers University Department of Computer Science Rutgers Laboratory for Real-Life Reinforcement Learning What is RL? Branch of machine learning concerned with
More informationIntelligent Agents. Chapter 2. Chapter 2 1
Intelligent Agents Chapter 2 Chapter 2 1 Outline Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types The structure of agents Chapter 2 2 Agents
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationIntroduction to Simulation
Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /
More informationHigh-level Reinforcement Learning in Strategy Games
High-level Reinforcement Learning in Strategy Games Christopher Amato Department of Computer Science University of Massachusetts Amherst, MA 01003 USA camato@cs.umass.edu Guy Shani Department of Computer
More informationAgent-Based Software Engineering
Agent-Based Software Engineering Learning Guide Information for Students 1. Description Grade Module Máster Universitario en Ingeniería de Software - European Master on Software Engineering Advanced Software
More informationStochastic Calculus for Finance I (46-944) Spring 2008 Syllabus
Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus Introduction. This is a first course in stochastic calculus for finance. It assumes students are familiar with the material in Introduction
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationDIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.
DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE Sample 2-Year Academic Plan DRAFT Junior Year Summer (Bridge Quarter) Fall Winter Spring MMDP/GAME 124 GAME 310 GAME 318 GAME 330 Introduction to Maya
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationAn OO Framework for building Intelligence and Learning properties in Software Agents
An OO Framework for building Intelligence and Learning properties in Software Agents José A. R. P. Sardinha, Ruy L. Milidiú, Carlos J. P. Lucena, Patrick Paranhos Abstract Software agents are defined as
More informationCourses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access
The courses availability depends on the minimum number of registered students (5). If the course couldn t start, students can still complete it in the form of project work and regular consultations with
More informationRobot Shaping: Developing Autonomous Agents through Learning*
TO APPEAR IN ARTIFICIAL INTELLIGENCE JOURNAL ROBOT SHAPING 2 1. Introduction Robot Shaping: Developing Autonomous Agents through Learning* Marco Dorigo # Marco Colombetti + INTERNATIONAL COMPUTER SCIENCE
More informationData Fusion Models in WSNs: Comparison and Analysis
Proceedings of 2014 Zone 1 Conference of the American Society for Engineering Education (ASEE Zone 1) Data Fusion s in WSNs: Comparison and Analysis Marwah M Almasri, and Khaled M Elleithy, Senior Member,
More informationA systems engineering laboratory in the context of the Bologna Process
A systems engineering laboratory in the context of the Bologna Process Matthias Kühnle, Martin Hillenbrand EWME, Budapest, 28.05.2008 Institut für Technik der Informationsverarbeitung (ITIV) Institutsleitung:
More informationIAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)
IAT 888: Metacreation Machines endowed with creative behavior Philippe Pasquier Office 565 (floor 14) pasquier@sfu.ca Outline of today's lecture A little bit about me A little bit about you What will that
More informationImproving Fairness in Memory Scheduling
Improving Fairness in Memory Scheduling Using a Team of Learning Automata Aditya Kajwe and Madhu Mutyam Department of Computer Science & Engineering, Indian Institute of Tehcnology - Madras June 14, 2014
More informationDOCTOR OF PHILOSOPHY HANDBOOK
University of Virginia Department of Systems and Information Engineering DOCTOR OF PHILOSOPHY HANDBOOK 1. Program Description 2. Degree Requirements 3. Advisory Committee 4. Plan of Study 5. Comprehensive
More informationSTA 225: Introductory Statistics (CT)
Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic
More informationModeling user preferences and norms in context-aware systems
Modeling user preferences and norms in context-aware systems Jonas Nilsson, Cecilia Lindmark Jonas Nilsson, Cecilia Lindmark VT 2016 Bachelor's thesis for Computer Science, 15 hp Supervisor: Juan Carlos
More informationAgents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators
s and environments Percepts Intelligent s? Chapter 2 Actions s include humans, robots, softbots, thermostats, etc. The agent function maps from percept histories to actions: f : P A The agent program runs
More informationTesting A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA
Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA Testing a Moving Target How Do We Test Machine Learning Systems? Peter Varhol, Technology
More information2017 Florence, Italty Conference Abstract
2017 Florence, Italty Conference Abstract Florence, Italy October 23-25, 2017 Venue: NILHOTEL ADD: via Eugenio Barsanti 27 a/b - 50127 Florence, Italy PHONE: (+39) 055 795540 FAX: (+39) 055 79554801 EMAIL:
More informationImproving Action Selection in MDP s via Knowledge Transfer
In Proc. 20th National Conference on Artificial Intelligence (AAAI-05), July 9 13, 2005, Pittsburgh, USA. Improving Action Selection in MDP s via Knowledge Transfer Alexander A. Sherstov and Peter Stone
More informationAction Models and their Induction
Action Models and their Induction Michal Čertický, Comenius University, Bratislava certicky@fmph.uniba.sk March 5, 2013 Abstract By action model, we understand any logic-based representation of effects
More informationAutomatic Discretization of Actions and States in Monte-Carlo Tree Search
Automatic Discretization of Actions and States in Monte-Carlo Tree Search Guy Van den Broeck 1 and Kurt Driessens 2 1 Katholieke Universiteit Leuven, Department of Computer Science, Leuven, Belgium guy.vandenbroeck@cs.kuleuven.be
More informationACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014
UNSW Australia Business School School of Risk and Actuarial Studies ACTL5103 Stochastic Modelling For Actuaries Course Outline Semester 2, 2014 Part A: Course-Specific Information Please consult Part B
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationEvolution of Symbolisation in Chimpanzees and Neural Nets
Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication
More informationDeep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach
#BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationKnowledge based expert systems D H A N A N J A Y K A L B A N D E
Knowledge based expert systems D H A N A N J A Y K A L B A N D E What is a knowledge based system? A Knowledge Based System or a KBS is a computer program that uses artificial intelligence to solve problems
More informationMassachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139
Hariharan Narayanan Massachusetts Institute of Technology Tel: 773.428.3115 LIDS har@mit.edu 77 Massachusetts Avenue http://www.mit.edu/~har Room 32-D558 MA 02139 EMPLOYMENT Massachusetts Institute of
More informationA Case-Based Approach To Imitation Learning in Robotic Agents
A Case-Based Approach To Imitation Learning in Robotic Agents Tesca Fitzgerald, Ashok Goel School of Interactive Computing Georgia Institute of Technology, Atlanta, GA 30332, USA {tesca.fitzgerald,goel}@cc.gatech.edu
More informationRegret-based Reward Elicitation for Markov Decision Processes
444 REGAN & BOUTILIER UAI 2009 Regret-based Reward Elicitation for Markov Decision Processes Kevin Regan Department of Computer Science University of Toronto Toronto, ON, CANADA kmregan@cs.toronto.edu
More informationInstructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question
1 Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question 2 Instructional Approach(s): The teacher should conduct the Concept
More informationAI Agent for Ice Hockey Atari 2600
AI Agent for Ice Hockey Atari 2600 Emman Kabaghe (emmank@stanford.edu) Rajarshi Roy (rroy@stanford.edu) 1 Introduction In the reinforcement learning (RL) problem an agent autonomously learns a behavior
More informationCausal Link Semantics for Narrative Planning Using Numeric Fluents
Proceedings, The Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-17) Causal Link Semantics for Narrative Planning Using Numeric Fluents Rachelyn Farrell,
More informationUndergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING
Undergraduate Program Guide Bachelor of Science in Computer Science 2011-2012 DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING The University of Texas at Arlington 500 UTA Blvd. Engineering Research Building,
More informationChapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)
Intelligent Agents Chapter 2 1 Outline Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Agent types 2 Agents and environments sensors environment percepts
More informationEvery curriculum policy starts from this policy and expands the detail in relation to the specific requirements of each policy s field.
1. WE BELIEVE We believe a successful Teaching and Learning Policy enables all children to be effective learners; to have the confidence to take responsibility for their own learning; understand what it
More informationAn investigation of imitation learning algorithms for structured prediction
JMLR: Workshop and Conference Proceedings 24:143 153, 2012 10th European Workshop on Reinforcement Learning An investigation of imitation learning algorithms for structured prediction Andreas Vlachos Computer
More informationCOMPUTER-AIDED DESIGN TOOLS THAT ADAPT
COMPUTER-AIDED DESIGN TOOLS THAT ADAPT WEI PENG CSIRO ICT Centre, Australia and JOHN S GERO Krasnow Institute for Advanced Study, USA 1. Introduction Abstract. This paper describes an approach that enables
More informationComputers Change the World
Computers Change the World Computing is Changing the World Activity 1.1.1 Computing Is Changing the World Students pick a grand challenge and consider how mobile computing, the Internet, Big Data, and
More informationTop US Tech Talent for the Top China Tech Company
THE FALL 2017 US RECRUITING TOUR Top US Tech Talent for the Top China Tech Company INTERVIEWS IN 7 CITIES Tour Schedule CITY Boston, MA New York, NY Pittsburgh, PA Urbana-Champaign, IL Ann Arbor, MI Los
More informationAn Investigation into Team-Based Planning
An Investigation into Team-Based Planning Dionysis Kalofonos and Timothy J. Norman Computing Science Department University of Aberdeen {dkalofon,tnorman}@csd.abdn.ac.uk Abstract Models of plan formation
More informationWhole School Literacy Policy 2017/18
Whole School Literacy Policy 2017/18 A central aim of teaching and learning is to develop students ability to use language effectively, in order to think, explore, organise and communicate meaning. Improved
More informationLEARNING TO PLAY IN A DAY: FASTER DEEP REIN-
LEARNING TO PLAY IN A DAY: FASTER DEEP REIN- FORCEMENT LEARNING BY OPTIMALITY TIGHTENING Frank S. He Department of Computer Science University of Illinois at Urbana-Champaign Zhejiang University frankheshibi@gmail.com
More informationRover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes
Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes WHAT STUDENTS DO: Establishing Communication Procedures Following Curiosity on Mars often means roving to places with interesting
More informationMASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE
Master of Science (M.S.) Major in Computer Science 1 MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE Major Program The programs in computer science are designed to prepare students for doctoral research,
More informationXXII BrainStorming Day
UNIVERSITA DEGLI STUDI DI CATANIA FACOLTA DI INGEGNERIA PhD course in Electronics, Automation and Control of Complex Systems - XXV Cycle DIPARTIMENTO DI INGEGNERIA ELETTRICA ELETTRONICA E INFORMATICA XXII
More informationPlanning with External Events
94 Planning with External Events Jim Blythe School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 blythe@cs.cmu.edu Abstract I describe a planning methodology for domains with uncertainty
More informationFall Classes At A Glance
Fall 2017 Fall Classes At A Glance @ Stonegate Elementary WHAT IS THE ACE PROGRAM AND WHAT ARE ACE CLASSES? The ACE Program (Afterschool Classroom Enrichment) is a program sponsored by IPSF (Irvine Public
More informationMaster s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors
Master s Programme in Computer, Communication and Information Sciences, Study guide 2015-2016, ELEC Majors Sisällysluettelo PS=pääsivu, AS=alasivu PS: 1 Acoustics and Audio Technology... 4 Objectives...
More informationADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF
Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download
More informationSyllabus for CHEM 4660 Introduction to Computational Chemistry Spring 2010
Instructor: Dr. Angela Syllabus for CHEM 4660 Introduction to Computational Chemistry Office Hours: Mondays, 1:00 p.m. 3:00 p.m.; 5:00 6:00 p.m. Office: Chemistry 205C Office Phone: (940) 565-4296 E-mail:
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationApplying Fuzzy Rule-Based System on FMEA to Assess the Risks on Project-Based Software Engineering Education
Journal of Software Engineering and Applications, 2017, 10, 591-604 http://www.scirp.org/journal/jsea ISSN Online: 1945-3124 ISSN Print: 1945-3116 Applying Fuzzy Rule-Based System on FMEA to Assess the
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationA student diagnosing and evaluation system for laboratory-based academic exercises
A student diagnosing and evaluation system for laboratory-based academic exercises Maria Samarakou, Emmanouil Fylladitakis and Pantelis Prentakis Technological Educational Institute (T.E.I.) of Athens
More informationLearning to Schedule Straight-Line Code
Learning to Schedule Straight-Line Code Eliot Moss, Paul Utgoff, John Cavazos Doina Precup, Darko Stefanović Dept. of Comp. Sci., Univ. of Mass. Amherst, MA 01003 Carla Brodley, David Scheeff Sch. of Elec.
More informationPH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)
PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.) OVERVIEW ADMISSION REQUIREMENTS PROGRAM REQUIREMENTS OVERVIEW FOR THE PH.D. IN COMPUTER SCIENCE Overview The doctoral program is designed for those students
More informationSurprise-Based Learning for Autonomous Systems
Surprise-Based Learning for Autonomous Systems Nadeesha Ranasinghe and Wei-Min Shen ABSTRACT Dealing with unexpected situations is a key challenge faced by autonomous robots. This paper describes a promising
More informationMath 181, Calculus I
Math 181, Calculus I [Semester] [Class meeting days/times] [Location] INSTRUCTOR INFORMATION: Name: Office location: Office hours: Mailbox: Phone: Email: Required Material and Access: Textbook: Stewart,
More informationCS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus
CS 1103 Computer Science I Honors Fall 2016 Instructor Muller Syllabus Welcome to CS1103. This course is an introduction to the art and science of computer programming and to some of the fundamental concepts
More informationMultisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems)
Multisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems) If searching for the ebook Multisensor Data Fusion: From Algorithms and Architectural
More information