Subgoal Discovery for Hierarchical Reinforcement Learning Using Learned Policies

Size: px
Start display at page:

Download "Subgoal Discovery for Hierarchical Reinforcement Learning Using Learned Policies"

Transcription

1 Subgoal Discovery for Hierarchical Reinforcemen Learning Using Learned Policies Sandeep Goel and Manfred Huber Deparmen of Compuer Science and Engineering Universiy of Texas a Arlingon Arlingon, Texas {goel, huber}@cse.ua.edu Absrac Reinforcemen learning addresses he problem of learning o selec acions in order o maximize an agen s performance in unknown environmens. To scale reinforcemen learning o complex real-world asks, agen mus be able o discover hierarchical srucures wihin heir learning and conrol sysems. This paper presens a mehod by which a reinforcemen learning agen can discover subgoals wih cerain srucural properies. By discovering subgoals and including policies o subgoals as acions in is acion se, he agen is able o explore more effecively and accelerae learning in oher asks in he same or similar environmens where he same subgoals are useful. The agen discovers he subgoals by searching a learned policy model for sae ha exhibis cerain srucural properies. This approach is illusraed using gridworld asks. Inroducion Reinforcemen learning (RL) (Kaelbling, Liman, and Moore, 1996) comprises a family of incremenal algorihms ha consruc conrol policy hrough real-world experimenaion. A key scaling problem of reinforcemen learning is ha in large domains an enormous number of decisions are o be made. Hence, insead of learning using individual primiive acions, an agen could poenially learn much faser if i could absrac he innumerable micro-decisions, and focus insead on a small se of imporan decision. This immediaely raises he quesion of how o recognize hierarchical srucures wihin learning and conrol sysems and how o learn sraegies for hierarchical decision making. Wihin he reinforcemen learning paradigm, one way o do his is o inroduce subgoals wih heir own reward funcions, learn policies for achieving hese subgoals, and hen include hese policies as acions. This sraegy can faciliae skill ransfer o oher asks and accelerae learning. I is desirable ha he reinforcemen learning agen discover he subgoals auomaically. Several researchers have proposed mehods Copyrigh 2003, American Associaion for Arificial Inelligence ( All righs reserved. by which policies learned for a se of relaed asks are examined for commonaliies (Thrun and Schwarz, 1995) or are probabilisically combined o form new policies (Bernsein, 1999). However, neiher of hese RL mehods inroduces subgoals. In oher work, subgoals are chosen based on informaion abou he frequency a sae was visied during policy acquisiion or based on he reward obained. Digney (Digney 1996, 1998) chooses saes ha are visied frequenly or saes where he reward gradien is high as subgoals. Similarly, McGovern (McGovern and Baro, 2001 uses diverse densiy o discover useful subgoals auomaically. However, in he case of more complicaed environmens and rewards i can be difficul o accumulae and classify he ses of successful and unsuccessful rajecories needed o compue he densiy measure or frequency couns. In addiion, hese mehods do no allow he agen o discover subgoals ha are no explicily par of he asks used in he process of discovering hem. In his paper, he focus is on discovering subgoals by searching a learned policy model for cerain srucural properies. This mehod is able o discover subgoals even if hey are no a par of he successful rajecories of he policy. If he agen can discover hese subgoal saes and learn policies o reach hem, i can include hese policies as acions and use hem for effecive exploraion as well as o accelerae learning in oher asks in which he same subgoals are useful. Reinforcemen Learning In he reinforcemen learning framework, a learning agen ineracs wih an environmen over a series of ime seps = 0, 1, 2, 3, A any insan in he ime he learner can observe he sae of he environmen, denoed by s S and apply an acion, a A. Acions change he sae of environmen, and also produce a scalar pay-off value (reward), denoed by r R. In a Markovian sysem, he nex sae and reward depend only on he preceding sae and acion, bu hey may depend on hese in a sochasic manner. The objecive of he agen is o learn o maximize he expeced value of reward received over ime. I does his by learning a (possibly sochasic) mapping from saes o acions called a policy, Π : S A i.e. a mapping from 346 FLAIRS 2003

2 saes s S o acions a A. More precisely, he objecive is o choose each acion so as o maximize he expeced reurn: R E[ = γ i ri ] (1) i= 0 where γ [0,1) is a discoun-rae parameer and r i refers o he pay-off a ime i. A common approach o solve his problem is o approximae he opimal sae-acion value funcion, or Q-funcion (Wakins, 1989), Q: S A R which maps saes s S and acions a A o scalar values. In paricular, Q ( s, a ) represens he expeced discouned sum of fuure rewards if acion a is aken in sae s and he opimal policy is followed aferwards. Hence Q, once learned, allows he learner o maximize R by picking acions greedily wih respec o Q: = arg maxq( s, Π (2) a A The value funcion Q is learned on-line hrough experimenaion. Suppose ha during learning he learner execues acion a in sae s, which leads o a new sae s and he immediae pay-off r s, a. In his case Q-learning uses his sae ransiion o updae Q ( s, according o: Q( s, (1 α ) Q( s, + α ( r, + γ max Q( s, ) (3) The scalar s a α [0,1) is he learning rae. Subgoal Exracion An example ha shows ha subgoals can be useful is a room o room navigaion ask where he agen should discover he uiliy of doorways as subgoals. If he agen can recognize ha a doorway is a subgoal, hen i can learn a policy o reach he doorway. This policy can accelerae learning on relaed asks in he same or similar environmens by allowing he agen o move beween he rooms using single acions. The idea of using subgoals however is no confined o gridworlds or navigaion asks. Oher asks should also benefi from subgoal discovery. For example, consider a game in which he agen mus find a key o open a door before i can proceed. If i can discover ha having a key is a useful subgoal, hen i will more quickly be able o learn how o advance from level o level (McGovern and Baro, 2001b). In he approach described in his paper, he focus is on discovering useful subgoals ha can be defined in he agen s sae space. Policies o hose subgoals are hen learned and added as acions. In a regular space (regular space here refers o a uniformly conneced sae space) every sae will have approximaely he same expeced number of direc predecessors under a given policy, excep for regions near he goal sae or close o boundaries (where he space is no regular). In a regular and unconsrained space, if he coun of all he predecessors for every sae under a given policy is accumulaed and a curve for hese a couns along a chosen pah is ploed, he expeced curve would behave like he posiive par of a quadraic, and he expeced raio of gradiens along such a curve would be a posiive consan. In he approach presened here, a subgoal sae is a sae wih he following srucural propery: he sae space rajecories originaing from a significanly larger han expeced number of saes lead o he subgoal sae while is successor sae does no have his propery. Such saes represen a funnel for he given policy. To idenify such saes i is possible o evaluae he raio of he gradiens of he coun curve before and afer he subgoal sae. Consider a pah under a given policy going hrough a subgoal sae. The predecessors of he subgoal sae along his pah lie in a relaively unconsrained space and hus he coun curve for hose saes should be quadraic. However, he dynamics changes srongly a he subgoal sae. There will be a srong increase in he coun and he curve will become seeper as he pah approaches a subgoal sae. On he oher hand, he increase in he coun can be expeced o be much lower for he successor sae of he subgoal as i again lies in a relaively unconsrained space. Thus he raio of he gradiens a his poin will be high and easily disinguishable. Le C( s ) represen he coun of predecessors for a sae s under a given policy, and C ( s ) is he coun of predecessors ha can reach s in exacly seps: C 1 = P ( s s, Π ( s )) (4) C s s = P ( s s, ( s )) C ( ) (5) + 1 Π s s s n C = C (6) i = 1 i where n is such ha C n+1 = C n or n = number of saes, whichever is smaller. The condiion s s prevens he couning of one sep loops. P( s s, s )) is he probabiliy of reaching sae s from sae s by aking acion s ) (in a deerminisic world he probabiliy is 1 or 0). If here are loops wihin he policy, hen he couns for he saes in he loop will become very high. This implies ha, if no precauions are aken, he gradien crieria used here migh also idenify saes in he loop as subgoals. To calculae he raio along a pah under he given policy, le C ( s 1 ) be he predecessor coun for he iniial sae of he pah and C ( s ) be he coun for he sae he agen will be in afer execuing seps from he iniial sae. The slope of he curve a sep, can be compued as: = ( s ) C( s 1 ) C (7) FLAIRS

3 To idenify subgoals, he gradien raio > < +1 is compued if +1 (If + 1 hen he raio is less hen 1 and sae does no fi he crierion. Avoiding he compuaion of he raio for such poins hus saves compuaional effor). If he compued raio is higher hen a specified hreshold, sae s will be considered a poenial subgoal. The hreshold here depends largely on he characerisics of he sae space bu can ofen be compued independen of he paricular environmen. The subgoal exracion echnique presened here has been illusraed using a simple gridworld navigaion problem. Figure 1 shows a four-room example environmen on a 20x20 grid. For hese experimens, he goal sae was placed in he lower righ porion and each rial sared from same sae in he lef upper corner as shown in Figure1. using Mone Carlo sampling mehods. The agen hen evaluaes he raio of gradiens along he coun curve by choosing random pahs, and picks he saes in which he raio is higher hen he specified hreshold as subgoal saes. For his experimen he coun curve along one of he randomly chosen pahs hrough a subgoal sae is shown in Figure 2. The pah chosen is indicaed in Figure 1 and he subgoal sae is highlighed boh in Figure 1 and Figure 2. The value for he gradien raio a sep 4 (which is in regular space) is while i is 95.0 a sep 6 (which is a subgoal sae). To show ha he gradien raio in he unconsrained porion of he sae space and a a subgoal sae are easily disinguishable, hisograms for he disribuion of hese raios in randomly generaed environmens, are shown in Figure 3. Figure 1. Se of primiive acions (righ) and gridworld(lef) wih he iniial sae in he upper lef corner, he goal in he lower righ porion and a random pah under he learned policy. The acion space consiss of eigh primiive acions (Norh, Eas, Wes, Souh, Norh-wes, Norh-eas, Souh-wes, and Souh-eas). The world is deerminisic and each acion succeeds in moving he agen in he chosen direcion. Wih every acion he agen receives a negaive reward of -1 for a sraigh acion and -1.2 for a diagonal acion. In addiion, he agen ges a reward of +10 when i reaches he goal sae. The agen learns using Q-OHDUQLQJ DQG -greedy exploraion. I sars wih 90 (which means 90% of he ime i ries o explore by choosing a random acion) and gradually decreases he exploraion o In his experimen he predecessor coun for every sae is compued exhausively using equaions 4, 5, and 6. However, for large sae spaces couns can be approximaed Figure 2. Coun curve along a randomly chosen pah hrough a subgoal sae under he learned policy. The Hisogram shows daa colleced from 12 randomly generaed 20x20 gridworlds wih randomly placed rooms and goals. Each run learns a policy model for he respecive ask using Q-learning and compues he couns of predecessors for every sae using equaions 4, 5, and 6. Gradien raios for 40 random pahs in each environmen are shown in he hisogram. The subgoal saes ha he agen discovered in his experimen are shown in Figure 4. The subgoal sae leading o he lef room is idenified here due o is srucural properies under he policy and despie he fac ha i does no lie on he successful pahs beween he sar and he goal sae. The agen did no discover he doorway in he smaller room as a subgoal sae because he number of sae for which he policy leads hrough he subgoal is small compared o he oher rooms and hence he coun for his subgoal sae is no influenced significanly by he srucural propery of he sae. 348 FLAIRS 2003

4 To show ha he mehod for discovering subgoals discussed above is no confined o gridworlds or navigaion asks, random worlds wih 1600 saes were generaed. In hese worlds fixed numbers of acions were available in each sae. Each acion in he sae s connecs o a randomly chosen sae s in heir local neighborhood. Then he coun meric was esablished and gradien raios were compued for hese spaces wih and wihou a subgoal. The resuls showed ha he gradien raios in he unconsrained porion of he sae space and a a subgoal sae are again easily disinguishable. Figure 3. Hisogram for he disribuion of he gradien raio in regular space (dark bars) and a subgoal saes (ligh bars). Figure 4. Subgoals saes discovered by he agen (ligh gray saes) Hierarchical Policy Formaion The moivaion for discovering subgoals is he effec ha available policies ha lead o subgoals have on he agen s exploraion and speed of learning in relaed asks in he same or similar environmens. If he agen randomly selecs exploraory primiive acions, i is likely o remain wihin he more srongly conneced regions of he sae space. A policy for achieving a subgoal region, on he oher hand, will end o connec separae srongly conneced areas. For example, in a room-o-room navigaion ask, navigaion using primiive movemen commands produces relaively srongly conneced dynamics wihin each room bu no beween rooms. A doorway links wo srongly conneced regions. By adding a policy o reach a doorway subgoal he rooms become more closely conneced. This allows he agen o more uniformly explore is environmen. I has been shown ha he effec on exploraion is one of wo main reasons ha exended acions can be able o dramaically affec learning (McGovern, 1998). Learning policies o subgoals To ake advanage of he subgoal saes, he agen uses Q- learning o learn a policy o each of he subgoals discovered in he previous sep. These policies, which lead o respecive subgoal saes (subgoal policies) are added o he acion se of he agen. Learning hierarchical policies. One reason ha i is imporan for he learning agen o be able o deec subgoal saes is he effec of subgoal policies on he rae of convergence o a soluion. If he subgoals are useful hen learning should be acceleraed. To ascerain ha hese subgoals help he agen o improve is policy more quickly, wo experimens were performed where he agen learned a new ask wih and wihou he subgoal policies. The same 20x20 grid-world wih hree rooms was used o illusrae he resuls. Subgoal policies were included in he acion se of he agen (Subg1, Subg2). The ask was changed by moving he goal o lef hand room as shown in Figure 5. The agen solves he new ask using Q-learning wih an exploraion of 5%. The acion sequence under he policy learned for he new ask, when is acion se included he subgoal policies is (Subg2, Souh-wes, Souh, Souh, Souh, Souh) where Subg2 refers o he subgoal policy which leads o he sae as shown in Figure 5. Figure 6 shows he learning curves when he agen was using he subgoal policies and when i was using only primiive acions. The learning performance is compared in erms of he oal reward ha he agen would receive under he learned policy a ha poin of he learning process. The curves in Figure 6 are averaged over 10 learning runs. Only an iniial par of daa is ploed o compare he wo learning curves; wih primiives only he agen is sill learning afer 150,000 learning seps while wih subgoal policies he policy has already converged. Afer 400,000 learning seps he agen wihou subgoal FLAIRS

5 policies also converges o he same overall performance. The verical inervals along he curve indicae one sandard deviaion in each direcion a ha poin. subgoals in he acion se can significanly accelerae learning in oher, relaed asks. While he example shown here are gridworld asks, he presened approach for discovering and using subgoals is no confined o gridworlds or navigaion asks. Acknowledgemens This work was suppored in par by NSF ITR and UTA REP/RES-Huber-CSE. Figure 5. New ask wih goal sae in he lef hand room. Figure 6. Comparison of learning speed using subgoal policies and using primiive acions only. Conclusions This paper presens a mehod for discovering subgoals by searching a learned policy model for saes ha exhibi a funneling propery. These subgoals are discovered by sudying he dynamics along he predecessor coun curve and can include saes ha are no an inegral par of he iniial policy. The experimens presened here shows ha discovering subgoals and including policies for hese References Bernsein, D. S. (1999). Reusing old policies o accelerae learning on new MDPs (Technical Repor UM-CS ). Dep. of Compuer Science, Univ. of Massachuses, Amhers, MA. Digney, B. (1996). Emergen hierarchical srucures: Learning reacive/hierarchical relaionships in reinforcemen environmens. From animals o animas 4: SAB 96. MIT Press/Bradford Books. Digney, B. (1998). Learning hierarchical conrol srucure for muliple asks and changing environmens. From animals o animas 5: SAB 98. Kaelbling, L. P., Liman, M. L., and Moore, A. W. Reinforcemen Learning: A Survey, Journal of Arificial Inelligence Research, Volume 4, 1996 McGovern, A., and Baro, A. G. (2001. Auomaic Discovery of Subgoals in Reinforcemen learning using Diverse Densiy. Proceedings of he 18 h Inernaional Conference on Machine Learning, pages McGovern, A. (1998). Roles of macro-acions in acceleraing reinforcemen learning. Maser s hesis, U. of Massachuses, Amhers. Also Technical Repor McGovern, A., and Baro, A. G. (2001b). Acceleraing Reinforcemen Learning hrough he Discovery of Useful Subgoals. Proceedings of he 6h Inernaional Symposium on Arificial Inelligence, Roboics and Auomaion in Space. Suon, R. S. & Baro, A. G. (1998). Reinforcemen learning: an Inroducion. Cambridge, MA: MIT Press. Suon, R.S. (1988). Learning o predic by he mehods of emporal differences. Machine Learning 3: Thrun, S. B., & Schwarz, A. (1995). Finding srucure in reinforcemen learning. NIPS 7 (pp ). San Maeo, CA: Morgan Kaufmann. Wakins, Chrisopher J.C.H. (1989). Learning from delayed rewards. PhD hesis, Dep. of Psychology, Univ. of Cambridge. 350 FLAIRS 2003

Neural Network Model of the Backpropagation Algorithm

Neural Network Model of the Backpropagation Algorithm Neural Nework Model of he Backpropagaion Algorihm Rudolf Jakša Deparmen of Cyberneics and Arificial Inelligence Technical Universiy of Košice Lená 9, 4 Košice Slovakia jaksa@neuron.uke.sk Miroslav Karák

More information

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments Inernaional Journal of Elecrical and Compuer Engineering (IJECE) Vol. 6, No. 5, Ocober 2016, pp. 2415~2424 ISSN: 2088-8708, DOI: 10.11591/ijece.v6i5.10639 2415 An Effiecien Approach for Resource Auo-Scaling

More information

MyLab & Mastering Business

MyLab & Mastering Business MyLab & Masering Business Efficacy Repor 2013 MyLab & Masering: Business Efficacy Repor 2013 Edied by Michelle D. Speckler 2013 Pearson MyAccouningLab, MyEconLab, MyFinanceLab, MyMarkeingLab, and MyOMLab

More information

Fast Multi-task Learning for Query Spelling Correction

Fast Multi-task Learning for Query Spelling Correction Fas Muli-ask Learning for Query Spelling Correcion Xu Sun Dep. of Saisical Science Cornell Universiy Ihaca, NY 14853 xusun@cornell.edu Anshumali Shrivasava Dep. of Compuer Science Cornell Universiy Ihaca,

More information

More Accurate Question Answering on Freebase

More Accurate Question Answering on Freebase More Accurae Quesion Answering on Freebase Hannah Bas, Elmar Haussmann Deparmen of Compuer Science Universiy of Freiburg 79110 Freiburg, Germany {bas, haussmann}@informaik.uni-freiburg.de ABSTRACT Real-world

More information

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports Downloaded from ascelibrary.org by Basil Sephanis on 07/13/16. Copyrigh ASCE. For personal use only; all righs reserved. Informaion Propagaion for informing Special Populaion Subgroups abou New Ground

More information

1 Language universals

1 Language universals AS LX 500 Topics: Language Uniersals Fall 2010, Sepember 21 4a. Anisymmery 1 Language uniersals Subjec-erb agreemen and order Bach (1971) discusses wh-quesions across SO and SO languages, hypohesizing:...

More information

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices Z. Zhang e al.: Channel Mapping using Bidirecional Long Shor-Term Memory for Dereverberaion in Hands-Free Voice Conrolled Devices 525 Channel Mapping using Bidirecional Long Shor-Term Memory for Dereverberaion

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

Learning Prospective Robot Behavior

Learning Prospective Robot Behavior Learning Prospective Robot Behavior Shichao Ou and Rod Grupen Laboratory for Perceptual Robotics Computer Science Department University of Massachusetts Amherst {chao,grupen}@cs.umass.edu Abstract This

More information

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Proceedings of 28 ISFA 28 International Symposium on Flexible Automation Atlanta, GA, USA June 23-26, 28 ISFA28U_12 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Amit Gil, Helman Stern, Yael Edan, and

More information

Improving Action Selection in MDP s via Knowledge Transfer

Improving Action Selection in MDP s via Knowledge Transfer In Proc. 20th National Conference on Artificial Intelligence (AAAI-05), July 9 13, 2005, Pittsburgh, USA. Improving Action Selection in MDP s via Knowledge Transfer Alexander A. Sherstov and Peter Stone

More information

Georgetown University at TREC 2017 Dynamic Domain Track

Georgetown University at TREC 2017 Dynamic Domain Track Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain

More information

Learning and Transferring Relational Instance-Based Policies

Learning and Transferring Relational Instance-Based Policies Learning and Transferring Relational Instance-Based Policies Rocío García-Durán, Fernando Fernández y Daniel Borrajo Universidad Carlos III de Madrid Avda de la Universidad 30, 28911-Leganés (Madrid),

More information

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Varun Raj Kompella, Marijn Stollenga, Matthew Luciw, Juergen Schmidhuber The Swiss AI Lab IDSIA, USI

More information

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Active Learning. Yingyu Liang Computer Sciences 760 Fall Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,

More information

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Exploration. CS : Deep Reinforcement Learning Sergey Levine Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Knowledge Transfer in Deep Convolutional Neural Nets

Knowledge Transfer in Deep Convolutional Neural Nets Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Speeding Up Reinforcement Learning with Behavior Transfer

Speeding Up Reinforcement Learning with Behavior Transfer Speeding Up Reinforcement Learning with Behavior Transfer Matthew E. Taylor and Peter Stone Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188 {mtaylor, pstone}@cs.utexas.edu

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Learning Cases to Resolve Conflicts and Improve Group Behavior

Learning Cases to Resolve Conflicts and Improve Group Behavior From: AAAI Technical Report WS-96-02. Compilation copyright 1996, AAAI (www.aaai.org). All rights reserved. Learning Cases to Resolve Conflicts and Improve Group Behavior Thomas Haynes and Sandip Sen Department

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Regret-based Reward Elicitation for Markov Decision Processes

Regret-based Reward Elicitation for Markov Decision Processes 444 REGAN & BOUTILIER UAI 2009 Regret-based Reward Elicitation for Markov Decision Processes Kevin Regan Department of Computer Science University of Toronto Toronto, ON, CANADA kmregan@cs.toronto.edu

More information

High-level Reinforcement Learning in Strategy Games

High-level Reinforcement Learning in Strategy Games High-level Reinforcement Learning in Strategy Games Christopher Amato Department of Computer Science University of Massachusetts Amherst, MA 01003 USA camato@cs.umass.edu Guy Shani Department of Computer

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

The Evolution of Random Phenomena

The Evolution of Random Phenomena The Evolution of Random Phenomena A Look at Markov Chains Glen Wang glenw@uchicago.edu Splash! Chicago: Winter Cascade 2012 Lecture 1: What is Randomness? What is randomness? Can you think of some examples

More information

FF+FPG: Guiding a Policy-Gradient Planner

FF+FPG: Guiding a Policy-Gradient Planner FF+FPG: Guiding a Policy-Gradient Planner Olivier Buffet LAAS-CNRS University of Toulouse Toulouse, France firstname.lastname@laas.fr Douglas Aberdeen National ICT australia & The Australian National University

More information

A Case-Based Approach To Imitation Learning in Robotic Agents

A Case-Based Approach To Imitation Learning in Robotic Agents A Case-Based Approach To Imitation Learning in Robotic Agents Tesca Fitzgerald, Ashok Goel School of Interactive Computing Georgia Institute of Technology, Atlanta, GA 30332, USA {tesca.fitzgerald,goel}@cc.gatech.edu

More information

What is a Mental Model?

What is a Mental Model? Mental Models for Program Understanding Dr. Jonathan I. Maletic Computer Science Department Kent State University What is a Mental Model? Internal (mental) representation of a real system s behavior,

More information

Teachable Robots: Understanding Human Teaching Behavior to Build More Effective Robot Learners

Teachable Robots: Understanding Human Teaching Behavior to Build More Effective Robot Learners Teachable Robots: Understanding Human Teaching Behavior to Build More Effective Robot Learners Andrea L. Thomaz and Cynthia Breazeal Abstract While Reinforcement Learning (RL) is not traditionally designed

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

Supervised Agricultural Experience Unit Agriculture, Food, and Natural Resources Texas Education Agency

Supervised Agricultural Experience Unit Agriculture, Food, and Natural Resources Texas Education Agency Supervised Agricultural Experience Unit Agriculture, Food, and Natural Resources Texas Education Agency LESSON: Agricultural Careers and Opportunities in SAE OBJECTIVES: 1. Identify as a class a variety

More information

On the Combined Behavior of Autonomous Resource Management Agents

On the Combined Behavior of Autonomous Resource Management Agents On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science

More information

Seminar - Organic Computing

Seminar - Organic Computing Seminar - Organic Computing Self-Organisation of OC-Systems Markus Franke 25.01.2006 Typeset by FoilTEX Timetable 1. Overview 2. Characteristics of SO-Systems 3. Concern with Nature 4. Design-Concepts

More information

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation School of Computer Science Human-Computer Interaction Institute Carnegie Mellon University Year 2007 Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation Noboru Matsuda

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

Let's Learn English Lesson Plan

Let's Learn English Lesson Plan Let's Learn English Lesson Plan Introduction: Let's Learn English lesson plans are based on the CALLA approach. See the end of each lesson for more information and resources on teaching with the CALLA

More information

Go fishing! Responsibility judgments when cooperation breaks down

Go fishing! Responsibility judgments when cooperation breaks down Go fishing! Responsibility judgments when cooperation breaks down Kelsey Allen (krallen@mit.edu), Julian Jara-Ettinger (jjara@mit.edu), Tobias Gerstenberg (tger@mit.edu), Max Kleiman-Weiner (maxkw@mit.edu)

More information

Professor Christina Romer. LECTURE 24 INFLATION AND THE RETURN OF OUTPUT TO POTENTIAL April 20, 2017

Professor Christina Romer. LECTURE 24 INFLATION AND THE RETURN OF OUTPUT TO POTENTIAL April 20, 2017 Economics 2 Spring 2017 Professor Christina Romer Professor David Romer LECTURE 24 INFLATION AND THE RETURN OF OUTPUT TO POTENTIAL April 20, 2017 I. OVERVIEW II. HOW OUTPUT RETURNS TO POTENTIAL A. Moving

More information

Managerial Decision Making

Managerial Decision Making Course Business Managerial Decision Making Session 4 Conditional Probability & Bayesian Updating Surveys in the future... attempt to participate is the important thing Work-load goals Average 6-7 hours,

More information

AMULTIAGENT system [1] can be defined as a group of

AMULTIAGENT system [1] can be defined as a group of 156 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 2, MARCH 2008 A Comprehensive Survey of Multiagent Reinforcement Learning Lucian Buşoniu, Robert Babuška,

More information

TEAM NEWSLETTER. Welton Primar y School SENIOR LEADERSHIP TEAM. School Improvement

TEAM NEWSLETTER. Welton Primar y School SENIOR LEADERSHIP TEAM. School Improvement Welton Primar y School February 2016 SENIOR LEADERSHIP TEAM NEWSLETTER SENIOR LEADERSHIP TEAM Nikki Pidgeon Head Teacher Sarah Millar Lead for Behaviour, SEAL and PE Laura Leitch Specialist Leader in Education,

More information

Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games

Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference Case Acquisition Strategies for Case-Based Reasoning in Real-Time Strategy Games Santiago Ontañón

More information

An OO Framework for building Intelligence and Learning properties in Software Agents

An OO Framework for building Intelligence and Learning properties in Software Agents An OO Framework for building Intelligence and Learning properties in Software Agents José A. R. P. Sardinha, Ruy L. Milidiú, Carlos J. P. Lucena, Patrick Paranhos Abstract Software agents are defined as

More information

Association Between Categorical Variables

Association Between Categorical Variables Student Outcomes Students use row relative frequencies or column relative frequencies to informally determine whether there is an association between two categorical variables. Lesson Notes In this lesson,

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Level 1 Mathematics and Statistics, 2015

Level 1 Mathematics and Statistics, 2015 91037 910370 1SUPERVISOR S Level 1 Mathematics and Statistics, 2015 91037 Demonstrate understanding of chance and data 9.30 a.m. Monday 9 November 2015 Credits: Four Achievement Achievement with Merit

More information

Task Completion Transfer Learning for Reward Inference

Task Completion Transfer Learning for Reward Inference Task Completion Transfer Learning for Reward Inference Layla El Asri 1,2, Romain Laroche 1, Olivier Pietquin 3 1 Orange Labs, Issy-les-Moulineaux, France 2 UMI 2958 (CNRS - GeorgiaTech), France 3 University

More information

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

Erkki Mäkinen State change languages as homomorphic images of Szilard languages Erkki Mäkinen State change languages as homomorphic images of Szilard languages UNIVERSITY OF TAMPERE SCHOOL OF INFORMATION SCIENCES REPORTS IN INFORMATION SCIENCES 48 TAMPERE 2016 UNIVERSITY OF TAMPERE

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch 2 Test Remediation Work Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) High temperatures in a certain

More information

A Game-based Assessment of Children s Choices to Seek Feedback and to Revise

A Game-based Assessment of Children s Choices to Seek Feedback and to Revise A Game-based Assessment of Children s Choices to Seek Feedback and to Revise Maria Cutumisu, Kristen P. Blair, Daniel L. Schwartz, Doris B. Chin Stanford Graduate School of Education Please address all

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

PART 1. A. Safer Keyboarding Introduction. B. Fifteen Principles of Safer Keyboarding Instruction

PART 1. A. Safer Keyboarding Introduction. B. Fifteen Principles of Safer Keyboarding Instruction Subject: Speech & Handwriting/Input Technologies Newsletter 1Q 2003 - Idaho Date: Sun, 02 Feb 2003 20:15:01-0700 From: Karl Barksdale To: info@speakingsolutions.com This is the

More information

2005 National Survey of Student Engagement: Freshman and Senior Students at. St. Cloud State University. Preliminary Report.

2005 National Survey of Student Engagement: Freshman and Senior Students at. St. Cloud State University. Preliminary Report. National Survey of Student Engagement: Freshman and Senior Students at St. Cloud State University Preliminary Report (December, ) Institutional Studies and Planning National Survey of Student Engagement

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

STUDENTS' RATINGS ON TEACHER

STUDENTS' RATINGS ON TEACHER STUDENTS' RATINGS ON TEACHER Faculty Member: CHEW TECK MENG IVAN Module: Activity Type: DATA STRUCTURES AND ALGORITHMS I CS1020 LABORATORY Class Size/Response Size/Response Rate : 21 / 14 / 66.67% Contact

More information

Idaho Public Schools

Idaho Public Schools Advanced Placement: Student Participation 13.5% increase in the number of students participating between 25 and 26 In 26: 3,79 Idaho Public School Students took AP Exams In 25: 3,338 Idaho Public School

More information

A General Class of Noncontext Free Grammars Generating Context Free Languages

A General Class of Noncontext Free Grammars Generating Context Free Languages INFORMATION AND CONTROL 43, 187-194 (1979) A General Class of Noncontext Free Grammars Generating Context Free Languages SARWAN K. AGGARWAL Boeing Wichita Company, Wichita, Kansas 67210 AND JAMES A. HEINEN

More information

Mathematics subject curriculum

Mathematics subject curriculum Mathematics subject curriculum Dette er ei omsetjing av den fastsette læreplanteksten. Læreplanen er fastsett på Nynorsk Established as a Regulation by the Ministry of Education and Research on 24 June

More information

Adaptive Generation in Dialogue Systems Using Dynamic User Modeling

Adaptive Generation in Dialogue Systems Using Dynamic User Modeling Adaptive Generation in Dialogue Systems Using Dynamic User Modeling Srinivasan Janarthanam Heriot-Watt University Oliver Lemon Heriot-Watt University We address the problem of dynamically modeling and

More information

International Journal of Innovative Research and Advanced Studies (IJIRAS) Volume 4 Issue 5, May 2017 ISSN:

International Journal of Innovative Research and Advanced Studies (IJIRAS) Volume 4 Issue 5, May 2017 ISSN: Effectiveness Of Using Video Presentation In Teaching Biology Over Conventional Lecture Method Among Ninth Standard Students Of Matriculation Schools In Coimbatore District Ms. Shigee.K Master of Education,

More information

Interpreting Graphs Middle School Science

Interpreting Graphs Middle School Science Middle School Free PDF ebook Download: Download or Read Online ebook interpreting graphs middle school science in PDF Format From The Best User Guide Database. Rain, Rain, Go Away When the student council

More information

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge

More information

Learning Human Utility from Video Demonstrations for Deductive Planning in Robotics

Learning Human Utility from Video Demonstrations for Deductive Planning in Robotics Learning Human Utility from Video Demonstrations for Deductive Planning in Robotics Nishant Shukla, Yunzhong He, Frank Chen, and Song-Chun Zhu Center for Vision, Cognition, Learning, and Autonomy University

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Modeling user preferences and norms in context-aware systems

Modeling user preferences and norms in context-aware systems Modeling user preferences and norms in context-aware systems Jonas Nilsson, Cecilia Lindmark Jonas Nilsson, Cecilia Lindmark VT 2016 Bachelor's thesis for Computer Science, 15 hp Supervisor: Juan Carlos

More information

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

Linking the Ohio State Assessments to NWEA MAP Growth Tests * Linking the Ohio State Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. August 2016 Introduction Northwest Evaluation Association (NWEA

More information

Longitudinal Analysis of the Effectiveness of DCPS Teachers

Longitudinal Analysis of the Effectiveness of DCPS Teachers F I N A L R E P O R T Longitudinal Analysis of the Effectiveness of DCPS Teachers July 8, 2014 Elias Walsh Dallas Dotter Submitted to: DC Education Consortium for Research and Evaluation School of Education

More information

Stopping rules for sequential trials in high-dimensional data

Stopping rules for sequential trials in high-dimensional data Stopping rules for sequential trials in high-dimensional data Sonja Zehetmayer, Alexandra Graf, and Martin Posch Center for Medical Statistics, Informatics and Intelligent Systems Medical University of

More information

TD(λ) and Q-Learning Based Ludo Players

TD(λ) and Q-Learning Based Ludo Players TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability

More information

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company Table of Contents Welcome to WiggleWorks... 3 Program Materials... 3 WiggleWorks Teacher Software... 4 Logging In...

More information

E mail: Phone: LIBRARY MBA MAIN OFFICE

E mail: Phone: LIBRARY MBA MAIN OFFICE MASTER OF BUSINESS ADMINISTRATION 1 Jennifer Brandow, MBA Director E mail: mba@wsc.edu Phone: 402.375.7587 MBA OFFICE Gardner Hall 106 1111 Main St. Wayne, NE 68787 ADMISSIONS 402.375.7234 admissions@wsc.edu

More information

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Paper #3 Five Q-to-survey approaches: did they work? Job van Exel

More information

Team Formation for Generalized Tasks in Expertise Social Networks

Team Formation for Generalized Tasks in Expertise Social Networks IEEE International Conference on Social Computing / IEEE International Conference on Privacy, Security, Risk and Trust Team Formation for Generalized Tasks in Expertise Social Networks Cheng-Te Li Graduate

More information

A simulated annealing and hill-climbing algorithm for the traveling tournament problem

A simulated annealing and hill-climbing algorithm for the traveling tournament problem European Journal of Operational Research xxx (2005) xxx xxx Discrete Optimization A simulated annealing and hill-climbing algorithm for the traveling tournament problem A. Lim a, B. Rodrigues b, *, X.

More information

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and in other settings. He may also make use of tests in

More information

PROGRAMME SYLLABUS International Management, Bachelor programme, 180

PROGRAMME SYLLABUS International Management, Bachelor programme, 180 PROGRAMME SYLLABUS International Management, Bachelor programme, 180 Programmestart: Autumn 2015 Jönköping International Business School, Box 1026, SE-551 11 Jönköping VISIT Gjuterigatan 5, Campus PHONE

More information

Improving Conceptual Understanding of Physics with Technology

Improving Conceptual Understanding of Physics with Technology INTRODUCTION Improving Conceptual Understanding of Physics with Technology Heidi Jackman Research Experience for Undergraduates, 1999 Michigan State University Advisors: Edwin Kashy and Michael Thoennessen

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

The University of Michigan-Flint. The Committee on the Economic Status of the Faculty. Annual Report to the Regents. June 2007

The University of Michigan-Flint. The Committee on the Economic Status of the Faculty. Annual Report to the Regents. June 2007 The University of Michigan-Flint The Committee on the Economic Status of the Faculty Annual Report to the Regents June 2007 Committee Chair: Stephen Turner (College of Arts and Sciences) Regular Members:

More information

Using focal point learning to improve human machine tacit coordination

Using focal point learning to improve human machine tacit coordination DOI 10.1007/s10458-010-9126-5 Using focal point learning to improve human machine tacit coordination InonZuckerman SaritKraus Jeffrey S. Rosenschein The Author(s) 2010 Abstract We consider an automated

More information

In Workflow. Viewing: Last edit: 10/27/15 1:51 pm. Approval Path. Date Submi ed: 10/09/15 2:47 pm. 6. Coordinator Curriculum Management

In Workflow. Viewing: Last edit: 10/27/15 1:51 pm. Approval Path. Date Submi ed: 10/09/15 2:47 pm. 6. Coordinator Curriculum Management 1 of 5 11/19/2015 8:10 AM Date Submi ed: 10/09/15 2:47 pm Viewing: Last edit: 10/27/15 1:51 pm Changes proposed by: GODWINH In Workflow 1. BUSI Editor 2. BUSI Chair 3. BU Associate Dean 4. Biggio Center

More information

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon

More information

Planning with External Events

Planning with External Events 94 Planning with External Events Jim Blythe School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 blythe@cs.cmu.edu Abstract I describe a planning methodology for domains with uncertainty

More information

Measures of the Location of the Data

Measures of the Location of the Data OpenStax-CNX module m46930 1 Measures of the Location of the Data OpenStax College This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 The common measures

More information

Algebra 2- Semester 2 Review

Algebra 2- Semester 2 Review Name Block Date Algebra 2- Semester 2 Review Non-Calculator 5.4 1. Consider the function f x 1 x 2. a) Describe the transformation of the graph of y 1 x. b) Identify the asymptotes. c) What is the domain

More information

Task Completion Transfer Learning for Reward Inference

Task Completion Transfer Learning for Reward Inference Machine Learning for Interactive Systems: Papers from the AAAI-14 Workshop Task Completion Transfer Learning for Reward Inference Layla El Asri 1,2, Romain Laroche 1, Olivier Pietquin 3 1 Orange Labs,

More information

Following Directions. Table of Contents

Following Directions. Table of Contents Following Directions Following directions is a life skill. Everyone needs to be able to follow directions and to give directions. The ideas in this resource book focus on helping young learners to: listen

More information

BENCHMARK TREND COMPARISON REPORT:

BENCHMARK TREND COMPARISON REPORT: National Survey of Student Engagement (NSSE) BENCHMARK TREND COMPARISON REPORT: CARNEGIE PEER INSTITUTIONS, 2003-2011 PREPARED BY: ANGEL A. SANCHEZ, DIRECTOR KELLI PAYNE, ADMINISTRATIVE ANALYST/ SPECIALIST

More information

Designing A Computer Opponent for Wargames: Integrating Planning, Knowledge Acquisition and Learning in WARGLES

Designing A Computer Opponent for Wargames: Integrating Planning, Knowledge Acquisition and Learning in WARGLES In the AAAI 93 Fall Symposium Games: Planning and Learning From: AAAI Technical Report FS-93-02. Compilation copyright 1993, AAAI (www.aaai.org). All rights reserved. Designing A Computer Opponent for

More information