Learning Perceptual Coupling for Motor Primitives

Size: px
Start display at page:

Download "Learning Perceptual Coupling for Motor Primitives"

Transcription

1 Learning Percepual Coupling for Moor Primiives Jens Kober, Bey Mohler, Jan Peers Max-Planck-Insiue for Biological Cyberneics Spemannsr. 38, Tuebingen, Germany Absrac Dynamic sysem-based moor primiives [1] have enabled robos o learn complex asks ranging from Tennisswings o locomoion. However, o dae here have been only few exensions which have incorporaed percepual coupling o variables of exernal focus, and, furhermore, hese modificaions have relied upon handcrafed soluions. Humans learn how o couple heir movemen primiives wih exernal variables. Clearly, such a soluion is needed in roboics. In his paper, we propose an augmened version of he dynamic sysems moor primiives which incorporaes percepual coupling o an exernal variable. The resuling percepually driven moor primiives include he previous primiives as a special case and can inheri some of heir ineresing properies. We show ha hese moor primiives can perform complex asks such as Ballin-a-Cup or Kendama ask even wih large variances in he iniial condiions where a skilled human player would be challenged. For doing so, we iniialize he moor primiives in he radiional way by imiaion learning wihou percepual coupling. Subsequenly, we improve he moor primiives using a novel reinforcemen learning mehod which is paricularly well-suied for moor primiives. I. INTRODUCTION The recen inroducion of moor primiives based on dynamic sysems [1] [4] have allowed boh imiaion learning and Reinforcemen Learning o acquire new behaviors fas and reliable. Resuling successes have shown ha i is possible o rapidly learn moor primiives for complex behaviors such as ennis swings [1], [2], T-ball baing [5], drumming [6], biped locomoion [3], [7] and even in asks wih poenial indusrial applicaion [8]. However, in heir curren form hese moor primiives are generaed in such a way ha hey are eiher only coupled o inernal variables [1], [2] or only include manually uned phase-locking, e.g., wih an exernal bea [6] or beween he gai-generaing primiive and he conac ime of he fee [3], [7]. In many human moor conrol asks, more complex percepual coupling is needed in order o perform he ask. Using handcrafed coupling based on human insigh will in mos cases no longer suffice. In his paper, i is our goal o augmen he Ijspeer- Nakanishi-Schaal approach [1], [2] of using dynamic sysems as moor primiives in such a way ha i includes percepual coupling wih exernal variables. Similar o he biokinesiological lieraure on moor learning (see e.g., [9]), we assume ha here is an objec of inernal focus described by a sae x and one of exernal focus y. The coupling beween boh foci usually depends on he phase of he movemen and, someimes, he coupling only exiss in shor phases, e.g., in a caching movemen, his could be a iniiaion of he movemen (which is largely predicive) and during he las momen when he objec is close o he hand (which is largely prospecive or reacive and includes movemen correcion). Ofen, i is also imporan ha he inernal focus is in a differen space han he exernal one. Fas movemens, such as a Tennis-swing, always follow a similar paern in join-space of he arm while he exernal focus is clearly on an objec in Caresian space or fovea-space. As a resul, we have augmened he moor primiive framework in such a way ha he coupling o he exernal, percepual focus is phase-varian and boh foci y and x can be in compleely differen spaces. Inegraing he percepual coupling requires addiional funcion approximaion, and, as a resul, he number of parameers of he moor primiives grows significanly. I becomes increasingly harder o manually une hese parameers o high performance and a learning approach for percepual coupling is needed. The need for learning percepual coupling in moor primiives has long been recognized in he moor primiive communiy [4]. However, learning percepual coupling o an exernal variable is no as sraighforward. I requires many rials in order o properly deermine he connecions from exernal o inernal focus. I is sraighforward o grasp a general movemen by imiaion and a human can produce a Ball-in-a-Cup movemen or a Tennis-swing afer a single or few observed rials of a eacher bu he will never have a robus coupling o he ball. Furhermore, small differences beween he kinemaics of eacher and suden amplify in he percepual coupling. This par is he reason why percepually driven moor primiives can be iniialized by imiaion learning bu will usually require self-improvemen by reinforcemen learning. This is analogous o he case of a human learning ennis: a eacher can show a forehand bu a lo of self-pracice is needed for a proper ennis game. II. AUGMENTED MOTOR PRIMITIVES WITH PERCEPTUAL COUPLING In his secion, we firs inroduce he general idea behind dynamic sysem moor primiives as suggesed in [1], [2] and, subsequenly, show how percepual coupling can be inroduced. Subsequenly, we show how he percepual coupling can be realized by augmening he acceleraion-based framework from [4]. A. Percepual Coupling for Moor Primiives The basic idea in he original work of Ijspeer, Nakanishi and Schaal [1], [2] is ha moor primiives can be pared ino

2 f 1 Transformed Sysem 1 Posiion Velociy Acceleraion Canonical Sysem f 2 Transformed Sysem 2 Posiion Velociy Acceleraion Figure 1. Illusraion of he behavior of he moor primiives (i) and he augmened moor primiives (ii). Exernal Variable f n Transformed Sysem n Posiion Velociy Acceleraion wo componens, i.e., a canonical sysem h which drives ransformed sysems g k for every considered degree of freedom k. As a resul, we have sysem of differenial equaions given by ż = h(z), (1) ẋ = g(x, z, w), (2) which deermines he variables of inernal focus x. Here, z denoes he sae of he canonical sysem and w he inernal parameers for ransforming he oupu of he canonical sysem. The schemaic in Figure 2 illusraes his radiional seup in black. In Secion II-B, we will discuss good choices for hese dynamical sysems as well as heir coupling based on he mos curren formulaion [4]. When aking an exernal variable y ino accoun, here are hree differen ways how his variable influences he moor primiive sysem which one can consider, i.e., (i) i could only influence Eq.(1) which would be appropriae for synchronizaion problems and phase-locking (similar as in [6], [10]), (ii) only affec Eq.(2) which allows he coninuous modificaion of he curren sae of he sysem by anoher variable and (iii) he combinaion of (i) and (ii). While (i) and (iii) are he righ soluion if phase-locking or synchronizaion are needed, he coupling in he canonical sysem will desroy many of he nice properies of he sysem and make i prohibiively hard o learn in pracice. Furhermore, as we focus on discree movemens in his paper, we focus on he case (ii) which has no been used o dae. In his case, we have a modified dynamical sysem ż = h(z), (3) ẋ = ĝ(x, y, ȳ, z, v), (4) ȳ = g(ȳ, z, w), (5) where y denoes he sae of he exernal variable, ȳ he expeced sae of he exernal variable and ȳ is derivaive. This archiecure inheris mos posiive properies from he original work while allowing he incorporaion of exernal feedback. We will show ha we can incorporae previous work wih ease and ha he resuling framework resembles he one in [4] while allowing o couple he exernal variables ino he sysem. B. Realizaion for Discree Movemens The original formulaion in [1], [2] was a major breakhrough as he righ choice of he dynamical sysems in Figure 2. General schemaic illusraing boh he original moor primiive framework by [2], [4] in black and he augmenaion for percepual coupling in red. Equaions (1, 2) allows deermining he sabiliy of he movemen, choosing beween a rhyhmic and a discree movemen and is invarian under rescaling in boh ime and movemen ampliude. Wih he righ choice of funcion approximaor (in our case locally-weighed regression), fas learning from a eachers presenaion is possible. In his secion, we firs discuss how he mos curren formulaion from he moor primiives as discussed in [4] is insaniaed from Secion II-A. Subsequenly, we show how i can be augmened in order o incorporae percepual coupling. While he original formulaion in [1], [2] used a secondorder canonical sysem, i has since hen been shown ha a single firs order sysem suffices [4], i.e., we have ż = h(z) = τα h z, which represens he phase of he rajecory. I has a ime consan τ and a parameer α h which is chosen such ha he sysem is sable. We can now choose our inernal sae such ha posiion of degree of freedom k is given by q k = x 2k, i.e., he 2k-h componen of x, he velociy by q k = τx 2k+1 = ẋ 2k and he acceleraion by q k = τẋ 2k+1. Upon hese assumpions, we can express he moor primiives funcion g in he following form ẋ 2k+1 = τα g (β g ( k x 2k ) x 2k+1 ) + τ (( k x 0 2k) + ak ) fk, (6) ẋ 2k = τx 2k+1. (7) This funcion has he same ime consan τ as he canonical sysem, appropriaely se parameers α g, β g, a goal parameer k, an ampliude modifier a k, and a ransformaion funcion f k. This ransformaion funcion ransforms he oupu of he canonical sysem so ha he ransformed sysem can represen complex nonlinear paerns and is given by f k (z) = N ψ i (z)w i z, (8) i=1 where w are adjusable parameers and uses normalized Gaus-

3 sian kernels wihou scaling such as exp ( h i (z c i ) 2) ψ i = (9) N j=1 ( h exp 2) j (z c j ) for localizing he ineracion in phase space where we have ceners c i and widh h i. In order o learn a moor primiive wih percepual coupling, we need wo componens. Firs, we need o learn he normal or average behavior ȳ of he variable of exernal focus y which can be represened by a single moor primiive ḡ, i.e., we can use he same ype of funcion from Equaions (2, 5) for ḡ which are learned based on he same z and given by Equaions (6, 7). Addiionally, we have he sysem ĝ for he variable of inernal focus x which deermines our acual movemens which incorporaes he inpus of he normal behavior ȳ as well as he curren sae y of he exernal variable. We obain he sysem ĝ by insering a modified coupling funcion ˆf(z, y, ȳ) insead of he original f(z) in g. Funcion f(z) is modified in order o include percepual coupling o y and we obain ˆf k (z, y, ȳ) = + N ψ i (z)ŵ i z i=1 M j=1 ( ) ˆψ j (z) κ T jk(y ȳ) + δ T jk(ẏ ȳ), where ˆψ j (z) denoe Gaussian kernels as in Equaion (9) wih ceners ĉ j and widh ĥj. Noe, ha i can be useful o se N > M for reducing he number of parameers. All parameers are given by v = [ŵ, κ, δ]. Here, ŵ are jus he sandard ransformaion parameers while κ jk and δ jk are he local coupling facors which can be inerpreed as gains acing on he difference beween he desired behavior of he exernal variable and is acual behavior. Noe ha for noisefree behavior and perfec iniial posiions, such coupling would never play a role; hus, he approach would simplify o he original approach. However, in he noisy, imperfec case, his percepual coupling can ensure success even in exreme cases. III. LEARNING FOR PERCEPTUALLY COUPLED MOTOR PRIMITIVES While he ransformaion funcion f k (z) can be learned from few or even jus a single rial, his simpliciy no longer ransfers o learning he new funcion ˆf k (z, y, ȳ) as percepual coupling requires ha he coupling o an uncerain exernal variable is learned. While imiaion learning approaches are feasible, hey require larger numbers of presenaions of a eacher wih very similar kinemaics for learning he behavior sufficienly well. As an alernaive, we could follow Naure as our eacher, and creae a concered approach of imiaion and self-improvemen by rial-and-error. For doing so, we firs have a eacher who presens several rials and, subsequenly, we improve our behavior by reinforcemen learning. A. Imiaion Learning wih Percepual Coupling For imiaion learning, we can largely follow he original work in [1], [2] and only need minor modificaions. We also make use of locally-weighed regression in order o deermine he opimal moor primiives, use he same weighing and compue he arges based on he dynamic sysems. However, unlike in [1], [2], we need a boosrapping sep as we deermine firs he parameers for he sysem described by Equaion (5) and, subsequenly, use he learned resuls in he learning of he sysem in Equaion (4). For doing so, we can compue he regression arges for he firs sysem by aking Equaion (6), replacing ȳ and ȳ by samples of y and ẏ, and solving forf k (z) as discussed in [1], [2]. A local regression yields good values for he parameers of f k (z). Subsequenly, we can perform he exac same sep for ˆf k (z, y, ȳ) where only he number of variables has increased bu he resuling regression follows analogously. However, noe ha while a single demonsraion suffices for he parameer vecor w and ŵ, he parameers κ and δ canno be learned by imiaion as hese require deviaion from he nominal behavior for he exernal variable. However, as discussed before, pure imiaion for percepual coupling can be difficul for learning he coupling parameers as well as he bes nominal behavior for a robo wih kinemaics differen from he human, many differen iniial condiions and in he presence of significan noise. Thus, we sugges o improve he policy by rial-and-error using reinforcemen learning upon an iniial imiaion. B. Reinforcemen Learning for Percepually Coupled Moor Primiives Reinforcemen learning [11] of discree moor primiives is a very specific ype of learning problem where i is hard o apply generic reinforcemen learning algorihms [5], [12]. For his reason, he focus of his paper is largely on domainappropriae reinforcemen learning algorihms which operae on paramerized policies for episodic conrol problems. 1) Reinforcemen Learning Seup: When modeling our problem as a reinforcemen learning problem, we always have a sae s = [z, y, ȳ, x] wih high dimensions (as a resul, sandard RL mehods which discreize he sae-space can no longer be applied), and he acion a = [f (z)+ɛ,ˆf(z, y, ȳ)+ˆɛ] is he oupu of our moor primiives. Here, he exploraion is denoed by ɛ and ˆɛ, and we can give a sochasic policy a π(s) as disribuion over he saes wih parameers θ = [w, v] R n. Afer a nex ime-sep δ, he acor ransfers o a sae s +1 and receives a reward r. As we are ineresed in learning complex moor asks consising of a single sroke [4], [9], we focus on finie horizons of lengh T wih episodic resars [11] and learn he opimal paramerized policy for such problems. The general goal in reinforcemen learning is o opimize he expeced reurn of he policy wih parameers θ defined by J(θ) = p(τ )R(τ )dτ, (10) T where τ = [s 1:T +1, a 1:T ] denoes a sequence of saes s 1:T +1 = [s 1, s 2,..., s T +1 ] and acions a 1:T = [a 1,

4 Figure 3. This figure shows schemaic drawings of he Ball-in-a-Cup moion, he final learned robo moion as well as a moion-capured human moion. The green arrows show he direcions of he momenary movemens. The human cup moion was augh o he robo by imiaion learning wih 91 parameers for 1.5 seconds. Also see he supplemenary video in he proceedings. a2,..., at ], he probabiliy of an episode τ is denoed by p(τ ) and R(τ ) refers o he reurn of an episode τ. Using Markov assumpion, we can wrie he pah disribuion QT +1 as p(τ ) = p(x1 ) =1 p(s+1 s, a )π(a s, ) where p(s1 ) denoes he iniial sae disribuion and p(s+1 s, a ) is he nex sae disribuion condiioned on las sae and acion. Similarly, if we assume addiive, accumulaed rewards, he PT reurn of a pah is given by R(τ ) = T1 =1 r(s, a, s+1, ), where r(s, a, s+1, ) denoes he immediae reward. While episodic Reinforcemen Learning (RL) problems wih finie horizons are common in moor conrol, few mehods exis in he RL lieraure (c.f., model-free mehod such as Episodic REINFORCE [13] and he Episodic Naural AcorCriic enac [5] as well as model-based mehods, e.g., using differenial-dynamic programming [14]). In order o avoid learning of complex models, we focus on model-free mehods and, o reduce he number of open parameers, we raher use a novel Reinforcemen Learning algorihm which is based on expecaion-maximizaion. Our new algorihm is called Policy learning by Weighing Exploraion wih he Reurns (PoWER) and can be derived from he same higher principle as previous policy gradien approaches, see [15] for deails. 2) Policy learning by Weighing Exploraion wih he Reurns (PoWER): When learning moor primiives, we inend o learn a deerminisic mean policy a = θ T µ(s) = [f (z), f (z, y, y )] which is linear in parameers θ and augmened by addiive exploraion ε(s, ) = [ˆ, ] in order o make model-free reinforcemen learning possible. As a resul, he exploraive policy can be given in he form a = θ T µ(s, ) + (µ(s, )). Previous work in [5], [12] has focused on sae-independen, whie Gaussian exploraion, i.e., (µ(s, )) N (0, Σ), and has resuled ino applicaions such as T-Ball baing [5] and operaional space conrol [12]. However, such unsrucured exploraion a every sep has several disadvanages, i.e., (i) i causes a large variance which grows wih he number of ime-seps [5], (ii) i perurbs acions oo frequenly, hus, washing ou heir effecs and (iii) can damage he sysem execuing he rajecory. Alernaively, one could generae a form of srucured, saedependen exploraion (µ(s, )) = εt µ(s, ) wih [ε ]ij 2 2 N (0, σij ), where σij are mea-parameers of he exploraion ha can also be opimized. This argumen resuls ino he policy a π(a s, ) = N (a µ(s, ), Σ (s, )). Based on he EM updaes for Reinforcemen Learning as suggesed in [12], [15], we can derive he updae rule o np T π Eτ ε Q (s, a, =1 np o. θ0 = θ + T π (s, a, ) Eτ Q =1 In order o reduce he number of rials in his on-policy scenario, we reuse he rials hrough imporance sampling [11], [16]. To avoid he fragiliy someimes resuling from imporance sampling in reinforcemen learning, samples wih very small imporance weighs are discarded. IV. E VALUATION & A PPLICATION In his secion, we demonsrae he effeciveness of he augmened framework for percepually coupled moor primiives as presened in Secion II and show ha our concered approach of using imiaion for iniializaion and reinforcemen learning for improvemen works well in pracice, paricularly

5 wih our novel PoWER algorihm from Secion III. We show ha his mehod can be used in learning a complex, real-life moor learning problem Ball-in-a-Cup in a physically realisic simulaion of an anhropomorphic robo arm. This problem is a good benchmark for esing he moor learning performance and we show ha we can learn he problem roughly a he efficiency of a young child. This algorihm successfully creaes a percepual coupling even o perurbaions ha are very challenging for a skilled adul player. A. Robo Applicaion: Ball-in-a-Cup We have applied he presened algorihm in order o each a physically-realisic simulaion of an anhropomorphic SAR- COS robo arm how o perform he radiional American children s game Ball-in-a-Cup, also known as Balero, Bilboque or Kendama. The oy consiss of a ball which is aached o a wooden cup by a sring. The iniial posiion is he ball hanging down verically on he sring and he player has o oss he ball ino he cup by jerking his arm [17], see Figure 3(op) for an illusraive figure. The sae of he sysem is described in Caresian coordinaes of he cup (i.e., he operaional space) and he Caresian coordinaes of he ball. The acions are he cup acceleraions in Caresian coordinaes wih each direcion represened by a moor primiive. An operaional space conrol law [18] is used in order o ransform acceleraions in he operaional space of he cup ino join-space orques. All moor primiives are perurbed separaely bu employ he same join reward which is r = exp( α(x c x b ) 2 α(y c y b ) 2 ) he momen where he ball passes he rim of he cup wih a downward direcion and r = 0 all oher imes. The cup posiion is denoed by [x c, y c, z c ] R 3, he ball posiion [x b, y b, z b ] R 3 and a scaling parameer α = The ask is quie complex as he reward is no modified solely by he movemens of he cup bu foremos by he movemens of he ball and he movemens of he ball are very sensiive o perurbaions. A small perurbaion of he iniial condiion or he rajecory will drasically change he movemen of he ball and hence he oucome of he rial if we do no use any form of percepual coupling o he exernal variable ball. Due o he complexiy of he ask, Ball-in-a-Cup is even a hard moor ask for children who only succeed a i by observing anoher person playing or deducing from similar previously learned asks how o maneuver he ball above he cup in such a way ha i can be caugh. Subsequenly, a lo of improvemen by rial-and-error is required unil he desired soluion can be achieved in pracice. The child will have an iniial success as he iniial condiions and execued cup rajecory fi ogeher by chance, aferwards he child sill has o pracice a lo unil i is able o ge he ball in he cup (almos) every ime and so cancel various perurbaions. Learning he necessary percepual coupling o ge he ball in he cup on a consisen basis is even a hard ask for aduls, as our whole lab can esify. In conras o a ennis swing, where a human jus needs o learn a goal funcion for he one momen he racke his he ball, in Ball-in-a-Cup we need a complee dynamical sysem as cup and ball consanly inerac. rewards rials learned hand uned iniializaion Figure 4. This figure shows he expeced reurn for one specific perurbaion of he learned policy in he Ball-in-a-Cup scenario (averaged over 3 runs wih differen random seeds and he sandard deviaion indicaed by he error bars). Convergence is no uniform as he algorihm is opimizing he reurns for a whole range of perurbaions and no for his es case. Thus, he variance in he reurn as he improved policy migh ge worse for he es case bu improve over all cases. Our algorihm rapidly improves, regularly beaing a hand-uned soluion afer less han fify rials and converging afer approximaely 600 rials. Noe ha his plo is a double logarihmic plo and, hus, single uni changes are significan as hey correspond o orders of magniude. Mimicking how children learn o play Ball-in-a-Cup, we firs iniialize he moor primiives by imiaion and, subsequenly, improve hem by reinforcemen learning in order o ge an iniial success. Aferwards we also acquire he percepual coupling by reinforcemen learning. We recorded he moions of a human player using a VICON TM moion-capure seup in order o obain an example for imiaion as shown in Figure 3(c). The exraced cup-rajecories were used o iniialize he moor primiives using locally-weighed regression for imiaion learning. The simulaion of he Ball-in-a-Cup behavior was verified using he racked movemens. We used one of he recorded rajecories for which, when played back in simulaion, he ball goes in bu does no pass he cener of he opening of he cup and hus does no opimize he reward. This movemen is hen used for iniializing he moor primiives and deermining heir parameric srucure where cross-validaion indicaes ha 91 parameers per moor primiive are opimal from a biasvariance poin of view. The rajecories are opimized by reinforcemen learning using he PoWER algorihm on he parameers w for non perurbed iniial condiions. The robo consanly succeeds a bringing he ball ino he cup afer approximaely ieraions given no noise and perfec iniial condiions. One se of he found rajecories is hen used o calculae he baseline ȳ = (h b) and ȳ = (ḣ ḃ), where h and b are he hand and ball rajecories. This se is also used o se he sandard cup rajecories. Hand uned coupling facors work quie well for small perurbaions of he iniial condiions. In order o make hem more robus we use reinforcemen learning using he same join reward as before. The iniial condiions (posiions and velociies) of he ball are perurbed compleely randomly (no PEGASUS Trick) using Gaussian random values wih variances se according o he desired sabiliy region. The PoWER algorihm converges afer approximaely ieraions.

6 x posiion [m] y posiion [m] z posiion [m] ime [s] ime [s] ime [s] cup no coupling ball no coupling cup coupling ball coupling Figure 5. This figure compares cup and ball rajecories wih and wihou percepual coupling. The rajecories and differen iniial condiions are clearly disinguishable. The percepual coupling cancels he swinging moion of he sring and ball pendulum ou. The successful rial is marked eiher by overlying (x and y) or parallel (z) rajecories of he ball and cup from 1.2 seconds on. This is roughly comparable o he learning speed of a 10 year old child (Figure 4). For he raining we used concurrenly sandard deviaions of 0.01m for x and y and of 0.1 m/s for ẋ and ẏ. The learned percepual coupling ges he ball in he cup for all esed cases where he hand-uned coupling was also successful. The learned coupling pushes he limis of he canceled perurbaions significanly furher and sill performs consisenly well for double he sandard deviaions seen in he reinforcemen learning process. Figure 5 shows an example of how he visual coupling adaps he hand rajecories in order o cancel perurbaions and o ge he ball in he cup. V. CONCLUSION Percepual coupling for moor primiives is an imporan opic as i resuls in more general and more reliable soluions while i allows he applicaion of he dynamic sysems moor primiive framework o many oher moor conrol problems. As manual uning can only work in limied seups, an auomaic acquisiion of his percepual coupling is essenial. In his paper, we have conribued an augmened version of he moor primiive framework originally suggesed by [1], [2], [4] such ha i incorporaes percepual coupling while keeping a disincively similar srucure o he original approach and, hus, preserving mos of he imporan properies. We presen a concered learning approach which relies on an iniializaion by imiaion learning and, subsequen, self-improvemen by reinforcemen learning. We inroduce a paricularly wellsuied algorihm for his reinforcemen learning problem called PoWER. The resuling framework works well for learning Ball-in-a-Cup on a simulaed anhropomorphic SARCOS arm in seups where he original moor primiive framework would no suffice o fulfill he ask. REFERENCES [1] A. J. Ijspeer, J. Nakanishi, and S. Schaal, Movemen imiaion wih nonlinear dynamical sysems in humanoid robos, in Proceedings of IEEE Inernaional Conference on Roboics and Auomaion (ICRA), Washingon, DC, May , pp [2], Learning aracor landscapes for learning moor primiives, in Advances in Neural Informaion Processing Sysems 16 (NIPS), S. Becker, S. Thrun, and K. Obermayer, Eds., vol. 15. Cambridge, MA: MIT Press, 2003, pp [3] S. Schaal, J. Peers, J. Nakanishi, and A. J. Ijspeer, Conrol, planning, learning, and imiaion wih dynamic movemen primiives, in Proceedings of he Workshop on Bilaeral Paradigms on Humans and Humanoids, IEEE 2003 Inernaional Conference on Inelligen RObos and Sysems (IROS), Las Vegas, NV, Oc , [4] S. Schaal, P. Mohajerian, and A. J. Ijspeer, Dynamics sysems vs. opimal conrol a unifying view, Progress in Brain Research, vol. 165, no. 1, pp , [5] J. Peers and S. Schaal, Policy gradien mehods for roboics, in Proceedings of he IEEE/RSJ 2006 Inernaional Conference on Inelligen RObos and Sysems (IROS), Beijing, China, 2006, pp [6] D. Pongas, A. Billard, and S. Schaal, Rapid synchronizaion and accurae phase-locking of rhyhmic moor primiives, in Proceedings of he IEEE 2005 Inernaional Conference on Inelligen RObos and Sysems (IROS), vol. 2005, 2005, pp [7] J. Nakanishi, J. Morimoo, G. Endo, G. Cheng, S. Schaal, and M. Kawao, Learning from demonsraion and adapaion of biped locomoion, Roboics and Auonomous Sysems (RAS), vol. 47, no. 2-3, pp , [8] H. Urbanek, A. Albu-Schäffer, and P.v.d.Smag, Learning from demonsraion repeiive movemens for auonomous service roboics, in Proceedings of he IEEE/RSL 2004 Inernaional Conference on Inelligen RObos and Sysems (IROS), Sendai, Japan, 2004, pp [9] G. Wulf, Aenion and moor skill learning. Champaign, IL: Human Kineics, [10] J. Nakanishi, J. Morimoo, G. Endo, G. Cheng, S. Schaal, and M. Kawao, A framework for learning biped locomoion wih dynamic movemen primiives, in Proceedings of he IEEE-RAS Inernaional Conference on Humanoid Robos (HUMANOIDS). Los Angeles, CA: Nov.10-12, Sana Monica, CA: IEEE, [11] R. Suon and A. Baro, Reinforcemen Learning. MIT PRESS, [12] J. Peers and S. Schaal, Reinforcemen learning for operaional space, in Proceedings of he Inernaional Conference on Roboics and Auomaion (ICRA), Rome, Ialy, [13] R. J. Williams, Simple saisical gradien-following algorihms for connecionis reinforcemen learning, Machine Learning, vol. 8, pp , [14] C. G. Akeson, Using local rajecory opimizers o speed up global opimizaion in dynamic programming, in Advances in Neural Informaion Processing Sysems 6 (NIPS), J. E. Hanson, S. J. Moody, and R. P. Lippmann, Eds. Denver, CO, USA: Morgan Kaufmann, 1994, pp [15] J. Kober and J. Peers, Policy search for moor primiives in roboics, in Advances in Neural Informaion Processing Sysems (NIPS), [16] C. Andrieu, N. de Freias, A. Douce, and M. I. Jordan, An inroducion o MCMC for machine learning, Machine Learning, vol. 50, no. 1, pp. 5 43, [17] Wikipedia, June [Online]. Available: hp://en.wikipedia.org/wiki/ball_in_a_cup [18] J. Nakanishi, M. Misry, J. Peers, and S. Schaal, Experimenal evaluaion of ask space posiion/orienaion conrol owards complian conrol for humanoid robos, in Proceedings of he IEEE/RSJ 2007 Inernaional Conference on Inelligen ROboics Sysems (IROS), 2007.

Neural Network Model of the Backpropagation Algorithm

Neural Network Model of the Backpropagation Algorithm Neural Nework Model of he Backpropagaion Algorihm Rudolf Jakša Deparmen of Cyberneics and Arificial Inelligence Technical Universiy of Košice Lená 9, 4 Košice Slovakia jaksa@neuron.uke.sk Miroslav Karák

More information

Fast Multi-task Learning for Query Spelling Correction

Fast Multi-task Learning for Query Spelling Correction Fas Muli-ask Learning for Query Spelling Correcion Xu Sun Dep. of Saisical Science Cornell Universiy Ihaca, NY 14853 xusun@cornell.edu Anshumali Shrivasava Dep. of Compuer Science Cornell Universiy Ihaca,

More information

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments Inernaional Journal of Elecrical and Compuer Engineering (IJECE) Vol. 6, No. 5, Ocober 2016, pp. 2415~2424 ISSN: 2088-8708, DOI: 10.11591/ijece.v6i5.10639 2415 An Effiecien Approach for Resource Auo-Scaling

More information

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices Z. Zhang e al.: Channel Mapping using Bidirecional Long Shor-Term Memory for Dereverberaion in Hands-Free Voice Conrolled Devices 525 Channel Mapping using Bidirecional Long Shor-Term Memory for Dereverberaion

More information

More Accurate Question Answering on Freebase

More Accurate Question Answering on Freebase More Accurae Quesion Answering on Freebase Hannah Bas, Elmar Haussmann Deparmen of Compuer Science Universiy of Freiburg 79110 Freiburg, Germany {bas, haussmann}@informaik.uni-freiburg.de ABSTRACT Real-world

More information

MyLab & Mastering Business

MyLab & Mastering Business MyLab & Masering Business Efficacy Repor 2013 MyLab & Masering: Business Efficacy Repor 2013 Edied by Michelle D. Speckler 2013 Pearson MyAccouningLab, MyEconLab, MyFinanceLab, MyMarkeingLab, and MyOMLab

More information

1 Language universals

1 Language universals AS LX 500 Topics: Language Uniersals Fall 2010, Sepember 21 4a. Anisymmery 1 Language uniersals Subjec-erb agreemen and order Bach (1971) discusses wh-quesions across SO and SO languages, hypohesizing:...

More information

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports Downloaded from ascelibrary.org by Basil Sephanis on 07/13/16. Copyrigh ASCE. For personal use only; all righs reserved. Informaion Propagaion for informing Special Populaion Subgroups abou New Ground

More information

arxiv: v2 [cs.ro] 3 Mar 2017

arxiv: v2 [cs.ro] 3 Mar 2017 Learning Feedback Terms for Reactive Planning and Control Akshara Rai 2,3,, Giovanni Sutanto 1,2,, Stefan Schaal 1,2 and Franziska Meier 1,2 arxiv:1610.03557v2 [cs.ro] 3 Mar 2017 Abstract With the advancement

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots Varun Raj Kompella, Marijn Stollenga, Matthew Luciw, Juergen Schmidhuber The Swiss AI Lab IDSIA, USI

More information

FF+FPG: Guiding a Policy-Gradient Planner

FF+FPG: Guiding a Policy-Gradient Planner FF+FPG: Guiding a Policy-Gradient Planner Olivier Buffet LAAS-CNRS University of Toulouse Toulouse, France firstname.lastname@laas.fr Douglas Aberdeen National ICT australia & The Australian National University

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

Computational Approaches to Motor Learning by Imitation

Computational Approaches to Motor Learning by Imitation Schaal S, Ijspeert A, Billard A (2003) Computational approaches to motor learning by imitation. Philosophical Transaction of the Royal Society of London: Series B, Biological Sciences 358: 537-547 Computational

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

A Case-Based Approach To Imitation Learning in Robotic Agents

A Case-Based Approach To Imitation Learning in Robotic Agents A Case-Based Approach To Imitation Learning in Robotic Agents Tesca Fitzgerald, Ashok Goel School of Interactive Computing Georgia Institute of Technology, Atlanta, GA 30332, USA {tesca.fitzgerald,goel}@cc.gatech.edu

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Learning Prospective Robot Behavior

Learning Prospective Robot Behavior Learning Prospective Robot Behavior Shichao Ou and Rod Grupen Laboratory for Perceptual Robotics Computer Science Department University of Massachusetts Amherst {chao,grupen}@cs.umass.edu Abstract This

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

Robot Learning Simultaneously a Task and How to Interpret Human Instructions

Robot Learning Simultaneously a Task and How to Interpret Human Instructions Robot Learning Simultaneously a Task and How to Interpret Human Instructions Jonathan Grizou, Manuel Lopes, Pierre-Yves Oudeyer To cite this version: Jonathan Grizou, Manuel Lopes, Pierre-Yves Oudeyer.

More information

Go fishing! Responsibility judgments when cooperation breaks down

Go fishing! Responsibility judgments when cooperation breaks down Go fishing! Responsibility judgments when cooperation breaks down Kelsey Allen (krallen@mit.edu), Julian Jara-Ettinger (jjara@mit.edu), Tobias Gerstenberg (tger@mit.edu), Max Kleiman-Weiner (maxkw@mit.edu)

More information

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Proceedings of 28 ISFA 28 International Symposium on Flexible Automation Atlanta, GA, USA June 23-26, 28 ISFA28U_12 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Amit Gil, Helman Stern, Yael Edan, and

More information

Machine Learning and Development Policy

Machine Learning and Development Policy Machine Learning and Development Policy Sendhil Mullainathan (joint papers with Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, Ziad Obermeyer) Magic? Hard not to be wowed But what makes

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

In Workflow. Viewing: Last edit: 10/27/15 1:51 pm. Approval Path. Date Submi ed: 10/09/15 2:47 pm. 6. Coordinator Curriculum Management

In Workflow. Viewing: Last edit: 10/27/15 1:51 pm. Approval Path. Date Submi ed: 10/09/15 2:47 pm. 6. Coordinator Curriculum Management 1 of 5 11/19/2015 8:10 AM Date Submi ed: 10/09/15 2:47 pm Viewing: Last edit: 10/27/15 1:51 pm Changes proposed by: GODWINH In Workflow 1. BUSI Editor 2. BUSI Chair 3. BU Associate Dean 4. Biggio Center

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Transfer Learning Action Models by Measuring the Similarity of Different Domains Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn

More information

College Pricing and Income Inequality

College Pricing and Income Inequality College Pricing and Income Inequality Zhifeng Cai U of Minnesota, Rutgers University, and FRB Minneapolis Jonathan Heathcote FRB Minneapolis NBER Income Distribution, July 20, 2017 The views expressed

More information

ACTIVITY: Comparing Combination Locks

ACTIVITY: Comparing Combination Locks 5.4 Compound Events outcomes of one or more events? ow can you find the number of possible ACIVIY: Comparing Combination Locks Work with a partner. You are buying a combination lock. You have three choices.

More information

Improving Fairness in Memory Scheduling

Improving Fairness in Memory Scheduling Improving Fairness in Memory Scheduling Using a Team of Learning Automata Aditya Kajwe and Madhu Mutyam Department of Computer Science & Engineering, Indian Institute of Tehcnology - Madras June 14, 2014

More information

Speeding Up Reinforcement Learning with Behavior Transfer

Speeding Up Reinforcement Learning with Behavior Transfer Speeding Up Reinforcement Learning with Behavior Transfer Matthew E. Taylor and Peter Stone Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188 {mtaylor, pstone}@cs.utexas.edu

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

Improving Action Selection in MDP s via Knowledge Transfer

Improving Action Selection in MDP s via Knowledge Transfer In Proc. 20th National Conference on Artificial Intelligence (AAAI-05), July 9 13, 2005, Pittsburgh, USA. Improving Action Selection in MDP s via Knowledge Transfer Alexander A. Sherstov and Peter Stone

More information

College Pricing and Income Inequality

College Pricing and Income Inequality College Pricing and Income Inequality Zhifeng Cai U of Minnesota and FRB Minneapolis Jonathan Heathcote FRB Minneapolis OSU, November 15 2016 The views expressed herein are those of the authors and not

More information

Game-based formative assessment: Newton s Playground. Valerie Shute, Matthew Ventura, & Yoon Jeon Kim (Florida State University), NCME, April 30, 2013

Game-based formative assessment: Newton s Playground. Valerie Shute, Matthew Ventura, & Yoon Jeon Kim (Florida State University), NCME, April 30, 2013 Game-based formative assessment: Newton s Playground Valerie Shute, Matthew Ventura, & Yoon Jeon Kim (Florida State University), NCME, April 30, 2013 Fun & Games Assessment Needs Game-based stealth assessment

More information

Practical Integrated Learning for Machine Element Design

Practical Integrated Learning for Machine Element Design Practical Integrated Learning for Machine Element Design Manop Tantrabandit * Abstract----There are many possible methods to implement the practical-approach-based integrated learning, in which all participants,

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Understanding and Changing Habits

Understanding and Changing Habits Understanding and Changing Habits We are what we repeatedly do. Excellence, then, is not an act, but a habit. Aristotle Have you ever stopped to think about your habits or how they impact your daily life?

More information

An OO Framework for building Intelligence and Learning properties in Software Agents

An OO Framework for building Intelligence and Learning properties in Software Agents An OO Framework for building Intelligence and Learning properties in Software Agents José A. R. P. Sardinha, Ruy L. Milidiú, Carlos J. P. Lucena, Patrick Paranhos Abstract Software agents are defined as

More information

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego

More information

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Using EEG to Improve Massive Open Online Courses Feedback Interaction Using EEG to Improve Massive Open Online Courses Feedback Interaction Haohan Wang, Yiwei Li, Xiaobo Hu, Yucong Yang, Zhu Meng, Kai-min Chang Language Technologies Institute School of Computer Science Carnegie

More information

Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

GDP Falls as MBA Rises?

GDP Falls as MBA Rises? Applied Mathematics, 2013, 4, 1455-1459 http://dx.doi.org/10.4236/am.2013.410196 Published Online October 2013 (http://www.scirp.org/journal/am) GDP Falls as MBA Rises? T. N. Cummins EconomicGPS, Aurora,

More information

An Online Handwriting Recognition System For Turkish

An Online Handwriting Recognition System For Turkish An Online Handwriting Recognition System For Turkish Esra Vural, Hakan Erdogan, Kemal Oflazer, Berrin Yanikoglu Sabanci University, Tuzla, Istanbul, Turkey 34956 ABSTRACT Despite recent developments in

More information

Georgetown University at TREC 2017 Dynamic Domain Track

Georgetown University at TREC 2017 Dynamic Domain Track Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain

More information

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy Informatics 2A: Language Complexity and the Chomsky Hierarchy September 28, 2010 Starter 1 Is there a finite state machine that recognises all those strings s from the alphabet {a, b} where the difference

More information

Xinyu Tang. Education. Research Interests. Honors and Awards. Professional Experience

Xinyu Tang. Education. Research Interests. Honors and Awards. Professional Experience Xinyu Tang Parasol Laboratory Department of Computer Science Texas A&M University, TAMU 3112 College Station, TX 77843-3112 phone:(979)847-8835 fax: (979)458-0425 email: xinyut@tamu.edu url: http://parasol.tamu.edu/people/xinyut

More information

Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

On-the-Fly Customization of Automated Essay Scoring

On-the-Fly Customization of Automated Essay Scoring Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,

More information

Universityy. The content of

Universityy. The content of WORKING PAPER #31 An Evaluation of Empirical Bayes Estimation of Value Added Teacher Performance Measuress Cassandra M. Guarino, Indianaa Universityy Michelle Maxfield, Michigan State Universityy Mark

More information

Sec123. Volleyball. 52 Resident Registration begins Aug. 5 Non-resident Registration begins Aug. 14

Sec123. Volleyball. 52 Resident Registration begins Aug. 5 Non-resident Registration begins Aug. 14 Sec123 Volleyball 52 Resident Registration begins Aug. 5 Non-resident Registration begins Aug. 14 foxvalleyparkdistrict.org 53 Sec123 Private Tennis Lessons! Call 630-907-8067 FALL TENNIS NO CLASS DATES

More information

AMULTIAGENT system [1] can be defined as a group of

AMULTIAGENT system [1] can be defined as a group of 156 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 2, MARCH 2008 A Comprehensive Survey of Multiagent Reinforcement Learning Lucian Buşoniu, Robert Babuška,

More information

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES LIST OF APPENDICES LIST OF

More information

Capturing and Organizing Prior Student Learning with the OCW Backpack

Capturing and Organizing Prior Student Learning with the OCW Backpack Capturing and Organizing Prior Student Learning with the OCW Backpack Brian Ouellette,* Elena Gitin,** Justin Prost,*** Peter Smith**** * Vice President, KNEXT, Kaplan University Group ** Senior Research

More information

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International

More information

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR ROLAND HAUSSER Institut für Deutsche Philologie Ludwig-Maximilians Universität München München, West Germany 1. CHOICE OF A PRIMITIVE OPERATION The

More information

Top US Tech Talent for the Top China Tech Company

Top US Tech Talent for the Top China Tech Company THE FALL 2017 US RECRUITING TOUR Top US Tech Talent for the Top China Tech Company INTERVIEWS IN 7 CITIES Tour Schedule CITY Boston, MA New York, NY Pittsburgh, PA Urbana-Champaign, IL Ann Arbor, MI Los

More information

An investigation of imitation learning algorithms for structured prediction

An investigation of imitation learning algorithms for structured prediction JMLR: Workshop and Conference Proceedings 24:143 153, 2012 10th European Workshop on Reinforcement Learning An investigation of imitation learning algorithms for structured prediction Andreas Vlachos Computer

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

A Stochastic Model for the Vocabulary Explosion

A Stochastic Model for the Vocabulary Explosion Words Known A Stochastic Model for the Vocabulary Explosion Colleen C. Mitchell (colleen-mitchell@uiowa.edu) Department of Mathematics, 225E MLH Iowa City, IA 52242 USA Bob McMurray (bob-mcmurray@uiowa.edu)

More information

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search Using Deep Convolutional Neural Networks in Monte Carlo Tree Search Tobias Graf (B) and Marco Platzner University of Paderborn, Paderborn, Germany tobiasg@mail.upb.de, platzner@upb.de Abstract. Deep Convolutional

More information

The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

More information

Supervised Agricultural Experience Unit Agriculture, Food, and Natural Resources Texas Education Agency

Supervised Agricultural Experience Unit Agriculture, Food, and Natural Resources Texas Education Agency Supervised Agricultural Experience Unit Agriculture, Food, and Natural Resources Texas Education Agency LESSON: Agricultural Careers and Opportunities in SAE OBJECTIVES: 1. Identify as a class a variety

More information

TD(λ) and Q-Learning Based Ludo Players

TD(λ) and Q-Learning Based Ludo Players TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability

More information

Dynamic Tournament Design: An Application to Prediction Contests

Dynamic Tournament Design: An Application to Prediction Contests Dynamic Tournament Design: An Application to Prediction Contests Jorge Lemus Guillermo Marshall July 14, 2017 Abstract Online competitions allow government agencies and private companies to procure innovative

More information

Time series prediction

Time series prediction Chapter 13 Time series prediction Amaury Lendasse, Timo Honkela, Federico Pouzols, Antti Sorjamaa, Yoan Miche, Qi Yu, Eric Severin, Mark van Heeswijk, Erkki Oja, Francesco Corona, Elia Liitiäinen, Zhanxing

More information

Design Principles to Set the Stage

Design Principles to Set the Stage 6 for Design Principles to Set the Stage Learning As published in Learning By Design E-News 7/15/2008 Six Design Principles to Set the Stage for Learning 6 Design Principles to Set the Stage for Learning

More information

COMPUTER-AIDED DESIGN TOOLS THAT ADAPT

COMPUTER-AIDED DESIGN TOOLS THAT ADAPT COMPUTER-AIDED DESIGN TOOLS THAT ADAPT WEI PENG CSIRO ICT Centre, Australia and JOHN S GERO Krasnow Institute for Advanced Study, USA 1. Introduction Abstract. This paper describes an approach that enables

More information

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES Luiz Fernando Gonçalves, luizfg@ece.ufrgs.br Marcelo Soares Lubaszewski, luba@ece.ufrgs.br Carlos Eduardo Pereira, cpereira@ece.ufrgs.br

More information

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

More information

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon

More information

Field Experience Management 2011 Training Guides

Field Experience Management 2011 Training Guides Field Experience Management 2011 Training Guides Page 1 of 40 Contents Introduction... 3 Helpful Resources Available on the LiveText Conference Visitors Pass... 3 Overview... 5 Development Model for FEM...

More information

A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting

A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting Turhan Carroll University of Colorado-Boulder REU Program Summer 2006 Introduction/Background Physics Education Research (PER)

More information

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4 University of Waterloo School of Accountancy AFM 102: Introductory Management Accounting Fall Term 2004: Section 4 Instructor: Alan Webb Office: HH 289A / BFG 2120 B (after October 1) Phone: 888-4567 ext.

More information

9.85 Cognition in Infancy and Early Childhood. Lecture 7: Number

9.85 Cognition in Infancy and Early Childhood. Lecture 7: Number 9.85 Cognition in Infancy and Early Childhood Lecture 7: Number What else might you know about objects? Spelke Objects i. Continuity. Objects exist continuously and move on paths that are connected over

More information

A Study on professors and learners perceptions of real-time Online Korean Studies Courses

A Study on professors and learners perceptions of real-time Online Korean Studies Courses A Study on professors and learners perceptions of real-time Online Korean Studies Courses Haiyoung Lee 1*, Sun Hee Park 2** and Jeehye Ha 3 1,2,3 Department of Korean Studies, Ewha Womans University, 52

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne

School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne Web Appendix See paper for references to Appendix Appendix 1: Multiple Schools

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

Generative models and adversarial training

Generative models and adversarial training Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task

Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task Stephen James Dyson Robotics Lab Imperial College London slj12@ic.ac.uk Andrew J. Davison Dyson Robotics

More information

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

Grammars & Parsing, Part 1:

Grammars & Parsing, Part 1: Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review

More information

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I Session 1793 Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I John Greco, Ph.D. Department of Electrical and Computer Engineering Lafayette College Easton, PA 18042 Abstract

More information

Reflective Teaching KATE WRIGHT ASSOCIATE PROFESSOR, SCHOOL OF LIFE SCIENCES, COLLEGE OF SCIENCE

Reflective Teaching KATE WRIGHT ASSOCIATE PROFESSOR, SCHOOL OF LIFE SCIENCES, COLLEGE OF SCIENCE Reflective Teaching KATE WRIGHT ASSOCIATE PROFESSOR, SCHOOL OF LIFE SCIENCES, COLLEGE OF SCIENCE Reflective teaching means looking at what you do in the classroom, thinking about why you do it, and thinking

More information

Three Strategies for Open Source Deployment: Substitution, Innovation, and Knowledge Reuse

Three Strategies for Open Source Deployment: Substitution, Innovation, and Knowledge Reuse Three Strategies for Open Source Deployment: Substitution, Innovation, and Knowledge Reuse Jonathan P. Allen 1 1 University of San Francisco, 2130 Fulton St., CA 94117, USA, jpallen@usfca.edu Abstract.

More information

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Exploration. CS : Deep Reinforcement Learning Sergey Levine Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?

More information

Vocational Training Dropouts: The Role of Secondary Jobs

Vocational Training Dropouts: The Role of Secondary Jobs Vocational Training Dropouts: The Role of Secondary Jobs Katja Seidel Insitute of Economics Leuphana University Lueneburg katja.seidel@leuphana.de Nutzerkonferenz Bildung und Beruf: Erwerb und Verwertung

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

On the implementation and follow-up of decisions

On the implementation and follow-up of decisions Borges, M.R.S., Pino, J.A., Valle, C.: "On the Implementation and Follow-up of Decisions", In Proc.of the DSIAge -International Conference on Decision Making and Decision Support in the Internet Age, Cork,

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

A Game-based Assessment of Children s Choices to Seek Feedback and to Revise

A Game-based Assessment of Children s Choices to Seek Feedback and to Revise A Game-based Assessment of Children s Choices to Seek Feedback and to Revise Maria Cutumisu, Kristen P. Blair, Daniel L. Schwartz, Doris B. Chin Stanford Graduate School of Education Please address all

More information