Flexible Cognitive Strategies during Motor Learning

Flexible Cogitive Strategies durig Motor Learig Jorda A. Taylor 1 *, Richard B. Ivry 1,2 * 1 Departmet of Psychology, Uiversity of Califoria, Berkeley, Berkeley, Califoria, Uited States of America, 2 Hele Wills Neurosciece Istitute, Uiversity of Califoria, Berkeley, Berkeley, Califoria, Uited States of America Abstract Visuomotor rotatio tasks have prove to be a powerful tool to study adaptatio of the motor system. While adaptatio i such tasks is seemigly automatic ad icremetal, participats may gai kowledge of the perturbatio ad ivoke a compesatory strategy. Whe provided with a explicit strategy to couteract a rotatio, participats are iitially very accurate, eve without o-lie feedback. Surprisigly, with further testig, the agle of their reachig movemets drifts i the directio of the strategy, producig a icrease i edpoit errors. This drift is attributed to the gradual adaptatio of a iteral model that operates idepedetly from the strategy, eve at the cost of task accuracy. Here we idetify costraits that ifluece this process, allowig us to explore models of the iteractio betwee strategic ad implicit chages durig visuomotor adaptatio. Whe the adaptatio phase was exteded, participats evetually modified their strategy to offset the rise i edpoit errors. Moreover, whe we removed visual markers that provided exteral ladmarks to support a strategy, the degree of drift was sharply atteuated. These effects are accouted for by a setpoit state-space model i which a strategy is flexibly adjusted to offset performace errors arisig from the implicit adaptatio of a iteral model. More geerally, these results suggest that strategic processes may operate i may studies of visuomotor adaptatio, with participats arrivig at a syergy betwee a strategic pla ad the effects of sesorimotor adaptatio. Citatio: Taylor JA, Ivry RB (2011) Flexible Cogitive Strategies durig Motor Learig. PLoS Comput Biol 7(3): e1001096. doi:10.1371/joural.pcbi.1001096 Editor: Jör Diedrichse, Uiversity College Lodo, Uited Kigdom Received April 11, 2010; Accepted Jauary 28, 2011; Published March 3, 2011 Copyright: ß 2011 Taylor, Ivry. This is a ope-access article distributed uder the terms of the Creative Commos Attributio Licese, which permits urestricted use, distributio, ad reproductio i ay medium, provided the origial author ad source are credited. Fudig: Jorda A. Taylor was supported by Natioal Research Service Award F32NS064749 from the Natioal Istitute of Neurological Disorders ad Stroke (NINDS). Richard B. Ivry was supported by R01HD060306 from the Natioal Istitutes of Child Health ad Huma Developmet, P01NS040813 from NINDS, ad IIS0703787 from the Natioal Sciece Foudatio. The fuders had o role i study desig, data collectio ad aalysis, decisio to publish, or preparatio of the mauscript. Competig Iterests: The authors have declared that o competig iterests exist. * E-mail: jorda.a.taylor@berkeley.edu (JAT); ivry@berkeley.edu (RBI) Itroductio Whe learig a ew motor skill, verbal istructio ofte proves useful to haste the learig process. For example, a ew driver is istructed o the sequece of steps required to chage gears whe usig a stadard trasmissio. As the skill becomes cosolidated, the driver o loger requires explicit referece to these istructios. Operatig a vehicle with a stiffer or looser clutch does ot geerally require further istructio, but rather etails a subtle recalibratio, or adaptatio of the previously leared skill. Ideed, the use of a explicit strategy may eve lead to degradatio i the expert s performace. Cosideratio of these cotradictory issues brigs ito questio the role of istructios or explicit strategies i sesorimotor learig. The type of motor task ad ature of the istructio ca have varyig effects o motor executio ad learig [1 3]. I the serial reactio time task (SRT), participats produce a sequece of cued butto presses. If the participat is iformed of the uderlyig sequece, learig occurs much more rapidly compared to whe sequetial learig arises from repeated performace [4]. However, learig i the SRT task etails the likage of a series of discrete actios. Explicit istructios of the sequece structure may be viewed as a way to create a workig memory represetatio of the series. May skills lack such a clear elemetal partitio ad, as such, participats caot easily verbalize what a successful movemet etails. For example, the patter of forces required to move the had i a straight lie i a ovel force field [5 7] would be hard to verbalize. Various studies have examied the role of explicit strategies i tasks ivolvig sesorimotor adaptatio [8 11]. The beefits of a explicit strategy may be illusory with adaptive processes arisig from automatic ad icremetal updatig of a motor system that is impeetrable to coscious itervetio [12 16]. However, performace measures idicate that adaptatio may differ betwee coditios i which participats are either aware or uaware of the chages i the eviromet [17]. For example, a large visuomotor rotatio ca be itroduced abruptly, i which case, awareess is likely, or itroduced icremetally such that participats are uaware of the rotatio. The abrupt oset of large uexpected errors may promote the use of cogitive strategies [18 20]. Participats who gai explicit kowledge of a imposed visuomotor rotatio show better performace durig learig tha participats who report little or o awareess of the rotatio [10]. Moreover, the rate of learig, at least i the early phase of adaptatio, correlates positively with spatial workig memory spa [21], suggestig that strategic compesatio may be depedet o workig memory capacity. Studies of sesorimotor adaptatio durig agig also idicate that the rate of learig is slower i older adults compared to youg adults, despite similar aftereffects [22 24]. This cost is abset i older adults who report awareess of the rotatio [25]. I may of the studies cited above, the assumptio has bee that the developmet of awareess ca lead to the utilizatio of compesatory strategies. However, few studies have directly sought to maipulate strategic cotrol durig sesorimotor adaptatio. Oe strikig exceptio is a study by Mazzoi ad PLoS Computatioal Biology www.ploscompbiol.org 1 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig Author Summary Motor learig has bee modeled as a implicit process i which a error, sigalig the differece betwee the predicted ad actual outcome is used to modify a model of the actor-eviromet iteractio. This process is assumed to operate automatically ad implicitly. However, people ca employ cogitive strategies to improve performace. It has recetly bee show that whe implicit ad explicit processes are put i oppositio, the operatio of motor learig mechaisms will offset the advatages coferred by a strategy ad evetually, performace deteriorates. We preset a computatioal model of the iterplay of these processes. A key isight of the model is that implicit ad explicit learig mechaisms operate o differet error sigals. Cosistet with previous models of sesorimotor adaptatio, implicit learig is drive by a error reflectig the differece betwee the predicted ad actual feedback for that movemet. I cotrast, explicit learig is drive by a error based o the differece betwee the feedback ad target locatio of the movemet, a sigal that directly reflects task performace. Empirically, we demostrate costraits o these two error sigals. Take together, the modelig ad empirical results suggest that the beefits of a cogitive strategy may lie hidde i may motor learig tasks. Krakauer [9]. Participats viewed a display of eight small circles, or visual ladmarks, that were evely spaced by 45u to form a large, implicit rig. The target locatio was specified by presetig a bullseye withi oe of the eight circles. After a iitial traiig phase i which the visuomotor mappig was ualtered, a 45u rotatio i the couterclockwise directio (CCW) was itroduced. I the stadard coditio i which o istructios were provided, participats gradually reduced edpoit error by alterig their movemet headig i the clockwise directio (CW). I the strategy coditio, participats were give explicit istructios to move to the circle located 45u clockwise to the target. This strategy eabled these participats to immediately elimiate all error. However, as traiig cotiued, the participats progressively icreased their movemet headig i the clockwise directio. As such, the edpoit locatio of the feedback cursor drifted further from the actual target locatio ad, thus, performace showed a icrease i error over traiig, a rather couterituitive result [26]. Mazzoi ad Krakauer [9] proposed that this drift arises from the implicit adaptatio of a iteral forward model. Importatly, the error sigal for this learig process is ot based o differece betwee the observed visual feedback ad target locatio. Rather, it is based o the differece betwee the observed visual feedback ad strategic aimig locatio. Eve though participats aim to a clockwise locatio of the target (as istructed), the motor system experieces a mismatch betwee the predicted state ad the visual feedback. This mismatch defies a error sigal that is used to recalibrate the iteral model. Reducig the mismatch results i a adjustmet of the iteral model such that the ext movemet will be eve further i the clockwise directio. Thus, the operatio of a implicit learig process that is impervious to the strategy produces the paradoxical deterioratio i performace over time. I the preset paper, we start by askig how this hypothesis could be formalized i a computatioal model of motor learig. State space modelig techiques have successfully described adaptatio ad geeralizatio durig motor learig [27 29]. These models focus o how learig mechaisms miimize error from trial to trial. Variats of these models postulate multiple learig mechaisms that operate at differet time scales [28]. Withi this framework, strategic factors might be associated with fast learig processes that rapidly reduce error. However, such models are uable to accout for the drift that arises followig the deploymet of a strategy. To address these issues, we developed a series of setpoit state-space models of adaptatio to quatitatively explore how strategic cotrol ad implicit adaptatio iteract. Assumig a fixed strategy, adaptatio should cotiue to occur util the error sigal, the differece betwee the feedback locatio ad the aimig locatio is zero; that is, the visual feedback matches the iteded aim of the reach. As such, drift arisig from implicit adaptatio should cotiue to rise util it offsets the adopted strategy. To test this predictio, we icreased the legth of the adaptatio phase. Moreover, we maipulated the saliece of the visual ladmarks used to support the strategy. We hypothesized that these ladmarks served as a proxy for the aimig locatio. If this assumptio is correct, the elimiatio of the visual ladmarks should weake the error sigal, give ucertaity cocerig the aimig locatio, ad drift should be atteuated. We test this predictio by comparig performace with ad without visual ladmarks. Results Curret models of sesorimotor adaptatio have ot addressed the effect of explicit strategies. Therefore, we started with the stadard state-space model (Eq 1 ad 2), ad icremetally modified it to accommodate the use of a explicit strategy. The stadard model for target error is give as: e target ~ r {r est r est z1 ~Arest zbetarget where e target is the target edpoit error o movemet, r is the rotatio, r est is the iteral model s estimatio of the rotatio, A is a memory term, ad B represets either a learig rate or sesitivity to error [27 30]. As expected, this model gradually lears to compesate for a visuomotor rotatio (Figure 1A black lie; simulated with A = 1 ad B = 0.02). Modelig strategy use durig visuomotor adaptatio Whe iformed of a appropriate strategy that will compesate for the rotatio, participats immediately couteract the rotatio ad show o-target accuracy. The stadard model as formulated above does ot provide a mechaism to implemet a explicit strategy. To allow immediate implemetatio of the strategy, we postulate that there is direct feedthrough of the strategy (s) to the target error equatio (equatio 1): e target ð1þ ð2þ ~ s z r {r est actual ð3þ Direct feedthrough allows the strategy to cotribute to the target error equatio without directly ifluecig the updatig of the iteral model. If the strategy operated through the iteral model, the the impact of the strategy would take time to evolve, assumig there is substatial memory of the iteral model s estimatio of the rotatio (i.e., A has a high value i Eq. 2). With direct feedthrough, the implemetatio of a appropriate strategy ca immediately compesate for the rotatio. I the curret arragemet, the appropriate strategy is fixed at 45u i the CW directio from the cued target. PLoS Computatioal Biology www.ploscompbiol.org 2 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig Figure 1. Setpoit model simulatio. A 245u rotatio was itroduced o movemet 121 ad remaied preset for the ext 322 trials. A) Simulated target error for four state space models. Black: Stadard state-space model (A = 1, B = 0.02); Gree: Setpoit model i which the target error is used to adapt the iteral model; Red: Setpoit model with direct-feedthrough i which the aimig error is used to adapt the iteral model. Drift is atteuated by either reduced adaptatio rate (Cya: A = 1, B = 0.01) or reducig the availability of the aimig error sigal that selectively operates o the strategy (Mageta: K = 0.5). B) Iteral model estimatio of rotatio whe a fixed strategy (blue) is combied with either high (red, K = 1) or low (mageta, K = 0.5) certaity of aimig errors. C) Effect of a variable strategy o target error. The strategy was simulated with a low (blue; E = 1 ad F = 0.01) or high (orage; F = 0.05) weightig of the target error. doi:10.1371/joural.pcbi.1001096.g001 Oce the strategy is implemeted, performace should remai stable sice the error term is small. Ideed, a model based o Eq. 3 immediately compesates for the rotatio. The target error, the differece betwee the feedback locatio ad target locatio, is essetially zero o the first trial with the strategy, ad remais so throughout the rotatio block (Figure 1B gree lie). However, this model fails to match the empirical results observed by Mazzoi ad Krakauer [9]: performace drifts over time with a icrease i errors i the directio of the strategy. This pheomeo led the authors to suggest that the predictio error sigal to the iteral model is ot based o target error. Istead, the error sigal should be defied by the differece betwee the feedback locatio ad aimig locatio (see Figure 2E): e aimig ~Feedback-Aimig Locatio ~ s z r {r est ð4þ actual{ ½ s Š desired The formulatio of the predictio error term i Eq. 4 is aki to a setpoit or referece sigal from egieerig cotrol theory [30]. I typical motor learig studies, the setpoit is to reach to the target. Whe there is o strategy (s = 0), the target error i Eq. 1 is the same as the error term i Eq. 4. However, whe a strategy with direct feedthrough is used (s?0), the strategy terms may cacel out if the actual implemeted strategy is similar to the desired strategy. The iput error to update the iteral model s estimate of the rotatio becomes: e aimig ~ s z r {r est ~ r {r est actual{s actual actualz(s actual {s desired ) This model shows immediate compesatio for the visuomotor rotatio, ad more importatly, produces a gradual deterioratio i performace over the course of cotiued traiig with the reachig error driftig i the directio of the strategy (Figure 1A red lie), cosistet with the results reported by Mazzoi ad Krakauer [9]. ð5þ It is importat to emphasize that the error sigal for sesorimotor recalibratio i Eq. 4 is ot based o the differece betwee the feedback locatio ad target locatio (target error). Rather, the error sigal is defied by the differece betwee the feedback locatio ad aimig locatio, or what we will refer to as aimig error. Whe a fixed strategy is adopted throughout traiig (Figure 1B blue lie), the aimig error is (iitially) quite large give that the predicted had locatio is far from the locatio of visual feedback, eve though the feedback cursor may be close to the actual target. I its simplest form, the setpoit model predicts that, as the iteral model miimizes this error (Figure 1B red lie), drift will cotiue util the observed feedback of the had matches the aimig locatio. That is, the magitude of the drift should equal the size of the strategic adjustmet. I the Mazzoi ad Krakauer set-up [9], the drift would evetually reach 45u i the CW directio (Figure 1A red lie). A secod predictio ca be derived by cosiderig that the error sigal i Eq. 4 relies o a accurate estimate of the strategic aimig locatio. We assume that a visual ladmark i the display ca be used as a referece poit for strategy implemetatio (e.g., the blue circle adjacet to the target). This ladmark ca serve as a proxy for the aimig locatio. The saliece of this ladmark provides a accurate estimate of the aimig locatio ad, from Eq. 4, drift should be proouced. However, if these ladmarks are ot available, the the estimate of the aimig locatio will be less certai. Previous studies have show that adaptatio is atteuated whe sesory feedback is oisy [31,32]. Oe approach for modelig the effect of chagig the availability or certaity of the (strategy defied) aimig locatio would be to vary the adaptatio rate (B). For example, B could be smaller if there is a decrease i certaity of the aimig locatio, ad correspodigly, a decrease i the certaity of the aimig error. This model predicts that the rate of drift is directly related to B: if B is lower due to decreased certaity of the aimig locatio, the the rate of drift will be atteuated (Figure 1A cya lie). To evaluate the predictios of this setpoit model, participats were tested i a exteded visuomotor rotatio task i which we varied the visual displays used to defie the target ad strategic ladmarks (see methods). The target was defied as a gree circle, PLoS Computatioal Biology www.ploscompbiol.org 3 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig Figure 2. Experimetal task desig. The experimet workspace cosisted of 8 empty blue circles separated by 45u (three locatios are show here). The target was defied whe a gree circle appeared at oe of the locatios. The had was occluded by the apparatus ad o feedback trials, a red cursor appeared as soo as the participat crossed a virtual rig, 10-cm from the start locatio. A) I the baselie block, participats moved towards the cued gree target. B) I the strategy-oly block, participats moved to the blue circle located 45u i the clockwise directio. Feedback was preseted at the veridical had positio. C) For the two rotatio probes, participats were istructed to move to the gree target, but feedback of had positio was rotated 45u i the couter-clockwise directio. D) I the rotatio plus strategy block, participats were istructed to move to the blue circle located 45u clockwise directio from the target. The feedback of had positio was rotated 45u couter-clockwise. E) Two sources of movemet error: a target error betwee the feedback locatio ad target locatio ad a aimig error betwee the feedback locatio ad aimig locatio. doi:10.1371/joural.pcbi.1001096.g002 appearig at oe of eight possible locatios, separated by 45u (Figure 2, oly three show here). By ecouragig the participats to make movemets that sliced through the target, ad oly providig feedback at the poit of itersectio with the virtual target rig, we were able to trai the participats to move quickly with relatively low trial-to-trial variability. We assume that participats mostly relied o feedforward cotrol give the ballistic ature of the movemets ad absece of cotiuous olie feedback. Participats were assiged to oe of three experimetal groups ( = 10 per group), with the groups defied by our maipulatio of the blue ladmarks i the visual displays. For the aimig-target group (AT), the blue circles were always visible, similar to the method used by Mazzoi ad Krakauer. For the disappearig aimig-target group (AT), the blue circles were visible at the start of the trial ad disappeared whe the movemet was iitiated. For the o aimig-target group (NoAT), the blue ladmarks were ot icluded i the display. The participats were iitially required to reach to the gree target (Figure 2A). Movemet duratio, measured whe the had crossed the target rig, averaged 275650.8 ms with o sigificat differece betwee groups (F 2,27 = 1.02, p = 0.37). Followig the iitial familiarizatio block, participats were traied to use a strategy of movig 45u i the CW directio from the gree target locatio, (Figure 2B). This locatio correspoded to the positio of the eighborig blue circle. Feedback was veridical i this phase (e.g., correspoded to had positio). To help participats i the NoAT group lear to move at 45u, the blue circles were also preseted o half of the trials for this group (i this phase oly). The mea agular shifts, relative to the gree target, were 43.461.6u ad 42.961.2u for the AT ad DAT groups, respectively (Figure 3 - orage). For the NoAT group, the mea agular shift was 43.560.9u whe the aimig target was preset ad 40.167.1u whe the aimig target was abset. While the variace was cosiderably larger for trials without the aimig target, the meas were ot sigificatly differet (t 18 = 0.95, p = 0.38). Practicig the 45u CW strategy did ot produce iterferece o a subsequet baselie block i which participats were agai istructed to reach to the cued, gree target (Figure 3 black). Over the last 10 movemets of the familiarizatio block participats, across all groups, had a average target error of 21.560.7u. Over the first 10 movemets of the baselie block, this value was 20.560.6u, cofirmig that the strategy-oly block did ot produce a substatial bias. Without warig, the CCW rotatio was itroduced (Figure 2C). As expected, the itroductio of the CCW rotatio iduced a large target error. Averaged over the two, rotatio probe trials, the mea values were 241.663.3u, 243.861.1u, 243.563.2u for AT, DAT, ad NoAT groups, respectively (Figure 3 x ). After the participats were istructed to use the clockwise strategy (Figure 2D), the target error was reduced immediately to 3.564.4u, 1.064.3u, ad 22.566.6u, values that were ot sigificatly differet from each other (F 2,27 = 1.96, p = 0.16). The participats were the istructed to use the strategy ad required to produce a total of 320 reachig movemets uder the CCW rotatio. This exteded phase allowed us to a) verify that error icreased over time, driftig i the directio of the strategy, ad b) determie if the magitude of the drift would approximate the magitude of the rotatio, a predictio of the simplest form of the setpoit model. Cosistet with the results of Mazzoi ad Krakauer [9], error icreased i the directio of the strategy over PLoS Computatioal Biology www.ploscompbiol.org 4 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig Figure 3. Group averaged edpoit error relative to the target for the three groups. Participats first practiced movig to the cued target without a rotatio (black) ad while usig the strategy without a rotatio (orage). The rotatio was tured o betwee movemets 121 ad 443 (dashed vertical lies). For the first two of these trials, the rotatio probes, the participats had ot bee give the strategy (X s). For the ext 320 rotatio trials, participats were istructed to use the strategy. Followig this, the rotatio was tured off ad participats were istructed to move towards the cued target, first without edpoit feedback (X s) ad the with edpoit feedback (circles). A) Aimig-Target Group (blue). B) Disappearig Aimig-Target Group (mageta). C) No Aimig-Target Group (red). Shadig represets the 95% cofidece iterval of the mea. doi:10.1371/joural.pcbi.1001096.g003 the iitial phase of the rotatio block. However, the extet of the drift fell far short of the magitude of the rotatio. To quatify the peak drift, each participat s time series of edpoit errors was averaged over 10 movemets ad we idetified the bi with the largest error. Based o this estimate of peak drift, a sigificat differece was observed betwee groups (F 2,27 = 21.9, p,0.001; Figure 4A). This is cosistet with the predictio of the model based that the saliece of the aimig targets would ifluece the estimatio of the aimig locatio. Drift was largest whe the aimig targets were always visible, ad progressively less for the DAT ad NoAT groups. Drift was ot isolated to particular target locatios (Figure 4B). Our rotatio plus strategy block lasted 320 trials, early four times the umber of trials used by Mazzoi ad Krakauer [9]. This larger widow provides a iterestig probe o learig give that the participats become progressively worse i performace with respect to the target over the drift phase. While the AT group had the largest drift, they evetually showed a chage i performace such that the headig agle at the ed of the rotatio block was close to 45u CW from the gree target (Figure 3A). By the ed of traiig, their target error was oly 0.363.9u, which was ot sigificatly differet from zero (t 9 = 0.17, p = 0.85). We did ot observe a cosistet patter i how these participats couteracted the drift (Figure 5). Two participats showed clear evidece of a abrupt chage i their performace, suggestig a discrete chage i their aimig strategy. For the other eight AT participats, the chages i performace were more gradual. The drift persisted over the 320 trials of the rotatio block for participats i the DAT group (Figure 3C). The average drift was 5.964.8u at the ed of traiig, a value that was sigificatly greater tha zero (t 9 = 2.40, p = 0.04). Give that the NoAT group showed miimal drift, we did ot observe ay cosistet chages i performace over the block. At the ed of traiig, the mea target error was oly 1.061.9u, a value which is ot sigificatly differet from zero (t 9 = 1.01, p = 0.33). The effect of aimig target availability The availability or certaity i the estimate of the aimig locatio was maipulated by alterig the presece of the aimig target across the groups. As predicted by the setpoit model, the degree of drift was atteuated as the availability of the aimig targets decreased. I the curret implemetatio of our model, this decrease i drift rate is captured by a decrease i the adaptatio rate (B): with greater ucertaity, the weight give to the error term for updatig the iteral model is reduced. However, oe predictio of this model is at odds with the empirical results. Variatio i the adaptatio rate ot oly predicts a chage i drift rate, but also predicts a chage i the washout period. Specifically, decreasig the adaptatio rate should produce a slower washout, or exteded aftereffect (Figure 1B cya). This predictio was ot supported. The washout rates are similar across the three groups (bootstrap, p.0.11 betwee all groups). Oe could hypothesize differet adaptatio rates durig the rotatio ad washout phases, with the effect of target certaity oly relevat for the former. However, a post hoc hypothesis alog these lies is hard to justify. Alteratively, it is possible that the adaptatio rate (B) is similar for the three groups ad that the variatio i drift rate arises from aother process. Oe possibility is that the maipulatio of the availability of the aimig targets iflueces the certaity of the desired strategy term i Equatio 4, ad correspodigly, modifies the aimig error term: e aimig ~ r {r est actualz(s actual {Ks desired ) ð6þ A value of K that is less tha 1 will atteuate drift (Figure 1A mageta lie; simulated with K = 0.5) because the strategy output (Eq. 3) ad the desired strategy (Eq. 6) do ot completely cacel out. Cosequetly, the error used to adjust the iteral model will be smaller ad produce atteuated drift (Figure 1B mageta lie). Moreover, because the strategy is o loger used durig the washout phase, the K term is o loger relevat. Thus, the washout rates should be idetical across the three groups, assumig a costat value of B. I sum, while variatio i B or K ca capture the group differeces i drift rate, oly the latter accouts for the similar rates of washout observed across groups. Whe the availability of the aimig targets is reduced, either by flashig them briefly or PLoS Computatioal Biology www.ploscompbiol.org 5 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig Figure 4. Time course of drift ad aftereffect, ad the relatioship of drift to target locatio ad aftereffect. A) Average edpoit agular error relative to the target for the three groups, bied by averagig over epochs of te movemets (AT group i blue, DAT group i mageta, NoAT group i red). B) Peak drift with respect to the eight target locatio for the three groups. The empty circles are the target locatios. To idetify peak drift, 10 bis of four movemets were calculated for each directio. C) Agular error after the rotatio was tured off ad participats were istructed to stop usig the strategy. Triagles are average of the first eight post-rotatio trials, performed without visual feedback. Squares are washout block with feedback. D) Relatioship of drift ad aftereffect based o the estimated peak drift for each participat ad the first eight postrotatio trials. For B) ad D), the meas ad 95% cofidece iterval of the mea were estimated through bootstrappig. doi:10.1371/joural.pcbi.1001096.g004 Figure 5. Performace durig the rotatio block of three participats. A ad B are from the AT group; C is from the DAT group. A) Drift followed by large fluctuatios i error. B) Drift followed by a abrupt chage i target error. C) Cotiuous drift across traiig. doi:10.1371/joural.pcbi.1001096.g005 PLoS Computatioal Biology www.ploscompbiol.org 6 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig elimiatig them etirely, the participats certaity of the aimig locatio is atteuated. This hypothesis is cosistet with the otio that the aimig locatios serve as a proxy for the predicted aimig locatio. Strategy adjustmet based o performace error As oted above, oe of the participats showed drift approachig 45u. Eve those exhibitig the largest drift evetually reversed directio such that they became more accurate over time i terms of reducig edpoit error with respect to the target locatio. To capture this feature of the results, we cosidered how participats might vary their strategy over time as performace deteriorates. It is reasoable to assume that the participat may recogize that the adopted strategy should be modified to offset the risig error. Oe saliet sigal that could be used to adjust the strategy is the target error, the differece betwee the target locatio ad the visual feedback. To capture this idea, we modified the setpoit model, settig the strategy as a fuctio of target error (Figure 2E): s z1 ~Es {Fe target where E defies the retetio of the state of the strategy ad the F defies the rate of strategic adjustmet. As target error grows (i.e., drift), the strategy will be adjusted to miimize this error (Figure 2E). I our iitial implemetatio of the setpoit model, the strategy term was fixed at 45u. Equatio 7 allows the strategy term to vary, takig o ay value betwee 0u ad 360u. The availability of the aimig targets, captured by K i Eq. 6, iflueces the magitude of the drift. Greater drift occurs whe the aimig error, that betwee the feedback locatio ad aimig locatio, is saliet (Figure 1B red lie; K = 1). However, whe the target error grows too large, adjustmets to the strategy begi to gai mometum ad performace becomes more accurate with respect to the target give the chage i strategy (Figure 1C blue lie; simulated with E = 1 ad F = 0.01). More emphasis o target errors rather tha the aimig error results i less drift (Figure 1C orage lie; simulated with F = 0.05). Thus, the relative values of K ad F determie the degree of performace error that is tolerated before strategic adjustmets compesate to offset the drift (Figure 1C). This setpoit model (Eqs. 8 11) was fit by bootstrappig (see methods) each group s time series of target errors: e aimig e target ~ r {r est ð7þ ~ s z r {r est actual ð8þ actualz(s actual {Ks desired ) ð9þ r est z1 ~Arest zbeaimig s desired z1 ~Es desired {Fe target ð10þ ð11þ The fits (Figure 6A C ad Table 1) show that K is the greatest for the AT group ad progressively less for the DAT group ad the NoAT group (AT vs DAT group: p = 0.003; AT vs NoAT groups: p,0.001; DAT vs. NoAT groups: p,0.001). Whe the aimig targets remai visible, the aimig error sigal is readily available, ad the weight give to the strategic aimig locatio, K, is larger. Coversely, the weight give to the target error, F, is sigificatly greater for the NoAT group compared to the AT ad DAT groups (NoAT vs AT group: p = 0.005; NoAT vs DAT group p = 0.001). These results are cosistet with the hypothesis that participats i the NoAT group rely more o target errors because the absece of the aimig targets removes a referece poit for geeratig a reliable aimig error (Eq. 9). The dyamics of the recalibratio process ad strategy state (Eqs. 10 ad 11) are plotted i Figure 6D. These parameters, alog with the other parameters that represet the memory of the iteral model (A), the adaptatio rate (B), ad the memory of the strategy (E) are listed i Table 1. Followig the rotatio block, we istructed the participats that the rotatio would be tured off ad they should reach to the cued gree target. For the first eight trials, o edpoit feedback was preseted. This provided a measure of the degree of sesorimotor recalibratio i the absece of learig (Figure 4C triagles). Aftereffects were observed i all three groups. The average error was sigificatly differet from zero i the CW directio from the gree target for all three groups (oe sample t-test for each group, p,0.001). I comparisos betwee the groups, the AT group showed the largest aftereffect of 19.263.7u (t 18 = 3.5, p = 0.003 ad t 18 = 5.61, p,0.001 compared to the DAT ad NoAT groups, respectively). The mea aftereffects for the DAT ad NoAT groups were 10.463.2u ad 6.862.2u, values that were ot sigificatly differet. Whe edpoit feedback was agai provided, the size of the aftereffect dimiished over the course of the washout block (Figure 4C - squares). Relatioship betwee drift ad aftereffect I the setpoit model, the iteral model will cotiue to adapt eve i the face of strategic adjustmets adopted to improve edpoit accuracy. As such, the model predicts that the size of the aftereffect should be larger tha the degree of drift. To test this predictio, we compared the peak drift durig the rotatio block to the aftereffect. I the precedig aalysis, we had estimated peak drift for each participat by averagig over 10 movemets ad idetifyig the bi with the largest error. However, a few errat movemets could easily bias the estimate of drift withi a 10- movemet bi. As a alterative procedure, we used a bootstrappig procedure to idetify the bi with the largest agular error for each group. This method should decrease the effect of oise because the estimate of peak drift is selected from a averaged sample of the participats data. Moreover, ay bias i the estimate of the magitude of the peak should be uiform across the three groups of participats. For cosistecy, we estimated the aftereffect (the first 8 trials without feedback) usig the same bootstrap procedure. For the AT group, the peak drift was 14.862.5u i the CW directio, occurrig 64630 movemets ito the rotatio block. For the DAT group, the peak drift was 10.061.8u, occurrig at a later poit i the rotatio block (1306106). For the NoAT group, peak drift was oly 3.262.7u ad occurred after 1456131 movemets. As predicted by the model, the aftereffect was sigificatly larger tha peak drift for the AT ad NoAT groups (Figure 4D; bootstrap: p = 0.002 ad p,0.001, respectively). The differece betwee the degree of peak drift ad aftereffect i the DAT group was ot reliable. It is importat to emphasize that estimates of the time of peak drift should be viewed cautiously, especially i terms of comparisos betwee the three groups. These estimates have lower variace for the AT group because it was easier to detect the poit of peak drift i this group compared to the DAT ad NoAT groups. PLoS Computatioal Biology www.ploscompbiol.org 7 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig Figure 6. Setpoit model fit. The setpoit model with direct-feedthrough (equatios 8 11) was fit to each group s data (from figure 3) via bootstrappig. A C: Solid fuctios are averaged model fits for the AT group (blue), DAT group (mageta), ad NoAT group (red) i compariso with actual group averaged data (black). D) The target error is the combiatio of learig withi the sesorimotor recalibratio process (solid) ad the aimig locatio associated with the strategy (dashed), show here fro the AT group (blue), DAT group (mageta), ad NoAT group (red). doi:10.1371/joural.pcbi.1001096.g006 Discussio Behavioral summary Visuomotor rotatio tasks are well-suited to explore how explicit cogitive strategies ifluece sesorimotor adaptatio. Followig Table 1. Modelig results for each group based o the setpoit model (Eqs 8 11). AT Group DAT Group NoAT Group A 0.99160.002 B 0.01260.003 E 0.99960.001 K 0.98560.034 0.40960.122 0.19560.108 F 0.02360.006 0.00260.003 0.72560.319 Goodess of fit (r) 0.68260.075 0.71360.069 0.65060.081 A: Retetio factor of the iteral model; B: Adaptatio rate based upo aimig errors; E: Retetio factor of the strategy; K: Availability of the strategic aimig locatio; F: Adjustmet rate of the strategy based upo target errors (high values favor strategy chage). A, B, ad E were costraied to be the same across groups, while parameters K ad F were allowed to vary betwee groups. The meas ad 95% cofidece iterval of the mea were estimated through bootstrappig. doi:10.1371/joural.pcbi.1001096.t001 the approach itroduced by Mazzoi ad Krakauer [9], we istructed participats to aim 45u CW i order to offset a 245u rotatio. Betwee groups, we maipulated the iformatio available to support the strategy by either costatly providig a aimig target, blakig the aimig target at movemet iitiatio, or ever providig a aimig target. I all groups, the strategy was iitially effective, resultig i the rapid elimiatio of the rotatio-iduced edpoit error. However, whe the aimig target was preset, participats showed a drift i the directio of the strategy, replicatig the behavior observed i Mazzoi ad Krakauer [9]. This effect was markedly atteuated whe the aimig target was ot preset suggestig that a accurate estimate of the strategic aimig locatio is resposible for causig the drift. I additio, whe the drift became quite large (as i the AT group), participats begi to adjust their strategy to offset the implicit drift. Icorporatig a strategy ito state-space models Mathematical models of sesorimotor adaptatio have ot explicitly addressed how a strategy iflueces learig ad performace. By formalizig the effect of strategy usage ito the stadard state-space model of motor learig, we ca begi to evaluate qualitative hypotheses that have bee offered to accout for the ifluece of strategies o motor learig. Mazzoi ad Krakauer [9] suggested that drift reflects the iteractio of the idepedet cotributio of strategic ad implicit learig processes i movemet executio. Curret models of adaptatio PLoS Computatioal Biology www.ploscompbiol.org 8 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig caot be readily modified to accout for this iteractio. Rather, we had to cosider more substative architectural chages. Borrowig from egieerig cotrol theory, we used a setpoit model i which the iteral model ca be recalibrated aroud ay give reach locatio. The idea of a setpoit is geerally implicit i most models of learig, but this compoet does ot come ito play sice the regressio is aroud zero. However, simply makig the setpoit explicit is ot sufficiet to capture the drift pheomeo. The strategy must have direct feedthrough to the output equatio i order to implemet the explicit strategy while allowig for a iteral model to implicitly lear the visuomotor rotatio. This simple setpoit model was capable of completely elimiatig error o the first trial ad capture the deterioratio of performace with icreased traiig. Drift arises because the error sigal is drive by the differece betwee the iteral model s predictio of the aimig locatio ad the actual, edpoit feedback. The idea that a aimig error sigal is the source of drift is cosistet with the cojecture of Mazzoi ad Krakauer [9]. A importat observatio i the curret study is that, give ucertaity i the predictio of the aimig locatio, participats use exteral cues as a proxy i geeratig this predictio. This hypothesis accouts for the observatio that drift was largest whe the aimig target was always visible, itermediate whe the aimig target was oly visible at the start of the trial, ad egligible whe the aimig target was ever visible. The aimig target, whe preset, served as a proxy for predicted had positio, ad helped defie the error betwee the feedback cursor ad aimig locatio i visual coordiates. Whe the aimig target was ot preset, the aimig locatio was less well-defied i visual coordiates, ad thus, the relatioship betwee the aimig locatio ad feedback cursor was less certai. Uder this coditio, the participat s certaity of the error was reduced ad adaptatio based of this sigal was atteuated. Quatitatively, progressively smaller values of K were observed with decreasig availability of the aimig targets. The atteuatio of adaptatio with icreasig ucertaity (as reflected by reduced drift) is similar to the effects o adaptatio predicted by a Kalma filter whe measuremet oise is large. Several studies have show that adaptatio rates ca chage whe the certaity of sesory iformatio is maipulated [31,32]. I our study, variatio i certaity of the desired aimig locatio (K) iflueced the magitude of drift. As the availability of the aimig targets was reduced, the correspodig estimate of the aimig error became less certai, producig slower adaptatio of the iteral model, or reduced drift. Moreover, sice K directly operates o the estimate of the strategic aimig locatio, this parameter does ot affect the rate of washout sice the strategy is o loger used. Cosistet with this predictio, the rate of washout was similar across the three groups. The effect of the visual ladmarks o adaptatio also provides isight ito why other studies have ot observed drift, eve whe participats develop some explicit awareess of the rotatio, ad presumably, use that kowledge [8,10,11,24,25] to improve performace rapidly. Several key methodological differeces are relevat. First, i most visuomotor rotatio studies, olie visual feedback is provided durig the movemets. This may impede drift because participats observe the casual relatioship betwee movemet of their had ad the edpoit, cursor feedback [33]. Drift itself could be corrected by olie feedback. Secod, participats i the earlier studies were ot give a clear, explicit strategy, ad importatly, were ot provided with visual ladmarks that could support a self-geerated strategy. Uder such coditios, participats face a difficult estimatio process. The absece of ladmarks would icrease ucertaity i implemetig a self-geerated strategy. Moreover, the motor system would ot have a saliet visual sigal for groudig the compariso of feedback ad aimig locatio. As show by our oaimig target coditio, drift is miimal whe the ladmarks are abset. Thus, the absece of drift i the visuomotor adaptatio literature caot be take as evidece that strategies are ot relevat. It is likely that, whe iitial error sigals are large, learig ivolves a combiatio of strategic ad recalibratio processes. Two sources of errors Our model etails two types of error sigals: a aimig predictio error betwee the feedback locatio ad aimig locatio, ad performace error betwee the feedback locatio ad the target locatio (Figure 2E). The aimig error drives the drift pheomeo while the target error is used to restore performace. Ituitively, the motor system should be able to recalibrate the iteral model aroud ay desired reach locatio, a feature captured by the setpoit model. Whe there is a accurate estimate of the strategy (the setpoit), the the strategy aturally falls out of the error equatio, allowig the iteral model to recalibrate aroud ay positio. The setpoit mechaism is revealed whe a strategy is imposed to couteract a visuomotor rotatio. A couterituitive cosequece of this process is the rise i error over time because the motor system is recalibratig aroud the strategic aimig locatio (or its proxy) ad ot the target locatio. Iterestigly, while there was a iitial rise i edpoit error, this fuctio evetually reversed, returig close to zero edpoit error by the ed of the strategy phase for the AT group. We assume that at some poit, the size of the edpoit error exceeded the participat s self-defied tolerace for errors ad caused them to modify the strategy. Ufortuately, we do ot have a direct measure of strategy chage. Examiatio of the learig profiles revealed cosiderable variability across idividual participats (Figure 4A ad Figure 5). This variability likely reflects multiple sources of oise, as well as istability i the use of a strategy. We obtaied self-reports i a debriefig sessio at the ed of the experimet. A few subjects i the AT ad DAT groups reported adjustig their strategy such that they reached to a locatio betwee the cued target ad aimig target, or that shifted to reach straight to the cued target. At a miimum, multiple processes are required to capture this omootoic learig fuctio. I our iitial modelig efforts, we fixed the strategy for the etire traiig process. Uder this assumptio, the system should exhibit drift that is equal i size to the rotatio, a effect ever observed. Thus, the fial versio of our model is a variat of a two-rate state space model [28], but with the two rates reflectig differet error sources. As described above, adaptatio of a implicit model is drive by the aimig error. I cotrast, the strategy is adjusted o a trial-by-trial basis as a fuctio of the curret target error. Target errors are iitially quite small ad, thus have little effect o performace. However, as the target errors become large due to adaptatio of the iteral model, adjustmets i the strategy are required to improve edpoit accuracy. Aimig to a ew locatio resets the recalibratio aroud a ew setpoit. To reach a stable state, participats would eed to progressively adjust their strategic aimig locatio to a poit where aimig error ad target error cacel each other out. It is reasoable to assume that our maipulatio of the availability of the aimig locatios iflueced the degree of certaity associated with the desired aimig locatio. Whe PLoS Computatioal Biology www.ploscompbiol.org 9 March 2011 Volume 7 Issue 3 e1001096

Flexible Strategies durig Motor Learig certaity is reduced, adaptatio arisig from the aimig error sigal is slower, ad i our two-process model, the level of adaptatio achieved by the motor system is lowered. Moreover, the model does ot predict that drift will reach 45u. The strategy is adjusted, reachig a poit where it offsets the drift arisig from adaptatio of the iteral model. The iterplay of these two processes is complex (Figure 6D). With both occurrig cotiuously durig traiig, the system reaches a pseudo-equilibrium state at which additioal chages to both processes becomes relatively small. Likig the strategy adjustmet to the target error sigal offers a process-based approach to capture flexibility i strategy use. Our setpoit model captures this through the strategy adjustmet parameter (F), a weightig term o target error. The NoAT group appears to give more weight to target error tha the AT ad DAT group. Iterestigly, the modelig results idicate that the AT group showed more utilizatio of the target errors tha the DAT group. We assume this arises because the AT group evetually offset the relatively large drift to restore o-target accuracy. I cotrast, the DAT group ever corrected for drift, suggestig that the weight give to target errors for this group was early zero. It is importat to highlight oe differece i how we coceptualize chages i the rate of strategy adjustmet (F) compared to chages i the rate of adaptatio (B). Adjustmets i a strategy ca occur o very fast timescale; for example, oce istructed, participats were able to immediately offset the full rotatio. Variatio i F refers to the rate at which participats chage where to aim. I cotrast, B reflects a gradual process, reflectig the rate of chage i a system desiged to reach a desired locatio. Alterative models of strategy chage I may sesorimotor adaptatio tasks, variable learig rates are used to model the substatial variability observed i idividual learig curves. I a similar maer, our setpoit model captures idividual differeces i strategy utilizatio by varyig the strategy adjustmet rate (F). Noetheless, this formulatio does ot adequately capture the full rage of behavior observed i the curret study. I particular, this approach is isufficiet to accout for abrupt chages i performace. For example, the learig profile show i Figure 5B suggests a categorical chage i strategy. That is, the participat abadoed what was becomig a uacceptable strategy to search for a ew strategy. Ideed, i a post-test iterview, this participat reported chagig the aimig locatio to a positio halfway betwee the cued target ad the aimig locatio. A alterative approach to model strategy chage could be derived from models of reiforcemet learig [34 37]. I such models, participats explore differet regios of a strategy space, attemptig to quickly idetify the policy that results i small target error. I our task, a shift i policy might occur whe the rise i target error due to adaptatio exceeds a threshold. That is, whe a chose actio fails to achieve the predicted reward, a ew strategy is adopted. This approach would provide a way to fit the data of the few participats who exhibited categorical-like chages i performace. A reiforcemet learig approach based o a discrete set of strategies is problematic with the curret data set. At oe extreme, oe might suppose that such values could take o the locatios of the aimig targets (e.g., 0u ad 45u), ad perhaps some itermediary poits (e.g., 22.5u, the poit halfway betwee two aimig targets). At the other extreme, the set might cosist of a large set of values. Choosig a sparse set of potetial actios will result i more abrupt chages i performace, while choosig a fier set of potetial actios will allow for more gradual chages. Studies desiged to explore reiforcemet learig models geerally use a limited set of choices ad performace thus etails discrete shifts i behavior. I our task, reach directio spas a cotiuous space, ad i fact, for most of our participats, the chages i performace were gradual. Future experimets that costrai the set of potetial actios ad maipulate reward may be better suited for employig a reiforcemet learig perspective to explore strategy chage. Qualitative chages i performace may also idicate that the participats have fudametally chaged their coceptualizatio of the task. For example, rather tha view the task goal as oe ivolvig reachig to targets, the participat may have switched to a orietatio i which the task goal ivolved masterig a game i which the had is a tool [38 40]. By this accout, the iitial drift would result from the operatio of implicit adaptatio of a iteral model of the arm as described above. However, whe this drift became too large, the participat switched to treatig the task as a game, with the arm ow coceptualized i a maer similar to how we view a computer mouse. Accurate performace ow required learig the appropriate trasformatio betwee the movemets of the tool ad the task workspace. The error sigal for this form of learig would o loger be based o the differece of predicted had/object locatio ad the feedback locatio; we are able to readily accept that the movemet of a mouse-drive cursor results i feedback i a alterative workspace. Rather, the error sigal here is the differece betwee the cued target locatio ad the feedback locatio. A error sigal of this form would ot produce drift. The recoceptualizatio hypothesis would predict that peak drift should equal or be greater tha the aftereffect. This follows from the idea that adaptatio of the iteral model should cease at the time the task goal chages from reachig to tool mastery. Oce the participat switches from learig about their arm to learig how to play the visuomotor game, the there the iteral model would ot cotiue to lear. The target error gais emphasis ad the aimig error falls out. As such, the aftereffect should equal the drift value or be lower if there is some time-depedet decay of the adaptatio effects [41]. While this hypothesis is plausible, there are also some limitatios. First, it is importat to keep i mid that i almost all adaptatio studies, the oly visual sigals are the target locatio ad a feedback cursor. Uder such coditios, aftereffects are promiet, idicatig adaptatio of a iteral model ad ot just learig a game. Oe would have to assume that tool coceptualizatio was more proouced i the preset study because of the strategic istructios. Secod, our estimate of the aftereffect is actually larger tha the peak drift for two of the three groups (Figure 4D). This observatio, while at odds with the recoceptualizatio hypothesis, is cosistet with the setpoit model. I our model, the aimig error sigal will cotiue to modify the iteral model eve as strategy adjustmets reduce target error. As such, the aftereffect, a estimator of implicit adaptatio should be equal to or larger tha peak drift. While future research will be required to explore the mechaisms of strategy chage, the curret study advaces our uderstadig of the iteractios that arise betwee explicit, strategic processes ad implicit, motor adaptatio. Cosistet with Mazzoi ad Krakauer [9], the results make clear that strategies should ot be viewed simply as represetatios that ca facilitate implicit learig mechaisms. Rather, implicit learig mechaisms operate with a cosiderable degree of autoomy ad, uder certai coditios, ca override the ifluece of a explicit strategy. Noetheless, the beefits of strategic capabilities are also PLoS Computatioal Biology www.ploscompbiol.org 10 March 2011 Volume 7 Issue 3 e1001096