Optimizing Knowledge Component Learning Using a Dynamic Structural Model of Practice

Similar documents
E-LEARNING USABILITY: A LEARNER-ADAPTED APPROACH BASED ON THE EVALUATION OF LEANER S PREFERENCES. Valentina Terzieva, Yuri Pavlov, Rumen Andreev

Fuzzy Reference Gain-Scheduling Approach as Intelligent Agents: FRGS Agent

Natural language processing implementation on Romanian ChatBot

Management Science Letters

Consortium: North Carolina Community Colleges

'Norwegian University of Science and Technology, Department of Computer and Information Science

arxiv: v1 [cs.dl] 22 Dec 2016

Application for Admission

CONSTITUENT VOICE TECHNICAL NOTE 1 INTRODUCING Version 1.1, September 2014

HANDBOOK. Career Center Handbook. Tools & Tips for Career Search Success CALIFORNIA STATE UNIVERSITY, SACR AMENTO

part2 Participatory Processes

VISION, MISSION, VALUES, AND GOALS

2014 Gold Award Winner SpecialParent

On March 15, 2016, Governor Rick Snyder. Continuing Medical Education Becomes Mandatory in Michigan. in this issue... 3 Great Lakes Veterinary

also inside Continuing Education Alumni Authors College Events

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

Effect of Word Complexity on L2 Vocabulary Learning

Language Acquisition Chart

Probabilistic Latent Semantic Analysis

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Seminar - Organic Computing

The Evolution of Random Phenomena

Cal s Dinner Card Deals

Cued Recall From Image and Sentence Memory: A Shift From Episodic to Identical Elements Representation

Professor Christina Romer. LECTURE 24 INFLATION AND THE RETURN OF OUTPUT TO POTENTIAL April 20, 2017

Math 181, Calculus I

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

The role of the first language in foreign language learning. Paul Nation. The role of the first language in foreign language learning

Lecture 1: Machine Learning Basics

How to Judge the Quality of an Objective Classroom Test

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

SOFTWARE EVALUATION TOOL

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only.

Reducing Features to Improve Bug Prediction

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Probability and Statistics Curriculum Pacing Guide

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

Loughton School s curriculum evening. 28 th February 2017

Major Milestones, Team Activities, and Individual Deliverables

Mandarin Lexical Tone Recognition: The Gating Paradigm

Age Effects on Syntactic Control in. Second Language Learning

Speech Recognition at ICSI: Broadcast News and beyond

Visual processing speed: effects of auditory input on

Rule Learning With Negation: Issues Regarding Effectiveness

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

Firms and Markets Saturdays Summer I 2014

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Reinforcement Learning by Comparing Immediate Reward

Course Name: Elementary Calculus Course Number: Math 2103 Semester: Fall Phone:

Extending Place Value with Whole Numbers to 1,000,000

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Lecture 10: Reinforcement Learning

Grade 6: Correlated to AGS Basic Math Skills

Application of Virtual Instruments (VIs) for an enhanced learning environment

A Process-Model Account of Task Interruption and Resumption: When Does Encoding of the Problem State Occur?

Web-based Learning Systems From HTML To MOODLE A Case Study

Radius STEM Readiness TM

Artificial Neural Networks written examination

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking

Software Maintenance

Mathematics process categories

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

To appear in The TESOL encyclopedia of ELT (Wiley-Blackwell) 1 RECASTING. Kazuya Saito. Birkbeck, University of London

Learning Methods in Multilingual Speech Recognition

Guidelines for Project I Delivery and Assessment Department of Industrial and Mechanical Engineering Lebanese American University

Running head: DUAL MEMORY 1. A Dual Memory Theory of the Testing Effect. Timothy C. Rickard. Steven C. Pan. University of California, San Diego

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

DERMATOLOGY. Sponsored by the NYU Post-Graduate Medical School. 129 Years of Continuing Medical Education

New Venture Financing

A Reinforcement Learning Variant for Control Scheduling

Instructor: Matthew Wickes Kilgore Office: ES 310

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

An Empirical and Computational Test of Linguistic Relativity

The Effectiveness of Realistic Mathematics Education Approach on Ability of Students Mathematical Concept Understanding

Learning From the Past with Experiment Databases

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

4.0 CAPACITY AND UTILIZATION

arxiv: v1 [cs.cl] 2 Apr 2017

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Cross Language Information Retrieval

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Physics 270: Experimental Physics

Developing a Language for Assessing Creativity: a taxonomy to support student learning and assessment

2.B.4 Balancing Crane. The Engineering Design Process in the classroom. Summary

CS Machine Learning

Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant Sudheer Takekar 1 Dr. D.N. Raut 2

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Student Morningness-Eveningness Type and Performance: Does Class Timing Matter?

The Netherlands. Jeroen Huisman. Introduction

Transcription:

I Proceedigs of ICCM - 2007- Eighth Iteratioal Coferece o Cogitive Modelig. 37-42. Oxford, UK: Taylor & Fracis/Psychology Press. Optimizig Kowledge Compoet Learig Usig a Dyamic Structural Model of Practice Philip I Pavlik Jr. (ppavlik@cs.cmu.edu), Huma Computer Iteractio Istitute Nora Presso (presso@cmu.edu), Psychology Departmet Keeth Koediger (koediger@cmu.edu), Huma Computer Iteractio Istitute Pittsburgh Sciece of Learig Ceter Caregie Mello Uiversity, 5000 Forbes Ave., Pittsburgh, PA 523 USA Abstract This paper presets a geeralized scheme for modelig learig i simple ad more complex tasks, ad shows how such a model ca be applied to optimizig coditios of practice to imize some desired performace. To eable this optimal allocatio of lesso time, this paper describes how to quatify the prefereces of studets usig utility fuctios that ca be imized. This covetioal game theoretic approach is eabled by specifyig a mathematical model that allows us to compute expected utility of various studet choices to choose the choice with imal expected utility. This method is applied to several educatioal decisios that ca beefit from optimizatio. Keywords: Memory; Ecoomics; Practice; Computer-Aided Istructio. Itroductio This paper describes a method for applyig ecoomic priciples i order to allocate the scarce resource of learig time toward satisfyig the ulimited eed for educatio. To do this, we describe a model that decomposes learig ito idividual kowledge compoets (KCs) that possess some degree of idepedece from other skills (a kowledge compoet is ay proficiecy that ca be leared). By assumig this idepedece, the model accouts for the uique effects of practice o specific KCs, with the goal of optimizig the beefit of practice. We do ot argue that the model is a precise represetatio of all the processes ivolved i learig, but rather that it provides a heuristic tool to track observed stregths of KCs as a geeral fuctio of practice, so that improvemet over time ad across KCs ca be optimized. The model we will preset, like similar models, is effective i capturig practice effects (Ce, Koediger, & Juker, 2006). Further, it is iterestig to ote that the dyamic practice model preseted here (based o the ACT-R computatioal model of declarative memory, Aderso & Lebiere, 998) might be substituted with aother model of cogitio with oly miimal modificatio to the approach. Although the model is a simplificatio of learig processes i most cases, this simplicity provides a importat advatage i applicatio. It allows closed form predictios of which learig evets (LEs) might be assiged at what times to imize learig (a learig evet is ay discrete iterval over which a leared proficiecy icreases). Ultimately, it is explaiig this collectio of closed form predictios ad recommedatios that is the goal of this paper. To explai these cocepts this paper has three parts. The first sectio o the dyamic practice model is largely a review of the ACT-R model of declarative memory. This sectio serves to oriet the reader o the output fuctios (probability ad latecy of recall) that will be used later. The secod sectio o structural models details how compoud evets ca be modeled usig the dyamic practice model. Compoud evets are importat to cosider whe resposes are ot idepedet ad are especially relevat for certai kids of optimizatio situatios (i.e. part-task to whole-task trasfer of performace). The fial sectio shows several ways the model built i the first half of the paper ca be applied to optimizig kowledge compoet learig. Dyamic Practice Model To uderstad the quatitative model that will be used to predict ad optimize learig, we will begi with the equatios that predict probability of correct performace ad latecy of correct performace as a fuctio of the activatio stregth of a KC. Probability Correct. The first depedet measure of KC performace is probability of correct respose. Equatio shows the stadard Boltzma equatio (similar to the Rasch model used i item respose theory), a logistic fuctio that characterizes the threshold of correct performace (the level of activatio at which performace is correct greater tha 50% of the time) ad distributioal oise as τ ad s respectively. Equatio describes a model of the probability of givig a correct respose (p) for a give KC activatio stregth value (m) ad the parameters described above. p m = τ + e m s Equatio Latecy. A secod depedet measure used to track KC performace is latecy (labeled q i our model). Various sources suggest modelig latecy with a Weibull distributio (Aderso & Lebiere, 998; Loga, 995). Such a Weibull distributio ca be produced by usig Equatio 2 to represet latecy as a fuctio of F (which scales latecy magitude), m (memory stregth) ad a fixed cost (which is determied from data ad captures the

miimum time ecessary for perceptual ad motor costs of respodig). Logistic oise o m determies the shape of the aggregate Weibull fuctio for a populatio of respose latecies. q m = Fe m + fixedtimecost Equatio 2 Kowledge Compoet Stregth Fuctio Give these two output fuctios, which correspod to two importat ways of measurig KC performace, we ca ow elaborate how curret m is computed as a fuctio of the history of a studet s practice of a KC practice item across prior LEs. Equatio 3 shows this KC stregth fuctio. The history term, the fial portio of Equatio 3, is essetially described by three values, t, d ad b, for each LE. The values for t represet the times sice each past LE (the ages of each LE effect). The d values are the power law decay values for each LE. The b values scales the effect of each LE depedig o the amout of learig for the LE (i.e. loger duratio LEs ad successful test LEs result i higher bs). To model history, the bt -d quatity is summed for each of the learig evets (LEs). The logarithm serves to scale the quatity from to. This power law decay formulatio was first explored by Aderso ad Schooler (99), who showed that it results i patters of forgettig that match the relative eed for performace i the eviromet. The β parameters, the first portio of Equatio 3, capture aturally occurrig error whe the model is fit to data from multiple studets or multiple KCs. β s is the parameter that captures cosistet error across KCs as a fuctio of studet. β i captures cosistet error across studets as a fuctio of KC (i stads for item). Fially β si captures the residual error for a specific KC ad a specific studet over multiple LEs. d m = β s + β i + β si + l bk t k k Equatio 3 k= b study =g( e v studyduratio ) Equatio 4 Equatio 4 shows how b ca be computed as a fuctio of the duratio of a study LE (where v ad g represet a growth costat ad the imum possible ecodig respectively). This captures the otio that cotiuous time spet o a sigle KC has a dimiishig effect o learig (Metcalfe & Korell, 2003). Recet work by Pavlik (i press) has show how this b scalar ca be used to capture the learig differece betwee active correct respodig ad passive study. I such work, b successfulretrieval is typically set at a costat, whereas b study varies as described i equatio 4. This supposes two caoical forms of the LE: the study LE, which comes from uassessed study over some fixed period of time of a stimulus represetig a KC, ad the test LE, which comes from a variable-duratio assessmet of learig (test LEs are ofte followed by study opportuity ad the are called drill LEs ). Test LEs are iterestig ot oly because they ted to lead to more learig tha passive study (for correct resposes), but also because they provide iformatio about the curret state of learig that ca be used to implemet kowledge tracig. Such kowledge tracig algorithms have chaged form over differet applicatios of the model. I the origial versio (Pavlik Jr., 2005), the distributio of residual β si variace is used as the iitial Bayesia prior for item stregth ad umerical itegratio is used to adjust this value after each practice by itegratig the logistic distributio for correctess give the respose of the studet. I the more recet versio, we have foud that a more computatioally iexpesive model that allows the simpler b successfulretrieval parameter to capture the β si variace works well i practice (Pavlik Jr. et al., 2007). Further, the latest versio also uses a b latecy parameter multiplied by each b successfulretrieval parameter for each successful test. This b latecy parameter is a atural log trasform (with a scalar parameter) of the differece betwee q m (the predicted latecy) ad the latecy data from the studet. This creates a kowledge tracig model that assumes that faster respodig meas more learig has occurred. Equatio 5 shows a more recet modificatio of the ACT- R equatios to capture the spacig effect, the spacig-bypractice iteractio, ad the spacig-by-retetio iterval iteractio (Pavlik Jr. & Aderso, 2005). This chage says that the forgettig rate from ay LE depeds o the level of activatio at the time of the LE. As modeled i Equatio 5, whe spacig betwee trials gets wider, activatio decreases betwee presetatios; decay is therefore less for each ew presetatio, ad log-term probability of correct performace does ot decrease as much. I Equatio 5, the decay rate d k is calculated for the kth presetatio of a KC item as a fuctio of the activatio m k- at the time the presetatio occurred (e.g., the decay rate for the 7th LE (t 7 ) depeds o the activatio at the time of the 7th LE, which is a fuctio of the time from last exposure of the prior 6 LEs ad their decay rates. It is importat to ote that sice t k s are ages (or differeces betwee the curret time ad the time of the past trial), activatio ad decay deped o the curret time as well as the umber of LEs). d m k = ce m k + a Equatio 5 Aderso, Ficham, ad Douglass (997) foud that Equatio 3 could accout for practice ad forgettig durig a experimet, but it could ot fit retetio data over log itervals. Because of this, they cocluded that betwee sessios, the presece of iterveig evets erodes KCs more slowly tha durig a experimetal sessio. This slower forgettig was modeled by scalig time as if it were slower outside the experimet. Forgettig is therefore depedet o the psychological time betwee presetatios, rather tha the true itersessio iterval. This factor is implemeted by multiplyig the portio of time that occurs betwee sessios by h (a scalar parameter for time) whe calculatig recall. This is doe by subtractig h*total itersessio time from each age (t k ) i Equatio (Pavlik Jr., 2005; Pavlik Jr. & Aderso, 2005). Because of this mechaism, time i the model is essetially a measure of destructive iterferig evets. The decay rate, therefore, is a measure of fragility of memories to the corrosive effect of these other evets. 2

This model has the flexibility to capture may varieties of learig ad practice effects. To further uderstad this flexibility, cosider the issue of more implicit productio rule (procedural) learig i cotrast to explicit factual (declarative) learig. This distictio is supported by research from widely distict theoretical perspectives such as ACT-R ad coectioism ad is supported by dissociable eural mechaisms (McClellad, McNaughto, & O'Reilly, 995). We might woder whether the equatios just preseted are adequate to capture both kowledge (ad KC) types. Specifically, that work implies that declarative learig is both faster (reflected by a larger b parameter) ad more easily forgotte (reflected by a larger d parameter) tha procedural learig, ad our model ca clearly characterize these differeces. Structural Model The structural level model assumes that few domais are made up of etirely idepedet KCs, as seems to be implied i the model we just preseted. The word structural refers to the fact that, because of this lack of idepedece, the modeler must be cocered with the structure that liks the multiple KCs ad their associatio. I may domais, predictios of the probability of correct respose ad latecy are derived from the stregth of more tha oe uderlyig KC. For example, i studies of Chiese vocabulary learig, stimuli ca be preseted i oe of four modes (Hazi character, piyi text, soud file, ad Eglish text). This results i 6 possible test LE types, two of which are Eglish piyi ad Hazi piyi drill LEs [(stimulus) (respose)]. I both of these cases, drill success depeds ot oly o the stregth of the lik betwee the stimulus ad respose, but also o the ability to recall ad produce the piyi respose. Because of this, performace for these pairs caot be idepedet. Similarly, i work with a Frech geder idetificatio task, words fall ito geder categories based o spellig ad sematic cues. For istace, words that ed i age are most ofte masculie i Frech, as i le fromage. Although each of these words might yield a correct respose idepedet of the geeral rule (through recall), it is also obvious that all rule exemplars share a KC that ca be used to respod to ay items i a cue category (ad i fact, it is this geeralized respodig, rather tha exemplar-based recall, that we wat to optimize). To deal with the fact that multiple KCs are required for these sigle skills, we will propose two basic structural models that accout for this, each of which fits some possible learig tasks: the cojuctive structure ad the disjuctive structure. Cojuctive Model I a cojuctive model, all compoet KCs must be active to produce a correct respose. For istace, i the Chiese vocabulary work, probability of correct performace for each trial is captured by the probability of correct recall for both the respose ad the lik betwee the stimulus ad the respose. Give this model, probability correct depeds o both the stregth of the lik ad the stregth of the respose i a cojuctive fuctio: p(lik) * p(respose), such that both elemets are ecessary for a correct respose. The more geeral form for the cojuctio of 2 KCs is show i Equatio 6. Latecy, o the other had, is hadled as the sum of the perceptual motor costs, the cost for recall of the lik KC, ad the cost for recall of the respose KC. Not oly does this structural model hadle the piyi respose example above, but it also captures data showig that respodig with a word i the ative laguage should be easier tha the recetly leared foreig equivalet (e.g. Scheider, Healy, & Boure, 2002). p( KC adkc 2 ) =p( KC ) p( KC 2 ) Equatio 6 Disjuctive Model The disjuctive model, i cotrast, assumes that a trial ca yield a correct respose due to performace of ay oe of the two or more idepedet KCs. Ofte disjuctive models apply i a geeralizatio situatio where the domai cotais specific KCs that apply for idividual stimuli ad geeral KCs that each apply to a group of stimuli, as i the Frech geder case. I this example, we ca imagie that geeral group KCs cotrol performace for clusters, the members of which ca also be leared by rote. Give the example of a geeral (rule-based) ad specific (rote) compoet cotrollig each performace, probability of correct skill performace depeds o the stregth of both geeral ad specific compoets i a disjuctive fuctio, p(geeral) + p(specific) * (-p(geeral)), such that (for example) a studet could classify a ovel word o the sole basis of the geeral KC. The geeral form of this model is show i Equatio 7. p( KC orkc 2 ) =p( KC ) + p( KC 2 )( p( KC ) Equatio 7 Optimizig Learig The followig procedures describe how oe ca use the model to compute optimal practice schedules. Usually, we assume that what is beig optimized is gai i some logterm measure of learig for a KC or multiple KCs. Although usig log-term probability correct as a depedet measure works whe we focus o optimizig some global aggregate task (like the optimal total umber of practices for a item), we eed a differet utility fuctio for more dyamic local schedulig (such as pickig a item to practice ext), i order to formalize prefereces for the learig gais from differet LE schedules. Utility Optimizatios We propose to use Equatio 8 as the utility fuctio for a LE (where b cotrols the weight of the LE, t is the desired retetio iterval of the LE, ad decay (d) is a fuctio of the activatio (m) at the time of practice). Most importatly, Equatio 8 does ot have the all-or-oe property of probability correct (because probability correct is a sigmoid 3

fuctio, it usually approaches 0 or ). If we tried to use log-term probability correct as our measure of local utility, it would value practice most heavily whe it comes ear the trasitio from mostly icorrect performace to mostly correct performace across a sequece of test LEs (those LEs that fall o the itermediate part of the curve). This bias distorts the fact that we are ultimately more cocered with the miimum umber of practice trials required to reach a certai log-term retetio, ot schedulig each practice trial so that it icreases percet correct imally. These goals are actually quite differet sice log-term percet correct gai from the ext practice depeds o earess to log-term floor or ceilig performace, while utility gai is ot affected by these bouds. Thus, our utility fuctio imizes the overall goal by valuig LEs idepedetly of the order they occurred, cosiderig oly their uique cotributios (a fuctio of stregth of ecodig, rececy, ad the decay rate) to the log-term KC stregth. u = bt d m Equatio 8 We will use Equatio 8 as a cardial utility fuctio: e.g., a.2 icrease i stregth is half as good as a.4 icrease i stregth. Oe reaso why this assumptio is reasoable is because LEs cotribute to KC stregth i small icremets ad these icremets are iterchageable, as illustrated i Equatio 3. Usig a cardial utility fuctio allows us to directly compare differet possible spacigs ad KC presetatio orders, to determie whe learig is imal, give learig history. Further, we assume that this utility equatio satisfies the vo Neuma ad Morgester game theoretic axioms of completeess, trasitivity, cotiuity ad idepedece required for comparig expected utility lotteries (Vo Neuma & Morgester, 944). Practice Spacig Optimizatio (PSO). For each KC ad each studet, it is useful to decide whe it would optimal to repeat a drill LE of that KC. Therefore, we are tryig to schedule the LEs uder coditios of allocative efficiecy. I ecoomics, allocative efficiecy is a coditio where costs (time spet learig) are allocated i a way that imizes gais (icreases i utility). Takig this parallel to learig theory, we search for the retetio iterval (for each KC) at which the expected rate of learig utility gai is imal give a ew LE. This is expressed i Equatio 9, which calculates the imum utility gai for a KC as a fuctio of m (activatio of that KC) ad t (the target retetio iterval eeded to compute g i Equatio 8). All the other values are fixed parameters (b s = success LE weight from Equatio 8 if the test LE is successful, b f = failure LE weight from Equatio 8 for the study LE give as review, -d computed from the curret m (eeded with t, b f ad b s to compute u values), p m ad q m estimated for the test LE from Equatios ad 2, ad failure costs estimated from prior data). Because t ad m are the oly values that vary i fidig the optimum spacig, we ca solve for the optimal level of the oe give the other. For example, if we kow the desired retetio iterval, we ca solve for the of Equatio 9 to solve for the optimal level of activatio at d mk = ce m k + a m = β s u = bt d m + β i + β si + l q m = Fe m + (, ) PSO Task PSO Task2 bk t k d k k= fixedtimecost p m u b s + ( p m ) u b f p m q m + ( p m ) failurecost p m = p m b study =g( e v studyduratio ) g( e vstudyduratio ) studyduratio + fixedcost τ + e m s p m expectedfrequecy ( T w PSO w [ p p p w, q p + q w ]) Figure. Orgaizig diagram of the mathematical relatioships i this paper. 4

which practice should occur. I practice, Equatio 9 teds to suggest (for a drill procedure) that whe failure costs for errors ad error feedback are high, or success gais from correct respodig are much greater tha failure gais from feedback study, log-term gais i utility per secod of practice will be highest whe repetitios are scheduled so that test LE performace is maitaied at a high probability. However, because the decay parameter ca be large for a LE after a short spacig, some spacig is always preferred. p m u b s + ( p m ) u bf p m q m + ( p m ) failurecost Equatio 9 Learig Evet Type Optimizatio. The above discussio assumes a sigle task (drill) which ca be selected for each item. However, we ca also propose other types of LEs ad the compare them with the drill trial. For example, we could decide whether it was better to give a study LE aloe or to give a drill LE( a test LE followed by a study LE whe the test fails). To do this, Equatio 0 shows how we ca compare the learig rates for each trial type to determie the optimal ext trial type for the studet. This priciple ca be exteded to compare ay two tasks (e.g., tutored problem solvig vs. ututored problem solvig). This is typically used i combiatio with dyamic PSO calculatios (whe the PSOs i Equatio 0 are computed as a fuctio of the curret time) to pick the optimal time for the optimal task. (, ) Equatio 0 PSO Task PSO Task2 Part- to Whole-Task Trasfer Optimizatio. For this optimizatio, the questio is whether to practice oly sigle KC compoets of a whole skill (a cojuctive skill cotaiig at least 2 KCs), oly the whole skill or some mixture of the two types of practice. Imagie, for example, practicig simple algebra, ad cosider that a compoet of the whole task may be kowig the times tables (the low level compoet). I this case, the questio is how much practice should be allocated to times tables practice before doig algebra practice (the high level compoet). We might expect that either spedig o time o times tables or o time o algebra would likely result i poorer algebra performace tha some mixture of these extremes, ad that a optimal mixture would allow for the best possible algebra performace. Part to whole trasfer optimizatio allows us to determie this optimal mixture. To compute this optimal mixture, we model the effect of the low level compoet LEs o the high level compoet learig rate. To do this, we must create a equatio expressig whole task learig as a fuctio of part task learig. Equatio (where subscripts w ad p refer to whole ad part task respectively) captures the otio that we are lookig to imize whole task time (T w ) * learig rate from a optimally spaced LE, which equals the total learig (this method assumes that all practice occurs at the PSO optimal poit). Here we specify that PSO for a cojuctive task is a fuctio of the stregth of the whole (depedet) KC ad the probability ad latecy estimates for the part task. By doig this, we have created a ew versio of the PSO, PSO w, that depeds o the stregth of both the part ad whole task KCs. At the same time, we are oly cocered with the learig of the whole task, so i practice, the t (retetio iterval) ad g (utility gai) terms are ot chaged from the origial PSO. This provides a mechaism whereby the higher probability ad lower latecy for a practiced part task icreases the expected stregth of the PSO w. (T w PSO w [ p p p w, q p + q w ]) Equatio Havig this mechaism, we ca compute the time eeded to trai the part task to imize its effects o whole task learig. I this case, it ca be oted that totaltime-t w is spet o the part task, with a learig rate of PSO p ; these values, therefore, cotrol p p (probability correct) ad q p (latecy). This allows us to costruct Equatio, which represets total learig as a fuctio of time spet o the whole task, multiplied by the learig rate for the whole task (which, because of the cojuctive respose fuctios i the PSO w, is itself a fuctio of time spet o the part task multiplied by the part task learig rate). Equatio ca the be solved for T w where T w 0 ad T w totaltime. Practice Legth Optimizatio Practice legth optimizatio determies the optimal duratio of a give LE. PLO relies o the fact that KC study for each LE has dimiishig margial returs as a fuctio of time as show i various studies (Metcalfe & Korell, 2003; Pavlik Jr., i press). Equatio 2 shows how this optimal study duratio is foud whe the total LE weight score (from Equatio 4) divided by the time spet studyig is imized. (Equatio 2 assumes some miimum study duratio greater tha 0 to accout for fixed costs.) g( e v studyduratio ) Equatio 2 studyduratio + fixedcost Practice Quatity Optimizatio Practice quatity optimizatio uses probability correct for log-term practice as a utility measure, the determies how may optimally spaced repetitios it takes to reach the poit where probability gai per LE is imal (the practice quatity optimizatio poit is the p m value whe Equatio 3 is imized) for each item beig leared of a set of items. p m Equatio 3 Figure 2 graphs Equati o 3 for the parameter set i Pavlik Jr. (2005, Experimet 4) where it was foud that practices would have bee optimal for each KC, as the imum value of the probability correct/practices curve occurs at repetitios. It is useful to ote that the utility fuctio should reflect the ature of our prefereces for target kowledge. For example, if the eed for oe KC is higher tha others, the gettig it correct has a higher utility. 5

Probability Correct 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0. 0 3 5 7 9 3 5 7 9 2 Practices Probability Correct Probability Correct/ Practices Figure 2. Practice quatity optimizatio. To imp lemet this i the model, for istace, we ca weight the utility fuctio by the expected frequecy of the item we are iterested i. This captures the otio that it is twice as importat to kow a word whe that word is used twice as frequetly. Havig weighted the utility fuctios, we could the determie a cutoff word frequecy below which we will ot be cocered with learig the word (this fixes the total amout of time we will eed to sped learig the corpus i questio). p m expectedfrequecy Equatio 4 Because the weights represet our prefereces, other ways of weightig the relative values of differet distributios of practice amogst items might further improve the usefuless of such procedures i implemetatio. For example, items could also be weighted based o the cosequeces for slow or icorrect performace with the item. Coclusio This paper was about a geeral microecoomic method of usig a computatioal model of cogitio to compute the efficiecy of various decisios that occur durig practice. This work is relevat to educatio because it shows a ew approach to uderstadig how to improve educatio by cosiderig learig by the studet as the measure of profit. I this ew approach, the learig of sets of skills ca be optimized to imize output give iput. While we tied this method to a ACT-R cogitive model, there seems o reaso why this method could ot be used to optimize learig usig aother computatioal model. The elegace of the method explaied here is that it is theory eutral (give a particular model) ad so results i predictios that must be true give the limits of the particular model ad the accuracy of the utility fuctio used to capture prefereces. I practice, however, the potetial of this method ca be limited i domais where the complexity of the KC or LEs prevets the clear specificatio of a utility fuctio to optimize. Ackowledgmets This research was supported i part by grat from Roald Zdrojkowski for educatioal research; the Pittsburgh Sciece of Learig Ceter which is fuded by the Natioal Sciece Foudatio award umber SBE-0354420, ad a Graduate Traiig Grat awarded to Caregie Mello Uiversity by the Dept. of Educatio (#R305B040063). Refereces Aderso, J. R., Ficham, J. M., & Douglass, S. (997). The role of examples ad rules i the acquisitio of a cogitive skill. Joural of Experimetal Psychology: Learig, Memory, & Cogitio, 23(4), 932-945. Aderso, J. R., & Lebiere, C. (998). The atomic compoets of thought. Mahwah, NJ, US: Lawrece Erlbaum Associates Publishers. Aderso, J. R., & Schooler, L. J. ( 99). Reflectios of the eviromet i memory. Psychological Sciece, 2(6), 396-408. Ce, H., Koediger, K. R., & Juker, B. (2006). Learig factors aalysis - A geeral method for cogitive model evaluatio ad improvemet. I T.-W. Cha (Ed.), Lecture Notes i Computer Sciece Itelliget Tutorig Systems (Vol. 4053, pp. 64-75): Spriger. L oga, G. D. (995). The Weibull distributio, the power law, ad the istace theory of automaticity. Psychological Review, 02(4), 75-756. M cclellad, J. L., McNaughto, B. L., & O'Reilly, R. C. (995). Why there are complemetary learig systems i the hippocampus ad eocortex: Isights from the successes ad failures of coectioist models of learig ad memory. Psychological Review, 02(3), 49-437. Metcalfe, J., & Korell, N. (2003). The dyamics of learig ad allocatio of study time to a regio of proximal learig. Joural of Experimetal Psychology: Geeral, 32(4), 530-542. Pa vlik Jr., P. I. (2005). The microecoomics of learig: Optimizig paired-associate memory. Dissertatio Abstracts Iteratioal: Sectio B: The Scieces ad Egieerig, 66(0-B), 5704. Pavlik Jr., P. I. (i press). Uderstadig ad applyig the dyamics of test practice ad study practice. Istructioal Sciece. Pavlik Jr., P. I., & Aderso, J. R. (2005). Practice ad forgettig effects o vocabulary memory: A activatiobased model of the spacig effect. Cogitive Sciece, 29(4), 559-586. Pavlik Jr., P. I., Presso, N., Dozzi, G., Wu, S.-m., MacWhiey, B., & Koediger, K. R. (2007). The FaCT (Fact ad Cocept Traiig) System: A ew tool likig cogitive sciece with educators. I D. S. McNamara & J. G. Trafto (Eds.). Mahwah, NJ: Lawrece Erlbaum. Scheider, V. I., Healy, A. F., & Boure, L. E., Jr. (2002). What is leared uder difficult coditios is hard to forget: Cotextual iterferece effects i foreig vocabulary acquisitio, retetio, ad trasfer. Joural of Memory ad Laguage, 46(2), 49-440. Vo Neuma, J., & Morgester, O. (944). Theory of games ad ecoomic behavior. Priceto,: Priceto uiversity press. 6