RESEARCH DESIGN AND METHODOLOGY SECTION. Generalizability of Oral Reading Fluency Measures: Application of G Theory to Curriculum-Based Measurement

Size: px
Start display at page:

Download "RESEARCH DESIGN AND METHODOLOGY SECTION. Generalizability of Oral Reading Fluency Measures: Application of G Theory to Curriculum-Based Measurement"

Transcription

1 School Psychology Quarterly, Vol. 15, No. 1, 2000, pp RESEARCH DESIGN AND METHODOLOGY SECTION Generalizability of Oral Reading Fluency Measures: Application of G Theory to Curriculum-Based Measurement John M. Hintze University of Massachusetts at Amherst Steven V. Owen University of Connecticut Edward S. Shapiro Lehigh University Edward J. Daly III Western Michigan University The purpose of this study was to demonstrate the use of Generalizability (G) theory as an alternative method of validating direct behavioral measures. Reliability and validity from a classic test score theory are explored and rephrased in terms of G theory. Two studies that used oral reading fluency measures within a curriculum-based measurement (CBM) approach are examined with G theory. Results indicate that CBM oral reading fluency measures are highly dependable and can be reliably used to make both between individual (nomothetic) and within individual (idiographic) decisions. Formulated by Cronbach, Gleser, Nanda, and Raj aratnam (1972), Generalizability (G) theory is a statistical technique developed specifically to assess the dependability of behavioral measurements. As an alternative to classical test score theory, G theory allows researchers to examine multiple sources of error simultaneously (e.g., across different occasions, materials, or examiners) (Cronbach et al., 1972; Shavelson& Webb, 1991; Shavelson, Webb, & Rowley, 1989; Suen, 1990). In the process, G theory provides a summary coefficient reflecting the level of dependability a generalizability coefficient analogous to classical test score theory's reliability coefficient which partitions variance that can be attributed to various sources of error (e.g., different raters, occasions, test forms, etc.). Unlike classic test score theory, which attributes everything not explained by true score variance Address correspondence to John M. Hintze, Ph.D., University of Massachusetts at Amherst, School of Education, School Psychology Program, Amherst, MA 01003; hintze@educ.umass.edu 52

2 GENERAUZABILITY OF ORAL READING FLUENCY MEASURES 53 to error, G theory allows researchers to estimate proportions of variance attributable to environmental arrangements and contexts (Burns, 1998). G theory also provides information about the precision of decisions made with various measurement techniques. Thus, the reliability of decisions made about the relative standing of individuals (e.g., "John scored higher than 95% of his peers."), and decisions about an individual's absolute performance across a variety of contexts (e.g., "John's score would be expected to be similar from one week to the next, across different forms of the test, in a different environments, etc.") can be ascertained. G theory is able to provide not only estimates of overall reliability ("true score") and error ("residual"), but it also allows the researcher to explore the dependability of decisions made with measures from both intra- and interindividual perspectives. These added features make G theory particularly relevant to behavioral assessment measures, which often use repeated measurement over time, in a variety of contexts with multiple raters. THEORY AND APPLICATION OF GENERALIZABILITY (G) THEORY G theory is different from classic test score theory in a number of important respects. Within G theory, reliability and error variance are considered within the context of the testing situation (Suen, 1990). Conceptualizing reliability in this manner is in direct contrast to classic test score theory, which describes measurement error as random without any specific context (Allen & Yen, 1979). G theory attempts to specify portions of error that can be accounted for by various situational variables under which the measurements were taken; it identifies and explains portions of variance that in classic test score theory are simply attributed to random error. A direct consideration of measurement error is an assumption that is particularly relevant to behavioral assessment measures which often assume a domain sampling approach where alternate forms of similar measures are randomly drawn from a total universe of possible items that are administered repeatedly over time by different raters within a variety of contexts (Suen, 1990). As an example of such an approach, curriculum-based measurement (CBM) progress monitoring in reading uses a series of alternate form reading passages sampled from a larger universe of all potential reading samples (typically those that students are expected to read by the end of a given school year) administered repeatedly over time (typically once or twice per week) by a teacher, instructional support team member, etc. (Fuchs & Deno, 1991). Within this context, G theory can provide a coefficient that is interpreted similarly to the reliability coefficient of classical test score theory (Allen & Yen, 1979). In this manner, the G coefficient would provide an indication of the dependability of the CBM measures, with higher coefficients suggesting stronger and more robust measurement properties. The second advantage of G theory is its ability to assess multiple sources of measurement error. To assess these multiple sources of error, G theory uses a repeated measures ANOVA design that allows researchers to estimate a variance compo-

3 54 HINTZE ET AL. nent (a ) for each source of variation in observed scores. In the CBM progress monitoring example, variance components for persons (here, true individual differences among students), occasions (repeated measures over time), raters (evaluators), and residual (unexplained error) may be calculated. By partitioning variability in this manner, G theory enables the researcher to pinpoint the major sources of measurement error and estimate the relative magnitude of each source. What is considered unexplained error in classic test score theory may be partitioned into distinct components in G theory. By identifying such sources of measurement variability, testing personnel are provided with information that can be used to improve assessment procedures and the characteristics of tests themselves. Perhaps most importantly, G theory provides a mechanism by which researchers can examine the nature and fidelity of decisions made with the scores under study. Termed Decision (D) studies, the resultant findings provide the researcher with information about the usefulness of the measure in producing data for making interindividual and intraindividual decisions. Interindividual decisions are ones that address "how much better" one individual performed compared to another (Shavelson & Webb, 1991). These studies are similar to the between-individual comparisons that are made within behavioral assessment. Intraindividual decisions focus on "how well" an individual can perform, regardless of the performance of his or her peers (Shavelson & Webb, 1991). Again, such analyses explore the type of idiographic decisions that are made with behavioral assessment data. For interindividual decisions, only those variance components that influence the relative standing of the individual within the group are used for analysis. These components include variables that may interact with a person's score (e.g., the effect of different test forms or evaluators on a person's score, or the effect of the time of testing on a person's score) and are considered in comparison with what is known about how individuals typically respond to the measurement device. The results of this analysis yield a coefficient that signals the credibility of the decisions that involve comparing individuals. For intraindividual decisions, all facets of variance are considered in comparison with what is known about individual variation associated with the measurement device. The results of such an analysis yield a coefficient that suggests the strength of within-individual decisions that can be made (i.e., across times, test forms, etc.) regardless of an individual's ranking within a group. In the case of CBM in reading, D studies can provide valuable information regarding the appropriateness of using oral reading fluency measures to make both between-individual and within-individual decisions. The purpose of this article is twofold. The first objective is to provide a workable example of G theory in practice that illustrates its usefulness with direct behavioral measurements. Data from two previously published studies in the area of CBM will be used for illustrative purposes. The second objective is to evaluate the technical merits of the CBM oral reading fluency metric from a G theory perspective. Of particular importance is assessing the dependability of CBM measures and examining

4 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 55 the variability that can be accounted for across time, curricula, and difficulty levels. Based on extant literature, we hypothesize that the CBM oral reading fluency measure would prove highly dependable and would not be influenced unduly by sources of error because of variations in curricula or the time series nature of the measurement system. In addition, in the current study we were interested in assessing the practicality of using oral reading fluency measures to make both inter- and intraindividual decisions. Study 1 Participants and Procedures METHOD The primary purpose of Study 1 was to ascertain whether CBM procedures, typically constructed for use in a traditional skills-based basal reading series, would be sensitive to progress in literature-based basal reading series over time. Participants were 160 general education students from 31 second- through fifth-grade classrooms, located in two different school districts. All students received primary reading instruction in their general education classroom. Half of the students (20 from each grade) were instructed primarily in a literature-based reading series, whereas the remaining 80 participants (20 from each grade) were instructed primarily in a traditional skills-based reading program. Progress monitoring sessions were conducted twice a week over an 8-week period. Missed probe sessions were not made up. At each session, students were provided with two reading passages, one from each of the reading series (i.e., literature- and traditional skills-based) corresponding to the long-term goal level material of the student's respective grade. Order of presentation of reading passages was counterbalanced across sessions for each student. The number of words read correctly per minute on each reading passage served as the oral reading fluency outcome datum for each individual probe session. Data Analysis G study. The process through which the magnitudes of error associated with each facet in a measurement design are estimated is referred to as the G study (Suen, 1990). To estimate the magnitude of the various sources of measurement error, the data in the current study were analyzed using a repeated measures ANOVA (BMDP 8 V; Dixson, 1992). The purpose of this ANOVA was to calculate variance components for the object of measurement (represented by persons), different fac- 1. Portions of these data have been previously published by Hintze and Shapiro (1997). The current work represents new and previously unpublished analyses of the data.

5 56 HINTZE ET AL. TABLE 1. Estimates of Variance Components for Study 1 Facet n Estimated Percentage Variance of Total Component Variance Person (CT2P) Grade (a2g) Method (a2m) Occasion (a20) Person x Method (s2p,m) Person x Occasion (a2p,0) Method x Occasion (CT2m,0) Grade x Method (a2g>m) Grade x Occasion (a2g,0) Grade x Method x Occasion (a2gjmi0) Person x Grade x Method x Occasion Residual (C72p,g,m,o,e) Total Note. "Person" refers to participants in the study, "grade" refers to the four different grade levels, "method" refers to the two different reading series used for progress monitoring, "occasion" refers to the 16 repeated progress monitoring sessions conducted over the 8-week period, n = the number of contributors to the variance component. ets of measurement (represented by different grades, alternate forms, and repeated measures taken over time), and the interactions among persons and facets. Data from missing probe sessions were imputed through regression. The resulting variance components indicated the expected degree of score variation for a single level of each facet. For example, the proportion of error that can be explained by the average subject, a single form of CBM, and a single occasion of measurement. Table 1 provides the variance component estimates for the current study. The greatest amount of observed variation in oral reading fluency scores is explained by individual variation among the participants (a p) and developmental changes across grades (a g) (approximately 48 and 19% respectively). The variance attributed to the type of material used for progress monitoring (a m), repeated measures over time (a 0), and the interaction of persons by method (a p m), persons by occasion (a p,0), method by occasion (a m)o), grade by method (a g,m), grade by occasion (a g,o), and grade by method by occasion (a g,m,o) were considerably smaller (combined total of approximately 12%). Lastly, the variance attributed to the residual term (a p,m,o,g,e) was low, about 21% of the observed variation in oral reading scores. What this analysis indicates is that of the total amount of variance present, the bulk of variation in oral reading fluency is accounted for by individual variation 2. In nested designs not all sources of variability can be estimated because of confounding. As such, not all possible interactions are noted.

6 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 57 among the participants and expected developmental changes across grades, and very little is attributable to the CBM progress monitoring methods or unexplained sources of variability. In the current framework, this suggests that much of the observed variation is a result of individual differences among the students themselves and developmental changes in oral reading fluency, and comparatively little to the assessment method itself standard expectations of any worthwhile assessment method. D study. Recall that in addition to partitioning variance associated with various facets of measurement, G theory allows the researcher to calculate a generalizability coefficient analogous to classical test score theory's reliability coefficient. There are two types of G coefficients that can be calculated, depending on how the researcher intends to use the scores. In the case of intra-individual or within-individual decision making, the researcher needs to calculate (/absolute- The first step in this procedure is rather straightforward and is expressed as a2 - g gj go apm ap amo, ggm, ggo agmo Ppgmoe,»g «m no»m «o "m"o "g"m "g«o «g«nt«o "g«m"o«e 2 2 where a abs is the total amount of measurement error for absolute decisions; a is a variance component from the ANOVA source table (see Table 1); and n is the number of contributors to the variance component (e.g., a 0 has 16 contributors, sono= 16). In the current study, this is represented as 2 _ ^ _ Utfw aabs 4 ~ +~ T ~ ~ 2 ~ ~ 1 T Next, an adaptation to classic test score theory, which expresses reliability as the ratio of observed score variance to true score variance plus error, is made to reflect what proportion of the total variance is because of true score variance. Using the language of generalizability theory this may be expressed as In the current study, this translates to 2 Gabs =, 2 " 2 x (3) (gp+gabs) G (4) abs which produces a generalizability coefficient of.90. This result suggests that researchers and practitioners can expect a high level of dependability of measurement for making intraindividual decisions when implementing CBM in Grades 2 through 5 over the course of 8 weeks (16 progress monitoring sessions) in two different sets of monitoring materials. Those, however, who routinely use CBM will

7 58 HINTZE ET AL. be quick to note that, as specified, CBM is not typically conducted in two sets of monitoring materials. Furthermore, individual decisions are usually made across a maximum of two grade levels, not four as was the case in the example study. In this case, the G coefficient of.90 may be an artifact of the specific research design and may not translate directly to practice. Fortunately, one of the added features of D studies is that parameters within the study can be isolated and examined analogously to using the Spearman-Brown formula in classic test score theory. For example, in the current study the effects of using only one source of monitoring materials (which is typical of CBM progress monitoring) in two grade levels may be predicted. The only change that is required is adjusting the n values that appear in the first equation to reflect one grade level and one set of progress monitoring materials over eight progress monitoring sessions. Inserting these values into Equation 1 produces: _0,r\ abs or a2abs = Inserting this new value of a2abs into Equation 3 yields C.= ")l3-89 (6) or a G coefficient of.82. The results of this analysis suggest that researchers and practitioners can expect adequate levels of dependability with CBM progress monitoring as is conducted typically with data that can be used to make within-individual decisions over the course of an 8-week period. Overall, the results of both absolute D studies make it quite clear that the dependability of the CBM measurement system for use in making individual decisions is quite strong. Clearly, researchers and practitioners can feel secure in the dependability of the CBM oral reading metric as it is currently used in monitoring individual progress over time, across a variety of curricula and grades. As the second step in the D study, the researcher may also want to explore the dependability of relative or interindividual decision making. Here, as in the case of intraindividual decisions, the researcher needs to calculate Grelative- This formula is simpler because it only concerns variance that has to do with the rank ordering of persons. As such, 2 _ ^pit > apgmoe /-TX okl 1- i {/) In the current study this is represented as 2 _ Q1 (O) rel

8 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 59 which reduces to a rel The formula for Greiative changes slightly so that G"'=Rfe (9) In the current study, this once again translates to C (10) rel or more simply, a generalizability coefficient of.99. This result suggests that researchers and practitioners can expect an exceedingly high level of dependability of measurement for making interindividual decisions when implementing CBM over the course of 8 weeks (16 progress monitoring sessions) in two different sets of monitoring materials in Grades 2 through 5. As in the case of absolute D study, it makes sense to examine the relative decision-making power when only one set of materials is used for progress monitoring over a shorter period. As in the absolute D study, the only change that is required is adjusting the n values as they appear in Equation 7 to reflect this change in design. Making these changes produces /1t, Grel = + + (11) rel or a rel = Inserting this new value of a rel into Equation 9 yields C= l0'3-89 (12) or more simply, a generalizability coefficient of.98. Thus, researchers and practitioners can expect high levels of dependability with CBM progress monitoring as typically conducted with data that can be used to make interindividual decisions in as little as 4 weeks. Using only three reading passages (as is typically done in survey level assessment and developing local norms) the generalizability coefficient is.95. Researchers and practitioners can place a high level of trust in the dependability of the CBM oral reading fluency metric as it is currently used in developing local norms, identifying and certifying problems between individuals, and estimating performance discrepancies of students within local curricula. Clearly, such reliability coefficients are as good if not better than most published norm-referenced materials used for similar purposes. Study 2 Participants and Procedures The purpose of the second study was to compare the growth rates obtained using instructional and challenging level long-term goal level material with CBM progress

9 60 HINTZE ET AL. TABLE 2. Estimates of Variance Components for Study 2 Facet n Estimated Percentage Variance Total Component Variance Person (CT2P) Grade (a2g) Method (o-2m) Occasion (a20) Person x Method (CT2p,m) Person x Occasion (a2p,0) Method x Occasion (a2m,o) Grade x Method (a2g,m) Grade x Occasion (a2g,0) Grade x Method x Occasion (a2g>m,0) Person x Grade x Method x Occasion Residual (<T2p,g,m,o,e) Total Note. "Person" refers to participants in the study; "grade" refers to the four different grade levels; "method" refers to the two different reading series used for progress monitoring; "occasion" refers to the repeated progress monitoring sessions conducted over the 10-week period, n = the number of contributors to the variance component. monitoring. Participants included 80 students from 12 first through fourth-grade classrooms located in one elementary school. Of the sample, 88% of the students received their reading instruction within general education classroom with no supplementary assistance, and 12% received either remedial or special educational services in reading outside the classroom, in addition to their instruction in the general education classroom. Progress monitoring sessions were conducted twice a week over a 10-week period. Missed probe sessions were not made up. At each session, students were provided with two reading passages, one from the instructional level and one from the challenging level of the reading basal used in the school at each grade. Order of presentation of reading passages was counterbalanced across sessions for each student. The number of words read correctly per minute on each reading passage served as the oral reading fluency outcome datum for each individual probe session. G Study. To estimate the magnitude of the various sources of measurement error, the data were analyzed through a repeated measures ANOVA (BMDP 8 V; Dixson, 1992). Because missed progress monitoring sessions were not made up, missing data were imputed with regression substitution. Table 2 presents the variance com- 3. Portions of these data may also be found in Hintze, Daly, and Shapiro (1998). The current work represents new and previously unpublished analyses of the data.

10 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 61 ponent estimates. Individual variation among the participants (a p) and developmental differences across grade (a g) contributed to the greatest amount of observed variation in oral reading fluency scores (approximately 42 and 36% respectively). The variance attributed to the type of material used for progress monitoring (a m), repeated measures over time (a o), and the interaction of persons by method (a n,m), persons by occasion (a n,o), method by occasion (a m,o), grade by method (a g,m), grade by occasion (a g,0), and grade by method by occasion (cr g,m,o) were considerably smaller (combined total of approximately 13%). Furthermore, the variance attributed to the residual term (a p,m,o,g,e) was low, explaining only about 9% of the observed variation in oral reading scores. Results of the G study suggest that of the total amount of variance present, roughly three-quarters of the variation in oral reading fluency was accounted for by variation within individuals and developmental differences across grades, and very little to the CBM progress monitoring procedures themselves or by unexplained error. These results concur with those from Study 1 and attest to the construct validity of oral reading fluency as it is used in CBM. D Study. To explore the dependability of the CBM progress-monitoring procedures for decision making, both absolute and relative decision studies were conducted. First, to estimate Gabsolute the proper terms from Table 2 are inserted into Equation 1 as n ~ which reduces to a abs = Second, to estimate Gabsolute this information is introduced into Equation 3 so that C '28 (14) which gives a generalizability coefficient of.80. This result suggests that researchers and practitioners can expect adequate dependability of measurement for making intraindividual decisions when implementing CBM over the course of 10 weeks across both instructional and long-term goal level material. Further analysis using a case scenario in which only one set of progress monitoring materials are used over two grade levels can be explored much in the same manner as in Study 1. Adjustments to Equation 13 to reflect such a change in a abs would appear as nck CTabs = (15) or a2abs = As such, _ O b = (16) }

11 62 HINTZE ET AL which gives a generalizability coefficient of.67. The results of this analysis suggest that researchers and practitioners can expect lower levels of dependability with CBM progress monitoring data when the materials are either too easy or too difficult. It would appear that carefully assessing the difficulty level of CBM progress monitoring reading passages and a student's response to such passages would be important when used for making within-individual decisions. To investigate the dependability of between- or interindividual decision making, the proper terms from Table 2 are first inserted into Equation 7 so that a= + + (17) rel or a rel Inserting a rel into Equation 9 indicates that C= (18) e which gives a generalizability coefficient of.98. Further investigation of relative decisions using only one set of progress monitoring materials over 5- or 3-week periods, respectively, indicates that: and G (19) rel C= l38728 (20) which produce generalizability coefficients of.96 and.95, respectively. Once again, using only three reading passages from one reading series in one grade (as is done in survey level assessment and creating local norms) a resultant generalizability coefficient of.88 is observed. These results concur with those from Study 1, which suggests good dependability in the CBM oral reading fluency metric as is currently used in developing local norms, identifying and certifying problems between individuals, and in determining performance discrepancies of students within local curricula. DISCUSSION The purposes of this study were to provide workable examples of G theory with direct behavioral measures and to explore the dependability and sensitivity of decisions that can be made with oral reading fluency measures such as those used in CBM. With a repeated measures ANOVA and a few straightforward calculations, researchers can begin to study the degree to which a given set of measurements of an individual generalize to a broader and more extensive set of measurements that

12 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 63 help answer important questions regarding the technical adequacy of a measurement system for making decisions. As illustrated, G theory provides a flexible and straightforward framework for examining the dependability of behavioral measurements. A principal assumption of the theory posits that a measurement taken on a person is only a random sample of that person's behavior. More importantly, the usefulness of any measurement depends on the degree to which any one measurement sample can be generalized accurately to the behavior of the same person across a wider set of contexts. From this perspective, G theory fits well into most forms of clinical assessment, which typically make assumptions and draw inferences from isolated measurement of behavior to overall global functioning and patterns across a variety of situations. This notion is in contrast to the concept of reliability from a classic test score approach. Instead of asking how reliably a set of scores represents a particular construct of interest, G theory asks how well a set of observed scores can be used to represent a person's behavior in a general manner. From a clinical perspective, the answer to this question is the sine qua non of good assessment. Not only is it important that we measure behavior and constructs reliably, but the scores from these measures must also be representative of a person's functioning across time and settings. In addition, G theory extends classic test score theory in a number of important ways. As illustrated by the current work and elucidated by Shavelson and Webb (1991), G theory allows the researcher to estimate statistically the magnitude of each source of measurement error separately in one analysis and provides a mechanism for optimizing the reliability of measurement. By providing a G coefficient that is analogous to a classic test score reliability coefficient, G theory provides feedback to researchers and test developers about sources of error variance and the magnitude of each source of error affecting measurement. Perhaps its most unique feature is the ability to distinguish between inter- and intraindividual decisions. Unlike classic test score theory, researchers are able to evaluate empirically the dependability of any measure from the perspective of how much better one individual performed from another, or conversely, how well an individual can perform regardless of his or her peers' performance. This characteristic is particularly salient to school psychologists, who historically have used both summative and formative assessment measures, and to test developers, who increasingly are being asked to develop time- and cost-efficient measurement procedures for use in schools. Applications of G Theory to CBM Results of the current study indicate that CBM oral reading fluency measures are extremely dependable for a variety of decision-making purposes and continue to be a highly reliable means for indexing students' oral reading proficiency. More specifically, from an interindividual decision-making perspective, the current findings indicate that the dependability of CBM oral reading fluency measures for making between-individual decisions is quite strong. For CBM assessments as typically

13 64 HINTZE ET AL used in practice (i.e., survey level assessments, problem identification and certification), G coefficients of approximately.90 were observed in both cases. The magnitude of such coefficients suggests that practitioners can feel safe in using CBM data for screening and educational decisions that may include classification or eligibility determination (Salvia & Ysseldyke, 1995). The current findings are also consistent with research showing CBM oral reading fluency measures to be highly related to reported differences in oral reading performance across students of different grades and classifications (Deno, Marston, Shinn, & Tindal, 1983; Fuchs, Fuchs, Hamlett, Walz, & Germann, 1993; Hintze & Shapiro, 1997; Hintze, Daly, & Shapiro, 1998; Hintze, Shapiro, Conte, & Basile, 1997; Shinn & Marston, 1985; Shinn, Tindal, & Stein, 1988) and to teachers' judgement of student reading proficiency in both general and special education (Fuchs & Deno, 1981; Fuchs, Fuchs, & Deno, 1982; Marston & Deno, 1982). In addition, the current findings continue to support the intraindividual decision-making abilities of CBM oral reading fluency measures. As such, the use of CBM for developing and monitoring Individualized Education Plan (IEP) goals and objectives (Fuchs, 1993; Fuchs et al., 1993; Fuchs & Deno, 1991; Fuchs, Fuchs, & Deno, 1985), and the monitoring of individual progress over time (Fuchs, 1986,1989, 1993; Fuchs & Fuchs, 1986; Fuchs, Fuchs, & Hamlett, 1989a, 1989b, 1989c) appears to be psychometrically defensible. Interestingly, the current study lends support to the notion that the difficulty level of the material chosen for progress monitoring can have a substantial effect on resultant CBM outcomes (Hintze et al., 1998). Moreover, results suggest that practitioners may obtain reliable estimates of performance based on the most recent 8 to 10 data points. Although other work has suggested that a minimum of 20 points is required for accurate prediction to some future point in time (Good & Shinn, 1990; Shinn, Good, & Stein, 1989), the current findings suggest that fewer data points can serve as reliable indicators of generalized performance over time. The good news for practitioners is that fewer data points require less time and effort without sacrificing precision. Clearly, such efficient and reliable measures fit the recent amendments to the Individuals with Disabilities Education Act (IDEA), which require that evaluations be linked to IEP and programming objectives through the use of classroom-based data (Turnbull & Turnbull, 1998). Limitations and Considerations for Future Research Although the current article has attempted to provide a working example of G theory as it pertains to direct measures of reading, the results must be interpreted within context. First, because the main purpose of the work was to illustrate a set of methodological techniques, the experimental design and data were ex post facto in nature. As noted, the two data sets were part of two previous research endeavors. Readers should recognize and be cautious in over-generalizing the results because of the retrospective nature of the analyses. Indeed, future work should include a

14 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 65 careful consideration of design elements and data analytic plans concurrently to strengthen external validity. Second, researchers interested in using G theory should consider carefully the use of power analysis before designing experiments. Although not important from a hypothesis testing perspective (i.e., establishing alpha level), without proper sample size, variability may be either under- or overestimated, which in turn would affect the variance component estimates. For example, with small Nthe variance component estimates may be underestimated (because of reduced variability) and with large N the variance component estimates may be overestimated (because of increased variability). A well-done power analysis in this case should provide the estimated sample size that should neither over- nor underestimate the variance component estimates (Cohen, 1988). In addition to these methodological considerations, future research may also focus on other features that interact with the assessment process. Among the many universes to which a score may belong, Cone (1977) has identified six "generalities" that are particularly relevant to behavioral assessors and behavioral assessment measures: (a) scorer, (b) item, (c) time, (d) method, (e) setting, and (f) dimension. These universes of generalizability represent measurement conditions under which a given behavior for a given individual may be measured. Scorer generality refers to the extent to which data obtained by one observer or scorer are comparable to the observations of all observers who have been used. Item generality reflects the extent to which a given response or set of responses is representative of a larger universe of similar responses. In behavioral assessment, item generality would be most closely linked with broad- and narrow-band informant report measures used as verbal analogues to actual behavior. The relevance of time generality concerns the extent to which data collected on one occasion are representative of those that might have been collected at other times. Although behavioral assessors have long subscribed to the controlling effects of situational specificity, such temporal generalizability is of specific concern when measures of treatment outcome are used to make high-stakes decisions (e.g., using treatment outcome data to make categorical classifications). In other words, evaluators can investigate whether changes in behavior over time are reliable or the result of unidentified sources of variance. Method generality refers to the comparability of data produced from two or more ways of measuring the same construct. For example, evaluators would be interested in knowing the degree to which direct observations of inattention agree with informant reports of the same construct. The extent to which the two measurement methods agree is evidence of method generality. Setting generality asks whether data obtained in one situation are representative of those obtainable in others. For example, does a measure of inattention during independent seat work apply to the same measures taken during other academic periods of the day, such as small group activities? Such information is especially important for behavior change agents who are interested in the external validity of a particular intervention across a variety of situations and contexts. For example, "Will the results of a token economy in a spe-

15 66 HINTZE ET AL. cial education classroom generalize if it is implemented in a general education classroom?" Having such knowledge a priori may influence the intervention decisions of a behavior change agent, depending on the ultimate goals for generalization. Finally, dimension generality refers to the comparability of data on two or more different behaviors. For example, "To what extent are measures of students' academic engaged time associated with academic achievement?" CONCLUSIONS As with other forms of assessment, the validity of behavioral assessment measures have frequently been called into question. At the forefront of such questions has been an apparent difficulty of behavioral assessment measures to meet the principles and assumptions of classic test score theory. Because of a basic belief that direct behavioral measures were situation-specific samples of behavior, some have argued that behavior as a basic unit of datum should not be expected to evidence properties such as test-detest reliability, concurrent validity across situations, or convergent validity across methods (Nelson, 1983). However, the principles and assumptions underlying G theory are well suited for the validation of behavioral assessment measures. The current paper has argued that differences between classic test score theory and G theory are more conceptual than methodological or statistical. Nonetheless, one important advantage of using G theory with behavioral assessment measures is the acknowledgment that behavior is greatly influenced by the environmental context in which it occurs. Sensitivity to situational specificity, in addition to the ability to partition the variance attributed to contextual arrangements, make G theory a conceptually strong and compatible methodology for validating behavioral measures and exploring the sensitivity of the decisions made with such measures at the same time. REFERENCES Allen, M.J.,& Yen, W.M. (1979). Introduction to measurement theory. Monterey, CA: Brooks/Cole. Bums, K. J. (1998). Beyond classical reliability: Using generalizability theory to assess dependability. Research in Nursing & Health, 21, Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Cone, J. D. (1977). The relevance of reliability and validity for behavioral assessment. Behavior Therapy, 5, Cronbach, L. J., Gleser, G. C, Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley. Deno, S. L., Marston, D., Shinn, M. R., & Tindal, G. (1983). Oral reading fluency: A simple datum for scaling reading disability. Topics in Learning and Learning Disabilities, 2, Dixson, W. J. (Ed.). (1992). BMDP statistical software manual. Los Angeles: University of California Press.

16 GENERALIZABILITY OF ORAL READING FLUENCY MEASURES 67 Fuchs, L. S. (1986). Monitoring progress among mildly handicapped pupils: Review of current practice and research. Remedial and Special Education, 7, Fuchs, L. S. (1989). Evaluating solutions, monitoring progress and revising intervention plans. In M. R. Shinn (Ed.), Curriculum-based measurement: Assessing special children (pp ). New York: Guilford. Fuchs, L. S. (1993). Enhancing instructional programming and student achievement with curriculum-based measurement. In J. J. Kramer & J. C. Conoley (Eds.), Curriculum-based measurement (pp ). Lincoln, NE: University of Nebraska-Lincoln, Buros Institute of Mental Measurements. Fuchs, L. S., & Deno, S. L. (1991). Paradigmatic distinctions between instructionally relevant measurement models. Exceptional Children, 57, Fuchs, L. S., & Deno, S. L. (1981). The relationship between curriculum-based mastery measures and standardized achievement tests in reading (Report No. 57). Minneapolis, MN: University of Minnesota Institute for Research on Learning Disabilities (ERIC Document Reproduction Service No. ED ). Fuchs, L. S., & Fuchs, D. (1986). Effects of systematic formative evaluation: A meta-analysis. Exceptional Children, 53, Fuchs, L. S., Fuchs, D., & Deno, S. L. (1982). Reliability and validity of curriculum-based informal reading inventories. Reading Research Quarterly, 18, Fuchs, L. S., Fuchs, D., & Deno, S. L. (1985). Importance of goal ambitiousness and goal mastery to student achievement. Exceptional Children, 52, Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1989a). Computers and curriculum-based measurement: Effects of teacher feedback systems. School Psychology Review, 18, Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1989b). Effects of alternative goal structures within curriculum-based measurement. Exceptional Children, 55, Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1989c). Effects of instrumental use of curriculum-based measurement to enhance instructional programs. Remedial and Special Education, 10, Fuchs, L. S., Fuchs, D., Hamlett, C. L., Walz, L., & Germann, G. (1993). Formative evaluation of academic progress: How much growth can we expect? School Psychology Review, 22, Good, R. H., & Shinn, M. R. (1990). Forecasting accuracy of slope estimates for reading curriculum-based measurement: Empirical evidence. Behavioral Assessment, 12, Hintze, J. M., Daly, E. J., & Shapiro, E. S. (1998). An investigation of the effects of passage difficulty level on oral reading fluency for progress monitoring. School Psychology Review, 2 7, Hintze, J. M., & Shapiro, E. S. (1997). Curriculum-based measurement and literature-based reading: Is curriculum-based measurement meeting the needs of changing reading curricula? Journal of School Psychology, 35, Hintze, J. M., Shapiro, E. S., Conte, K. L., & Basile, I. M. (1997). Oral reading fluency and authentic reading material: Criterion validity of the technical features of CBM survey-level assessment. School Psychology Review, 26, Marston, D., & Deno, S. L. (1982). Implementation of direct and repeated measurement in the school setting (Report No. 106). Minneapolis, MN: University of Minnesota Institute for Research on Learning Disabilities. Nelson, R. O. (1983). Behavioral assessment: Past, present, and future. Behavioral Assessment, 5, Salvia, J., & Ysseldyke, J. E. (1995). Assessment (6th ed.). Boston: Houghton Mifflin. Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage. Shavelson, R. J., & Webb, N. M., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, Shinn, M. R., Good, R. H., & Stein, S. (1989). Summarizing trend in student achievement: A comparison of methods. School Psychology Review, 18,

17 68 HINTZE ET AL. Shinn, M. R., & Marston, D. (1985). Differentiating mildly handicapped, low-achieving and regular education students: A curriculum-based approach. Remedial and Special Education, 6, Shinn, M. R., Tindal, G., & Stein, S. (1988). Curriculum-based assessment and identification of mildly handicapped students: A research review. Professional School Psychology, 3, Suen, H. K. (1990). Principles of tests theories. Hillsdale, NJ: Erlbaum. Turnbull, H. R., & Turnbull, A. P. (1998). Free appropriate public education: The law and children with disabilities (5th ed.). Denver, CO: Love. Action Editor: Timothy Z. Keith Acceptance Date: July 26, 1999

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE Mark R. Shinn, Ph.D. Michelle M. Shinn, Ph.D. Formative Evaluation to Inform Teaching Summative Assessment: Culmination measure. Mastery

More information

QUESTIONS ABOUT ACCESSING THE HANDOUTS AND THE POWERPOINT

QUESTIONS ABOUT ACCESSING THE HANDOUTS AND THE POWERPOINT Answers to Questions Posed During Pearson aimsweb Webinar: Special Education Leads: Quality IEPs and Progress Monitoring Using Curriculum-Based Measurement (CBM) Mark R. Shinn, Ph.D. QUESTIONS ABOUT ACCESSING

More information

Using CBM for Progress Monitoring in Reading. Lynn S. Fuchs and Douglas Fuchs

Using CBM for Progress Monitoring in Reading. Lynn S. Fuchs and Douglas Fuchs Using CBM for Progress Monitoring in Reading Lynn S. Fuchs and Douglas Fuchs Introduction to Curriculum-Based Measurement (CBM) What is Progress Monitoring? Progress monitoring focuses on individualized

More information

Evidence for Reliability, Validity and Learning Effectiveness

Evidence for Reliability, Validity and Learning Effectiveness PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies

More information

Progress Monitoring & Response to Intervention in an Outcome Driven Model

Progress Monitoring & Response to Intervention in an Outcome Driven Model Progress Monitoring & Response to Intervention in an Outcome Driven Model Oregon RTI Summit Eugene, Oregon November 17, 2006 Ruth Kaminski Dynamic Measurement Group rkamin@dibels.org Roland H. Good III

More information

NCEO Technical Report 27

NCEO Technical Report 27 Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students

More information

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are:

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are: Every individual is unique. From the way we look to how we behave, speak, and act, we all do it differently. We also have our own unique methods of learning. Once those methods are identified, it can make

More information

PROGRESS MONITORING FOR STUDENTS WITH DISABILITIES Participant Materials

PROGRESS MONITORING FOR STUDENTS WITH DISABILITIES Participant Materials Instructional Accommodations and Curricular Modifications Bringing Learning Within the Reach of Every Student PROGRESS MONITORING FOR STUDENTS WITH DISABILITIES Participant Materials 2007, Stetson Online

More information

BENCHMARK TREND COMPARISON REPORT:

BENCHMARK TREND COMPARISON REPORT: National Survey of Student Engagement (NSSE) BENCHMARK TREND COMPARISON REPORT: CARNEGIE PEER INSTITUTIONS, 2003-2011 PREPARED BY: ANGEL A. SANCHEZ, DIRECTOR KELLI PAYNE, ADMINISTRATIVE ANALYST/ SPECIALIST

More information

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February

More information

How to Judge the Quality of an Objective Classroom Test

How to Judge the Quality of an Objective Classroom Test How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM

More information

NCSC Alternate Assessments and Instructional Materials Based on Common Core State Standards

NCSC Alternate Assessments and Instructional Materials Based on Common Core State Standards NCSC Alternate Assessments and Instructional Materials Based on Common Core State Standards Ricki Sabia, JD NCSC Parent Training and Technical Assistance Specialist ricki.sabia@uky.edu Background Alternate

More information

Psychometric Research Brief Office of Shared Accountability

Psychometric Research Brief Office of Shared Accountability August 2012 Psychometric Research Brief Office of Shared Accountability Linking Measures of Academic Progress in Mathematics and Maryland School Assessment in Mathematics Huafang Zhao, Ph.D. This brief

More information

On-the-Fly Customization of Automated Essay Scoring

On-the-Fly Customization of Automated Essay Scoring Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,

More information

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS ELIZABETH ANNE SOMERS Spring 2011 A thesis submitted in partial

More information

Norms How were TerraNova 3 norms derived? Does the norm sample reflect my diverse school population?

Norms How were TerraNova 3 norms derived? Does the norm sample reflect my diverse school population? Frequently Asked Questions Today s education environment demands proven tools that promote quality decision making and boost your ability to positively impact student achievement. TerraNova, Third Edition

More information

Extending Place Value with Whole Numbers to 1,000,000

Extending Place Value with Whole Numbers to 1,000,000 Grade 4 Mathematics, Quarter 1, Unit 1.1 Extending Place Value with Whole Numbers to 1,000,000 Overview Number of Instructional Days: 10 (1 day = 45 minutes) Content to Be Learned Recognize that a digit

More information

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering Document number: 2013/0006139 Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering Program Learning Outcomes Threshold Learning Outcomes for Engineering

More information

Using CBM to Help Canadian Elementary Teachers Write Effective IEP Goals

Using CBM to Help Canadian Elementary Teachers Write Effective IEP Goals Exceptionality Education International Volume 21 Issue 1 Article 6 1-1-2011 Using CBM to Help Canadian Elementary Teachers Write Effective IEP Goals Chris Mattatall Queen's University, cmattatall@mun.ca

More information

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011 CAAP Content Analysis Report Institution Code: 911 Institution Type: 4-Year Normative Group: 4-year Colleges Introduction This report provides information intended to help postsecondary institutions better

More information

The Oregon Literacy Framework of September 2009 as it Applies to grades K-3

The Oregon Literacy Framework of September 2009 as it Applies to grades K-3 The Oregon Literacy Framework of September 2009 as it Applies to grades K-3 The State Board adopted the Oregon K-12 Literacy Framework (December 2009) as guidance for the State, districts, and schools

More information

STA 225: Introductory Statistics (CT)

STA 225: Introductory Statistics (CT) Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic

More information

CONTINUUM OF SPECIAL EDUCATION SERVICES FOR SCHOOL AGE STUDENTS

CONTINUUM OF SPECIAL EDUCATION SERVICES FOR SCHOOL AGE STUDENTS CONTINUUM OF SPECIAL EDUCATION SERVICES FOR SCHOOL AGE STUDENTS No. 18 (replaces IB 2008-21) April 2012 In 2008, the State Education Department (SED) issued a guidance document to the field regarding the

More information

Social Emotional Learning in High School: How Three Urban High Schools Engage, Educate, and Empower Youth

Social Emotional Learning in High School: How Three Urban High Schools Engage, Educate, and Empower Youth SCOPE ~ Executive Summary Social Emotional Learning in High School: How Three Urban High Schools Engage, Educate, and Empower Youth By MarYam G. Hamedani and Linda Darling-Hammond About This Series Findings

More information

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best

More information

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse Program Description Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse 180 ECTS credits Approval Approved by the Norwegian Agency for Quality Assurance in Education (NOKUT) on the 23rd April 2010 Approved

More information

TU-E2090 Research Assignment in Operations Management and Services

TU-E2090 Research Assignment in Operations Management and Services Aalto University School of Science Operations and Service Management TU-E2090 Research Assignment in Operations Management and Services Version 2016-08-29 COURSE INSTRUCTOR: OFFICE HOURS: CONTACT: Saara

More information

A Study of Metacognitive Awareness of Non-English Majors in L2 Listening

A Study of Metacognitive Awareness of Non-English Majors in L2 Listening ISSN 1798-4769 Journal of Language Teaching and Research, Vol. 4, No. 3, pp. 504-510, May 2013 Manufactured in Finland. doi:10.4304/jltr.4.3.504-510 A Study of Metacognitive Awareness of Non-English Majors

More information

success. It will place emphasis on:

success. It will place emphasis on: 1 First administered in 1926, the SAT was created to democratize access to higher education for all students. Today the SAT serves as both a measure of students college readiness and as a valid and reliable

More information

The My Class Activities Instrument as Used in Saturday Enrichment Program Evaluation

The My Class Activities Instrument as Used in Saturday Enrichment Program Evaluation Running Head: MY CLASS ACTIVITIES My Class Activities 1 The My Class Activities Instrument as Used in Saturday Enrichment Program Evaluation Nielsen Pereira Purdue University Scott J. Peters University

More information

Learning By Asking: How Children Ask Questions To Achieve Efficient Search

Learning By Asking: How Children Ask Questions To Achieve Efficient Search Learning By Asking: How Children Ask Questions To Achieve Efficient Search Azzurra Ruggeri (a.ruggeri@berkeley.edu) Department of Psychology, University of California, Berkeley, USA Max Planck Institute

More information

Proficiency Illusion

Proficiency Illusion KINGSBURY RESEARCH CENTER Proficiency Illusion Deborah Adkins, MS 1 Partnering to Help All Kids Learn NWEA.org 503.624.1951 121 NW Everett St., Portland, OR 97209 Executive Summary At the heart of the

More information

Assessing Functional Relations: The Utility of the Standard Celeration Chart

Assessing Functional Relations: The Utility of the Standard Celeration Chart Behavioral Development Bulletin 2015 American Psychological Association 2015, Vol. 20, No. 2, 163 167 1942-0722/15/$12.00 http://dx.doi.org/10.1037/h0101308 Assessing Functional Relations: The Utility

More information

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING Yong Sun, a * Colin Fidge b and Lin Ma a a CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland

More information

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations Michael Schneider (mschneider@mpib-berlin.mpg.de) Elsbeth Stern (stern@mpib-berlin.mpg.de)

More information

Early Warning System Implementation Guide

Early Warning System Implementation Guide Linking Research and Resources for Better High Schools betterhighschools.org September 2010 Early Warning System Implementation Guide For use with the National High School Center s Early Warning System

More information

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice Megan Andrew Cheng Wang Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice Background Many states and municipalities now allow parents to choose their children

More information

ACADEMIC AFFAIRS GUIDELINES

ACADEMIC AFFAIRS GUIDELINES ACADEMIC AFFAIRS GUIDELINES Section 8: General Education Title: General Education Assessment Guidelines Number (Current Format) Number (Prior Format) Date Last Revised 8.7 XIV 09/2017 Reference: BOR Policy

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District Report Submitted June 20, 2012, to Willis D. Hawley, Ph.D., Special

More information

Math 96: Intermediate Algebra in Context

Math 96: Intermediate Algebra in Context : Intermediate Algebra in Context Syllabus Spring Quarter 2016 Daily, 9:20 10:30am Instructor: Lauri Lindberg Office Hours@ tutoring: Tutoring Center (CAS-504) 8 9am & 1 2pm daily STEM (Math) Center (RAI-338)

More information

A cautionary note is research still caught up in an implementer approach to the teacher?

A cautionary note is research still caught up in an implementer approach to the teacher? A cautionary note is research still caught up in an implementer approach to the teacher? Jeppe Skott Växjö University, Sweden & the University of Aarhus, Denmark Abstract: In this paper I outline two historically

More information

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017 Instructor Syed Zahid Ali Room No. 247 Economics Wing First Floor Office Hours Email szahid@lums.edu.pk Telephone Ext. 8074 Secretary/TA TA Office Hours Course URL (if any) Suraj.lums.edu.pk FINN 321 Econometrics

More information

A Note on Structuring Employability Skills for Accounting Students

A Note on Structuring Employability Skills for Accounting Students A Note on Structuring Employability Skills for Accounting Students Jon Warwick and Anna Howard School of Business, London South Bank University Correspondence Address Jon Warwick, School of Business, London

More information

Sheila M. Smith is Assistant Professor, Department of Business Information Technology, College of Business, Ball State University, Muncie, Indiana.

Sheila M. Smith is Assistant Professor, Department of Business Information Technology, College of Business, Ball State University, Muncie, Indiana. Using the Social Cognitive Model to Explain Vocational Interest in Information Technology Sheila M. Smith This study extended the social cognitive career theory model of vocational interest (Lent, Brown,

More information

Tun your everyday simulation activity into research

Tun your everyday simulation activity into research Tun your everyday simulation activity into research Chaoyan Dong, PhD, Sengkang Health, SingHealth Md Khairulamin Sungkai, UBD Pre-conference workshop presented at the inaugual conference Pan Asia Simulation

More information

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report Contact Information All correspondence and mailings should be addressed to: CaMLA

More information

Learning Disabilities and Educational Research 1

Learning Disabilities and Educational Research 1 Learning Disabilities and Educational Research 1 Learning Disabilities as Educational Research Disabilities: Setting Educational Research Standards Dr. K. A Korb University of Jos Korb, K. A. (2010). Learning

More information

I N T E R P R E T H O G A N D E V E L O P HOGAN BUSINESS REASONING INVENTORY. Report for: Martina Mustermann ID: HC Date: May 02, 2017

I N T E R P R E T H O G A N D E V E L O P HOGAN BUSINESS REASONING INVENTORY. Report for: Martina Mustermann ID: HC Date: May 02, 2017 S E L E C T D E V E L O P L E A D H O G A N D E V E L O P I N T E R P R E T HOGAN BUSINESS REASONING INVENTORY Report for: Martina Mustermann ID: HC906276 Date: May 02, 2017 2 0 0 9 H O G A N A S S E S

More information

SSIS SEL Edition Overview Fall 2017

SSIS SEL Edition Overview Fall 2017 Image by Photographer s Name (Credit in black type) or Image by Photographer s Name (Credit in white type) Use of the new SSIS-SEL Edition for Screening, Assessing, Intervention Planning, and Progress

More information

Hierarchical Linear Models I: Introduction ICPSR 2015

Hierarchical Linear Models I: Introduction ICPSR 2015 Hierarchical Linear Models I: Introduction ICPSR 2015 Instructor: Teaching Assistant: Aline G. Sayer, University of Massachusetts Amherst sayer@psych.umass.edu Holly Laws, Yale University holly.laws@yale.edu

More information

EVALUATING MATH RECOVERY: THE IMPACT OF IMPLEMENTATION FIDELITY ON STUDENT OUTCOMES. Charles Munter. Dissertation. Submitted to the Faculty of the

EVALUATING MATH RECOVERY: THE IMPACT OF IMPLEMENTATION FIDELITY ON STUDENT OUTCOMES. Charles Munter. Dissertation. Submitted to the Faculty of the EVALUATING MATH RECOVERY: THE IMPACT OF IMPLEMENTATION FIDELITY ON STUDENT OUTCOMES By Charles Munter Dissertation Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment

More information

Identifying Students with Specific Learning Disabilities Part 3: Referral & Evaluation Process; Documentation Requirements

Identifying Students with Specific Learning Disabilities Part 3: Referral & Evaluation Process; Documentation Requirements Identifying Students with Specific Learning Disabilities Part 3: Referral & Evaluation Process; Documentation Requirements Section 3 & Section 4: 62-66 # Reminder: Watch for a blue box in top right corner

More information

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT By: Dr. MAHMOUD M. GHANDOUR QATAR UNIVERSITY Improving human resources is the responsibility of the educational system in many societies. The outputs

More information

VIEW: An Assessment of Problem Solving Style

VIEW: An Assessment of Problem Solving Style 1 VIEW: An Assessment of Problem Solving Style Edwin C. Selby, Donald J. Treffinger, Scott G. Isaksen, and Kenneth Lauer This document is a working paper, the purposes of which are to describe the three

More information

PSYC 620, Section 001: Traineeship in School Psychology Fall 2016

PSYC 620, Section 001: Traineeship in School Psychology Fall 2016 PSYC 620, Section 001: Traineeship in School Psychology Fall 2016 Instructor: Gary Alderman Office Location: Kinard 110B Office Hours: Mon: 11:45-3:30; Tues: 10:30-12:30 Email: aldermang@winthrop.edu Phone:

More information

Preprint.

Preprint. http://www.diva-portal.org Preprint This is the submitted version of a paper presented at Privacy in Statistical Databases'2006 (PSD'2006), Rome, Italy, 13-15 December, 2006. Citation for the original

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

IEP AMENDMENTS AND IEP CHANGES

IEP AMENDMENTS AND IEP CHANGES You supply the passion & dedication. IEP AMENDMENTS AND IEP CHANGES We ll support your daily practice. Who s here? ~ Something you want to learn more about 10 Basic Steps in Special Education Child is

More information

Field Experience Management 2011 Training Guides

Field Experience Management 2011 Training Guides Field Experience Management 2011 Training Guides Page 1 of 40 Contents Introduction... 3 Helpful Resources Available on the LiveText Conference Visitors Pass... 3 Overview... 5 Development Model for FEM...

More information

MSW POLICY, PLANNING & ADMINISTRATION (PP&A) CONCENTRATION

MSW POLICY, PLANNING & ADMINISTRATION (PP&A) CONCENTRATION MSW POLICY, PLANNING & ADMINISTRATION (PP&A) CONCENTRATION Overview of the Policy, Planning, and Administration Concentration Policy, Planning, and Administration Concentration Goals and Objectives Policy,

More information

Instructional Intervention/Progress Monitoring (IIPM) Model Pre/Referral Process. and. Special Education Comprehensive Evaluation.

Instructional Intervention/Progress Monitoring (IIPM) Model Pre/Referral Process. and. Special Education Comprehensive Evaluation. Instructional Intervention/Progress Monitoring (IIPM) Model Pre/Referral Process and Special Education Comprehensive Evaluation for Culturally and Linguistically Diverse (CLD) Students Guidelines and Resources

More information

Initial English Language Training for Controllers and Pilots. Mr. John Kennedy École Nationale de L Aviation Civile (ENAC) Toulouse, France.

Initial English Language Training for Controllers and Pilots. Mr. John Kennedy École Nationale de L Aviation Civile (ENAC) Toulouse, France. Initial English Language Training for Controllers and Pilots Mr. John Kennedy École Nationale de L Aviation Civile (ENAC) Toulouse, France Summary All French trainee controllers and some French pilots

More information

ISD 2184, Luverne Public Schools. xcvbnmqwertyuiopasdfghjklzxcv. Local Literacy Plan bnmqwertyuiopasdfghjklzxcvbn

ISD 2184, Luverne Public Schools. xcvbnmqwertyuiopasdfghjklzxcv. Local Literacy Plan bnmqwertyuiopasdfghjklzxcvbn qwertyuiopasdfghjklzxcvbnmqw ertyuiopasdfghjklzxcvbnmqwert yuiopasdfghjklzxcvbnmqwertyui opasdfghjklzxcvbnmqwertyuiopa sdfghjklzxcvbnmqwertyuiopasdf ghjklzxcvbnmqwertyuiopasdfghj klzxcvbnmqwertyuiopasdfghjklz

More information

w o r k i n g p a p e r s

w o r k i n g p a p e r s w o r k i n g p a p e r s 2 0 0 9 Assessing the Potential of Using Value-Added Estimates of Teacher Job Performance for Making Tenure Decisions Dan Goldhaber Michael Hansen crpe working paper # 2009_2

More information

THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY

THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY William Barnett, University of Louisiana Monroe, barnett@ulm.edu Adrien Presley, Truman State University, apresley@truman.edu ABSTRACT

More information

Miriam Muñiz-Swicegood Arizona State University West. Abstract

Miriam Muñiz-Swicegood Arizona State University West. Abstract The Effects of Metacognitive Reading Strategy Training on the Reading Performance and Student Reading Analysis Strategies of Third Grade Bilingual Students Miriam Muñiz-Swicegood Arizona State University

More information

WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING AND TEACHING OF PROBLEM SOLVING

WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING AND TEACHING OF PROBLEM SOLVING From Proceedings of Physics Teacher Education Beyond 2000 International Conference, Barcelona, Spain, August 27 to September 1, 2000 WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING

More information

Practices Worthy of Attention Step Up to High School Chicago Public Schools Chicago, Illinois

Practices Worthy of Attention Step Up to High School Chicago Public Schools Chicago, Illinois Step Up to High School Chicago Public Schools Chicago, Illinois Summary of the Practice. Step Up to High School is a four-week transitional summer program for incoming ninth-graders in Chicago Public Schools.

More information

SPECIALIST PERFORMANCE AND EVALUATION SYSTEM

SPECIALIST PERFORMANCE AND EVALUATION SYSTEM SPECIALIST PERFORMANCE AND EVALUATION SYSTEM (Revised 11/2014) 1 Fern Ridge Schools Specialist Performance Review and Evaluation System TABLE OF CONTENTS Timeline of Teacher Evaluation and Observations

More information

King-Devick Reading Acceleration Program

King-Devick Reading Acceleration Program King-Devick Reading Acceleration Program The Effect of In-School Saccadic Training on Reading Fluency and Comprehension in First and Second Grade Students: A Randomized Controlled Trial David Dodick, MD*,1;

More information

Wonderworks Tier 2 Resources Third Grade 12/03/13

Wonderworks Tier 2 Resources Third Grade 12/03/13 Wonderworks Tier 2 Resources Third Grade Wonderworks Tier II Intervention Program (K 5) Guidance for using K 1st, Grade 2 & Grade 3 5 Flowcharts This document provides guidelines to school site personnel

More information

EQuIP Review Feedback

EQuIP Review Feedback EQuIP Review Feedback Lesson/Unit Name: On the Rainy River and The Red Convertible (Module 4, Unit 1) Content Area: English language arts Grade Level: 11 Dimension I Alignment to the Depth of the CCSS

More information

Research Design & Analysis Made Easy! Brainstorming Worksheet

Research Design & Analysis Made Easy! Brainstorming Worksheet Brainstorming Worksheet 1) Choose a Topic a) What are you passionate about? b) What are your library s strengths? c) What are your library s weaknesses? d) What is a hot topic in the field right now that

More information

Running head: LISTENING COMPREHENSION OF UNIVERSITY REGISTERS 1

Running head: LISTENING COMPREHENSION OF UNIVERSITY REGISTERS 1 Running head: LISTENING COMPREHENSION OF UNIVERSITY REGISTERS 1 Assessing Students Listening Comprehension of Different University Spoken Registers Tingting Kang Applied Linguistics Program Northern Arizona

More information

Causal Relationships between Perceived Enjoyment and Perceived Ease of Use: An Alternative Approach 1

Causal Relationships between Perceived Enjoyment and Perceived Ease of Use: An Alternative Approach 1 Research Article Causal Relationships between Perceived Enjoyment and Perceived Ease of Use: An Alternative Approach 1 Heshan Sun School of Information Studies Syracuse University hesun@syr.edu Ping Zhang

More information

Concept mapping instrumental support for problem solving

Concept mapping instrumental support for problem solving 40 Int. J. Cont. Engineering Education and Lifelong Learning, Vol. 18, No. 1, 2008 Concept mapping instrumental support for problem solving Slavi Stoyanov* Open University of the Netherlands, OTEC, P.O.

More information

LANGUAGE TESTING: RECENT DEVELOPMENTS AND PERSISTENT DILEMMAS

LANGUAGE TESTING: RECENT DEVELOPMENTS AND PERSISTENT DILEMMAS Luukka, M.-R., S. Salla & H. Dufva (toim.) 1998. Puolin ja toisin. AFinLAn vuosikirja 1998. Suomen soveltavan kielitieteen yhdistyksen julkaisuja no. 56. Jyväskylä. s. 277-287. LANGUAGE TESTING: RECENT

More information

MERGA 20 - Aotearoa

MERGA 20 - Aotearoa Assessing Number Sense: Collaborative Initiatives in Australia, United States, Sweden and Taiwan AIistair McIntosh, Jack Bana & Brian FarreII Edith Cowan University Group tests of Number Sense were devised

More information

Grade Dropping, Strategic Behavior, and Student Satisficing

Grade Dropping, Strategic Behavior, and Student Satisficing Grade Dropping, Strategic Behavior, and Student Satisficing Lester Hadsell Department of Economics State University of New York, College at Oneonta Oneonta, NY 13820 hadsell@oneonta.edu Raymond MacDermott

More information

Improving recruitment, hiring, and retention practices for VA psychologists: An analysis of the benefits of Title 38

Improving recruitment, hiring, and retention practices for VA psychologists: An analysis of the benefits of Title 38 Improving recruitment, hiring, and retention practices for VA psychologists: An analysis of the benefits of Title 38 Introduction / Summary Recent attention to Veterans mental health services has again

More information

CONNECTICUT GUIDELINES FOR EDUCATOR EVALUATION. Connecticut State Department of Education

CONNECTICUT GUIDELINES FOR EDUCATOR EVALUATION. Connecticut State Department of Education CONNECTICUT GUIDELINES FOR EDUCATOR EVALUATION Connecticut State Department of Education October 2017 Preface Connecticut s educators are committed to ensuring that students develop the skills and acquire

More information

Executive Summary. Laurel County School District. Dr. Doug Bennett, Superintendent 718 N Main St London, KY

Executive Summary. Laurel County School District. Dr. Doug Bennett, Superintendent 718 N Main St London, KY Dr. Doug Bennett, Superintendent 718 N Main St London, KY 40741-1222 Document Generated On January 13, 2014 TABLE OF CONTENTS Introduction 1 Description of the School System 2 System's Purpose 4 Notable

More information

Technical Manual Supplement

Technical Manual Supplement VERSION 1.0 Technical Manual Supplement The ACT Contents Preface....................................................................... iii Introduction....................................................................

More information

ECON 365 fall papers GEOS 330Z fall papers HUMN 300Z fall papers PHIL 370 fall papers

ECON 365 fall papers GEOS 330Z fall papers HUMN 300Z fall papers PHIL 370 fall papers Assessing Critical Thinking in GE In Spring 2016 semester, the GE Curriculum Advisory Board (CAB) engaged in assessment of Critical Thinking (CT) across the General Education program. The assessment was

More information

DOES OUR EDUCATIONAL SYSTEM ENHANCE CREATIVITY AND INNOVATION AMONG GIFTED STUDENTS?

DOES OUR EDUCATIONAL SYSTEM ENHANCE CREATIVITY AND INNOVATION AMONG GIFTED STUDENTS? DOES OUR EDUCATIONAL SYSTEM ENHANCE CREATIVITY AND INNOVATION AMONG GIFTED STUDENTS? M. Aichouni 1*, R. Al-Hamali, A. Al-Ghamdi, A. Al-Ghonamy, E. Al-Badawi, M. Touahmia, and N. Ait-Messaoudene 1 University

More information

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4 Chapters 1-5 Cumulative Assessment AP Statistics Name: November 2008 Gillespie, Block 4 Part I: Multiple Choice This portion of the test will determine 60% of your overall test grade. Each question is

More information

Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs

Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs 2016 Dual Language Conference: Making Connections Between Policy and Practice March 19, 2016 Framingham, MA Session Description

More information

Aviation English Training: How long Does it Take?

Aviation English Training: How long Does it Take? Aviation English Training: How long Does it Take? Elizabeth Mathews 2008 I am often asked, How long does it take to achieve ICAO Operational Level 4? Unfortunately, there is no quick and easy answer to

More information

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students Edith Cowan University Research Online EDU-COM International Conference Conferences, Symposia and Campus Events 2006 Empowering Students Learning Achievement Through Project-Based Learning As Perceived

More information

Massachusetts Department of Elementary and Secondary Education. Title I Comparability

Massachusetts Department of Elementary and Secondary Education. Title I Comparability Massachusetts Department of Elementary and Secondary Education Title I Comparability 2009-2010 Title I provides federal financial assistance to school districts to provide supplemental educational services

More information

Applying Florida s Planning and Problem-Solving Process (Using RtI Data) in Virtual Settings

Applying Florida s Planning and Problem-Solving Process (Using RtI Data) in Virtual Settings Applying Florida s Planning and Problem-Solving Process (Using RtI Data) in Virtual Settings As Florida s educational system continues to engage in systemic reform resulting in integrated efforts toward

More information

FY year and 3-year Cohort Default Rates by State and Level and Control of Institution

FY year and 3-year Cohort Default Rates by State and Level and Control of Institution Student Aid Policy Analysis FY2007 2-year and 3-year Cohort Default Rates by State and Level and Control of Institution Mark Kantrowitz Publisher of FinAid.org and FastWeb.com January 5, 2010 EXECUTIVE

More information

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4 University of Waterloo School of Accountancy AFM 102: Introductory Management Accounting Fall Term 2004: Section 4 Instructor: Alan Webb Office: HH 289A / BFG 2120 B (after October 1) Phone: 888-4567 ext.

More information

Strategic Practice: Career Practitioner Case Study

Strategic Practice: Career Practitioner Case Study Strategic Practice: Career Practitioner Case Study heidi Lund 1 Interpersonal conflict has one of the most negative impacts on today s workplaces. It reduces productivity, increases gossip, and I believe

More information

Summary results (year 1-3)

Summary results (year 1-3) Summary results (year 1-3) Evaluation and accountability are key issues in ensuring quality provision for all (Eurydice, 2004). In Europe, the dominant arrangement for educational accountability is school

More information

Technology and Assessment Study Collaborative

Technology and Assessment Study Collaborative Technology and Assessment Study Collaborative Examining the Feasibility and Effect of a Computer-Based Read-Aloud Accommodation on Mathematics Test Performance Part of the New England Compact Enchanced

More information