OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE Mark R. Shinn, Ph.D. Michelle M. Shinn, Ph.D. Formative Evaluation to Inform Teaching Summative Assessment: Culmination measure. Mastery assessment. Pass/fail type assessments which summarize the knowledge students learn. Typical summative assessments include: End of chapter tests High-stakes tests (e.g., State assessments) GRE, ACT, SAT, GMAT, etc. tests Driver s license test Final Exams. Formative Evaluation: Process of assessing student achievement during instruction to determine whether an instructional program is effective for individual students. Informs: When students are progressing, continue using your instructional programs. Big ideas of Benchmark Assessment It s about using General Outcome Measures (GOMs) for formative assessment/evaluation to: Inform teaching AND ensure accountability. It s different from, but related to, summative high-stakes testing/evaluation, which: Doesn t inform teaching. Mostly used for accountability/motivation. When tests show that students are not progressing, you can change your instructional programs in meaningful ways. Today s High Stakes Evaluation World High Stakes Tests are USUALLY appropriate for only for summative evaluation. NOT useful for decisions teachers need to make every day (formative): For whom do I need to individualize instruction or find more intensive instructional programs? How do I organize my classrooms for instructional grouping? How do I know that my teaching is working for each student so that I can make changes in instruction when necessary? NOT very useful to administrators who must make decisions about allocating instructional resources, especially in a preventative or responsive model. High Stakes Evaluation World (continued) Reliability/Validity issues on High-stakes tests: Guessing factor (bubble-in, skipping items, etc.) Cheating at various levels Culture-sensitivity concerns Test may not match what is taught Fatigue Enabling behaviors may not be present for test that is required Text anxiety Political pressures on student performance Testing often takes place after year(s) of instruction with long time frames between (annual( annual). Information about success and failure rates provided too late to make changes. Primary use of high stakes tests then may be to assign the blame to students, their parents, teachers, or schools. 1

High Stakes Evaluation World (continued) High Stakes = High cost in terms of: Loss of instructional time. Time for test taking. Paying for the tests. In summary, typical High Stakes Testing is: Too little! Too late! At too high a cost! An Example: Weight High standard: All children will have a healthy weight by the end of third grade. High Stakes Assessment: Based on assessing body density. Weighing each student. Immersing each student in a large tub filled with water, and measuring the amount of water displaced. Divide weight by displacement and get density, a very accurate picture of physical status. Weight (continued) Formative Evaluation: Same Standard - Different Assessment After 8-98 9 YEARS of growth, we would: 1. Place students who are unhealthy in remedial programs. 2. Create new health programs. 3. Blame the effectiveness of old health programs. 4. Blame the students (or their families) for over - or under - eating. High Standard: All children will have a healthy weight by the end of third grade. Benchmark Assessment: Monitor weight directly, frequently, and continuously. From birth, measure weight frequently and continuously with a simple, albeit less precise, general outcome measure, weight in pounds, using a scale. AIMSweb in a Picture and a Sentence At ANY Point in Development The child could be weighed and a decision made about healthy weight. This process is: Efficient. Sufficiently accurate. Proactive. Cost effective We would know their health status before they reached the high stakes point! AIMSweb is a 3-tier 3 Progress Monitoring System based on direct, frequent and continuous student assessment which is reported to students, parents, teachers and administrators via a web based data management and reporting system for the purpose of determining response to instruction. 2

Common Characteristics of GOMs General Outcome Measures (GOMs) from Other Fields The same kind of evaluation technology as other professions Powerful measures that are: Simple Accurate Efficient indicators of performance that guide and inform a variety of decisions Generalizable thermometer that allows for reliable, valid, cross comparisons of data Medicine measures height, weight, temperature, and/or blood pressure. Federal Reserve Board measures the Consumer Price Index. Wall Street measures the Dow-Jones Industrial Average. Companies report earnings per share. McDonald s measures how many hamburgers they sell. CBM is a GOM Used for Scientific Reasons Based on Evidence Things to Always Remember About CBM Reliable and valid indicator of student achievement Simple, efficient, and of short duration to facilitate frequent administration by teachers Provides assessment information that helps teachers plan better instruction Sensitive to the improvement of students achievement over time Easily understood by teachers and parents Improves achievement when used to monitor progress Designed to serve as indicators of general reading achievement: CBM probes don t measure everything, but measure the important things. Standardized tests to be given, scored, and interpreted in a standard way Researched with respect to psychometric properties to ensure accurate measures of learning Items to Remember (continued) Are sensitive to improvement in brief intervals of time Also tell us how students earned their scores (qualitative information) Designed to be as short as possible to ensure its do ability Are linked to decision making for promoting positive achievement and Problem-Solving What is CBM? CBM is a form of Curriculum-Based Assessment (CBA). Curriculum-Based Measurement (CBM) is the method of monitoring student progress through direct, continuous assessment of basic skills. CBM is used to assess skills such as reading fluency, comprehension, spelling, mathematics, and written expression. Early literacy skills (phonics and phonological awareness) are similar measures and are downward extensions of CBM. CBM probes last from 1 to 4 minutes depending on the skill being measured and student performance is scored for speed and accuracy to determine proficiency. Because CBM probes are quick to administer and simple to score, they can be given frequently to provide continuous progress data. The results are charted and provide for timely evaluation based on hard data. 3

Origins of CBM as General Outcome Measures Curriculum-Based Measurement (CBM) was developed more than 20 years ago by Stanley Deno at the University of Minnesota through a federal contract to develop a reliable and valid measurement system for evaluating basic skills growth. CBM is supported by more than 25 years of school-based research by the US Department of Education. Starting in the area of reading, researchers have expanded to investigate additional academic areas over the years. With significant additional research, AIMSweb now offers CBM assessments in 7 areas and 2 languages, with new areas currently being tested and developed. Supporting documentation can be found in 100 s of articles, book chapters, and books in the professional literature describing the use of CBM to make a variety of important educational decisions. Skill Areas Currently Assessable via AIMSweb: Early Literacy [K-1 benchmark, Progress Monitor (PM) any age] Letter Naming Fluency Letter sound fluency Phonemic Segmentation Fluency Nonsense Word Fluency Early Numeracy (K-1 benchmark, PM any age) Oral Counting Number identification Quantity discrimination Missing number Oral Reading (K-8, PM any age) MAZE (Reading comprehension); (1-8, PM any age) Math Computation (1-6, PM any age) Math Facts (PM any age) Spelling (1-8, PM any age) Written Expression (1-8, PM any age) Early Literacy and Oral Reading Spanish (K-8) Advantages of CBM Direct measure of student performance. Correlates strongly with best practices for instruction and assessment, and research-supported methods for assessment and intervention. Focus is on repeated measures of performance. (This cannot be done with most norm-referenced and standardized tests due to practice effect or limited forms.) Advantages of Using CBM Quick to administer, simple, easy, and cost-efficient. Performance is graphed an analyzed over time Sensitive to even small improvements in performance This is KEY as most standardized/norm-referenced tests do NOT show small, incremental gains. CBM allows teachers to do what they do better! Capable of having many forms whereas most standardized tests only have a maximum of two forms. Monitoring frequently enables staff to see trends in individual and group performance and compare those trends with targets set for their students. Big Ideas of Benchmark (Tier 1) Assessment Benchmarking allows us to add systematic Formative Evaluation to current practice. For Teachers (and Students) Early Identification of At Risk Students Instructional Planning Progress Monitoring For Parents Opportunities for Communication/Involvement Accountability Benchmark Testing: Useful for Communicating with Parents Designed for collaboration and communication with parents. Student achievement is enhanced by the teacher-parent communication about achievement growth on a continuous basis. A Parent Report is produced for each Benchmark Testing. For Administrators Resource Allocation/Planning and Support Accountability 4

2006 2-weeks during: September 1 to October 15 January 1 to February 1 May 1 to June 1 School Calendar Year (2006-2007): Benchmarking 2007 (Tier 1) Benchmark (Tier 1) for Oral Reading (R-CBM) Set of 3 probes (passages) at grade-level* (~1 through 8). Administer the same set, three times per year, to all students. Requires 3 minutes per student, 3 times per year. 1 2 3 How the AIMSweb System Works for Benchmark (Tier 1): Oral Reading (R-CBM) as an Example Research suggests there is no significant practice effect by repeating the set of three passages for benchmark assessment periods. Using same passage sets for each benchmark increases confidence in data obtained, reduces extraneous variables. Students read aloud for 1 minute from each of the three Edformation Standard Reading Assessment Passages. Passages contain meaningful, connected text. Number of words read correctly (wrc) and number of errors are counted per passage read. Scores reported as WRC/errors Benefits of Using Edformation s Standard Reading Assessment Passages Passages are written to represent general curriculum or to be curriculum independent For additional data on R-CBM passages, review: Standard Reading Assessment Passages for Use in General Outcome Measurement: A manual describing development and technical features. Kathryn B. Howe, Ph.D. & Michelle M. Shinn, Ph.D. Allow decision making about reading growth, regardless of between-school, between-school-district, between-teacher differences in reading curriculum Are graded to be of equal difficulty Have numerous alternate forms for testing over time without practice effects Sample R-CBM Assessment Passage Student Copy Standard Reading Assessment Passage Student Copy: No numbers Between 250-300 words (exception: 1st grade) Sample R-CBM Assessment Passage Examiner Copy Standard Reading Assessment Passage Examiner Copy: Pre-numbered so they can be scored quickly and immediately. An informative first sentence Same font style and size Text without pictures 5

Data: Get the MEDIAN score for student s 3 passages: 67 / 2 85 / 8 74 / 9 The Data: Get the MEDIAN score for 3 passages: 67 / 2 85 / 8 74 / 9 1 min. 1 min. 1 min. Why use Median vs. Average? Averages are susceptible to outliers when dealing with small number sets. Median Score is a statistically more reliable number than average for R-CBM. 1 min. 1 min. 1 min. 1. Throw out the HIGH and LOW scores for Words Read Correct The Data: Get the MEDIAN score for 3 passages: 67 / 2 85 / 8 74 / 9 Managing Data after Assessment: Take median score for each student and report in AIMSweb System. =74/8 1 min. 1 min. 1 min. 2. Throw out the HIGH and LOW scores for the Errors. Remaining scores = MEDIAN. 3. Report this score in your AIMSweb account. =74/8 AIMSweb instantly generates multiple reports for analysis and various decision-making purposes. A few of the many reports available appear here: S A M P L E S For Teachers: Classroom Report Box & Whiskers Graphs (box plots): A Brief Explanation AIMSweb commonly uses box plots to report data. This chart will help familiarize yourself with box plots: Consider bell-curve. Box plots are somewhat similar in shape and representation. outlier Above Average Range Average range of population included in sample. Below Average Range 90 th percentile 75 th percentile Median (50 th percentile) 25 th percentile 10 th percentile 6

Report Beginning of Year Status Individual Report: Student Know When Things are Working Have Data to Know When Things Need Changing Data to Know that Changes Made a Difference Data to Know that Things Went Well 7

Identifying At Risk Students For Teachers: Classroom Report At-a-Glance Views of Student Ranking & Growth Follow student progress over time. Compare Sub-group Trends: Compare a School to a Composite 8

Many Reporting Options Available Finally Benchmark Testing, using simple general, RESEARCHED outcome measures, provides an ONGOING data base to teachers, administrators, and parents for making decisions about the growth h and development of basic skills. Professionally managed by staff in a process that communicates that t WE are in charge of student learning. The End 9