The 2017 Reading MCA-III Benchmark Report

Similar documents
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

NCEO Technical Report 27

Interpreting ACER Test Results

STA 225: Introductory Statistics (CT)

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

Probability and Statistics Curriculum Pacing Guide

Mathematics Success Level E

Review of Student Assessment Data

School Size and the Quality of Teaching and Learning

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS

How to Judge the Quality of an Objective Classroom Test

BENCHMARK TREND COMPARISON REPORT:

Biological Sciences, BS and BA

Testing Schedule. Explained

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are:

Spinners at the School Carnival (Unequal Sections)

Miami-Dade County Public Schools

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Informal Comparative Inference: What is it? Hand Dominance and Throwing Accuracy

Ohio s Learning Standards-Clear Learning Targets

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

NCSC Alternate Assessments and Instructional Materials Based on Common Core State Standards

Mathematics Scoring Guide for Sample Test 2005

The Oregon Literacy Framework of September 2009 as it Applies to grades K-3

PROGRESS MONITORING FOR STUDENTS WITH DISABILITIES Participant Materials

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

preassessment was administered)

The New York City Department of Education. Grade 5 Mathematics Benchmark Assessment. Teacher Guide Spring 2013

Number of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Probability estimates in a scenario tree

ISD 2184, Luverne Public Schools. xcvbnmqwertyuiopasdfghjklzxcv. Local Literacy Plan bnmqwertyuiopasdfghjklzxcvbn

Using CBM for Progress Monitoring in Reading. Lynn S. Fuchs and Douglas Fuchs

Norms How were TerraNova 3 norms derived? Does the norm sample reflect my diverse school population?

Technical Manual Supplement

ACADEMIC AFFAIRS GUIDELINES

Like much of the country, Detroit suffered significant job losses during the Great Recession.

1. READING ENGAGEMENT 2. ORAL READING FLUENCY

Access Center Assessment Report

AP Statistics Summer Assignment 17-18

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Writing an Effective Research Proposal

This scope and sequence assumes 160 days for instruction, divided among 15 units.

success. It will place emphasis on:

IS USE OF OPTIONAL ATTRIBUTES AND ASSOCIATIONS IN CONCEPTUAL MODELING ALWAYS PROBLEMATIC? THEORY AND EMPIRICAL TESTS

Functional Skills Mathematics Level 2 assessment

Coming in. Coming in. Coming in

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Exemplar Grade 9 Reading Test Questions

Copyright Corwin 2015

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries

End-of-Module Assessment Task

Science Fair Project Handbook

learning collegiate assessment]

Characteristics of Functions

Rote rehearsal and spacing effects in the free recall of pure and mixed lists. By: Peter P.J.L. Verkoeijen and Peter F. Delaney

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Mathematics process categories

Progress Monitoring for Behavior: Data Collection Methods & Procedures

Cooper Upper Elementary School

South Carolina English Language Arts

Evaluation of Teach For America:

Centre for Evaluation & Monitoring SOSCA. Feedback Information

LESSON PLANS: AUSTRALIA Year 6: Patterns and Algebra Patterns 50 MINS 10 MINS. Introduction to Lesson. powered by

FIGURE IT OUT! MIDDLE SCHOOL TASKS. Texas Performance Standards Project

Student Name: OSIS#: DOB: / / School: Grade:

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

The My Class Activities Instrument as Used in Saturday Enrichment Program Evaluation

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Statewide Framework Document for:

Shelters Elementary School

Kansas Adequate Yearly Progress (AYP) Revised Guidance

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

University of Groningen. Systemen, planning, netwerken Bosman, Aart

ECON 365 fall papers GEOS 330Z fall papers HUMN 300Z fall papers PHIL 370 fall papers

What is Thinking (Cognition)?

SACS Reaffirmation of Accreditation: Process and Reports

Characteristics of Collaborative Network Models. ed. by Line Gry Knudsen

Plattsburgh City School District SIP Building Goals

Learning From the Past with Experiment Databases

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

CSC200: Lecture 4. Allan Borodin

STANDARDS AND RUBRICS FOR SCHOOL IMPROVEMENT 2005 REVISED EDITION

Probability Therefore (25) (1.33)

The College Board Redesigned SAT Grade 12

An ICT environment to assess and support students mathematical problem-solving performance in non-routine puzzle-like word problems

Introduction to the Practice of Statistics

THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

Create Quiz Questions

Physics 270: Experimental Physics

STAT 220 Midterm Exam, Friday, Feb. 24

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Reflective Teaching KATE WRIGHT ASSOCIATE PROFESSOR, SCHOOL OF LIFE SCIENCES, COLLEGE OF SCIENCE

On-the-Fly Customization of Automated Essay Scoring

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

Transcription:

The 2017 Reading MCA-III Benchmark Report The Reading MCA-III Benchmark Report is a tool that educators can use to compare the performance of students in their school on content benchmarks relative to their overall performance on the Reading MCA-III. That is, a school s performance on each benchmark is described in terms of a deviation around the performance expected given its students scores on the entire test. The first section of this document presents an introduction to the benchmark reports and their interpretation. The second section presents a more-detailed discussion of the technical details involved in the calculations that are displayed in the reports. The Reading MCA-III Benchmark Reports are organized by four content domains from the Minnesota Academic Standards for Reading (2010). The four content domains are defined by Key Ideas and Details and Craft and Structure skill domains nested within Literature and Informational Text sub-strands. A separate graph is produced for each content domain to report school performance on the benchmarks that the content domain comprises. A list of the benchmarks assessed by the Reading MCA-III is included in the document MCA-III Test Specifications: Reading, Grades 3-8, 10, which is available on the Test Specifications page of the MDE website. View the Test Specifications page (MDE > Educator Excellence > Districts, Schools and Educators >Statewide Testing > Test Specifications). How to Interpret the Reading MCA-III Benchmark Reports Figure 1 displays the performance for a school on three benchmarks in the Craft and Structure skill domain of the grade 7 Literature sub-strand. Each plotted point represents performance on a benchmark (identified by its four-digit numeric label) relative to overall performance expectation for the school (represented by the dashed vertical line that crosses the x-axis at value 0.50). Figure 1. Sample Benchmark Report for the Grade 7 Literature-Craft & Structure Domain. 2017 Reading MCA Benchmark Reports User Guide 1 September 2017

The relative performance on benchmarks for students within a school is reported using the Common Language Effect Size (CLES). When benchmark performance for a group of students is equivalent to that expected given their overall MCA scores, the CLES will equal 0.50. CLES values greater than 0.50 indicate that performance on the benchmark by students at the school exceeds that expected from their overall scores. CLES values less than 0.50 have the opposite implication: school performance on the benchmark is lower than expected based on the overall MCA scores of students in the school. In addition to markers showing the school s relative performance on each benchmark, a synthetic statewide reference line (the solid gray vertical line) is included in the report to add a normative perspective. This state-wide reference line reflects the performance expected from students at the state mean for that grade in overall reading ability compared to that expected from students in the school across all items. Thus, the CLES value associated with state vs. school reference lines reflects global, rather than benchmark or content domainspecific differences in performance expectations. When the CLES value for the state reference line is greater than 0.50, it indicates that overall expected state performance is greater than expected school performance; values less than 0.50 have the opposite meaning. Benchmark Indicators Individual benchmarks within each content domain are identified by a 4-digit code. Relative performance on each benchmark is indicated by a color and shape coded symbol and a dashed line extending from the symbol. The symbol s horizontal position indicates the actual effect size value for the benchmark (on the CLES metric). The dashed line around the symbol represents a corresponding 95% credible interval (i.e., a 0.95 probability range of plausible CLES values given the data). Within each content domain, the benchmarks are arranged from highest performance relative to the school at the top right of the graphic to lowest relative performance at the bottom left. As described in Table 1, the color and shape of each plotted symbol indicate how the school s students performed on the benchmark relative to expectations based on their overall Reading MCA-III scores. Because the state reference line is based on comparison of state vs. school expectations across all items, comparison of individual benchmark performance with the state reference line is not appropriate. Table 1. Benchmark Marker Color and Shape Codes Evaluating Performance Differences between Benchmarks In making comparisons between pairs of benchmarks, pay close attention to the amount of overlap of the credible bands for those benchmarks. If their credible bands overlap by more than one-half, regardless of color or position of the markers, performance on those benchmarks may be considered statistically equivalent. In other words, if the bands on two different benchmarks have substantial overlap, there is little credible evidence to suggest that actual performance was significantly different on the two benchmarks. If the credible bands across two benchmarks do not overlap, then there is very clear evidence of a reliable difference in performance between the two benchmarks. 2017 Reading MCA Benchmark Reports User Guide 2 September 2017

Benchmark Codes Benchmark codes are indicated to the left of each marker. A list of the benchmarks assessed by the Reading MCA-III is included in the document MCA-III Test Specifications: Reading, Grades 3-8, 10 available on the Test Specifications page of the MDE website. View the Test Specifications page (MDE > Districts, Schools and Educators > Statewide Testing > Test Specifications). Cautions in Interpreting the Benchmark Report As with any data, caution must be exercised in making inferences from the benchmark report. It is important to frame any interpretation within the context of the school s environment. Consideration of external information about the Reading curriculum, instructional practices and data from other classroom assessments is critical to making appropriate and meaningful inferences from this report. Interpretation of this report should also take the following factors into account: The generalizability of inferences about student performance in the content domain that the benchmark comprises depends upon the representative sampling of: (a) items from the benchmark that students in a school are administered; and (b) students in a school who are administered items from the benchmark. For a computer-adaptive test, such as the online Reading MCA-III, there generally will be multiple items administered across students at a school assessing each benchmark. Test blueprint specifications are at the sub-strand and standard level. Thus, the number of items administered from each benchmark can vary. The length of the credible band around a benchmark report marker reflects, in part, the number of item responses included in calculating the benchmark CLES value; shorter credible bands are associated with larger numbers of student responses to items from the benchmark. There are several misinterpretations that should be avoided: Color/shape and position of markers in the graphs do not reflect benchmark difficulty. Color/shape and position of markers in the graphs do not correspond to achievement levels (i.e., Does Not Meet, Partially Meets, Meets, or Exceeds the Standards). When comparing Benchmark Report graphs from different schools within a district, be aware that the range of values on the horizontal axis CLES scale is adjusted to fit each school s data. If a school has a large outlier (i.e., a benchmark with very high or very low relative performance) the graph will have a greater range reflected on the horizontal axis, and its benchmark markers will appear to be clustered more tightly together than those for a school with a smaller range of benchmark CLES values. The primary purpose of the MDE Benchmark Report is to provide information to help curriculum and instructional staff in making inferences about their instructional/curricular activities and their students level of understanding, based on performance data from the online Reading MCA-III. The purpose of data in this report is not to designate strengths and weaknesses in the school. Rather, the Benchmark Report is to serve as a guidance tool to identify possible gaps in instructional content that the school staff may find relevant and important. In particular, it is important to recognize that this report reflects data on a sample of student testing behavior obtained at a single time point in the academic year, and may not fully reflect the systematic instructional and curricular outcomes as a whole. Furthermore, some of the results may depend upon the timing and sequence of when content was presented during the school year. For those reasons, it is critical to 2017 Reading MCA Benchmark Reports User Guide 3 September 2017

appropriately involve knowledgeable instructional staff in the discussion and interpretation of the results, and in deliberations about their implications for curriculum and instructional activities. Technical Details for the Reading MCA-III Benchmark Reports Relative Benchmark Performance and Common Language Effect Size The relative performance on benchmarks for students within a school or district is reported using the Common Language Effect Size (CLES). The CLES is a non-parametric statistic used to summarize group differences. The basic notion is that two groups (say, Group A and Group B) exist, where each group member has a score on an outcome of interest. The CLES is calculated as the probability that a randomly selected member from one group (e.g., Group A) will have a higher score than a randomly selected member of the other group (Group B). When the group score distributions are equivalent, the probability will be 0.50. As scores in Group A become increasingly higher than those in Group B, the probability that the score of a randomly selected member from Group A will be greater than that of a randomly selected member from Group B increases correspondingly, and the CLES becomes increasingly greater than 0.50. Conversely, as scores in Group A become progressively lower than those in Group B, the CLES will move progressively lower than 0.50. The Reading MCA-III Benchmark report uses the CLES to compare performance of two groups on the items administered from a benchmark. The scores comprising the first group are the observed item scores for students in the school on items from the benchmark. The scores comprising the second group are the item scores on benchmark items that would be expected from each student, given their overall score on the Reading MCA-III. These expected scores (or average conditional performance) are calculated based on the 3-parameter logistic (3PL) measurement model that underlies all scaling on the MCA-III. The 3PL model estimates the probability of a correct response on each benchmark item given each student s overall MCA-III score (see Figure 2). Using MCA- III item response data from students in the school, the observed and expected numbers of correct and incorrect responses to items from each benchmark are obtained and used to calculate the CLES. Figure 2. Sample item response function: Probability of correct response conditional on ability. 2017 Reading MCA Benchmark Reports User Guide 4 September 2017

The School-Based CLES expresses the probability that an item score (0 or 1), selected at random from the observed set of scores on items representing a benchmark is greater than an item score (0 or 1) drawn at random from the expected set of benchmark item scores based on the 3PL model (i.e., average conditional performance). When the number of correct observed benchmark item responses is equivalent to the number of correct responses expected based on overall MCA-III scores, the result will be a CLES of 0.50. School or District Reference Line The gray dashed school expectation reference line located at 0.50 on the CLES scale on each benchmark report graph represents observed benchmark item performance equal to expected benchmark item performance. A benchmark CLES value greater than 0.50 indicates that student performance on items from the benchmark exceeds the expected conditional student performance. CLES values less than 0.50 have the opposite implication: school performance is lower than expectation given the ability of students who were administered the set of items from the benchmark. When CLES is calculated based on dichotomous item scores, the deviation from CLES value 0.50 is approximately equal to one-half the difference in proportions correct in the two groups. Thus, a benchmark CLES value of.55 can be interpreted to mean that the observed proportion of correct responses to benchmark items is 0.10 greater than the expected proportion correct. State Reference Line Although the focus of the Reading Benchmark Report is within-school comparisons of observed and expected benchmark item score distributions, some users may be interested in comparing school and statewide performance on the CLES scale. In an adaptive testing context, comparison of observed state and school item scores is problematic because students are being administered items tailored to their ability level, and the difficulty of items taken by students in a school may be very different than what is typical statewide. The heuristic approach adopted in the benchmark report is to calculate the expected count of correct responses if a student whose reading ability was at the state average for the grade was administered the same items actually taken by students in the school. As before, a CLES index is calculated, this time comparing the expected correct response count (across all items) for the average state student vs. the students in the school. The gray solid vertical line plotted at the obtained CLES value represents state-wide performance relative to the school or district. When the solid state reference line is to the right of the dashed school/district line (i.e., >0.50), it means the expected overall state performance exceeded that of the school. Conversely, when the state reference line is less than 0.50, it indicates expected state overall performance that is lower than expected performance for the school. 95% Credible Interval Bands The credible interval is the Bayesian analogue of the confidence interval reported in common statistical (i.e., frequentist) practice. The 95% credible interval band reported here can be interpreted as the range of CLES values within which there is a 95% probability the true CLES value lies, given the observed data. The 95% credible interval bands are estimated empirically, based on observed CLES values resulting from 20,000 paired random draws from beta binomial distributions with parameters (1+observed number correct, 1+observed number wrong) for the school observed data and (1+expected number correct, 1+expected number wrong) for the school expectation data. The.025 and.975 quantiles of the observed CLES sampling distribution serve as the limits of the 95% credible interval band. One consequence of this empirical approach is that when a district has a single school at a grade, the re-sampled distributions from school and district analyses can differ very slightly, and the resulting school and district benchmark graphs will be not quite identical. 2017 Reading MCA Benchmark Reports User Guide 5 September 2017