The LASER Model: A Systemic and Sustainable Approach for Achieving High Standards in Science Education

Similar documents
Consortium: North Carolina Community Colleges

CONSTITUENT VOICE TECHNICAL NOTE 1 INTRODUCING Version 1.1, September 2014

Application for Admission

E-LEARNING USABILITY: A LEARNER-ADAPTED APPROACH BASED ON THE EVALUATION OF LEANER S PREFERENCES. Valentina Terzieva, Yuri Pavlov, Rumen Andreev

HANDBOOK. Career Center Handbook. Tools & Tips for Career Search Success CALIFORNIA STATE UNIVERSITY, SACR AMENTO

'Norwegian University of Science and Technology, Department of Computer and Information Science

VISION, MISSION, VALUES, AND GOALS

arxiv: v1 [cs.dl] 22 Dec 2016

Management Science Letters

part2 Participatory Processes

Natural language processing implementation on Romanian ChatBot

Fuzzy Reference Gain-Scheduling Approach as Intelligent Agents: FRGS Agent

2014 Gold Award Winner SpecialParent

also inside Continuing Education Alumni Authors College Events

On March 15, 2016, Governor Rick Snyder. Continuing Medical Education Becomes Mandatory in Michigan. in this issue... 3 Great Lakes Veterinary

Miami-Dade County Public Schools

Effectiveness of McGraw-Hill s Treasures Reading Program in Grades 3 5. October 21, Research Conducted by Empirical Education Inc.

NCEO Technical Report 27

DERMATOLOGY. Sponsored by the NYU Post-Graduate Medical School. 129 Years of Continuing Medical Education

2013 TRIAL URBAN DISTRICT ASSESSMENT (TUDA) RESULTS

Statistical Peers for Benchmarking 2010 Supplement Grade 11 Including Charter Schools NMSBA Performance 2010

Kansas Adequate Yearly Progress (AYP) Revised Guidance

NATIONAL CENTER FOR EDUCATION STATISTICS RESPONSE TO RECOMMENDATIONS OF THE NATIONAL ASSESSMENT GOVERNING BOARD AD HOC COMMITTEE ON.

Exams: Accommodations Guidelines. English Language Learners

Evaluation of Teach For America:

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

The Efficacy of PCI s Reading Program - Level One: A Report of a Randomized Experiment in Brevard Public Schools and Miami-Dade County Public Schools

The My Class Activities Instrument as Used in Saturday Enrichment Program Evaluation

Evaluation of the. for Structured Language Training: A Multisensory Language Program for Delayed Readers

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

Data Glossary. Summa Cum Laude: the top 2% of each college's distribution of cumulative GPAs for the graduating cohort. Academic Honors (Latin Honors)

Guru: A Computer Tutor that Models Expert Human Tutors

A Pilot Study on Pearson s Interactive Science 2011 Program

African American Male Achievement Update

ILLINOIS DISTRICT REPORT CARD

A Guide to Adequate Yearly Progress Analyses in Nevada 2007 Nevada Department of Education

Shelters Elementary School

ILLINOIS DISTRICT REPORT CARD

Greta Bornemann (360) Patty Stephens (360)

Evidence for Reliability, Validity and Learning Effectiveness

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Cooper Upper Elementary School

Wisconsin 4 th Grade Reading Results on the 2015 National Assessment of Educational Progress (NAEP)

Trends & Issues Report

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

FY year and 3-year Cohort Default Rates by State and Level and Control of Institution

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Update on Standards and Educator Evaluation

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

OFFICE OF ENROLLMENT MANAGEMENT. Annual Report

Conceptual Framework: Presentation

Cooper Upper Elementary School

Essentials of Ability Testing. Joni Lakin Assistant Professor Educational Foundations, Leadership, and Technology

On-the-Fly Customization of Automated Essay Scoring

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

& Jenna Bush. New Children s Book Authors. Award Winner. Volume XIII, No. 9 New York City May 2008 THE EDUCATION U.S.

Travis Park, Assoc Prof, Cornell University Donna Pearson, Assoc Prof, University of Louisville. NACTEI National Conference Portland, OR May 16, 2012

CONTINUUM OF SPECIAL EDUCATION SERVICES FOR SCHOOL AGE STUDENTS

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

Running head: METACOGNITIVE STRATEGIES FOR ACADEMIC LISTENING 1. The Relationship between Metacognitive Strategies Awareness

Tutor Trust Secondary

The Impact of Formative Assessment and Remedial Teaching on EFL Learners Listening Comprehension N A H I D Z A R E I N A S TA R A N YA S A M I

STEM Academy Workshops Evaluation

re An Interactive web based tool for sorting textbook images prior to adaptation to accessible format: Year 1 Final Report

School Physical Activity Policy Assessment (S-PAPA)

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

Multiple regression as a practical tool for teacher preparation program evaluation

School Year 2017/18. DDS MySped Application SPECIAL EDUCATION. Training Guide

Update Peer and Aspirant Institutions

DOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY?

Standards-based Mathematics Curricula and Middle-Grades Students Performance on Standardized Achievement Tests

Colorado s Unified Improvement Plan for Schools for Online UIP Report

Review of Student Assessment Data

Personnel Administrators. Alexis Schauss. Director of School Business NC Department of Public Instruction

School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne

ACADEMIC AFFAIRS GUIDELINES

Positive Behavior Support In Delaware Schools: Developing Perspectives on Implementation and Outcomes

How do we balance statistical evidence with expert judgement when aligning tests to the CEFR?

School Performance Plan Middle Schools

Governors and State Legislatures Plan to Reauthorize the Elementary and Secondary Education Act

Accountability in the Netherlands

Briefing for Parents on SBB, DSA & PSLE

EXECUTIVE SUMMARY. TIMSS 1999 International Science Report

Bellehaven Elementary

Short Term Action Plan (STAP)

Norms How were TerraNova 3 norms derived? Does the norm sample reflect my diverse school population?

PROPOSED MERGER - RESPONSE TO PUBLIC CONSULTATION

The Condition of College & Career Readiness 2016

Newburgh Enlarged City School District Academic. Academic Intervention Services Plan

Transportation Equity Analysis

AC : DEVELOPMENT OF AN INTRODUCTION TO INFRAS- TRUCTURE COURSE

2007 No. xxxx EDUCATION, ENGLAND. The Further Education Teachers Qualifications (England) Regulations 2007

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

ScienceDirect. Noorminshah A Iahad a *, Marva Mirabolghasemi a, Noorfa Haszlinna Mustaffa a, Muhammad Shafie Abd. Latif a, Yahya Buntat b

NCSC Alternate Assessments and Instructional Materials Based on Common Core State Standards

Financing Education In Minnesota

learning collegiate assessment]

Cuero Independent School District

PRESENTED BY EDLY: FOR THE LOVE OF ABILITY

Redirected Inbound Call Sampling An Example of Fit for Purpose Non-probability Sample Design

Transcription:

The LASER Model: A Systemic ad Sustaiable Approach for Achievig High Stadards i Sciece Educatio SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses Todd Zoblotsky, Ed.D. Christie Bertz, Ph.D. Breda Gallagher, Ed.D. Marty Alberg, Ph.D. Pricipal Ivestigator The Uiversity of Memphis 8/31/2016 Drive by Doig.

Ackowledgmets The success of this evaluatio would ot have bee possible without the herculea efforts built o strog parterships amog the Ceter for Research i Educatioal Policy (CREP), the Smithsoia Sciece Educatio Ceter (SSEC), Abt Associates, Beralillo Public s, Chama Public s, Clevelad Couty s, Greee Couty s, Housto Idepedet District, Jemez Valley Public s, Johsto Couty s, Los Alamos Public s, McDowell Couty s, Moore Couty s, Mora Public s, Pecos Idepedet District, Rio Racho Public s, Sata Fe Public s, Warre Couty s, ad Wilso Couty s. We exted our heartfelt thaks ad appreciatio to all who cotributed to this amazig edeavor, ad sought ad still seek to improve the state of sciece educatio i America. CREP Project Staff: Marty Alberg Caroly Kaldo Da Strahl Michael Rowe Joh Burgette Todd Zoblotsky Breda Gallagher Yu Tag Lou Fraceschii Haixia Qia Brya Witer Yig Huag Adria Youg Cidy Muzzi Dallas Burkhardt Margie Steves Ruby Booth Pricipal Ivestigator Co Pricipal Ivestigator Co Pricipal Ivestigator Project Maager Qualitative Aalysis Statistics Statistics Statistics Statistics Statistics Statistics Statistics Liaiso Liaiso Site Researcher Liaiso SMS Admiistratio SMS Admiistratio SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses i

Table of Cotets Ackowledgmets... i Itroductio... 1 Methodology... 3 Istrumetatio... 3 Populatio ad Cotext... 5 Sample... 7 Fidigs... 7 All Regios: Results for Sprig 2014 PASS Multi Choice... 10 All Regios Sprig 2014 PASS Multiple Choice Key Fidigs for Phase 1... 11 Fall 2011 to Sprig 2014 PASS Results: All Regios... 12 PASS Multiple Choice: All Regios... 12 Elemetary ad Middle Cohort PASS Multiple Choice Aalyses: All Regios... 14 Elemetary Cohort PASS Multiple Choice Sprig 2014 Results: All Regios... 14 Middle Cohort PASS MC Sprig 2014 Results: All Regios... 15 All Regios: Results for Sprig 2014 PASS Ope Eded ad Performace Tasks... 16 All Regios Sprig 2014 PASS Ope Eded ad Performace Task Key Fidigs for Phase 1... 17 Sprig 2014 PASS Ope Eded ad Performace Task Results: All Regios... 18 PASS Ope Eded ad Performace Task Scorig... 18 Elemetary ad Middle Cohorts PASS Ope Eded Aalyses: All Regios... 23 Elemetary Cohort Sprig 2014 PASS Ope Eded Results: All Regios... 23 Middle Cohort Sprig 2014 PASS Ope Eded Results: All Regios... 24 Elemetary ad Middle Cohorts PASS Performace Task Aalyses: All Regios... 24 Elemetary Cohort Sprig 2014 PASS Performace Task Results: All Regios... 24 Middle Cohort Sprig 2014 PASS Performace Task Results: All Regios... 27 Refereces... 29 Appedix... 30 Table A 1: Clusterig Correctio for Mismatched Aalyses... 31 Table A 2: Bejamii Hochberg Correctio for Multiple Comparisos... 32 Table A 3: Elemetary Cohort Cluster Level Attritio... 33 Table A 4: Middle Cohort Cluster Level Attritio... 34 Table A 5: WWC House Overall ad Differetial Attritio Rates... 35 Figure A 1: WWC Cofirmatio Letter... 36 Figure A 2: WWC Email Chai Regardig Cluster Level Iferece... 39 Figure A 3: WWC Email Regardig Multiple Comparisos... 45 SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses ii

Itroductio I August 2010, the Smithsoia Sciece Educatio Ceter (SSEC), a divisio of the Smithsoia Istitutio formerly kow as the Natioal Sciece Resources Ceter (NSRC), received a grat of more tha $25 millio from the U.S. Departmet of Educatio s Ivestig i Iovatio (i3) program for a fiveyear study to validate its Leadership ad Assistace for Sciece Educatio Reform (LASER) model i three regios of the Uited States: rural North Carolia, orther New Mexico, ad the Housto Idepedet District (HISD). Matchig fuds to support the study i the amout of more tha $5 millio were obtaied from parters i the three regios as required by the Departmet of Educatio. The idepedet third party research evaluatio of the LASER model was coducted by the Ceter for Research i Educatioal Policy (CREP) with techical assistace from Westat ad Abt Associates, who were provided to i3 gratees evaluatio parters by the US Departmet of Educatio (USDOE). CREP, a Teessee Ceter of Excellece, is a research ad evaluatio uit based at the College of Educatio at the Uiversity of Memphis. Iterim evaluatio results were reported to SSEC by CREP i aual formal techical reports as well as more iformally through presetatios ad writte materials. I July 2015, a comprehesive report (The LASER Model: A Systemic ad Sustaiable Approach for Achievig High Stadards i Sciece Educatio Summative Report) was submitted to SSEC cotaiig overall fidigs, coclusios, ad recommedatios i summary form based o aalysis of the fial year of available quatitative ad qualitative data for studets i two cohorts of schools: a elemetary cohort ad a middle school cohort. Supportig materials comprisig the complete fial report icluded a overview of implemetatio fidigs related to the five pillars of the LASER model ad a report of fidigs from case studies i additio to quatitative aalyses of achievemet data related to both cofirmatory ad exploratory research questios. The curret report focuses o the cofirmatory ad exploratory research questios submitted to i3 for the two studies coducted for the LASER i3 validatio grat, providig clarifyig detail related to methodology ad istrumetatio. The studies were coducted to aswer two cofirmatory research questios: After three years of participatio i the study (i.e., after Year 3), do schools cotaiig the Grade 3 elemetary school cohort that receive the LASER itervetio (i.e., Phase 1 schools) attai higher levels of sciece achievemet tha schools that do ot receive this itervetio (i.e., Phase 2 schools) as measured by the PASS? After three years of participatio i the study (i.e., after Year 3), do schools cotaiig the Grade 6 middle school cohort that receive the LASER itervetio (i.e., Phase 1 schools) attai higher levels of sciece achievemet tha schools that do ot receive this itervetio (i.e., Phase 2 schools) as measured by the PASS? SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 1

I additio, the studies were coducted to aswer two exploratory research questios: After three years of participatio i the study (i.e., after Year 3), do schools cotaiig the Grade 3 elemetary school cohort that receive the LASER itervetio (i.e., Phase 1 schools) attai higher levels of sciece achievemet tha schools that do ot receive this itervetio (i.e., Phase 2 schools) as measured by the PASS for the followig uderrepreseted studets i STEM? a. Studets with Disabilities b. Eglish Laguage Learers c. Ecoomically Disadvataged d. Females After three years of participatio i the study (i.e., after Year 3), do schools cotaiig the Grade 6 middle school cohort that receive the LASER itervetio (i.e., Phase 1 schools) attai higher levels of sciece achievemet tha schools that do ot receive this itervetio (i.e., Phase 2 schools) as measured by the PASS for the followig uderrepreseted studets i STEM? a. Studets with Disabilities b. Eglish Laguage Learers c. Ecoomically Disadvataged d. Females Two ew versios of the What Work Clearighouse Procedures ad Stadards Hadbook (Versio 2.1 ad Versio 3.0) were published after the research desig of the study was approved uder i3 ad data collectio had begu, i additio to the Reviewer Guidace for Use with the Procedures ad Stadards Hadbook Versio 3.0, which was published i March 2016, subsequet to completio of this summative report submitted to SSEC i July 2015. The curret report cotais laguage iteded to clarify fidigs ad esure aligmet with the most recet versio of the hadbook (Versio 3.0) ad the Reviewer Guidace documet. For example, issues such as level of iferece (idividual or cluster) ad joiers vs. stayers were ot part of the WWC Hadbook at the time this study was first implemeted uder i3. As a result, cosideratios such as these were ot factored ito the origial presetatio of results. The curret versio of the report reflects laguage that the WWC would evaluate as part of their review of the study fidigs uder the most curret versio of the hadbook. Note: As previously idicated, data for the elemetary ad middle school samples reported i this mauscript should be treated as two separate studies for reportig ad iterpretatio purposes a elemetary school study ad a middle school study (as cofirmed by the WWC i the letter i Figure A 1), both with cluster level research questios ad iferece, but with aalysis at the idividual (i.e., studet) level. Subgroup aalyses focused o uder represeted groups i STEM: Studets with a Disability (IEP), Eglish Laguage Learers (ELL), Ecoomically Disadvataged (FRL), ad females. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 2

Methodology The LASER i3 Validatio study utilized a matched pair, radomized cotrolled trial (RCT) ad was desiged to meet the What Works Clearighouse (WWC) criteria without reservatios, which is the highest possible ratig. s with itact elemetary (grades 3 5) ad middle school (grades 6 8) cohorts were paired ad radomly assiged to Phase 1 (immediate implemetatio) or Phase 2 (delayed implemetatio). s i Phase 1 bega implemetig LASER i the fall of 2011; Phase 2 schools served as the cotrol group, receivig a reduced versio of LASER followig the coclusio of the research study. The matched pair desig was utilized to esure equivalecy betwee groups. Baselie equivalece was established with the aalytic sample for elemetary schools overall ad for all subgroups ad i all but two cases for middle schools (PASS multiple choice ad ope eded for the ELL subgroup, both of which favored Phase 1 schools). For the middle school study, all sixth graders i the impact aalyses atteded elemetary schools the year prior to the iceptio of this study ad were therefore ot i their middle schools at the time of radom assigmet of clusters. By WWC defiitio, they were cosidered joiers (studets ot erolled i the school at the time of radom assigmet). There were also joiers icluded i the elemetary study (studets whose first year i the study school/cluster was the third grade), although this umber was relatively small (17.9% of the Phase 1 ad 9.9% of the Phase 2 sample, 14.5% of the total sample). Please also ote that while the word "studet" is used throughout this mauscript i reportig some icidetal fidigs (e.g., percetile raks) due to the idividual level aalyses, ad while tables of sample sizes ad outcomes referece samples at both the school ad idividual levels due to the idividual level aalyses, ifereces for both the elemetary ad middle school results should be at the cluster (i.e., school) level as whole schools, ot idividual studets, received the itervetio. Overall ad differetial cluster level attritio levels were calculated for the full elemetary ad middle school samples as well as for each subgroup for compariso to the What Works Clearighouse (WWC) cluster level attritio stadards (See Table A 3,Table A 4, ad Table A 5 i the Appedix), ad aalyses of aggregate ad subgroup data icluded ANCOVAs with a cluster correctio (See Table A 1 i the Appedix) as well as the Bejamii Hochberg correctio for multiple comparisos (See Table A 2 i the Appedix) the WWC applies to primary (i.e., cofirmatory) ad secodary (i.e., exploratory) cotrasts (What Works Clearighouse, 2014 ad 2016). Details are provided i subsequet sectios of this report ad i the appedices. Istrumetatio The evaluatio team elected to use the WestEd developed Partership for Stadards based Sciece Assessmet (PASS at WestEd ) test as the primary measure of studet learig. PASS is a stadardsbased test, developed with fudig from the Natioal Sciece Foudatio (NSF). The work of PASS SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 3

builds upo research o the properties of sciece assessmet ad curret approaches for assessmet developmet ad scorig. For the purposes of this study, PASS was admiistered at the elemetary ad middle school levels. It cosists of three assessmet compoets at each grade level: 1) selectedrespose/multiple choice items (hereafter referred to i this report as multiple choice or MC), 2) costructed respose ivestigatios (grades 3 5) or ope eded questios (grades 6 8 (hereafter referred to i this report as ope eded or OE), ad 3) hads o performace tasks (PT). All studets i the study completed the MC compoet; the OE ad PT sectios were completed by a sub group of studets i focal schools. Table 1 provides a descriptio of each PASS Assessmet Compoet. Table 1: PASS Assessmet Compoets PASS Assessmet Compoet Selected Respose or Multiple Choice Items (MC) (29 items for both Elemetary ad Middle school) Costructed Respose Ivestigatios ad Ope Eded Questios (OE) (2 items for Elemetary ad 6 items for Middle school) Hads o Performace Tasks (PT) (6 items for both Elemetary ad Middle school) Descriptio Items assess studets uderstadig of importat scietific facts, cocepts, priciples, laws, ad theories. Studets aalyze a problem, thik critically, coduct a secodary aalysis, ad apply learig. They costruct explaatios usig evidece. Ivestigatios idetifyig a problem to solve. Studets use equipmet to perform ivestigatios; make observatios; geerate, orgaize, ad aalyze data; commuicate uderstadigs; ad apply learig. Measuremet specialists o the PASS developmet teams coducted equatig studies i the desig of the forms ad established the validity ad reliability of the assessmets. Table 2 below, obtaied from WestEd, shows score reliabilities ad iter rater agreemet calculated for the PASS assessmets, icludig selected respose/multiple choice items (MC), costructed respose/ope eded ivestigatios (OE), ad hads o performace tasks (PT). The reliabilities of the item calibratios are give by the Rasch equivalet of the Crobach alpha statistic ad are derived from the ratio of the spread of the items over the scale to their ow root mea squared error (RMSE). The statistic is scaled to stretch from 0 to 1.0. Iter rater agreemet is the correlatio betwee the first ad secod reader o each aswer withi a task. Table 2: PASS Score Reliabilities ad Iter-Rater Reliability Number of Studets Overall Score Reliabilities (Crobach Alpha) Iter Rater Agreemet (Costructed Respose & Performace Tasks) CR 1 CR 2 PT Grade 5: Admiistered at grades 3, 4, & 5 7, 429.87.84.87.85 Grade 8: Admiistered at grades 6, 7, & 8 7, 777.92.95.88.91 To esure that oly studets preset at the begiig of LASER implemetatio would cotiue to be tested throughout the course of the study (ad would therefore be cosidered as accurately represetig their school i terms of sciece achievemet), CREP researchers pre slugged aswer sheets SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 4

followig the baselie admiistratio of the PASS MC compoet. Data from each admiistratio were carefully moitored to esure accuracy of the data set. I both the elemetary ad middle school cohorts, there were four alterate forms of the PASS MC subtest admiistered over the course of the study: Fall 2011 (pretest), Sprig 2012 (posttest), Sprig 2013 (posttest), ad Sprig 2014 (posttest). The three posttest forms were equated, separately by grade level bad, with the Fall 2011 (base) form usig commo items ad the Mea/Sigma method based o sigle parameter (item difficulty) IRT Rasch models estimated uder the assumptio of radom items. As the OE ad PT subtests for both grade bad tests were the same o all test forms, o equatig was required for those subtests. It should be oted that the OE ad PT sectios were admiistered begiig i Sprig 2012, so there were oly three vs. the four admiistratios of the MC sectio. I additio to the equatig, scaled scores (0 600 scale) were established for the MC subtest to eable aalyses of outcomes across the differet test forms. CREP recruited, traied ad calibrated scorers for the PASS OE ad PT compoets of the assessmet usig WestEd s validated traiig ad calibratio materials ad coducted a adaptatio of WestEd s traiig ad calibratio process startig with the Sprig 2013 admiistratio. Scorers first participated i a five hour traiig sessio, focused o either elemetary or middle school scorig, where the opeeded questios ad performace tasks, alog with their associated scorig rubrics, were preseted ad discussed. At the ed of the traiig sessio ad prior to scorig ay of the actual PASS OE/PT studet materials, participats idepedetly scored a calibratio set of eight OE/PT calibratio booklets (validated ad provided from WestEd). The participats results were compared to the 80% percet agreemet bechmark set to esure their readiess to become a certified scorer. After qualifyig as certified scorers, each was radomly assiged to score studet test booklets, based o the elemetary or middle school traiig they had received. Each time a scorer received a set of 100 booklets, three uidetified calibratio booklets were radomly iserted ito the set. Whe scorers retured each set of papers, CREP checked the calibratio papers agaist the pre determied scores. Scorers could ot receive a ew set of booklets uless they met or exceeded the bechmark calibratio level. Certified scorers siged a o disclosure agreemet ad kept booklets secure ad cofidetial util they were scored ad retured. They scored each questio idividually ad recorded results o studet sca documets. The same scorers retured to score the followig year s PASS OE/PT materials, ad completed a secod traiig ad calibratio process prior to scorig those assessmets. Populatio ad Cotext The populatio from which the sample for this study was draw ecompasses three regios (Housto Idepedet District, rural North Carolia, ad orther New Mexico), ad represets a total of 16 school districts. This populatio of school districts icludes more tha 325,000 studets ad 20,000 SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 5

teachers, ad over 150 district ad buildig level istructioal leaders, with more tha half of studets (56.2%) idetified as ecoomically disadvataged by the Natioal Ceter for Educatioal Statistics (NCES) based o free ad reduced luch status. From this total, schools were omiated for participatio i the study, ad from that omiated list the study sample was created. Although these regios, ad schools withi the regios, are very diverse, commoalities exist across the regios that stem from coversatios, treds ad iitiatives takig place at the atioal level. These have had varyig levels of impact o the teachig of sciece i elemetary ad middle schools, ad therefore o the implemetatio of the LASER model durig the course of this study. Most otable are the atioal debate aroud Commo Core State Stadards ad associated testig; the Next Geeratio Sciece Stadards (fial draft released i April 2013); ew teacher evaluatio models; ad idetificatio ad implemetatio of programs for low performig schools. Withi each regio, uique coditios also existed durig the LASER implemetatio widow that had potetial to impact implemetatio of the model. I North Carolia, a program called Read to Achieve was adopted i July 2012 via a state budget act ad became effective durig the 2013 2014 school year. With madated summer readig camps ad possible retetio imposed o studets who did ot achieve acceptable readig levels by third grade, more istructioal time was devoted to the teachig of readig i the lower grades i may NC schools. I Housto, 11 of the LASER i3 schools (25.6%) were also part of that district s Apollo 20 iitiative, with multiple strategies i place to close gaps i achievemet ad more time devoted to testig tha was required i other schools. Implemetatio i Norther New Mexico was impacted by the geography of the regio: remote moutai school locatios affected teacher travel to professioal developmet sessios; delivery of materials, particularly live specimes; ad access to SSEC s regioal support system. Partership with the Los Alamos Natioal Laboratory (LANL) Foudatio provided support for schools i that area. Baselie data collected from teachers prior to LASER i3 implemetatio i fall 2011 revealed that most studets i all three regios received sciece istructio from their classroom teachers rather tha from a sciece specialist (reported by more tha 90% of teachers i all three regios) ad that these teachers did ot major i sciece or sciece educatio (Approximately 10% of respodets from New Mexico ad HISD ad 8% from North Carolia reported holdig majors i sciece). Sciece laboratories were more prevalet i Housto schools tha i the other regios, with early half (49.8%) of teachers reportig that their studets wet to labs to receive sciece istructio compared to 10.5% i North Carolia ad 1.5% i New Mexico. Time devoted to the teachig of sciece was also greater i Housto, with teachers reportig a average of 3.3 hours per week of sciece istructio compared to slightly uder 2.5 hours i North Carolia ad slightly over 2 hours i New Mexico. Whe asked at baselie about challeges associated with teachig sciece i elemetary ad middle schools, teachers resposes were cosistet across all three regios. The greatest challege (reported as substatial or sigificat by 70% of NM teachers, 63% of NC teachers, ad 59% of HISD teachers) was limited time for sciece istructio. Limited fuds for purchasig equipmet ad supplies ad more SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 6

emphasis o Eglish/laguage arts ad mathematics tha sciece istructio were also reported challeges i all regios. Sample The study sample origially icluded 139 schools across the three regios. These schools wet through a omiatio ad qualificatio process, were matched based o several school level demographic ad achievemet variables, ad the radomly assiged to Phase 1 (immediate implemetatio) or Phase 2 (delayed implemetatio). After radom assigmet, chages i school participatio occurred withi each regio, ad the fial sample cotaied 125 study schools withi the 16 districts ad ecompassed approximately 60,000 studets, 1,900 teachers, ad over 140 district admiistrators ad pricipals. While LASER is a school level itervetio i which all studets i the participatig schools received the treatmet, a subsample of 9,000 studets i two cohorts (third ad sixth graders i 2011 12) was followed logitudially over the three years of the study, ad a further subset of focal schools was radomly selected to participate i additioal compoets of data collectio. A breakdow of the study schools by regio, phase, ad focal status as well as a detailed descriptio of selectio methodology has bee provided i previous aual reports. HISD is the largest school district i the study. Participatig schools geerally served Hispaic ad Africa America populatios (63.8% ad 28.9%, respectively), with most studets (88.1%) idetified as eligible for free ad reduced luch. New Mexico LASER schools i the participatig school districts raged i size from 26 to 984 studets. New Mexico districts served mostly Hispaic, White, ad America Idia/ Alaska populatios (48.0%, 34.8%, ad 11.7%, respectively), with over half of studets (58.0%) qualifyig for free ad reduced luch status. The sizes of LASER schools i the participatig school districts i North Carolia (NC) raged from 186 to 930 studets. Most of the districts served primarily White ad Africa America populatios (60.7% ad 18.1%, respectively), with almost two thirds of studets (63.1%) idetified as eligible for free or reduced luch by the North Carolia Public s. Fidigs Although studet achievemet gais as measured by traditioal stadardized tests comprise oly oe compoet of a successful itervetio, it is the sigle outcome of most iterest to may costituecies. To obtai valid achievemet outcome data, CREP researchers aalyzed scores from oly studets i the elemetary ad middle school cohorts for whom both pretest ad posttest (baselie ad sprig 2014) PASS scores were available ad established baselie equivalece usig these aalytic SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 7

samples. It is importat to ote that all schools idetified as Phase 1 are cosidered to be i the treatmet group regardless of their level of implemetatio of the LASER model, ad that fidelity of implemetatio varied widely across regios ad across schools withi regios. It should also be oted that the statistical aalyses utilized a Itet to Treat model, where studets were icluded i the Phase 1 or Phase 2 groups for aalysis based o their treatmet status at the time of radom assigmet. Importat ad positive treds betwee Phase 1 ad Phase 2 schools are evideced i exploratory subgroup outcomes related to characteristics commoly agreed upo as most valued by employers. Both the OE ad PT sectios of the PASS call upo studets to commuicate their kowledge i writte form. They also egage studets i activities associated with critical thikig ad problem solvig. These twety first cetury skills are associated with college ad career readiess; it is therefore oteworthy that these are the areas of achievemet i which Phase 1 schools excelled. It is also importat to ote that the uderserved populatios of ecoomically disadvataged ad special eeds studets, as well as those for whom Eglish is a secod laguage, seem to have beefited from their experieces with LASER as reflected i scores o the PASS. PASS results for all three regios combied follow, first for the MC, the the OE ad PT sectios. A summary of the key fidigs for each set of aalyses is preseted at the begiig of each sectio, followed by iformatio o the samples icluded, baselie equivalece betwee the Phase 1 ad Phase 2 groups, ad the detailed outcomes by grade level (i.e., elemetary cohort ad middle school cohort) ad subgroup. I keepig with guidelies i the most recet What Work Clearighouse Procedures ad Stadards Hadbook (Versio 3.0), a clusterig correctio ad Bejamii Hochberg correctio for multiple comparisos was applied to all statistically sigificat fidigs, icludig secodary cotrasts (i.e., exploratory aalyses). I additio, the overall ad differetial cluster level attritio rate for each outcome was calculated ad compared to the WWC allowable stadards. Based o guidelies i the most recet What Work Clearighouse Procedures ad Stadards Hadbook (Versio 3.0), guidelies i Reviewer Guidace for Use with the Procedures ad Stadards Hadbook (versio 3.0), cosultatio with represetatives from the WWC (See Figure A 2 ad Figure A 3 i the Appedix), ad the fact that the elemetary school study was a RCT with low cluster level attritio, it should receive a WWC Study Ratig of Meets WWC Group Desig Stadards without Reservatios. While the middle school study was also a RCT, oe of the outcomes met the WWC cluster level attritio stadards. For the OE ad PT sectios, this was due to issues of differetial attritio related to Phase 2 schools (o Phase 1 middle schools were lost to attritio). However, all but two outcomes demostrated baselie equivalece, meaig the middle school study should receive a WWC Study Ratig of Meets WWC Group Desig Stadards with Reservatios. There were three exploratory outcomes for the elemetary study that were statistically sigificat ad positive after the cluster correctio: The ELL subgroup o PASS OE ad PT, ad the IEP subgroup o the PT. However, there were o statistically sigificat positive cofirmatory or exploratory fidigs i either the elemetary or middle school studies after applyig both the cluster ad Bejamii Hochberg correctio for multiple comparisos that the WWC, but ot IES, applies to secodary cotrasts. As stated by The Istitute SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 8

of Educatio Scieces (IES) at the U.S. Departmet of Educatio (Schochet, 2008), multiplicity adjustmets are ot required for exploratory aalyses. Furthermore, it should be oted that there were o statistically sigificat or substatively importat cofirmatory or exploratory egative fidigs for the elemetary study. Therefore, based o the IES guidace, these three exploratory outcomes for the elemetary study i favor of Phase 1 schools, which remaied statistically sigificat after the cluster correctio, should still be cosidered meaigful ad positive fidigs. I additio, the IEP (g = 0.39) ad ELL (g = 0.30) PT exploratory outcomes had substatively importat effect sizes. Accordig to the What Work Clearighouse Procedures ad Stadards Hadbook (Versio 3.0): For the WWC, effect sizes of 0.25 stadard deviatios or larger are cosidered to be substatively importat. Effect sizes at least this large are iterpreted as a qualified positive (or egative) effect, eve though they may ot reach statistical sigificace i a give study (What Works Clearighouse, 2014, p. 23). Furthermore, the America Statistical Associatio (ASA) recetly released a statemet o the use ad iterpretatio of p values i which they cocluded: Scietific coclusios ad busiess or policy decisios should ot be based oly o whether a p value passes a specific threshold.a p value, or statistical sigificace, does ot measure the size of a effect or the importace of a result.by itself, a p value does ot provide a good measure of evidece regardig a model or hypothesis (Wasserstei ad Lazar, 2016, pp. 131 132). Give that the WWC cosiders substatively importat effect sizes as a qualified positive effect, eve if ot statistically sigificat, the ASA s recet guidace o p values, as well as the IES guidace o multiple compariso correctios for exploratory aalyses, the SSEC s LASER model ca claim success i meetig its goal of improvig studet achievemet. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 9

All Regios: Results for Sprig 2014 PASS Multiple Choice SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 10

All Regios Sprig 2014 PASS Multiple Choice Key Fidigs for Phase 1 After applyig the cluster correctio ad Bejamii Hochberg correctio for multiple comparisos that the WWC applies to secodary cotrasts (i.e., exploratory aalyses), ad calculatig the overall ad subgroup cluster level attritio, there were o statistically sigificat or substatively importat PASS scaled score outcomes favorig Phase 1 elemetary schools o the Sprig 2014 PASS multiple choice sectio. It should be oted that while oe of the middle school aalyses (overall or by subgroup) met the WWC cluster level attritio stadard, all outcomes demostrated baselie equivalece usig the aalytic samples. For the elemetary cohort, oly the IEP subgroup did ot meet the WWC cluster level attritio stadard, but the subgroup did demostrate baselie equivalece with the aalytic sample. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 11

Fall 2011 to Sprig 2014 PASS Results: All Regios There were a total of 29 multiple choice questios o both the Fall 2011 ad Sprig 2014 forms of the PASS (PASS MC) addressig five broad sciece cotet stadard categories for the elemetary cohort ad six broad sciece cotet stadard categories for the middle school cohort. Scaled scores o the PASS MC for both elemetary ad middle schools rage from 0 600. Oly studets who aswered at least oe multiple choice achievemet questio at both time poits were icluded i the aalyses for each respective area of aalysis. PASS Multiple Choice: All Regios Table 3 shows the fial cluster (i.e., school) ad studet sample sizes employed i the elemetary cohort aalyses (5 th graders i 2013 2014) oce studets missig data o all 29 PASS MC questios at either time poit were excluded. Table 3: PASS MC, Sprig 2014: ad Studet Samples for the PASS MC Aalyses for the Elemetary Cohort: All Regios Sample Phase 1 Phase 2 s available for the PASS MC achievemet aalysis 51 43 Studets available for the PASS MC achievemet aalysis 2,338 1,785 Table 4 shows the fial school ad studet sample sizes employed i the middle school cohort aalyses (8 th graders i 2013 2014) oce studets missig all 29 PASS MC questios at either time poit were excluded. Table 4: PASS MC, Sprig 2014: ad Studet Samples for the PASS MC Aalyses for the Middle Cohort: All Regios Sample Phase 1 Phase 2 s available for the PASS MC achievemet aalysis 11 11 Studets available for the PASS MC achievemet aalysis 1,036 1,132 To determie baselie equivalece o the Fall 2011 PASS MC scaled scores betwee Phase 1 ad Phase 2 for elemetary ad middle schools icluded i the preset aalysis, a series of idepedet t tests were coducted usig the aalytic samples for all elemetary ad middle schools i the aggregate as well as for subgroups of desigated by their Special Educatio (IEP) status, Eglish Laguage Learer (ELL) status, Ecoomically Disadvataged (FRL) status, ad Geder. I additio, a effect size was also calculated as a measure of baselie equivalece. As a idicator of the impact or practical sigificace of the treatmet, the effect size (calculated as Hedges g) is a descriptive statistic that idicates the magitude of the differece (i stadard deviatio uits) betwee two measures. For example, a positive effect size would idicate a higher (i.e., better) SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 12

Phase 1 mea, while a egative effect size would idicate a higher (i.e., better) Phase 2 mea. Based o guidelies from the What Works Clearighouse (WWC), a uit withi the research divisio of the U.S. Departmet of Educatio, a effect size of +/ 0.25 is cosidered to be substatively importat (What Works Clearighouse, 2014). With respect to the elemetary cohort (Table 5), i the aggregate (the All group), there was o statistically sigificat differece by Phase betwee schools i baselie achievemet levels (t (4121) = 0.75, p =.45, g = 0.02, PR = 49) based o the aalytic samples. At the same time, the ELL subgroup was the oly oe that demostrated a statistically sigificatly differece i baselie achievemet, with Phase 2 ELL schools outperformig their Phase 1 couterparts, although based o the effect size (g), ot to a substatively meaigful degree (t (926.9) = 2.36, p =.02, g = 0.15, PR = 44). Overall, there were o substatively importat effect size differeces for the elemetary cohort schools, meaig there was baselie equivalece for all groups based o the aalytic samples. Table 5: Baselie Compariso of Fall 2011 PASS MC Scaled Scores for Elemetary Cohort Phase 1 (Treatmet) ad Phase 2 (Cotrol) s (N = 94): All Regios Treatmet (Phase 1) Cotrol (Phase 2) Group Elemetary Cohort Studet M SD Studet M SD t g PR All 51 2,338 312.02 101.33 43 1,785 314.39 98.11-0.75-0.02 49 IEP 42 209 267.05 93.13 30 152 260.83 98.96 0.61 0.06 53 ELL 45 537 263.82 91.57 38 418 277.31 84.24-2.36* -0.15 44 FRL 47 1,416 284.79 94.85 40 1,060 289.39 90.29-1.23-0.05 48 Female 51 1,157 310.55 97.49 43 887 311.97 96.89-0.33-0.01 49 Note: PR = The percetile rak of the average Phase 1 studet i the cotrol group based o the effect size (g). For example, if the PR is 60, the the average Phase 1 studet scored at the 60th percetile of the cotrol group. Note: PASS MC scaled scores rage from 0-600. * p <.05. Likewise, with respect to schools i the middle school cohort (Table 6), there was o statistically sigificat differece betwee schools i baselie achievemet by Phase (t (2166) = 1.17, p =.24, g = 0.05, PR = 52) i the aggregate based o the aalytic samples. Whe the outcomes for the FRL subgroup were compared by Phase, there was a statistically sigificat differece i Fall 2011 PASS scores that favored Phase 1 schools, but the effect size liked to the compariso did ot meet WWC criteria for substative importace (i.e., g 0.25) (t (1223.2) = 3.62, p <.01, g = 0.20, PR = 58). O the other had, there was a statistically sigificat differece i Fall 2011 PASS scores for the ELL subgroup, ad the effect size associated with the differece met the WWC threshold for substative importace, favorig Phase 1 schools (t (181) = 3.30, p <.01, g = 0.49, PR = 69). Therefore, the outcome for the ELL subgroup compariso for the middle school cohort should be iterpreted i light of the substatively importat differece i baselie achievemet betwee Phase 1 ad Phase 2 schools. Employig these Fall 2011 data as covariates to statistically adjust the outcomes for baselie differeces i achievemet for the aalytic sample, aalyses were coducted o Sprig 2014 PASS MC scaled scores SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 13

to determie differeces betwee Phase 1 ad Phase 2 elemetary ad middle schools, with scaled scores o the Sprig 2014 PASS MC used as the outcome measure. Table 6: Baselie Compariso of Fall 2011 PASS MC Scaled Scores for Middle Cohort Phase 1 (Treatmet) ad Phase 2 (Cotrol) s (N = 22): All Regios Treatmet (Phase 1) Cotrol (Phase 2) Group Studet Middle Cohort M SD Studet M SD t g PR All 11 1,036 364.51 102.66 11 1,132 359.10 112.40 1.17 0.05 52 IEP 10 111 282.22 96.57 6 114 276.25 112.07 0.43 0.06 52 ELL 10 83 290.70 82.07 11 100 248.08 90.75 3.30* 0.49 69 FRL 11 644 339.08 98.64 11 614 317.63 110.88 3.62* 0.20 58 Female 11 531 367.68 99.85 11 562 357.67 108.00 1.59 0.10 54 Note: PR = The percetile rak of the average Phase 1 studet i the cotrol group based o the effect size (g). For example, if the PR is 60, the the average Phase 1 studet scored at the 60th percetile of the cotrol group. Note: PASS MC scaled scores rage from 0-600. * p <.05. Elemetary ad Middle Cohort PASS Multiple Choice Aalyses: All Regios With respect to the cohort of elemetary schools i Phase 1 ( = 51) ad Phase 2 ( = 43) ad the cohort of middle schools i Phase 1 ( = 11) ad Phase 2 ( =11), a set of ANCOVA aalyses iteded to geerate pairs of adjusted scaled score meas ad to compute the treatmet effect sizes (g) were coducted o the PASS MC outcomes for all elemetary ad middle schools by Phase withi cohort, as well as for subgroups, categorized by their IEP status, ELL status, FRL status, ad Geder (see Table 7 ad Table 8). Elemetary Cohort PASS Multiple Choice Sprig 2014 Results: All Regios For the elemetary cohort schools across the three regios, while the overall (i.e., the All group) ANCOVA adjusted scaled score mea preseted i Table 7 was higher for Phase 1 schools ( = 51, Adjusted Mea = 435.80) compared to Phase 2 schools ( = 43, Adjusted Mea = 434.88), it also fell short of beig statistically sigificat (F (1, 4116) = 0.15, p = 0.698, g = 0.01, PR = 50), ad the effect size (g = 0.01) was ot substatively importat. Cosistet with these overall outcomes, two subgroup aalyses (IEP ad ELL) were liked to positively siged effect sizes that favored Phase 1 schools i the elemetary cohort (see Table 7). Meawhile, while the ELL subgroup i Phase 2 schools statistically sigificatly outperformed the ELL subgroup i Phase 1 schools at baselie, the ELL subgroup i Phase 1 schools had a higher adjusted mea scaled score o the posttest that ultimately fell short of beig statistically sigificat or substatively importat. Overall, oe of the effect sizes for the ANCOVA aalyses were large eough to be substatively importat, ragig from a low of 0.03 (FRL ad Female) to a high of 0.19 (IEP). SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 14

Table 7: PASS MC, Sprig 2014: Subgroup Mea Scaled Score Compariso for Elemetary Cohort Phase 1 (Treatmet) ad Phase 2 (Cotrol) s (N = 94): All Regios Group Treatmet (Phase 1) Cotrol (Phase 2) Studet M SD Adj. M Studet M SD Adj. M F p g PR All 51 2,338 435.28 88.72 435.80 43 1,785 435.56 88.76 434.88 0.15 0.698 0.01 50 IEP 42 209 392.08 104.38 390.08 30 152 366.49 116.11 369.23 3.86 0.050 0.19 58 ELL 45 537 402.87 100.18 405.64 38 418 403.77 104.85 400.21 0.80 0.370 0.05 52 FRL 47 1,416 415.28 94.98 415.85 40 1,060 419.38 94.50 418.62 0.66 0.416-0.03 49 Female 51 1,157 434.92 84.14 435.33 43 887 438.12 84.10 437.59 0.50 0.481-0.03 49 Note: PR = The percetile rak of the average Phase 1 studet i the cotrol group based o the effect size (g). For example, if the PR is 60, the the average Phase 1 studet scored at the 60th percetile of the cotrol group. Note: PASS MC scaled scores rage from 0-600. * p <.05 Middle Cohort PASS MC Sprig 2014 Results: All Regios For the schools across the three regios i the middle school cohort, ulike the outcomes observed for the elemetary cohort, the overall scaled score performace result for the ANCOVA aalysis (i.e., the All group) show i Table 8 was egative for middle school cohort Phase 1 schools ( = 11, Adjusted Mea = 323.02) compared to middle school cohort Phase 2 schools ( = 11, Adjusted Mea = 327.22), ad was ot statistically sigificat (F (1, 2161) = 1.35, p = 0.246). I additio, the effect size (g = 0.04, PR = 48) was ot substatively importat. Furthermore, despite advatage of Phase 1 schools o the pretest for all subgroups, icludig a substatively importat advatage of Phase 1 schools o the Fall 2011 baselie for the ELL subgroup, Phase 2 outperformed Phase 1 for all subgroups. However, the effect size favorig the Phase 2 IEP subgroup (g = 0.28) was the oly substatively importat subgroup effect foud, ad idicated that the average Phase 1 studet scored at the 39 th percetile of the cotrol group. Furthermore, oe of the outcomes was statistically sigificat, icludig the outcome for the IEP subgroup, which was ot statistically sigificat after applyig the cluster correctio. Table 8: PASS MC, Sprig 2014: Subgroup Mea Scaled Score Compariso for Middle Cohort Phase 1 (Treatmet) ad Phase 2 (Cotrol) s (N = 22): All Regios Group Treatmet (Phase 1) Cotrol (Phase 2) Studet M SD Adj. M Studet M SD Adj. M F p p^ g PR All 11 1,036 323.75 110.85 323.02 11 1,132 326.55 106.00 327.22 1.35 0.246-0.04 48 IEP 10 111 220.02 124.36 217.99 6 114 250.53 124.45 252.50 5.91 0.016* 0.196-0.28 39 ELL 10 83 235.75 107.78 224.46 11 100 232.97 106.64 242.34 1.39 0.240-0.17 43 FRL 11 644 299.72 111.57 293.04 11 614 293.58 109.45 300.58 2.25 0.134-0.07 47 Female 11 531 333.00 104.39 330.93 11 562 330.84 99.39 332.80 0.16 0.690-0.02 49 Note: PR = The percetile rak of the average Phase 1 studet i the cotrol group based o the effect size (g). For example, if the PR is 60, the the average Phase 1 studet scored at the 60th percetile of the cotrol group. Note: PASS MC scaled scores rage from 0-600. * p < 0.05. p^ = Clusterig-corrected statistical sigificace SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 15

All Regios: Results for Sprig 2014 PASS Ope Eded ad Performace Task SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 16

All Regios Sprig 2014 PASS Ope Eded ad Performace Task Key Fidigs for Phase 1 After applyig both the cluster correctio ad Bejamii Hochberg correctio for multiple comparisos that the WWC applies to secodary cotrasts (exploratory aalyses), ad calculatig the overall ad subgroup cluster level attritio, for all studets combied (the All group) ad the specified subgroups, the followig outcomes (percetage correct) favorig Phase 1 elemetary ad middle schools were foud o the Sprig 2014 PASS Ope Eded/Costructed Respose (OE), ad Performace Task (PT) sectios. It should be oted that while oe of the middle school aalyses (overall or by subgroup) met the WWC cluster level attritio stadard (due to differetial attritio rates related to Phase 2 schools, as o Phase 1 schools were lost to attritio), all outcomes demostrated baselie equivalece with the aalytic samples except for the ELL subgroup o the Ope Eded sectio. For the elemetary cohort, oly the IEP subgroup did ot meet the WWC cluster level attritio stadard, but the subgroup did demostrate baselie equivalece with the aalytic sample. ELL Elemetary Cohort Ope Eded: Phase 1 schools had statistically sigificatly higher achievemet tha Phase 2 schools after the cluster correctio. However, the differece was ot statistically sigificat after the Bejamii Hochberg correctio for multiple comparisos that the WWC, but ot IES, applies to secodary cotrasts (i.e., exploratory aalyses). Elemetary Cohort Performace Task: After cotrollig for the statistically sigificat advatage Phase 2 schools demostrated o the pretest (g = 0.18), Phase 1 schools demostrated a statistically sigificat advatage over Phase 2 schools after the cluster correctio. The differece, however, was ot statistically sigificat after the Bejamii Hochberg correctio for multiple comparisos that the WWC, but ot IES, applies to secodary cotrasts (exploratory aalyses), but was substatively importat (g = 0.30). Middle Cohort Performace Task: Phase 1 schools had a substatially importat advatage over Phase 2 schools o the posttest (g = 0.37). Ecoomically Disadvataged (FRL) Middle Cohort Performace Task: Phase 1 schools outperformed Phase 2 schools with a effect size that was substatively importat (g = 0.27). IEP Elemetary Cohort Performace Task: Phase 1 schools demostrated statistically sigificatly higher achievemet tha Phase 2 schools after the cluster correctio, but the differece was ot statistically sigificat after the Bejamii Hochberg correctio for multiple comparisos that the WWC, but ot IES, applies to secodary cotrasts (i.e., exploratory aalyses). I additio, the effect size was substatially importat (g = 0.39). Female Middle Cohort Performace Task: Phase 1 schools outperformed Phase 2 schools with a early substatively importat effect size (g = 0.23). SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 17

Sprig 2014 PASS Ope Eded ad Performace Task Results: All Regios Itroductio A radom sample of schools i the three regios took the PASS Ope Eded ad Performace Task assessmet for the first time i Sprig 2012, (ed of first posttest year) ad agai i Sprig 2013 ad Sprig 2014 (secod ad third posttest years, respectively). Studets i the elemetary cohort (5 th graders i 2013 2014) respoded to two Ope Eded (OE) ad six Performace Task (PT) items, while middle schools (8 th graders i 2013 2014) respoded to six OE ad six PT items. It should be oted that a radom sample of schools i the HISD middle school cohort took the OE ad PT sectios for the first time i Sprig 2013, ad are therefore ot icluded i these aalyses. PASS Ope Eded ad Performace Task Scorig For the elemetary cohort, there are a total of six poits possible for the OE sectio ad 17 total poits possible for the PT sectio. For the middle school cohort, there are a total of 15 poits possible for the OE sectio ad 17 total poits possible for the PT sectio. The items are scored usig a rubric, with the umber of poits available for each item i each sectio show i Table 9 below. I order to score a sectio, the studet had to aswer at least oe item (i.e., gave a respose that received a score of zero or higher). Otherwise, the sectio was dropped from the aalysis if all the items were either missig, scored a B (blak), or had a combiatio of missig data ad scores of B. If the sectio was scored, ay item with a B ad ay missig items were give a value of zero. As a result, whe a sectio was scored ad a studet had missig items or items scored with a B, those items were treated the same as the case where a studet actually respoded to a item, but received a score of zero, idicatig the respose did ot cotai ay correct elemets or was irrelevat. For both the OE ad PT sectios, the outcome score used i the aalyses was the percetage correct out of the total umber of poits possible. Table 9: PASS OE ad PT Scorig Scales, Sprig 2012, Sprig 2013, ad Sprig 2014 Elemetary Cohort Middle Cohort Ope-eded Questio Performace Task Ope-eded Questio Performace Task Item Scale Item Scale Item Scale Item Scale Total Poits B = Blak 1 B, 0, 1, 2, 3 1 B, 0, 1, 2, 3 1 B, 0, 1, 2 1 B, 0, 1, 2, 3 2 B, 0, 1, 2, 3 2 B, 0, 1, 2, 3 2 B, 0, 1, 2 2 B, 0, 1, 2, 3 6 3 B, 0, 1, 2, 3 3 B, 0, 1, 2 3 B, 0, 1, 2, 3 4 B, 0, 1, 2, 3 4 B, 0, 1, 2, 3 4 B, 0, 1, 2, 3 5 B, 0, 1, 2, 3 5 B, 0, 1, 2, 3 5 B, 0, 1, 2, 3 6 B, 0, 1, 2 6 B, 0, 1, 2, 3 6 B, 0, 1, 2 Total Poits 17 Total Poits 15 Total Poits 17 SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 18

A summary of the Key Fidigs for each set of aalyses is preseted at the begiig of the report, followed by iformatio o the samples icluded, baselie equivalece betwee the Phase 1 ad Phase 2 schools, ad the detailed outcomes by grade level (i.e., elemetary cohort ad middle school cohort), outcome (PASS OE ad PASS PT) ad subgroup. A prelimiary aalysis was coducted o the Sprig 2012 OE ad PT sectios of the PASS for studets i the aalytic sample who had a Sprig 14 OE or PT percet correct score to determie baselie equivalece betwee Phase 1 ad Phase 2 for elemetary ad middle schools icluded the preset aalysis (see Table 10) as the PASS OE ad PT sectios were ot admiistered util the ed of the first posttest year, meaig there was o Fall 2011 baselie scores available. I additio, a effect size was also calculated as a measure of baselie equivalece. As a idicator of the impact or practical sigificace of the treatmet, the effect size (calculated as Hedges g) is a descriptive statistic that idicates the magitude of the differece (i stadard deviatio uits) betwee two measures. For example, a positive effect size would idicate a higher (i.e., better) Phase 1 mea, while a egative effect size would idicate a higher (i.e., better) Phase 2 mea. Based o guidelies from the What Works Clearighouse, a uit withi the research divisio of the U.S. Departmet of Educatio, a effect size of +/ 0.25 is cosidered to be substatively importat (What Works Clearighouse, 2014). Results idicated that for the elemetary cohort aggregate scores (i.e., for all studets combied), there was o statistically sigificat differece betwee Phase 1 ad Phase 2 schools o the Sprig 2012 OE or PT percet correct, alog with o substatially importat effect sizes accordig to What Work Clearighouse (WWC) stadards. For the middle school cohort aggregate scores, Phase 1 schools had a statistically sigificatly higher mea Sprig 2012 OE percet correct, as well as Sprig 2012 PT percet correct, with the magitude of the effects for both beig substatially importat. Table 10: PASS OE ad PT, Sprig 2012, Treatmet (Phase 1) ad Cotrol (Phase 2) Percet Correct Meas Compariso: All Regios Sectio Cohort Treatmet (Phase 1) Cotrol (Phase 2) Studet Studet M SD M SD Ope-Eded Elemetary 35 1,159 43.3 20.66 31 991 44.43 18.69-1.37-0.06 Performace Task Elemetary 35 1,326 53.68 19.76 32 1,099 54.41 17.35-0.97-0.04 Ope-Eded Middle 8 795 72.6 16.32 7 578 68.06 19.43 4.56* 0.26 Performace Task Middle 8 697 52.11 20.08 7 514 42.23 23.45 7.69* 0.46 * p < 0.05 Due to the fact that the PASS OE ad PT were ot admiistered util the ed of the first posttest year, meaig there were o true baselie scores available, ad due to substatively meaigful differeces o the Sprig 2012 scores, correlatio aalyses were coducted to examie the relatioship betwee the Sprig 2014 PASS OE ad PT percet correct ad (1) the Sprig 2012 PASS OE ad PT percet correct, as well as (2) the Fall 2011 PASS Multiple Choice (MC) scaled score results, to determie which scores would serve as the better baselie measure of achievemet. The aalyses revealed statistically sigificat, but low correlatios amog each of the measures of achievemet (see Table 11). For the t g SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 19

schools i both the elemetary ad middle school cohorts, the Fall 2011 PASS MC scaled scores had higher statistically sigificat correlatios with the Sprig 2014 PASS OE ad PT, compared to the Sprig 2012 OE ad PT. Table 11: Correlatios o the Percet Correct for Sprig 2014 PASS OE ad PT with Sprig 2012 PASS OE ad PT, ad Fall 2011 PASS Multiple Choice for Phase 1 ad Phase 2 s: All Regios Sprig 2014 PASS Cohort Fall 2011 PASS Multiple Choice Sprig 2012 Ope-Eded Sprig 2012 Performace Task Sprig 2014 Ope-Eded Elemetary 0.37* 0.33* NA Middle 0.45* 0.38* NA Sprig 2014 Performace Task Elemetary 0.36* NA 0.35* Middle 0.39* NA 0.34* * p < 0.05 To determie baselie equivalece o the Fall 2011 PASS MC scaled score betwee the aalytic samples i Phase 1 ad Phase 2 elemetary ad middle schools icluded the preset aalyses, a series of idepedet t tests were coducted for all elemetary ad middle schools i the aggregate as well as for subgroups idetified by their Special Educatio (IEP) status, Eglish Laguage Learer (ELL) status, Ecoomically Disadvataged (FRL) status, ad Geder (see Table 12). For the aalytic sample i the aggregate elemetary OE cohort (i.e., the All group), Phase 2 schools demostrated a statistically sigificat advatage over Phase 1 schools i their baselie achievemet levels (t(2583) = 2.53, p = 0.011, g = 0.10, PR = 46), but the effect size liked to this advatage did ot meet WWC criteria for substative importace (i.e., g 0.25). I additio to this overall differece i performace, a statistically sigificat, but ot substatively importat advatage was also observed to favor the aalytic sample i the Female subgroup i Phase 2 schools i the elemetary cohort. For schools i the middle school OE cohort, o statistically sigificat differece i aggregate performace (i.e., the All group) betwee the aalytic sample i Phase 1 ad Phase 2 schools was observed (t(1525) = 1.02, p = 0.309, g = 0.05, PR = 52), ad the associated effect size did ot meet the WWC criteria for substative importace. Meawhile, a statistically sigificat, but ot substatively importat advatage i baselie performace for the aalytic sample was observed for the FRL subgroup i Phase 1 middle schools. Additioally, the ELL subgroup i Phase 1 schools had a advatage over Phase 2 schools that was ot statistically sigificat, but was substatively importat (g = 0.31). With respect to the elemetary PT cohort i the aggregate (i.e., the All group), Phase 2 schools demostrated a statistically sigificat advatage over Phase 1 schools i their baselie achievemet levels for the aalytic sample (t(2599) = 2.20, p = 0.028, g = 0.09, PR = 47), but the effect size liked to this advatage did ot meet WWC criteria for substative importace (i.e., g 0.25). Cosistet with this overall differece i performace, a statistically sigificat, but ot substatively importat advatage was observed to favor the ELL subgroup i Phase 2 schools. With respect to the middle school PT cohort, o statistically sigificat differece i aggregate performace (i.e., the All group) betwee the aalytic samples i Phase 1 ad Phase 2 schools was observed (t(1406) = 0.55, p = 0.582, g = 0.03, PR = 49), ad the associated effect size did ot meet the SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 20

WWC criteria for substative importace. No statistically sigificat or substatively importat advatages i baselie performace were observed for the aalytic samples i ay of the four subgroups for middle schools. While either the Sprig 2012 PASS OE ad PT or the Fall 2011 PASS Multiple Choice provided complete baselie equivalece betwee Phase 1 ad Phase 2 schools, the Fall 2011 PASS Multiple Choice was admiistered as a true baselie assessmet vs. the Sprig 2012 PASS OE ad PT, which was ot admiistered util the ed of the first posttest year. Therefore, due to its stroger relatioship to the Sprig 2014 PASS OE ad PT outcomes, ad because it was a true baselie measure, the Fall 2011 PASS Multiple Choice scaled score was chose as the covariate (i.e., pretest measure) for both the elemetary ad middle school cohort aalyses. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 21

Table 12: Fall 2011 PASS Multiple Choice, Treatmet (Phase 1) ad Cotrol (Phase 2) Mea Scaled Score Compariso: All Regios Treatmet (Phase 1) Cotrol (Phase 2) Group Studet M SD Studet M SD t g PR Elemetary Cohort - Ope-Eded All 35 1,409 306.03 100.39 31 1,176 316 98.81-2.53* -0.10 46 IEP 28 133 265.75 92.62 21 90 258.42 102.86 0.55 0.08 53 ELL 32 370 263.68 92.01 28 247 277.96 86.41-1.94-0.16 44 FRL 32 890 280.08 95.69 28 659 287.52 89.2-1.56-0.08 47 Female 35 693 302.23 94.44 31 584 313.05 97.34-2.01* -0.11 46 Treatmet (Phase 1) Cotrol (Phase 2) Group Studet M SD Studet M SD t g PR Middle Cohort - Ope-Eded All 8 832 368.45 103.45 7 695 362.84 111.7 1.02 0.05 52 IEP 7 83 275.46 90 6 71 269.59 120.65 0.34 0.06 52 ELL 7 44 274.25 80.46 7 64 246.67 91.15 1.62 0.31 62 FRL 8 490 338.92 99.53 7 380 318.23 108.71 2.92* 0.20 58 Female 8 434 372.06 99.2 7 352 359.58 108.43 1.68 0.12 55 Treatmet (Phase 1) Cotrol (Phase 2) Group Studet M SD Studet M SD t g PR Elemetary Cohort - Performace Task All 35 1,429 308.05 101.59 32 1,172 316.73 98.39-2.20* -0.09 47 IEP 28 132 266.52 91.6 22 94 254.68 100.22 0.92 0.12 55 ELL 32 371 263.52 92.41 29 238 279.6 85.71-2.15* -0.18 43 FRL 32 895 280.24 95.46 29 654 288.61 88.3-1.76-0.09 46 Female 35 703 303.87 94.93 32 581 314.23 96.22-1.93-0.11 46 Treatmet (Phase 1) Cotrol (Phase 2) Group Studet M SD Studet M SD t g PR Middle Cohort - Performace Task All 8 772 365.64 104.81 7 636 368.78 107.97-0.55-0.03 49 IEP 7 84 271.93 90.84 5 61 280.89 114.37-0.53-0.09 46 ELL 7 42 274.98 82.04 7 50 259.06 91.01 0.87 0.08 57 FRL 8 465 338.95 100.62 7 338 325.94 106.72 1.76 0.13 55 Female 8 405 368.36 102.13 7 328 362.54 105.82 0.75 0.06 52 * p < 0.05 Note: PASS MC scaled scores rage from 0-600. Employig the Fall 2011 PASS MC data as a covariate to statistically adjust the outcomes for baselie differeces i achievemet, prelimiary aalyses were coducted o Sprig 2014 PASS OE ad PT percet correct scores to determie ay differeces betwee Phase 1 ad Phase 2 elemetary ad middle schools. As oted earlier, for the elemetary cohort, there were statistically sigificat differeces betwee Phase 1 ad Phase 2 schools o the baselie measures for both the OE ad PT, with Phase 2 schools havig a advatage both overall ad for several subgroups. For the middle school cohort, the Phase 1 ELL subgroup had a substatively importat advatage ad the FRL subgroup had a statistically SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 22

sigificat advatage o the OE sectio. Due to these baselie differeces, results for these particular groups should be iterpreted with these advatages i mid. Elemetary ad Middle Cohorts PASS Ope Eded Aalyses: All Regios A set of ANCOVA aalyses iteded to geerate pairs of adjusted percetage correct scores ad to compute the treatmet effect sizes (g) was coducted by Phase withi cohort o the PASS OE outcomes for all elemetary (Phase 1 = 35, Phase 2 = 31) ad middle schools (Phase 1 = 8, Phase 2 = 7), as well as for subgroups, categorized by their Special Educatio (IEP) status, Eglish Laguage Learer (ELL) status, Ecoomically Disadvataged (FRL) status, ad Geder. Elemetary Cohort Sprig 2014 PASS Ope Eded Results: All Regios For the elemetary cohort schools across the three regios, after applyig the cluster correctio, the ANCOVA adjusted meas preseted i Table 13 idicated o statistically sigificat differece betwee Phase 1 ad Phase 2 schools overall (i.e., the All group) (F (1, 2578) = 6.32, p^ = 0.240, g = 0.09, PR = 54). I additio, the magitude of the effect size for the All group (g = 0.09) was ot cosidered to be substatively importat. It should be oted that o the pretest for this group, Phase 2 schools had a statistically sigificat advatage over Phase 1 schools, although it was ot cosidered substatively importat. Oly the ELL subgroup demostrated a statistically sigificat differece after the cluster correctio, which favored Phase 1 schools (F (1, 610 = 6.70, p^ = 0.043, g = 0.20, PR = 58). The differece, though, was ot statistically sigificat after the Bejamii Hochberg correctio for multiple comparisos that the WWC applies to secodary cotrasts (exploratory aalyses). It should be oted, however, that accordig to The Istitute of Educatio Scieces (IES) at the U.S. Departmet of Educatio (Schochet, 2008), multiplicity adjustmets are ot required for exploratory aalyses. Therefore, based o the IES guidace, the ELL outcome, which was still statistically sigificat after the cluster correctio, should still be cosidered a meaigful fidig. Furthermore, eve though Phase 2 schools had a advatage o the pretest overall ad for all but the IEP subgroup, after cotrollig for pretest differeces, Phase 1 schools outperformed Phase 2 schools o the posttest for all groups, although o posttest effect size was substatively importat. Table 13: PASS Ope-Eded Questios, Sprig 2014: Mea Percet Correct Compariso of Phase 1 (Treatmet) ad Phase 2 (Cotrol) Elemetary Cohort s (N = 66): All Regios Group Treatmet (Phase 1) Cotrol (Phase 2) M SD Adj. M M SD Adj. M F p p^ P^^ g PR All 35 1,409 65.87 21.09 66.39 31 1,176 65.12 20.18 64.50 6.32 0.012* 0.240 0.09 54 IEP 28 133 56.27 22.99 55.97 21 90 51.48 22.16 51.92 1.91 0.168 0.18 57 ELL 32 370 61.89 22.38 62.38 28 247 58.77 20.54 58.05 6.70 0.010* 0.043* 0.013 0.20 58 FRL 32 890 63.43 21.69 63.69 28 659 61.51 20.35 61.15 6.34 0.012* 0.134 0.12 55 Female 35 693 67.68 20.07 68.21 31 584 67.01 19.47 66.37 3.22 0.073 0.09 54 * p < 0.05. p^ = Clusterig-corrected statistical sigificace p^^ = Bejamii-Hochberg correctio for multiple comparisos correctio of the clusterig-corrected statistical sigificace. To remai statistically sigificat after the multiple compariso correctio, p^ p^^ SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 23

Middle Cohort Sprig 2014 PASS Ope Eded Results: All Regios Across the middle schools i the three regios, the ANCOVA adjusted meas preseted i Table 14 idicated o statistically sigificat differece betwee Phase 1 ad Phase 2 schools overall (i.e., the All group) (F (1, 1520) = 0.51, p = 0.477, g = 0.03, PR = 51). While o subgroup compariso was statistically sigificat, the ELL subgroup produced a effect size that was substatively importat favorig Phase 2 schools (g = 0.32) (with the ELL subgroup i Phase 1 schools havig a advatage (g = 0.31) o the pretest). Table 14: PASS Ope-Eded Questios, Sprig 2014: Mea Percet Correct Compariso of Phase 1 (Treatmet) ad Phase 2 (Cotrol) Middle Cohort s (N = 15): All Regios Group Treatmet (Phase 1) Cotrol (Phase 2) Studet M SD Adj. M Studet M SD Adj. M F p g PR All 8 832 85.32 15.49 85.08 7 695 84.32 15.32 84.60 0.51 0.477 0.03 51 IEP 7 83 70.92 20.04 70.28 6 71 69.48 23.55 70.24 0.00 0.992 0.00 50 ELL 7 44 66.67 19.40 66.25 7 64 72.40 20.12 72.68 2.77 0.099-0.32 37 FRL 8 490 82.91 16.74 63.69 7 380 80.02 17.01 61.15 1.09 0.297 0.15 56 Female 8 693 67.68 20.07 68.21 7 584 67.01 19.47 66.37 0.36 0.549 0.09 54 * p < 0.05. Elemetary ad Middle Cohorts PASS Performace Task Aalyses: All Regios A set of ANCOVA aalyses iteded to geerate pairs of adjusted percetage correct scores ad to compute the treatmet effect sizes (g) was coducted by Phase withi cohort o the PASS PT outcomes for all elemetary (Phase 1 = 35, Phase 2 = 32) ad middle schools (Phase 1 = 8, Phase 2 = 7), as well as for subgroups categorized by their Special Educatio (IEP) status, Eglish Laguage Learer (ELL) status, Ecoomically Disadvataged (FRL) status, ad Geder. Elemetary Cohort Sprig 2014 PASS Performace Task Results: All Regios For the elemetary cohort schools across the three regios, the ANCOVA adjusted meas preseted i Table 15 demostrate o statistically sigificatly differece betwee Phase 1 ad Phase 2 schools overall (i.e., the All group) after applyig the cluster correctio (F (1, 2594) = 6.28, p^ = 0.355, g = 0.09, PR = 54), idicatig that the average Phase 1 studet scored at the 54 th percetile of the cotrol group. I additio, the effect size was ot cosidered to be substatively importat accordig to WWC stadards. O the other had, both the IEP (F (1, 219) = 10.16, p^ = 0.014, g = 0.39, PR = 65) ad ELL (F (1, 602) = 15.54, p^ = 0.039, g = 0.30, PR = 62) subgroups i Phase 1 schools demostrated statistically sigificat outcomes after the cluster correctio, but either differece remaied statistically sigificat after the Bejamii Hochberg correctio for multiple comparisos that the WWC applies to secodary cotrasts (exploratory aalyses). Furthermore, effect sizes for both subgroups were substatively importat. The IEP subgroup, however, did ot meet the WWC cluster level attritio stadard, but did demostrate baselie equivalece with the aalytic sample. Agai, based o the IES guidace that adjustmets for multiple comparisos are ot required for exploratory aalyses (Schochet, 2008), the IEP ad ELL outcomes, which remaied statistically sigificat after the cluster correctio, should still be cosidered meaigful fidigs. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 24

Meawhile, eve though Phase 2 schools had a statistically sigificat advatage o the pretest overall ad for the ELL subgroup, for both groups, after cotrollig for statistically sigificat pretest differeces (All, g = 0.09, ad ELL, g = 0.18), there was o statistically sigificat differece betwee Phase 1 ad Phase 2 schools o the posttest. Furthermore, both the IEP (g = 0.39) ad ELL (g = 0.30) subgroups had substatively importat posttest effect sizes. I additio, after cotrollig for the advatage of the Phase 2 FRL (g = 0.09) ad Female (g = 0.11) subgroups o the pretest, the Phase 1 FRL ad Female subgroups were able to demostrate small, but positive effect sizes o the posttest (g = 0.14 ad g = 0.06 respectively). SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 25

Table 15: PASS Performace Task Questios, Sprig 2014: Mea Percet Correct Compariso of Phase 1 (Treatmet) ad Phase 2 (Cotrol) Elemetary Cohort s (N = 67): All Regios Treatmet (Phase 1) Cotrol (Phase 2) Group Studet M SD Adj. M Studet M SD Adj. M F p p^ P^^ g PR All 35 1,429 66.16 15.50 66.55 32 1,172 65.56 16.82 65.09 6.28 0.012* 0.355 0.09 54 IEP 28 132 60.29 16.26 59.96 22 94 52.25 21.16 52.72 10.16 0.002* 0.014* 0.004 0.39 65 ELL 32 371 63.20 14.85 63.55 29 238 59.17 17.95 58.63 15.54 <0.001* 0.039* 0.008 0.30 62 FRL 32 895 63.75 15.47 63.95 29 654 61.93 17.56 61.66 8.37 0.004* 0.157 0.14 56 Female 35 703 67.15 14.82 67.57 32 581 67.22 16.26 66.70 1.18 0.278 0.06 52 * p < 0.05. p^ = Clusterig-corrected statistical sigificace p^^ = Bejamii-Hochberg correctio for multiple comparisos correctio of the clusterig-corrected statistical sigificace. To remai statistically sigificat after the multiple compariso correctio, p^ p^^ SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 26

Middle Cohort Sprig 2014 PASS Performace Task Results: All Regios Across the middle schools i the three regios, the ANCOVA adjusted meas preseted i Table 16 demostrate o statistically sigificat differece betwee Phase 1 ad Phase 2 schools overall (i.e., the All group) after applyig the cluster correctio (F (1, 1401) = 19.09, p <0.001, p^ = 0.516, g = 0.12, PR = 58), idicatig that the average Phase 1 studet scored at the 58 th percetile of the cotrol group. I additio, the effect size was ot cosidered to be substatively importat accordig to WWC stadards. Furthermore, while there were o statistically sigificat differeces for ay subgroups after applyig the cluster correctio, for the All ad IEP subgroups, after cotrollig for a Phase 2 advatage o the pretest (g = 0.55 ad g = 0.53 respectively), Phase 1 schools outperformed Phase 2 schools o the posttest (g = 0.12 ad g = 0.19 respectively). Meawhile, although Phase 1 schools had advatages o the pretest for three additioal subgroups (ELL, FRL, ad Female) that were either statistically sigificat or substatially importat (ELL, g = 0.08, FRL, g = 0.13, ad Female, g = 0.06) the Phase 1 advatages were eve stroger o the posttest with substatially importat effects for the ELL (g = 0.37) ad FRL subgroups (g = 0.27), ad early substatially importat effects for the Female subgroup (g = 0.23). SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 27

Table 16: PASS Performace Task Questios, Sprig 2014: Mea Percet Correct Compariso of Phase 1 (Treatmet) ad Phase 2 (Cotrol) Middle Cohort s (N = 15): All Regios Treatmet (Phase 1) Cotrol (Phase 2) Group Studet M SD Adj. M Studet M SD Adj. M F p p^ g PR All 8 772 58.64 24.49 58.81 7 636 53.95 23.01 53.74 19.09 <0.001* 0.516 0.12 58 IEP 7 84 40.97 20.43 41.02 5 61 37.32 19.65 37.25 1.28 0.260 0.19 57 ELL 7 42 45.10 20.45 44.48 7 50 36.94 18.45 37.21 3.46 0.066 0.37 65 FRL 8 465 55.75 23.31 55.26 7 338 48.45 21.70 49.13 17.19 <0.001* 0.170 0.27 61 Female 8 405 61.68 24.33 61.40 7 328 55.69 22.94 56.04 11.21 0.001* 0.287 0.23 59 * p < 0.05 p^ = Clusterig-corrected statistical sigificace SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 28

Refereces Schochet, P. Z. (2008). Techical Methods Report: Guidelies for Multiple Testig i Impact Evaluatios (NCEE 2008 4018). Washigto, DC: Natioal Ceter for Educatio Evaluatio ad Regioal Assistace, Istitute of Educatio Scieces, U.S. Departmet of Educatio. Retrieved from http://ies.ed.gov/cee/pdf/20084018.pdf Wasserstei, R.L. & Lazar, N.A. (2016). The ASA's Statemet o p Values: Cotext, Process, ad Purpose. The America Statisticia, 70:2, 129 133. doi: 10.1080/00031305.2016.1154108 What Works Clearighouse (2014). Procedures ad stadards hadbook (Versio 3.0). Washigto, DC: Author. Retrieved from ies.ed.gov/cee/wwc/pdf/referece_resources/ wwc_procedures_v3_0_stadards_hadbook.pdf What Works Clearighouse (2016). Reviewer Guidace for Use with the Procedures ad Stadards Hadbook (versio 3.0). Washigto, DC: Author. Retrieved from http://ies.ed.gov/cee/wwc/pdf/referece_resources/wwc_reviewer_guidace_030416.pdf SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 29

Appedix SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 30

Table A-1: Clusterig Correctio for Mismatched Aalyses Due to the fact that the radom assigmet was carried out at the cluster level (i.e., school level), but the aalyses were coducted at the studet level, a clusterig correctio was applied to the p values for statistically sigificat outcomes of the ANCOVA aalyses to calculate clusterig corrected statistical sigificace levels (p values). Subtest Group Elemetary Cohort Treatmet (Phase 1) Cotrol (Phase 2) g Ucorrected p M ICC t t a df Corrected p PASS OE All 1,409 1,176 0.09 0.012 94 0.10 2.278617 1.1744776 2014.564 0.240 PASS OE ELL 370 247 0.20 0.010 83 0.07 2.434091 2.0241855 596.9861 0.043* PASS OE FRL 890 659 0.12 0.012 87 0.08 2.335034 1.5001646 1384.108 0.133 PASS PT All 1,429 1,172 0.09 0.012 94 0.19 2.283769 0.9253471 1332.025 0.354 PASS PT IEP 1 132 94 0.39 0.002 72 0.17 2.889757 2.4783267 211.5658 0.013* PASS PT ELL 371 238 0.30 <0.001 83 0.32 3.612335 2.0716161 369.8459 0.038* PASS PT FRL 895 654 0.14 0.004 87 0.16 2.721465 1.4145159 1086.117 0.157 Middle Cohort PASS MC IEP 1 111 114 0.28 0.016 16 0.12 2.09981 1.2978464 189.7401 0.195 PASS PT All 1 772 636 0.12 <0.001 22 0.17 2.240872 0.6498379 518.5035 0.516 PASS PT FRL 1 465 338 0.27 <0.001 22 0.18 3.777381 1.3750417 381.8819 0.169 PASS PT Female 1 405 328 0.23 0.001 22 0.23 3.09628 1.0658156 285.8985 0.287 1 Subgroup did ot meet the WWC cluster level attritio stadard. All subgroups demostrated baselie equivalece. * p < 0.05 after clusterig correctio SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 31

Table A-2: Bejamii-Hochberg Correctio for Multiple Comparisos Due to the fact that the What Works Clearighouse applies a multiple compariso correctio for both primary (cofirmatory) ad secodary (exploratory) cotrasts, the Bejamii Hochberg correctio was applied to the followig secodary cotrasts that remaied statistically sigificat after the cluster correctio. Subtest Group Elemetary Cohort Clusterig Corrected p value (p x ) p value Rak (x) Alpha x*alpha Total Number of Tests New Critical p value (p x '= x*alpha/total Number of Tests) Fidig p value < New Critical p value? (p x p x ') Statistical Sigificace after BH Correctio? PASS OE ELL 0.043 3 0.05 0.15 12 0.013 No No PASS PT IEP 1 0.014 1 0.05 0.05 12 0.004 No No PASS PT ELL 0.039 2 0.05 0.10 12 0.008 No No 1 Subgroup did ot meet the WWC cluster level attritio stadard. All subgroups demostrated baselie equivalece. Note: To remai statistically sigificat after the multiple compariso correctio, the Clusterig Corrected p value New Critical p value (px px') SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 32

Table A-3: Elemetary Cohort Cluster-Level Attritio MC OE PT Group Cluster Category Phase 1 Phase 2 Overall Phase 1 Phase 2 Overall Phase 1 Phase 2 Overall Total 54 51 105 35 34 69 35 34 69 Dropped 3 8 11 0 3 3 0 2 2 All Attritio Rate 5.6% 15.7% 10.5% 0.0% 8.8% 4.3% 0.0% 5.9% 2.9% Differetial Attritio 10.1% 8.8% 5.9% Fial Cout 51 43 94 35 31 66 35 32 67 Total 51 47 98 33 32 65 33 32 65 Dropped 9 17 26 5 11 16 5 10 15 IEP Attritio Rate 17.6% 36.2% 26.5% 15.2% 34.4% 24.6% 15.2% 31.3% 23.1% Differetial Attritio 18.5% 19.2% 16.1% Fial Cout 42 30 72 28 21 49 28 22 50 Total 52 48 100 35 33 68 35 33 68 Dropped 7 10 17 3 5 8 3 4 7 ELL Attritio Rate 13.5% 20.8% 17.0% 8.6% 15.2% 11.8% 8.6% 12.1% 10.3% Differetial Attritio 7.4% 6.6% 3.5% Fial Cout 45 38 83 32 28 60 32 29 61 Total 51 48 99 33 31 64 33 31 64 Dropped 4 8 12 1 3 4 1 2 3 FRL Attritio Rate 7.8% 16.7% 12.1% 3.0% 9.7% 6.3% 3.0% 6.5% 4.7% Differetial Attritio 8.8% 6.6% 3.4% Fial Cout 47 40 87 32 28 60 32 29 61 Total 54 51 105 35 34 69 35 34 69 Dropped 3 8 11 0 3 3 0 2 2 Females Attritio Rate 5.6% 15.7% 10.5% 0.0% 8.8% 4.3% 0.0% 5.9% 2.9% Differetial Attritio 10.1% 8.8% 5.9% Fial Cout 51 43 94 35 31 66 35 32 67 Note: Oly the IEP subgroup did ot meet the WWC cluster-level attritio stadard. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 33

Table A-4: Middle Cohort Cluster-Level Attritio MC OE PT Group Cluster Category Phase 1 Phase 2 Overall Phase 1 Phase 2 Overall Phase 1 Phase 2 Overall Total 13 17 30 8 8 16 8 8 16 Dropped 2 6 8 0 1 1 0 1 1 All Attritio Rate 15.4% 35.3% 26.7% 0.0% 12.5% 6.3% 0.0% 12.5% 6.3% Differetial Attritio 19.9% 12.5% 12.5% Fial Cout 11 11 22 8 7 15 8 7 15 Total 12 12 24 7 7 14 7 6 13 Dropped 2 6 8 0 1 1 0 1 1 IEP Attritio Rate 16.7% 50.0% 33.3% 0.0% 14.3% 7.1% 0.0% 16.7% 7.7% Differetial Attritio 33.3% 14.3% 16.7% Fial Cout 10 6 16 7 6 13 7 5 12 Total 12 17 29 7 8 15 7 8 15 Dropped 2 6 8 0 1 1 0 1 1 ELL Attritio Rate 16.7% 35.3% 27.6% 0.0% 12.5% 6.7% 0.0% 12.5% 6.7% Differetial Attritio 18.6% 12.5% 12.5% Fial Cout 10 11 21 7 7 14 7 7 14 Total 13 17 30 8 8 16 8 8 16 Dropped 2 6 8 0 1 1 0 1 1 FRL Attritio Rate 15.4% 35.3% 26.7% 0.0% 12.5% 6.3% 0.0% 12.5% 6.3% Differetial Attritio 19.9% 12.5% 12.5% Fial Cout 11 11 22 8 7 15 8 7 15 Total 13 17 30 8 8 16 8 8 16 Dropped 2 6 8 0 1 1 0 1 1 Females Attritio Rate 15.4% 35.3% 26.7% 0.0% 12.5% 6.3% 0.0% 12.5% 6.3% Differetial Attritio 19.9% 12.5% 12.5% Fial Cout 11 11 22 8 7 15 8 7 15 Note: Neither the Overall sample or ay subgroup met the WWC cluster-level attritio stadard. SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 34

Table A-5: What Works Clearighouse (WWC) Allowable Overall ad Differetial Attritio Rates Note: Reproduces Table III.1 i the WWC Procedures ad Stadards Hadbook (Versio 3.0): Highest Differetial Attritio for a Sample to Maitai Low Attritio, by Overall Attritio, Uder Liberal ad Coservative Assumptios SSEC i3 Validatio Fial Report of Cofirmatory ad Exploratory Aalyses 35