Chapter 6: Assessment of Student Learning

Similar documents
EQuIP Review Feedback

Test Administrator User Guide

McDonald's Corporation

Final Teach For America Interim Certification Program

Guide to Teaching Computer Science

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

PEDAGOGICAL LEARNING WALKS: MAKING THE THEORY; PRACTICE

What is PDE? Research Report. Paul Nichols

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

For information only, correct responses are listed in the chart below. Question Number. Correct Response

Professional Learning Suite Framework Edition Domain 3 Course Index

Delaware Performance Appraisal System Building greater skills and knowledge for educators

Secondary English-Language Arts

Classroom Assessment Techniques (CATs; Angelo & Cross, 1993)

BENG Simulation Modeling of Biological Systems. BENG 5613 Syllabus: Page 1 of 9. SPECIAL NOTE No. 1:

Unit 3. Design Activity. Overview. Purpose. Profile

Rendezvous with Comet Halley Next Generation of Science Standards

VIEW: An Assessment of Problem Solving Style

PROGRESS MONITORING FOR STUDENTS WITH DISABILITIES Participant Materials

Assessment and Evaluation

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science

How to Judge the Quality of an Objective Classroom Test

This Performance Standards include four major components. They are

What can I learn from worms?

Common Core Exemplar for English Language Arts and Social Studies: GRADE 1

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

Developing an Assessment Plan to Learn About Student Learning

TEKS Resource System. Effective Planning from the IFD & Assessment. Presented by: Kristin Arterbury, ESC Region 12

Field Experience and Internship Handbook Master of Education in Educational Leadership Program

Scoring Guide for Candidates For retake candidates who began the Certification process in and earlier.

Delaware Performance Appraisal System Building greater skills and knowledge for educators

A Pilot Study on Pearson s Interactive Science 2011 Program

RED 3313 Language and Literacy Development course syllabus Dr. Nancy Marshall Associate Professor Reading and Elementary Education

Introducing the New Iowa Assessments Reading Levels 12 14

Florida Reading Endorsement Alignment Matrix Competency 1

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

School Leadership Rubrics

The ELA/ELD Framework Companion: a guide to assist in navigating the Framework

Case study Norway case 1

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

Indiana Collaborative for Project Based Learning. PBL Certification Process

Queensborough Public Library (Queens, NY) CCSS Guidance for TASC Professional Development Curriculum

STUDENT ASSESSMENT, EVALUATION AND PROMOTION

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Exams: Accommodations Guidelines. English Language Learners

Introducing the New Iowa Assessments Language Arts Levels 15 17/18

Kentucky s Standards for Teaching and Learning. Kentucky s Learning Goals and Academic Expectations

State Parental Involvement Plan

KENTUCKY FRAMEWORK FOR TEACHING

1. Answer the questions below on the Lesson Planning Response Document.

Number of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)

Let's Learn English Lesson Plan

Facing our Fears: Reading and Writing about Characters in Literary Text

Colorado State University Department of Construction Management. Assessment Results and Action Plans

WHI Voorhees SOL Unit WHI.3 Date

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question

Exemplar Grade 9 Reading Test Questions

Examining the Structure of a Multidisciplinary Engineering Capstone Design Program

EDUC-E328 Science in the Elementary Schools

A Survey of Authentic Assessment in the Teaching of Social Sciences

Timeline. Recommendations

Norms How were TerraNova 3 norms derived? Does the norm sample reflect my diverse school population?

Learning Lesson Study Course

E-3: Check for academic understanding

Instructional Supports for Common Core and Beyond: FORMATIVE ASSESMENT

California Professional Standards for Education Leaders (CPSELs)

Stimulating Techniques in Micro Teaching. Puan Ng Swee Teng Ketua Program Kursus Lanjutan U48 Kolej Sains Kesihatan Bersekutu, SAS, Ulu Kinta

What is Thinking (Cognition)?

Expanded Learning Time Expectations for Implementation

Focus on. Learning THE ACCREDITATION MANUAL 2013 WASC EDITION

NORTH CAROLINA STATE BOARD OF EDUCATION Policy Manual

West s Paralegal Today The Legal Team at Work Third Edition

Some Basic Active Learning Strategies

Biome I Can Statements

Results In. Planning Questions. Tony Frontier Five Levers to Improve Learning 1

TASK 2: INSTRUCTION COMMENTARY

eportfolio Guide Missouri State University

TEACH 3: Engage Students at All Levels in Rigorous Work

ACADEMIC AFFAIRS GUIDELINES

Great Teachers, Great Leaders: Developing a New Teaching Framework for CCSD. Updated January 9, 2013

An Analysis of the Early Assessment Program (EAP) Assessment for English

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Chart 5: Overview of standard C

Florida Reading for College Success

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

STANDARDS AND RUBRICS FOR SCHOOL IMPROVEMENT 2005 REVISED EDITION

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

AC : DEVELOPMENT OF AN INTRODUCTION TO INFRAS- TRUCTURE COURSE

Essential Learnings Assessing Guide ESSENTIAL LEARNINGS

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

learning collegiate assessment]

Why OUT-OF-LEVEL Testing? 2017 CTY Johns Hopkins University

Vision for Science Education A Framework for K-12 Science Education: Practices, Crosscutting Concepts, and Core Ideas

MODULE 4 Data Collection and Hypothesis Development. Trainer Outline

What s in Your Communication Toolbox? COMMUNICATION TOOLBOX. verse clinical scenarios to bolster clinical outcomes: 1

Probability and Statistics Curriculum Pacing Guide

NC Global-Ready Schools

NAME OF ASSESSMENT: Reading Informational Texts and Argument Writing Performance Assessment

Higher Education / Student Affairs Internship Manual

Criterion Met? Primary Supporting Y N Reading Street Comprehensive. Publisher Citations

Transcription:

Chapter 6: Assessment of Student Learning Introduction This chapter will address the nature of assessment and the purposes of assessment at different levels in the educational system from the classroom, to the district and state, to the national and international levels. The big idea of assessment is that assessments are cyclical in nature teachers and students use assessment to monitor student progress which in turn informs instructional decisions that support learning. This chapter will discuss the role of the teacher and the role of the student in the assessment instruction cycle. It will also address a variety of assessments designed to test student mastery of higher-order thinking and the integral role of the investigation and experimentation standards as part of assessment. Results from classroom assessments provide quality feedback to teachers. This chapter will address using data and results that will allow teachers to improve student learning, inform and guide instruction, and research their teaching practices. This chapter will also cover strategies for the assessment of English learners and special needs students. Information regarding the current statewide assessment system in science will also be covered. I. The Nature of Assessment The nature of assessment is the essence of knowing what students should know, do, and understand. In science, assessments should provide students the opportunity to demonstrate their understanding of important and meaningful science content, to use scientific tools and processes, and to apply their knowledge and understanding to real-life situations. Assessment does not exist in isolation, but must be closely aligned with the goals of state standards, curriculum, and instruction to support learning. Current research calls for balanced assessment systems that align and restore: Comprehensiveness the use of multiple sources of evidence to draw inferences about an individual student s proficiency, Coherence a shared model of learning that links curriculum, instruction and assessment within the classroom and links the classroom with large-scale assessments and Continuity between classroom, district and state assessments calling for a longitudinal assessment of learning progress over time. i Assessment, testing, and educational measurement are often used interchangeably to refer to a process by which educators use students responses to stimuli in order to draw inferences about students knowledge and skills. ii While testing usually refers to standardized multiple-choice instruments, assessment per se denotes a more comprehensive view of student performance. iii 1

The terms assessment and evaluation are also used interchangeably and in many contexts. Assessment refers to the judgment of student performance and evaluation refers to the judgment of programs or organizational effectiveness. iv 2

II. Purpose of Assessment Assessment is a systemic, multi-step process involving the collection and interpretation of educational data. As the primary feedback mechanisms in the educational system, assessments provide information to students about how well they are performing; to teachers about how well their students are learning; to districts about the effectiveness of their teachers and programs; and to policymakers about the effects of their policies. The intent of this feedback is to allow stakeholders within the educational system to make informed decisions regarding improved student learning, teacher development, program modifications, and changes in policy. v The purposes of assessment can be categorized into three main areas: (1) support of student learning; (2) certification, which includes reporting individual student achievement, placement and/or promotion; and (3) accountability, which is designed to evaluate programs and inform policy. The first purpose focuses primarily on formative and summative classroom assessments while the second and third are geared more toward large-scale assessments including district, state, and national tests. Formative and Summative Classroom Assessments Formative assessment is defined as assessment carried out during the instructional process for improving teaching or learning. vi Assessment becomes formative only when either the teacher or the student uses that information to inform teaching and/or to influence learning. vii Formative assessments are informal, ongoing assessments that provide continuous opportunities for teachers to observe, question, listen, and provide immediate feedback to students. Formative assessment also provides opportunities for students to become more involved in the assessment process and to become self-reflective about their own learning. The line between instruction and assessment is blurred in classrooms where formative assessment is used to support learning. Everything students do from conversing in groups, completing seatwork, answering and asking questions, to sitting quietly and looking confused is a potential source of information about what they do and do not understand.the teacher who consciously uses assessment to support learning takes in this information, analyzes it, and makes instructional decisions that address the understandings and misunderstandings that these assessments reveal. viii While formative assessments occur minute-by-minute and day-by-day, summative assessments are cumulative assessments, usually occurring at the end of a unit of instruction. Designed to measure what a student has learned after a certain period of time, summative assessments are administered less frequently than formative assessments. Teachers also use summative assessments as pretests to see what students understand before they teach a unit of instruction and as posttests afterwards to see what students learned as a result of their instruction. Summative assessments are also used for reporting grades at the end of a semester. Summative classroom assessments should (1) enable students to draw on what they have learned to explain new phenomena, think critically and make informed decisions ix, and (2) consist of multiple measures including hands-on performance tasks, constructed response investigations, 3

long and short essays, portfolios, interactive computer tasks, and well constructed multiplechoice tests. District Summative Assessments School districts administer summative tests to students throughout the year to determine if students are learning the grade level science content recommended in the state standards and to evaluate the district science program. Examples of the types of summative science assessments used in school districts in California include benchmark and interim assessments and end-ofcourse tests. Benchmark and interim assessments are used to monitor progress during the school year toward meeting state standards and NCLB performance goals. x These assessments usually consist of multiple-choice questions and are administered at the end of every quarter. These types of assessments focus on program evaluation and provide teachers with information about which science standards students have mastered. Current research does not show that benchmark or interim assessments help to improve student learning or achievement in science. xi End-of-course tests are used by districts at the high school level to determine the content learned by students as a result of taking a specific course of study. Districts also implement end-ofcourse tests to: establish the effectiveness of the curriculum in each science domain; ensure that course content is focused on state standards; establish a common level of expected student performance; ensure that evaluation of student performance is consistent across classrooms and schools in the district; and to help identify students who need additional help to meet graduation requirements. Districts participating in state-funded projects also administer summative assessments; for example, districts participating in the California Mathematics Science Partnerships (CaMSP) are required to administer a standards-based assessment as a pretest and posttest to students in both treatment and control groups at the beginning and end of the school year. The analyses of the pretest and posttest results are used to determine if the treatment teachers training makes a difference in student learning and achievement of science. Districts are using a variety of multiple-choice tests aligned to enduring grade level standards. State Summative Assessments The California Standards Tests (CSTs) are summative assessments that measure student achievement of the Science Content Standards for California Public Schools. The CST s are a battery of standardized tests that comprise the state's STAR (Standardized Testing and Reporting) Program. All students in grades two through eleven participate in the STAR Program including students with disabilities and students who are English learners. Section VII of this chapter addresses the science portion of the CST in more depth. National and International Assessments The National Assessment of Educational Progress (NAEP) also known as the Nation s Report Card measures fourth-, eighth-, and twelfth-grade students performance in science with assessments designed specifically for national and state information needs. The NAEP 4

assessment contains multiple measures including: multiple-choice items, constructed response questions, hands-on performance tasks, and interactive computer tasks. All of the components of NAEP are aligned to the content recommendations in the 2009 NAEP Science Framework. Participation in NAEP allows states to compare student achievement to the achievement of students in other states. The international assessments, the Trends in International Mathematics and Science Study (TIMSS) and the Program for International Student Assessment (PISA), enable the United States to benchmark its performance in fourth-grade and eighth-grade mathematics and science in TIMSS and in 15-year-old students mathematics, science, and reading literacy in PISA to that of other countries. Each test was designed to serve a different purpose and each is based on a separate and unique framework and set of assessment questions although content areas assessed and ages and grade levels of students are significantly similar. The definitions of science differ among the three assessments: The NAEP framework defines science as physical science, life science and earth science. TIMSS also includes life science and earth science, but with regard to physical science, TIMSS splits it into separate domains for physics and chemistry. NAEP identifies three categories of knowing and doing science as conceptual understanding, scientific investigation, and practical reasoning. PISA takes a broader approach than both NAEP and TIMMS in addressing important competencies required for scientific literacy: identifying scientific issues, explaining scientific phenomena, and using scientific evidence. PISA s content can be divided into knowledge of the natural world (in the fields of life systems, physical systems, Earth and space systems, and technology systems) and knowledge about science itself (scientific inquiry and scientific explanations). All three assessments are conducted regularly to allow the monitoring of student outcomes over time. 5

III. Assessment of Student Learning Quality classroom assessment informs instruction and improves student understanding, learning and achievement. Both formative and summative assessments make up quality classroom assessments. Formative assessment is defined as a planned, ongoing dynamic process in which teachers and students use evidence to adjust teaching and learning. Measurements of student learning, such as scores from a summative test, are just one component of the formative assessment process. While informal or formal assessments play a role in this process, they are not the process itself. xii In this chapter, assessment of student learning in science is defined as a process of formative assessment that integrates instruction with multiple measures of student ability including a variety of techniques for various learning styles and levels of readiness. The research base clearly supports the process of formative assessment in improved student learning and achievement. A synthesis of more than 4,000 research studies undertaken during the last 40 years consistently shows that, when implemented well, formative assessment can significantly improve student learning and achievement more than any other educational outcome. xiii The big idea of formative assessment is that evidence about student learning is used to adjust instruction to better meet student needs; in other words, teaching is adaptive to the student s learning needs and assessment is done in real time. xiv When teachers engage in formative assessment, the purpose of assessment changes from just measuring what students know to enhancing student learning. xv In this new role, assessment is a shared responsibility between teachers and students. The Teacher s Role in Classroom Assessment The teacher s role in ongoing assessment is to facilitate student growth, understanding, and learning. Teachers use continuous assessments to: improve classroom practice, plan curricula, develop self-directed learners, report student progress, and investigate their own teaching practices. xvi While numerous strategies can be used for formative assessment, current research shows that higher-level questioning, descriptive feedback, student self-assessment and reflection, and student self-regulated learning all have a positive effect on student achievement and the ability for students to transfer their learning to new situations. xvii Good questioning is at the heart of classroom practice. Teachers spend at least 80% of their time engaged in questioning on any given day. Research shows that questioning can improve student learning when teachers: (1) structure questions around information that is critical to the topic, not around information that might be interesting or unusual; (2) ask questions that are higher-level the questions require students to analyze, synthesize, and apply information instead of recalling facts; (3) provide students with wait time after a question so that students have time to think about their response; and (4) help students establish a mental map to process their learning experiences. xviii 6

Teacher feedback is central to formative assessment. The effectiveness of formative assessment is dependent on both the quality of the information gathered and the quality of the feedback provided. In a study on teacher grading and feedback, researchers investigated the effectiveness of different kinds of feedback over a series of lessons. The students were randomly assigned to one of three groups: Group A received written feedback clearly describing what they did correctly, what was incorrect, and what was needed to improve their work; Group B received only grades derived from scoring; and Group C received grades and general comments. The scores for the students in Group A that received constructive descriptive feedback increased significantly from the first to the second lesson, while the scores for the students in Groups B and C declined between the first and the second lesson. xix Research shows that in order for feedback to be effective it should be: (1) corrective in nature clearly describing what the student is doing that is correct, not correct, and what needs to be done to improve their work; (2) provided in a timely manner; (3) and specific to a criterion referencing a specific level of skill or knowledge. xx The Student s Role in Classroom Assessment Students are ultimately responsible for taking action to bridge the gap between where they are and where they need to go in their learning. xxi Research shows that when students have insight into their own strengths and weaknesses and develop their own repertoires of strategies for learning, their learning improves. Self-assessment, peer-assessment, and self-regulation are metacognitive strategies that assist students in improving their own learning. xxii Peer-assessment is a powerful complement to formative assessment. Student discourse during peer-assessment is valuable because it allows students to assume the role of the teacher. In the role of teacher, students have to make sure that they understand the content so that they can evaluate the understanding of their peers. xxiii As students become self-regulated learners, they are able to describe their strengths, analyze learning tasks to consider their options, explain their choices in completing their learning tasks, and regularly set goals for future learning. xxiv In order for self-assessment, peer-assessment and self-regulated learning to become effective components of student learning, students must understand the criteria used to evaluate their work and the difference between quality work and substandard work. Students should also be taught the habits and skills of the collaborative process used in peer-assessment, requiring them to see their work objectively. To become a self-directed learner, students set their sights on their own learning goals and understand the steps they must go through in order to get there. xxv Strategies and Techniques for Formative Assessment Research maintains that the process of effective formative assessment consists of five key strategies. xxvi Figure 1 below outlines the five key strategies and one suggested technique for implementing each strategy. xxvii Figure 1: Key Strategies and Techniques for Effective Formative Assessment 7

Strategy Technique Description 1. Clarifying Learning Intentions and Sharing Criteria for Success Sharing Exemplars Before asking 11 th grade students to write a lab report, the teacher gives each student four sample lab reports representing varying degrees of quality. The lab reports are teacher-generated or from a previous class. Students are asked to analyze the reports and 2. Engineering Effective Classroom Discussions, Questions, and Learning Tasks that Elicit Evidence of Learning 3. Providing Feedback that Moves Learners Forward 4. Activating Students as Learners of Their Own Learning 5. Activating Students as Instructional Resources for One Another White Boards Find it and Fix it Traffic Lighting Pre-Flight Check List identify why certain reports are of a higher quality than others. During a 4 th grade lesson on magnetism, the teacher asks the class what would happen if two like poles of two magnets were placed together. He asks the class to write their answers on their white boards and hold them up on the count of three. Using this kind of all student response system helps the teacher to get a sense of what students understand while requiring all students to engage in the task. If all answers are correct, the teacher moves on. If none are correct, the teacher may choose to re-teach the concept in another way. If there are a variety of answers, the teacher can use the information from the student answers to direct a class discussion. Students in a 7 th grade classroom just completed a task on plant and animal cells. Rather then checking all correct answers and putting a check next to incorrect ones, the teachers tells a student three of your answers are incorrect; find them and fix them. This requires the student to engage cognitively in response to the feedback rather than reacting emotionally to a letter grade. After students in a 3 rd grade class complete a lesson on energy and matter, they review the learning goal their teacher provided at the beginning of the lesson and hold up a colored circle to indicate their level of understanding. Green means I understand; yellow, I m not sure; and red, I do not understand. At regular intervals, the teacher provides time in class for students to move their learning forward by turning their reds to yellows and their yellows to green. For homework, students in a 9 th grade class write a paper on a science based societal issue. Before turning in their work, students trade papers with a peer. Each student completes a pre-flight checklist by comparing the peer s document against a list of required elements, e.g., identify a science-based societal issue, cite research studies, analyze data, and communicate findings. As teachers utilize these key strategies and techniques for formative assessment and integrate them into their practice, they view their own practice in new ways. Implementing and Sustaining Formative Assessment with Teacher Learning Communities Formative assessments are not common practices in most teachers classrooms and changes in teacher practice are not always easy to implement. Furthermore, professional development in almost any aspect of assessment is sparse. By working with practicing classroom teachers in real time, researchers have identified practical suggestions for setting up Teacher Learning Communities (TLC) to implement and sustain formative assessment. xxviii Figure 2 below outlines a strategy found to be successful in establishing a TLC. xxix 8

Figure 2: Strategies for Implementing a Teacher Learning Community Around Classroom Assessment Suggestion Plan for the TLC to run for at least two years Start with volunteers Meet monthly for at least 75 minutes Aim for a group of 8-10 teachers Try to group teachers with similar assignments Establish building-based groups Require teachers to make detailed, modest, individual action plans Creating an Action Plan Teacher Leader to organize and coordinate meetings Rationale Formative assessment is not a quick fix. It takes time to learn, practice, and refine your strategies. Formative assessment cuts across many established practices in schools and volunteers are more likely to find ways around obstacles. Monthly meetings are more suited to teachers schedules and time. To ensure that all individuals have adequate time to report and share, the meeting should last 75 minutes or longer. When the group is too small, there are not enough differences of opinion to provide for good teacher learning. When the group is too large, all members may not have time to talk. Teachers should work and share in small grade level groups. While cross-building meetings can be productive, it is best to work within sites so that support can be maintained with a group of trusted colleagues. At the first meeting, each teacher should made a specific plan about what they want to change. Teachers should focus on a small number of changes they can integrate into their practice. The following questions are intended to help teachers format their own action plans: 1. What is one thing that you will find easy to change? What difference do you expect it to make to your practice? 2. What is one thing that you would like to change that will require support? What help would you need? 3. What other changes would you like to make later on in the year? What help might you need? 4. What will you do differently or stop doing to implement these changes? Someone needs to make sure the meetings happen, e.g., secure a room, send out the agenda, secure refreshments and so on. This person should not be an expert. The idea of a TLC is that each person comes with a clear idea about what they want help with and the group helps that person with the task. The following five-part process xxx was also found to be successful in implementing and sustaining teacher learning community meetings: 1. Introduction (5-10 minutes): Participating teachers agree on the goals of the meeting and agenda. 2. How s it going? (30-50 minutes): Each teacher provides a summary of what they did in relation to their action plan during the previous month. The other teachers listen and provide support for that teacher in moving their plans forward. 9

3. New learning about formative assessment (25-40 minutes): The teacher leader or a small group of other teachers research and introduce new ideas in formative assessment to the group. The teachers engage in a shared activities intended to improve their understanding of formative and summative assessment. 4. Individual teacher planning (10-15 minutes): Based on the group discussion, feedback and new learning, teachers may want to revise their action plans. Teachers need time to think through what they are planning to do in the next month. They may also want to discuss new ideas with their colleagues. 5. Review of the meeting (5 minutes): The lead teacher redirects the group to the original goals and objectives for the meeting and checks to see if they were achieved. Teacher learning communities have the potential to support the implementation of formative assessments while installing ownership in teachers for their own professional development. The strategies mentioned above provide the foundation for a practical and workable model that will enable schools to initiate and sustain teacher professional development based on formative assessment. The Assessment Instruction Cycle During the assessment instruction cycle, teachers continuously observe student behavior, collect evidence, and make reasonable inferences about what students know. Assessment is central to teaching and to instruction an invisible thread connects assessment, curriculum and teaching together in the service of learning. xxxi There are four major components to the assessment instruction cycle: (1) achievement expectations; (2) the cyclical nature of assessment and instruction; (3) multiple forms of assessment; and (4) evidence and feedback. xxxii The bases for state, district and classroom assessment, as well as curriculum and instruction in California are the State Science Content Standards. Achievement expectations start with the state standards and there is strong alignment among the state standards, the state adopted science curricula, the teachers instructional practices and the students learning goals. Student learning goals are clearly translated into plain language that all students can understand. Teachers guide students through well-defined learning progressions and students understand where they need to go next to accomplish their goals. Teachers also provide students with criteria for how their work will be judged and exemplars or models of quality student work. xxxiii Assessment and instruction are cyclical in nature. Teachers and students use assessment to monitor student progress, which in turn informs instructional decisions that support learning. Teachers assess, determine needs, provide descriptive feedback, set goals, provide guided practice, and keep the cycle in continuous motion. Students work with their teacher to know where they are in their learning continuum. With their teacher s guidance, students track and manage their progress, assess and reflect on their learning, set goals, learn, and keep the cycle in continuous motion. xxxiv 10

Teachers use multiple forms of assessment that yield accurate information about students to support their learning and achievement. Teachers are continuously collecting evidence, analyzing it, and providing timely descriptive feedback to students. The evidence and feedback are: directly related to the standards and to the students leaning goals; communicated and understood by students to encourage self-reflection and goal setting; and used to show growth and improvement over time for students, teachers, and parents. xxxv 11

IV. Examples of Quality Formative and Summative Science Assessments Assessments should provide students the opportunity to demonstrate their understanding of important and meaningful science content, to use scientific tools and processes, to apply their understandings to solve new problems, and to draw on what they have learned to explain new phenomena, think critically, and make informed decisions. xxxvi All assessments should have clear expectations for students, be valid, reliable, and free of bias. Validity Three types of validity are central to assessment: content validity; construct validity; and instructional validity. Content validity addresses the degree to which an assessment measures the intended content of the standards. Construct validity refers to the degree to which an assessment measures a construct or ability. The Investigation and Experimentation standards, for example, outline the skills or constructs necessary to engage in scientific inquiry. To make a valid claim about a student s ability to conduct inquiry, the assessment would need to assess the range of skills in the Investigation and Experimentation standards. Finally, an assessment has instructional validity if the content of the test matches what is actually being taught during instruction. Reliability When assessments are reliable, they consistently measure what they are intended to measure. There are three kinds of consistency in classroom assessments: stability the consistency of student scores over time; alternate test forms consistency of results among two or more different forms of a test; and internal consistency consistency in the way items on an assessment work. xxxvii Bias Sometimes assessments can be biased against particular groups of students. When an assessment is biased, the constructs of the test cause students to perform poorly. All assessments should be free of bias they should not penalize students because of their gender, ethnicity, socioeconomic status, religion, or other defining characteristics. Assessments should also not be offensive to students. xxxviii Different forms of bias include: xxxix Content Bias: Does the assessment contain content that is different or unfamiliar to different groups? Example: asking girls to compare the mass of different footballs when they have not had experience with footballs. Language Bias: Does the assessment contain words that have different or unfamiliar meanings for different groups? Example: asking urban students about farming techniques such as forage pits. Item Structure and Format Bias: Does the nature of the task confuse members of different groups? Example: requiring non-english learners to write a long essay in English. 12

Stereotyping: Does the assessment give a positive representation of different groups? Assessments should be free of material that may be offensive, demeaning, or emotionally charged. Fairness: Is the assessment balanced in terms of being equally familiar to every group? Tests should be free of words or phrases that are generally associated with elitism-- polo, yacht, regatta; finances--venture capital, stock options; regionalisms--grinder, hoagie, parish; military topics--rapier, mortar, breech; political topics--alderman, pork barrel; legal topics--tort, docket; and farm topics--combine, thresher. Assessing the Science Content Standards for California Public Schools Assessments should cover the content of the standards at each grade level including the standards for Investigation and Experimentation. The Investigation and Experimentation standards are central to the role of assessment in the teaching of science. Involving students in scientific inquiry helps them develop proficiency in: 1) understanding scientific concepts; 2) appreciating how and what we know in the realm of science; 3) understanding of the nature of science; 4) the ability to inquire about the natural world; and 5) the ability to use the skills and attitudes associated with science. xl The Investigation and Experimentation standards are multifaceted they call for students to make observations, pose questions, make predictions, plan and conduct investigations, use tools to gather, analyze and use data, generate and evaluate evidence and explanations, use critical and logical thinking, examine information, consider alternative explanations, and communicate their results. Student understanding of this rich array of skills cannot be captured in a simple set of multiplechoice questions. Assessments should consist of different strategies ranging from formative assessments which include teacher observations and feedback to challenge statements, to summative assessments which include hands-on performance tasks, constructed response investigations, open-ended questions, portfolios, and well constructed multiple-choice tests. Multiple-Measures of Student Achievement Assessments should be based on multiple measures of student ability and include a variety of techniques for various learning styles and levels of readiness. Figure 4 below outlines examples of formative and summative assessments. Figure 4: Examples of Formative and Summative Assessments Formative Teacher Observation, Listening, Questioning and Feedback Self-reflection and Self-assessment Peer Assessment and Reflection Science Notebooks White Boards Summative Hands-On Performance Tasks Constructed Response Open-ended Questions Multiple-choice Questions Portfolios 13

Graphic Organizers: Concept Maps, Concept Webs, Venn Diagrams, Flowcharts Challenge Statements Extended Research Projects Student Presentations Interviews Homework Assignments Interactive Computer Assessments Constructed Response Items Constructed response items require students to write their own answers. Student responses are scored with a scoring rubric tailored specifically to each task. Scoring rubrics can be holistic (where a single score is assigned to the entire task) or analytical (where each question on a task receives an individual score). Analytical rubrics are more diagnostic in nature and provide more detailed information regarding student understanding of science content and inquiry constructs in the task. Hands-on Performance Tasks Hands-on performance tasks integrate standards for life, earth and/or physical science with Investigation and Experimentation constructs. During a hands-on task, students are presented with a scenario identifying a problem that needs to be solved. Students are provided hands-on materials organized on a placemat, and asked to: make predictions; setup and conduct an investigation; record data and observations; organize data (graphs, charts, tables, etc.); explain if and how the results of their investigation either support or refute their prediction; analyze their results and use their own data and findings to explain their answers; use what they ve learned in the task to make an application beyond the task; and/or think of another (new) question to investigate and briefly describe the steps of a plan for a new investigation. Students work with a partner to conduct their investigation and to collect their data. They work individually to record their answers in their test booklet. Examples of performance tasks are in Appendix A. Constructed Response Investigations Constructed Response Investigations are extended paper/pencil tasks that integrate science concepts with inquiry and investigation. Students are presented with a problem that students (hypothetical) in another school are trying to solve. They are provided a set of authentic data and a set of questions and required to: analyze the problem and the data; graph and interpret data; interpret relationships on graphs; construct models, questions, predictions and/or hypothesis; recommend solutions; and/or design new investigations to further explore the problem in the task. Although students usually work individually, these tasks can be designed to include information that students would discuss with a partner before writing their individual responses. Examples of constructed response tasks are in Appendix A. 14

Open-ended Questions Open-ended questions are short paper/pencil tasks that focus on evaluating understanding and reasoning. They are designed to explore students abilities to: communicate scientific understandings; use inquiry; reason scientifically; express positions on societal issues; and design an experiment. Students are presented with a prompt, usually in the form of a problem or scenario, and asked to communicate their understandings of scientific concepts and processes. Students work individually to record their responses in their test booklet. Examples of open-ended questions are in Appendix A. Challenge Statements Challenge Statements are assessment probes designed to investigate students thinking about important science concepts. The assessment probe consists of a deliberately provocative or ambiguous statement about a science concept such as As electrical current passes through devices such as light bulbs and motors, some of it gets used up. The learner is asked to agree or disagree with the statement and to explain their reasoning. Students are expected to explain their thinking using everyday language and not use academic vocabulary. Academic vocabulary can be used as a screen for not revealing misconceptions. The goal of Challenge Statements is to make student thinking visible and not hide their misconceptions behind their science vocabulary. Challenge Statements are used before and after a unit of instruction. Students start by thinking about the Challenge Statement and writing their thoughts individually. They discuss their ideas with their peers and then have an opportunity to revise their statement based on input from their group. Challenge Statements demand deeper thinking and investigation. They set the stage for meaningful discussion as part of learning. Challenge Statements are evaluated using a 5-point rubric modeled after the five levels of proficiency measured in the California Standards Tests. In evaluating responses, valid conceptions and sophistication of reasoning are considered. Student Science Notebooks Student Science Notebooks engage students in scientific thinking as they explore questions, make predictions, plan and conduct investigations, collect, organize and use data, apply their learning, and communicate their understanding of science. As an assessment tool, science notebooks have been found to: help students construct their conceptual thinking; inform and guide instruction; enhance literacy skills; support differentiated learning; and foster teacher collaboration. White Boards White Boards are powerful tools for allowing students to make their thinking visible. The use of white boards at the beginning of an instructional unit is an effective way to elicit students prior knowledge of the content to be taught. Before teaching a fourth grade lesson on circuits, a teacher may ask the class to quickly draw a complete circuit on their white boards and hold them up. The teacher can easily find out which students understand circuits and use this information to 15

teach the lesson. During the lesson, the teacher may ask expert students to use their white boards to explain their thinking. This provides novice learners an opportunity to learn from expert thinking, which is usually hidden. xli At the end of the lesson, the teacher may have the students use the white boards to show what they learned and use this information to prepare for the next lesson. Graphic Organizers: Concept Maps, Venn Diagrams, Flowcharts Graphic organizers, such as concept maps, Venn diagrams, and flowcharts are mental maps of student thinking and understanding. Concept maps help students see the connections between concepts and the differences among concepts. Venn diagrams help students see the relationships between ideas, and flowcharts can help students to sequence events. Like white boards, they can be used as assessment strategies for making student thinking visible, helping teachers assesses what students do and do not understand. Portfolios Portfolios are collections of student work designed to provide the best evidence of a student s scientific literacy. They are used to measure student growth over time, showing achievement of science concepts, the deepening of understanding of the scientific method, and the growth of both communication and problem solving skills. Through portfolios, students can become actively engaged in their own learning, gaining a sense of pride and ownership of their work. As an assessment tool, portfolios provide opportunities for students to: reflect on and self-evaluate their learning and work; select a variety of different types of work they think best represent their understanding of science; and learn how to score and evaluate the work of peers. Teachers use student portfolios to evaluate the progress of the student, the class, the curriculum, and their instruction. Interactive Computer Tasks Computer simulations can present students with rich, interactive assessments that model systems in the natural world. Science simulations can model authentic environments and make concepts that are difficult to represent in a graphic format such convection currents, the movement of molecules in solids, liquids and gases, and/or plate tectonics visible. In an interactive computer task, students have the opportunity to manipulate stimuli that they would not be able to manipulate in real time. In an assessment of plate tectonics and Earth s structure, for example, students can investigate the results of different plate movements or how wind, water, and ice shape and reshape Earth s surface. Interactive computer simulations allow students to demonstrate their understandings of science content and inquiry in an active manner. Moreover, computer technology associated with simulations can provide automatic feedback to students and teachers and can help to inform and guide instruction. Select Response Items Select response items are commonly called multiple-choice items. In responding to a multiplechoice item, students select one of four possible answer choices and record their responses on a separate answer sheet. Each multiple-choice item is: aligned to only one content standard; contains a stem with either a question or a completion format; and four different answer choices with only one correct answer. The four answer choices should be approximately the same length, 16

have the same format, and have parallel syntax and semantic structures. At least 10 items are needed for each standard to reliably report student achievement for that standard. Ten items are also needed to reliably report student achievement for each domain level of life science, earth science, physical science, and investigation and experimentation. Two examples of multiplechoice items follow. Regular Multiple-choice Items A well-constructed multiple-choice item may be a valuable component of an assessment system because it can provide broad coverage of important topics and allow students to demonstrate a variety of skills and knowledge. Many regular multiple-choice items usually focus on lowerlevel recall assessing small, topical pieces of information such as, what are the parts of a cell, or in what year was helium discovered. Multiple-choice items require higher-level and theyfocus more on important skills and can probe analytical reasoning. While any incorrect student answer can qualify as a misconception, there is a relatively large research base of documented student misconceptions in science. Documented misconceptions have been studied and confirmed by researchers through thorough investigations. Documented common student misconceptions in science can be built into the answer choices. If documented misconceptions are used in the answer choices, it is recommended that only one of the four answer choices contain the documented misconception. Justified Multiple-choice Items A modified multiple-choice question is called a justified multiple-choice question. Students select an answer choice and then explain why they think the answer is correct. Students are directed to use their understanding of specific science content and inquiry to explain why their answer is correct. Teachers use scoring rubrics specific to each question to score student work. Examples of justified multiple-choice questions are in Appendix A. Graphic Organizers for Monitoring and Tracking Formative and Summative Assessments aligned to the California Science Content Standards Teachers can use various methods to monitor and track different classroom assessments aligned to the California Science Content Standards. The matrix shown in Figure 5 below shows general headings for formative and summative assessments. Enduring California science standards for grade 4 are listed down the left side of the matrix. Teachers can monitor and track specific assessments for formative and summative categories in the cells. 17

Figure 5: Graphic Organizer for Monitoring Formative and Summative Assessments aligned to the California Science Content Standards QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. By using a variety of assessments that have clear expectations for students and are closely linked to the standards and to learning goals, teachers can capture the full range of student understanding and progress. They can also use the resulting data in thoughtful and powerful ways to improve student learning and achievement and to inform and guide their instruction. 18

V. Analyzing and Using Data and Results Results from classroom assessments provide quality feedback to teachers allowing them to: improve student learning and achievement; inform and modify instruction; plan curriculum; target teaching; and research teaching practices. Once teachers collect data and results, they need to make sense of their findings before they can apply them to improved learning and instruction. Analyzing data involves: looking for patterns or trends in both individual student work and for similar patterns in the work of all students in the class; reflecting on inferences and plausible explanations for findings; making sense out of clusters of information that go together; and making informed decisions for using the results with students and with their instruction. Tally Sheets Tally Sheets can be designed to record and analyze student results for multiple-choice tests. The Tally Sheet is a matrix with the item numbers and the codes for the standards assessed identified across the top of the matrix and the names of the students listed down the left side of the matrix. The teacher could enter (+) for a correct answer and ( ) for an incorrect answer and then tally the number correct for each student and for each standard. By reading across the matrix from the left side to the right side, teachers can quickly determine how many items each student responded to correctly. By reading from the top of the matrix to the bottom of the matrix for each item, teachers can quickly determine which standards on this particular test were difficult for students and which were not. In order to make a reliable inference about student understanding of a single standard, there must be at least ten items for each standard. Figure 6 below shows a tally sheet made in Excel for recording student responses to a multiple-choice test. Several Tally Sheets can be made in Excel to keep track of student results and progress. Figure 6: Tally Sheet for Multiple-choice Answers QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. 19

Tally Sheets can also be used to capture and analyze information from a hands-on performance task. A hands-on performance task was administered to eighth grade students in a large urban school district. The students investigated variables related to force and motion. After the students took the test, each question in their booklets was scored with an analytical rubric and summarized in the Tally Sheet in Figure 7 below. The parts of the performance task and associated questions are listed at the top of the matrix. The score points 1 for a correct response, 0 for an incorrect response, and B for blank are listed down the left side of the matrix. The data, reported in percentages for the 4, 500 students tested, is recorded in each cell in the table. The data for question 3B, for example, shows that 76% of the 4,500 eighth grade students correctly recorded data from their investigation in a data table while 23% of the students did not record data in a table correctly. The matrix also shows that 1% of the students left the question blank. In contrast, the data for question 4 show that only 34% of the 4,500 students were able to organize their results correctly on a graph while 62% of the students did not graph their data correctly. The matrix also shows that a 4% of the students did not attempt to graph the data from their investigation. Figure 7: Tally Sheet Showing Student Results for an Eighth Grade Performance Task QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. The information in Figures 6 and 7 allow teachers to use data from a summative test to inform instruction and improve student learning. Teachers can identify specific areas where students are experiencing difficulty and target their instruction to address these areas. This allows teachers to use results from a summative test in a formative manner. Furthermore, research shows that when teachers identify specific student weaknesses and target their instruction using metacognitive teaching strategies to address those weaknesses, student achievement improves significantly. xlii 20

Assessment data should be drawn from multiple sources and triangulated. Triangulation is a technique of using data from three different sources to determine student achievement of specific content. Three different sources of data provide teachers three different perspectives of student work and understanding of that content, making their inferences about student understanding more reliable. The Logic Model for Assessment in Figure 8 shows a graphic representation for triangulating data from pre-posttests, formative and summative assessments, and the state Content Standards Test. In this model, the grey box in the middle represents the formative and summative assessments that take place during the course of standards-based instruction throughout the school year. At the start of instruction in the fall, the teacher administers a pretest to determine students prior knowledge of the science concepts for that particular grade level. In this scenario, the school is participating in a CaMSP and required to pre- and posttest students. Throughout the course of the year, the teacher engages in continuous formative and summative assessment. In the spring, the teacher administers the California Standards Test for science and at the end of the year, the posttest is administered. The model shows that the intent of the data from the pre- and posttest and the CST is to: see how well students are achieving the Science Content Standards; determine if the school is meeting its state performance targets in science; investigate program effects between the schools participating in the CaMSP; to determine program impact; and to inform local and state evaluators. The model also shows that data from all assessments are triangulated to form a culminating body of evidence. At a larger grain size, the results of the culminating body of evidence are used to inform and guide instruction, inform and guide professional development, plan instruction, allocate resources, and to disseminate findings of what worked to the larger learning network. 21