BASIC TESTING TERMINOLOGY

Similar documents
How to Judge the Quality of an Objective Classroom Test

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

Unit 13 Assessment in Language Teaching. Welcome

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

Norms How were TerraNova 3 norms derived? Does the norm sample reflect my diverse school population?

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

Curriculum and Assessment Policy

5. UPPER INTERMEDIATE

Developing a College-level Speed and Accuracy Test

Psychometric Research Brief Office of Shared Accountability

Running head: LISTENING COMPREHENSION OF UNIVERSITY REGISTERS 1

10: The use of computers in the assessment of student learning

Carolina Course Evaluation Item Bank Last Revised Fall 2009

To test or not to test? The selection and analysis of an instrument to assess literacy skills of Indigenous children: a pilot study.

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

DATE ISSUED: 11/2/ of 12 UPDATE 103 EHBE(LEGAL)-P

Purpose of internal assessment. Guidance and authenticity. Internal assessment. Assessment

Assessment System for M.S. in Health Professions Education (rev. 4/2011)

ECON 365 fall papers GEOS 330Z fall papers HUMN 300Z fall papers PHIL 370 fall papers

NCEO Technical Report 27

The Oregon Literacy Framework of September 2009 as it Applies to grades K-3

see that few students made As or Bs on the test from C to F, that the median grade was an F and the top grade was a C

Short vs. Extended Answer Questions in Computer Science Exams

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

CEFR Overall Illustrative English Proficiency Scales

ASSESSMENT OVERVIEW Student Packets and Teacher Guide. Grades 6, 7, 8

Mathematical Misconceptions -- Can We Eliminate Them? Phi lip Swedosh and John Clark The University of Melbourne. Introduction

Delaware Performance Appraisal System Building greater skills and knowledge for educators

Exams: Accommodations Guidelines. English Language Learners

Dyslexia and Dyscalculia Screeners Digital. Guidance and Information for Teachers

English Language Arts Summative Assessment

ESL Curriculum and Assessment

Kentucky s Standards for Teaching and Learning. Kentucky s Learning Goals and Academic Expectations

ANGLAIS LANGUE SECONDE

English for Specific Purposes World ISSN Issue 34, Volume 12, 2012 TITLE:

West s Paralegal Today The Legal Team at Work Third Edition

Writing for the AP U.S. History Exam

Use the Syllabus to tick off the things you know, and highlight the areas you are less clear on. Use BBC Bitesize Lessons, revision activities and

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

A Guide to Adequate Yearly Progress Analyses in Nevada 2007 Nevada Department of Education

Livermore Valley Joint Unified School District. B or better in Algebra I, or consent of instructor

Soil & Water Conservation & Management Soil 4308/7308 Course Syllabus: Spring 2008

Karla Brooks Baehr, Ed.D. Senior Advisor and Consultant The District Management Council

Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs

EQuIP Review Feedback

Essentials of Ability Testing. Joni Lakin Assistant Professor Educational Foundations, Leadership, and Technology

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

Course Syllabus Art History II ARTS 1304

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

Arts, Literature and Communication (500.A1)

GUIDE TO EVALUATING DISTANCE EDUCATION AND CORRESPONDENCE EDUCATION

Candidates must achieve a grade of at least C2 level in each examination in order to achieve the overall qualification at C2 Level.

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

CELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom

Achievement Level Descriptors for American Literature and Composition

INTERMEDIATE ALGEBRA Course Syllabus

End-of-Module Assessment Task

Why OUT-OF-LEVEL Testing? 2017 CTY Johns Hopkins University

Simulation in Maritime Education and Training

AP Statistics Summer Assignment 17-18

Delaware Performance Appraisal System Building greater skills and knowledge for educators

College of Engineering and Applied Science Department of Computer Science

Kelso School District and Kelso Education Association Teacher Evaluation Process (TPEP)

Lower and Upper Secondary

Loughton School s curriculum evening. 28 th February 2017

Final Teach For America Interim Certification Program

Interpreting ACER Test Results

NC Global-Ready Schools

Observing Teachers: The Mathematics Pedagogy of Quebec Francophone and Anglophone Teachers

University of Groningen. Systemen, planning, netwerken Bosman, Aart

NAME OF ASSESSMENT: Reading Informational Texts and Argument Writing Performance Assessment

Course Syllabus MFG Modern Manufacturing Techniques I Spring 2017

Monitoring Metacognitive abilities in children: A comparison of children between the ages of 5 to 7 years and 8 to 11 years

PART C: ENERGIZERS & TEAM-BUILDING ACTIVITIES TO SUPPORT YOUTH-ADULT PARTNERSHIPS

2012 New England Regional Forum Boston, Massachusetts Wednesday, February 1, More Than a Test: The SAT and SAT Subject Tests

General Physics I Class Syllabus

Probability and Statistics Curriculum Pacing Guide

success. It will place emphasis on:

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

QUESTIONS ABOUT ACCESSING THE HANDOUTS AND THE POWERPOINT

A Study of Video Effects on English Listening Comprehension

FOR TEACHERS ONLY. The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION. ENGLISH LANGUAGE ARTS (Common Core)

Intensive English Program Southwest College

An Analysis of the Early Assessment Program (EAP) Assessment for English

Probability estimates in a scenario tree

RUSSIAN LANGUAGE, INTERMEDIATE LEVEL

ELS LanguagE CEntrES CurriCuLum OvErviEw & PEDagOgiCaL PhiLOSOPhy

Review of Student Assessment Data

Van Andel Education Institute Science Academy Professional Development Allegan June 2015

Sample Performance Assessment

PROMOTION MANAGEMENT. Business 1585 TTh - 2:00 p.m. 3:20 p.m., 108 Biddle Hall. Fall Semester 2012

Quantitative Research Questionnaire

DG 17: The changing nature and roles of mathematics textbooks: Form, use, access

Politics and Society Curriculum Specification

Lesson M4. page 1 of 2

Biology 1 General Biology, Lecture Sections: 47231, and Fall 2017

LANGUAGE TESTING: RECENT DEVELOPMENTS AND PERSISTENT DILEMMAS

BEST OFFICIAL WORLD SCHOOLS DEBATE RULES

Transcription:

TOPIC 3 BASIC TESTING TERMINOLOGY 3.0 SYNOPSIS Topic 3 provides input on basic testing terminology. It looks at the definitions, purposes and differences of various tests. 3.1 LEARNING OUTCOMES By the end of this topic, you will be able to: 1. explain the meaning and purpose of different types of language tests; 2. compare between Norm-Referenced Test and Criterion-Referenced Test, Formative and Summative Tests, Objective and Subjective Tests 3.2 FRAMEWORK OF TOPICS Norm-Referenced and Criterion- Referenced Types of Tests Formative and Summative Objective and Subjective 1

1 CONTENT SESSION THREE (3 hours) 3.3 Norm-Referenced Test (NRT) According to Brown (2010), in NRTs an individual test-taker s score is interpreted in relation to a mean (average score), median (middle score), standard deviation (extent of variance in scores), and/or percentile rank. The purpose of such tests is to place test-takers along a mathematical continuum in rank order. In a test, scores are commonly reported back to the test-taker in the form of a numerical score for example, 250 out of 300 and a percentile rank for instance 78 percent, which denotes that the test-taker s score was higher than 78 percent of the total number of test-takers but lower than 22 percent in the administration. In other words, NRT is administered to compare an individual performance with his peers and/or compare a group with other groups. In the School-Based Evaluation, NRT is used for the summative evaluation, such as in the end of the year examination for the streaming and selection of students. 3.4 Criterion-Referenced Test (CRT) Gottlieb (2006) on the other hand refers Criterion-referenced tests as the collection of information about student progress or achievement in relation to a specified criterion. In a standards-based assessment model, the standards serve as the criteria or yardstick for measurement. Following Glaser (1973), the word criterion means the use of score values that can be accepted as the index of attainment to a test-taker. Thus, CRTs are designed to provide feedback to test-takers, mostly in the form of grades, on specific course or lesson objectives. Curriculum Development Centre (2001) defines CRT as an approach that provides information on student s mastery based on the criteria determined by the teacher. These criteria are based on learning outcomes or objectives as specified in the syllabus. The main advantage of CRTs is that they provide the testers to make inferences about how much language proficiency, in the case of language proficiency tests, or knowledge and skills, in the aspect of academic achievement tests, that test-takers/students originally have and their successive gains over time. As opposed to NRTs, CRTs focus on student s mastery of a subject matter (represented in the standards) along a continuum instead of ranking student on a bell curve. Table 3 below shows the differences between Norm-Referenced Test (NRT) and Criterion-Referenced Test (CRT). 2

Definition Purpose Test Item Frequency Appropriateness Example Norm-Referenced Test A test that measures student s achievement as compared to other students in the group Determine performance difference among individual and groups From easy to difficult level and able to discriminate examinee s ability Continuous assessment in the classroom Summative evaluation Public exams: UPSR, PMR, SPM, and STPM Criterion-Referenced Test An approach that provides information on student s mastery based on a criterion specified by the teacher Determine learning mastery based on specified criterion and standard Guided by minimum achievement in the related objectives Continuous assessment Formative evaluation Mastery test: monthly test, coursework, project, exercises in the classroom Table 3: The differences between Norm-Referenced Test (NRT) and Criterion- Referenced Test (CRT) 3.5 Formative Test Formative test or assessment, as the name implies, is a kind of feedback teachers give students while the course is progressing. Formative assessment can be seen as assessment for learning. It is part of the instructional process. We can think of formative assessment as practice. With continual feedback the teachers may assist students to improve their performance. The teachers point out on what the students have done wrong and help them to get it right. This can take place when teachers examine the results of achievement and progress tests. Based on the results of formative test or assessment, the teachers can suggest changes to the focus of curriculum or emphasis on some specific lesson elements. On the other hand, students may also need to change and improve. Due to the demanding nature of this formative test, numerous teachers prefer not to adopt this test although giving back any assessed homework or achievement test present both teachers and students healthy and ultimate learning opportunities. 3

3.6 Summative Test Summative test or assessment, on the other hand, refers to the kind of measurement that summarise what the student has learnt or give a one-off measurement. In other words, summative assessment is assessment of student learning. Students are more likely to experience assessment carried out individually where they are expected to reproduce discrete language items from memory. The results then are used to yield a school report and to determine what students know and do not know. It does not necessarily provide a clear picture of an individual s overall progress or even his/her full potential, especially if s/he is hindered by the fear factor of physically sitting for a test, but may provide straightforward and invaluable results for teachers to analyse. It is given at a point in time to measure student achievement in relation to a clearly defined set of standards, but it does not necessarily show the way to future progress. It is given after learning is supposed to occur. End of the year tests in a course and other general proficiency or public exams are some of the examples of summative tests or assessment. Table 3.1 shows formative and summative assessments that are common in schools. Formative Assessment Summative Assessment Anecdotal records Final exams Quizzes and essays National exams (UPSR, PMR, SPM, STPM) Diagnostic tests Entrance exams Table 3.1: Common formative and summative assessments in schools 3.7 Objective Test According to BBC Teaching English, an objective test is a test that consists of right or wrong answers or responses and thus it can be marked objectively. Objective tests are popular because they are easy to prepare and take, quick to mark, and provide a quantifiable and concrete result. They tend to focus more on specific facts than on general ideas and concepts. The types of objective tests include the following: i. Multiple choice items/questions ii. True-false items/questions: iii. Matching items/questions; and iv. Fill-in the blanks items/questions. 4

In this topic, let us focus on the multiple-choice questions, which may look easy to construct but in reality, it is very difficult to build correctly. This is congruent with the viewpoint of Hughes (2003, pp76-78) who warns against many weaknesses of multiple-choice questions. The weaknesses include: It may limit beneficial washback; It may enable cheating among test-takers; It is very challenging to write successful items; This technique strictly limits what can be tested; This technique tests only recognition knowledge; It may encourage guessing, which may have a considerable effect on test scores. Let s look at some important terminology when designing multiple-choice questions. This objective test item comprises five terminologies namely: 1. Receptive or selective response Items that the test-takers chooses from a set of responses, commonly called a supply type of response rather than creating a response. 2. Stem Every multiple-choice item consists of a stem (the body of the item that presents a stimulus). Stem is the question or assignment in an item. It is in a complete or open, positive or negative sentence form. Stem must be short or simple, compact and clear. However, it must not easily give away the right answer. 3. Options or alternatives They are known as a list of possible responses to a test item. There are usually between three and five options/alternatives to choose from. 4. Key This is the correct response. The response can either be correct or the best one. Usually for a good item, the correct answer is not obvious as compared to the distractors. 5

5. Distractors This is known as a disturber that is included to distract students from selecting the correct answer. An excellent distractor is almost the same as the correct answer but it is not. When building multiple-choice items for both classroom-based and largescaled standardised tests, consider the four guidelines below: i. Design each item to measure a single objective; ii. State both stem and options as simply and directly as possible; iii. Make certain that the intended answer is clearly the one correct one; iv. (Optional) Use item indices to accept, discard or revise item. 3.8 Subjective Test Contrary to an objective test, a subjective test is evaluated by giving an opinion, usually based on agreed criteria. Subjective tests include essay, shortanswer, vocabulary, and take-home tests. Some students become very anxious of these tests because they feel their writing skills are not up to par. In reality, a subjective test provides more opportunity to test-takers to show/demonstrate their understanding and/or in-depth knowledge and skills in the subject matter. In this case, test takers might provide some acceptable, alternative responses that the tester, teacher or test developer did not predict. Generally, subjective tests will test the higher skills of analysis, synthesis, and evaluation. In short, subjective test will enable students to be more creative and critical. Table 3.2 shows various types of objective and subjective assessments. Objective Assessments True/False Items Multiple-choice Items Multiple-responses Item Matching Items Subjective Assessments Extended-response Items Restricted-response Items Essay Table 3.2: Various types of objective and subjective assessments 6

Some have argued that the distinction between objective and subjective assessments is neither useful nor accurate because, in reality, there is no such thing as objective assessment. In fact, all assessments are created with inherent biases built into decisions about relevant subject matter and content, as well as cultural (class, ethnic, and gender) biases. Reflection 1. Objective test items are items that have only one answer or correct response. Describe in-depth the multiple-choice test item. 2. Subjective test-items allocate subjectivity in the response given by the test-takers. Explain in detail the various types of subjective test-items. Discussion 1. Identify at least three differences between formative and summative assessment? 2. What are the strengths of multiple-choice items compared to essay items? 3. Informal assessments are often unreliable, yet they are still important in classrooms. Explain why this is the case, and defend your explanation with examples. 4. Compare and contrast Norm-Referenced Test with Criterion- Referenced Test. 7