TOPIC 3 BASIC TESTING TERMINOLOGY 3.0 SYNOPSIS Topic 3 provides input on basic testing terminology. It looks at the definitions, purposes and differences of various tests. 3.1 LEARNING OUTCOMES By the end of this topic, you will be able to: 1. explain the meaning and purpose of different types of language tests; 2. compare between Norm-Referenced Test and Criterion-Referenced Test, Formative and Summative Tests, Objective and Subjective Tests 3.2 FRAMEWORK OF TOPICS Norm-Referenced and Criterion- Referenced Types of Tests Formative and Summative Objective and Subjective 1
1 CONTENT SESSION THREE (3 hours) 3.3 Norm-Referenced Test (NRT) According to Brown (2010), in NRTs an individual test-taker s score is interpreted in relation to a mean (average score), median (middle score), standard deviation (extent of variance in scores), and/or percentile rank. The purpose of such tests is to place test-takers along a mathematical continuum in rank order. In a test, scores are commonly reported back to the test-taker in the form of a numerical score for example, 250 out of 300 and a percentile rank for instance 78 percent, which denotes that the test-taker s score was higher than 78 percent of the total number of test-takers but lower than 22 percent in the administration. In other words, NRT is administered to compare an individual performance with his peers and/or compare a group with other groups. In the School-Based Evaluation, NRT is used for the summative evaluation, such as in the end of the year examination for the streaming and selection of students. 3.4 Criterion-Referenced Test (CRT) Gottlieb (2006) on the other hand refers Criterion-referenced tests as the collection of information about student progress or achievement in relation to a specified criterion. In a standards-based assessment model, the standards serve as the criteria or yardstick for measurement. Following Glaser (1973), the word criterion means the use of score values that can be accepted as the index of attainment to a test-taker. Thus, CRTs are designed to provide feedback to test-takers, mostly in the form of grades, on specific course or lesson objectives. Curriculum Development Centre (2001) defines CRT as an approach that provides information on student s mastery based on the criteria determined by the teacher. These criteria are based on learning outcomes or objectives as specified in the syllabus. The main advantage of CRTs is that they provide the testers to make inferences about how much language proficiency, in the case of language proficiency tests, or knowledge and skills, in the aspect of academic achievement tests, that test-takers/students originally have and their successive gains over time. As opposed to NRTs, CRTs focus on student s mastery of a subject matter (represented in the standards) along a continuum instead of ranking student on a bell curve. Table 3 below shows the differences between Norm-Referenced Test (NRT) and Criterion-Referenced Test (CRT). 2
Definition Purpose Test Item Frequency Appropriateness Example Norm-Referenced Test A test that measures student s achievement as compared to other students in the group Determine performance difference among individual and groups From easy to difficult level and able to discriminate examinee s ability Continuous assessment in the classroom Summative evaluation Public exams: UPSR, PMR, SPM, and STPM Criterion-Referenced Test An approach that provides information on student s mastery based on a criterion specified by the teacher Determine learning mastery based on specified criterion and standard Guided by minimum achievement in the related objectives Continuous assessment Formative evaluation Mastery test: monthly test, coursework, project, exercises in the classroom Table 3: The differences between Norm-Referenced Test (NRT) and Criterion- Referenced Test (CRT) 3.5 Formative Test Formative test or assessment, as the name implies, is a kind of feedback teachers give students while the course is progressing. Formative assessment can be seen as assessment for learning. It is part of the instructional process. We can think of formative assessment as practice. With continual feedback the teachers may assist students to improve their performance. The teachers point out on what the students have done wrong and help them to get it right. This can take place when teachers examine the results of achievement and progress tests. Based on the results of formative test or assessment, the teachers can suggest changes to the focus of curriculum or emphasis on some specific lesson elements. On the other hand, students may also need to change and improve. Due to the demanding nature of this formative test, numerous teachers prefer not to adopt this test although giving back any assessed homework or achievement test present both teachers and students healthy and ultimate learning opportunities. 3
3.6 Summative Test Summative test or assessment, on the other hand, refers to the kind of measurement that summarise what the student has learnt or give a one-off measurement. In other words, summative assessment is assessment of student learning. Students are more likely to experience assessment carried out individually where they are expected to reproduce discrete language items from memory. The results then are used to yield a school report and to determine what students know and do not know. It does not necessarily provide a clear picture of an individual s overall progress or even his/her full potential, especially if s/he is hindered by the fear factor of physically sitting for a test, but may provide straightforward and invaluable results for teachers to analyse. It is given at a point in time to measure student achievement in relation to a clearly defined set of standards, but it does not necessarily show the way to future progress. It is given after learning is supposed to occur. End of the year tests in a course and other general proficiency or public exams are some of the examples of summative tests or assessment. Table 3.1 shows formative and summative assessments that are common in schools. Formative Assessment Summative Assessment Anecdotal records Final exams Quizzes and essays National exams (UPSR, PMR, SPM, STPM) Diagnostic tests Entrance exams Table 3.1: Common formative and summative assessments in schools 3.7 Objective Test According to BBC Teaching English, an objective test is a test that consists of right or wrong answers or responses and thus it can be marked objectively. Objective tests are popular because they are easy to prepare and take, quick to mark, and provide a quantifiable and concrete result. They tend to focus more on specific facts than on general ideas and concepts. The types of objective tests include the following: i. Multiple choice items/questions ii. True-false items/questions: iii. Matching items/questions; and iv. Fill-in the blanks items/questions. 4
In this topic, let us focus on the multiple-choice questions, which may look easy to construct but in reality, it is very difficult to build correctly. This is congruent with the viewpoint of Hughes (2003, pp76-78) who warns against many weaknesses of multiple-choice questions. The weaknesses include: It may limit beneficial washback; It may enable cheating among test-takers; It is very challenging to write successful items; This technique strictly limits what can be tested; This technique tests only recognition knowledge; It may encourage guessing, which may have a considerable effect on test scores. Let s look at some important terminology when designing multiple-choice questions. This objective test item comprises five terminologies namely: 1. Receptive or selective response Items that the test-takers chooses from a set of responses, commonly called a supply type of response rather than creating a response. 2. Stem Every multiple-choice item consists of a stem (the body of the item that presents a stimulus). Stem is the question or assignment in an item. It is in a complete or open, positive or negative sentence form. Stem must be short or simple, compact and clear. However, it must not easily give away the right answer. 3. Options or alternatives They are known as a list of possible responses to a test item. There are usually between three and five options/alternatives to choose from. 4. Key This is the correct response. The response can either be correct or the best one. Usually for a good item, the correct answer is not obvious as compared to the distractors. 5
5. Distractors This is known as a disturber that is included to distract students from selecting the correct answer. An excellent distractor is almost the same as the correct answer but it is not. When building multiple-choice items for both classroom-based and largescaled standardised tests, consider the four guidelines below: i. Design each item to measure a single objective; ii. State both stem and options as simply and directly as possible; iii. Make certain that the intended answer is clearly the one correct one; iv. (Optional) Use item indices to accept, discard or revise item. 3.8 Subjective Test Contrary to an objective test, a subjective test is evaluated by giving an opinion, usually based on agreed criteria. Subjective tests include essay, shortanswer, vocabulary, and take-home tests. Some students become very anxious of these tests because they feel their writing skills are not up to par. In reality, a subjective test provides more opportunity to test-takers to show/demonstrate their understanding and/or in-depth knowledge and skills in the subject matter. In this case, test takers might provide some acceptable, alternative responses that the tester, teacher or test developer did not predict. Generally, subjective tests will test the higher skills of analysis, synthesis, and evaluation. In short, subjective test will enable students to be more creative and critical. Table 3.2 shows various types of objective and subjective assessments. Objective Assessments True/False Items Multiple-choice Items Multiple-responses Item Matching Items Subjective Assessments Extended-response Items Restricted-response Items Essay Table 3.2: Various types of objective and subjective assessments 6
Some have argued that the distinction between objective and subjective assessments is neither useful nor accurate because, in reality, there is no such thing as objective assessment. In fact, all assessments are created with inherent biases built into decisions about relevant subject matter and content, as well as cultural (class, ethnic, and gender) biases. Reflection 1. Objective test items are items that have only one answer or correct response. Describe in-depth the multiple-choice test item. 2. Subjective test-items allocate subjectivity in the response given by the test-takers. Explain in detail the various types of subjective test-items. Discussion 1. Identify at least three differences between formative and summative assessment? 2. What are the strengths of multiple-choice items compared to essay items? 3. Informal assessments are often unreliable, yet they are still important in classrooms. Explain why this is the case, and defend your explanation with examples. 4. Compare and contrast Norm-Referenced Test with Criterion- Referenced Test. 7