Designing Multiple Choice Tests to Measure Higher Order Thinking

Similar documents
Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Beyond the Blend: Optimizing the Use of your Learning Technologies. Bryan Chapman, Chapman Alliance

Language Acquisition Chart

Automating Outcome Based Assessment

Taxonomy of the cognitive domain: An example of architectural education program

Unit 3. Design Activity. Overview. Purpose. Profile

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application:

BIOH : Principles of Medical Physiology

Promoting Active Learning in University Classes

What is PDE? Research Report. Paul Nichols

Innovative Methods for Teaching Engineering Courses

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

Science Fair Project Handbook

Probability and Statistics Curriculum Pacing Guide

Instructional Supports for Common Core and Beyond: FORMATIVE ASSESMENT

Rubric for Scoring English 1 Unit 1, Rhetorical Analysis

PEDAGOGICAL LEARNING WALKS: MAKING THE THEORY; PRACTICE

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

John Jay College of Criminal Justice, CUNY ASSESSMENT REPORT: SPRING Undergraduate Public Administration Major

Exemplary Planning Commentary: Secondary Science

EQuIP Review Feedback

School Leadership Rubrics

Biome I Can Statements

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

South Carolina English Language Arts

COMM 210 Principals of Public Relations Loyola University Department of Communication. Course Syllabus Spring 2016

Qualitative Site Review Protocol for DC Charter Schools

Lab 1 - The Scientific Method

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

5. UPPER INTERMEDIATE

Developing a Language for Assessing Creativity: a taxonomy to support student learning and assessment

Statewide Framework Document for:

Earl of March SS Physical and Health Education Grade 11 Summative Project (15%)

Providing Feedback to Learners. A useful aide memoire for mentors

Instructions and Guidelines for Promotion and Tenure Review of IUB Librarians

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

FIGURE IT OUT! MIDDLE SCHOOL TASKS. Texas Performance Standards Project

Wellness Committee Action Plan. Developed in compliance with the Child Nutrition and Women, Infant and Child (WIC) Reauthorization Act of 2004

Introduction to Questionnaire Design

Predatory Reading, & Some Related Hints on Writing. I. Suggestions for Reading

Chemistry Senior Seminar - Spring 2016

Let's Learn English Lesson Plan

Human Biology: Physiology and Health (Higher) Unit. level 6 (6 SCQF credit points)

Researcher Development Assessment A: Knowledge and intellectual abilities

Nutrition 10 Contemporary Nutrition WINTER 2016

How to Judge the Quality of an Objective Classroom Test

Major Milestones, Team Activities, and Individual Deliverables

Pharmaceutical Medicine

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Unit 7 Data analysis and design

Be aware there will be a makeup date for missed class time on the Thanksgiving holiday. This will be discussed in class. Course Description

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

CEFR Overall Illustrative English Proficiency Scales

GRANT WOOD ELEMENTARY School Improvement Plan

MADERA SCIENCE FAIR 2013 Grades 4 th 6 th Project due date: Tuesday, April 9, 8:15 am Parent Night: Tuesday, April 16, 6:00 8:00 pm

SCIENCE AND TECHNOLOGY 5: HUMAN ORGAN SYSTEMS

Susan K. Woodruff. instructional coaching scale: measuring the impact of coaching interactions

Basic Standards for Residency Training in Internal Medicine. American Osteopathic Association and American College of Osteopathic Internists

EDEXCEL FUNCTIONAL SKILLS PILOT TEACHER S NOTES. Maths Level 2. Chapter 4. Working with measures

TEKS Resource System. Effective Planning from the IFD & Assessment. Presented by: Kristin Arterbury, ESC Region 12

Developing Students Research Proposal Design through Group Investigation Method

OVERVIEW OF CURRICULUM-BASED MEASUREMENT AS A GENERAL OUTCOME MEASURE

USF Course Change Proposal Global Citizens Project

The College Board Redesigned SAT Grade 12

Outcome Based Education 15/01/2012

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

MASTER S THESIS GUIDE MASTER S PROGRAMME IN COMMUNICATION SCIENCE

Learning Lesson Study Course

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Secondary English-Language Arts

West Georgia RESA 99 Brown School Drive Grantville, GA

Biology 1 General Biology, Lecture Sections: 47231, and Fall 2017

University of Texas at Tyler Nutrition Course Syllabus Summer II 2017 ALHS

TUCSON CAMPUS SCHOOL OF BUSINESS SYLLABUS

TCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits)

Missouri Mathematics Grade-Level Expectations

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Introduction to Psychology

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

MYCIN. The MYCIN Task

BENCHMARK TREND COMPARISON REPORT:

Chapter 9 The Beginning Teacher Support Program

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-

An Analysis of the Early Assessment Program (EAP) Assessment for English

Practice Examination IREB

Shank, Matthew D. (2009). Sports marketing: A strategic perspective (4th ed.). Upper Saddle River, NJ: Pearson/Prentice Hall.

SPECIALIST PERFORMANCE AND EVALUATION SYSTEM

Demystifying The Teaching Portfolio

Number of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Ohio s New Learning Standards: K-12 World Languages

George Mason University Graduate School of Education Education Leadership Program. Course Syllabus Spring 2006

EEX Exceptional People: School and Society Spring Table of Contents

Diagnostic Test. Middle School Mathematics

Protocol for using the Classroom Walkthrough Observation Instrument

Writing Functional Dysphagia Goals

What is Thinking (Cognition)?

Indiana Collaborative for Project Based Learning. PBL Certification Process

Evaluation of Teach For America:

Modified Systematic Approach to Answering Questions J A M I L A H A L S A I D A N, M S C.

Transcription:

University of North Texas Health Science Center UNTHSC Scholarly Repository Test Item Writing Assessment 3-15-2012 Designing Multiple Choice Tests to Measure Higher Order Thinking Carol Kominski Follow this and additional works at: http://digitalcommons.hsc.unt.edu/test_items Recommended Citation Kominski, Carol, "Designing Multiple Choice Tests to Measure Higher Order Thinking" (2012). Test Item Writing. Paper 10. http://digitalcommons.hsc.unt.edu/test_items/10 This Article is brought to you for free and open access by the Assessment at UNTHSC Scholarly Repository. It has been accepted for inclusion in Test Item Writing by an authorized administrator of UNTHSC Scholarly Repository. For more information, please contact Tom.Lyons@unthsc.edu.

Designing Multiple Choice Tests to Measure Higher Order Thinking Carol A. Kominski, Ph.D. Assessment Specialist

Learning Outcomes 1. Apply Bloom s conceptual model to construct higher order learning outcomes. 2. Analyze structured vs. unstructured assessments. 3. Evaluate multiple choice test items for quality and skill level. 4. Construct multiple choice test items to assess higher order thinking.

Assumptions About Workshop Participants Target audience Primary assessment goals Units for Analysis Outcomes Faculty designing course tests Diagnose strengths and weaknesses Student Higher Order Thinking Learning Outcomes Certify student competence Course

Workshop Agenda 1. Bloom s conceptual Model of Higher Order Thinking Learning Outcomes that Assess Higher Order Thinking 2. Structured vs. Unstructured Tests (Pluses and Minuses) Types of Structured Tests 3. Analysis of Multiple choice Test Items General Quality Standards Level of Thinking 4. Construction of Multiple choice Test Items for Higher Order Thinking

Just what is higher order thinking?

Bloom s taxonomy remix Putting things together; Creative thinking Making Judgment Breaking things down; Critical thinking Using knowledge in new situations Understanding Recall

Learning Outcomes: Questions Should Have Yes Answers Clarity Do they specify what students are expected to know and/or be able to do? Communication Are they included in syllabus? Are they communicated in course activities? Relationship to Test Can you report test performance on each outcome?

Learning Outcomes that Assess Higher Order Thinking: Key Words Useful websites http://www.celt.iastate.edu/teaching/revisedblo oms1.html http://cte.uwaterloo.ca/ksu/bloom's_taxonomy_ Cognitive_Domain.pdf

Definition Remember Understand Apply Analyze Evaluate Create Bloom s Remember Demonstrate an Apply knowledge to Break down objects Make and defend Compile component Bloom s Taxonomy Definition previously learned understanding of the actual situations. or ideas into judgments based ideas into a new information. facts. simpler parts and on internal whole or propose find evidence to support generalizations. evidence or external criteria. alternative solutions. Verbs Arrange Define Describe Duplicate Identify Label List Match Memorize Name Order Outline Recognize Relate Recall Repeat Reproduce Select State Classify Convert Defend Describe Discuss Distinguish Estimate Explain Express Extend Generalized Give example Identify Indicate Infer Locate Paraphrase Predict Recognize Rewrite Review Select Summarize Translate Apply Change Choose Compute Demonstrate Discover Dramatize Employ Illustrate Interpret Manipulate Modify Operate Practice Predict Prepare Produce Relate Schedule Show Sketch Solve Use Write Analyze Appraise Breakdown Calculate Categorize Compare Contrast Criticize Diagram Differentiate Discriminate Distinguish Examine Experiment Identify Illustrate Infer Model Outline Point out Question Relate Select Separate Subdivide Test Appraise Argue Assess Attach Choose Compare Conclude Contrast Defend Describe Discriminate Estimate Evaluate Explain Judge Justify Interpret Relate Predict Rate Select Summarize Support Value Arrange Assemble Categorical Collect Combine Comply Compose Construct Create Design Develop Devise Explain Formulate Generate Plan Prepare Rearrange Reconstruct Relate Reorganize Revise Rewrite Set up Summarize Synthesize Tell Write

Structured vs. Unstructured Tests Structured Tests Have limited number of response options. Examples are True False Multiple choice Matching Fill in the blanks Unstructured Tests Have wider variety of response options controlled by test taker. Examples are Technical Writing Oral presentation Procedural demonstration Case study analysis E Portfolio

Pluses and Minuses for Structured Response Tests Pluses Comprehensive knowledge assessed efficiently Scoring economical and speedy Moderate to high reliability Amenable to statistical analysis Amenable to collection of comparative and trend data Minuses Test items laborious to construct Higher order thinking skills items even more difficult to construct Impact of cueing, guessing, test savvy, & motivation uncertain Test security a requirement Less related to tasks of professional life

Pluses and Minuses for Unstructured Response Tests Pluses Higher order thinking more easily assessed Moderate to high authenticity for real life tasks Requires greater student activity and engagement Minimal influence of guessing and motivation on performance Ease of construction Minuses Necessity for rubric/scoring key construction & calibration Scoring requires significant time Pre calibration of evaluators needed to increase reliability More difficult to assess broad range of knowledge quickly Comparative and trend data harder to collect

Examples of Structured Test Items Forced choice True false Multiple Choice (usually 3 5 choices) Matching Allow for use of same options for more than one question Options can be extended (15 20 options) One to one match or unevenly matched lists Fill in the blanks Complete a diagram Cloze test for comprehension where every nth word is omitted Complete a sentence

Structured Assessments: Focus on Multiple Choice Items Commonly used Large classes High stakes testing Admission to professional schools Professional licensure Item analysis Highly developed Facilitates systematic item improvement

Multiple Choice Items: Basic Guidelines State learning evaluation objectives clearly. Determine level of thinking required. Recall Understanding Application Analysis Evaluation

Multiple Choice Question Terminology Test Item Item Stem Alternatives Context Keyed Response Distractors

Context Helpful for Higher Order Thinking Questions No Context Usually not needed for testing of factual knowledge Context Skeleton A small amount may be desirable for testing understanding. Context Rich Rich context is usually helpful for assessment of higher order thinking skills like application, analysis, and evaluation.

Three Desirable Qualities of Item Stem 1. Succinctness 2. Clear statement of question, problem, or task 3. Positive wording

Six Desirable Qualities of Alternatives 1. Similar lengths 2. Correct grammar 3. One correct answer 4. Absence of extremes like never, always, only 5. No all of the above. 6. Mutually exclusive alternatives

Let s Try Some Questions

Question 1: What is wrong with this question? The way to a man s heart is through his a. aorta b. pulmonary arteries c. pulmonary veins d. stomach Source: Constructing Written Test Questions for the Basic and Clinical Sciences. Third Edition (Revised). National Board of Medical Examiners, 2002, p.15.

Question 2: What is wrong with this question? Structured tests a. Usually assess higher order thinking. b. Are better for large classes. c. Do not require a high level of test secruity. d. Requires rubrics or scoring key. e. Are easy to construct. f. All of the above.

Question 3: What is wrong with this question? Assume you are a biology professor interested in deciding whether or not team based learning has a significant impact upon your students. You give half the students a lesson in which you employ team based learning and the other half a lesson in which you teach using a traditional lecture. After both lessons, you give students a 100 point test to determine how well they have learned the material covered in each class. If you were to do a 2 tailed t test on the students test results, what is the hypothesis that you are seeking to test? a. Students in the team based learning class will score higher on the test. b. Students in both classes will score about the same on the test. c. Students in the traditional lecture class will score higher on the test. d. All of the above.

Question 3: New and Improved An instructor teaching half of his students using team based learning and the other half using traditional lecture gives each group the same test at the end of each class. He performs a 2 tailed t test to compare the two groups. What hypothesis is he testing? a. Students in the team based learning class will score higher. b. Students in both classes will score about the same. c. Students in the traditional lecture class will score higher.

Question 4: What level of thinking is assessed? Which of the following blood tests is used in diagnosis and treatment of diabetes? a. Hemoglobin A1C b. C reactive protein (CRP) c. Antinuclear antibodies (ANA) d. Aspartate aminotransferase (AST)

Question 5: What level of thinking is assessed? In a routine physical exam John Smith, age 47, had a blood glucose level of 140 and an A1C level of 4.1%. What is the most plausible explanation of these numbers? a. He has Type I diabetes which is probably controlled by insulin. b. He shows early signs of development of Type II diabetes. c. He probably fasted before his blood glucose test. d. He probably did not fast before his blood glucose test.

Question 6: What level of thinking is assessed? Two 60 year old male patients have Type 2 diabetes. Each have a BMI of 27. The primary treatment for each is a diet to reduce blood glucose levels. What is the most likely reason Patient #2 did not show a decline in glucose after three months? 200 190 180 170 160 150 240 220 200 180 160 193 192 194 192 180 177 173 Weight P#1 Weight P#2 169 1 Jan 1 Feb 1 Mar 1 Apr 221 211 199 201 198 199 188 Glucose P#1 Glucose P#2 1 Jan 1 Feb 1 Mar 1 Apr a. P#1 may have exercised more than P#2. b. P#2 probably leads a more sedentary life than P#1. c. P#1 lost more weight on the glucose reduction diet. d. P#2 may have a more resistant form of diabetes.

Question 7: What level of thinking is assessed? Without any other data, which conclusion can you make from reviewing Figure 17? a. The average American uses more drugs than citizens of any country except the United Kingdom. b. The average Mexican or Chilean consumes fewer drugs than citizens from other countries. c. Americans are more likely than residents of other countries to use new drugs. d. Japanese have regulations that make it very difficult to obtain new drugs.

Question 8: What level of thinking is asssessed? What data would be most helpful in estimating average levels of personal drug consumption for the countries identified in Figure 17? a. Percent of population in each country buying the covered drugs. b. Average cost of new drugs for each country. c. Average cost of all drugs for each country. d. Population of each country.

Question 9: What level of thinking is assessed? Susan and Clara each want to lose weight. Susan goes on a low carbohydrate diet and Clara goes on a Vegan diet. After six months Susan loses 30 and Clara loses 15 pounds. Relative to losing weight, which of the following conclusions is supported? The low carbohydrate diet is more effective at producing weight loss than the Vegan diet. The Vegan diet contains more calories than the low carbohydrate diet. The low carbohydrate diet is easier to maintain than the Vegan diet. Additional information is needed before making any conclusions.

Characteristics of Multiple Choice Items That Measure Higher Order Thinking Difficult to construct Must develop context Require lots of context Reading selections Scenarios, vignettes Tables, charts, graphs Require more testing time Reading selections, studying tables and charts Thinking itself more complex Require review by others Other faculty, colleagues, or small sample of students

Objectives and Test Items: Dimensions and Guidelines Dimensions Number of learning evaluation outcomes Relative importance of each outcome Total testing time Higher thinking items require more time Guidelines Minimum two items per objective 5 10 items for important learning objectives Additional items increase reliability and validity