Short vs. Extended Answer Questions in Computer Science Exams

Similar documents
SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

BADM 641 (sec. 7D1) (on-line) Decision Analysis August 16 October 6, 2017 CRN: 83777

Sample Performance Assessment

Create Quiz Questions

(Sub)Gradient Descent

Data Structures and Algorithms

A Metacognitive Approach to Support Heuristic Solution of Mathematical Problems

Senior Project Information

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Average Number of Letters

MADERA SCIENCE FAIR 2013 Grades 4 th 6 th Project due date: Tuesday, April 9, 8:15 am Parent Night: Tuesday, April 16, 6:00 8:00 pm

Grade Dropping, Strategic Behavior, and Student Satisficing

Purpose of internal assessment. Guidance and authenticity. Internal assessment. Assessment

PROJECT MANAGEMENT AND COMMUNICATION SKILLS DEVELOPMENT STUDENTS PERCEPTION ON THEIR LEARNING

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

LET S COMPARE ADVERBS OF DEGREE

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

Self Study Report Computer Science

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

Test How To. Creating a New Test

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

re An Interactive web based tool for sorting textbook images prior to adaptation to accessible format: Year 1 Final Report

A Note on Structuring Employability Skills for Accounting Students

use different techniques and equipment with guidance

Unit 3. Design Activity. Overview. Purpose. Profile

Many instructors use a weighted total to calculate their grades. This lesson explains how to set up a weighted total using categories.

On the Combined Behavior of Autonomous Resource Management Agents

Developing an Assessment Plan to Learn About Student Learning

Operating Theatre Nursing Multiple Choice Questions Sample

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Sectionalism Prior to the Civil War

Radius STEM Readiness TM

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

What effect does science club have on pupil attitudes, engagement and attainment? Dr S.J. Nolan, The Perse School, June 2014

Testing for the Homeschooled High Schooler: SAT, ACT, AP, CLEP, PSAT, SAT II

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Philosophy of Literacy. on a daily basis. My students will be motivated, fluent, and flexible because I will make my reading

What is PDE? Research Report. Paul Nichols

Student Handbook 2016 University of Health Sciences, Lahore

MASTERS VS. PH.D. WHICH ONE TO CHOOSE? HOW FAR TO GO? Rita H. Wouhaybi, Intel Labs Bushra Anjum, Amazon

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Delaware Performance Appraisal System Building greater skills and knowledge for educators

How to Judge the Quality of an Objective Classroom Test

Market Economy Lesson Plan

Curriculum and Assessment Policy

Reducing Spoon-Feeding to Promote Independent Thinking

Faculty Feedback User s Guide

HOLMER GREEN SENIOR SCHOOL CURRICULUM INFORMATION

Measurement. When Smaller Is Better. Activity:

Probability and Statistics Curriculum Pacing Guide

Machine Learning and Development Policy

On the Polynomial Degree of Minterm-Cyclic Functions

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Top Ten Persuasive Strategies Used on the Web - Cathy SooHoo, 5/17/01

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Course Content Concepts

High School to College

E-3: Check for academic understanding

Creating a Test in Eduphoria! Aware

Maintaining Resilience in Teaching: Navigating Common Core and More Online Participant Syllabus

The Political Engagement Activity Student Guide

Strategies that Students Use to Trace Code: An Analysis Based in Grounded Theory

ReFresh: Retaining First Year Engineering Students and Retraining for Success

Using Blackboard.com Software to Reach Beyond the Classroom: Intermediate

National Survey of Student Engagement (NSSE) Temple University 2016 Results

1. Programme title and designation International Management N/A

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Delaware Performance Appraisal System Building greater skills and knowledge for educators

IMGD Technical Game Development I: Iterative Development Techniques. by Robert W. Lindeman

Effects of Anonymity and Accountability During Online Peer Assessment

Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving

Evaluating Statements About Probability

Thesis-Proposal Outline/Template

Mcgraw Hill 2nd Grade Math

Classifying combinations: Do students distinguish between different types of combination problems?

University of North Carolina at Greensboro Bryan School of Business and Economics Department of Information Systems and Supply Chain Management

Discriminative Learning of Beam-Search Heuristics for Planning

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

12-WEEK GRE STUDY PLAN

International Business BADM 455, Section 2 Spring 2008

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

West s Paralegal Today The Legal Team at Work Third Edition

EXECUTIVE SUMMARY. Online courses for credit recovery in high schools: Effectiveness and promising practices. April 2017

Syllabus Fall 2014 Earth Science 130: Introduction to Oceanography

FOR TEACHERS ONLY. The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION. ENGLISH LANGUAGE ARTS (Common Core)

Saint Louis University Program Assessment Plan. Program Learning Outcomes Curriculum Mapping Assessment Methods Use of Assessment Data

American Journal of Business Education October 2009 Volume 2, Number 7

Karla Brooks Baehr, Ed.D. Senior Advisor and Consultant The District Management Council

Department of Statistics. STAT399 Statistical Consulting. Semester 2, Unit Outline. Unit Convener: Dr Ayse Bilgin

A Case Study: News Classification Based on Term Frequency

E C C. American Heart Association. Basic Life Support Instructor Course. Updated Written Exams. February 2016

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Bachelor Class

Lecture 1: Machine Learning Basics

The open source development model has unique characteristics that make it in some

Classroom Assessment Techniques (CATs; Angelo & Cross, 1993)

SYLLABUS: RURAL SOCIOLOGY 1500 INTRODUCTION TO RURAL SOCIOLOGY SPRING 2017

Infrastructure Issues Related to Theory of Computing Research. Faith Fich, University of Toronto

ASSESSMENT GUIDELINES (PRACTICAL /PERFORMANCE WORK) Grade: 85%+ Description: 'Outstanding work in all respects', ' Work of high professional standard'

Maintaining Resilience in Teaching: Navigating Common Core and More Site-based Participant Syllabus

Transcription:

Short vs. Extended Answer Questions in Computer Science Exams Alejandro Salinger Opportunities and New Directions April 26 th, 2012 ajsalinger@uwaterloo.ca

Computer Science Written Exams Many choices of question formats Multiple-choice True/false Short answer Problem solving Code writing How suitable is each type for CS exams? What are their implications?

Outline Short and Extended Answer Questions Do both measure the same skills? Instructors perspectives Influence on learning Discussion Intended Learning Outcomes Assessments as a learning instance Structural fidelity

Short Answer Questions Multiple-choice True-or-false Fill-in-the-blank Brief-answer What are the best-, average-, and worst-case times to sort n items using Quicksort?

Extended Answer Questions Code Writing Problem Solving Mathematical Proof Prove that in the comparison model any sorting algorithm requires Ω(n log n) comparisons in the worst case when sorting n items.

Pros and Cons Short Answer Questions Efficient administration Objective grading Timely feedback Can test wide range of topics Independent of writing skills Easy to evaluate test itself Extended Answer Questions Take less time to construct Easier to test high levels of learning Partial credit No guessing Test writing abilities Higher structural fidelity

Can both format types measure the same skills?

Can both formats measure the same? Can multiple-choice replace constructed-response? CR items provide less information in more time and at greater cost than MC (Lukhele, Thissen, Wainer, 1994) Score of essay adds minimal information about grade beyond MC score (Wastad & Becker, 1994) Little support for stereotype of MC and CR measuring different constructs (Bennet, Rock, Wang, 1991)

Can both formats measure the same? Some skills are too complex to be measured effectively with MC questions Measures can change over time MC questions are not homogeneous Substantial differences in MC questions Extremely difficult to construct MC question at Application level No combination of MC questions exactly matches skills of some CR questions Weak correlations at same difficulty level The evidence is inconclusive (Livingston, 2009) (Martinez, 1999) (Simkin, Kuechler, 2005, 2010)

Computer Science instructors perspectives on multiple-choice questions

Why do you use MC questions? Understanding To get an idea of the breadth of students understanding Determine level of knowledge MC can also be used to test the depth of knowledge Confidence To give weaker students confidence to answer questions Shorter feedback time Easy question Keep the weaker students on track To test understanding of fundamental terms/phrases used in programming Student-centered (Shuhidan, Hamilton, D Souza, 2010) To keep students happy To constrain the students creativity To prepare students for later courses where this will be even more common

CS Instructors perspectives on MC 10% do not support the use of MC: I have NEVER used multiple choice questions in an exam! You need to include essay questions because Computer Science students need to know how to write I feel that multiple choices is a completely inappropriate tool for judging deep understanding and comprehension of programming concepts

Influence on learning approaches

Influence on learning approaches Multiple-choice: Surface approach Knowledge-based skills Essay Deep approach Comprehension, application, analysis Preference: Essays deep approach and better performance in essays MCQ surface approach and worse performance in essays Deep approach and perception of MCQ assessing higher levels poor performance (Scouller, 1998)

Discussion

Intended Learning Outcomes ILOs go beyond Knowledge and Comprehension in most CS courses Difficult to measure these with short-answer questions If tests rely on short-answer questions, High-level skills not measured Likely not attained What about other assessments?

CS Courses Assessments Weights Assignments Midterms Final Others 1% 46% 29% 24%

Assessment as a Learning Instance Show how to sort n integers in the range 0, n 2 1 in O(n) time. Exam with a few long questions enables: Reflecting Deriving ideas Making connections Evaluation of ideas Creativity Writing skills Valuable feedback for instructors and students

Structural Fidelity Easier to achieve with extended-answer questions For example, computer programming: Writing a program closer to real-world situation Skills measured by short-answer questions not very useful Extended-answer questions can be complemented by provided aid

Conclusions Choice of question format has an important influence on learning assessment, learning approaches, and perceptions Skills involved in Computer Science courses are better measured with extended-answer questions Extended-answer questions should be used in CS courses from the first year, with the appropriate weight in the final grade

Thank you Simon Wilson

References CUT project: http://www.cs.uwaterloo.ca/~ajsalinger/cut_project.pdf Bennett, R. E., Rock, D. A., & Wang, M. (1991). Equivalence of free-response and multiple-choice items. Journal of Educational Measurement, 28(1), pp. 77-92. Kuechler, W. L., & Simkin, M. G. (2010). Why is performance on multiple-choice tests and constructed-response tests not more closely related? theory and an empirical test*. Decision Sciences Journal of Innovative Education, 8(1), 55 73. Livingston, S. A. (2009). Constructed-response test questions: Why we use them; how we score them. ETS, R&D Connections 11. Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative value of multiple-choice, constructed response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31(3), pp. 234-250. Scouller, K. (1998). The influence of assessment method on students learning approaches: Multiple choice question examination versus assignment essay. Higher Education, 35, 453-472. Shuhidan, S., Hamilton, M., & D Souza, D. (2010). Instructor perspectives of multiple-choice questions in summative assessment for novice programmers. Computer Science Education, 20(3), 229-259. Simkin, M. G., & Kuechler, W. L. (2005). Multiple-choice tests and student understanding: What is the connection? Decision Sciences Journal of Innovative Education, 3(1), 73 98. Struyven, K., Dochy, F., & Janssens, S. (2005). Students perceptions about evaluation and assessment in higher education: a review. Assessment & Evaluation in Higher Education, 30(4), 325-341. Walstad, W. B., & Becker, W. E. (1994, May). Achievement differences on multiple-choice and essay tests in economics. American Economic Review, 84(2), 193-96.