Why Is the One-Group Pretest Posttest Design Still Used?

Similar documents
Process Evaluations for a Multisite Nutrition Education Program

Tutoring First-Year Writing Students at UNM

Office Hours: Mon & Fri 10:00-12:00. Course Description

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

Changing User Attitudes to Reduce Spreadsheet Risk

Monitoring Metacognitive abilities in children: A comparison of children between the ages of 5 to 7 years and 8 to 11 years

TCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits)

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

How to make your research useful and trustworthy the three U s and the CRITIC

Infrastructure Issues Related to Theory of Computing Research. Faith Fich, University of Toronto

The Task. A Guide for Tutors in the Rutgers Writing Centers Written and edited by Michael Goeller and Karen Kalteissen

Tun your everyday simulation activity into research

The Round Earth Project. Collaborative VR for Elementary School Kids

Aviation English Training: How long Does it Take?

The Implementation of Interactive Multimedia Learning Materials in Teaching Listening Skills

Student Handbook 2016 University of Health Sciences, Lahore

Developing Students Research Proposal Design through Group Investigation Method

Critical Thinking in Everyday Life: 9 Strategies

THE UNIVERSITY OF WESTERN ONTARIO. Department of Psychology

Just in Time to Flip Your Classroom Nathaniel Lasry, Michael Dugdale & Elizabeth Charles

Research Design & Analysis Made Easy! Brainstorming Worksheet

Consultation skills teaching in primary care TEACHING CONSULTING SKILLS * * * * INTRODUCTION

Syllabus for PRP 428 Public Relations Case Studies 3 Credit Hours Fall 2012

Thesis-Proposal Outline/Template

What is Teaching? JOHN A. LOTT Professor Emeritus in Pathology College of Medicine

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

The Impact of Formative Assessment and Remedial Teaching on EFL Learners Listening Comprehension N A H I D Z A R E I N A S TA R A N YA S A M I

Northeastern University Online Course Syllabus

GROUP COMPOSITION IN THE NAVIGATION SIMULATOR A PILOT STUDY Magnus Boström (Kalmar Maritime Academy, Sweden)

Course Content Concepts

Marketing Management MBA 706 Mondays 2:00-4:50

TU-E2090 Research Assignment in Operations Management and Services

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

Report on organizing the ROSE survey in France

The Effects of Jigsaw and GTM on the Reading Comprehension Achievement of the Second Grade of Senior High School Students.

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

Writing Mentorship. Goals. Ideas and Getting Started! 1/21/14. Pamela Hallquist Viale Wendy H. Vogel

University of Arkansas at Little Rock Graduate Social Work Program Course Outline Spring 2014

Executive Guide to Simulation for Health

PERSONAL STATEMENTS and STATEMENTS OF PURPOSE

5 Star Writing Persuasive Essay

What is Thinking (Cognition)?

Software Security: Integrating Secure Software Engineering in Graduate Computer Science Curriculum

Probability estimates in a scenario tree

Red Flags of Conflict

STA 225: Introductory Statistics (CT)

Last Editorial Change:

The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach

Philosophy in Literature: Italo Calvino (Phil. 331) Fall 2014, M and W 12:00-13:50 p.m.; 103 PETR. Professor Alejandro A. Vallega.

Physics 270: Experimental Physics

GDP Falls as MBA Rises?

MONTAGE OF EDUCATIONAL ATTRACTIONS

TRAITS OF GOOD WRITING

Introduction. 1. Evidence-informed teaching Prelude

Evidence for Reliability, Validity and Learning Effectiveness

Running head: METACOGNITIVE STRATEGIES FOR ACADEMIC LISTENING 1. The Relationship between Metacognitive Strategies Awareness

WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING AND TEACHING OF PROBLEM SOLVING

Study Group Handbook

Department of Plant and Soil Sciences

Increasing the Learning Potential from Events: Case studies

ACADEMIC POLICIES AND PROCEDURES

COURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management

Quantitative Research Questionnaire

My Identity, Your Identity: Historical Landmarks/Famous Places

Multidisciplinary Engineering Systems 2 nd and 3rd Year College-Wide Courses

Author's response to reviews

Management 4219 Strategic Management

What is PDE? Research Report. Paul Nichols

Effect of Cognitive Apprenticeship Instructional Method on Auto-Mechanics Students

STUDENT PERCEPTION SURVEYS ACTIONABLE STUDENT FEEDBACK PROMOTING EXCELLENCE IN TEACHING AND LEARNING

(Includes a Detailed Analysis of Responses to Overall Satisfaction and Quality of Academic Advising Items) By Steve Chatman

essential lifestyle planning for everyone Michael W. Smull and Helen Sanderson

benefit essay social disadvantages networking, disadvantages essays social benefit its

Helping your child succeed: The SSIS elementary curriculum

The New Theory of Disuse Predicts Retrieval Enhanced Suggestibility (RES)

Usability Design Strategies for Children: Developing Children Learning and Knowledge in Decreasing Children Dental Anxiety

ATW 202. Business Research Methods

Title: Improving information retrieval with dialogue mapping and concept mapping

BASIC EDUCATION IN GHANA IN THE POST-REFORM PERIOD

BENCHMARK TREND COMPARISON REPORT:

The lasting impact of the Great Depression

The Talent Development High School Model Context, Components, and Initial Impacts on Ninth-Grade Students Engagement and Performance

Field Experience Management 2011 Training Guides

Practical Research Planning and Design Paul D. Leedy Jeanne Ellis Ormrod Tenth Edition

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Telekooperation Seminar

Dimensions of Classroom Behavior Measured by Two Systems of Interaction Analysis

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

Improving Conceptual Understanding of Physics with Technology

Number of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)

Doctoral GUIDELINES FOR GRADUATE STUDY

MKTG 611- Marketing Management The Wharton School, University of Pennsylvania Fall 2016

Running head: DEVELOPING MULTIPLICATION AUTOMATICTY 1. Examining the Impact of Frustration Levels on Multiplication Automaticity.

Co-Professors: Cylor Spaulding, Ph.D. & Brigitte Johnson, APR Office Hours: By Appointment

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries

IMPROVING THE STUDENTS ENGLISH VOCABULARY MASTERY THROUGH PUZZLE GAME AT THE SIXTH GRADE STUDENTS OF SDN 1 SODONG GUNUNGHALU

How People Learn Physics

Transcription:

666280CNRXXX10.1177/1054773816666280Clinical Nursing ResearchKnapp research-article2016 Guest Editorial Why Is the One-Group Pretest Posttest Design Still Used? Clinical Nursing Research 2016, Vol. 25(5) 467 472 The Author(s) 2016 Reprints and permissions: sagepub.com/journalspermissions.nav DOI: 10.1177/1054773816666280 cnr.sagepub.com Thomas R. Knapp, EdD, FAAN 1,2 Abstract The one-group pretest posttest pre-experimental design has been widely criticized, yet continues to be used in some clinical nursing research studies. This editorial explains what is wrong with the design, suggests reasons for its continued use, and gives some recommendations regarding what can be done about it. Keywords experimental design, causality, graduate education More than 50 years ago, Donald Campbell and Julian Stanley (1963) carefully explained why the one-group pretest posttest pre-experimental design (Y 1 X Y 2 ) was a very poor choice for testing the effect of an independent variable X on a dependent variable Y that is measured at Time 1 and Time 2. The reasons ranged from obvious matters such as the absence of a control group to technical considerations such as regression toward the mean. Yet that design continues to be used in clinical nursing research. After briefly summarizing some of the things that Campbell and Stanley (hereinafter referred to as C&S) said were wrong with this design, I will try 1 University of Rochester, NY, USA 2 The Ohio State University, Columbus, USA Corresponding Author: Thomas R. Knapp, Professor Emeritus of Education, University of Rochester, 145 Rockingham St., Rochester, NY 14620, USA. Email: tknapp5@juno.com

468 Clinical Nursing Research 25(5) to suggest some reasons for its survival in the clinical nursing research literature. But first I would like to emphasize at the outset one other weakness that is not in the C&S list. There is no basis for any sort of helpful inference from it, statistical or scientific, even if the sample used in the study has been randomly selected (which is rarely the case). Suppose there is a statistically significant difference (change) between the pretest and posttest results. What can you say? You can t say that there is a statistically significant effect of X on Y, because there is no random assignment to experimental and control groups (there is no control group). The difference is what it is, and that s that. If the sample is random, you could construct a confidence interval around the difference, but that wouldn t help in inferring anything about the effect of X. Threats to Its Internal Validity and External Validity C&S use the term threats to indicate uncontrolled matters that could affect Y instead of, or in addition to, X. They also use the term internal validity as synonymous with causal interpretability. Some of the threats to the internal validity of the one-group pretest posttest design are as follows: 1. History: History is a threat in the sense that while the participants are being exposed to X, there could be some other event occurring at the same time that could be the cause of the change in Y. 2. Maturation: If there is a long time between T 1 for Y 1 and T 2 for Y 2, the participants have grown older and possibly less healthy or more healthy, which might account for any change in Y. 3. Testing: If the posttest is a cognitive test that is the same test as the pretest, the questions might be familiar, and therefore now easier, so if the scores improve from pretest to posttest, it could be a practice effect rather than a treatment effect. 4. Instrumentation: Instrumentation is a threat regarding the scoring or rating of the pre-experimental measurements and the post-experimental measurements. If the posttest performance is evaluated by someone different from, and a more stringent evaluator than, the person who scored the pretest, the posttest measurements could be lower even if there were no treatment effect. The same threat could be posed by a mechanical or electrical instrument s reduction in precision from Time 1 to Time 2. 5. Statistical Regression: If the participants are well below average in the population of interest, and have been selected on that basis, they must perform better, on the average, on the posttest than on the pretest

Knapp 469 as an artifact of the elliptical shape of the scatter diagram for a positive relationship between pretest and posttest scores (which is usually the case). They have no other way to go than up, so to speak. This threat also is a problem for participants selected because they are well above average. They have nowhere to go but down. C&S use the term external validity as synonymous with generalizability. Two of such threats for the one-group pretest posttest design are described as follows: 1. Interaction of Testing and X: This threat refers to the possibility of participants being sensitized to the treatment by the pretest, so that the generalizability of the findings might only extend to pretested populations. 2. Interaction of Selection and X: This pertains to the unfortunate fact that experiments are rarely carried out on random samples of participants, thus making generalizations to other potential participants difficult if not impossible. The measuring instrument(s) used is(are) also rarely randomly sampled from a set of equally appropriate instruments, thereby further restricting generalizability. Some Possible Reasons for Its Survival Perhaps some researchers in disciplines such as nursing, medicine, and public health have not heard about the C&S cautions. I personally doubt it, for three reasons: (a) I am familiar enough with graduate curricula in nursing to know that C&S has indeed been used in courses in research design in many schools and colleges of nursing; (b) discussions (sometimes dangerously close to plagiarisms) of the C&S designs appear in several textbooks in the health sciences; and (c) the Google prompt campbell stanley experimental design (without the quotation marks) returns about 180,000 entries, not all of which are to social scientific research. The prompt campbell stanley onegroup pretest posttest design (again without the quotation marks) returns about 25,000 entries, several of which are to clinical research studies. Perhaps some researchers find random assignment to treatment and control groups difficult to carry out, for practical and/or ethical reasons. As an obvious example, one cannot randomly assign elementary schoolchildren to a reading program or to no program to study the change in the understanding of directions for taking low-dose aspirin. Perhaps some researchers are subject to pressures from colleagues and/or superiors to give the experimental treatment to everybody. The Sinclair

470 Clinical Nursing Research 25(5) Lewis (1925) novel Arrowsmith provides a good example of that with respect to an untried serum. The researcher who might otherwise argue for a better design might not be willing to spend the political capital necessary to overturn an original decision to go with the Y 1 X Y 2 approach. Perhaps some researchers might want to conserve personal effort by using the one-group design. Having a control group to contend with is much more work. Perhaps some researchers don t care whether or not the difference is attributable to X; all they might care about is whether things get better or worse between pretest and posttest, not why. Perhaps some researchers use the design in a negative way. If X is hoped to produce an increase in Y from pretest to posttest, and if in fact a decrease is observed, any hypothesis regarding a positive change would not be supported by the data, no matter how big or how small that decrease is. Perhaps some researchers consider the use of this design as a pilot effort (for a main study that might or might not follow). Perhaps some researchers feel that the time between pretest and posttest is often so short (a measure of Y, a brief exposure to X, and another measure of Y) that if there s any change in Y, it must be X that did it. Perhaps some researchers not only don t care about causality but are interested primarily in individual changes (John lost 5 points, Mary gained 10 points, etc.) even if the gains and the losses cancel each other out. The raw data for a Y 1 X Y 2 design show that nicely. Perhaps some researchers are so eager to get a paper published that they ll try almost anything, including the use of a weak design. Can the Design Be Salvaged? There have been several suggestions for improving upon the one-group pretest posttest design to make it more defensible as a serious approach to experimentation. One suggestion (Glass, 1965) was to use a complicated design that is capable of separating maturation and testing effects from the treatment effect. Another approach (Johnson, 1986) was to randomly assign participants to the various measurement occasions surrounding the treatment (e.g., pretest, posttest, post-posttest) and compare the findings for those subgroups within the one-group context. A third variation was to incorporate a double pretest before implementing the treatment. If the difference between either pretest and posttest is much greater than the difference between the two pretests, additional support is provided for the effect of X. Marin, Marin, Perez-Stable, Otero-Sabogal, and Sabogal (1990) actually used that design in their study of the effect of an anti-smoking campaign.

Knapp 471 But all of those approaches pale in comparison with having two groups (one experimental, one control) to which participants are randomly assigned, with each group pretested and posttested, that is, R Y 1 X Y 2 which is C&S Design 4. R Y1 X Y2 R Y Y 3 4 If you have the random assignment you can even do without the pretest, using their Design 6, R X Y 1, which they prefer to Design 4 in any event because it has R X Y 1 R Y 2 greater generalizability. Both are equally strong for assessing causality. What Can Be Done to Minimize Its Use? It s all well and good to complain about the misuse or overuse of the onegroup pretest posttest design. It s much more difficult to try to fix the problem. I have only the following three relatively mild recommendations: 1. Every graduate program (master s and doctoral) in nursing should include a required course in the design of experiments in which the C&S chapter is one of the adopted readings, with particular emphasis placed upon the section dealing with the one-group pretest posttest design. (C&S use the notation O 1 X O 2 rather than the notation Y 1 X Y 2, where the O stands for observation on the dependent variable Y; but in my opinion Y 1 X Y 2 is much more straightforward.) 2. Thesis and dissertation committees should take a much stronger stance against the one-group design. The best people to insist upon that are those who serve as statistical consultants in nursing colleges and departments. 3. Editors of, and reviewers for, nursing research journals should automatically reject a manuscript in which this design plays the principal role. A Historical Note Regarding the C&S Work As indicated in the References section that follows, experimental and quasi-experimental designs for research on teaching first appeared as a chapter in a set of papers devoted to educational research. It received such

472 Clinical Nursing Research 25(5) acclaim that it was reprinted (essentially intact) as a paperback book published in 1966, but without the words on teaching (undoubtedly in the hope of attracting a larger market, which it indeed did). It has gone in and out of print many times. Declaration of Conflicting Interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding The author(s) received no financial support for the research, authorship, and/or publication of this article. References Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research on teaching. In N. L. Gage (Ed.), Handbook of research on teaching (pp. 171-246). Chicago, IL: Rand McNally. Reprinted in 1966 under the title Experimental and quasi-experimental designs for research. Glass, G. V. (1965). Evaluating testing, maturation, and treatment effects in a pretestposttest quasi-experimental design. American Educational Research Journal, 2, 83-87. Johnson, C. W. (1986). A more rigorous quasi-experimental alternative to the onegroup pretest-posttest design. Educational and Psychological Measurement, 46, 585-591. Lewis, S. (1925). Arrowsmith. New York, NY: Harcourt Brace. Marin, B. V., Marin, G., Perez-Stable, E. J., Otero-Sabogal, R., & Sabogal, F. (1990). Cultural differences in attitudes toward smoking: Developing messages using the theory of reasoned action. Journal of Applied Social Psychology, 20, 478-493. Author Biography Thomas R. Knapp is Professor Emeritus of Education at the University of Rochester and Professor Emeritus of Nursing at The Ohio State University. His specializations are statistics, measurement, and research design.