Objective Structured Clinical Exercise (OSCE): An Overview Jeanne M. Farnan, MD MHPE The University of Chicago Pritzker School of Medicine CDIM Pre-course for Experienced Educators Outline Background and Use Development and Implementation Evaluation and Outcomes Examples and Alternatives Discussion and Wrap-up
Learning Objectives Introduce the OSCE as an effective assessment tool for clerkship students Discuss advantages and barriers to development and implementation Practice blueprinting development for an OSCE assessment Review validity and reliability in the context of OSCE evaluation Who is in the room? CD s? PD s? Program administrators? SP program folks? Others? What are you hoping to accomplish?
What is an OSCE? Objective, structured, clinical examination with one or more assessment tools administered at separate, sequenced stations, each requiring the student to perform a specified task Historical Use Initially described by Harden in 1975 (Harden, R. BMJ. 1975) Replace clinical examination which lacked: Reliability Standardized objectives Structured observation Simultaneous evolution of Standardized or Simulated Patient (Barrows HS. Simulated Patients. 1971) Dr. Howard Barrows, 1963 at USC
Evolution of the OSCE Recent innovative adaptations of OSCE model to teach non-cs based topics including: Quality improvement/pbli (Varkey, et al. Am J Med Qual. 2008) Objective Structured TEACHING Exercise (OSTE) (Stone, et al. TLM. 2003) Handoff skills and Transitions of Care (Farnan, et al JGIM 2010) Applications in UME, GME & faculty development Formalized Use UME Clinical Skill development Clerkship assessment GME Pre-orientation assessment Licensure Step 2 CS examination Indiana University OSCE UM GME POA Toolbox of Assessment Methods. 2000. ACGME & ABMS
OSCE s in IM Clerkship Establishing sources of validity evidence for IM clerkship MCQ and OSCE MOST sources of validity evidence (>WE, PE, OPC) Aurwarakul. Med Educ. 2005 Mar;39(3):276-83. Needs assessment Do I need an OSCE? Existing curricula Direct observation opportunities Resource considerations Why administer an OSCE? Test integrative knowledge Specificity of assessment to program Programmatic evaluation ASE, The OSCE, 2011
BARRIERS & FACILITATORS What are the challenges? Efficiency & cost-effectiveness Cost-effective ONLY when many candidates evaluated in ONE administration Logistics Resources, personnel and expertise Time-consuming development Maintenance & test security
Several advantages: Wide range of learning objectives & contexts Different specialties & disciplines Formative & summative assessment Programmatic & individual evaluation Despite those Patricio, Med Teach, 2013; 35: 503 514 OSCE DEVELOPMENT
Major Components Lists of skills, behaviors, attitudes to assess Instruments for evaluation Criteria for scoring assessment and passing standards Resources Dedicated space Personnel! ASE, The OSCE, 2011 What is an OSCE? Objective, structured, clinical examination with one or more assessment tools administered at separate, sequenced stations, each requiring the student to perform a specified task
perform a specified task : Behaviors Communication skills History-taking Patient counseling Physical examination skills Higher-ordered skills Specific communication challenges Documentation Medical Knowledge/Clinical reasoning specified tasks : Setting Standardized patient encounters Diagnosis and treatment Patient counseling Simulations & models Role-playing stations Transitions of care Oral presentation tasks Decision-making & problem-solving
Development: Issues to Consider Material development influenced by stakes of examination (Adamo, Med Teacher, 2003) Institution and curricula specific Resource specific Pre-existing standardized patient program Student use as simulated patients (Rollnick, Pt Ed and Cnsel, 2007) Development: Planning Mapping skills to be assessed to competencies or milestones Communication Skills Medical Knowledge Patient Management/Counseling Structural considerations? Station-specific resources Who are the evaluators?
Blueprinting Review curricular objectives Decide on domains to be assessed Observable skills Map domains against learning objectives Framework for blueprint Sampling: proportion cases/stations in each section Testing time and assessment www.faculty.londondeanery.ac.uk/e-learning/structured-assessments-of-clinicalcompetence Blueprinting Compile specific CONTENT objectives that student is expected to achieve Consider levels & ranges of patients and problems Simple to Complex Frequency, relevance & necessity of skills to assess
BLUEPRINTING PRACTICE ASME OSCE Blueprint
What is an OSCE? Objective, structured, clinical examination with one or more assessment tools administered at separate, sequenced stations, each requiring the student to perform a specified task sequenced stations Station development Resource and training needs SP and observers Define the purpose Define instructions for the candidate, evaluator and SP Concomitant EVALUATION development
Station Development Review of literature Guidelines Expert opinion STAKES! Peer review Existing resources available ASPE, MedEd Portal Station Development Assess feasibility of what can be simulated Task and time agreement Who [trainee] can do how much of what [observable skill] in the context of [subject matter] at what level [passing standards]? Provide as much or as little patient information as is appropriate given the purpose for the case Dalhousie University Summer Institute June 2012
How many stations? 14-18 recommended to obtain reliable measurements of performance BEMER of OSCE s 20-315 minutes total 6-20 minutes per station 4-40 stations Toolbox of Assessment Methods. 2000. ACGME & ABMS Patricio, Med Teach, 2013; 35: 503 514 SP training Training for ~10 minute interview varies between 3-4 hours Preparation for physical findings Conflicting reports on reliability and accuracy as evaluators Areas of evaluation best focused at communication skills and behaviors (McLaughlin, BMC Med Ed, 2006)
SP training Link training information to the assessment instrument Train with responses across spectrum of performance Be SPECIFIC with findings, historical or otherwise Identify information to be volunteered and that to be elicited Inter-station Exercises Allows for additional tasks to be assessed: Documentation Chart review Oral case presentation Interpretation of diagnostics Identification and interpretation of findings Task alignment with time allotment
Longitudinal Evaluation Knows Knows how Shows how Does Knowledge MCQ Clinical Skills OSCE Clinical Practice Direct Observation Miller, J Assoc Am Med Coll, 1990 Evaluation Considerations Individual & programmatic Able to assess student skill in individual domains Establishing individual instruments and passing standards Rigor again dependent upon the STAKES of the evaluation
Side note: Passing Standards Passing score= score needed to pass assessment, percent correct Passing rate= percentage of students who pass at any given score Angoff Method Discussion of characteristics of borderline trainee Item-by-item performance of borderline trainee Totaled and averaged to determine passing score Downing, Teaching Learning Med. 2006, 18(1): 50-57. Angoff Method: Example http://teachinganatomy.blogspot.com/2013/07/standardsetting.html
Factor Analysis & Passing Non-compensatory approach Excellence in one domain of skill should not compensate for deficiency in another Compensatory approach Errors/difficulty with one station can be compensated by performance on another Factor analysis Identify variables which account for variance in scores on task for each station Identification of critical skills in pass/fail discussion Chesser, et al. Med Ed. 2004; 38: 825 831 one or more assessment tools Checklists vs. Global Rating scales Assess key elements; emphasis on thoroughness Procedural tasks Done/not done Assess behaviors; emphasis on judgment Communication skills, professionalism Behavioral assessment with descriptive anchors Guide for examiners and SP s Formative; focus on feedback, increases rater training needs Longer not necessarily better (Yudkowsky, in press) Content validity demonstrated with shorter, clinically discriminating checklist
Other considerations Items must be separate Avoid the double-barrel item e.g. Student asked patient about location and radiation of pain Number of items on checklist reflects the length of the station Consider KEY FACTORS/Critical Actions (Payne et al Acad Med 2008) Task driven checklist Antoun J, Romani M, Saab B. Disclosure of Medical Error: Objective Structured Clinical Examination (OSCE). MedEdPORTAL; 2012. Available from: www.mededportal.org/publication/9226
Global rating scale PPQ: Patient Perception Questionnaire Introduction Appearance Rapport Verbal Communication & Active Listening Encounter closure Global Rating Scale with descriptive anchors From http://www.mcc.ca/pdf/rating_scale_qeii_e.pdf
Clinical Skill observer checklist Bergus, et al. Adv in Med Ed Prac.2010:1 67 73 VALIDITY & RELIABILITY
Validity Validity Primer Degree to which test measures what it intends to measure Does evidence and theory support interpretation of test scores Construct Content Concurrent Predictive Response process Inferences we make from evidence Downing, Messick (1989) Establishing OSCE Validity Content validity Degree to which assessment measures construct (what we intend to measure) Blue-printing Response Process Degree to which students understand assessment as we intend Demonstration of student understanding Piloting cases http://nsse.iub.edu/html/validity.cfm
Establishing OSCE Validity Concurrent validity Correlation with other measures of same construct SP-based scores and clinical ratings/written exams Predictive validity Does assessment correlate with other measures? OSCE score predicting other criterion measures http://nsse.iub.edu/html/validity.cfm Reliability Degree to which the scores obtained on one administration of the test would be consistent with those obtained on a second administration, using the same or similar group Pearson s correlation coefficient Cronbach s alpha most commonly reported (Brannick, Med Ed, 2011)
Reliability Often more difficult to reliably assess communication skills than clinical skills Can be improved: Utilization of two raters/observers Larger number of stations Pre-assessment rater training for common expectations SAMPLING vs. standardization (Van der Vleuten, 2005, Med Ed) OSCE Relationship to Other Measures of Competency (r 0.296; p 0.001) (r 0.422; p 0.001) (Lukas RV, Adesoye T, Smith S, Blood A, Brorson JR. Neurology. 2012;79(7):681-685)
Generalizability (G) theory Score is defined in relation to all facets of the measurement process (Kreiter, 2009) In relation to an OSCE, facets can include students, cases, domains, and items Provides an estimation of which facets should be adjusted in order to achieve an acceptable level of reliability EVALUATION CONSIDERATIONS
OSCE vs. Direct Observation Variables Reproducibility Sampling of learning objectives Formative and Summative Variety of assessment methods Implementation Considerations Faculty development Observer training Passing standards Case development Writing group Cost and resources Space
FUTURE DIRECTIONS & ALTERNATIVES Formative & Summative Assessment emphasis on developmental progression (Frank, Med Teach, 2010) CBME framework (Harris, Med Teach, 2010) Translating competency into observable behaviors Mookherjee et al, Med Teach, 2013
Long Case Appropriateness for high stakes decisions remains in question Ways to improve reliability: Structured assessment tools Increasing testing time Addition of observation to the clinical interaction Wilkinson, TJ. Med Ed.2008; 42: 887 893 SOCE Systematically Observed Clinical Encounters Trained lay observers + MD observers Score reliability increased with increasing observations Advantages Incorporation of different patient populations Bergus, et al. Adv in Med Ed Prac.2010:1 67 73
Narrative Feedback Case-specific rating scales versus generalized narrative feedback Simplification of rater training if single, global assessment used Common Ground checklist (Van Nuland, Med Ed, 2007) LONGER training but single instrument Sub-internship OSCE <5% of Internal Medicine sub-internships utilizing (CDIM survey data 11-12) Skills focus: Transitions of care Communication within the healthcare system Communication with patients/providers Improved self-efficacy, perceived preparedness for residency training Mischler et al, Teach Learn Medi 25(3), 242 248
Conclusions Need & Current Evaluation Metrics Design (Blueprinting, Case Development, Assessment Tools) Implementation (Resources, Feasibility, Cost) Evaluation & Feedback Questions?