Epi Bio 502: Advanced Biostatistics 1.0 Credit Winter Quarter 2015 Lecture Times: Mondays and Wednesdays 10:30-12:00 PM Lecture Location: Lurie-Gray Seminar Room Lab Time: Wednesdays 2:00-3:00 PM Lab Location: Galter Health Science Library - LRC Course Instructor Edward F. Vonesh, Ph.D. Professor of Preventive Medicine Department of Preventive Medicine Division of Biostatistics e-vonesh@northwestern.edu *Phone: Office hours: Mondays from 1:00-2:00 PM, 633 N St. Clair Street, Room 18-043. Teaching Assistant Hongyan Ning h-ning@northwestern.edu I. Course description This course covers modern approaches to the analysis of correlated response data arising from longitudinal studies and studies involving repeated measurements or clustered observations within subjects. Emphasis will be on applications requiring the use of linear models for normally distributed data and generalized linear models for non-normally distributed data (e.g., binary outcomes data and count data). Nonlinear models for normally distributed data will also be discussed and contrasted with linear models but not emphasized. The course will focus on two primary approaches for analyzing correlated responses within subjects. The first approach entails fitting a marginal linear or generalized linear model to a set of data in which case, a covariance structure describing correlation and variation is specified directly within the model. The second approach entails fitting a linear mixed-effects or generalized linear mixed-effects model to a set of data. In this case, correlation and variation are introduced indirectly through specification of subject-specific random effects within the model. Conceptual differences between marginal and mixed-effects models will be discussed as will differences in how model parameters are estimated and interpreted. Applications will focus on repeated measures profile analysis for both continuous and discrete response data, growth curve and random coefficient regression models for continuous response data, logistic regression models for binary outcomes data and log-linear models for count data. Techniques for assessing model goodness-of-fit through graphical and analytical means will be presented together with how tests of hypotheses and confidence intervals are constructed. Lastly, problems arising from missing data in longitudinal studies will be discussed with an emphasis placed on understanding the impact missing values can have on subsequent inference. All modeling and numerical analyses will be done using SAS. II. Prerequisites Equivalent of Biostatistics I and II (Epi Bio 302 and Epi Bio 402) or permission from instructor II. Learning Objectives This course is designed to familiarize students with various statistical techniques used to analyze longitudinal data arising from clinical and epidemiological studies in which the outcome or response variable of interest is measured repeatedly over time and possibly under different experimental conditions on each subject. At the end of this course, students should have the ability to:
describe different sources of correlation and variation in longitudinal studies differentiate between within-subject covariates and between-subject covariates describe various linear models one can use to analyze continuous longitudinal data, their key assumptions and the types of covariance structures most commonly used to account for the correlation that exists with longitudinal data describe the key elements that define a univariate generalized linear model for non-normally distributed data including the type of response variable being modeled and its corresponding mean, variance and probability distribution describe different methods for how correlation can be introduced into a generalized linear model in order to accommodate the analysis of correlated longitudinal data describe key differences between marginal generalized linear models and generalized linear mixed-effects models in terms of underlying assumptions and subsequent methods of estimation used by each describe inherent differences between marginal models and mixed-effects models in terms of how regression parameters are interpreted under each formulate an appropriate statistical model based on both the type of outcome being measured and on the stated objectives of a particular application use SAS to fit various models to longitudinal data as appropriate to the study objectives and constraints of the observed data differentiate between various missing data mechanisms and what impact each has on one s ability to draw valid inference based on the method of analysis chosen for a particular application IV. Required Textbook Fitzmaurice GM, Laird NM and Ware JH (2011). Applied Longitudinal Analysis, 2 nd Edition, Hoboken, New Jersey: Wiley. V. Required Software All numerical analyses will be performed using SAS VI. Blackboard The syllabus, class handouts/lecture notes, and both homework and lab assignments will be posted on the course s Blackboard site, available at https://courses.northwestern.edu/webapps/login if you are registered for the course. VII. Class assignments, labs, examinations a) Class assignments Homework is assigned on Wednesdays following the Lab session and will be due no later than the following Wednesday (to be submitted via email). See the Class Schedule for projected dates when homework is assigned and due. b) Labs Labs will consist of select exercises including SAS programming exercises assigned during the Lab sessions and will be specific to the material covered in class each week. c) Examinations There will be an in-class mid-term exam and a comprehensive final exam. The final exam will involve both a one week take-home examination and an in-class examination. The in-class examinations may consist of multiple choice, short answer and statistical computations or interpretation of SAS output and will be administered at the scheduled times. The take-home portion of the final examination will entail the analysis and interpretation of results from a longitudinal dataset which will require the use of SAS. Make up examinations should be arranged in advance and will only be given under extenuating circumstances.
VIII. Student Evaluation Students will be evaluated based on completed homework assignments (20%), a mid-term in class exam (30%) and final exam (50%). The final exam will be part take home (25%) and part in class (25%). Lab exercises and student participation will not formally factor into the course grade but students are expected to fully participate in any assigned lab session exercises as well as in classroom discussions. IX. Course evaluation The MPH Program administers web-based course evaluations to students for each course near the end of the quarter. Your completion of both the unit (course) and faculty evaluation components is required; failure to complete either of the evaluations will result in an incomplete grade until the evaluations are submitted. You will be sent the web link and instructions via email later in the quarter. You will have about two weeks to complete the evaluations before grades are submitted. X. Academic Integrity Every Northwestern faculty member and student belongs to a community of scholars where academic integrity is a fundamental commitment. The Program in Public Health abides by the standards of academic conduct, procedures, and sanctions as set forth by The Graduate School at Northwestern University. Students and faculty are responsible for knowledge of the information provided by The Graduate School on their Web page at http://www.tgs.northwestern.edu/academics/academic-services/integrity/index.html Academic misconduct includes, but is not limited to 1. Receiving or giving unauthorized aid on examinations or homework 2. Plagiarism 3. Fabrication 4. Falsification or manipulation of academic records 5. Aiding or abetting any of the above The PPH follows The Graduate School s procedure for evaluating alleged academic misconduct, as outlined on the TGS website. http://www.tgs.northwestern.edu/academics/academicservices/integrity/dishonesty/index.html Faculty reserve the right to use the Safe Assignment: Plagiarism Detection Tool that is part of the Course Management System to evaluate student assignments. Information about this tool can be found at http://www.it.northwestern.edu/education/coursemanagement/support/assessments/safeassignment.html XI. Class schedule Class Topic Readings, HW Assignment Homework Due Mon 1/5 Introduction to Longitudinal Analysis Chapters 1, 2 (2.1-2.3), Appendix B, Class Wed 1/7 Overview of Linear Models for Longitudinal Data - Part I Chapters 2 (2.3-2.6) and 3 (3.1-3.5), Appendix A, Class
Wed 1/7 Lab session Lab Exercises, HW #1 Mon 1/12 Wed 1/14 Overview of Linear Models for Longitudinal Data - Part II Marginal Linear Models for Correlated Data Estimation and Inference Chapters 2 (2.3-2.6) and 3 (3.1-3.5), Appendix A, Class Chapter 4, Class HW #1 due Wed 1/14 Lab session Lab Exercises, HW #2 Mon 1/19 Martin Luther King Day NO CLASSES Wed 1/21 Modeling the Mean Profile Analysis using PROC MIXED in SAS Part I Chapters 5, Class HW #2 due Wed 1/21 Lab sessions Lab Exercises, Mon 1/26 Modeling the Mean Profile Analysis using PROC MIXED in SAS Part II Chapters 5, Class Wed 1/28 Modeling the Mean Growth Curve Analysis using PROC MIXED in SAS Chapters 6, Class Wed 1/28 Lab sessions Lab Exercises, HW #3 Mon 2/2 Modeling the Covariance Chapter 7, Class Wed 2/4 Linear Mixed-Effects Models for Correlated Data Estimation, Prediction and Inference Chapter 8 (8.1-8.7), Class Wed 2/4 Lab session Lab Exercises, HW #4 HW #3 due Mon 2/9 Fitting Linear Mixed-Effects Models using PROC MIXED in SAS 1 hour Review and Q & A 30 minutes Chapter 8 (8.8-8.9), Class Wed 2/11 Mid-Term Exam HW #4 due Wed 2/11 Lab session NO LAB NO LAB Mon 2/16 Fitting Linear Mixed-Effects Models using PROC MIXED in SAS (cont), Chapter 8 (8.8-8.9), Class Chapter 10, Class Wed 2/18 Model Diagnostics and Goodness-of- Fit Wed 2/18 Lab sessions Lab Exercises, HW #5 Mon 2/23 Introduction to Generalized Linear Models Chapter11 (11.1-11.3, 11.5-11.7), Class
Wed 2/25 Marginal Generalized Linear Models for Correlated Data Generalized Estimating Equations (GEE) Chapters12 and 13 (13.1-13.2), Class HW #5 due Wed 2/25 Lab session Lab Exercises, HW #6 Mon 3/2 Fitting Marginal Logistic and Log-linear Models to Correlated Longitudinal Data using PROC GENMOD in SAS Chapter 13 (13.3-13.6), Class Wed 3/4 Generalized Linear Mixed-Effects Models Interpretation of Regression Parameters, Maximum Likelihood Estimation and Inference Chapter 14 (14.1-14.5), Class Wed 3/4 Lab session Lab Exercises, HW #7 Mon 3/9 Fitting Logistic and Log-Linear Mixed- Effects Models to Longitudinal Data using PROC GLIMMIX in SAS Chapters 14 (14.7-14.8) and 15 (15.5-15.6), Class Wed 3/11 Missing Data in Longitudinal Studies Chapter 17, Class Wed 3/11 Lab session Lab Exercises, Review and Q & A Take home exam Mon 3/16 Introduction to Nonlinear Models (30 Class minutes), Review and Q & A session (1 hour) Wed 3/18 Final Exam Turn in take home exam In Class Exam HW #6 due HW #7 due Additional resources (SAS, ) Optional readings: Littell RC, Milliken GA, Stroup WW, Wolfinger RD and Schabenberger O (2006). SAS for Mixed Models. Cary, NC: SAS Institute Inc. Vonesh EF (2012). Generalized Linear and Nonlinear Models for Correlated Data: Theory and Applications Using SAS. Cary, NC: SAS Institute Inc.