SDS 385 2: APPLIED REGRESSION, UNIQUE NO. 57555 and PA397C: ADVANCED EMPIRICAL METHODS FOR POLICY ANALYSIS, APPLIED REGRESSION, UNIQUE NO. 61630 Spring 2017 Instructor: Email: Office: Office Hours: Dr. Jay Zarnikau Classroom: SRH 3.122 Teaching Assistant: jayz@utexas.edu (When I reply, my consulting firm s email domain name might be on the email address, since I often forward UT emails to my consulting firm s server.) On Campus: SRH 3.273, Visiting Faculty Office on 2nd floor of Sid Richardson Hall (LBJ School of Public Affairs) near the Computer Lab. Sometimes, I may hold office hours in my office in the Economics Dept. Each class day, I ll announce my office hours for the coming week. Additionally, I ll often be in my office late on Monday afternoons before class. Feel free to send me an email to schedule an appointment. Typically, I will hold office hours in the LBJ School s Computer Lab. Su Chen <s.chen@utexas.edu> Description This class will explore how various regression modeling techniques can be used to study practical problems and issues in the social sciences, public policy, and other fields. The course is intended to be interdisciplinary, and the content will be tailored to some degree to match the interests of the students. For students in the LBJ School of Public Affairs, this class builds upon the basic statistical concepts presented in Introduction to Empirical Methods (IEM) and qualifies as the second course in the core sequence in quantitative analysis. The focus of this class is on statistical modeling concepts and how to analyze different problems and data sets. Little attention will be devoted to formal proofs and derivations. The content of this course will be very similar to an introductory Econometrics course, but we will use less mathematics than is used in the Economics Department. Course topics will include: A review of basic statistical concepts Dealing with uncertainty in research and policy problems The thought process behind setting up a statistical model Software (Excel and SAS will be used)
Approaches to estimating relationships (e.g., least squares, likelihood estimation, MCMC simulation) Interpretation of regression statistics Hypothesis testing (from frequentist and Bayesian perspectives) Multivariate regression Variable selection, specification testing, and functional forms How to identify and address modeling problems (e.g., autocorrelation, multicollinearity, heteroskedasticity, omitted variables bias, misspecification, endogeneity bias) Dummy variables and qualitative variables Logit, Tobit, and Poisson models Basic index theory Time series approaches Spatial regression Forecasting techniques An introduction to Bayesian estimation Causality A brief introduction to more advanced approaches, including non-parametric regression and simulation techniques. Objectives Upon completing this course, you should be able to read, understand, critically interpret, and identify the strengths and limitations of many statistical studies that you encounter in policy reports and in the literature of the social sciences. You should also be able competently analyze data sets using common statistical techniques. Prerequisites You should already be familiar with basic statistical concepts (including Ordinary Least Squares regression) and be able to read and understand simple mathematical notation. We will avoid the use of linear algebra. You will not need any prior background with SAS. If you are enrolled in the LBJ School of Public Affairs, IEM is a prerequisite. Textbooks and References There are no required textbooks for this class. I think we ve reached a point where there are so many good references to these topics available on the Internet, there is no need to spend a lot of money on textbooks. Nonetheless, you might find it useful to acquire your own personal copy of some of the references described below. Many of the lectures will be based on the following textbook:
Damodar Gujarati, Basic Econometrics. Any edition is fine. Again, this textbook is not required, but you might find it helpful. A copy is on Reserves in the PCL (the university s main library). There also appears to be a lot of pdf s of the textbook available on the Internet. By the 4 th lecture, we will start examining problems that are too difficult to study with a spreadsheet. At that point, we will start using SAS software. There are a number of good tutorials to SAS floating around, and I ll provide you with links. Our discussion of logit models will follow: Allison, Paul, Logistic Regression Using the SAS System, John Wiley & Sons. A copy of this book will be placed on Reserves in the PCL and some chapters will be posted on Canvas. Software One objective of this course is to make you feel comfortable with using statistical software programs. We will start off by analyzing data sets in Excel. You should already have some experience with Excel through prior coursework or work experience. Most of the exercises with Excel will require the Data Analysis Add-In. If you will be using your own computer, ensure that you can add-in this software feature, if you don t already have it on your computer. SAS is installed on most computers in the Public Affairs Computer Lab. If you plan to use your own computer to work on class assignments, you may wish to acquire SAS from the UT Computation Center. Make sure your CD has the ETS (Economic Time Series) module. SAS can also be accessed remotely one of the Department of Statistics servers, via UT s computer networks. Finally, if you plan to use your own personal computer to review lecture notes and supplemental readings, you will need software capable of reading Microsoft PowerPoint and Adobe Acrobat files. Lecture Notes My lecture notes will be posted on Canvas. My lectures will loosely follow the content of the Gujarati textbook and the other reference materials noted above. However, I will try to explain the material a bit differently, provide additional examples that are less economics-focused, and provide some different perspectives on some of the topics. I will also provide additional readings, example Excel spreadsheets and SAS programs on Canvas. Homework Problems Typically, I ll provide you with a data set and ask you to perform some analysis using Excel or SAS. I might ask you to describe how you would organize an empirical study to examine some issue or hypothesis of current importance. Generally, homework assignments are due one week after they are assigned and discussed in class. Assignments will be posted on Canvas. The simpler lesson assignments will be graded by the Teaching Assistant (if I have one). I ll grade the Mid-Term, Final Projects, and any lesson assignments that involve more creativity and judgment.
Final Project I will help to establish teams to work together on a final project and will meet with each team to discuss appropriate topics. Appropriate topics might include some data analysis in a field in which you are interested or critiquing a recent policy report. You are free to work on an individual project, rather than a group project. Grading Policies Homework assignments count for 30% of your final grade. The mid-term and final projects each count for 35%. Your final aggregate homework grade will be the average of the scores on the top n-1 of the n total homework assignments. For example, if I end up assigning 6 homework problem sets, your grade will be based on your best 5 grades i.e., I ll throw out the lowest grade. Generally, I do not accept late homework solutions, since we go over the solutions in class on the day in which they are due. I consider a 93.3% or higher to be an A, a 90.0% to 93.3% to be an A-, a 86.7% to 90.0% to be a B+, a 83.3% to 86.7% to be a B, etc. Students who take an incomplete in this course will receive a reduction in their grade of at least 7 points if/when they make up the incomplete. I reserve the right to adjust anyone s grade upward based on class participation. Alternative Path for Grading and Assignments If you already have a strong background in statistics, econometrics, or operations research and are engaged in a major research project involving statistical analysis, I may be willing to let you substitute a much bigger term project for some of the homework assignments. No Email or Facebook while in Class, Please. Feel free to bring a laptop or tablet to class. But please keep your attention on the class material while you in class! Tentative Agenda The content described below may change. In particular, we may explore some current issues in newspapers, journals, or policy reports in lieu of some of the topics identified here. Also, I may make some changes depending upon the interests of the class. Class 1 Jan. 23 Introduction and Review of Approaches to Quantitative Analysis Preview of the course material. Solicit feedback from the class on class interests. Review basic probability theory
Discuss the objectives of quantitative analysis and modeling in the social sciences, policy research, and other fields Discuss two competing theories of probability (frequentist and Bayesian) Review of basic regression analysis (OLS) Demonstrate data analysis using Excel Class 2 Jan 30 Distributions, the Specification of Simple Linear Models, and ANOVA Review various probability distributions Discuss examples of simple linear models Time-series analysis vs. cross-sectional analysis Analysis of Variance Class 3 Feb 6 Specification and Estimation of Simple Linear Models How estimates are conditional upon the model selected Interpretation of regression coefficients and regression statistics Controversies regarding interpretation of p-values and hypothesis tests Further estimation of regression models using Excel Class 4 Feb 13 Simple Multivariate Regression Models Examples of simple linear multivariate models How to select appropriate variables Discuss two different ways of estimating regression coefficients: least squares vs. likelihood estimation Discussion of software options How to select appropriate variables Class 5 Feb 20 Identifying and Correcting Problems Related to Functional Form, Multicollinearity, Heteroskedasticity, Omitted Variables, and Autocorrelation Identifying common problems with regression models Using Dummy independent variables Issues on model specification Handling non-linearities Subsample stability How do we address these common problems? Examining residuals to identify problems Consequences of various statistical problems
Likelihood estimation Introduction to SAS Class 6 Feb 27 Models with Binary or Truncated Dependent Variables Finish up non-linear relationships. An introduction to logit, Tobit, and Poisson regression Review of binomial distributions, Bernoulli processes Further demonstration of SAS software Class 7 Mar 6 Further Discussion of Qualitative Dependent Variables Further discussion of qualitative variables Discuss the need for indices The aggregation problem Describe different index methods Discuss how indices are commonly used to present economic phenomenon Likely date for mid-term exam Spring Break. Yipee! Class 8 March 20 Index Theory and Time-Series Models; Review and Discussion of Term Projects Discussion of common time-series techniques Forecasting using time-series models Identifying trends, cycles, and seasonality in time-series data Demonstration of time-series models using Excel and SAS Discuss term project topics Class 9 Mar 27 Introduction to Panel Data; Simulation Techniques, Get started on panel data concepts Data intensive simulation methods Fixed effects, random effects, and mixed models Hierarchical multilevel models Class 10 Apr 3 Panel Techniques Panel data and cohort analysis Using seemingly unrelated regressions (Zellner s method) when you don t have true simultaneous equation models, but there clearly is some relationship among numerous regression equations.
Class 11 April 10 Simultaneous Equation Systems; Concepts in Forecasting When are simultaneous equation models appropriate? How can we estimate relationships and coefficients in simultaneous equation models? Demonstration of forecasting models Using simultaneous equation forecasting models Combining forecasts from different models Class 12 April 17 Causality and Modeling Review of endogeneity and exogeneity. Are we explaining or predicting our Y variable(s)? Causal inferences in statistical models Philosophical background Can we test for causality? Possible additional topic: Spatial regression Class 13 April 24 Bayes The Bayesian notion of probability. Bayes law Is it acceptable to use prior knowledge and judgment? Is this the scientific method? Introduction to simulation techniques Multiplying probability distribution functions together and solving Bayesian problems with Markov Chain Monte Carlo techniques Examples in SAS Some term paper projects will be presented Class 14 May 1 Presentation of Term Projects Notices from the University Students who violate University rules on scholastic dishonesty are subject to disciplinary penalties, including the possibility of failure in the course and/or dismissal from the University. Since such dishonesty harms the individual, all students, and the integrity of the University, policies on scholastic dishonesty will be strictly enforced. For further information, please visit the Student Judicial Services web site at: www.utexas.edu/depts/dos/sjs/.
At the beginning of the semester, students with disabilities who need special accommodations should notify the instructor by presenting a letter prepared by the Service for Students with Disabilities Office to ensure that appropriate accommodations can be provided.