MLLG - Linear and Generalized Linear Models

Similar documents
STA 225: Introductory Statistics (CT)

Probability and Statistics Curriculum Pacing Guide

Lecture 1: Machine Learning Basics

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Hierarchical Linear Models I: Introduction ICPSR 2015

Detailed course syllabus

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

Assignment 1: Predicting Amazon Review Ratings

Research Design & Analysis Made Easy! Brainstorming Worksheet

Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

12- A whirlwind tour of statistics

School Size and the Quality of Teaching and Learning

A Program Evaluation of Connecticut Project Learning Tree Educator Workshops

Python Machine Learning

A. What is research? B. Types of research

APPENDIX A: Process Sigma Table (I)

Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

Analysis of Enzyme Kinetic Data

PHD COURSE INTERMEDIATE STATISTICS USING SPSS, 2018

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

A Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements

Evaluation of Teach For America:

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

Theory of Probability

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

Ryerson University Sociology SOC 483: Advanced Research and Statistics

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

School of Innovative Technologies and Engineering

Predicting the Performance and Success of Construction Management Graduate Students using GRE Scores

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only.

Office Hours: Mon & Fri 10:00-12:00. Course Description

GradinG SyStem IE-SMU MBA

The Good Judgment Project: A large scale test of different methods of combining expert predictions

A study of speaker adaptation for DNN-based speech synthesis

Evaluation of ecodriving performances and teaching method: comparing training and simple advice

MASTER OF PHILOSOPHY IN STATISTICS

CS/SE 3341 Spring 2012

A Model to Detect Problems on Scrum-based Software Development Projects

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Multiple regression as a practical tool for teacher preparation program evaluation

Programme Specification

Setting the Scene: ECVET and ECTS the two transfer (and accumulation) systems for education and training

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Knowledge management styles and performance: a knowledge space model from both theoretical and empirical perspectives

PREDISPOSING FACTORS TOWARDS EXAMINATION MALPRACTICE AMONG STUDENTS IN LAGOS UNIVERSITIES: IMPLICATIONS FOR COUNSELLING

Strategy and Design of ICT Services

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

Honors Mathematics. Introduction and Definition of Honors Mathematics

PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

NIH Public Access Author Manuscript J Prim Prev. Author manuscript; available in PMC 2009 December 14.

Go fishing! Responsibility judgments when cooperation breaks down

Mathematics subject curriculum

An overview of risk-adjusted charts

Master s Programme in European Studies

Lecture 15: Test Procedure in Engineering Design

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Capturing and Organizing Prior Student Learning with the OCW Backpack

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

CS Machine Learning

SSE - Supervision of Electrical Systems

Discovering Statistics

European Higher Education in a Global Setting. A Strategy for the External Dimension of the Bologna Process. 1. Introduction

TEACHING AND EXAMINATION REGULATIONS PART B: programme-specific section MASTER S PROGRAMME IN LOGIC

MGT/MGP/MGB 261: Investment Analysis

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

STA2023 Introduction to Statistics (Hybrid) Spring 2013

Model Ensemble for Click Prediction in Bing Search Ads

Running head: METACOGNITIVE STRATEGIES FOR ACADEMIC LISTENING 1. The Relationship between Metacognitive Strategies Awareness

Syllabus of the Course Skills for the Tourism Industry

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

(Sub)Gradient Descent

Algebra 2- Semester 2 Review

Investment in e- journals, use and research outcomes

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

Understanding Games for Teaching Reflections on Empirical Approaches in Team Sports Research

Universityy. The content of

SAM - Sensors, Actuators and Microcontrollers in Mobile Robots

THE INFLUENCE OF COOPERATIVE WRITING TECHNIQUE TO TEACH WRITING SKILL VIEWED FROM STUDENTS CREATIVITY

Multi-Lingual Text Leveling

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

Evolutive Neural Net Fuzzy Filtering: Basic Description

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

Introduction to Simulation

Physics 270: Experimental Physics

The Impact of Formative Assessment and Remedial Teaching on EFL Learners Listening Comprehension N A H I D Z A R E I N A S TA R A N YA S A M I

FIGURE IT OUT! MIDDLE SCHOOL TASKS. Texas Performance Standards Project

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Transcription:

Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2017 200 - FME - School of Mathematics and Statistics 715 - EIO - Department of Statistics and Operations Research 749 - MAT - Department of Mathematics MASTER'S DEGREE IN STATISTICS AND OPERATIONS RESEARCH (Syllabus 2013). (Teaching unit Optional) 5 Teaching languages: Spanish, English Teaching staff Coordinator: Others: MARTA PÉREZ CASANY Primer quadrimestre: MARTA PÉREZ CASANY - A JORDI VALERO BAYA - A Prior skills With respect to the Theory of Probability, the students should know the basic probability distributions, their main properties and the situations that they are able to model in an appropiate way. They also have to be familiarized with the main concepts of Statistical Inference corresponding to a first course of Statistics. Requirements We start modelization from scratch, so there are no pre-requisites. Nevertheless, some knowledge about linear regression and/or ANOVA will help better undestand the subject. Degree competences to which the subject contributes Specific: MESIO-CE4. CE-4. Ability to use different inference procedures to answer questions, identifying the properties of different estimation methods and their advantages and disadvantages, tailored to a specific situation and a specific context. MESIO-CE3. CE-3. Ability to formulate, analyze and validate models applicable to practical problems. Ability to select the method and / or statistical or operations research technique more appropriate to apply this model to the situation or problem. MESIO-CE6. CE-6. Ability to use appropriate software to perform the necessary calculations in solving a problem. MESIO-CE1. CE-1. Ability to design and manage the collection of information and coding, handling, storing and processing it. MESIO-CE7. CE-7. Ability to understand statistical and operations research papers of an advanced level. Know the research procedures for both the production of new knowledge and its transmission. MESIO-CE9. CE-9. Ability to implement statistical and operations research algorithms. MESIO-CE8. CE-8. Ability to discuss the validity, scope and relevance of these solutions and be able to present and defend their conclusions. Transversal: CT3. TEAMWORK: Being able to work in an interdisciplinary team, whether as a member or as a leader, with the aim of contributing to projects pragmatically and responsibly and making commitments in view of the resources that are available. CT5. FOREIGN LANGUAGE: Achieving a level of spoken and written proficiency in a foreign language, preferably 1 / 5

English, that meets the needs of the profession and the labour market. CT2. SUSTAINABILITY AND SOCIAL COMMITMENT: Being aware of and understanding the complexity of the economic and social phenomena typical of a welfare society, and being able to relate social welfare to globalisation and sustainability and to use technique, technology, economics and sustainability in a balanced and compatible manner. Teaching methodology The course will be held in the second semestrer (S2) in an intensive way, since it will last 7 weeks. Each week there will be two sessions of three hours divided in two parts, with a break of 15 minutes. The first part corresponds to the theory session and will take place in a normal room. The second part will take place in a computer room since it consists in the analysis of some data sets by means of the statistical software R. Learning objectives of the subject The main objectives of this subject are that the students acquire: 1) Deep knowledge of LINEAR MODELS. In particular of simple and multiple regression, ANOVA and ANCOVA. 2) Some skills on non-linear models that can be linearized. 3) Deep knowledge of GENERALIZED LINEAR MODELS. In particular of logistic regression, log-linear models, models for polytomous data, models for Gamma response. 4) Knowledge of modelling using QUASI-LIKELIHOOD. 5) Important level of practice dealing with real data. This knowledge will be very useful when posteriorly, the students collaborate with research groups in different areas, with the objective of advise them in the statistical part. These skills will allow the student: 1) To be able posteriorly to assimilate more easily other subjects as: LONGITUDINAL MODELS or BAYESIAN ANALYSIS 2) To be able to collaborate, at the end of the Master, with research groups of different kinds and give advice from the statistical point of view. 6) Ability in obtaining conclusions and explaining them. Study load Total learning time: 125h Hours large group: 30h 24.00% Hours small group: 15h 12.00% Self study: 80h 64.00% 2 / 5

Content Linear Model Learning time: 18h Theory classes: 10h 30m Laboratory classes: 7h 30m Presentation and Linear Model. 1.1. Generalities. Objectives. Definition. Hypothesis. Matrix formulation. Examples and counter-examples. Parameter Estimation. Parameter distribution. Residuals. Goodness of fit techniques. Checking the model hypothesis. 1.2. Analysis of Variance. One factor Anova: Parameter Estimation. Confidence Intervals for the means and means differences. Multiple comparisons. Random Blocks designs. Two way ANOVA. Designs with nested factors. Designs with crossed and nested factors. 1.3. Multiple linear regressions. Simple linear regression: parameter estimation, determination coefficient, mean square error, confidence intervals for the parameters and estimations, model adequacy checking. Multiple regression: collinearity, causality, robust models and outliers detection. Parsimony principle. Anova Table. Common mistakes in regression. 1.4. Transformations to obtain linearity, normality and/or homocedasticity. Non linear models than can be linearized. Exponential families Learning time: 6h 45m Theory classes: 3h 45m Practical classes: 3h Definition. Canonical parameter. Parameter space. Minimal and sufficient statistic. Examples and counterexamples. Complete and regular exponential models. Moment and kumulant generating functions. Different parametrizations of the same model. Maximum likelihood estimation. 3 / 5

Generalized Linear models Learning time: 16h 30m Theory classes: 9h Practical classes: 7h 30m 3.1. Basic Concepts. Objectives. Definition. Hypothesis. Link function and canonical link function. Variance function. Dispersion parameter. Parameter estimation and their asymptotic distribution. Goodness of fit measures: deviance, scaled deviance, X^2 generalized Pearson statistic. AIC. Residuals. 3.2. Models for binary data. Grouped and ungrouped data. Important link functions for binary data. Logit model: parameter interpretation, deviance, likelihood ratio test. Wald test. Confidence interval for the probabilities. Contingency tables with given marginals. Overdispersion. 3.3. Models for polytomous data. Models for ordinal responses. Models for nominal responses. Contingency tables with given total. 3.4. Models for count data. Poisson model. Overdispersion. Models with mixed Poisson distribution. Zero-inflated Poisson models. Contingency tables with unknown total and unknown marginals. 3.5. Quasi-likelihood models. When are they necessary? Definition. Parameter estimation. Goddnes-of-fit. Quasiresiduals. Comparative analysis between likelihood and quasi-likelihood models. Qualification system The 60% of the Final mark will come from the Final Exam. This exam will contain a theoretical as well as a practical part, both with the same weight. The remaining 40% will come from the activities realized during the course. The activities jointly with their weights are the following: 1) Reading, report and oral presentation of a scientific paper (10%). 2) Mini Exam composed by 10 short questions (10%). 3) Two deliveries in which the student will need to model a set of data with R (20%). Regulations for carrying out activities The Mini Exam and the Final Exam will be closed book, but the students might need to bring calculator and statistical tables. 4 / 5

Bibliography Basic: Seber, G.A.F. ; Lee, A. J. Linear regression analysis. Wiley, 2003. Dobson, J.A. An Introduction to generalized linear models. Chapman and Hall, 1990. Fox, J. Applied regression analysis and generalized linear models. Sage, 2008. Fox, J. ; Weisberg, S. An R companion to applied regression. sage, 2011. Complementary: McCullagh, P. ; Nelder, J.A. Generalized linear models. Chapman and Hall, 1989. Collet, D. Modelling binary data. Chaman and Hall, 2003. Lindsey, J. K. Applying generalized linear models. Springer, 1997. Montgomery, D. Design and Analysis of experiments. 8 ed. Wiley, 2013. 5 / 5