Applied Multivariate Statistics

Similar documents
(Sub)Gradient Descent

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

ME 4495 Computational Heat Transfer and Fluid Flow M,W 4:00 5:15 (Eng 177)

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS APPLIED STATICS MET 1040

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

Python Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

STA 225: Introductory Statistics (CT)

Probabilistic Latent Semantic Analysis

MTH 215: Introduction to Linear Algebra

Statistics and Data Analytics Minor

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Welcome to. ECML/PKDD 2004 Community meeting

College Pricing. Ben Johnson. April 30, Abstract. Colleges in the United States price discriminate based on student characteristics

Honors Mathematics. Introduction and Definition of Honors Mathematics

Measurement. When Smaller Is Better. Activity:

Computer Science 141: Computing Hardware Course Information Fall 2012

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Office Hours: Mon & Fri 10:00-12:00. Course Description

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

EGRHS Course Fair. Science & Math AP & IB Courses

Course Syllabus for Math

CS Machine Learning

Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation

MGMT 479 (Hybrid) Strategic Management

Strategy and Design of ICT Services

Learning Methods for Fuzzy Systems

Lecture 1: Basic Concepts of Machine Learning

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

School of Innovative Technologies and Engineering

Humboldt-Universität zu Berlin

Australian Journal of Basic and Applied Sciences

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS APPLIED MECHANICS MET 2025

MASTER OF PHILOSOPHY IN STATISTICS

SYLLABUS- ACCOUNTING 5250: Advanced Auditing (SPRING 2017)

Introductory Astronomy. Physics 134K. Fall 2016

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question

Dimensions of Classroom Behavior Measured by Two Systems of Interaction Analysis

WHEN THERE IS A mismatch between the acoustic

Mathematics Program Assessment Plan

FINN FINANCIAL MANAGEMENT Spring 2014

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Wellness Committee Action Plan. Developed in compliance with the Child Nutrition and Women, Infant and Child (WIC) Reauthorization Act of 2004

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

INTERMEDIATE ALGEBRA Course Syllabus

Multi-Lingual Text Leveling

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Computational Data Analysis Techniques In Economics And Finance

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Math Grade 3 Assessment Anchors and Eligible Content

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Newcastle University Business School (NUBS)

Faculty of Health and Behavioural Sciences School of Health Sciences Subject Outline SHS222 Foundations of Biomechanics - AUTUMN 2013

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

BADM 641 (sec. 7D1) (on-line) Decision Analysis August 16 October 6, 2017 CRN: 83777

EXAMINING THE DEVELOPMENT OF FIFTH AND SIXTH GRADE STUDENTS EPISTEMIC CONSIDERATIONS OVER TIME THROUGH AN AUTOMATED ANALYSIS OF EMBEDDED ASSESSMENTS

Computer Science 1015F ~ 2016 ~ Notes to Students

George Mason University Graduate School of Education Program: Special Education

Detailed course syllabus

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

MTH 141 Calculus 1 Syllabus Spring 2017

Applying Learn Team Coaching to an Introductory Programming Course

Course outline. Code: SPX352 Title: Sports Nutrition

COURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management

Hierarchical Linear Models I: Introduction ICPSR 2015

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

PHD COURSE INTERMEDIATE STATISTICS USING SPSS, 2018

COMM370, Social Media Advertising Fall 2017

B.S/M.A in Mathematics

MATH 108 Intermediate Algebra (online) 4 Credits Fall 2008

SOUTHWEST COLLEGE Department of Mathematics

Human Emotion Recognition From Speech

Missouri Mathematics Grade-Level Expectations

Navigating the PhD Options in CMS

Bachelor Programme Structure Max Weber Institute for Sociology, University of Heidelberg

The University of Southern Mississippi

Management of time resources for learning through individual study in higher education

Assignment 1: Predicting Amazon Review Ratings

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Working with Rich Mathematical Tasks

Course Prerequisite: CE 2407 Adobe Illustrator or equivalent experience

EDINA SENIOR HIGH SCHOOL Registration Class of 2020

Guide to Teaching Computer Science

Math 96: Intermediate Algebra in Context

BAYLOR COLLEGE OF MEDICINE ACADEMY WEEKLY INSTRUCTIONAL AGENDA 8 th Grade 02/20/ /24/2017

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students

Transcription:

Applied Multivariate Statistics Fall Semester 2017 University of Mannheim Department of Economics Chair of Statistics Toni Stocker

Applied Multivariate Statistics (AMS) - Content Introduction to AMS Matrix Algebra Multivariate Samples Principal Component Analysis (PCA) Biplots Factor Analysis Multidimensional Scaling (MDS) Cluster Analysis Linear Discriminant Analysis (LDA) Binary Response Models Correspondence Analysis 1 20 57 77 129 141 152 170 183 194 212

Introduction to AMS 1

General Course Information Prerequisites Students in Economics from Mannheim: no problem All other students: should have attended two or more courses in Statistics (descriptive statistics, estimating and hypothesis testing) A course in Basic Econometrics is helpful but not strictly required. The statistical software R will intensively be used throughout this course. Students who are not yet familiar with R should work through chapters 1-5 of the R introduction (see course folder) on their own by September 15 at the latest. If you are not yet sure whether you will attend this course, you may read sections 1.1 and 1.2 in Johnson & Wichern (see p. 3) to get an idea about the purposes of this course. Though R is easy to learn, you need to invest some time at the beginning. But you may benefit from it for a long time. 2

General Course Information Time and Locations Day Time Location Lecture Friday 10:15-11:45 L7, 3-5, P043 Tutorial Friday 08:30-10:15 L7, 3-5, P043 Tutorials start in the 2nd week. Contact Office Hour: Wednesday, 3:00-4:30 p.m. or by appointment Office: L7, 3-5, 1st floor, room 143 Phone: 0621-181-3963 Email: stocker@rumms.uni-mannheim.de 3

General Course Information Course Material Slides (Lecture), Assignments (Tutorials), Introduction to R (see p. 2) Material will be updated weekly (Friday) to find in course folder at Studierendenportal (ILIAS) References R. Johnson, D. Wichern (2007): Applied Multivariate Statistical Analysis; Pearson Intl. Ed. A. Rencher (2002): Methods of Multivariate Analysis; Wiley. W. Härdle, L. Simar (2003): Applied Multivariate Statistical Analysis; Wiley. A. J. Izenman (2008): Modern Multivariate Statistical Techniques; Springer. P. Hewson (2009): Multivariate Statistics with R; Open Text Book. Main Reference 4

Examination Exam + Assignments: 80% written exam (120 minutes) + 20% Assignments in terms of points to earn in total. Example: Points Written Exam: 60 (from 80) Assignments: 18 (from 20): Total: 78 (from 100) => Grading will be based on 78 points (from 100) Minimum for passing: 40 Assignments: Need to submit homework and attend tutorial. To get full points (20) you need to work at least on 10 assignments (out of 11) in a meaningful way. (See Guidelines for Assignments) 5

Issues of Applied Multivariate Statistics (AMS) Multivariate analysis consists of a collection of methods that can be used when several measurements are made on each individual or object in one or more samples. See Renchner (2002), p.1 Objectives Dimension reduction and structural simplification Visualization of high-dimensional data Investigation of the dependence among variables Grouping, discrimination and classification Close link to other areas such as Exploratory Data Analysis (EDA) and Data Mining (see also J+W (2007), p.2) 6

Example 1: Dimension Reduction What is it about? Economic Indicators for the 27 European Union Countries in 2011 (see WIREs Comput Stat 2012, 4:399 406. doi: 10.1002/wics.1200) 7

Example 2: Modern Graphical Techniques What is it about? 8

Example 2... 9

Example 3: Factor Analysis Consumer Preference (J&W, example 9.9, p. 508) R = 1 0.02 0.96 0.42 0.01 0.02 1 0.13 0.71 0.85 0.96 0.13 1 0.50 0.11 0.42 0.71 0.50 1 0.79 0.01 0.85 0.11 0.79 1 Taste Good buy for money Flavor Suitable for snack Provides lots of energy 10

Example 4: Distances Voting results for 15 congressmen from New Jersey (example from R package HSAUR) Extraction from the distance matrix... Hunt(R) Sandman(R) Howard(D) Hunt(R) 0 8 15 Sandman(R) 8 0 17 Howard(D) 15 17 0 11

Example 5: Grouping 12

Example 5... 13

Example 6: Classification Labor Market Participation of Married Women in Switzerland (1981) (example from R package AER) 14

Example 7: Discrimination Heights and weights of students 15

Example 8: Correspondence Analysis What is it about? Baccalauréat in France (Härdle & Simar, p. 313) 16

Course Outline Generally: Chapters 1-4, 8, 9, 11, 12 from Johnson and Wichern (J&W) Timetable and Contents Lecture 1: Introduction (today) Lecture 2: Matrix Algebra (part 1) Lecture 3: Matrix Algebra (part 2) Lecture 4: Multivariate Samples Lecture 5: Principal Component Analysis Lecture 6: Biplots Lecture 7: Factor Analysis 17

Lecture 8: Multidimensional Scaling Lecture 9: Cluster Analysis Lecture 10: Linear Discriminant Analysis Lecture 11: Binary Response Models Lecture 12: Correspondence Analysis Lecture 13: Used as time buffer Note: This is just a plan! Topics may be skipped; order may be changed; lecture topics may overlap 18

Main Objectives... at the end of the semester you know and (hopefully) understand most common methods for analyzing multivariate data and their theoretical background can proficiently use R when using multivariate techniques: data import, constructing graphics, inference, model diagnosis and assessment have experienced the possibilities and limitations of multivariate methods on the basis of real data examples Generally: This is an introductory and applied course. Modern multivariate techniques based on machine learning algorithms will hardly be covered. 19