Applied Functional Data Analysis. What is Functional Data? What is Functional Data? What is Functional Data?

Similar documents
STA 225: Introductory Statistics (CT)

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

MTH 141 Calculus 1 Syllabus Spring 2017

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Syllabus ENGR 190 Introductory Calculus (QR)

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Probability and Statistics Curriculum Pacing Guide

Firms and Markets Saturdays Summer I 2014

This Performance Standards include four major components. They are

Executive Guide to Simulation for Health

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Office Hours: Mon & Fri 10:00-12:00. Course Description

MGT/MGP/MGB 261: Investment Analysis

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

PHY2048 Syllabus - Physics with Calculus 1 Fall 2014

Lecture 1: Machine Learning Basics

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

Introduction to the Practice of Statistics

EGRHS Course Fair. Science & Math AP & IB Courses

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

An Introduction to Simio for Beginners

Math Techniques of Calculus I Penn State University Summer Session 2017

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

Statewide Framework Document for:

Instructor: Matthew Wickes Kilgore Office: ES 310

Analysis of Enzyme Kinetic Data

Syllabus Foundations of Finance Summer 2014 FINC-UB

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Biology and Microbiology

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants)

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

CS/SE 3341 Spring 2012

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Introduction to Simulation

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

EDINA SENIOR HIGH SCHOOL Registration Class of 2020

What is this species called? Generation Bar Graph

Exploring Derivative Functions using HP Prime

UNIT ONE Tools of Algebra

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Mathematics subject curriculum

Self Study Report Computer Science

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

U : Survey of Astronomy

Julia Smith. Effective Classroom Approaches to.

How the Guppy Got its Spots:

Commanding Officer Decision Superiority: The Role of Technology and the Decision Maker

Spring 2012 MECH 3313 THERMO-FLUIDS LABORATORY

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

Control Tutorials for MATLAB and Simulink

University of Cincinnati College of Medicine. DECISION ANALYSIS AND COST-EFFECTIVENESS BE-7068C: Spring 2016

Physics 270: Experimental Physics

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only.

Electromagnetic Spectrum Webquest Answer Key

Learning Disability Functional Capacity Evaluation. Dear Doctor,

Course outline. Code: ENS281 Title: Introduction to Sustainable Energy Systems

Social Media Marketing BUS COURSE OUTLINE

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

EDIT 576 (2 credits) Mobile Learning and Applications Fall Semester 2015 August 31 October 18, 2015 Fully Online Course

Math 96: Intermediate Algebra in Context

Following the Freshman Year

Aviation English Solutions

ONLINE COURSES. Flexibility to Meet Middle and High School Students at Their Point of Need

Mathematics Program Assessment Plan

Characteristics of Functions

Corpus Linguistics (L615)

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Morphosyntactic and Referential Cues to the Identification of Generic Statements

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

EDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall Semester 2014 August 25 October 12, 2014 Fully Online Course

Improving Conceptual Understanding of Physics with Technology

Unequal Opportunity in Environmental Education: Environmental Education Programs and Funding at Contra Costa Secondary Schools.

Theory of Probability

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

MTH 215: Introduction to Linear Algebra

TUESDAYS/THURSDAYS, NOV. 11, 2014-FEB. 12, 2015 x COURSE NUMBER 6520 (1)

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

School of Innovative Technologies and Engineering

Introducing the New Iowa Assessments Mathematics Levels 12 14

Measurement & Analysis in the Real World

Introduction and Motivation

E C C. American Heart Association. Basic Life Support Instructor Course. Updated Written Exams. February 2016

Reflective Teaching KATE WRIGHT ASSOCIATE PROFESSOR, SCHOOL OF LIFE SCIENCES, COLLEGE OF SCIENCE

Mathematics. Mathematics

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Technical Manual Supplement

Othello Act 1 Study Guide Answers

Year 11 GCSE Information Evening

San José State University Department of Marketing and Decision Sciences BUS 90-06/ Business Statistics Spring 2017 January 26 to May 16, 2017

Circuit Simulators: A Revolutionary E-Learning Platform

Stakeholder Debate: Wind Energy

Speech Emotion Recognition Using Support Vector Machine

Cal s Dinner Card Deals

Python Machine Learning

CSL465/603 - Machine Learning

Transcription:

Applied Functional Data Analysis Venue: Tuesday/Thursday 11:40-12:55 WN 360 Lecturer: Giles Hooker Office Hours: Wednesday 2-4 Comstock 1186 Ph: 5-1638 e-mail: gjh27 What are the most obvious features of these data? http://www.bscb.cornell.edu/ hooker/fda2008/ See also Blackboard What are the most obvious features of these data? quantity What are the most obvious features of these data? quantity frequency (resolution)

What are the most obvious features of these data? Most important: smoothness quantity frequency (resolution) similar trends Most important: smoothness Most important: smoothness These data describe (nearly) a process that changes smoothing, and continuously over time. These data describe (nearly) a process that changes smoothing, and continuously over time. Functional Data Analysis = Analysis of data that are functions.

Most important: smoothness 20 replications These data describe (nearly) a process that changes smoothing, and continuously over time. Functional Data Analysis = Analysis of data that are functions. Domain is usually time, but can be anything: space, energy... Functional data analysis involves repeated measures of the same process. 20 replications, 1401 observations within replications complicated: not easily described by mathematical formulae variation between replications even harder to describe

often a large number of related quantities often a large number of related quantities viewing each replication as a single observation can make the data easier to think about (once we have the right machinery) often a large number of related quantities viewing each replication as a single observation can make the data easier to think about (once we have the right machinery) What are these data, anyway?

Classical Functional Data Measures of position of nib of a pen writing "fda". 20 replications, measurements taken at 200 hertz. often a large number of related quantities What are these data, anyway? What if I plot one component against another? viewing each replication as a single observation can make the data easier to think about (once we have the right machinery) Characteristics About Functional Data Analysis Data are measurements of smooth processes over time We usually do not want to make parametric assumptions about those processes. Often have multiple measurements of the same process We are interested in describing the variation of processes. Frequently, collected data have high resolution and low noise. Can be applied to any estimate of a smooth process. 1 FDA is New First named in Dalzell & Ramsay, 1991 Relatively little penetration into applied fields (= easy publication) Several competing methodologies (we focus on one) Limited public software/resources data analysis rather than inference 2 Functional Data is Complex Requires more thought/judgement than a t-test data needs pre-processing parametric inference is rarely available/appropriate

Audience: application areas with functional data Focus: What can Functional Data Analysis do? How do I make it happen? Software: packages in R, Matlab Goals: Enabling you to Understand and interpret the result of FDA applied to real data Use existing FDA libraries to analyze functional data Evaluate its usefulness/correctness Extend the methods in existing software if you need to Not Covered: reproducing-kernel Hilbert spaces, asymptotics, theorems... Pre-requisites and Recommendations Pre-requisites: BTRY 601 and 602 or equivalent (at least multiple linear regression) Useful: Life will be easier if you do not need to learn some of the following: R/Matlab or other programming experience Calculus Matrix algebra Multivariate statistics Computational statistics Any necessary material will be covered in class, but will be out of context. Resources Textbook: Ramsay and Silverman, 2005, Functional Data Analysis, Springer. Books: Ramsay and Silverman, 2002, Applied Functional Data Analysis, Springer. Chapters from Ramsay, Graves and Hooker, (2009, hopefully) Functional Data Analysis in R. Online: http://www.functionaldata.org for FDA http://www.r-project.org a general site for R http://www.bscb.cornell.edu/ hooker/fda2008 All class notes, exercises etc will be posted here. Class materials will also be posted to Blackboard; a general discussion board has also been set up. Assessment 3 Assignments (20% each) Using the FDA libraries to analyze data Interpreting results of this analysis Some simulation studies Class Project (40%) Analysis of real-world data End of semester presentation Short written report. More details later. Policies: you are welcome to discuss homework, but you should do and write it individually project may be done as a group, but should be submitted with a statement of who did which parts

Back to "What is Functional Data" Data may be measured more noisily Or What isn t Functional Data? Do my data need to look this good? We need to find the smooth process under the data. Data may be measured more sparsely We may not have repeated measurements Data are low noise but low-resolution Measured at unequal intervals We know that the curves must always increase Single time series But, repeated "shapes" over each year We can use this to investigate variation, development, dynamics

Necessities for Functional Data Common Sources must believably derive from a smooth process process should not be easily parameterizable (should not be able to write down a formula) enough data to resolve the essential features of the process (peaks, zero-crossings, speed... will depend on application) some repetition in the process do not need equally-spaced or perfect measurements medical monitoring: EEG, ECG, fmri, blood pressure... medical tests: HIV antibodies, flu screens... biology: animal behavior (whale songs, fly egg-laying...) environmental monitoring: weather, pollution, solar radiation, traffic... optotrack experiments: psychology/physiology economics/marketing: macro-trends, futures markets web data: e-bay auction prices, google trends Essential Questions Or what can FDA do for me? How do we go from discrete to functional data? How do we describe random variation in functional data? How do we decide if groups of functional data are different? How do we relate functional data to other data? To other functional data? What is special about functional data? Aligning functions (registration) Use of rates of change (dynamics) Approximate Class Agenda 1 Introduction, R, Projects (weeks 1 and 2) 2 From data to functional data (weeks 3-6/7) Basis expansions and smoothing The fda library Positive and monotone smoothing No classes Sept 16 and 18 3 Exploring Functional Data (weeks 7-9) Means, variances, covariances Functional PCA 4 Functional Linear Models (weeks 9-11) 5 Registration (week 12) 6 Dynamic Models (weeks 13-14) 7 Project Presentations (week 15)