Multivariate models and machine learning for fmri

Similar documents
Lecture 1: Machine Learning Basics

Python Machine Learning

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Generative models and adversarial training

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Rule Learning With Negation: Issues Regarding Effectiveness

CS Machine Learning

Rule Learning with Negation: Issues Regarding Effectiveness

Probabilistic Latent Semantic Analysis

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

CSL465/603 - Machine Learning

Word Segmentation of Off-line Handwritten Documents

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

A study of speaker adaptation for DNN-based speech synthesis

Speech Recognition at ICSI: Broadcast News and beyond

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Speech Emotion Recognition Using Support Vector Machine

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised Face Detection

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Learning From the Past with Experiment Databases

Brains in dialogue: decoding neural preparation of speaking to a conversational partner

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Comparison of network inference packages and methods for multiple networks inference

Artificial Neural Networks written examination

Australian Journal of Basic and Applied Sciences

Probability and Statistics Curriculum Pacing Guide

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Human Emotion Recognition From Speech

Reducing Features to Improve Bug Prediction

A survey of multi-view machine learning

Switchboard Language Model Improvement with Conversational Data from Gigaword

STA 225: Introductory Statistics (CT)

(Sub)Gradient Descent

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Mandarin Lexical Tone Recognition: The Gating Paradigm

Knowledge Transfer in Deep Convolutional Neural Nets

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

A Case Study: News Classification Based on Term Frequency

arxiv: v1 [cs.lg] 3 May 2013

WHEN THERE IS A mismatch between the acoustic

Time series prediction

Calibration of Confidence Measures in Speech Recognition

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Using Web Searches on Important Words to Create Background Sets for LSI Classification

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Axiom 2013 Team Description Paper

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Assignment 1: Predicting Amazon Review Ratings

CS 446: Machine Learning

INPE São José dos Campos

Evolution of Symbolisation in Chimpanzees and Neural Nets

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

arxiv: v2 [cs.cv] 30 Mar 2017

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Word learning as Bayesian inference

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

Learning Methods in Multilingual Speech Recognition

UMass at TDT Similarity functions 1. BASIC SYSTEM Detection algorithms. set globally and apply to all clusters.

Model Ensemble for Click Prediction in Bing Search Ads

arxiv: v1 [cs.cl] 2 Apr 2017

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Lecture 1: Basic Concepts of Machine Learning

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

Softprop: Softmax Neural Network Backpropagation Learning

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

Truth Inference in Crowdsourcing: Is the Problem Solved?

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Comment-based Multi-View Clustering of Web 2.0 Items

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

arxiv: v1 [cs.lg] 15 Jun 2015

Issues in the Mining of Heart Failure Datasets

Evolutive Neural Net Fuzzy Filtering: Basic Description

Indian Institute of Technology, Kanpur

Universidade do Minho Escola de Engenharia

Attributed Social Network Embedding

Multi-tasks Deep Learning Model for classifying MRI images of AD/MCI Patients

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

A Bayesian Learning Approach to Concept-Based Document Classification

Data Fusion Through Statistical Matching

Evidence for Reliability, Validity and Learning Effectiveness

Transcription:

Multivariate models and machine learning for fmri Methods and Models in fmri, 15.11.2016 Jakob Heinzle heinzle@biomed.ee.ethz.ch Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering (IBT) University and ETH Zürich Many thanks to Sudhir Raman and Kay Brodersen for material 1 Translational Neuromodeling Unit

Overview Motivation Modelling Terminology Learning from data Multivariate Bayes in SPM Generative Embedding fmri Analysis and Classifcation 2

Why multivariate? Univariate approaches are excellent for localizing activations in individual voxels. * n.s. v 1 v 2 v 1 v 2 reward no reward

Why multivariate? Multivariate approaches can be used to examine responses that are jointly encoded in multiple voxels. n.s. n.s. v 1 v 2 v 1 v 2 v 2 orange juice apple juice v 1

A bit of history Multidymensional scaling Psychophysical rating fmri Two-dimensional projection of similarity measure for both psychophysical rating and fmri response. Edelman et al, Psychobiology, 1998 fmri Analysis and Classifcation 5

A bit of history Classification Studies Haxby et al, Science, 2001 fmri Analysis and Classifcation 6

A bit of history Classification Studies Kamitani and Tong, Nat Neurosci, 2005 fmri Analysis and Classifcation 7

Representational similarity analysis Idea: Compare the similarity of representations (correlation between activation patterns) between different stimuli. Allows for a comparison between monkey (neural firing pattern) and human (fmri activation patterns). Kriegeskorte et al, Neuron, 2008 fmri Analysis and Classifcation 8

Overview Motivation Modelling Terminology Learning from data Multivariate Bayes in SPM Generative Embedding fmri Analysis and Classifcation 9

Analysis steps Feature Extraction Classification Clustering Modelling Regression Inference Cross validation Performance Prediction Model Selection

Feature space F 1 F 2... F P S 1 1 0.5 Features Data Points S 2 0 5.7. 1 4. 1 5.3 S N 1 6.6 Discrete Continuous

Feature selection for fmri multivariate analysis Different features answer different questions. Reducing the dimensionality might reduce noise, but could also reduce relevant information. Model parameters Mean values Raw data Correlations between regions Model Parameters, e.g. DCM fmri Analysis and Classifcation 12

Model selection - Generalizability Model Fit Model Complexity Bishop (2006), Pitt & Miyung (2002), TICS fmri Analysis and Classifcation 13

Encoding and decoding models condition stimulus response prediction error encoding model gg: XX tt YY tt decoding model h: YY tt XX tt context (cause or consequence) XX tt R dd BOLD signal YY tt R vv fmri Analysis and Classifcation 14

Modelling goals Prediction Y h X Predictive Density

Modelling goals Model Selection Sparse Coding Distributed Coding Model Evidence

Overview Motivation Modelling Concepts Learning From Data Multivariate Bayes in SPM Generative Embedding

Learning from data Supervised Learning Unsupervised Learning Reinforcement Learning Semi-supervised Learning Labels for training data are known! Labels for training data are NOT known!

Supervised learning Independent variables X Function - f Continuous dependent variable Y Categorical

Classification X Function - f Y Kernel Methods Support Vector Machines φφ Kernel Function K xx ii, xx jj = φφ xx ii. φφ xx jj Kernel methods for pattern analysis, Taylor, Cristianini, 2004

Gaussian Processes Other popular classifiers C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006, Deep Belief networks http://deeplearning.net/tutorial/dbn.html G.E. Hinton, S. Osindero, and Y. Teh, A fast learning algorithm for deep belief nets, Neural Computation, vol 18, 2006

Generative and Discriminative classifiers Generative classifiers Learn the parameters for the functions p(y) and p(x Y), e.g. Naïve Bayes Classifier Discriminative classifiers Learn the parameters for p(y X), e.g. logistic regression, SVM fmri Analysis and Classifcation 22

Cross-validation The generalization ability of a classifier can be estimated using a resampling procedure known as cross-validation. One example is 2-fold cross-validation: examples 1? training example 2?? test examples 3? 99 100...??... Model Selection Performance evaluation 1 2 folds Balanced Accuracy F1 Score performance evaluation fmri Analysis and Classifcation 23

Cross-validation Another commonly used variant is leave-one-out cross-validation. examples 1? training example 2?? test example 3?.................. 99? 100? 1 2 98 99 100 folds performance evaluation In fmri often leave one-run-out fmri Analysis and Classifcation 24

Performance Single Subject Binomial Test pp = PP XX kk HH 0 = 1 BB kk nn, ππ 0 Brodersen et al. 2013, NeuroImage k=30!!! Cross-validated data are not necessarily binomially distributed Permutation tests are better!!! fmri Analysis and Classifcation 25

Performance Mulitple subjects Random effects Fixed effects http://www.translationalneuromodeling.org/tapas/ Brodersen et al. 2013, NeuroImage fmri Analysis and Classifcation 26

Confounds GLM vs. MVPA Todd et al. 2013, NeuroImage fmri Analysis and Classifcation 27

Second level t-tests for accuracies? True β-values are normally distributed. True accuracies are not normal and truncated at chance. A possible solution is given by Allefeld et al. Allefeld et al. Neuroimage, 2016 fmri Analysis and Classifcation 28

Statistical testing with classification Within subjects: Permutation statistics Parametric tests ar not valid (assumptions not met), e.g. Biomialor t-test (c.f. Schreiber and Krekelberg, 2013). Across subjects: Assumptions for t-tests are not met Full Bayesian model (Bordersen et al. 2013, but assumptions are not met for CV) Use prevalence statistic proposed in Allefeld et al., 2016 fmri Analysis and Classifcation 29

Research questions for classification Overall classification accuracy accuracy 100 % Spatial deployment of discriminative regions 80% 50 % Left or right button? Truth or lie? Healthy or ill? classification task 55% Temporal evolution of discriminability Model-based classification accuracy 100 % 50 % Accuracy rises above chance Participant indicates decision within-trial time { group 1, group 2 } Pereira et al. (2009) NeuroImage, Brodersen et al. (2009) The New Collection fmri Analysis and Classifcation 30

Decoding «hidden» intentions searchlight approach Haynes et al., Current Biology, 2007 fmri Analysis and Classifcation 31

Decoding of free decisions Decoding of fingerpresses (red line). Participants freely choose timing and hand. Earliest information about left-right long before execution free will? Soon et al., Nat Neurosci, 2008 fmri Analysis and Classifcation 32

Decoding task preparation connectitivy based decoding SV-Classifier on connectivity graph (correlation) Discriminative maps Heinzle et al., J Neurosci, 2012 fmri Analysis and Classifcation 33

Unsupervised learning Building a representation of data Dimensionality Reduction Clustering Time series K-means Mixture models fmri Analysis and Classifcation 34

K-means clustering Cost function Algorithm 1. Initialize 2. Estimate assignments 3. Estimate cluster centroids 4. Repeat 2,3 until convergence Bishop PRML (2006) fmri Analysis and Classifcation 35

Clustering Mixture of Gaussians Bishop PRML (2006) fmri Analysis and Classifcation 36

Interpretation Cluster parameters Cluster 1 Cluster 2 Internal Criterion Model Evidence External Criterion - Purity Inferred Labels Subjects External Labels fmri Analysis and Classifcation 37

Motivation Modelling Learning from Data Multivariate Bayes in SPM Generative Embedding fmri Analysis and Classifcation 38

Encoding vs. Decoding models fmri Analysis and Classifcation 39

Encoding vs. Decoding models fmri Analysis and Classifcation 40

Coding Hypotheses Sparse vectors Spatial vectors Smooth vectors Distributed vectors Singular vectors of data UUUUVV TT = RRYY TT Support vectors UU = RRYY TT fmri Analysis and Classifcation 41

Coding Hypotheses Friston et al. 2008 NeuroImage fmri Analysis and Classifcation 42

Solved with variational Bayes Friston et al. 2008 NeuroImage fmri Analysis and Classifcation 43

Example Decoding of motion. Attention to motion dataset - Büchel & Friston 1999 Cerebral Cortex Experimental factors: 1. Photic 2. Motion 3. Attention Friston et al. 2008 NeuroImage fmri Analysis and Classifcation 44

Friston et al. 2008 NeuroImage fmri Analysis and Classifcation 45

Results Friston et al. 2008 NeuroImage fmri Analysis and Classifcation 46

Multivariate Bayes in SPM Motion contrast(s) < < 3 SPMmip [-36, -87, -3] < SPM{T 338 } 50 100 150 200 250 SPMresults:.\SPM-practical\attention\GLM Height threshold T = 4.874226 {p<0.05 (FWE)} 300 60 log-evidence maximum p = 100.00% 500 distribution of weights 40 400 20 0 frequency 300 200 100 adjusted response -20 0.5 0-0.5 1 2 3 4 5 partitions PPM: MVB_Motion (Motion) MVB_Motion (prior: sparse) target prediction prediction 0-0.04-0.02 0 0.02 0.04 Posterior probabilities voxel-weight at maxima p( w > 0) location (x,y,z) weight (w) p = 0.993-39.0,-90.0,-3.0mm q = 0.0254; p = 0.983-33.0,-99.0,-3.0mm q = -0.0216; p = 0.983-30.0,-99.0,3.0mm q = 0.0211; p = 0.982-42.0,-90.0,9.0mm q = 0.0201; p = 0.980-45.0,-75.0,-3.0mm q = 0.0168; p = 0.979-30.0,-84.0,6.0mm q = -0.0187; p = 0.977-39.0,-87.0,3.0mm q = -0.0196; p = 0.973-30.0,-84.0,-6.0mm q = -0.0204; p = 0.972-39.0,-81.0,-15.0mm q = 0.0166; p = 0.946-36.0,-84.0,12.0mm q = -0.0144; p = 0.933-48.0,-84.0,-3.0mm q = -0.0119; p = 0.929-39.0,-75.0,3.0mm q = -0.0160; 506 observed voxels; 360 and scans predicted contrast SNR (variance) 0.64 0.4 0.2 0-0.2-1 0 100 200 300 400 scans -0.4-1 -0.5 0 0.5 contrast fmri Analysis and Classifcation 47

Laminar activity related to novelty and episodic memory Maas et al. 2014 Nature Communications fmri Analysis and Classifcation 48

Motivation Modelling Principles Learning from Data Multivariate Bayes in SPM Generative Embedding fmri Analysis and Classifcation 49

Classifying Groups of Subjects Voxel activity Connectivity Subject 1 Subject 2. Dynamic causal model (DCM) High dimensionality Unusual cluster distributions Lack of interpretation Subject 1 Subject 2. Subject N Classification Clustering Subject N Group 1 Group 2 fmri Analysis and Classifcation 50

Generative Embedding Brodersen et al. PLOS computation biology 2011. fmri Analysis and Classifcation 51

DCM for speech processing fmri Analysis and Classifcation 52

Working memory in Schizophrenia 41 Schizophrenia patients (DSM IV,ICD 10), 42 controls Visual numeric n-back working memory task 1 900ms 5 3 5 500ms 4 2 9 8 9 Deserno et al (2012) The Journal of Neuroscience fmri Analysis and Classifcation 53

Model based clustering Brodersen et al 2014 Neuroimage fmri Analysis and Classifcation 54

Results healthy vs. schizophrenia patients Brodersen et al 2014 Neuroimage fmri Analysis and Classifcation 55

Within patients clustering Brodersen et al 2014 Neuroimage fmri Analysis and Classifcation 56

Be aware Interpretation of decoding or classification results is difficult. The decoded information must be in the data, but in what features exactly is often hard to find out fmri Analysis and Classifcation 57

Summary Summary Modelling Principles Learning from Data Multivariate Bayes in SPM Generative Embedding fmri Analysis and Classifcation 58

Acknowledgments Many thanks to K.E. Stephan, Sudhir S. Raman and K. Brodersen for sharing their teaching material. fmri Analysis and Classifcation 59