Stopping rules for sequential trials in high-dimensional data

Similar documents
Wenguang Sun CAREER Award. National Science Foundation

STAT 220 Midterm Exam, Friday, Feb. 24

Comparison of network inference packages and methods for multiple networks inference

Lecture 1: Machine Learning Basics

An overview of risk-adjusted charts

Self Study Report Computer Science

GDP Falls as MBA Rises?

12- A whirlwind tour of statistics

Reinforcement Learning by Comparing Immediate Reward

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Probability and Statistics Curriculum Pacing Guide

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS

STA 225: Introductory Statistics (CT)

Thesis-Proposal Outline/Template

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

A Comparison of Charter Schools and Traditional Public Schools in Idaho

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

CURRICULUM VITAE Ma lgorzata Bogdan

w o r k i n g p a p e r s

How Does Physical Space Influence the Novices' and Experts' Algebraic Reasoning?

Types of Research EDUC 500

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Universityy. The content of

A Pipelined Approach for Iterative Software Process Model

THE INFORMATION SYSTEMS ANALYST EXAM AS A PROGRAM ASSESSMENT TOOL: PRE-POST TESTS AND COMPARISON TO THE MAJOR FIELD TEST

What is Research? A Reconstruction from 15 Snapshots. Charlie Van Loan

Faculty and Student Perceptions of Providing Instructor Lecture Notes to Students: Match or Mismatch?

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

BMBF Project ROBUKOM: Robust Communication Networks

Dana Chisnell, UsabilityWorks Ethan Newby, Newby Research (consultant on statistics) Sharon Laskowski, NIST Svetlana Lowry, NIST

Lecture 10: Reinforcement Learning

Lecturing for Deeper Learning Effective, Efficient, Research-based Strategies

A Game-based Assessment of Children s Choices to Seek Feedback and to Revise

Why Did My Detector Do That?!

ScienceDirect. Noorminshah A Iahad a *, Marva Mirabolghasemi a, Noorfa Haszlinna Mustaffa a, Muhammad Shafie Abd. Latif a, Yahya Buntat b

A Comparison of Annealing Techniques for Academic Course Scheduling

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

How the Guppy Got its Spots:

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

A Case Study: News Classification Based on Term Frequency

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

Acquiring Competence from Performance Data

On the Distribution of Worker Productivity: The Case of Teacher Effectiveness and Student Achievement. Dan Goldhaber Richard Startz * August 2016

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

Working Paper: Do First Impressions Matter? Improvement in Early Career Teacher Effectiveness Allison Atteberry 1, Susanna Loeb 2, James Wyckoff 1

Efficient Use of Space Over Time Deployment of the MoreSpace Tool

Senior Project Information

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Predicting the Performance and Success of Construction Management Graduate Students using GRE Scores

Third Misconceptions Seminar Proceedings (1993)

Learning By Asking: How Children Ask Questions To Achieve Efficient Search

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

EDPS 859: Statistical Methods A Peer Review of Teaching Project Benchmark Portfolio

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Probability estimates in a scenario tree

Generating Test Cases From Use Cases

Algebra 2- Semester 2 Review

Systematic reviews in theory and practice for library and information studies

An application of student learner profiling: comparison of students in different degree programs

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

A Genetic Irrational Belief System

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

IMGD Technical Game Development I: Iterative Development Techniques. by Robert W. Lindeman

Social and Economic Inequality in the Educational Career: Do the Effects of Social Background Characteristics Decline?

Evidence for Reliability, Validity and Learning Effectiveness

STUDENT SATISFACTION IN PROFESSIONAL EDUCATION IN GWALIOR

Planning with External Events

What is PDE? Research Report. Paul Nichols

A STUDY ON THE EFFECTS OF IMPLEMENTING A 1:1 INITIATIVE ON STUDENT ACHEIVMENT BASED ON ACT SCORES JEFF ARMSTRONG. Submitted to

Concept mapping instrumental support for problem solving

FINAL EXAMINATION OBG4000 AUDIT June 2011 SESSION WRITTEN COMPONENT & LOGBOOK ASSESSMENT

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

Switchboard Language Model Improvement with Conversational Data from Gigaword

International Journal of Innovative Research and Advanced Studies (IJIRAS) Volume 4 Issue 5, May 2017 ISSN:

AMULTIAGENT system [1] can be defined as a group of

University of Cincinnati College of Medicine. DECISION ANALYSIS AND COST-EFFECTIVENESS BE-7068C: Spring 2016

Robert Wedgeworth ALL RIGHTS RESERVED

Degeneracy results in canalisation of language structure: A computational model of word learning

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

An Evaluation of the Interactive-Activation Model Using Masked Partial-Word Priming. Jason R. Perry. University of Western Ontario. Stephen J.

PhD project description. <Working title of the dissertation>

Replies to Greco and Turner

learning collegiate assessment]

HEROIC IMAGINATION PROJECT. A new way of looking at heroism

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

The Implementation of Interactive Multimedia Learning Materials in Teaching Listening Skills

PEER EFFECTS IN THE CLASSROOM: LEARNING FROM GENDER AND RACE VARIATION *

What is a Mental Model?

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

Transcription:

Stopping rules for sequential trials in high-dimensional data Sonja Zehetmayer, Alexandra Graf, and Martin Posch Center for Medical Statistics, Informatics and Intelligent Systems Medical University of Vienna Supported by - Funds Nr. T401 and P23167

Probability of a false positive result Stopping rules for sequential trials in high-dimensional data Hunting for significance inflates the probability of a false positive result a = 0.05

Probability of a false positive result Stopping rules for sequential trials in high-dimensional data Hunting for significance inflates the probability of a false positive result a = 0.05

Probability of a false positive result Stopping rules for sequential trials in high-dimensional data Hunting for significance inflates the probability of a false positive result a = 0.05

Probability of a false positive result Stopping rules for sequential trials in high-dimensional data Hunting for significance inflates the probability of a false positive result a = 0.05

Conclusion I Testing a single hypothesis repeatedly at several interim analyses at level a ( Hunting for significance ), increases the probability of a false positive result. Solution: Group sequential tests: adjust a What about very many hypotheses?

Many hypotheses m hypotheses (genes), e.g., microarray study H 0i : m i = 0 versus H 1i : m i 0, i=1,,m

The False Discovery Rate (FDR) Benjamini and Hochberg, 1995 V FDR E( ) max{ R,1} V : number of erroneously rejected null hypotheses R : number of rejected null hypotheses FDR of the experiment is controlled according to Benjamini and Hochberg (1995) Order the individual p-values p (1) p (m) d = argmax i {p (i) ia/m} Reject all hypotheses with p-values p (1) p (d) This is a conservative procedure for controlling the FDR if the test statistics are independent or positively dependent (Benjamini and Yekutieli, 2001)

1 2 3 4 5 Analysis controlling the FDR at level a 1 spot = ^ 1 hypothesis

1 2 3 4 5 Stop the experiment. Reject all significant hypotheses. Retain all others. Analysis controlling the FDR at level a 1 spot = ^ 1 hypothesis

6 7 8 Stopping rules for sequential trials in high-dimensional data 1 2 3 4 5 Stop the experiment. Reject all significant hypotheses. Retain all others. Analysis controlling the FDR at level a 1 spot = ^ 1 hypothesis 9 10 Analysis controlling the FDR at level a for pooled data

6 7 8. Stopping rules for sequential trials in high-dimensional data 1 2 3 4 5 Stop the experiment. Reject all significant hypotheses. Retain all others. Analysis controlling the FDR at level a Stop Reject Retain 1 spot = ^ 1 hypothesis 9 10 Analysis controlling the FDR at level a for pooled data 11 12

6 7 8. Stopping rules for sequential trials in high-dimensional data 1 2 3 4 5 Stop the experiment. Reject all significant hypotheses. Retain all others. Analysis controlling the FDR at level a Stop Reject Retain 1 spot = ^ 1 hypothesis 9 10 Analysis controlling the FDR at level a for pooled data. 11 12

6 7 8. Stopping rules for sequential trials in high-dimensional data 1 2 3 4 5 Stop the experiment. Reject all significant hypotheses. Retain all others. Analysis controlling the FDR at level a Stop Reject Retain 1 spot = ^ 1 hypothesis 9 10 Analysis controlling the FDR at level a for pooled data. 11 12 At the end, only the significant hypotheses from the final stage can be rejected!

What is the effect of unadjusted repeated analyses on the FDR?

What is the effect of unadjusted repeated analyses on the FDR? Depends on the number of true null hypotheses m 0 : In case of m 0 /m<1: For m, the FDR is controlled asymptotically regardless of the stopping stage (under suitable assumptions). In case of m 0 /m=1 (global H 0 ): A constraint on the stopping rule has to be imposed: Stop early only if at least a certain number s(m) of hypotheses can be rejected. Then early stopping hardly occurs. Then the FDR is controlled asymptotically (Posch, Zehetmayer, Bauer, 2009)

Stopping the experiment Stopping for futility Futility boundary a 1 > a Early rejection Proportion of rejected H0 D Proportion of rejected H0 False Negative Rate D False Negative Rate False Non Discovery Rate Concordance (and at least s(m) hypotheses can be rejected)

Stop as soon as the FNR is < 20% e.g., Zehetmayer & Posch (2010) Multiple Type II Error Expected proportion of not-rejected true alternative hypotheses among all true alternative hypotheses FNR E 1 R V m m 0 R: # of rejections V: # of false rejections m: # of hypotheses m 0 : # of true null hypotheses

In each stage k the FNR is estimated from the data g : critical value from the FDR-controlling procedure The p-values corresponding to the true null hypotheses are uniformly distributed. 0 0 0 ) ( 1 1 m m m R E m m V R E FNR k k k k k g m 0k : estimator for m 0 R k (g )= # {p ik <g k } k k k k k k m m m R FNR 0 0 ) ( 1 g g Stopping rules for sequential trials in high-dimensional data

Stop as soon as DFNR < 0.05 DFNR is based on the increment of the stagewise FNR: DFNR k FNR k FNR k1 with FNR 0 =1. In each stage DFNR is estimated as described before: DFNR k FNR k FNR k1

Stop as soon as the concordance of the rejected hypotheses from stage to stage > 0.9 Concordance (CO) measures the proportion of significant genes in stage k which were also significant in stage k-1: where = 1 if hypothesis i was significant in stage k and 0 else with CO 1 =0.

Example: m 0 /m=0.9, m/s=0.5 True FNR for different sample sizes: Theoretical curve

Example: m 0 /m=0.9, m/s=0.5 True DFNR for different sample sizes: Theoretical curve

Example: m 0 /m=0.9, m/s=0.5 True CO for different sample sizes: Theoretical curve

Simulation study (50000 runs) The setting: m=5000 / 50000 m 0 /m=0.9, m/s =0.5 10 stages with stage-wise sample sizes of 5 z-tests, a = 0.05 Stopping rules: FNR<0.2, DFNR<0.05, CO>0.9, s(m)>9

Simulation study (50000 runs) The setting: m=5000 / 50000 m 0 /m=0.9, m/s =0.5 10 stages with stage-wise sample sizes of 5 z-tests, a = 0.05 Stopping rules: FNR<0.2, DFNR<0.05, CO>0.9, s(m)>9 Independent data The FDR is controlled at level a = 0.05 for the 3 considered stopping criteria.

Simulation study (50000 runs) The setting: m=5000 / 50000 m 0 /m=0.9, m/s =0.5 10 stages with stage-wise sample sizes of 5 z-tests, a = 0.05 Stopping rules: FNR<0.2, DFNR<0.05, CO>0.9, s(m)>9 Independent data Equi-correlated data (r = 0.5) The FDR is controlled at level a = 0.05 for the 3 considered stopping criteria. The FDR is controlled at level a = 0.05 for the 3 considered stopping criteria.

Independent data Equi-correlated data

Independent data Equi-correlated data

Independent data Equi-correlated data

Independent data Equi-correlated data

The Family Wise Error Rate Replace the BH procedure by the Bonferroni test If no multiplicity adjustment for the repeated looks is applied, the FWER may be inflated (Armitage,1969) If stopping rules are applied, that are asymptotically deterministic, the sequential procedure controls the FWER Reason: The sequential procedure degenerates to a fixed sample size procedure For the considered stopping rules and scenarios the FWER is controlled at level a = 0.05.

Outlook Muralidharan (2010) considered an empirical bayes mixture method for effect size estimation (mean values and standard deviations) We try to apply the estimated values for a power estimation. Power(reject effect sizes > D)

Discussion Is it necessary to adjust for the number of looks? If the number of hypotheses is very large, multiple analyses hardly inflate the error rate. Is this the solution to the sequential problem? There are limitations Result applies only for large m Convergence rate depends on m 0 /m and the alternative Appropriate stopping rules Increment - Rules seem to work better however the performance depends on the stage-wise sample size

Selected References Armitage P, McPherson CK, Rowe BC (1969) J R Stat Soc Ser B. Benjamini Y, Hochberg Y (1995) J R Stat Soc Ser B. Marot G, Mayer CD (2009) SAGMB. Muralidharan (2010) Annals of Applied Statistics Pawitan et al. (2005) Bioinformatics. Posch M, Zehetmayer S, Bauer P (2009) Jasa. Storey JD, Taylor JE, Siegmund D (2004), J R Stat Soc Ser B. Zehetmayer S, Posch M (2010) Bioinformatics.