Module 7: Hypothesis Testing I Statistics (OA3102)

Similar documents
STAT 220 Midterm Exam, Friday, Feb. 24

Managerial Decision Making

The Evolution of Random Phenomena

4-3 Basic Skills and Concepts

Probability and Statistics Curriculum Pacing Guide

Visit us at:

STA 225: Introductory Statistics (CT)

Lecture 1: Machine Learning Basics

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Diagnostic Test. Middle School Mathematics

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Evaluating Statements About Probability

Stopping rules for sequential trials in high-dimensional data

Extending Place Value with Whole Numbers to 1,000,000

Cal s Dinner Card Deals

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

A Game-based Assessment of Children s Choices to Seek Feedback and to Revise

12- A whirlwind tour of statistics

Mathematics (JUN14MS0401) General Certificate of Education Advanced Level Examination June Unit Statistics TOTAL.

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Software Maintenance

Proof Theory for Syntacticians

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

Measures of the Location of the Data

TU-E2090 Research Assignment in Operations Management and Services

CAN PICTORIAL REPRESENTATIONS SUPPORT PROPORTIONAL REASONING? THE CASE OF A MIXING PAINT PROBLEM

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants)

Mathematics Success Grade 7

Build on students informal understanding of sharing and proportionality to develop initial fraction concepts.

Critical Thinking in the Workplace. for City of Tallahassee Gabrielle K. Gabrielli, Ph.D.

THE INFORMATION SYSTEMS ANALYST EXAM AS A PROGRAM ASSESSMENT TOOL: PRE-POST TESTS AND COMPARISON TO THE MAJOR FIELD TEST

EDEXCEL FUNCTIONAL SKILLS PILOT. Maths Level 2. Chapter 7. Working with probability

Getting Started with Deliberate Practice

Introduction to Causal Inference. Problem Set 1. Required Problems

Math 96: Intermediate Algebra in Context

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

The Indices Investigations Teacher s Notes

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

CS Machine Learning

AP Statistics Summer Assignment 17-18

Statewide Framework Document for:

Probability Therefore (25) (1.33)

Grade 6: Correlated to AGS Basic Math Skills

How to make your research useful and trustworthy the three U s and the CRITIC

Julia Smith. Effective Classroom Approaches to.

An overview of risk-adjusted charts

An Introduction to Simio for Beginners

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

Note: Principal version Modification Amendment Modification Amendment Modification Complete version from 1 October 2014

Law Professor's Proposal for Reporting Sexual Violence Funded in Virginia, The Hatchet

On-the-Fly Customization of Automated Essay Scoring

Activity 2 Multiplying Fractions Math 33. Is it important to have common denominators when we multiply fraction? Why or why not?

Unraveling symbolic number processing and the implications for its association with mathematics. Delphine Sasanguie

Classifying combinations: Do students distinguish between different types of combination problems?

Simple Random Sample (SRS) & Voluntary Response Sample: Examples: A Voluntary Response Sample: Examples: Systematic Sample Best Used When

Corpus Linguistics (L615)

Effect of Cognitive Apprenticeship Instructional Method on Auto-Mechanics Students

Using Proportions to Solve Percentage Problems I

Level 1 Mathematics and Statistics, 2015

Rote rehearsal and spacing effects in the free recall of pure and mixed lists. By: Peter P.J.L. Verkoeijen and Peter F. Delaney

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Developing a concrete-pictorial-abstract model for negative number arithmetic

Successfully Flipping a Mathematics Classroom

Missouri Mathematics Grade-Level Expectations

Grade 5 + DIGITAL. EL Strategies. DOK 1-4 RTI Tiers 1-3. Flexible Supplemental K-8 ELA & Math Online & Print

Critical Thinking in Everyday Life: 9 Strategies

Mathematics subject curriculum

Rule-based Expert Systems

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

ARSENAL OF DEMOCRACY

ACTION LEARNING: AN INTRODUCTION AND SOME METHODS INTRODUCTION TO ACTION LEARNING

Individual Differences & Item Effects: How to test them, & how to test them well

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Physics 270: Experimental Physics

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

Software Security: Integrating Secure Software Engineering in Graduate Computer Science Curriculum

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Artificial Neural Networks written examination

Probability estimates in a scenario tree

IBM Software Group. Mastering Requirements Management with Use Cases Module 6: Define the System

What is related to student retention in STEM for STEM majors? Abstract:

The KAM project: Mathematics in vocational subjects*

Lab 1 - The Scientific Method

Politics and Society Curriculum Specification

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

Mathematics. Mathematics

Preliminary Chapter survey experiment an observational study that is not a survey

Sector Differences in Student Learning: Differences in Achievement Gains Across School Years and During the Summer

An Empirical and Computational Test of Linguistic Relativity

San José State University Department of Marketing and Decision Sciences BUS 90-06/ Business Statistics Spring 2017 January 26 to May 16, 2017

San José State University Department of Psychology PSYC , Human Learning, Spring 2017

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Transcription:

Module 7: Hypothesis Testing I Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 10.1-10.5 Revision: 2-12 1

Goals for this Module Introduction to testing hypotheses Steps in conducting a hypothesis test Types of errors Lots of terminology Large sample tests for the means and proportions Aka z-tests Type II error probability calculations Sample size calculations Revision: 2-12 2

A Simple Example You hypothesize a coin is fair To test, take a coin and start flipping it If it is fair, you expect that about half of the flips will be heads and half tails After a large number of flips, if you see either a large fraction of heads or a large fraction of tails you tend to disbelieve your hypothesis This is an informal hypothesis test! How to decide when the fraction is too large? Revision: 2-12 3

Is the Coin Fair? Back to the in-class exercise from Module 1: flip a coin 10 times and count the number of heads Assuming the coin is fair (and flips are independent), the number of heads has a binomial distribution with n=10 and p=0.5 So, if our assumption is true, the distribution of the number of heads is: And, if your assumption is not true, what do we expect to see? Either too few or too many heads Revision: 2-12 4

Setting Up a Test for the Coin Idea: Make a rule, based on the number of heads observed, from which we will conclude either that our assumption (p=0.5) is true or false Here s one rule: Assume p=0.5 if 3 < x < 7 Otherwise, conclude p 0.5 If p=0.5, what s our chance of making a mistake? ~11% If p=0.8, what s our chance of detecting the biased coin? ~68% Revision: 2-12 5

CIs vs. Hypothesis Testing Previously we would have answered this question with a confidence interval: Say we observed yn= 0.7 : we re 95% confident that the interval [0.5, 0.9] covers the true p Now we look to answer the question using a hypothesis test: If the true probability of heads is p = 0.5 (i.e., the coin is fair), how unlikely would it be to see 7 heads out of ten flips? If we see an outcome inconsistent with our hypothesis (the coin is fair), then we reject it Revision: 2-12 6

Why Do Hypothesis Tests? Confidence intervals provide more information than hypothesis tests But often we need to test a particular theory E.g., The mod decreases the mean down-time. Sometimes confidence intervals are hard/impossible When the theory you re testing has several dimensions E.g., regression slope and intercept When intervals don t make sense E.g., are two categorical variables independent? Revision: 2-12 7

Elements of a Statistical Test Null hypothesis Denoted H 0 We believe the null unless/until it is proven false Alternative hypothesis Denoted H a It s what we would like to prove Test statistic The test statistic is the empirical evidence from the data Rejection region If the test statistic falls in the rejection region, we say reject the null hypothesis or we ve proven the alternative Otherwise, we fail to reject the null Revision: 2-12 8

true situation Errors in Hypothesis Testing don t reject H 0 conclusion reject H 0 H 0 true no error type I error H a true type II error no error Revision: 2-12 9

Probability of a Type I Error a=pr(type I error) = Pr(H 0 is rejected when it is true) Also called the significance level or the level of the test Experimenter chooses prior to the test Conventions: a=0.10, 0.05, and 0.01 Can increase or decrease depending on particular problem Choice of a defines rejection region (or size of p-value ) that results in rejection of null Revision: 2-12 10

Probability of a Type II Error b=pr(type II error) = Pr(not rejecting H 0 when it is false) It is a function of the actual alternative distribution and the sample size 1-b called the power of a test It s Pr(rejecting H 0 when it is false) Ideal is both small a and b (i.e., high power), but for a fixed sample size they trade off By convention, we control a by choice and b with sample size (bigger sample, more power) Revision: 2-12 11

Choosing the Null and Alternative Based on the Severity of Error In hypothesis testing, we get to control the Type I error So, if one error is more severe than the other, set the test up so that it s the Type I error E.g., in the US, we consider sending an innocent person to jail more serious than letting a guilty person go free Hence, the burden of proof of guilt is placed on the prosecution at trial ( innocent until proven guilty ) I.e., the null hypothesis is a person is innocent, and the trial process controls the chance of sending an innocent person to jail Revision: 2-12 12

Example 10.1 Consider a political poll of n=15 people Want to test H 0 : p = 0.5 vs. H a : p < 0.5, where p is the proportion of the population favoring a candidate The test statistic is Y, the number of people in the sample favoring the candidate If the RR={y < 2}, find a Solution: Revision: 2-12 13

Example 10.1 (continued) Revision: 2-12 14

Example 10.2 Continuing with the previous problem, if p=0.3 what is the probability of a Type II error (b)? I.e., what s the probability of concluding that the candidate will win? Solution: Revision: 2-12 15

Example 10.2 (continued) Revision: 2-12 16

Example 10.3 Still continuing with the previous problem, if p=0.1 then what is the probability of a Type II error (b)? Solution: Revision: 2-12 17

Example 10.3 (continued) Revision: 2-12 18

Example 10.4 Still continuing the problem, if RR={y < 5}: What is the level of the test (a)? If p = 0.3, what is b? Solution: Revision: 2-12 19

Example 10.4 (continued) Revision: 2-12 20

Terminology Null and alternative hypotheses We will believe the null until it is proven false Acceptance vs. rejection region The null is proven false if the test statistic falls in the rejection region Type I vs. Type II error Type I: Rejecting the null hypothesis when it is really true Type II: Accepting the null hypothesis when it is really false Significance level or level of the test (a) Probability of a Type I error Pr(Type II error) = b and 1-b is called the power It s a function of the actual alternative distribution Revision: 2-12 21

Another Example As a program manager, you want to decrease mean down-time m for a type of equipment Manufacturer suggests modification Goal is to see if mod actually does decrease m Implement modification on a sample of equipment (n=25) and measure the downtime (say, per month) Currently, equipment down-time is 75 mins/month Standard deviation is s=9 mins Revision: 2-12 22

Intuition Behind the Test 1. We will assume the status quo is true a. In this case, that there is no change in the mean down-time 2. unless there is sufficient evidence to contradict it a. In this case, meaning we observe a much smaller sample mean than we would expect to see assuming m=75 s=9/5 X m=75 Revision: 2-12 23

Steps in Hypothesis Testing 1. Identify the parameter of interest 2. State null and alternative hypotheses 3. Determine form of test statistic 4. Calculate rejection region 5. Calculate test statistic 6. Determine test outcome by comparing test statistic to rejection region Revision: 2-12 24

1. Identify Parameter of Interest In hypothesis tests, we are testing the parameter of a distribution E.g., m or s for a normal distribution E.g., p for a binomial distribution E.g., a and b for a gamma distribution So, the first step is to identify the parameter of interest Often we ll be testing the mean m of a normal, since the CLT often applies to the sample mean E.g., in the equipment down-time example, we re interested in testing the mean down-time For the coin and election examples, we re testing p Revision: 2-12 25

2. State Null and Alternative Hypotheses H 0 : The null hypothesis is a specific theory about the population that we wish to disprove We will believe the null until it is disproved Example: Mean down-time is equal to 75 In notation, H : m = 75 0 H a : The alternative hypothesis is what we want to prove What we will believe if the null is rejected Example: Mean down-time is less than 75 In notation, H : 75 a m Revision: 2-12 26

Null and Alternative Hypotheses are Fundamentally Different The null hypothesis is what you have assumed Generally, it s the status quo or less desirable test outcome Failing to prove the alternative does not mean the null is true, only that you don t have enough evidence to reject it The alternative is proven based on empirical evidence It s the desired test outcome and/or the outcome upon which the burden of proof rests The significance level (a) is set so that the chance of incorrectly proving the alternative is tolerably low Having proved the alternative is a much stronger outcome than failing to reject the null Thus, structure the test so the alternative is what needs proving Revision: 2-12 27

In the Example In the example, we want to test is whether the mod decreases mean down-time So, the null hypothesis is the status quo and the alternative carries the burden of proof to show a decrease We write this out as H : 75 0 m = H : 75 a m H0 The other possibilities are : m = 75 H : 75 and : 0 m = 75 Ha m H : 75 a m What would you be testing with these hypotheses? Revision: 2-12 28

Expressing the Null as an Equality We will express the null hypothesis as an equality and the alternative as an inequality E.g., H0 : m = 75 versus H a : m 75 In reality, the hypotheses divide the real line into two separate regions E.g., H0 : m 75 versus H a : m 75 However, the most powerful test occurs when the null hypothesis is at the boundary of its region Hence, we write the null as an equality Revision: 2-12 29

3. Determine Test Statistic (and its Sampling Distribution) The test statistic is (a function of) the sample statistic corresponding to population parameter you are testing Population and sample statistic examples: Population mean Sample mean Difference of two population means Difference of two sample means It is sometimes a function of the sample statistic as we may rescale the sample statistic We use the sampling distribution to determine whether the observed statistic is unusual Revision: 2-12 30

In the Example In the example, we are testing the mean m so the obvious choice for the sample statistic is X In this case, it s easier to make the test statistic the rescaled sample statistic, X m0 Z =, s / n where m 0 is the null hypothesis mean Why? Because, assuming the null hypothesis is true, we know the sampling distribution of Z: Z~N(0,1) This is exactly true if X has a normal distribution and approximately true via the CLT for large sample sizes Revision: 2-12 31

4. Calculate Rejection Region Rejection region depends on the alternative hypothesis Set the significance level a so that the Pr(fall in rejection region null hyp. is true) = a Means you will have to determine the appropriate quantile or quantiles of the sampling distribution Revision: 2-12 32

The Way to Think About It (for a two-sided test) Rejection region unlikely under the null (i.e., probability a) If test statistic falls in this region, reject the null Acceptance region likely under the null hypothesis If test statistic falls in this region, fail to reject null Revision: 2-12 33

In the Example In our example, the test statistic In addition, we know H : a m 75, so this is a one-tailed or one-sided test So, we need to find z a so that Pr(Z < z a ) = a Choosing a=0.05: From Table 4 we get Pr(Z < -1.645) = 0.05 Using Excel: =NORMSINV(0.05) And in R: qnorm(0.05) Z~N(0,1) Revision: 2-12 34

Picturing the Rejection Region For the rescaled test statistic, the rejection region is in red Probability of falling in that region, assuming the null is true, is a=0.05 Z~N(0,1) pdf -1.645 0 Here s the equivalent test without rescaling Probability of falling in that region, assuming the null is true, is still a=0.05 75-1.645 x 9/5 = 72 75 X~N(75,81/25) pdf Revision: 2-12 35

5. Calculate the Test Statistic Now, plug the necessary quantities into the formula and calculate the test statistic E.g., in the example, imagine we tested 25 pieces of equipment for a month, measuring the total down-time for each: Calculating the sample mean gives X = 68.6 Thus, the test statistic is X m0 68.6 75 z = = = 3.8 s / n 9 / 25 Revision: 2-12 36

6. Determine the Outcome Now, compare the test statistic with the rejection region If it falls within the rejection region you have rejected the null hypothesis Equivalently, proven the alternative If it falls in the acceptance region, you have failed to reject the null hypothesis Equivalently, failed to prove the alternative Revision: 2-12 37

In the Example We observed z = 3.8 rejection region The picture: which falls in the Very unusual to see this, if the null is true z = 3.8-1.645 0 Thus, conclude that the mod is effective at reducing mean down-time for the equipment Revision: 2-12 38

One Way to Think About It The test statistic tells us our observation was 3.8 standard errors way from the hypothesized mean The calculation How likely is this? X m0 68.6 75 z = = = 3.8 s / n 9 / 25 If the null were true, it would be very unlikely Pr( Z 3.8 H is true) = 3.8 = 0.000072 0 Another way to think about it is about one chance in 14,000 to see something like this or more extreme (i.e., 1/0.000072) Pretty convincing evidence that the mod decreases mean down-time Revision: 2-12 39

Logic of Hypothesis Testing It s proof by contradiction: Suppose I am a general/flag officer People would salute me on Tuesdays No one ever salutes me I must not be a general/flag officer Null Hypothesis What I expect if null hypothesis is true What I see is inconsistent with what I expect My initial assumption must be wrong Revision: 2-12 40

Now, for the Example Assume m is 75 Sample mean (n=25) normally distributed w/ mean 75, s.e. 9/ 25 = 1.8 Observe a sample mean of 68.6 that is 3.8 s.e.s below assumed mean Real mean m must be less than 75 Null Hypothesis What you expect to see if null is true Not very likely to happen if null is true (p =0.00007) Null must be false Revision: 2-12 41

Large-Sample or z-tests The statistic is (approximately) normally distributed ˆ 0 The rescaled test statistic is Z = The null hypothesis is Three possible alternative hypotheses and tests: Alternative Hypothesis H a : H a : H a : 0 0 0 H : = 0 0 s ˆ Rejection Region for Level a Test z z z a z a z z or z z a/ 2 a/ 2 (upper-tailed test) (lower-tailed test) (two-tailed test) Revision: 2-12 42

Picturing z-tests Upper-tailed test Lower-tailed test Two-tailed test Revision: 2-12 43 * Figure from Probability and Statistics for Engineering and the Sciences, 7 th ed., Duxbury Press, 2008.

Example 10.5 VP claims mean contacts/week is less than15 Data collected on random sample of n=36 people Given y =17 and s 2 =9, is there evidence to refute the claim at a significance level of a=0.05? Specify the hypotheses to be tested: Revision: 2-12 44

Example 10.5 (continued) Specify the test statistic and the rejection region Revision: 2-12 45

Example 10.5 (continued) Conduct the test and state the conclusion Revision: 2-12 46

Large Sample Tests for Population Proportion (p) If we have a large sample, then via the CLT, has an approximate normal distribution For the null hypothesis H 0 : p = p 0 there are three possible alternative hypotheses Alternative Hypothesis H p p a : H p p a : H p p a : where the test statistic is 0 0 0 Rejection Region for Level a Test z z Revision: 2-12 47 z a z a z z or z z a/ 2 a/ 2 z = pˆ (upper-tailed test) p (lower-tailed test) 0 p (1 p ) / n 0 0 pˆ = y / n (two-tailed test)

What s a Large Sample? Remember that the y in indicator variables pˆ = y / n is a sum of If you sum enough, then the CLT kicks in and you can assume ˆp has an approximately normal distribution So use the same rule we used back in the CLT lecture: n 9 max( p0, q0) min( p, ) 0 q0 Revision: 2-12 48

Example 10.6 Machine must be repaired is produces more than 10% defectives A random sample of n=100 items has 15 defectives Is there evidence to support the claim that the machine needs repairing at a significance level of a=0.01? Specify the hypotheses to be tested: Revision: 2-12 49

Example 10.6 (continued) Specify the test statistic and the rejection region Revision: 2-12 50

Example 10.6 (continued) Conduct the test and state the conclusion Revision: 2-12 51

Example 10.7 Reaction time study on men and women conducted Data on independent random samples of 50 men and 50 women collected giving 2 2 y = 3.6, s = 0.18, y = 3.8, and s = 0.14 men men women women Is there evidence to suggest a difference in the true mean reaction times between men and women at the a=0.01 level? Specify the hypotheses to be tested: Revision: 2-12 52

Example 10.7 (continued) Specify the test statistic and the rejection region Revision: 2-12 53

Example 10.7 (continued) Conduct the test and state the conclusion Revision: 2-12 54

Calculating Type II Error Probabilities The probability of a Type II error (b) is the probability a test fails to reject the null when the alternative hypothesis is true Note that it depends on a particular alternative hypothesis We can write it mathematically as Pr(reject H 0 H a is true with = a ) To determine, first figure out the rejection region (a function of the null hypothesis), then calculate the probability of falling in the acceptance region when = a Revision: 2-12 55

Calculating Type II Error Probabilities, continued For example, consider the test versus H a : 0 H : = 0 0 Then the rejection region is of the form RR = ˆ : ˆ k for some value of k So, the probability of a Type II error is Pr ˆ not in RR when true with = a = Ha a b = Pr ˆ k = ˆ a k a = Pr s s a ˆ ˆ Revision: 2-12 56

Example 10.8 Returning to Example 10.5, find b if m a =16 Remember that m 0 =15, a=0.05, n=36, y =17 and s 2 =9 Pictorially, we have: Revision: 2-12 57

Example 10.8 (continued) Now solving: Revision: 2-12 58

Example 10.8 (continued) An alternative way to solve: Revision: 2-12 59

Example 10.8 (continued) Revision: 2-12 60

Alternatively, the Power of a Test Remember, that s the power of a test is the probability a test will reject the null for a particular alternative hypothesis It s just 1-b Why is this important? Prior to doing a test, natural problem is that you want to make sure you have sufficient power to prove interesting alternatives Sometimes after a test results in a null result, you might want to know the probability of rejecting at the observed level Revision: 2-12 61

Sample Size Calculations The sample size n for which a level a test also has b at the alternative value m a is n 2 s za zb ma m0 = s z a/2 zb ma m0 2 for a one-tailed test for a two-tailed test (approximate sol'n) Here z a and z b are the quantiles of the normal distribution for a and b Revision: 2-12 62

Example 10.9 Returning to Exercise 10.5, assuming s 2 =9, what sample size n is required to test vs. H m = with a=b=0.05? a : 16 Solution: H : m = 15 0 Revision: 2-12 63

Another Example Consider a test: H 0 : m =30,000 vs H a : m 30,000, where The desired significance level is a=0.01 The population has a normal distribution with s =1,500 Find the required sample size so that at m a =31,000, the probability of a Type II error is 0.1: s za 2 z b n = = ma m 0 2 Revision: 2-12 64

Relationship Between Hypothesis Tests and Confidence Intervals The large sample hypothesis test (z-test) is based on the statistic ˆ 0 Z = s where the acceptance region is ˆ 0 RR = za za s ˆ which we can write as Revision: 2-12 65 ˆ 2 2 RR = ˆ z s ˆ z s a 2 ˆ 0 a 2 ˆ

Relationship Between Hypothesis Tests and Confidence Intervals Now, remember the general form for a twosided large sample confidence interval: 100 1 a % CI = ˆ z s ˆ 2 ˆ, z s a a 2 ˆ Note the similarities to the acceptance region: RR = ˆ z s ˆ z s a 2 ˆ 0 a 2 So, if 0 ˆ z s ˆ 2 ˆ, a z s a 2 ˆ reject the hypothesis test we would fail to Can interpret 100(1-a)% CI as the set of all values for 0 for which H 0 : = 0 is acceptable at level a Revision: 2-12 66 ˆ

What We Covered in this Module Introduced hypothesis testing Steps in conducting a hypothesis test Types of errors Lots of terminology Large sample tests for the means and proportions Aka z-tests Type II error probability calculations Sample size calculations Revision: 2-12 67

Homework WM&S chapter 10 Required: 2, 3, 20, 21, 27, 34, 38, 41, 42 Extra credit: None Useful hints: Always first write out the null and alternative hypotheses. I also recommend drawing the null distribution and then highlighting the rejection region. This can be particularly helpful when calculating b... Exercises 21 and 27: We didn t do these types of problems in class, but just use what you learned with two-sample confidence intervals The relevant hypotheses are H0: m1 m2 = 0 H0: p1 p2 = 0 : 0 and Ha m1 m2 H : 0 a p1 p2 Revision: 2-12 68