THE NEW STATISTICAL ANALYSIS OF DATA

Similar documents
STA 225: Introductory Statistics (CT)

Probability and Statistics Curriculum Pacing Guide

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

Perspectives of Information Systems

Guide to Teaching Computer Science

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Grade 6: Correlated to AGS Basic Math Skills

Research Design & Analysis Made Easy! Brainstorming Worksheet

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Office Hours: Mon & Fri 10:00-12:00. Course Description

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Mathematics subject curriculum

School Size and the Quality of Teaching and Learning

Knowledge management styles and performance: a knowledge space model from both theoretical and empirical perspectives

Shockwheat. Statistics 1, Activity 1

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

CHALLENGES FACING DEVELOPMENT OF STRATEGIC PLANS IN PUBLIC SECONDARY SCHOOLS IN MWINGI CENTRAL DISTRICT, KENYA

BENG Simulation Modeling of Biological Systems. BENG 5613 Syllabus: Page 1 of 9. SPECIAL NOTE No. 1:

International Series in Operations Research & Management Science

COURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Statewide Framework Document for:

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

School of Innovative Technologies and Engineering

A THESIS. By: IRENE BRAINNITA OKTARIN S

San José State University Department of Marketing and Decision Sciences BUS 90-06/ Business Statistics Spring 2017 January 26 to May 16, 2017

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

Lecture 1: Machine Learning Basics

EGRHS Course Fair. Science & Math AP & IB Courses

Theory of Probability

Level 1 Mathematics and Statistics, 2015

Evaluation of Teach For America:

THE INFLUENCE OF COOPERATIVE WRITING TECHNIQUE TO TEACH WRITING SKILL VIEWED FROM STUDENTS CREATIVITY

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

MASTER OF ARTS IN APPLIED SOCIOLOGY. Thesis Option

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

AP Statistics Summer Assignment 17-18

Self Study Report Computer Science

CS/SE 3341 Spring 2012

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Technical Manual Supplement

THE PROMOTION OF SOCIAL AWARENESS

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Lecture Notes on Mathematical Olympiad Courses

A Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements

ATW 202. Business Research Methods

Delaware Performance Appraisal System Building greater skills and knowledge for educators

(Includes a Detailed Analysis of Responses to Overall Satisfaction and Quality of Academic Advising Items) By Steve Chatman

Availability of Grants Largely Offset Tuition Increases for Low-Income Students, U.S. Report Says

Redirected Inbound Call Sampling An Example of Fit for Purpose Non-probability Sample Design

Statistics and Data Analytics Minor

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

Extending Place Value with Whole Numbers to 1,000,000

Mathematics Program Assessment Plan

APPENDIX A: Process Sigma Table (I)

Test Administrator User Guide

PRODUCT PLATFORM AND PRODUCT FAMILY DESIGN

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Enhancing Students Understanding Statistics with TinkerPlots: Problem-Based Learning Approach

Cal s Dinner Card Deals

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

MODULE 4 Data Collection and Hypothesis Development. Trainer Outline

Using Calculators for Students in Grades 9-12: Geometry. Re-published with permission from American Institutes for Research

12- A whirlwind tour of statistics

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

Python Machine Learning

Analysis of Enzyme Kinetic Data

Conducting the Reference Interview:

A Case Study: News Classification Based on Term Frequency

University of Massachusetts Lowell Graduate School of Education Program Evaluation Spring Online

Unequal Opportunity in Environmental Education: Environmental Education Programs and Funding at Contra Costa Secondary Schools.

Bachelor Programme Structure Max Weber Institute for Sociology, University of Heidelberg

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

HOLMER GREEN SENIOR SCHOOL CURRICULUM INFORMATION

Measures of the Location of the Data

Julia Smith. Effective Classroom Approaches to.

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

Hierarchical Linear Models I: Introduction ICPSR 2015

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

Math 96: Intermediate Algebra in Context

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

ACADEMIC AFFAIRS GUIDELINES

Tun your everyday simulation activity into research

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants)

MGT/MGP/MGB 261: Investment Analysis

Visit us at:

OFFICE SUPPORT SPECIALIST Technical Diploma

How the Guppy Got its Spots:

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Transcription:

THE NEW STATISTICAL ANALYSIS OF DATA

Springer New York Berlin Heidelberg Barcelona Budapest HongKong London Milan Paris Santa Clara Singapore Tokyo

THE NEW STATISTICAL ANALYSIS OF DATA T.W. Anderson Stanford University Jeremy D. Finn State University of New York at Buffalo Springer

T.W. Anderson Stanford University Depanment of Statistics Stanford, CA 94305 USA Jeremy D. Finn State University of New York at Buffalo Graduate School of Education Buffalo, NY 14260-1000 USA Library of Congress Cataloging-in-Publication Data Anderson, T.W. C1heodore Wilbur), 1918-- The new statistical analysis of data / T. W. Anderson, Jeremy D. Finn. p. em. Includes bibliographical references and index. ISBN-13: 978-1-4612-8466-6 e-isbn-13: 978-1-4612-4000-6 001: 10.1007/978-I -46 12-4000-6 1. Statistics. I. Finn, Jeremy D. n. Title. m. Series. QA276.12.A46 1996 519.5-<lc20 95-44885 Printed on add-free paper. 1996 Springer-Verlag New York, Inc. Sofkover reprint of fue hardcover I st edition 1996 All rights reserved. This work may not be translated or copied in whole or in pan without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafrer developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Production managed by Bill imbomoni; manufacturing supervised by Joe QuateJa. Typeset by Integre Technical Publishing Co., Inc., Albuquerque, NM. 9 8 765 4 3

To Dorothy and Kristin

Preface The Nature of the Book This book is a text for a first course in statistical concepts and methods. It introduces the analysis of data and statistical inference and explains various methods in enough detail so that the student can apply them. Little mathematical background is required; only high school algebra is used. No mathematical proof is given in the body of the text, although algebraic demonstrations are given in appendices at the ends of some chapters. The exposition is based on logic, verbal explanations, figures, and numerical examples. The verbal and conceptual levels are higher than the mathematical level The concepts and methods of statistical analysis are illustrated by more than 100 interesting real-life data sets. Some examples are taken from daily life; many deal with the behavioral sciences, some with business, the health sciences, the physical sciences, and engineering. The exercises are of varying degrees of difficulty. This book is suitable for undergraduates in many majors and for graduate students in the health sciences, behavioral sciences, and education. It has grown out of our experience over many years of teaching such courses. An earlier text by T. W. Anderson and S. L. Sclove, The Statistical Analysis of Data, had similar objectives and was of a similar nature. The organization of The New Statistical Analysis of Data follows an outline like that of the former book. However, the explanations of statistical topics are more elementary, detailed, and comprehensive in the new book. Instead of one chapter on the organization of data, vii

viii Preface Statistics and Computers there is one chapter on the description of quantitative measurements and another chapter on qualitative variables. Probability distributions are treated more thoroughly with emphasis on the normal distribution. Many more examples appear throughout the New Statistical Analysis of Data. It is expected that in putting their statistical knowledge into practice students will use computers to carry out the procedures they have learned. Indeed, the computer has become an essential ingredient to researchers analyzing statistical data. The computer furnishes more accurate results than either hand computation or a calculator and is more efficient with respect to time and effort. Most courses provide an opportunity for students to learn about statistical computing together with the statistical content itself. At the present time there are a number of computer packages available for performing basic statistical procedures, each with its advantages and disadvantages. Rather than basing the entire text on one of these packages, the authors are preparing a separate volume with instructions for a general purpose package, SPSS, which will be available from the publisher. The instructor who wishes to use SPSS as part of the course will find the self-teaching supplement invaluable. The instructor who has a preference for another statistical package (or none) will still find the textbook appropriate. The SPSS package is user-friendly and will carry out Virtually all of the procedures described in this text. The SPSS Guide to the New Statistical Analysis of Data provides step-by-step instructions on the use of the program for performing these analyses plus guidelines for locating and interpreting results in the output files. Many of the illustrations in the guide use specific data sets and analyses presented in the text. Alternative data sets are presented to give the reader additional experience with the methodology; a number of recommended "computer exercises" are furnished. The Organization of the Book Part One, "Introduction," shows how statistical methods are used in several substantive fields for answering important questions. Part Two, "Descriptive Statistics," considers the organization of data and summarization by means of descriptive statistics; the student learns how to

Preface ix approach information that comes in numerical form. Both univariate and multivariate data are considered. Part TIrree, "Probability," develops the ideas of probability to form the basis of statistical inference. Part Four, "Statistical Inference," begins by considering the use of sample characteristics to estimate population characteristics; the idea of variability of sample quantities leads directly to confidence intervals. Part Four also discusses tests of hypotheses and the allied concepts of significance level and power. This part treats some of the basic methods for means, proportions, and variances. Part Five, "Statistical Methods for Other Problems," includes techniques such as chi-square tests, analysis of variance, regression analysis, and sample surveys. The Use of the Book The 17 chapters in the book provide enough material for a course of two semesters or three quarters although it is anticipated that the most common use will be for a one-semester introductory course. In some chapters there are starred sections (*) that may be omitted without affecting the understanding of subsequent sections; some such material is put into appendices in order not to burden the main development. Chapter 1 presents examples of the use of statistics in studies of general interest and some basic concepts of statistics. Chapters 2 through 6 are largely descriptive statistics. Most beginning students will start with these chapters. Association is discussed at length in Chapters 5 and 6 that include many examples of both numerical and categorical data. Chapters 7, 8, and 9 on probability and probability distributions provide the theoretical background for statistical inference. If the course is of an applied nature, some of the more formal parts of these chapters may be given less emphasis. The ideas of estimation and testing hypotheses in Chapters 10 and 11 are the basic ingredients of inferential statistics and require in-depth attention. If time is at a premium, the sections on nonparametric procedures (Sections 10.5 and 11.6) can be omitted. Chapter 12 dealing with statistical inference based on two samples presents useful methods. Again, the instructor may omit the material on nonparametric tests (Section 12.5) if time is short. Chapter 13 on variances in one and two populations presents chi-square and F-tests which are based on an assumption of normal populations. An instructor may choose to emphasize certain topics, for example, the F-test. Chapter 14 presents statistical inference for "contingency" tables. Chapter 15 discusses simple regression and correlation. Chapter 16 is on

x Preface the analysis of variance; the nonparametric material is optional (Section 16.3). Chapter 17 presents an introduction to the basic ideas of sample surveys. In a one quarter or one-semester course in certain disciplines (e.g., sociology or management), the instructor may prefer to include this material instead of the analysis of variance chapter. The chapters are divided into sections (numbered 1.1 and 1.2 in Chapter 1, for example). Many sections are in tum presented in subsections. The effect of this hierarchy of chapters, sections, and subsections is to form an outline of the material. The summary at the end of each chapter reviews that chapter's contents and helps the student determine whether the important concepts have been learned. The material of the book is organized so that the 17 chapters provide a basis for courses of varying lengths. Some guidelines for coverage are as follows. A one-year course (two semesters or three quarters). All 17 chapters can be covered in a one-year course. Some instructors may wish to omit the starred sections and possibly de-emphasize other relatively advanced sections, such as Sections 6.3 and 6.4 on effects of a third variable on the relationship between two given variables. A two-quarter course. One can go in the direction of either breadth or depth. A survey course covering all 17 Chapters would mean CO the omission of starred sections, (ii) de-emphasis of Chapter 7, and (iii) omission of peripheral topics in Chapters 13-16. An irstructor can design a course which goes in the direction of depth rather than breadth by omitting one or more of Chapters 13-17. A one-semester course. The first twelve chapters comprise a text for a one-semester course. Some instructors may wish to include smatterings of later chapters, such as 13, 14, 15, or 16. A one-quarter course. The first twelve chapters provide a basis for such a course. The instructor can omit starred sections and deemphasize Chapter 7. Again, some teachers may wish to include parts of Chapters 13, 14, 15, or 16. In order to learn statistical concepts and methods, hands-on experience with many data sets, real and simulated, is essential. To provide practice, numerous exercises are provided at the end of each chapter. These

Preface xi apply the ideas presented in the chapter and demonstrate the utility of the statistical methods to a broad range of problems. Answers to Selected appear at the end of the book. The instructor has many options in assigning exercises of varying difficulty, either with or without answers available. The instructor can obtain a complete Solutions Manual from the publisher. Acknowledgments We are indebted to Stanley L. Sclove and the many colleagues, teaching assistants, typists, and reviewers who helped in the preparation of the texts preceding this volume. We wish to thank the staff of Springer Verlag, Bill Imbornoni, and several anonymous reviewers for their many helpful comments and suggestions in preparing the present text, and Marjorie Weinstock for assistance with the references. A special note of gratitude is extended to Ingram Olkin for encouraging us to begin this book and for useful suggestions during its preparation. T.W.A. J.D.F. March 1996

Contents Preface vii Part I: Chapter 1 - Introduction The Nature of Statistics 1.1 Some Examples of the Use of Statistics 4 Political Polls, 4 The Polio Vaccine Trial, 9 Smoking and Health, 11 Project "Head Start", 13 "A Minority of One Versus a Unanimous Majority", 14 Deciding Authorship, 15 1.2 Basic Concepts of Statistics 16 Experimental and Nonexperimental Research, 16 Populations and Samples, 17 Descriptive and Inferential Statistics, 19 Planning Statistical Investigations, 21 23 1 3 Part II: Descriptive Statistics 27 Chapter 2 -Organization of Data 29 2.1 Kinds of Variables: Scales 30 Categorical Variables, 30 Numerical Variables, 31 Scales, 32 2.2 Organization of Categorical Data 34 Frequencies, 34 Frequency Graphs, 38 xiii

xiv Contents Chapter 3-2.3 Organization of Numerical Data 40 Discrete Data, 40 Continuous Data, 46 Frequency Distributions for Continuous Data, 47 57 Measures of Location 3.1 The Mode Definition and Interpretation of the Mode, 70 Data, 72 Mode of Grouped 3.2 The Median and Other Percentiles 74 The Median, 74 Quartiles, 75 Deciles, Percentiles, and Other Quantiles, 77 3.3 The Mean 77 Definition and Interpretation of the Mean, 77 Use of Notation, 79 Calculating the Mean from a Frequency Distribution, 82 The Proportion as a Mean, 85 Other Properties of the Mean, 85 Effects of Change of Scale, 87 3.4 Choosing Among Measures of Location 88 Shape of Distributions, 90 Appendices Appendix 3A Computing the Median and Other Quantiles of Grouped Continuous Data Appendix 3B Appendix 3C Appendix 3D Chapter 4-Measures of Variability Rules for Summation Change of Scale Significant Digits 4.1 Ranges The Range, 109 The Interquartile Range, 110 4.2 The Mean Deviation 4.3 The Standard Deviation Definitions, 112 Reasons for Dividing by One Less than the Sample Size, 114 Interpreting the Standard Deviation, 115 69 70 93 93 95 98 99 99 107 109 111 112

Contents xv 4.4 Formulas for the Standard Deviation 118 Computing Formula, 118 Calculating the Standard Deviation from a Frequency Distribution, 119 Effects of Change of Scale, 121 45 Some Uses of Location and Dispersion Measures Together 122 Standard Scores, 122 Box-and-Whisker Plots, 123 Appendices 127 Appendix 4A Proofs of Some Algebraic Principles 127 Appendix 4B Adjusting Data to Maintain Computational Accuracy 129 130 Chapter 5 - Summarizi ng Mu Itivariate Data: Association Between Numerical Scales 137 5.1 Association of Two Numerical Variables Scatter Plots, 139 Other Information Revealed by Scatter Plots, 144 The Correlation Coefficient, 148 Rank Correlation, 156 5.2 More than Two Variables 159 Profiles, 159 Correlation Matrix, 160 Appendices 164 139 Appendix SA Computational Form for the Covariance 164 Appendix 5B Change of Scale 164 166 Chapter 6-Summarizing Multivariate Data: Association Between Categorical Variables 6.1 Two-by-Two Frequency Tables Organization of Data into Two-by-Two Tables, 178 Calculation of Percentages, 182 Interpretation of Frequencies, 185 6.2 Larger TWO-Way Frequency Tables Organization of Data for Two Categorical Variables, 193 of Frequencies, 195 Interpretation 177 6.3 Three Categorical Variables 201 Organization of Data for Three Yes-No Variables, 201 Larger Three-Way Frequency Tables, 205 178 193

xvi Contents 6.4 Effects of a Third Variable Association and Interpretation, 206 Independence in Subtables, 206 Similar Association in Subtables, 211 Reversal of Association in Subtables, 215 Hidden Relationships, 218 206 220 Part III: Probability 231 Chapter 7 - Basic Ideas of Probabi I ity 233 7.1 Intuitive Examples of Probability Physical Devices Which Approximate Randomness, 233 Lottery, 235 Probability and Everyday Life, 236 The Draft 7.2 Probability and Statistics 237 7.3 Probability in Terms of Equally Likely Cases 7.4 Events and Probabilities in General Terms Outcomes, Events, and Probabilities, 242 Addition of Probabilities of Mutually Exclusive Events, 243 Addition of Probabilities, 245 7.5 Interpretation of Probability: Relation to Real Life 7.6 Conditional Probability 7.7 Independence 7.8 Random Sampling; Random Numbers 255 Random Devices, 255 Random Numbers, 255 Sampling with Replacement, 257 Sampling without Replacement, 258 7.9 Bayes'Theorem 258 Bayes' Theorem in a Simplified Case, 258 Examples, 260 Use of Subjective Prior Probabilities in Bayes' Theorem, 263 233 238 242 247 248 252 266 Chapter 8 - Probability Distributions 8.1 Random Variables 8.2 Cumulative Probability 273 273 275

Contents xvii 8.3 The Mean and Variance of a Probability Distribution 277 The Mean of a Discrete Random Variable, 277 The Variance of a Discrete Random Variable, 278 The Mean and Variance of a Continuous Random Variable, 279 8.4 Uniform Distributions 280 The Discrete Uniform Distribution, 280 The Continuous Uniform Distribution, 281 8.5 The Family of Normal Distributions 282 The Normal Distributions, 282 Different Normal Distributions, 284 The Standard Normal Distribution, 285 Other Normal Distributions, 292 297 Chapter 9-Sampling Distributions 305 9.1 Sampling from a Population 306 Random Samples, 306 Sampling Distributions, 307 Independence of Random Variables, 309 Sampling from a Probability Distribution, 311 9.2 Sampling Distributions of a Sum and of a Mean 311 Sampling Distribution of a Sum, 311 Sampling Distribution of the Sample Mean, 315 9.3 The Binomial Distribution 316 Sampling Distribution of the Number of Heads, 316 Proportion of Heads in Bernoulli Trials, 323 The Mean and Variance of the Binomial Probability Distribution, 325 9.4 The Law of Averages (Law of Large Numbers» 9.5 The Normal Distribution of Sample Means The Central Limit Theorem, 329 Normal Approximation to the Binomial Distribution, 335 Appendix 9A The Correction for Continuity 326 329 337 339 Part IV: Statistical Inference 345 Chapter 10-Using a Sample to Estimate Characteristics of One Population 347 10.1 Estimation of a Mean by a Single Number 348

xviii Contents 10.2 Estimation of Variance and Standard Deviation 10.3 An Interval of Plausible Values for a Mean Confidence Intervals when the Standard Deviation is Known, 353 Confidence Intervals when the Standard Deviation is Estimated, 359 10.4 Estimation of a Proportion Point Estimation of a Proportion, 364 Proportion, 365 10.5 Estimation of a Median Point Estimation of a Median, 366 Median', 368 Interval Estimation of a Interval Estimation of a 10.6 Paired Measurements Mean of a Population of Differences, 369 Matched Samples, 371 352 353 10.7 Importance of Size of Population Relative to Sample Size" 373 Appendix loa The Continuity Adjustment Chapter 11 -Answering Questions about Population Characteristics 389 364 366 369 378 379 11.1 Testing a Hypothesis About a Mean 390 An Example and Terminology, 391 Hypothesis Testing Procedures, 394 Deciding Whether a Population Mean Differs from a Given Value, 400 Relation of Two-Tailed Tests to Confidence Intervals, 402 Validity Conditions, 404 11.2 Errors and Power 405 Types of Error, 405 Probability of a Type I Error, 406 Probability of a Type II Error and Power, 407 11.3 Testing Hypotheses About a Mean when the Standard Deviation Is Unknown 413 11.4 P Values: Another Way to Report Tests of Significance 419 P Values for Two-Tailed Tests, 422 P Values when the Population Standard Deviation Is Unknown, 423 11.5 Testing Hypotheses About a Proportion 424 Testing Hypotheses About the Probability of a Success, 424 Example of a Two-Tailed Test, 427 11.6 Testing Hypotheses About a Median: The Sign Test' 429

Contents xix 11.7 Paired Measurements Testing Hypotheses About the Mean of a Population of Differences, 431 Testing the Hypothesis of Equality of Proportions, 436 Appendix 11A The Continuity Correction 440 441 Chapter 12 - Differences Between Populations 455 12.1 Comparison of Two Independent Sample Means When the Population Standard Deviations Are Known 458 One-Tailed Tests, 458 Two-Tailed Tests, 463 Confidence Intervals, 464 Validity Conditions, 467 12.2 Comparison of Two Independent Sample Means When the Population Standard Deviations are Unknown but Treated as Equal 468 Confidence Intervals, 473 12.3 Comparison of Two Independent Sample Means When the Population Standard Deviations Are Unknown and Not Treated as Equal' 474 12.4 Comparison of Two Independent Sample Proportions 477 Hypothesis Tests, 477 Confidence Intervals, 480 12.5 The Sign Test for a Difference in Locations' Appendix 12A The Continuity Adjustment 431 481 486 486 Chapter 13 - Variability in One Population and in Two Populations 13.1 Variability in One Population The Sampling Distribution of the Sum of Squared Deviations, 502 Testing the Hypothesis that the Variance Equals a Given Number, 506 Confidence Intervals for the Variance', 507 13.2 Variability in Two Populations The Sampling Distribution of the Ratio of Two Sample Variances, 508 Testing the Hypothesis of Equality of Two Variances, 512 Confidence Intervals for the Ratio of Two Variances', 515 501 502 508 516

xx Contents Part V: Statistical Methods for Other Problems Chapter 14-lnference on Categorical Data 14.1 Tests of Goodness of Fit Two Categories-Dichotomous Data, 524 Any Number of Categories, 526 Combining Categories, 528 14.2 Chi-Square Tests of Independence Two-by-Two Tables, 530 Two-Way Tables in General, 537 Combining Categories, 540 14.3 Measures of Association The Phi Coefficient, 542 A Coefficient Based on Prediction, 543 A Coefficient Based on Ordering", 548 521 523 524 530 542 Appendix 14A The Continuity Adjustment 554 554 Chapter 15 -Simple Regression Analysis 569 15.1 Functional Relationship 570 15.2 Statistical Relationship 572 15.3 Least-Squares Estimates 573 15.4 Statistical Inference for f3 578 15.5 The Correlation Coefficient: A Measure of Linear Relationship 581 The Bivariate Nonna! Distribution and Test of Significance for a Correlation, 583 585 Chapter 16-Comparison of Several Populations 597 16.1 One-Way Analysis of Variance 602 A Complete Example, 602 The Algebra of ANOVA, 607 An Example with Unequal Sample Sizes, 612 More About the Analysis of Variance, 615 16.2 Which Groups Differ from Which, and by How Much? 619 Comparing Two Means, 619 Comparing More Than Two Means, 622

Contents xxi 16.3 Analysis of Variance of Ranks' Chapter 17 -Sampling from Populations: Sample Surveys Answers to Selected References Appendices 17.1 Simple Random Sampling 17.2 Stratified Random Sampling Two Strata, 644 More than Two Strata, 648 17.3 Cluster Sampling 17.4 Systematic Sampling with a Random Start 17.5 Systematic Subsampling with Random Starts Appendix I Table of the Standard Normal Distribution Appendix II Appendix III Appendix IV Table of Binomial Probabilities Percentage Points of Student's t-distributions Percentage Points of Chi-Square Distributions Appendix V Upper Percentage Points of F-Distributions Appendix VI Table of Random Digits 626 629 639 641 644 651 651 654 655 659 675 681 683 684 686 687 688 701 Index 703