Chapter 1: Where Do Data Come From? Thought Question Researchers observed that children who get tutoring get worse grades than children who do not get tutored. Should we conclude that tutoring lowers grades? Is there a more reasonable conclusion? Thought Question In 1997, the Orlando Sentinel released the results of a poll in which more than 90% of those people who called in to the paper said that Orlando s NBA team, the Orlando Magic, should not re-sign its center, Shaquille O Neal, for the amount of money he was asking. Based on this poll, would you conclude that over 90% of Orlando s population felt that the team should not re-sign Shaquille O Neal? 1
Data and Statistics Data pieces of information. numbers. measurements. Terminology Individuals: the objects described by a set of data. Individuals may be people, animals, or things. Variable: any characteristic of an individual. Statistics The science of data. The study of how to collect, organize, analyze and interpret information. A process to make decisions in the presence of uncertainty without prejudice. The goal of statistics is to gain information from data. 2
Example: Course Data Name Major Points Grade Advani, Sura Math 397 B Barton, David Chem 323 C Brown, Annette Lit 446 A Chiu, Sun Psych 405 B Cortez, Maria Psych 461 A Who are the individuals? students What are the variables? major, points, and grade Which variables are numerical (quantitative)? points Which variables are non-numerical (categorical)? major, grade What are the benefits of using numerical variables? can calculate numbers which describe the data 3
Collecting Data: Observational Studies Terminology Population: entire group of individuals about which we want information. Sample: subset of individuals from which information is collected. Observational study: observes individuals and measures variables of interest but does not attempt to influence the responses. Sample Survey: observational study where the population is studied by observing a sample. Census: observational study which attempts to include the entire population in the sample. Examples Population: all American adults, all WSU students Sample: 2,531 randomly selected American adults, 15 WSU students from MI Sample Survey: public opinion polls, pre-election polls, teacher evaluations Census: US Census (every 10 years) 4
Collecting Data: Experiments Terminology Experiment: deliberately imposes some treatment on individuals in order to observe their responses. Studies cause and effect relationships. Examples Is a new drug effective? Does collecting homework improve student grades? Warning Statistical conclusions hold on average, not necessarily for individuals. Warning: Ethical problems, e.g., does smoking cause lung cancer? Case Study: The Effect of Hypnosis on the Immune System Reported in Science News, Sept. 4, 1993, p. 153. Objective: to determine if hypnosis strengthens the disease-fighting capacity of immune cells. Individuals: 65 college students, 33 who are easily hypnotized and 32 who are not easily hypnotized. Method: 1. Students white blood cell counts were measured. 2. Students were randomly assigned to one of three conditions: subjects hypnotized 5
subjects relaxed in sensory deprivation tank control group (no treatment) 3. White blood cell count were re-measured after one week. 4. The two white blood cell counts were compared for each group. Results: Students who were hypnotized showed a larger jump in white blood cells than those who relaxed or received no treatment. Students who were easily hypnotized showed the largest immune enhancement. What is the population? college students What is the sample? 65 students What data were collected? easy or difficult to hypnotize, group assignment, pre-study white blood cell count, post-study white blood cell count Is this an experiment or an observational study? experiment Does hypnosis affect the immune system? Quite possibly, at least for college students. 6
Chapter 1 Exercises 1. A press release by the Gallup News Service says that it found 75% of Americans saying the entertainment industry should make a serious effort to reduce the amount of sex and violence in its movies, TV shows, and music. Toward the end of the article, you read: These polls are based on telephone interviews with a randomly selected national sample of 1008 adults. What variable did this poll measure? What population do you think Gallup wants information about? What was the sample? Variable: whether or not the entertainment industry should reduce sex and violence. Population: all American adults. Sample: 1008 American adults. 2. Does eating oatmeal reduce the level of bad cholesterol (LDL)? Here are two ways to study this question: 1. A researcher finds 500 adults over 40 who regularly eat oatmeal or products made from oatmeal. She matches each with a similar adult who does not regularly eat oatmeal or products made from oatmeal. She measures the LDL for each adult and compares both groups. 2. Another researcher finds 1000 adults over 40 who do not regularly eat oatmeal or products made from oatmeal and are willing to participate in a study. She randomly assigns 500 of these to a diet that includes a daily breakfast of oatmeal. The other 500 continue their usual habits. After 6 months she compares changes in LDL levels. (a) Explain why the first is an observational study and the second is an experiment. In the first study, subjects are assigned to groups based on their own habits; treatments (eating oatmeal or not) are not imposed on them. In the second setting, a treatment is imposed: each subject is randomly assigned to eat oatmeal or to continue with his/her usual habits. (b) Why does the experiment give more useful information about whether oatmeal reduces LDL? In the observational study, there may be other factors (e.g., genetic background, other dietary habits) that make one more likely to eat oatmeal and less (or more) likely to have high LDL. This means we could not conclude that oatmeal causes lower LDL; it may simply be a symptom of some other factor. 7