Project: Joint Probability (AKA, multiple probabilities) Part 1. A Familiar Data set to Introduce a New Idea

Similar documents
Getting Started with Deliberate Practice

A Pumpkin Grows. Written by Linda D. Bullock and illustrated by Debby Fisher

Virtually Anywhere Episodes 1 and 2. Teacher s Notes

How to make an A in Physics 101/102. Submitted by students who earned an A in PHYS 101 and PHYS 102.

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Chapter 4 - Fractions

Community Rhythms. Purpose/Overview NOTES. To understand the stages of community life and the strategic implications for moving communities

Left, Left, Left, Right, Left

Eduroam Support Clinics What are they?

Case study Norway case 1

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Welcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading

If we want to measure the amount of cereal inside the box, what tool would we use: string, square tiles, or cubes?

Part I. Figuring out how English works

B. How to write a research paper

Writing the Personal Statement

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

Episode 97: The LSAT: Changes and Statistics with Nathan Fox of Fox LSAT

Using Proportions to Solve Percentage Problems I

AP Statistics Summer Assignment 17-18

Occupational Therapy and Increasing independence

Creating Your Term Schedule

Graduation Party by Kelly Hashway

Me on the Map. Standards: Objectives: Learning Activities:

MADERA SCIENCE FAIR 2013 Grades 4 th 6 th Project due date: Tuesday, April 9, 8:15 am Parent Night: Tuesday, April 16, 6:00 8:00 pm

Mathematics Scoring Guide for Sample Test 2005

Contents. Foreword... 5

South Carolina English Language Arts

Effective Practice Briefings: Robert Sylwester 03 Page 1 of 12

Grade 8: Module 4: Unit 1: Lesson 11 Evaluating an Argument: The Joy of Hunting

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

The Evolution of Random Phenomena

Utilizing FREE Internet Resources to Flip Your Classroom. Presenter: Shannon J. Holden

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

Houghton Mifflin Online Assessment System Walkthrough Guide

CONTENTS. Resources. Labels Text Page Web Page Link to a File or Website Display a Directory Add an IMS Content Package.

Hentai High School A Game Guide

Excel Intermediate

Section 7, Unit 4: Sample Student Book Activities for Teaching Listening

WORK OF LEADERS GROUP REPORT

The Indices Investigations Teacher s Notes

CAN PICTORIAL REPRESENTATIONS SUPPORT PROPORTIONAL REASONING? THE CASE OF A MIXING PAINT PROBLEM

Instructional Supports for Common Core and Beyond: FORMATIVE ASSESMENT

Welcome to SAT Brain Boot Camp (AJH, HJH, FJH)

Welcome to ACT Brain Boot Camp

E-3: Check for academic understanding

How we look into complaints What happens when we investigate

COMMUNICATION & NETWORKING. How can I use the phone and to communicate effectively with adults?

No Parent Left Behind

How to make successful presentations in English Part 2

babysign 7 Answers to 7 frequently asked questions about how babysign can help you.

Mathematics Success Grade 7

Fundraising 101 Introduction to Autism Speaks. An Orientation for New Hires

Functional Skills Mathematics Level 2 assessment

Sapphire Elementary - Gradebook Setup

Active Ingredients of Instructional Coaching Results from a qualitative strand embedded in a randomized control trial

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

P-4: Differentiate your plans to fit your students

RETURNING TEACHER REQUIRED TRAINING MODULE YE TRANSCRIPT

Managerial Decision Making

Sight Word Assessment

Testing for the Homeschooled High Schooler: SAT, ACT, AP, CLEP, PSAT, SAT II

16.1 Lesson: Putting it into practice - isikhnas

Tutoring First-Year Writing Students at UNM

Calculators in a Middle School Mathematics Classroom: Helpful or Harmful?

Backwards Numbers: A Study of Place Value. Catherine Perez

West s Paralegal Today The Legal Team at Work Third Edition

What's My Value? Using "Manipulatives" and Writing to Explain Place Value. by Amanda Donovan, 2016 CTI Fellow David Cox Road Elementary School

TabletClass Math Geometry Course Guidebook

Reinventing College Physics for Biologists: Explicating an Epistemological Curriculum

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Physics 270: Experimental Physics

The Task. A Guide for Tutors in the Rutgers Writing Centers Written and edited by Michael Goeller and Karen Kalteissen

Foothill College Summer 2016

Cara Jo Miller. Lead Designer, Simple Energy Co-Founder, Girl Develop It Boulder

Probability estimates in a scenario tree

DegreeWorks Advisor Reference Guide

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Diagnostic Test. Middle School Mathematics

10 tango! lessons. for THERAPISTS

ESSENTIAL SKILLS PROFILE BINGO CALLER/CHECKER

Remainder Rules. 3. Ask students: How many carnations can you order and what size bunches do you make to take five carnations home?

Reviewing the student course evaluation request

Grade 6: Module 2A: Unit 2: Lesson 8 Mid-Unit 3 Assessment: Analyzing Structure and Theme in Stanza 4 of If

Division Strategies: Partial Quotients. Fold-Up & Practice Resource for. Students, Parents. and Teachers

May To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment

Why Pay Attention to Race?

File # for photo

HUBBARD COMMUNICATIONS OFFICE Saint Hill Manor, East Grinstead, Sussex. HCO BULLETIN OF 11 AUGUST 1978 Issue I RUDIMENTS DEFINITIONS AND PATTER

Association Between Categorical Variables

Executive Session: Brenda Edwards, Caddo Nation

Experience Corps. Mentor Toolkit

E C C. American Heart Association. Basic Life Support Instructor Course. Updated Written Exams. February 2016

Preparation for Leading a Small Group

Teachers College Reading and Writing Project

How long did... Who did... Where was... When did... How did... Which did...

Test How To. Creating a New Test

Exemplar Grade 9 Reading Test Questions

Kindergarten Lessons for Unit 7: On The Move Me on the Map By Joan Sweeney

Transcription:

Project: Joint Probability (AKA, multiple probabilities) At the point, we re ready to start dealing with more advanced probabilistic analysis. Although the single probabilities from the first project were important (and will continue to be important in certain applications), often, you will need to calculate an event space and sample space that involve more than one action. To that end, the probability distributions that we will be discussing in class will be immensely helpful, but this project will introduce just what is happening when we use those distributions. Part 1. A Familiar Data set to Introduce a New Idea Remember the dataset from Punxatawney Phil problem in the last project? Of course, you don t so here it is again a : right wrong Early spring prediction 8 1 Late spring prediction 4 3 We re starting with this one, since the numbers are fairly manageable, and will help us to see the big idea of this project (joint probabilities) without getting us lost in arithmetic. Joint probability is used whenever you randomly select more than one element from a sample space (compare this to your last project, when every problem started with randomly select one ). Now, if you were to look at any statistical study, you d be certain that the sample size was greater than 1. In MTH 244, you ll learn how to decide just how big the sample should be, but, at this point, it suffices to say that sometimes we need to deal with probabilities of more than one thing at a time. What the heck let s jump in Example 1. Randomly select two predictions from this CT. What s the chance they were both right? size of event space Answer 1. Now, we know the definition of probability: size of sample space. So, this question is a matter of finding those two numbers, and then dividing them. Let s start with the sample space: Well, imagine drawing the first prediction. There are clearly 16 ways to do this; so far, so good. How about the next prediction? Well, since you ve already taken one prediction out of the mix, there are only 15 left from which to choose. So, the question remains how many ways can we select the first prediction AND then the second one? In other words, how do we put together the 16 and the 15? Well, you might remember, from the first project, that the word or means to add. In this project, that still holds true (as you will see), but we now have to deal with a new word: and. In a nutshell, and means to multiply, and here s why: Imagine that the predictions are labeled 1, 2, 3,16 (doesn t matter which is which, so long as each is labeled with one number). Using Excel, we can generate the sample space fairly quickly, as such: a Actually, this dataset is only accurate from 1995 to 2011; your dataset from the first project has the most up to date data.

Prediction Prediction Prediction Prediction Prediction Prediction Prediction Prediction first second first second first second first second first second first second first second first second 1 2 2 1 3 1 4 1 5 1 6 1 7 1 8 1 1 3 2 3 3 2 4 2 5 2 6 2 7 2 8 2 1 4 2 4 3 4 4 3 5 3 6 3 7 3 8 3 1 5 2 5 3 5 4 5 5 4 6 4 7 4 8 4 1 6 2 6 3 6 4 6 5 6 6 5 7 5 8 5 1 7 2 7 3 7 4 7 5 7 6 7 7 6 8 6 1 8 2 8 3 8 4 8 5 8 6 8 7 8 8 7 1 9 2 9 3 9 4 9 5 9 6 9 7 9 8 9 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 1 11 2 11 3 11 4 11 5 11 6 11 7 11 8 11 1 12 2 12 3 12 4 12 5 12 6 12 7 12 8 12 1 13 2 13 3 13 4 13 5 13 6 13 7 13 8 13 1 14 2 14 3 14 4 14 5 14 6 14 7 14 8 14 1 15 2 15 3 15 4 15 5 15 6 15 7 15 8 15 1 16 2 16 3 16 4 16 5 16 6 16 7 16 8 16 Prediction Prediction Prediction Prediction Prediction Prediction Prediction Prediction first second first second first second first second first second first second first second first second 9 1 10 1 11 1 12 1 13 1 14 1 15 1 16 1 9 2 10 2 11 2 12 2 13 2 14 2 15 2 16 2 9 3 10 3 11 3 12 3 13 3 14 3 15 3 16 3 9 4 10 4 11 4 12 4 13 4 14 4 15 4 16 4 9 5 10 5 11 5 12 5 13 5 14 5 15 5 16 5 9 6 10 6 11 6 12 6 13 6 14 6 15 6 16 6 9 7 10 7 11 7 12 7 13 7 14 7 15 7 16 7 9 8 10 8 11 8 12 8 13 8 14 8 15 8 16 8 9 10 10 9 11 9 12 9 13 9 14 9 15 9 16 9 9 11 10 11 11 10 12 10 13 10 14 10 15 10 16 10 9 12 10 12 11 12 12 11 13 11 14 11 15 11 16 11 9 13 10 13 11 13 12 13 13 12 14 12 15 12 16 12 9 14 10 14 11 14 12 14 13 14 14 13 15 13 16 13 9 15 10 15 11 15 12 15 13 15 14 15 15 14 16 14 9 16 10 16 11 16 12 16 13 16 14 16 15 16 16 15 Whew I m kinda tired (even though Excel did all of the work). Anyway, tired or not, I have to add up all of those possibilities, and I get 240 of them. So, the size of the sample space for this question is 240. Now, back to my question from above: how could we have gotten the number 240 from 16 and 15 (remember those?) without having to draw out all of these possibilities? The answer (as you may have guessed) is to multiply: 16*15 = 240. Ta da! This multiplication idea is actually summarized pretty nicely using something called the fundamental counting principle b (FCP): To find the sample (or event) space created by a joint probability, multiply the possibilities of each of the joint components. That s it! Let s try it out on the event space from this one: b Actually, it s a theorem, but no ever calls it that. Being a theorem, it s provable, and I ve done it for you on the enrichment page.

Remember that we re trying to find p(both right), so the event space must be the number of ways to select one correct prediction followed by a second. Well, there are 12 correct predictions presented in the chart. If I select one of them first, then the second one has only 11 remaining ways to be chosen, so the 132 event space would have to be 12*11 = 132. So, p(both right) = 240 = 55%. A note about sampling: the kind of selection we did in the previous problem is called sampling without replacement; this is because you selected the first item (in this case, a prediction), then, sampled successive items without reusing items already sampled. This is also called dependent sampling, since successive pulls depend on pulls that happened prior. Independent sampling (or sampling with replacement) would have occurred if we had been able to select the same prediction twice; in that case the event space would have been (12)(12) = 144, and the sample space would have been 16*16 = 256, so, p(both right) = 144 256 = 55%. While dependent sampling is more realistic to use, sometimes independent sampling, which is easier to calculate, can be substituted with little to no penalty (for example, see how the two answers were the same?). More on that in a bit; for now, let s practice using the FCP. Example 2. Randomly select three predictions from this CT, without replacement. What s the chance all three were wrong? Answer 2. Looking at the chart, there were 4 predictions that were wrong. Using the FCP, the event space would be 4*3*2 = 24. The sample space would have to be all possible ways to choose three predictions. There are still 16 of them, so the FCP tells us that the sample space is 16*15*14 = 3360. 24 So, p(all three predictions are wrong) = 3360 = 0.7%. Example 3. Randomly select three predictions from this CT without replacement. What s the chance at least one of the three is correct? Answer 3. Interesting what does it mean for at least one of them to be correct? Well, they could all be correct. That counts. Or, two could be correct, and one could be wrong or two could be wrong and one could be right. So, the long way to do this would be to calculate P(at least one correct) = p(all correct OR two correct and one wrong OR one correct and two wrong) Since or means to add disjoint events, P(at least one correct) = p(all correct) + p(two correct and one wrong) + p(one correct and two wrong) Now, before you start to feel sick (too late? Sorry about that), let s be clever about this. 100% of the time, if we select three predictions without replacement, one of four things is going to happen: a) they will all be right; b) two will be right and one will be wrong; c) one will be right and two will be wrong, or d) they will all be wrong. This means that if I add up all four probabilities associated with these four events, they sum to 100%: p(all correct) + p(two correct and one wrong) + p(one correct and two wrong) + p(all wrong) = 100% Remember, too, that we re looking for p(at least one is right), which is handily covered by the first three terms of that equation: p(all correct) + p(two correct and one wrong) + p(one correct and two wrong) + p(all wrong) = 100% p(at least one is right)

So, to be clever, all we need to do is to take p(all wrong), which we found in question 2, subtract it from 100%, leaving us with p(at least one is right): p(at least one is right) + P(all wrong) = 100% p(at least one is right) + 0.7% = 100% p(at least one is right) = 99.3% Does that seem vaguely familiar? It should; it s a skill we learned back in the first project. The events all wrong and at least one right are complements, so their probabilities must sum to 100% ****************************************** Let s look back at the racial steering data from the first project: New Renters Caucasian African American Section A 87 8 Section B 83 34 Example 4. Randomly select 4 people from this CT, without replacement. What s the chance all were Caucasian? Answer 4. A bit tedious, but Size of event space: 170*169*168*167 = 806,048,880 (wowsers!) Size of sample space: 212*211*210*209 = 1,963,287,480 (double wowsers!!) 806048880 p(all 4 Caucasian) = 1963287480 41% Example 5. Randomly select 4 people from this CT, with replacement. What s the chance all were Caucasian? Answer 5. Way less tedious Size of event space: (170) 4 = 835,210,000 Size of sample space: (212) 4 = 2,019,963,136 p(all 4 Caucasian) = 835210000 2019963136 41% So, you can see why independent sampling is more attractive; it s way easier to calculate the probabilities, for the sample and event spaces are much less cumbersome. In addition, did you notice how the answers to questions 4 and 5 were basically identical? When certain conditions are met, we can use independent sampling even though it would be more correct to use dependent sampling (for example, Gallup Polls). We ll study this more in when we get into the binomial distribution; for now, I ll just keep switching back and forth between dependent and independent for practice.

Example 6. Randomly select 4 people from this CT without replacement. What s the chance at least one is African American? Answer 6. Since the event at least one is African American is the complement of all are Caucasian, we can find this probability as follows: p(at least one is African American) = 100% - p(all are Caucasian) = 100% - 41% = 59%. Example 7. Randomly select 4 people from this CT, with replacement. What s the chance at least one lives in Section B? Answer 7. Ha! I tried to pull one over on you but you won t be fooled. You ll tell me, Sean, no biggie. The complement to at least one lives in Section B is none live in section B, so I ll start by finding that Size of event space: (95) 4 = 81,450,625 Size of sample space: (212) 4 = 2,019,963,136 81450625 p(none live in Section B) = 2019963136 4% Now that we know that, we can use the complement idea p(at least one lives in section B) = 100% - p(none live in Section B) = 100% - 4% = 96% So there it is and you know what else I noticed? Check this out: 81450625 2019963136 = (95)4 (212) 4 = ( 212) 95 4 So, technically, I wouldn t even have to list out the event and sample spaces. Pretty rad, eh? You make me so very proud. Part 2. Some Probabilities without CTs (AKA, an intro to probability distributions) Although we ll build upon this idea in the next project (and with probability distributions) quite a bit, I d like to plant its seed in this one. Consider the following problem I needed to address in our garden during the summer of 2011: Example 8. I ve noticed that, when I plant Big Max pumpkins from seed, we have about a 60% chance of them germinating (that is, about 60% of them germinate). If I plant 3 Big Max seeds, what s the chance that at least one of them germinates?

Answer 8. This might sound trivial, until you ve seen (and/or tasted) one of these beauts. Also, since our son s name is Max, well it helps when his namesake fruit actually fruits. So, what might seem at first trivial actually has quite a bit of value (for us, at least). Some might say, Sean! You have over a 100% chance that you ll get a germination! Since each seed is 60% likely to germinate, and you have 3 seeds, add the 60% s together to get 180%. The chance of seed 1 OR seed 2 OR seed 3. or means to add! Ah if only it were that easy! We can t add the 60% s together, for two reasons: 1) you can t have more than a 100% chance of anything happening, and 2) yes, or means to add but only with disjoint events. Just because one seed germinates doesn t mean the others can t. No, we re going to have to use a skill we learned in this project to assist us here: p(at least one seed germinates) = 100% - p(none of the three germinate) Since the events at least one and none are complements, this equation must be true (for reference, see questions 3, 6, and 7 above). Now, all we have to do is figure out how to find p(none of the three germinate) This is a challenge, since we have no sample space. All we know is that, for any seed, there s a 60% chance it will germinate. Hmm let s look at that more closely This 60% number that I reported is the result of testing of hundreds of seeds over time. How many? Not sure. However, it doesn t matter, and here s why assuming that the three seeds I m about to plant are representative of the previous ones I tested (no reason to think they aren t), we can envision those three seeds as part of the larger sample space of hundreds I ve already tested. So, in other words, imagine a large pile of pumpkin seeds, all of which are 60% likely to germinate. Select 3. Two things to note: 1) You ll be sampling without replacement, but, since the pile of seeds is large, we ll be treating it as sampling with replacement. Why? First, the math is easier, and second, as you remember from questions 4 and 5 above, you get essentially the same answer. 2) We don t need the sample space, and here s why:

Suppose your sample space is at 60% germination, this many seeds will geminate which means this many won t 100 60 40 500 300 200 1,000 600 400 5,000 3000 2000 10,000 6000 4000 so, p(one seed not germinating) is 40 100 = 0.4 200 500 = 0.4 400 1000 = 0.4 2000 5000 = 0.4 4000 10000 = 0.4 Cool! Our sample space is irrelevant, so long as it s large (more on that in class). Now, back to the task at hand: p(at least one seed germinates) = 100% - p(none of the three germinate) OK, so we have three seeds. We re going to be sampling with replacement, which makes things easier. Let s suppose that we had 100 seeds in our original sample space. How likely would it be to draw 3 seeds that wouldn t germinate? p(none of the three germinate) = 40*40*40 100*100*100 = 64000 1000000 = 0.064 All right, what would happen if our three seeds were drawn from a sample space of 500? p(none of the three germinate) = 200*200*200 500*500*500 = 8000000 125000000 = 0.064 Hmmm how about a sample space of 10,000? p(none of the three germinate) = 4000*4000*4000 10000*10000*10000 = 64000000000 1000000000000 = 0.064 Hold on here! No matter what sample size we use, we always get the same probability! Why is that? Well, here s why. Suppose your sample size is n. Then p(none of the three germinate) = 0.4n*0.4n*0.4n n*n*n Oh, that s just too cool. So here s our answer: = (0.4n)3 n 3 = (0.4)3 n 3 n 3 = (0.4) 3 = 0.064 p(at least one seed germinates) = 100% - p(none of the three germinate) = 100% - 6.4% = 93.6% Which means that if I plant 3 Big Max seeds, about 94% of the time (in the long run), I ll get at least one to germinate. Cool.

Example 9. If I plant 4 Big Max seeds, what s the chance that at least one of them germinates? Answer 9. p(at least one seed germinates) = 1 - p(none of the four germinate) = 1 - (0.4) 4 = 1 0.0256 = 100% - 2.56% = 97.44% Hopefully, this makes sense as I plant more seeds, it becomes likely that I ll get one to germinate. (Note: you ll need this paragraph for question 2b) Using the Meaning of Probability from the previous project, this number (97.44%) means that, if I plant four seeds each year, then I can expect, in about 97% of the years in which I plant them, at least one seed will germinate (unfortunately, I don t know which years just that the rate of germination for 4-seed plantings is about 97% of the time). Of course, you can speed the process up suppose, one year, I plant 10,000 rows of 4 seeds each. In 9744 of those rows, I can expect at least one seed to have germinated. That also means that I can expect 256 rows to have no germination at all. That might not be sufficient for a pumpkin farmer which is why they plant WAY more than 4 seeds per row. Example 10. If I plant 4 Big Max seeds, what s the chance that at least one of them doesn t germinates? Answer 10. You might be tempted to think that s just the complement of the last question but it s not (see why?). Let s carefully set it up like the last one: p(at least one seed doesn t germinates) = 1 - p(all four germinate) = 1 - (0.6) 4 = 1 0.1296 = 100% - 12.96% 87% Pretty good chance, therefore, that at least one of those seeds isn t going to germinate but remember, there s about a 97% chance that at least one will. One last pumpkin question, so you can see the power of this idea. Example 11. How many Big Max seeds should I plant in order to be 99.9% certain that at least one germinates? Answer 11. Ooooo! This is interesting. Now, we have the probability, but don t have the seeds. Let s look back to the basic form of what we need to use: p(at least one seed germinates) = 1 - p(none of the seeds germinate) Now, we know that we want the left hand side of that to be at least 99.9%: I m gonna do a little moving around now 0.999 = 1 - p(none of the seeds germinate) 0.999 = 1 - p(none of the seeds germinate) p(none of the seeds germinate) = 0.001

Interesting we need to find out how many seeds it will take until the likelihood of none of them germinating will be 0.001 (or, for the percentage inclined, 0.1%). One way to proceed is to make a table, and use what we ve learned so far: Number of seeds p(none of the seeds germinate) 1 0.4 = 40% Remember 40% of seeds don t germinate. 2 (0.4) 2 = 16% 3 (0.4) 3 = 6.4% We found this in question 8 4 (0.4) 4 = 2.56% and this in question 9. 5 (0.4) 5 = 1.024% 6 (0.4) 6 = 0.4096% 7 (0.4) 7 0.164% Close! But still too big. 8 (0.4) 8 0.0655% There it is. So, with 8 seeds, there s only a 0.0655% chance of none germinating, which means there s at least a 99.9% chance of at least one germinating (there s actually a 99.9345% chance). Tables like the last one will be helpful to you in this class (and beyond!) here s a video showing you how to make it! https://www.youtube.com/watch?v=dggmptju2ic&feature=youtu.be (Of course, if you wanted to, you could also use logarithms) Well, there s your introduction to probability distributions. By the time you get to them, you ll be totally ready. And wait ll you see how useful they can be!

Just like in the last project, express answers as either a decimal or a percent. Rounding will vary depending on the context. Also, show/explain how you arrived at your answers (w). Some answers are in red to help you if you get stuck. 1. Hepatitis C causes about 10,000 deaths each year in the United States, but often lies undetected for two years after infection. A study from the University of Texas Southwestern Medical Center wanted to see if there was some connection between Hepatitis C and tattoos (or, more specifically, where folks got their tattoos). 626 randomly selected subjects were classified two ways, according to their Hep C status, and tattoo status. The data from this study is summarized as follows: Hepatitis C No Hepatitis C Tattoo From Tattoo Parlor 17 35 Tattoo Gotten Elsewhere 8 53 No Tattoos 22 491 Let s make sure we understand the difference between sampling without and with replacement! 1. (2 point) Suppose we randomly select 3 of the folks, without replacement, from this study, and we were interested in finding p(all three have tattoos). What would be the correct way to set this problem up? i. ii. 2. (2 point) Now, suppose we randomly select 5 of the folks, with replacement, from this study, and we re interested in finding p(at least one has Hep C). What would be the correct way to set this problem up? i. ii. Now, sampling without replacement (as you have seen, and will continue to see) is awesomely powerful. We ll discuss it in depth from everything from PowerBall and Poker to purchasing laying chickens to Gallup polls to choosing insurance plans. However, as you learned in example problem 8 above, so long as you re selecting only a small sample from a population, you can treat any problem as a with replacement type. This next problem is a neat one I stumbled upon while having coffee with the head of IT at COCC! 3. COCC s Information Technology (IT) department is in charge of managing all aspects of your online COCC world: your email messages, your Document folder, your Banner account all of it. Now, you may have noticed that computer parts break. If, say, the hard drive on which your Document folder lives breaks, well, that s pretty bad. So, in order to protect your data, COCC backs up the Documents data with more than one hard drive. Their method is called RAID (Redundant Array of Independent Disks). The key word here is independent ; each disk operates independently of the others, so if one fails, the others don t automatically fail, although, in theory, they all could fail. According to the article Disk Failures in the Real World (Schroeder, Gibson, 2007), it is estimated that around 3% of the hard drives in a RAID system will fail over a given year (that is, out of every 100 RAID disks, 3 will fail, and the other 97 will work over any given year). Now, remember a single hard disk failure is hardly a problem; RAID systems are multiply redundant. Therefore, the only thing COCC has to concern itself with is making sure at least one of the disks in the RAID works at all times. (read on!)

a. (w) (4 points) Since COCC s RAID has 6 independent hard drives, find p(at least one of COCC RAID disks works over any given year). This is the chance that the RAID system will protect all student data at COCC. Essentially 100% (99.99999993%) b. (w) (4 points) Find p(at least one of COCC RAID disks fails over any given year) about 17%. c. (4 points) What does that last probability mean? Use the 17% in your result, for sure. Answer 9 will help you if you re stuck. Remember - don t use the words probability, chance (nor anything synonymous with those) in your answer! #eschewcircularreasoning 4. Refer back to Example 11. a. (1 point) Like I did in the video for Example 11 (what do you mean you didn t watch it?!?), create a T Table from 1 seed to 20 seeds. Please also make both columns for p(no Seeds Germinating and p(at Least One Seed Germinating. Take a screen shot of your spreadsheet and include it! In case you need help taking easy screen shots in Windows: https://www.youtube.com/watch?v=guj1zhehseg What is the least number of pumpkin seeds should I plant if I want to be b. (2 points) 99.999% sure that at least one germinates? c. (2 points) 99.99999% sure that at least one germinates? Now, you might not care one bit about pumpkin seeds but I ll bet you care about redundancy. I mean, that s why we have two kidneys, right? The whole point of redundancy is one that s very intuitive, and now you have a way to measure it! Cool! 5. (4 points) We ll finish up with a super applicable one that I heard after class one day: a student thanked me for letting her work with two of her friends on her assessments. I said, You re more than welcome. And then I asked her, What made you bring that up? And she said, Well, each of us remember stuff like, 80% of the time, but it seems that we all forget random things, kinda. So, with all of us working together, we ve got a good chance of at least one of us knowing what s going on for any problem. I then asked her, So, what s that chance? To which she correctly gave me the answer. So you tell me: what s the chance of at least one of them knowing what s going on for any given problem?