CSI 23 LECTURE NOTES (Ojakian) Topic 1: Overview and Fundamental Background

Similar documents
The Evolution of Random Phenomena

Les cartes au poisson

Algebra 2- Semester 2 Review

AP Statistics Summer Assignment 17-18

Association Between Categorical Variables

Measures of the Location of the Data

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

(I couldn t find a Smartie Book) NEW Grade 5/6 Mathematics: (Number, Statistics and Probability) Title Smartie Mathematics

MGF 1106 Final Exam Review / (sections )

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

Unit 3: Lesson 1 Decimals as Equal Divisions

4-3 Basic Skills and Concepts

Math 121 Fundamentals of Mathematics I

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Preliminary Chapter survey experiment an observational study that is not a survey

1. Lesson and Activities. a. Power Point Agenda i. A great means of keeping things organized and keeping your rehearsal or class running smoothly

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Shockwheat. Statistics 1, Activity 1

Statistical Studies: Analyzing Data III.B Student Activity Sheet 7: Using Technology

Probability and Statistics Curriculum Pacing Guide

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

UNIT ONE Tools of Algebra

Grade 6: Correlated to AGS Basic Math Skills

Bittinger, M. L., Ellenbogen, D. J., & Johnson, B. L. (2012). Prealgebra (6th ed.). Boston, MA: Addison-Wesley.

Unit 2. A whole-school approach to numeracy across the curriculum

The following shows how place value and money are related. ones tenths hundredths thousandths

Diagnostic Test. Middle School Mathematics

May To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment

Math Grade 3 Assessment Anchors and Eligible Content

Left, Left, Left, Right, Left

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

Simple Random Sample (SRS) & Voluntary Response Sample: Examples: A Voluntary Response Sample: Examples: Systematic Sample Best Used When

Introduction to Personality Daily 11:00 11:50am

Lesson 12. Lesson 12. Suggested Lesson Structure. Round to Different Place Values (6 minutes) Fluency Practice (12 minutes)

In how many ways can one junior and one senior be selected from a group of 8 juniors and 6 seniors?

Basic lesson time includes activity only. Introductory and Wrap-Up suggestions can be used

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Using Proportions to Solve Percentage Problems I

The Good Judgment Project: A large scale test of different methods of combining expert predictions

STA 225: Introductory Statistics (CT)

First and Last Name School District School Name School City, State

Maths Games Resource Kit - Sample Teaching Problem Solving

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Grade Band: High School Unit 1 Unit Target: Government Unit Topic: The Constitution and Me. What Is the Constitution? The United States Government

St Math Teacher Login

I can explain why backward design is a good organizing principle for lesson planning. 2. use backward design as a framework to design my lessons

Job Explorer: My Dream Job-Lesson 5

EDEXCEL FUNCTIONAL SKILLS PILOT. Maths Level 2. Chapter 7. Working with probability

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Module Title: Managing and Leading Change. Lesson 4 THE SIX SIGMA

Managerial Decision Making

Paper Reference. Edexcel GCSE Mathematics (Linear) 1380 Paper 1 (Non-Calculator) Foundation Tier. Monday 6 June 2011 Afternoon Time: 1 hour 30 minutes

Evaluating Statements About Probability

Lesson M4. page 1 of 2

Measurement. Time. Teaching for mastery in primary maths

TCC Jim Bolen Math Competition Rules and Facts. Rules:

Massachusetts Department of Elementary and Secondary Education. Title I Comparability

How to set up gradebook categories in Moodle 2.

Learning Lesson Study Course

learning collegiate assessment]

Introduction. Chem 110: Chemical Principles 1 Sections 40-52

Number Line Moves Dash -- 1st Grade. Michelle Eckstein

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

Introduction to Causal Inference. Problem Set 1. Required Problems

Statewide Framework Document for:

APPLIED RURAL SOCIOLOGY SOC 474 COURSE SYLLABUS SPRING 2006

Dublin City Schools Mathematics Graded Course of Study GRADE 4

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

BEST OFFICIAL WORLD SCHOOLS DEBATE RULES

Pre-vocational training. Unit 2. Being a fitness instructor

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

Grades. From Your Friends at The MAILBOX

NUMBERS AND OPERATIONS

The Editor s Corner. The. Articles. Workshops. Editor. Associate Editors. Also In This Issue

STAT 220 Midterm Exam, Friday, Feb. 24

When!Identifying!Contributors!is!Costly:!An! Experiment!on!Public!Goods!

Contents. Foreword... 5

Answer Key For The California Mathematics Standards Grade 1

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Lab 1 - The Scientific Method

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

University of Utah. 1. Graduation-Rates Data a. All Students. b. Student-Athletes

Answers To Hawkes Learning Systems Intermediate Algebra

The Indices Investigations Teacher s Notes

Kindergarten - Unit One - Connecting Themes

This curriculum is brought to you by the National Officer Team.

SER CHANGES~ACCOMMODATIONS PAGES

Outline for Session III

Mathematics (JUN14MS0401) General Certificate of Education Advanced Level Examination June Unit Statistics TOTAL.

12- A whirlwind tour of statistics

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Welcome to California Colleges, Platform Exploration (6.1) Goal: Students will familiarize themselves with the CaliforniaColleges.edu platform.

Understanding Fair Trade

EDPS 859: Statistical Methods A Peer Review of Teaching Project Benchmark Portfolio

Valcik, N. A., & Tracy, P. E. (2013). Case studies in disaster response and emergency management. Boca Raton, FL: CRC Press.

Fall Classes At A Glance

ACTIVITY: Comparing Combination Locks

Math 96: Intermediate Algebra in Context

Transcription:

CSI 23 LECTURE NOTES (Ojakian) Topic 1: Overview and Fundamental Background 1. Introduction to Statistics and Excel 2. Fundamental Terminology OUTLINE (References: 1.1, 1.2, 1.3, 3.1) 3. Mean, Median, Mode and other Data Summaries 4. Random Samples 5. Topics with brief introduction: Probability, Estimation, Correlation 1. Simple Introduction to Statistics (a) Goal: Understand some characteristic about a population. i. Example: Consider the population = all NYC residents. Want to understand issues related to voting in election for president in November 2016. (By the way, 78.59% voted for Clinton and 18.4% voted for Trump Source: https://www.dnainfo.com/new-york/numbers/clinton-trump-president-vice-presidentevery-neighborhood-map-election-results-voting-general-primary-nyc) ii. Example: Consider the population = all NYC residents. Want to understand how old we are. iii. Example: Consider the population = all the S.U.V.s in the USA. Want to understand how safe they are. (b) Approach of Statistics: Focus on some parameters of the population and use a sample. i. Example: For the population = all NYC residents, to understand how old we are. A. Focus on some parameters such as: average age, percent of the population that is older than 65, etc. B. Select a sample to study. 2. Some Mathematical Terminology (a) Set: (b) List: (c) Integers: (d) Real Numbers: (e) Function: (f) Applying a function to every element of a list to get a new list: 1

(g) EXERCISES PROBLEM 1. Suppose F (x) is the function which maps an integer x to 0 if it is odd and 1 if it is even. Evaluate F (100003) and F (77777774). Apply F to the list (3, 4, 4, 1, 3). *PROBLEM* 2. Suppose G(x) is the function which maps any real number x to the nearest integer (and up if it is exactly between two integers). Evaluate G(23.78) and G(100.12). Apply G to the list (0.77, 4.1, 50, 7.5). 3. Some Statistics Terminology (a) Population: (b) Individuals: (c) Variable (in the Statistics Sense!): Function from the population to the real numbers. (d) i. Example: Population = NYC residents. A. One variable is the function that maps a person to that person s height. B. Another variable is the function that maps a person to 0 if the person voted for Trump and 1 if the person voted for Clinton, and 2 if the person did something else. ii. Remark: It is a reduction of information. PROBLEM 3. Textbook (1.1) - 8 (6 in 5th Edition) *PROBLEM* 4. On a small sheet of paper, write down an example of one population and two different variables. *PROBLEM* 5. Write a variable or two that could apply to our class (such as height ). Write down one you think would be interesting to understand; I will create a survey based on your responses! 2

4. Common Data Summaries: Mean, Median, Mode (a) Mean (b) Median (c) Mode (d) Remark: Sometimes average refers to mean and sometimes average refers to any of mean, median, or mode. (e) EXERCISES PROBLEM 6. Consider the data: 1, 4, 0, -2, 1. i. What is the mean? What is the median? ii. If the largest number is increased, how does this effect the mean and median? iii. If the smallest number is increased so that it is now the largest number, how does this effect the mean and median? PROBLEM 7. Textbook (3.1): 13 (5 in 5th edition) *PROBLEM* 8. The net worth of someone is the amount of money that person would have if they sold everything they have and then subtracted their debt. Based on the Net Worth handout, answer the following questions (round numbers to nearest 1000): i. Suppose there is bar with two typical 40 year olds and three typical 50 year olds. What is the mean net worth and what is the median net worth in the bar? ii. Suppose there is bar with 99 typical 40 year olds. What is the mean net worth and what is the median net worth in the bar? Now pick your favorite super rich man from the top eight; suppose he walks into the bar. Now what is the mean net worth and what is the median net worth bar? 5. Excel Introduction (a) Putting something in a box: Text, Number, or Function (b) Some functions: i. For mean use: average ii. For median use: median iii. For mode use: mode iv. For summing use: sum (c) Different worksheets. (d) Please!... Organize your work clearly. (e) Exercises PROBLEM 9. Go to Cengage Data (at webpage), download Heights of Pro Basketball Players from the n 30 data. Find the median. Find the mean in two ways: 1) using the average function and 2) using the sum function, but not the average function. *PROBLEM* 10. Go to Cengage Data (at webpage), download a data set of your choice from the n 30 data. Find the median. Find the mean in two ways: 1) using the average function and 2) using the sum function, but not the average function. 3

6. Data Summaries in General (a) Functions whose inputs are lists (Examples: mean, median, mode) (b) Other Data Summaries i. Maximum and Minimum ii. Range iii. Percentages (c) EXERCISES PROBLEM 11. Consider the data summary X that maps a list of integers to the percent of negative numbers in the list. Evaluate X(4, 0, 4, 1, 4) and X(1, 2, 3, 4, 5). *PROBLEM* 12. On a small sheet of paper, write down an example of another data summary (remember: its input should be a list of numbers and its output should be a single number). 7. Fundamental Idea of Statistics To understand a population, understand a sample of the population. (a) Population versus Sample Example: All NYC residents versus this class. (b) Population Parameter versus Sample Statistic *PROBLEM* 13. I have 35 sheets of paper (each numbered 1-10). To guess the population mean, population median, and percent of 1 s, choose a sample of size 5 and find the sample mean, sample median, and sample percentage. (Do with 3 different volunteers and save info on the board) *PROBLEM* 14. Suppose we want to know 1) the average age of a CUNY student and 2) the percent of students 25 years and older. Let s use our class to guess. i. What is the population? ii. What is the variable? iii. What are the population parameters? iv. What is the sample? v. What are the sample statistics? vi. Calculate the sample mean and sample percent (need class data!). vii. How well do you think our sample statistics approximate the population parameters? Why factors support accepting our approximations and what factors support rejecting our approximations? (c) Terminology i. Population mean: µ (pronounced mew ) ii. Sample mean: x (pronounced x bar ) PROBLEM 15. Use the names µ and x on the previous problems. 4

8. Probability (Details: ch. 5, 6) (a) Probability of an event: A measure of how likely it is using a number between 0 and 1. (b) Examples: Coins, Dice, Polls. Probability that Clinton would win 2016 presidential election: 70% or 99% depending on who you asked... (c) Typical assumption: Equally Likely Outcomes Probability = F avorable T otal (d) EXERCISES PROBLEM 16. Suppose you roll a 6-sided die (with numbers 1 through 6). i. What is the probability of rolling a 2? ii. What is the probability of rolling a number larger than 2? *PROBLEM* 17. Suppose you choose one card from a standard deck of playing cards: 52 cards in total, with 4 suits: 13 red hearts, 13 red diamonds, 13 black clubs, and 13 black spades; in each suit there are cards: 1,2,3,4,5,6,7,8,9,10,Jack, Queen, King. i. What is the probability picking the Queen of Spades. ii. What is the probability of picking a 1 (i.e. an Ace)? iii. What is the probability picking a diamond? iv. What is the probability of picking a red Jack? 9. Random Samples (a) Random Sample: (b) Random sample using Excel: randbetween (c) EXERCISES PROBLEM 18. Textbook (1.2) - 9 (5 in 5th Edition) *PROBLEM* 19. Which of the following ways of getting a sample from a population are random? If not completely random, how close to random does it seem and how could you correct the sample to make it random? i. Population = All US residents. Sample = Randomly call 100 people. ii. Population = All US residents that own a phone. Sample = Randomly call 100 people. iii. Population = All subway riders. Sample = Randomly select 100 people entering the Burnside Avenue Subway. 5

10. Estimation (Details: ch. 8) (a) Confidence interval i. Have some population and an unknown population parameter Q. ii. Choose a confidence level : A percent, P %, between 0% and 100% (i.e. a probability measure between 0 and 1). iii. From a random sample obtain a P % confidence interval (a, b) for Q. iv. The probability you pick an interval (a, b) that contains Q is P %. v. Subtlety: The parameter Q is either in or not in (a, b). Having a confidence interval with confidence level P % means that the process that yields (a, b) has a P % chance of producing an interval containing Q. (b) Example: Suppose the newspaper tells you that (22.1, 25.8) is a 90% confidence interval for the average age of a college student. This means that you can be 90% confident that the average age of a college student is in between 22.1 and 25.8. More subtlety: The process that yielded (22.1, 25.8) had a 90% chance of yielding an interval that contains the actual average age of a college student. (c) Example i. Population = the earlier papers with numbers 1-10. Parameter is µ. ii. Take confidence level 95%. iii. Take a random sample. For now, use Excel to obtain confidence interval. (d) Using Excel to get confidence interval. i. Data Data Analysis Descriptive Statistics Confidence Level for Mean ii. Add and subtract the Confidence Level from the Sample Mean to obtain the Confidence Interval. PROBLEM 20. Find a confidence interval for the various samples in the earlier attempt to guess the average of the numbers on the papers. PROBLEM 21. Using our earlier work, find a confidence interval for average age of a CUNY student using our class as the sample. 11. Correlation (Details: ch 4) PROBLEM 22. Textbook: 4.1-7 (a) Important principle: Correlation does not imply Causation. (b) Lurking variable: (c) EXERCISE *PROBLEM* 23. Textbook: 4.1-8 (d) Moral: Be careful on declaring a cause for some phenomena! 6