Math HL Chapter 11 Statistics Name: Self-Check your own progress by rating where you are. # Learning Targets Lesson I have NO Idea! Which describes you best? I know I know some of most of this! this! I ve got this! 5.1 I can identify and determine a population, sample and random sample. I can make and use a frequency distribution. I can identify and use discrete and continuous data. I can determine and use mid-interval values and interval width. I can identify upper and lower interval boundaries. I can find and interpret the mean and median of a set of data. I can find and interpret the variance and standard deviation of a set of data. Graphs and Visual Representations: I can create various ways of displaying data (stem-and-leaf plots, histograms, box-andwhiskers plots, circle graphs, frequency distributions and cumulative frequency distributions). 1
Math HL Chapter 11 Statistics A population is any A sample is A variable is a There are several ways of summarizing and describing data. There are two major classifications of data: Or data Numerical (Quantitative) Data: There are two types of numerical data: Discrete - Continuous - Categorical (Qualitative Data): Pie Charts are used as a way of summarizing data or displaying the different values of a given ( distribution). It is a circle divided into sectors, each of which represents a particular The of each sector is the same of the circle as the category is of the total data. 2
Bar Graphs: Stem-and-Leaf Display Data: 3
Frequency Distribution: Cumulative Frequency Distribution: 11.2: Measures of Central Tendency Mean, median & mode Statistic and parameter: A statistic is a descriptive measure computed from a sample of data. A parameter is a descriptive measure computed from an entire population of data. The Mean and the Median: The most common measure of central tendency is the arithmetic mean, usually referred to simply as the mean or average. The median is the middle value of the data. Ex 5: The following are the five closing prices of the NASDAQ Index for the first business week in November 2007. This is a sample of size n = 5 for the closing prices from the entire 2007 population: 2794.83; 2810.38; 2785.18; 2825.18; 2748.76 What is the average closing price? 4
Ex 6: Here is a table listing the frequency distribution of 25 families in Lower Austria that were polled in a marketing survey to list the number of liters of mile consumed during a particular week. And this is the frequency histogram of the data. For lists, the mode is the most common (frequent) value. A list can have more than one mode. For histograms, the mode is relative maximum. 5
Shape of the Distribution: Symmetry: Skewness: Positively skewed: Negatively skewed: 11.3: Measures of Variability Range, Interquartile Range, Variance, and Standard Deviation Range is a single number: Max min = range Variance: Variance: s 2 = n i=1 (x i x) 2 n 1 or σ 2 = n i=1 (x i x) 2 N Sample Variance Population Variance What is the difference between the two equations? Standard Deviation: Standard Deviation is 6
Standard Deviation: s 2 or σ 2 Ex - Practice: Find the variance and standard deviation of the data below: Follow the steps from the example in the text on page 489. Quiz 1 28 16 27 30 22 25 18 23 26 Interquartile Range and Measures of Non-central Tendency: Interquartile Range: Q 3 Q 1 = IQR Order the data: Q 1 = n +1 4 and Q 3 = 3(n +1) 4 Median: Five-Number Summary: minimum, first quartile, median, third quartile, maximum Box-and-Whisker Plot: Uses the Five-Number Summary to create. 7
In your own words, what is an outlier? v Lower Fence: Q 1 1.5(IQR) v Upper Fence: Q 3 +1.5(IQR) Any point beyond the lower or upper fence is considered to be an outlier. Ex 7: Speed limits in some European cities are set to 50 km/h. Drivers in various cities react to such limits differently. The results of the survey to compare drivers behavior in Brussels, Vienna and Stockholm are given in the table below. Read through the solution and make sure you understand. Shape, Center & Spread: Empirical Rule: If the data is close to being symmetrical, the following is true: v The interval µ ± σ contains approximately 68% of the data v The interval µ ± 2σ contains approximately 95% of the data v The interval µ ± 3σ contains approximately 99.7% of the data Ex 8: The records of a large high school show the heights of their students for the year 2006. A) Which statistics would best represent the data here? Why? 8
B) Calculate the mean and standard deviation. Height (cm) x i Number of Students f(x i) x i x f(x i) 170 15 2550 51.84 777.6 171 60 172 90 173 70 174 50 175 200 176 180 177 70 178 120 179 50 180 110 181 80 182 90 183 40 184 20 185 40 186 10 194 2 196 3 Totals: C) Develop a cumulative frequency graph of the data. D) Use your result in part C to estimate the median, Q 1, Q 3, and the IQR. E) Are there any outliers in the data? Why or why not? SW F) Write a few sentences to describe the situation. 9