Statistics and Risk Management Software Proficiency Video URL: jukebox.esc13.net/untdeveloper/videos/software%20proficiency.mov Vocabulary List: Measure of Central Tendencies: the measurement of a relationship between two variables. Causality: the relationship between cause and effect. Spreadsheet: a computer application used to calculate tabulated data. Commercial Software: proprietary software products available for a cost. Freeware: software available for no cost and with no limitations on use. Shareware: software available on a trial basis or with limited functionality, with the option to pay for a full version. Contingency Table: a cross-tabulation of the frequencies of two variables assigned to rows and columns. Histogram: a type of graph in which the widths of rectangular bars represent the limits of a class interval, or range of related values, and the heights of the rectangular bars represent the frequency of each class. 1
Resources: The Importance of Business Statistics In this article by Six Sigma the importance of the use of business statistics is broken down for the non-statistician. The article touches on the need for sampling and other statistical gathering strategies in the business world. http://www.sixsigmaonline.org/six-sigma-training-certificationinformation/the-importance-of-business-statistics.html Statistical Thinking in a Technological Environment Through their research Dani Ben-Avi and Alex Friedlander has developed a method for teaching statistical thinking in a technological environment. Developed mainly for seventh through ninth grades, their method has activities, research projects, and thinking processes used by the duo. http://www.dartmouth.edu/~chance/teaching_aids/iase/4.ben-zvi.pdf Spreadsheet: Statistical Functions This site defines each of the various Microsoft Excel functions available to use in statistical calculations. The site also provides some information and examples on how to select cells and ranges for calculations. http://www.medcalc.org/manual/statistical_functions.php 2
Software Proficiency Practice Test Name: Use the Internet to locate answers; compare with the key. List the five types of information required in order to analyze data: 1. 2. 3. 4. 5. List the nine statistical tests which can be completed with this software. 6. 7. 8. 9. 10. 11. 12. 13. 14. 3
Software Proficiency Practice Test LOG INTO SPREADSHEET SOFTWARE You will first need to load the Analysis ToolPak: File, Options, Add-ins In the Manage box, select Excel Add-ins, Click Go In the Add-Ins available box, select the Analysis ToolPak (make sure to check the box), click OK The ToolPak add-in will be located under DATA, Data Analysis Check which options are available in your ToolPak: 1. ANOVA 2. Histogram 3. Chi-Square 4. T- Test 5. Z-Test 6. Sampling Now, look under functions available. Go to Functions, More Functions, Statistical functions. Identify with options are available under functions, answering yes or no: 7. Chi-Square 8. Confidence 9. ANOVA 10. Z Test 11. T Test 4
Software Proficiency Practice Test KEY List the five types of information required in order to analyze data: 1. Data Library 2. Dataset 3. Dependent Variable 4. Predictor Variable 5. Grouping Variable List the nine statistical tests which can be completed with this software. 6. Descriptive 7. Histogram 8. Chi-Square Test 9. Box Plot 10. Stem and Leaf 11. Normal Quantile Plot 12. T-test/Confidence Interval 13. ANOVA 14. Correlation/Regression Check which options are available in your ToolPak: 15. ANOVA yes 16. Histogram yes 17. Chi-Square no 18. T- Test yes 19. Z-Test yes 20. Sampling yes Now, look under functions available. Go to Functions, More Functions, Statistical functions. Identify with options are available under functions, answering yes or no: 21. Chi-Square yes 22. Confidence yes 23. ANOVA no 24. Z Test yes 25. T Test yes 5
Student Assignment 8.1a Software Proficiency Name: Your class just took an exam worth 100 points. There are 30 students in your class. The scores were as follows. Student Score Student Score 1 73 16 79 2 78 17 85 3 89 18 88 4 78 19 87 5 81 20 84 6 85 21 87 7 91 22 86 8 97 23 79 9 92 24 75 10 84 25 86 11 86 26 81 12 84 27 83 13 79 28 85 14 98 29 78 15 92 30 90 Using Excel create a new worksheet for this week s assignment. First enter the 1-30 cases in Column Aand the test Scores in Column B. Next create a histogram type graph in the upper right of the worksheet. Save your work. You will continue to add to it all this week. 6
Student Assignment 8.1b Software Proficiency Name: Next using the Data Analysis Add-In, create a Histogram type graph in the upper right of the worksheet. Often Raw data needs to be categorized into bins so a histogram can be created. Example: the grades you have can be divided into bins by grades 2 grades between 100 and 90, 8 grades between 89 and 80..so forth. Excel using the histogram function will attempt to divide our raw data into bins to accommodate a bell or normalized curve. Play with this function and learn to use it well. Look in the Analysis section of the Data tab. Click "Data Analysis" and highlight the "Histogram" tool from the Analysis Tools box. Click "OK." You will need to specify an Input Range and an Output Range. The Bin Range can be blanks. Select "Chart Output" in the output options section to generate a histogram graph. Click "OK." If you end with a histogram that looks like a bell curve you have done well. The bins will populate with what Excel thing are good bins. It actually attempted to curse the resulting grades. Again using the Data Analysis Add-In, create a Pie Chart type graph in the Lower right of the worksheet. Hint: Use the Histogram tool again and then change it to a pie chart. Right click on the completed Histogram to change the chart type to pie. Save your spreadsheet work. You will continue to add to it all week. 7
Student Assignment 8.1c Software Proficiency Name: Below are the grades in a four point scale. Student Grade Student Grade 1 2 16 2 2 2 17 3 3 3 18 3 4 2 19 3 5 3 20 3 6 3 21 3 7 4 22 3 8 4 23 2 9 4 24 2 10 3 25 3 11 3 26 3 12 3 27 3 13 2 28 3 14 4 29 2 15 4 30 4 Add the grade point assignments into a Column C. Perform a Chi-Best Fit test to see if the expected grades are significantly different from what is expected of 33.33% for each group. Using the Bin data and expected of 10 for each of the three groups what does a =CHISQ.TEST(Actual Range, Expected Range) show. Was this an appropriate analysis test? What assumption(s) are made? Does this really mean anything? 8
Student Assignment 8.1d Software Proficiency Name: Below are the grades in a four point scale. Student Gender Student Gender 1 M 16 F 2 M 17 M 3 M 18 F 4 F 19 M 5 F 20 F 6 F 21 M 7 F 22 M 8 M 23 F 9 F 24 M 10 F 25 M 11 F 26 F 12 F 27 M 13 F 28 M 14 M 29 M 15 F 30 M Using spreadsheet software, enter the Gender in column D and use a t-test calculation to see if there is a significant difference of grades between the female group and the males group of students. Copy the C & D columns to a new worksheet and then sort all of the 2 x 30 cells by Gender. The use the formula =t.test(male grade range, female grade range,2,2) to get a result. What does the result tell you? For fun, raise up the score for all females students and see want happens to the t-test results. 9
Student Assignment 8.1e Software Proficiency Name: Below are the study times for each student. Student Minutes Student Minutes 1 30 16 15 2 15 17 45 3 30 18 60 4 45 19 30 5 60 20 15 6 45 21 45 7 60 22 45 8 90 23 15 9 100 24 30 10 60 25 30 11 60 26 60 12 60 27 45 13 30 28 30 14 90 29 45 15 180 30 120 Using spreadsheet software, enter the grades in column E and calculate the correlation between the study times and exam scores. Copy columns B & E to another worksheet and use the =CORREL(Grade Range, Study Time Range) function. What does this tell you? 10
Explore Activity: Statistics Software vs. Spreadsheet The spreadsheet has become the go-to tool for numerical work in the 21 st century. Spreadsheet programs were designed to be a programmable calculator, but not really a tool for statistical analysis. There are numerous add-ons to spreadsheet programs that attempt to work around the limitations, but it is hard to beat actual statistical software. In your lessons, you were introduced to one free source, PSPP, which is modeled after one of the standard professional software packages. In order to proceed with this activity you will need to be able to install software to your computer (or your teacher will need permission to install to school computers). Option 1 Your assignments in these lessons involved using spreadsheet software to analyze data. Repeat the analyses using real statistical software. Option 2 Find a data set that is of interest to you and then conduct an analysis using one of the statistical program offerings. Copy the data into PSPP (or spreadsheet program) to investigate whether doctors treat over-weight patients differently than normalweight patients. This is an opportunity to see statistics used in a reallife situation. 11
Write a report about your set of data including: a. A description of your data source (i.e. background on where you found the data; the question of interest you are investigating, etc.). b. An appropriate graph. c. Descriptive statistics output from computer software (this would include a report of the mean, standard deviation, and other relevant statistics). d. An interpretation of the importance or relevance of the statistics with regards to your question of interest. 12