First Digit Phenomenon Activity 6A: First Digit Phenomenon A Descriptive Statistical Discovery Instruction Sheet (Rev 2.5)

Similar documents
Average Loan or Lease Term. Average

2017 National Clean Water Law Seminar and Water Enforcement Workshop Continuing Legal Education (CLE) Credits. States

46 Children s Defense Fund

Wilma Rudolph Student Athlete Achievement Award

medicaid and the How will the Medicaid Expansion for Adults Impact Eligibility and Coverage? Key Findings in Brief

STATE CAPITAL SPENDING ON PK 12 SCHOOL FACILITIES NORTH CAROLINA

BUILDING CAPACITY FOR COLLEGE AND CAREER READINESS: LESSONS LEARNED FROM NAEP ITEM ANALYSES. Council of the Great City Schools

FY year and 3-year Cohort Default Rates by State and Level and Control of Institution

Disciplinary action: special education and autism IDEA laws, zero tolerance in schools, and disciplinary action

Housekeeping. Questions

Two Million K-12 Teachers Are Now Corralled Into Unions. And 1.3 Million Are Forced to Pay Union Dues, as Well as Accept Union Monopoly Bargaining

A Profile of Top Performers on the Uniform CPA Exam

cover Private Public Schools America s Michael J. Petrilli and Janie Scull

CLE/MCLE Information by State

State Limits on Contributions to Candidates Election Cycle Updated June 27, PAC Candidate Contributions

NASWA SURVEY ON PELL GRANTS AND APPROVED TRAINING FOR UI SUMMARY AND STATE-BY-STATE RESULTS

Discussion Papers. Assessing the New Federalism. State General Assistance Programs An Urban Institute Program to Assess Changing Social Policies

NCSC Alternate Assessments and Instructional Materials Based on Common Core State Standards

The following tables contain data that are derived mainly

Proficiency Illusion

2014 Comprehensive Survey of Lawyer Assistance Programs

Free Fall. By: John Rogers, Melanie Bertrand, Rhoda Freelon, Sophie Fanelli. March 2011

Fisk University FACT BOOK. Office of Institutional Assessment and Research

Set t i n g Sa i l on a N e w Cou rse

A Comparison of the ERP Offerings of AACSB Accredited Universities Belonging to SAPUA

2016 Match List. Residency Program Distribution by Specialty. Anesthesiology. Barnes-Jewish Hospital, St. Louis MO

Understanding University Funding

The Effect of Income on Educational Attainment: Evidence from State Earned Income Tax Credit Expansions

2013 donorcentrics Annual Report on Higher Education Alumni Giving

Stetson University College of Law Class of 2012 Summary Report

ObamaCare Expansion Enrollment is Shattering Projections

Imagine this: Sylvia and Steve are seventh-graders

The College of New Jersey Department of Chemistry. Overview- 2009

STATE-BY-STATE ANALYSIS OF CONTINUING EDUCATION REQUIREMENTS FOR LANDSCAPE ARCHITECTS

Math Grade 3 Assessment Anchors and Eligible Content

Trends in Tuition at Idaho s Public Colleges and Universities: Critical Context for the State s Education Goals

2007 NIRSA Salary Census Compiled by the National Intramural-Recreational Sports Association NIRSA National Center, Corvallis, Oregon

Strategic Plan Update, Physics Department May 2010

EDUCATION POLICY ANALYSIS ARCHIVES A peer-reviewed scholarly journal

top of report Note: Survey result percentages are always out of the total number of people who participated in the survey.

NBCC NEWSNOTES. Guidelines for the New. World of WebCounseling. Been There, Done That: Multicultural Training Can. Always be productively revisted

2009 National Survey of Student Engagement. Oklahoma State University

VOL VISION 2020 STRATEGIC PLAN IMPLEMENTATION

Student Admissions, Outcomes, and Other Data

Research Design & Analysis Made Easy! Brainstorming Worksheet

Why Science Standards are Important to a Strong Science Curriculum and How States Measure Up

Excel Intermediate

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

National Survey of Student Engagement Spring University of Kansas. Executive Summary

LEWIS M. SIMES AS TEACHER Bertel M. Sparks*

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

The Indices Investigations Teacher s Notes

Case study Norway case 1

NCEO Technical Report 27

AP Statistics Summer Assignment 17-18

Probability and Statistics Curriculum Pacing Guide

Ready Common Core Ccls Answer Key

Peer Comparison of Graduate Data

December 1966 Edition. The Birth of the Program

36TITE 140. Course Description:

Financial Education and the Credit Behavior of Young Adults

Sample Performance Assessment

History of CTB in Adult Education Assessment

(I couldn t find a Smartie Book) NEW Grade 5/6 Mathematics: (Number, Statistics and Probability) Title Smartie Mathematics

Lesson M4. page 1 of 2

Mathematics Session 1

Use of CIM in AEP Enterprise Architecture. Randy Lowe Director, Enterprise Architecture October 24, 2012

UNIT ONE Tools of Algebra

CC Baccalaureate. Kevin Ballinger Dean Consumer & Health Sciences. Joe Poshek Dean Visual & Performing Arts/Library

Paper 2. Mathematics test. Calculator allowed. First name. Last name. School KEY STAGE TIER

Introduction to the Revised Mathematics TEKS (2012) Module 1

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

The Value of English Proficiency to the. By Amber Schwartz and Don Soifer December 2012

Planning for Preassessment. Kathy Paul Johnston CSD Johnston, Iowa

OSR Preclinical Grading Questionnaire Results

Teacher Supply and Demand in the State of Wyoming

PHYSICIAN PRACTICE MANAGEMENT BOOT CAMP DIRECTORY

WASHINGTON Does your school know where you are? In class? On the bus? Paying for lunch in the cafeteria?

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

NATIONAL SURVEY OF STUDENT ENGAGEMENT

Facing our Fears: Reading and Writing about Characters in Literary Text

Science Studies Weekly 5th Grade

Planning for Preassessment. Kathy Paul Johnston CSD Johnston, Iowa

Measures of the Location of the Data

Physics 270: Experimental Physics

Appendix L: Online Testing Highlights and Script

Minitab Tutorial (Version 17+)

Introduction to the Practice of Statistics

Missouri Mathematics Grade-Level Expectations

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Interpreting Graphs Middle School Science

A STUDY ON THE EFFECTS OF IMPLEMENTING A 1:1 INITIATIVE ON STUDENT ACHEIVMENT BASED ON ACT SCORES JEFF ARMSTRONG. Submitted to

File Print Created 11/17/2017 6:16 PM 1 of 10

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

CREATING SAFE AND INCLUSIVE SCHOOLS: A FRAMEWORK FOR SELF-ASSESSMENT. Created by: Great Lakes Equity Center

Physics XL 6B Reg# # Units: 5. Office Hour: Tuesday 5 pm to 7:30 pm; Wednesday 5 pm to 6:15 pm

Using SAM Central With iread

A Case to Provide Students Practice in Basic and Advanced Functions of IDEA Software

National FFA Collegiate Scholarships Catalog

Further, Robert W. Lissitz, University of Maryland Huynh Huynh, University of South Carolina ADEQUATE YEARLY PROGRESS

Transcription:

First Digit Phenomenon Activity 6A: First Digit Phenomenon A Descriptive Statistical Discovery Instruction Sheet (Rev 2.5) Introduction: Statisticians divide their subject into two major branches. Descriptive Statistics: Based on the concepts and methods necessary to organize and summarize data, which deals with describing data in the form of tables, graphs, or sample statistics Inferential Statistics: Is how to reach decisions about a large body of data by examining only a small part of the data, by dealing with inferring (or estimating) population characteristics from sample data At times we tend to downplay the category of descriptive statistics perhaps thinking it plays a minor role in making decisions and judgment calls or that it lacks the clout of inferential statistics. Specifically this spreadsheet activity provides students an opportunity to use simple side-by-side bar graphs to aid in their discovery of the amazing First Digit Phenomenon, while generally it helps them gain an appreciation for shapes of sampling distributions. This Activity in based on material covered in Units 5C and 6A Estimated Time for Completion: This activity could potentially be broken in two parts: 1) a gathering of population data accompanied by the analysis. 2) an internet search accomplished by a modification of the lift-hand theoretical distribution with the ensuring discussion questions. The total amount of time spent on the activity is fairly variable, approximately 1 to 3 hours for the spreadsheet and analysis, depending on the student in depth taken and background with Spreadsheet. Student can save a considerable amount of time by having the right-hand table and graph previously set up in the Template. Objectives: The Mathematical objectives of the activity are: 1. To discover through U.S. Census data, the distributions of both the right-hand and left-hand digits and how they dramatically differ. 2. To learn about the First Digit Phenomenon, its history, and some of the possible applications of Benford s Law. The spreadsheet objectives of the activity are: 1. To learn using Excel left, right, and countif functions and how to sort data. 2. To construct bar graphs and draw appropriate conclusions from the data.

Materials Required:. Access to the internet to gather population data from the U.S. Census Bureau s website and do a brief keyword search on Benford s Law and /or first digit phenomenon. Access to spreadsheet software The project template First_Digit_Phenomenon.xls. A selected states to analyze the distribution of the left and right-hand digits of the City and Town Populations - states chart is provided. This activity handout for recording, analyzing, and discussing the results. Activity Overview: Most of the project can be self-driven. The following list outlines what you can do to feel satisfied with the project and confident with your spreadsheet work. Become familiar with a sample of a completed spreadsheet that centers on the population of Idaho's 200 towns and cities (see last pages of this activity sheet). Make sure you are comfortable with the 'left', 'right', and 'countif' functions as well as Benford's Law formula: log(l + 1!Digit). You need to be clear on the right and left-hand distributions in terms of why shape and center matter, and to understand on what both the heights of the curve and area below the curve actually represent. Remember the absolute cell reference of the left digit column (column D) when building the 'countif' or the theoretical count formulas. Without the absolute cell referencing, as you fill down your formula, you will miss coverage of the first several towns or cities. The template may be useful to show you with a modest amount of effort depending on your spreadsheet skill level. The notion of copying a bar graph and then making modifications with a right-click, choosing Select data... ", can be extremely valuable to help you get started in the right direction. It may also prove useful to briefly review some of the key types of distributions such as left and rightskewed, uniform, normal, bi-modal, and where they are often found in life's various contexts. Before you begin: Review the section titled Shapes of Distributions discussed in unit 6A of your text. 1) What are the two Branches of Statistics? (1) (2) 2) Theorize what the distributions would approximately look like for both the right-hand and left-hand digits of the populations of towns and cities from your selected state. Show a sketch in the rectangles below. 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 Right-hand Digit Distribution Left-hand Digit Distribution

Procedure: Now, with your spreadsheet template ready to go, carefully follow the steps listed below. You have to choose one of three options to complete this activity. Your total score depends on the option chosen. Option 1: Option 2: Option 3: Regular score (7.0 points) 80% of the Regular score (5.6 points) 50% of the regular score (3.5 points) Note: States with a strikethrough line have too few cities and towns as reported in the census database which do not make for a valid analysis. Alabama Hawaii Massachusetts New Mexico South Dakota Alaska Idaho Michigan New York Tennessee Arizona Illinois Minnesota Arkansas Indiana Mississippi North Carolina North Dakota Texas California Iowa Missouri Ohio Vermont Colorado Kansas Montana Oklahoma Virginia Connecticut Kentucky Nebraska Oregon Washington Delaware Louisiana Nevada Pennsylvania Florida Maine New Hampshire Georgia Maryland New Jersey Rhode Island South Carolina Utah West Virginia Wisconsin Wyoming Option 1: Regular score (7.0 points) 1. Choose a state from the states chart shown above 2. Use a web browser to locate the U.S Census Bureau: Try the following Link http://www.census.gov/popest/ or contact the State's Census Customer Service and ask them for the break down info of the individual population of each of the state cities and towns. You are not restricted to which year, may be you will be able to obtain the latest which could be of the year 2010 or later.

3. Open the Excel file of your selected state into your spreadsheet software. 4. Copy all the cells containing the names of the towns and cities of the state along with their projected populations (that is, data from columns A and B, excluding the headers). 5. Paste this information into columns A and B of your spreadsheet template. 6. Carefully use your Sort command to arrange the towns and cities from lowest to highest in population. (See your software's help documentation for the proper way to do this.) 7. After pasting the data into your template, you will have noticed that the right-hand digits of each town or city's population are automatically placed into column C along with a frequency table to the right and a side-by-side bar graph directly linked to the table. 8. Double click any of the right digits in column C to discover the built-in function that was used to correctly copy the right-hand digits of the population. Double click the Actual Count cells from column G to see how the 'countif' function is used. You might investigate more information about this 'countif' function in your software's help documentation. As you investigate the "Theoretical Count" cells from column H, notice that the total number of towns and cities is being multiplied by 1/10 since there are 10 potential digits on the right side. On average, each digit should appear equally often. 9. Using the knowledge gained from the right-hand distribution data, table, and side-by-side bar graph, construct the same information and graph for the left digits in a parallel way. Be careful to note that the left digits are 1 through 9 and therefore the theoretical count requires that the total number of towns and cities be multiplied by 1/9. 10. Depending on your graphing experience, it may be easier to copy and paste the bar graph and then customize it, rather than creating one from scratch for the left-hand digits. Refer to software help for some plotting tips to make an effective side-by-side bar graph of your data based on the left-hand digits of the populations (optional). Option 2: 80% of the Regular score (5.6 points) 1. Use the data provided for the state of Idaho 2: Do steps 4 10 listed above

Option 3: 50% of the Regular score (3.5 points) If time or facilities are such that individual Internet research is cumbersome for you, much of Benford's Law and/or first digit phenomenon can be addressed collectively. The story about its discovery by American astronomer Simon Newcomb and rediscovery by American electrical engineer is very interesting. Type a theoretical detailed report on the subject of this activity. Photo credits: WWH'.gutsbv.lId.ac.uk, http://en.wikipedia.org/wiki/simon_newcomb

Analysis: Now with both side-by-side bar graphs completed, answer the questions below. 1. What type of distribution shape would you say fits (models) the following counts? Right Actual Count Left Actual Count Right Theoretical Count Left Theoretical Count 2. Which of the two types of actual digits fits its theoretical count distribution the best? LEFT-HAND RIGHT-HAND (circle one) 3. Referring to the one you did not circle above, describe in a brief sentence what characteristic(s) of the shape made it not be the best fit. Discussion: Before addressing the items below, modify the formula for the Left-hand Theoretical Count by using the logarithmic formula from Benford's Law. A brief Internet search on "Benford's Law" and/or "first digit phenomenon" would be very helpful. 1. Why do you suppose the actual count distributions are not as smooth as the theoretical? 2. By changing the right-hand theoretical count to reflect the logarithmic formula from Benford's Law, discuss how the shape improved the fit of the actual count. 3. Why would the right-hand digit's distribution be approximately uniform (flat)? 4. Why would the left-hand digit's distribution be roughly right-skewed? (see #5 below) 5. To better understand why there is a built-in bias for the lower digits in the left-hand distribution, scan the sorted populations of your state from low to high. Discuss why a city, as it grows in population, would remain with a left-hand digit of a 1 longer than a 2, or why longer with a 2 than a 3, etc. You may come to a better appreciation for the first digit phenomenon that occurs in certain kinds of data by noting how there is a 100% increase from 1 to 2 but then a dramatically tapering percentage thereafter. Fill in the rest of the table and discuss how this might apply to population changes in a town or city. Digit 1 2 3 4 5 6 7 8 9 %Increase n/a 100% 6. Based on your Internet research, discuss a practical application of Benford's Law that interested you and why. Also, include what the Benford ratios are for digits 1 through 9.

1. The population of Idaho's 200 towns and cities Spreadsheet: