Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Similar documents
DRAFT VERSION 2, 02/24/12

Access Center Assessment Report

OFFICE OF ENROLLMENT MANAGEMENT. Annual Report

Introduction to Simulation

WHY GRADUATE SCHOOL? Turning Today s Technical Talent Into Tomorrow s Technology Leaders

Visit us at:

Do multi-year scholarships increase retention? Results

(Sub)Gradient Descent

Educational Leadership and Policy Studies Doctoral Programs (Ed.D. and Ph.D.)

Math Pathways Task Force Recommendations February Background

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

White Paper. The Art of Learning

CS Machine Learning

Strategic Plan Dashboard Results. Office of Institutional Research and Assessment

Dublin City Schools Career and College Ready Academies FAQ. General

Value of Athletics in Higher Education March Prepared by Edward J. Ray, President Oregon State University

Go fishing! Responsibility judgments when cooperation breaks down

Radius STEM Readiness TM

College Pricing and Income Inequality

College Pricing and Income Inequality

Leveraging MOOCs to bring entrepreneurship and innovation to everyone on campus

Differential Tuition Budget Proposal FY

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

12- A whirlwind tour of statistics

College Pricing. Ben Johnson. April 30, Abstract. Colleges in the United States price discriminate based on student characteristics

Alex Robinson Financial Aid

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Analysis of Enzyme Kinetic Data

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Research Design & Analysis Made Easy! Brainstorming Worksheet

Math Placement at Paci c Lutheran University

HOLMER GREEN SENIOR SCHOOL CURRICULUM INFORMATION

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

A Neural Network GUI Tested on Text-To-Phoneme Mapping

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Relationships Between Motivation And Student Performance In A Technology-Rich Classroom Environment

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Lucintel. Publisher Sample

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

A MEANINGFUL CAREER IN LESS THAN ONE YEAR MASTER IN TEACHING

VOL VISION 2020 STRATEGIC PLAN IMPLEMENTATION

Lecture 1: Machine Learning Basics

The Strong Minimalist Thesis and Bounded Optimality

OFFICE SUPPORT SPECIALIST Technical Diploma

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Team Dispersal. Some shaping ideas

B.S/M.A in Mathematics

Introduction to CS 100 Overview of UK. CS September 2015

Medical Complexity: A Pragmatic Theory

Planning for Preassessment. Kathy Paul Johnston CSD Johnston, Iowa

TU-E2090 Research Assignment in Operations Management and Services

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

ACADEMIC AND COLLEGE PLANNING NIGHT

Strategic Plan Dashboard

Evaluation of a College Freshman Diversity Research Program

Writing Research Articles

New Jersey Institute of Technology Newark College of Engineering

A comparative study on cost-sharing in higher education Using the case study approach to contribute to evidence-based policy

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

What is a Mental Model?

Bachelor of Science. Undergraduate Program. Department of Physics

Why Pay Attention to Race?

Race, Class, and the Selective College Experience

Welcome. Paulo Goes Dean, Eller College of Management Welcome Our region

Communication Disorders Program. Strategic Plan January 2012 December 2016

American Journal of Business Education October 2009 Volume 2, Number 7

Knowledge management styles and performance: a knowledge space model from both theoretical and empirical perspectives

Testimony in front of the Assembly Committee on Jobs and the Economy Special Session Assembly Bill 1 Ray Cross, UW System President August 3, 2017

Detailed course syllabus

The Condition of College & Career Readiness 2016

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Biomedical Sciences (BC98)

10 Tips For Using Your Ipad as An AAC Device. A practical guide for parents and professionals

Foundations of Knowledge Representation in Cyc

A CASE STUDY FOR THE SYSTEMS APPROACH FOR DEVELOPING CURRICULA DON T THROW OUT THE BABY WITH THE BATH WATER. Dr. Anthony A.

November 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students

BENCHMARK TREND COMPARISON REPORT:

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

On-Line Data Analytics

Early Warning System Implementation Guide

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

UNIT ONE Tools of Algebra

School of Innovative Technologies and Engineering

Investment in e- journals, use and research outcomes

DegreeWorks Training Guide

A Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements

Python Machine Learning

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Top Ten Persuasive Strategies Used on the Web - Cathy SooHoo, 5/17/01

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Process Evaluations for a Multisite Nutrition Education Program

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are:

What is PDE? Research Report. Paul Nichols

Speech Recognition at ICSI: Broadcast News and beyond

STA 225: Introductory Statistics (CT)

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Circuit Simulators: A Revolutionary E-Learning Platform

Transcription:

Purdue Data Summit 2017 Communication of Big Data Analytics New SAT Predictive Validity Case Study Paul M. Johnson, Ed.D. Associate Vice President for Enrollment Management, Research & Enrollment Information Services (REIS) Paul.Johnson@REIS.Rutgers.edu

Optional Presentation Title Communication of Research to Stakeholders Early Communication electronically and at in-person meetings the what, why, how, when, etc. Format similar to research with which academia is familiar (e.g., research article) Clearly communicate data as tool to help inform their decision-making, if applicable Visuals to help show meaning of statistical concepts Link research\data to university and area (e.g., Enrollment Management) strategic plan to which stakeholders have already helped create and support Communicate progress on research, results, implementation, next steps electronically/in-person Office of Enrollment Management 2

Optional Presentation Title Example: Overview of Intro meeting academic research article approach Background Method: Multiple Linear Regression Validity Study Hypotheses Outcomes Simulation & Implementation Questions Office of Enrollment Management 3

Optional Presentation Title Example: Hypotheses SAT-Writing will correlate significantly with FY GPA SAT-Writing & SAT-Critical Reading (CR) are heavily correlated Incorporating SAT-Writing and recent cohorts will improve multiple correlation or R with FY GPA vs. current regression equation Office of Enrollment Management 4

Optional Presentation Title Examples: Visual--Outcomes SAT-Writing & SAT-CR were heavily correlated (.7 to.8 where 1 is perfect), presenting multicollinearity challenges. Writing Critical Reading Office of Enrollment Management 5

Optional Presentation Title Office of Enrollment Management 6

Example: Discussion & Approval Optional Presentation Title First discussed with deans during meetings in annual Enrollment Management updates, Summer-Fall 2008. Sounded good to everyone; Engineering raised some concern regarding their need to weight SAT-Math heavily. Following study (Summer 2009), weights and background sent to deans for review and their recommendations. Weights changed accordingly based on deans requests less change than data suggested. Noticed that SAT-Math received significantly lower weight (e.g., less than 10% at SEBS) based on regression. Office of Enrollment Management 7

USER CENTERED DESIGN IN DATA SCIENCE IAN PYTLARZ IPYTLARZ@PURDUE.EDU SENIOR DATA SCIENTIST

Why Design Is Needed In Data Science Many Ways Data Science Goes Wrong Automation instead of augmentation Tendency to think about replacing people, even if this isn t the optimal solution This generates a lot of fear towards data scientists Accidental stupidity Tay Microsoft chat bot gone horribly wrong Promotion of fake news Racist image recognition The problem with the designs of most engineers is that they are too logical. We have to accept human behavior the way it is, not the way we would wish it to be Don Norman, The Design of Everyday Things PURDUE DATA SUMMIT

Examples At Purdue Forecast & Grades Modelling Thinking About User Perspective The model predicts a student to fail if they have > 50% likelihood of doing so Low bar to clear, so students end up with many failures This could be demoralizing to a student if they had many failure predictions Instead of presenting the binary output, we binned the output to only show students as in danger if the had > 80% likelihood We could be doing even better! A small minority of students still show as failing everything without special intervention PURDUE DATA SUMMIT

Examples At Purdue Forecast & Grades Modelling Predictions Alone Won t Change Behavior The goal of Forecast is to change student behavior, we needed to bear that in mind when analyzing the model We need to provide information on WHY those predictions were made, to nudge behavior in the right direction Currently shows students the relationships between behaviors and success Student can influence this! Again, we could be doing this even better Should focus dynamically on students who have successful behaviors that they could be improving on PURDUE DATA SUMMIT

Next Steps at Purdue Augmentation A Website Will Never Replace Human Beings There is no automatic tool that is going to drastically alter student behaviors by itself Human intervention is the best way to effect change in students Luckily, we have humans who already do that job! Advisors can be augmented with machines to improve student success 1 This is in-progress at Purdue, both in modelling and in process improvements 1 http://www.npr.org/sections/ed/2016/10/30/499200614/how-one-university-used-big-data-to-boost-graduation-rates PURDUE DATA SUMMIT

UK LEADS: Using Data Analytics to Drive Decision Making at Craig Rudick - Executive Director of Institutional Research and Lead Data Scientist craig.rudick@uky.edu 13

High School Readiness Index (HSRI) = HSGPA*10 + ACT/2 Unmet Financial Need is a major driver of student success. 14

UK LEADS: Leveraging Economic Affordability for Developing Success Shifting resources toward need-based financial aid Simulate changing financial aid awards to optimize: Yield Retention/Progression Net Tuition Revenue URM/Pell/First Gen/etc. Financial Aid Yield Retention Total Enrollment Net Tuition Revenue Demographics/Diversity 15

Designing effective analyses for decision support: Use the simplest methods possible Create tools others can use A picture is worth a thousand algorithms Focus algorithm output on useful decisionpoints Pilots and test cases to prove validity Make all your underlying data available 16