Purdue Data Summit 2017 Communication of Big Data Analytics New SAT Predictive Validity Case Study Paul M. Johnson, Ed.D. Associate Vice President for Enrollment Management, Research & Enrollment Information Services (REIS) Paul.Johnson@REIS.Rutgers.edu
Optional Presentation Title Communication of Research to Stakeholders Early Communication electronically and at in-person meetings the what, why, how, when, etc. Format similar to research with which academia is familiar (e.g., research article) Clearly communicate data as tool to help inform their decision-making, if applicable Visuals to help show meaning of statistical concepts Link research\data to university and area (e.g., Enrollment Management) strategic plan to which stakeholders have already helped create and support Communicate progress on research, results, implementation, next steps electronically/in-person Office of Enrollment Management 2
Optional Presentation Title Example: Overview of Intro meeting academic research article approach Background Method: Multiple Linear Regression Validity Study Hypotheses Outcomes Simulation & Implementation Questions Office of Enrollment Management 3
Optional Presentation Title Example: Hypotheses SAT-Writing will correlate significantly with FY GPA SAT-Writing & SAT-Critical Reading (CR) are heavily correlated Incorporating SAT-Writing and recent cohorts will improve multiple correlation or R with FY GPA vs. current regression equation Office of Enrollment Management 4
Optional Presentation Title Examples: Visual--Outcomes SAT-Writing & SAT-CR were heavily correlated (.7 to.8 where 1 is perfect), presenting multicollinearity challenges. Writing Critical Reading Office of Enrollment Management 5
Optional Presentation Title Office of Enrollment Management 6
Example: Discussion & Approval Optional Presentation Title First discussed with deans during meetings in annual Enrollment Management updates, Summer-Fall 2008. Sounded good to everyone; Engineering raised some concern regarding their need to weight SAT-Math heavily. Following study (Summer 2009), weights and background sent to deans for review and their recommendations. Weights changed accordingly based on deans requests less change than data suggested. Noticed that SAT-Math received significantly lower weight (e.g., less than 10% at SEBS) based on regression. Office of Enrollment Management 7
USER CENTERED DESIGN IN DATA SCIENCE IAN PYTLARZ IPYTLARZ@PURDUE.EDU SENIOR DATA SCIENTIST
Why Design Is Needed In Data Science Many Ways Data Science Goes Wrong Automation instead of augmentation Tendency to think about replacing people, even if this isn t the optimal solution This generates a lot of fear towards data scientists Accidental stupidity Tay Microsoft chat bot gone horribly wrong Promotion of fake news Racist image recognition The problem with the designs of most engineers is that they are too logical. We have to accept human behavior the way it is, not the way we would wish it to be Don Norman, The Design of Everyday Things PURDUE DATA SUMMIT
Examples At Purdue Forecast & Grades Modelling Thinking About User Perspective The model predicts a student to fail if they have > 50% likelihood of doing so Low bar to clear, so students end up with many failures This could be demoralizing to a student if they had many failure predictions Instead of presenting the binary output, we binned the output to only show students as in danger if the had > 80% likelihood We could be doing even better! A small minority of students still show as failing everything without special intervention PURDUE DATA SUMMIT
Examples At Purdue Forecast & Grades Modelling Predictions Alone Won t Change Behavior The goal of Forecast is to change student behavior, we needed to bear that in mind when analyzing the model We need to provide information on WHY those predictions were made, to nudge behavior in the right direction Currently shows students the relationships between behaviors and success Student can influence this! Again, we could be doing this even better Should focus dynamically on students who have successful behaviors that they could be improving on PURDUE DATA SUMMIT
Next Steps at Purdue Augmentation A Website Will Never Replace Human Beings There is no automatic tool that is going to drastically alter student behaviors by itself Human intervention is the best way to effect change in students Luckily, we have humans who already do that job! Advisors can be augmented with machines to improve student success 1 This is in-progress at Purdue, both in modelling and in process improvements 1 http://www.npr.org/sections/ed/2016/10/30/499200614/how-one-university-used-big-data-to-boost-graduation-rates PURDUE DATA SUMMIT
UK LEADS: Using Data Analytics to Drive Decision Making at Craig Rudick - Executive Director of Institutional Research and Lead Data Scientist craig.rudick@uky.edu 13
High School Readiness Index (HSRI) = HSGPA*10 + ACT/2 Unmet Financial Need is a major driver of student success. 14
UK LEADS: Leveraging Economic Affordability for Developing Success Shifting resources toward need-based financial aid Simulate changing financial aid awards to optimize: Yield Retention/Progression Net Tuition Revenue URM/Pell/First Gen/etc. Financial Aid Yield Retention Total Enrollment Net Tuition Revenue Demographics/Diversity 15
Designing effective analyses for decision support: Use the simplest methods possible Create tools others can use A picture is worth a thousand algorithms Focus algorithm output on useful decisionpoints Pilots and test cases to prove validity Make all your underlying data available 16