A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Similar documents
Do multi-year scholarships increase retention? Results

Access Center Assessment Report

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Evaluation of Teach For America:

What is related to student retention in STEM for STEM majors? Abstract:

10/6/2017 UNDERGRADUATE SUCCESS SCHOLARS PROGRAM. Founded in 1969 as a graduate institution.

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Student attrition at a new generation university

2012 New England Regional Forum Boston, Massachusetts Wednesday, February 1, More Than a Test: The SAT and SAT Subject Tests

Strategic Plan Dashboard Results. Office of Institutional Research and Assessment

American Journal of Business Education October 2009 Volume 2, Number 7

OFFICE OF ENROLLMENT MANAGEMENT. Annual Report

UDW+ Student Data Dictionary Version 1.7 Program Services Office & Decision Support Group

Evaluation of a College Freshman Diversity Research Program

learning collegiate assessment]

Implementing an Early Warning Intervention and Monitoring System to Keep Students On Track in the Middle Grades and High School

Assessing the Impact of an Academic Recovery Program

EVALUATION PLAN

National Survey of Student Engagement at UND Highlights for Students. Sue Erickson Carmen Williams Office of Institutional Research April 19, 2012

The Impact of Honors Programs on Undergraduate Academic Performance, Retention, and Graduation

Multiple Measures Assessment Project - FAQs

Race, Class, and the Selective College Experience

CS Machine Learning

READY OR NOT? CALIFORNIA'S EARLY ASSESSMENT PROGRAM AND THE TRANSITION TO COLLEGE

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Best Colleges Main Survey

Data Glossary. Summa Cum Laude: the top 2% of each college's distribution of cumulative GPAs for the graduating cohort. Academic Honors (Latin Honors)

12- A whirlwind tour of statistics

Analyzing the Usage of IT in SMEs

Predicting the Performance and Success of Construction Management Graduate Students using GRE Scores

The Diversity of STEM Majors and a Strategy for Improved STEM Retention

Institution-Set Standards: CTE Job Placement Resources. February 17, 2016 Danielle Pearson, Institutional Research

School Size and the Quality of Teaching and Learning

Longitudinal Analysis of the Effectiveness of DCPS Teachers

Undergraduates Views of K-12 Teaching as a Career Choice

MAINE 2011 For a strong economy, the skills gap must be closed.

University-Based Induction in Low-Performing Schools: Outcomes for North Carolina New Teacher Support Program Participants in

UW-Stout--Student Research Fund Grant Application Cover Sheet. This is a Research Grant Proposal This is a Dissemination Grant Proposal

2007 Advanced Advising Webinar Series. Academic and Career Advising for Sophomores

Cooking Matters at the Store Evaluation: Executive Summary

Dublin City Schools Career and College Ready Academies FAQ. General

Basic Skills Initiative Project Proposal Date Submitted: March 14, Budget Control Number: (if project is continuing)

B.S/M.A in Mathematics

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

ACHE DATA ELEMENT DICTIONARY as of October 6, 1998

(Sub)Gradient Descent

ASCD Recommendations for the Reauthorization of No Child Left Behind

Accessing Higher Education in Developing Countries: panel data analysis from India, Peru and Vietnam

Setting Up Tuition Controls, Criteria, Equations, and Waivers

National Collegiate Retention and Persistence to Degree Rates

Stipend Handbook

School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

NTU Student Dashboard

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

How Effective is Anti-Phishing Training for Children?

Welcome Parents! Class of 2021

Assignment 1: Predicting Amazon Review Ratings

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Lecture 1: Machine Learning Basics

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

It s not me, it s you : An Analysis of Factors that Influence the Departure of First-Year Students of Color

A Guide to Adequate Yearly Progress Analyses in Nevada 2007 Nevada Department of Education

Review of Student Assessment Data

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

AP Statistics Summer Assignment 17-18

National Collegiate Retention and. Persistence-to-Degree Rates

Math Pathways Task Force Recommendations February Background

21 st Century Apprenticeship Models

Volunteer State Community College Strategic Plan,

The Impacts of Regular Upward Bound on Postsecondary Outcomes 7-9 Years After Scheduled High School Graduation

Financial Aid & Merit Scholarships Workshop

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Facts and Figures Office of Institutional Research and Planning

THEORY/COMPOSITION AREA HANDBOOK 2010

DegreeWorks Advisor Reference Guide

The Efficacy of PCI s Reading Program - Level One: A Report of a Randomized Experiment in Brevard Public Schools and Miami-Dade County Public Schools

WHY DID THEY STAY. Sense of Belonging and Social Networks in High Ability Students

The Talent Development High School Model Context, Components, and Initial Impacts on Ninth-Grade Students Engagement and Performance

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

w o r k i n g p a p e r s

Academic Internships: Crafting, Recruiting, Supervising

PSIWORLD Keywords: self-directed learning; personality traits; academic achievement; learning strategies; learning activties.

Tablet PCs, Interactive Teaching, and Integrative Advising Promote STEM Success

Financing Education In Minnesota

Clock Hour Workshop. June 28, Clock Hours

Social and Economic Inequality in the Educational Career: Do the Effects of Social Background Characteristics Decline?

CONSISTENCY OF TRAINING AND THE LEARNING EXPERIENCE

LaGuardia Community College Retention Committee Report June, 2006

1GOOD LEADERSHIP IS IMPORTANT. Principal Effectiveness and Leadership in an Era of Accountability: What Research Says

ProMedica Defiance Regional Hospital Physicians Scholarship Fund Guidelines and Application

A Systems Approach to Principal and Teacher Effectiveness From Pivot Learning Partners

On-Line Data Analytics

LATTC Program Review Instructional -Department Level

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Supervised Agriculture Experience Suffield Regional 2013

TCC Jim Bolen Math Competition Rules and Facts. Rules:

Empirical Software Evolvability Code Smells and Human Evaluations

Transcription:

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and Planning

Overview Motivation for Analyses Analyses and Results Descriptive Logistic Regression Decision Tree Analysis 1 st Year Retention 6 year Graduation Rate Impact of 2year Degree on Performance at CMU Conclusion 2

Motivation CMU typically enrolls between 1400 and 1500 transfer students each year Constitutes nearly 25% of all new students In the past used all new transfers as cohort ignoring number of transfer credits Given the limited resources of the Office of Student Success, can we identify specific groups for outreach (intervention)? 3

Entry Credentials by Class 60% 50% 3-year (2011-2013) mean %count by Class of Entry 80 Mean Transfer hours by Class of Entry (2006-2014) 40% 30% 20% 10% 0% 21.2% 50.3% Freshman(<26hrs) Sophomore (27-56hrs) 25.9% Junior(57-86hrs) 60 40 20 0 17.1 41.1 66.0 Freshman Sophomore Junior 4.00 Mean Transfer GPA 3.50 3.00 2.50 2.90 3.04 3.24 2.00 Fresh Soph Junior 4

First Year Performance 4.00 3.50 3.00 2.50 2.00 First Term GPA by Entry Level 3.02 2.76 2.46 Fresh Soph Junior 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% % Persisting into 2nd Year by Class of Entry 70.5% 77.5% 84.3% Fresh Soph Junior 5

LOESS of Persistence by Transfer Hours 0 25 50 75 100 125 6

Graduation by Level of Entry Last 3-year mean % Graduating in Year by Class of Entry 90% 80% 70% 60% 50% 40% Freshman Sophomore Junior 30% 20% 10% 0% 1st Year 2nd year 3rd Year 4th Year 5th Year 6th Year 7

Impact of 2 Year Degree At CC Next we looked at the impact of obtaining a 2 year degree on persistence, graduation, and GPA 8

Impact of 2 year Degree 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 85.7% 74.6% 67.2% 43.8% 75.8% 58.8% Pers2 Grad4 Grad6 Degree No Degree 9

Graduation in 4yrs No Associate Degree Associate Degree Graduated within 4 Years No Yes 1006 (56.2%) 190 (32.8%) 784 (43.8%) 389 (67.2%) χ 2 = 95.714, p <.001 However, persistence and graduation are confounded with number of transfer hours. 10

Logistic Regression- Graduation in 4yrs B S.E. df Sig. Exp(B) TRANHRS 0.028 0.003 1 0.000 1.029 Prior_2yr_Degree 0.28 0.122 1 0.021 1.323 Constant -1.197 0.106 1 0.000 0.302 a. Variable(s) entered on step 1: TRANHRS, Prior_2yr_Degree. 11

Results The impact of a 2 year degree on persistence to second year and graduation in 4 or 6 years was significant beyond the impact of number of transfer hours 12

Best Way to Classify Students Who are at Risk? Performance varies by transfer hours but what about other factors? Could use logistic regression to determine which factors are predictive However, this is not useful in determining which groups of students are at risk 13

Factors That Likely Impact Persistence Variables Full or part time student Transfer GPA Total transfer hours Transfer College Year Low Income Status First Generation Status 14

Logistic - Stepwise Variables in the Equation B S.E. Wald df Sig. Exp(B) Step 1 a TRANGPA.796.054 217.894 1.000 2.216 Constant -1.123.161 48.490 1.000.325 Step 2 b TRANGPA.710.055 166.387 1.000 2.034 TRANHRS.011.001 57.435 1.000 1.011 Constant -1.294.163 62.740 1.000.274 Step 3 c FULLPART.739.128 33.566 1.000 2.095 TRANGPA.702.055 161.845 1.000 2.017 TRANHRS.012.001 68.157 1.000 1.012 Constant -2.020.207 94.866 1.000.133 Step 4 d LOWINC -.324.067 23.659 1.000.723 FULLPART.733.128 32.890 1.000 2.081 TRANGPA.700.055 160.898 1.000 2.014 TRANHRS.013.001 75.412 1.000 1.013 Constant -1.968.208 89.593 1.000.140 a. Variable(s) entered on step 1: TRANGPA. b. Variable(s) entered on step 2: TRANHRS. c. Variable(s) entered on step 3: FULLPART. d. Variable(s) entered on step 4: LOWINC. 15

Decision Trees for Outreach 16

Decision Tree Models Several types of decision tree models Here we chose the Chi-square Automatic Interaction Detection (CHAID) Model over Classification and Regression Trees (CRT) With CRT, GPA might split several times to be refined enough to be predictive 17

Decision Tree Models These models can be used simply to classify a set of data (e.g. what is the best way to classify our transfer students in terms of retention factors) Or can be used for prediction (e.g. can we flag new transfer students who are at risk (or not at risk) for persistence?) 18

CHAID The procedure creates tree-based models that determine how variables best combine to explain the outcome in a given dependent variable Dependent variable is binary response Retained vs Not Predictor variables are any combination of variable types (continuous or categorical) 19

Method Start by selecting a subset of data for training Use model to predict a new set of data Here we chose a subset of 70% of the data and fit to the remaining 30% Check misclassification rate and standard error for predictability 20

CHAID Analysis For First Year Retention Start by classifying to determine if even possible If yes, build prediction model Input variables Full or part time student Transfer GPA Total transfer hours Transfer College Year Low Income Status First Generation Status All were Selected by Model 21

Persist to 2 nd Year <=2.410 (2.41-3.06] (3.06-3.35] (3.35-3.73] >3.73 22

23

24

Which Nodes are Important? Gains for Nodes Training Node Index 5 115.1% 25 114.5% 30 114.2% 28 110.8% 31 104.6% 23 103.9% 26 103.2% 22 102.5% 16 101.0% 27 98.6% 8 98.0% 29 95.5% 32 93.3% 17 89.1% 12 88.4% 24 86.7% 20 85.2% 21 85.2% 9 80.4% 18 72.5% 6 70.0% Node 5: Transfer GPA >3.73 90.1% Persist to 2 nd year Node 6: Transfer GPA <= 2.41 Transfer Hrs <=23, 54.8% Persist to 2 nd year 25

Model Evaluation Classification Predicted Percent Sample Not Persist2 Persist2 Correct Training Not Persist2 0 1242 0.0% Persist2 0 4469 100.0% Overall Percentage 0.0% 100.0% 78.3% Test Not Persist2 0 563 0.0% Persist2 0 1917 100.0% Overall Percentage 0.0% 100.0% 77.3% Risk Sample Estimate Std. Error Training.217.005 Test.227.008 Growing Method: CHAID Dependent Variable: Persisted to 2nd Year 26

Summary of Persistence Decision Tree Transfer GPA is most predictive Other predictive factors vary by transfer GPA For clarity and ease of understanding for those who use the information, we created the following tables: 27

Retention at Risk Table Transfer GPA 4.0 3.9 3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3.0 2.9 2.8 2.7 2.6 2.5 2.4 2.3 2.2 2.1 2.0 28 Transfer Credits >60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 6.2% 10.5% Low income in this range at higher risk. 3.3% 17.4% 18.3% Low income in this range at higher risk. 4.5% 1.5% 5.0% First Generation in this range at higher risk. 3.5%

Decision Tree Analysis For 6 Year Input variables Graduation Rate Full or part time student Transfer GPA Total transfer hours Transfer College Year Low Income Status, First Generation Status All variables except full vs part time were selected by model 29

6 Year Graduation <=2.31 2.30-2.69 2.69-2.99 2.99-3.29 <=3.29 30

Summary of 6 Year Graduation Decision Tree For low transfer GPA (<=2.3), 1 st generation status at greater risk For second lowest GPA (2.3-2.69], transfer credit hours and 2 vs 4 year important For middle GPA (2.69-3.29], 2 vs 4 year and transfer credit hours important For highest GPA (>3.29) transfer hours, 2 vs 4 year, and low income important 31

Transfer Credits >60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 Grad at Risk Table Transfer GPA 4.0 3.9 3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3.0 2.9 2.8 2.7 2.6 2.5 2.4 2.3 2.2 2.1 2.0 15.0% CC in this range at higher risk. 7.9% CC transfer in this range at higher risk. 10.0% First Generation in this range at higher risk. 32

Summary The more refined assessment of transfer students revealed some interesting findings Using decision tree analyses on transfer data, we are able to identify very specific groups at risk (and groups likely to succeed) We have presented these results to the retention subcommittee, strategic enrollment group, and to the vice president of enrollment services Questions? 33