Backend Features in TechLauncher Common Assessment Process Platform. James Volis u Supervisor: Shayne Flint

Similar documents
Using dialogue context to improve parsing performance in dialogue systems

CS Machine Learning

Python Machine Learning

Linking Task: Identifying authors and book titles in verbose queries

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation

Australian Journal of Basic and Applied Sciences

Disambiguation of Thai Personal Name from Online News Articles

Cross-lingual Short-Text Document Classification for Facebook Comments

BMC Medical Informatics and Decision Making 2012, 12:33

Indian Institute of Technology, Kanpur

Reducing Features to Improve Bug Prediction

Rule Learning With Negation: Issues Regarding Effectiveness

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Bachelor Class

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

Exposé for a Master s Thesis

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Radius STEM Readiness TM

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

A Case Study: News Classification Based on Term Frequency

Student Services Job Family FY18 General

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Detecting English-French Cognates Using Orthographic Edit Distance

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Truth Inference in Crowdsourcing: Is the Problem Solved?

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Issues in the Mining of Heart Failure Datasets

Switchboard Language Model Improvement with Conversational Data from Gigaword

Learning From the Past with Experiment Databases

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

GRAPHIC DESIGN TECHNOLOGY Associate in Applied Science: 91 Credit Hours

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Postprint.

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Word Segmentation of Off-line Handwritten Documents

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

SARDNET: A Self-Organizing Feature Map for Sequences

B. How to write a research paper

Lecture 1: Machine Learning Basics

Interpreting ACER Test Results

The Ups and Downs of Preposition Error Detection in ESL Writing

Bug triage in open source systems: a review

CS177 Python Programming

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Speech Emotion Recognition Using Support Vector Machine

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Human Emotion Recognition From Speech

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Lecture 1: Basic Concepts of Machine Learning

GACE Computer Science Assessment Test at a Glance

Learning Microsoft Office Excel

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

A Biological Signal-Based Stress Monitoring Framework for Children Using Wearable Devices

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Automatic Pronunciation Checker

Generative models and adversarial training

Online Marking of Essay-type Assignments

LIFELONG LEARNING PROGRAMME ERASMUS Academic Network

Rule Learning with Negation: Issues Regarding Effectiveness

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

CS 446: Machine Learning

An investigation of imitation learning algorithms for structured prediction

Assignment 1: Predicting Amazon Review Ratings

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Comment-based Multi-View Clustering of Web 2.0 Items

Test Effort Estimation Using Neural Network

Mathematics subject curriculum

Khairul Hisyam Kamarudin, PhD 22 Feb 2017 / UTM Kuala Lumpur

Shockwheat. Statistics 1, Activity 1

A Neural Network GUI Tested on Text-To-Phoneme Mapping

School of Innovative Technologies and Engineering

PHONETIC DISTANCE BASED ACCENT CLASSIFIER TO IDENTIFY PRONUNCIATION VARIANTS AND OOV WORDS

Software Development: Programming Paradigms (SCQF level 8)

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Multivariate k-nearest Neighbor Regression for Time Series data -

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Introducing the New Iowa Assessments Mathematics Levels 12 14

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Using focal point learning to improve human machine tacit coordination

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

MINISTRY OF EDUCATION

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Evolutive Neural Net Fuzzy Filtering: Basic Description

Formative Assessment in Mathematics. Part 3: The Learner s Role

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Unit 7 Data analysis and design

Organizational Knowledge Distribution: An Experimental Evaluation

WHEN THERE IS A mismatch between the acoustic

10.2. Behavior models

Transcription:

Backend Features in TechLauncher Common Assessment Process Platform James Volis u5370515 Supervisor: Shayne Flint

Structure 1. 2. 3. 4. 5. Background Task Approach Results Conclusions

Background

TechLauncher

Also known as...

COMP3100

COMP3100 COMP3500

COMP3100 COMP3500 COMP3550

COMP3100 COMP3500 COMP3550 COMP4500

COMP3100 COMP3500 COMP3550 COMP4500 COMP8715

~37 different projects

~37 different projects ~240 students enrolled

~37 different projects ~240 students enrolled Big course.

So what? It s a big course.

The issues?

Subjectivity.

No two students are the same!

No two pieces of work are the same!

(hopefully not)

A huge amount of data.

A lot of marking is required!

Effort is required for marking!

Problems remain...

How do you mark different pieces of work to the same standard?

How do you give timely feedback, given the large amount of marking?

Task

Simple solution to the marking problem...

Get the students to do the work for you!

Otherwise known as Peer Assessment

However,

this brings up other issues, namely...

Using students to mark themselves!

How can you assess if they are giving good feedback?

How can you assess what is good feedback?

And then there is the challenge,

How do you filter good feedback from not good feedback?

How do you filter all good feedback and give it back to the students in time?

Approach

Data can be sorted into two classes, (Jin 2016)

Actionable Feedback

Contains an executable suggestion

You should try doing this

Negative tone that suggests improvement

No risk assessment plan

Feedback that suggests improvement.

Descriptive Feedback

Detailing what has been done

You have done X, Y and Z well

Compliments

Good job dude! Great work!

Anything else.

But we still have too much data to classify by hand?

Solution?

Use a machine learning algorithm to do it for us.

What type of machine learning algorithm should we use?

Support Vector Machine

By Alisneaky, svg version by User:Zirguezi (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

Found in a previous report to be the most effective for this problem (Jin 2016)

This year in TechLauncher,

Using Tag Reports

Students select tags that match deliverables,

Examples: Collecting user feedback Collecting evidence Requirements managed Prioritising work

They also leave a rationale for their tags,

My focus are the rationale responses.

Large amount of responses,

473 reports * 5 responses each = 2365 responses to sort fortnightly

Tools Used

Utilising the SVM workflow created last year (Jin 2016)

Why continue with this?

Open-source

Open-source Easy to use

Open-source Easy to use Sufficient performance

Data Storage

Why.xlsx over a database?

Output of tag report forms are.xlsx

Output of tag report forms are.xlsx Can be used as an input

Output of tag report forms are.xlsx Can be used as an input GUI for less technical people (tutors)

Data Manipulation

Why Python?

Easy to set up

Easy to set up Easy to use

Easy to set up Easy to use Easy to document

Results

How did this go?

Training sets were created based off a random selection of raw data,

Data was sorted and put into Excel (training sets and raw data),

KNIME created a new spreadsheet as an output with predictions,

The output was then matched with the initial tag report,

End result: Tag report with predictions.

How effective was the SVM?

Using a Confusion Matrix

n = total sample size Predicted Observation: True Predicted Observation: False Actual Observation: True Number of True Positives Number of False Negatives Actual Observation: False Number of False Positives Number of True Negatives

n = total sample size Predicted Observation: True Predicted Observation: False Actual Observation: True Number of True Positives Number of False Negatives Actual Observation: False Number of False Positives Number of True Negatives

n = total sample size Predicted Observation: True Predicted Observation: False Actual Observation: True Number of True Positives Number of False Negatives Actual Observation: False Number of False Positives Number of True Negatives

n = total sample size Predicted Observation: True Predicted Observation: False Actual Observation: True Number of True Positives Number of False Negatives Actual Observation: False Number of False Positives Number of True Negatives

n = total sample size Predicted Observation: True Predicted Observation: False Actual Observation: True Number of True Positives Number of False Negatives Actual Observation: False Number of False Positives Number of True Negatives

Adapting this to TechLauncher data

Suggested Confusion Matrix for Peer Assessment Responses n = Number of Peer Assessment Responses Predicted Observation: Actionable Predicted Observation: Descriptive Actual Observation: Actionable Number of Actionable responses predicted Number of Actionable predicted as Descriptive Actual Observation: Descriptive Number of Descriptive predicted as Actionable Number of Descriptive responses predicted

Actual Confusion Matrix for Peer Assessment Responses n = 2110 Predicted Observation: Actionable Predicted Observation: Descriptive Actual Observation: Actionable 527 108 Actual Observation: Descriptive 118 1357

Metrics for Confusion Matrix performance

Accuracy

Accuracy How often did it predict correctly?

Accuracy How often did it predict correctly? Accuracy = (TP + TN) / Total Sample Size

Precision

Precision What proportion of predictions were true?

Precision What proportion of predictions were true? Precision = TP / (TP + FP)

Recall

Recall What proportion of positives were predicted?

Recall What proportion of positives were predicted? Recall = TP / (TP + FN)

Performance

Accuracy ~ 89%

Accuracy ~ 89% Precision ~ 82%

Accuracy ~ 89% Precision ~ 82% Recall ~ 83%

The classifier is relatively good.

Models may need to be improved slightly before use in grades.

Conclusions

Where do we go from here?

Students will receive an extra field in their Peer Assessment feedback,

AI Feedback field will list whether feedback is actionable or descriptive,

Given a form to check whether the AI has classified feedback correctly,

Student s corrections will help improve the AI s performance.

Other uses

Giving statistics on types of feedback given

A student would be able to see how much good feedback they have given,

Making it part of the marking scheme.

More iterations are needed at this stage.

Summary TechLauncher is really big and hard to mark Classify feedback given Support Vector Machine Results Actionable & Descriptive Using Machine Learning Algorithm How do we mark it in time? Performance Metrics are relatively good Conclusions Students will help correct classifications Statistics from predictions in future More iterations needed to improve performance.

Questions?

References Jin, Zi, 2016, Using Peer Assessment Data to Help Improve Teaching and Learning Outcomes, B Advanced Computing (Honours) Thesis, The Australian National University.