An introduction to the AI tutor project: several ongoing research on big data and artificial intelligence in education. Dr.

Similar documents
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

A Case Study: News Classification Based on Term Frequency

Assignment 1: Predicting Amazon Review Ratings

Laboratorio di Intelligenza Artificiale e Robotica

Python Machine Learning

CS Machine Learning

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

A student diagnosing and evaluation system for laboratory-based academic exercises

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Laboratorio di Intelligenza Artificiale e Robotica

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

AQUA: An Ontology-Driven Question Answering System

Mining Association Rules in Student s Assessment Data

Learning Methods in Multilingual Speech Recognition

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Knowledge-Based - Systems

Rule Learning With Negation: Issues Regarding Effectiveness

Lecture 1: Machine Learning Basics

Probabilistic Latent Semantic Analysis

Word Segmentation of Off-line Handwritten Documents

Introduction to Moodle

Learning From the Past with Experiment Databases

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

A Study of Metacognitive Awareness of Non-English Majors in L2 Listening

Lecture 1: Basic Concepts of Machine Learning

GACE Computer Science Assessment Test at a Glance

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Software Maintenance

Applications of data mining algorithms to analysis of medical data

Guru: A Computer Tutor that Models Expert Human Tutors

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Rule Learning with Negation: Issues Regarding Effectiveness

Epistemic Cognition. Petr Johanes. Fourth Annual ACM Conference on Learning at Scale

Speech Emotion Recognition Using Support Vector Machine

Australian Journal of Basic and Applied Sciences

(Sub)Gradient Descent

ZHANG Xiaojun, XIONG Xiaoliang School of Finance and Business English, Wuhan Yangtze Business University, P.R.China,

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

STA2023 Introduction to Statistics (Hybrid) Spring 2013

Learning Methods for Fuzzy Systems

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

Modeling user preferences and norms in context-aware systems

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Organizational Knowledge Distribution: An Experimental Evaluation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

MSW POLICY, PLANNING & ADMINISTRATION (PP&A) CONCENTRATION

Florida Reading for College Success

DOCTOR OF PHILOSOPHY HANDBOOK

Journal title ISSN Full text from

Guide to Teaching Computer Science

Linking Task: Identifying authors and book titles in verbose queries

The Enterprise Knowledge Portal: The Concept

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Focus on. Learning THE ACCREDITATION MANUAL 2013 WASC EDITION

CSL465/603 - Machine Learning

Automating the E-learning Personalization

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Welcome to the session on ACCUPLACER Policy Development. This session will touch upon common policy decisions an institution may encounter during the

Beyond the Blend: Optimizing the Use of your Learning Technologies. Bryan Chapman, Chapman Alliance

Reinforcement Learning by Comparing Immediate Reward

Georgetown University at TREC 2017 Dynamic Domain Track

Robot manipulations and development of spatial imagery

Leveraging MOOCs to bring entrepreneurship and innovation to everyone on campus

Identification of Opinion Leaders Using Text Mining Technique in Virtual Community

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Reducing Features to Improve Bug Prediction

DOUBLE DEGREE PROGRAM AT EURECOM. June 2017 Caroline HANRAS International Relations Manager

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Software Development: Programming Paradigms (SCQF level 8)

E-Teaching Materials as the Means to Improve Humanities Teaching Proficiency in the Context of Education Informatization

Diploma in Library and Information Science (Part-Time) - SH220

Kindergarten Iep Goals And Objectives Bank

Information System Design and Development (Advanced Higher) Unit. level 7 (12 SCQF credit points)

Android App Development for Beginners

What is a Mental Model?

Top US Tech Talent for the Top China Tech Company

Beyond the Pipeline: Discrete Optimization in NLP

1 Use complex features of a word processing application to a given brief. 2 Create a complex document. 3 Collaborate on a complex document.

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Literature and the Language Arts Experiencing Literature

MAE Flight Simulation for Aircraft Safety

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Visit us at:

Content-free collaborative learning modeling using data mining

Curriculum Policy. November Independent Boarding and Day School for Boys and Girls. Royal Hospital School. ISI reference.

SOFTWARE EVALUATION TOOL

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Transcription:

An introduction to the AI tutor project: several ongoing research on big data and artificial intelligence in education Dr. Baoping Li

Introduction of ICT Center in China ICT Center of China focuses on research and practice integrate ICTs into teaching and learning big data mining and AI in education. A learning platform named Smart Learning Partner was developed to support the research and practice

AI Tutor Project

Vision: Learning Assistant first, then Learning Partner and AI Tutor finally To build a comprehensive simulation of the knowledge, emotion, cognition and social network of young children and teenagers so as to provide "intelligent tutor" service with natural language interaction through collecting data and understanding the general rules and individual characteristics of the development of young people.

Vision: to found a future school with organization innovation To found an Internet + supported future school to explore organizational innovation; Integration of AI tutor's online teaching service and offline teaching to achieve personalized education; To promote project-based exploratory learning cross disciplines and develop students innovative spirit and hands-on skills; To provide personalized public services via Internet.

Vision:Online and offline school environment Open, mobile, social, distributed and connected to the smart cognitive network and personalized development space. This ecological environment is not a fragmented learning space, rather a network connecting to the global community. Learning is not limited to classroom and school, but a lifelong, all-round and on-demand practice.

Backgrounds & Aims Artificial intelligence is emerging from science fiction to everyday life, it continues to influence industries like consumer electronics, E-commerce, media, transportation, and healthcare. Education Next Opportunity! Chinese government also announced its intentions to prioritize the development of AI as part of its national development plan. Provide an innovative platform for international research cooperation, understanding and investigating how AI could reinvent the future of education from both teaching and learning perspectives.

Scopes AI-driven knowledge base construction, knowledge graph construction and ontology construction; AI-driven knowledge tracing, educational data mining and learning analytics; AI-driven learner emotion recognition and affective computing; AI-driven new generation of student model and adaptive learning system; AI-driven automatic question generation, automatic question answering and automatic short answer grading; AI-driven problem-solving ability assessment; AI-driven student academic performance and achievement prediction;

Scopes AI-driven recommender system for student career development; AI-driven intelligent teaching robot and agent; AI-driven interactive teaching with natural language processing techniques Ethics and law for AI-driven teaching and learning; Large scale educational data storage, processing and transformation; Any other relevant AI techniques applicable to the education domain.

Supports Up to US $ 50,000 grants, depending on the project AICFE will assign at least one researcher to collaborate with the grant recipients. We may also provide research engineers/assistant to conduct the system implementation.

Completion & Publication Reports Seminars Publish at least one journal paper (Indexed by SCI or SSCI) Publish at least one top conference paper Patents and system prototypes are also strongly encouraged Duration: 1-2 years

Proposal Submission Deadline July 30, 2017 (1 st Stage) November 30, 2017 (2 nd Stage)

More Information Handout in your bag Website: http://aic-fe.bnu.edu.cn/en/ Contact: Sylvia Gao & Victor Lu Email: aitutor@bnu.edu.cn

Objectives of SLP Data collection during the entire learning process Model construction for knowledge and capability Diagnosis and treatment for learning obstacles Identification and enhancement on disciplinary advantage

Data Analysis Framework Assessment Assignment Practice Online learning Clustering Classification Frequent mode Outlier Correlation analysis Discriminant analysis Comparison and summary Trend analysis Deviation analysis Pattern discovery Data mining Education quality map Service supplier 教育资源与服务统一战线 Data collection Datamation Intelligent recommendati on engine Online interaction Works Video recording Classroom interaction Mobile interaction Wearable device Intelligent device Sensor network Information system Coding analysis Text analysis Discourse analysis Pattern recognition Voice recognition Image analysis Video analysis Modeling Individual diagnostic report

Research on Educational Knowledge Graph Dr. Hepeng Cheng

Educational Knowledge graph Objective To construct knowledge graph of K-12 education Background Knowledge base of AI Tutor Applications Knowledge state based student profiling Intelligent personalized recommendation on learning resources System Output Knowledge graph fused domain expertise and artificial intelligence Automatic exam paper generation for given concepts Student profile of knowledge states utilizing performance data Personalized educational resource recommendation based on student profiles

Educational Knowledge graph Objective To construct knowledge graph of K-12 education Background Knowledge base of AI Tutor Applications Knowledge state based student profiling Intelligent personalized recommendation on learning resources System Output Knowledge graph fused domain expertise and artificial intelligence Automatic exam paper generation for given concepts Student profile of knowledge states utilizing performance data Personalized educational resource recommendation based on student profiles

Task 1: Knowledge Graph Construction Data Model

Task 1: Knowledge Graph Construction Objective Fill in the content of knowledge graph according to designed data model, more specifically, include Subject concepts and prerequisite relations between them Linking subject concepts with textbooks and questions Linking subject concepts with learning objectives Linking to students and teachers Data Sources Traditional teaching material: textbooks, lecture notes, curriculum standards Online education platform: learning log, teacher-student interaction, forum data Internet data: Wikipedia data Output Knowledge graph that fused domain expertise and artificial intelligence

Task 2: Knowledge Graph Analysis Objective Use a small set of questions to examine students' knowledge states of a large set of subject concepts Subtasks Subset selection Find out subset of subject concepts to covert the entire set of given subject concepts Paper generation How to build a paper with given subject concepts and their related questions? Output An algorithm to generate exam paper based on given subject concepts for testing

Task 3: Educational Application Application 1: Student Profiling Objective Monitor/Represent students' knowledge states based on knowledge graph, students' particulars, performance data Challenge Performance data is not continuous due to limited number of performance data, in which case we need to predict students' performances on those subject concepts without performance data Output Students' profiles of knowledge states Application 2: Smart Recommendation Objective Learning resources recommendation based on student profiles as well as learning objectives Challenge Matching and coverage between subject concepts and questions Matching and coverage between subject concepts and learning resources Output Recommended resources

System Workflow

Accomplishment and Collaboration Accomplishment Half done with task 1 Finished: Subject concept extraction (first round, may iteratively update in the future) Prerequisite relations identified manually Linked learning objectives with certain key subject concepts Linked subject concepts with several sets of exam questions Remaining: Linking subject concepts with textbooks Linking subject concepts with more questions Linking subject concepts with teachers and students Potential Collaboration Knowledge graph construction: share data and resources to enrich our knowledge graph content Knowledge graph analysis: work on certain graph analysis together Educational application: develop certain educational applications on top of knowledge graph and analysis

Automatically question generation based on semantic network Dr.Lishan Zhang

Common question generation techniques Generation based on plain text For example John bought some fruits (ROOT (S (NP (NNP John)) (VP (VBD bought) (NP (DT some) (NNS fruits))))) -> Who bought some fruits? -> What did John buy? Generation based on a semantic network or ontology for a specific domain (We adopt this methodology) For example.

The Workflow for question generation Knowledge base (semantic network) Evaluate and improve question patterns & Improve the standards of knowledge base Construct question patterns Evaluate and improve the question patterns by looking at the generated questions Generate question in natural language

The domains Chinese reading comprehension Expository text reading in specific Aim for improving students understanding on the text Photosynthesis in Biology Aims for helping teachers generate shallow questions Aims for assessing students understanding on basic concepts

Question generation for expository text reading comprehension Text code schema: The object of the expository text is classified into two types The way to describe the object is classified into ten types The plain text is transformed into a semantic network:

Question generation in photosynthesis Each concept is classified as process or instance The knowledge in this domain is transformed into the semantic network: What does photosynthesis produce? How is light used in photosynthesis? To generate questions like Where does photosynthesis take place?

Technologies being used OWL standard is adopted to describe the semantic network Jena API as well as SPASQL is used for accessing the coded semantic network The generation program is being implemented with Java The program accesses the semantic network recursively to find out all the relations fitting for the question pattern.

Connection with auto-grading component Both question and its answer can be generated from the semantic network. So it can facilitate student answer grading. Questions Semantic network Generation engine generate Key words in correct answer compare Auto-grading engine Graded answer Student answer

Expected results By having teacher authorized a semantic network, our system can automatically generate questions to auto-grade students answers, assess students competence, feedback to students, and adaptively select the next question.

Automated Assessment System for Short Answer Questions Dr.Xi Yang

We Need Autamatic Grading 1. Assessing students' acquired knowledge is one of the key aspects of teachers' job. Assessments are important for teachers as these provide them insights on how effective their teaching has been. However, assessment is a monotonous, repetitive and time consuming job and often seen as an overhead and non-rewarding. 2. Consequently, use of open-ended questions that seek students' constructed responses is more commonly found in educational institutions. They reveal students' ability to integrate, synthesize, design, and communicate their ideas in natural language. 3. With the increase of e-learning, MOOCs, online testing automatic grading has aroused more critical discussions.

Break Through on Reading Comprehension Different question type is for different level of cognitive skill Different question type is corresponding to different openness level There are few researches in automatic reading comprehension grading

Datasets Collecting data is a significant part for our researches. Chinese Data We organize two experienced teachers to label the Chinese answers individually and made agreements finally. English Data We selected five datasets in Kaggle Automatic Student Assessment Prize: Short Answer Scoring(ASAP-SAS) based on reading comprehension definition.

Data Overview Problem Avgword Total Samples Score Level Language CRCC1 39 2579 0-4 Chinese CRCC2 33 2571 0-2 Chinese CRCC3 26 2382 0-3 Chinese CRCC4 27 2458 0-4 Chinese CRCC5 31 2538 0-3 Chinese ASAP-SAS3 47 2297 0-2 English ASAP-SAS4 40 2033 0-2 English ASAP-SAS7 41 2398 0-2 English ASAP-SAS8 52 2398 0-2 English ASAP-SAS9 49 2397 0-2 English

Algorithm Input Segmentati on Embedding LSTM Grading

Answer Preprocessing & Word Embedding Word Segmentation CBOW model Answer Text Self Vector Knowledge Adaptation Vector Wikipedia Corpus External Knowledge Wiki Vector

LSTM Extract Semantic Information Standard Bag-of-words Model LSTM based Deep Sequence Model Who I am Who am I ignore the word order consider the word order and disposal word sequence

Experiment and Conclusion Ten datasets (5 Chinese & 5 English) Two baselines 1. Logistic Regression 2. Support Vector Machine Evaluation: Accuracy

Results on Accuracy DJDT HSVM LSTM+self LSTM+web corpus LSTM+KA CRCC1 0.5106 0.5482 0.5979 0.5805 0.6134 CRCC2 0.6036 0.6585 0.7487 0.7242 0.7379 CRCC3 0.8564 0.8862 0.6511 0.8061 0.8229 CRCC4 0.5020 0.5867 0.5533 0.5725 0.5911 CRCC5 0.7574 0.7660 0.6942 0.7443 0.7738 ASAP-SAS3 0.4789 0.4698 0.4806 0.4885 0.4898 ASAP-SAS4 0.6385 0.7358 0.4550 0.7688 0.7742 ASAP-SAS7 0.6343 0.6626 0.5605 0.6684 0.6493 ASAP-SAS8 0.5234 0.5988 0.5868 0.6222 0.6322 ASAP-SAS9 0.6237 0.6442 0.6834 0.6458 0.6926 Ava accuracy 0.6126 0.6557 0.6011 0.6623 0.6748

Analysis HSVM is a relative better automatic reading comprehension grading model in baselines. The statistic machine learning models still work in some reading comprehension grading tasks without rubrics. The pretrained word vectors is limited for LSTM approaches. And the vectors training performance is influenced by the volume of datasets so the word embedding may not perform better when only use the student answers. More importantly, the experiment results also proved that transfer external knowledge for word embedding through knowledge adaptation can help impove the performance of model.

Conclusion We propose a deep learning based method for automatic Chinese reading comprehension grading. Our method does not rely on any target answer due to the fact that target answer is not always available for most open-ended reading comprehension questions. In our framework, CBOW and LSTM are combined and extract semantic information automaticly and effectively consider the word orders in student response. Additional, through knowledge adaptation, the external knowledge is transferred to present corpus by utilizing fine-tuning technique. Experiments on ten datasets, demonstrate the performance improvement by introducing of external knowledge.

THANK YOU!