A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDS

Similar documents
Mining Association Rules in Student s Assessment Data

On-Line Data Analytics

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Rule Learning With Negation: Issues Regarding Effectiveness

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Python Machine Learning

Word Segmentation of Off-line Handwritten Documents

Rule Learning with Negation: Issues Regarding Effectiveness

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Assignment 1: Predicting Amazon Review Ratings

Lecture 1: Machine Learning Basics

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Humboldt-Universität zu Berlin

Human Emotion Recognition From Speech

Probabilistic Latent Semantic Analysis

Mining Student Evolution Using Associative Classification and Clustering

Applications of data mining algorithms to analysis of medical data

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Lecture 1: Basic Concepts of Machine Learning

Australian Journal of Basic and Applied Sciences

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

INTRODUCTION TO DECISION ANALYSIS (Economics ) Prof. Klaus Nehring Spring Syllabus

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

CSL465/603 - Machine Learning

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Evolutive Neural Net Fuzzy Filtering: Basic Description

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Learning Methods for Fuzzy Systems

Reducing Features to Improve Bug Prediction

Welcome to. ECML/PKDD 2004 Community meeting

Classification Using ANN: A Review

(Sub)Gradient Descent

Speech Emotion Recognition Using Support Vector Machine

Data Structures and Algorithms

Top US Tech Talent for the Top China Tech Company

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Test Effort Estimation Using Neural Network

Learning From the Past with Experiment Databases

Time series prediction

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

A Case Study: News Classification Based on Term Frequency

Software Maintenance

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

Circuit Simulators: A Revolutionary E-Learning Platform

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Ryerson University Sociology SOC 483: Advanced Research and Statistics

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Different Requirements Gathering Techniques and Issues. Javaria Mushtaq

Major Milestones, Team Activities, and Individual Deliverables

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Leveraging MOOCs to bring entrepreneurship and innovation to everyone on campus

TENNESSEE S ECONOMY: Implications for Economic Development

CWIS 23,3. Nikolaos Avouris Human Computer Interaction Group, University of Patras, Patras, Greece

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Dinesh K. Sharma, Ph.D. Department of Management School of Business and Economics Fayetteville State University

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

STUDYING ACADEMIC INDICATORS WITHIN VIRTUAL LEARNING ENVIRONMENT USING EDUCATIONAL DATA MINING

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

GACE Computer Science Assessment Test at a Glance

CS Machine Learning

Computerized Adaptive Psychological Testing A Personalisation Perspective

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

ACCOUNTING FOR MANAGERS BU-5190-OL Syllabus

Linking Task: Identifying authors and book titles in verbose queries

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

An Introduction to Simio for Beginners

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Len Lundstrum, Ph.D., FRM

Modeling function word errors in DNN-HMM based LVCSR systems

Chemical Engineering Mcgill Cegep Entry

COURSE LISTING. Courses Listed. Training for Cloud with SAP SuccessFactors in Integration. 23 November 2017 (08:13 GMT) Beginner.

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

National Survey of Student Engagement (NSSE)

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

Odysseyware Login Macon County

Diploma in Library and Information Science (Part-Time) - SH220

Evidence for Reliability, Validity and Learning Effectiveness

Automating the E-learning Personalization

2017? Are you skilled for. Market Leader. Prize Winner. Pass Insurance. Online Learning F7, F8 & F9. Classroom Learning P1-P7

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

STUDENT SATISFACTION IN PROFESSIONAL EDUCATION IN GWALIOR

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Data Fusion Models in WSNs: Comparison and Analysis

OPAC and User Perception in Law University Libraries in the Karnataka: A Study

Probability and Statistics Curriculum Pacing Guide

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

A student diagnosing and evaluation system for laboratory-based academic exercises

UNA PROFESSIONAL ACCOUNTING PREP PROGRAM

ANNUAL CURRICULUM REVIEW PROCESS for the 2016/2017 Academic Year

Specification of the Verity Learning Companion and Self-Assessment Tool

Transcription:

KAAV INTERNATIONAL JOURNAL OF SCIENCE, ENGINEERING & TECHNOLOGY A REFEREED BLIND PEER REVIEW QUARTERLY JOURNAL KIJSET/JUL-SEP (2017)/VOL-4/ISS-3/A15 PAGE NO.84-89 ISSN: 2348-5477 IMPACT FACTOR (2017) 6.9101 WWW.KAAVPUBLICATIONS.ORG A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDS 1 MEHTA SMRUTI HEMANT KUMAR 1 Research Scholar, Pacific Academy of higher Education and Research University, Udaipur 1 DR. NIMESH I. MODI 1 I/C H.O.D., Hemchandracharya North Gujarat University, Patan Abstract Data mining extracts the information from large amount of data. The goal of organization is to give quality education to its students. One way to achieve highest level of quality in higher education system is by predicting student s performance. This paper provides various data mining techniques. These techniques include classification, clustering, association rule, prediction etc. Keywords Data Mining, Education Data Mining, Knowledge discovery from data (KDD), Data mining techniques 1. Introduction In the real world, higher educational institutions are facing very high competitions. The aim of these institutions is to get more advantages over the other business competitors. To achieve this goal the institutions have to get highest level of quality and satisfy their customers. The way to reach the highest level of quality in higher education system is good prediction of student s success in higher learning institution. To achieve this prediction model is used using anyone of the various approaches. Students and professors are the valuable assets for these institutions. To remain competitive in educational domain, these institutions have to be knowledgeable for a better assessment, evaluation, planning and decision - making. To be a knowledgeable, knowledge is required and this knowledge can be acquired from the historical and operational data that reside in the database of educational institution. For these, data mining techniques can be used to extract knowledge from large data sets. Data mining can be applied in various areas like finance, banking, telecommunication, industry, medical, education, marketing, surveillance, fraud detection, statistical analysis, engineering, sales etc. Data Mining is a process of extracting previously unknown, valid, potentially useful and hidden patterns from large data sets (Connolly, 1999) [1]. The amount of data stored in educational databases is increasing rapidly. Efficient data mining techniques are required in order to get required benefits from such a large data and to find out hidden 84

relationships between variables (Han and Kamber, 2006) [4]. Data Mining, sometimes also called Knowledge Discovery in Databases (KDD). The primary goal of data mining is to uncover hidden information. Knowledge Discovery in Database refers to the overall process of extracting useful information from large data sets, where data stored in databases, data warehouses or other information storage areas. It interacts with user or knowledge base. Knowledge Discovery in Database is used for finding new knowledge from database that is used in decision making process. 1.1 Steps of KDD Knowledge discovery process is depicted in following figure. Figure 1: Steps of KDD KDD have iterative sequence of the following steps [12]: 1. Develop an understanding for the application domain and identify the goal. 2. Create a target dataset Selecting a dataset or focusing on a subset of samples or variables on which to make discoveries 3. Data cleaning and preprocessing removing of noise and outliers from collecting necessary information to model or account for noise handling of missing data accounting for time sequence information. 4. Data reduction and projection Finding useful features to represent the data relative to the goal dimensionality reduction/transformation ==> reduce number of variables identification of invariant representations. 5. Selection of appropriate data-mining task Summarization, classification, regression, clustering, etc. 6. Selection of data-mining algorithm(s) Methods to search for patterns decision of which models and parameters may be appropriate match method to goal of KDD process 7. Data-mining Searching for patterns of interest in one or more representational forms 8. Interpretation and visualization Interpretation of mined patterns visualization of extracted patterns and models visualization of the data given the extracted models 85

Data mining includes fitting models to or determining patterns from observed data. The fitted models play the role of brings knowledge. Deciding whether, the model reflects useful knowledge or not is a part of the overall KDD process for which subjective human judgment is usually required. Data mining consists of five major elements: 1) Extract, convert, and load transaction data into data warehouse system. 2) Storage and then management of this data in a multidimensional database system. 3) Provide access of this data to information technology professionals and business analysts. 4) Analysis of data using application software. 5) Present the data in a useful form, such as a table or graph [11]. 2. Educational data mining Baker and Yacef (2009) [5] defined educational data mining as Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in. The raw data coming from educational system are converted into useful information during EDM process which could have an ample impact on educational practice. In recent years EDM has become an active research area for researchers all over the world. EDM is a process of developing various techniques or methods like prediction, clustering, classification, association rule mining etc. for extracting the different types of data from educational database and using those methods to understand the students. The main area of Educational Data Mining is predicting student s performance, enrolment management, grouping students, predicting student s profiling, planning and scheduling, user modeling, detecting cheating in online examination. The main objective of any higher educational system is to improve the quality of education. To accomplish this goal, the data mining techniques can be used. Educational data mining have some advantages over the higher educational system such as decreasing student s drop-out rate, increasing student s promotion rate, increasing student s retention rate, increasing student s transition rate, increasing educational improvement ratio, increasing student s learning outcome, maximizing educational system efficiency and reducing the cost of system processes. To achieve these goals the data mining system will be helpful to put insights for decision makers in the higher educational system. Higher education system involves different groups of users or participants. They describe information related to education according to their own mission, vision and objectives. Higher education can be classified into different Users/Stakeholders as follows [16]: 1. Learners / Students :- To personalize e-learning, recommend activities to learners, provide learning tasks that could further improve their learning, to suggest interesting learning experiences to the students. 2. Educators / Teachers / Instructors :- To detect which students require support, to predict student performance, to classify learners into groups, to find a learner s regular as well as irregular patterns, to find the most frequently made mistakes, to analyze student s learning and behavior, to detect which students require support. 3. Course Developers / Educational Researchers :- To compare data mining techniques in order to be able to recommend the most useful one for each task, to develop specific data mining tools for educational purposes etc. 4. System Administrators / Network Administrator :- To utilize available resources more effectively, to enhance educational program offers and determine the effectiveness of the distance learning approach. 3. Data Mining Techniques Data mining techniques are used to manage large amounts of data to discover hidden patterns and relationships. These patterns are helpful in decision making. Data mining techniques includes algorithms 86

like classification, regression, association, prediction, clustering and time series analysis etc. These techniques are used for knowledge discovery from database. 3.1 Classification Classification is a classic data mining technique based on machine learning. Classification technique maps data into a set of predefined classes to describe a model [10]. Classification uses decision tree, neural network and classification rule (IF - Then). For example we can apply the classification rule on the past record of the student who left for university and evaluate them. 3.2 Clustering Clustering is a collection of similar data object. Dissimilar object is another cluster. It is way finding similarities between data according to their characteristic. This technique based on the unsupervised learning (i.e. desired output for a given input is not known). For example, image processing, pattern recognition, city planning [14]. 3.3 Prediction Prediction techniques discover the relationship between one or more independent variables and dependent variables [8]. In data mining independent variables are attributes already known and response variables are what we want to predict. Prediction model is based on continuous or ordered values. 3.4 Regression Regression is used to map a data item to a real valued prediction variable [2]. In other words, regression can be adapted for prediction. In the regression techniques target value are known. For example, you can predict the child behavior based on family history. 3.5 Time Series Analysis Time series analysis is the process of using statistical techniques to model and explain a time-dependent series of data points. Time series forecasting is a method of using a model to generate predictions (forecasts) for future events based on known past events. For example stock market. 3.6 Association Rule: It is a technique to identify specific relationships among data. This technique is useful to identify students failure patterns [6], parameters related to the admission process, migration, contribution of alumni, student assessment, co-relation between different group of students, to guide a search for a better fitting transfer model of student learning etc. [3] 3.7 Sequence Discovery Uncover relationships among data [2]. It is set of object each associated with its own timeline of events. For example, scientific experiment, natural disaster and analysis of DNA sequence. 4. Literature Survey Brijesh Kumar Baradwaj and Saurabh Pal [7] have designed to justify the capabilities of data mining techniques in context of higher education by offering a data mining model for higher education system in the university. In this research, the classification task is used to evaluate student s performance and as there are many approaches that are used for data classification, the decision tree method is used here. Information s like Attendance, Class test, Seminar and Assignment marks were collected from the student s previous database, to predict the performance at the end of the semester. This study will also work to identify those students which needed special attention to reduce fail ration and taking appropriate action for the next semester examination. Pooja Gulati and Dr. Archana Sharma [9] have highlighted in their paper how the education quality is improved with the help of educational data mining. The educational systems currently face number of issues. Data mining provides a set of techniques, which can help the educational system to overcome these issues and enhance the quality of education. One of the significant facts in higher learning institution is the explosive growth educational data. These data are increasing rapidly without any benefit to the management. The main objective of any higher educational institution is to improve the quality of managerial decisions and to impart quality education. For predicting student s success in higher learning institution is one way to reach the highest level of quality 87

in higher education system. Komal S. Sahedani and Prof. B Supriya Reddy [12] have stated that the goal of institutions is to give quality education to its students. One way to achieve highest level of quality in higher education system is by discovering knowledge for prediction regarding enrolment of students in a particular course. In our data driven data mining model, knowledge is existed in data, but just not understandable for human. Educational Data Mining is an emerging discipline that focuses on applying data mining tools and techniques to educationally related data. This paper will focus exclusively on ways that data mining is used to improve student success and processes directly related to student learning. Dr. K M Alaskar, Prof. Prashant G. Tandale and Prof. A A Basade [13] have discussed that one of the biggest challenges that higher education system face today is to improve the quality of managerial decisions. One way to achieve these challenges is to provide new knowledge related to the educational processes and entities to the managerial system. This knowledge can be extracted from historical and operational data that reside in the educational organization s databases using the techniques of data mining. This paper is based on the use of data mining to analyze the student s feedback on curriculum. The result of this study indicates that Data Mining Techniques provide effective improving tools for student feedback analysis. It showed that how data mining can be useful in higher education to predict acceptance and changes of curriculum by students. We collected the data from student by using questionnaire to find the relationships between behavioral factors of student. S. Lakshmi Prabha and Dr.A.R.Mohamed Shanavas [15] have presented broad areas of applications in which educational data mining can be applied to e-learning. The application areas discussed in this paper are: User modeling, User grouping or profiling, Domain modeling and Trend analysis. The experiment is done on 6 th grade Student log collected from MathsTutor for mensuration. By identifying the knowledge level of a students and grouping them will make easier for the teacher to concentrate the areas for week students. Smita, Priti Sharma [14] defines that Data mining extracts knowledge from a large amount of data which stores in various databases. They studied the survey of various data mining techniques. These techniques include classification, association, correlation, clustering and neural network. This paper also conducts a formal review of the application of data mining such as the education sector, marketing, fraud detection, manufacturing and telecommunication. The main objective of data mining techniques is to discover the knowledge from active data. D. Fatima, Dr. Sameen Fatima and Dr.A. V. Krishna Prasad [17] have discussed in their papers to study the application of data mining to analyze data generated by various information systems supporting learning or education. They also deal with EDM applications with an actual impact on the future of learning and teaching. There are a wide variety of applications of EDM discussed in this paper i.e. Improving Student Models, Discovering or improving models of the knowledge structure of the domain, studying the pedagogical support provided by learning software, Scientific discovery about learning and learners. Hardeep Kaur [18] discusses various techniques of data mining like classification, clustering, association rule mining etc. Each technique has its own importance according to his role. There are various applications of data mining in various fields like education, scientific and engineering, healthcare, business and many more. In this paper we will discuss basics of educational data mining. In this paper we will mainly focus on the applications of data mining in the field of education. Applications of data mining in field of education sector are Analysis and Visualization of Data, Predicting Student Performance, Enrolment Management, Grouping Students, Predicting Students Profiling, Planning and Scheduling, User Modeling, Organization of Syllabus, Detecting Cheating in Online Examination. Sen and Umesh Kumar [16] have tried to put emphasize on the different learning techniques such as offline educational system/traditional educational system, web mining/e-learning and intelligent tutorial system. By adopting all these learning techniques student and institutions could attain better enhancement and enrichment to obtain the knowledge in the field of academic curriculum. To apply the educational data mining effectively we will use the various data mining tools and techniques such as: classification, association rule, clustering and decision tree etc. This paper is a review of the state of the art with respect to EDM. This study would be helpful to student, teacher and institution to enhance the performance and 88

productivity effectively. In this research paper different classification method is used to predict the performance of students. 5. Conclusion In this paper I have discussed data mining, educational data mining and data mining techniques. The main goal of any institution is to improve the quality of education. For this data mining techniques can be used. Data mining methods are useful to understand the student s behavior and measuring their performance. References [1] Connolly T., C. Begg and A. Strachan, (1999) Database Systems: A Practical Approach to Design, Implementation, and Management (3rd Ed.). Harlow: Addison-Wesley.687. [2] Dr. M.H.Dunham, Data Mining, Introductory and Advanced Topics, Prentice Hall, 2002. [3] Freyberger,J., Heffernan, N., Ruiz, C.(2004), Using association rules to guide a search for best fitting transfer models of student learning, Workshop on Analyzing Student-Tutor Interactions Logs to Improve Educational Outcomes at ITS Conference [4] Han, J. and Kamber, M., (2006) "Data Mining: Concepts and Techniques", 2nd edition. The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor. [5] Baker, R., & Yacef, K. (2009). The State of Educational Data mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1(1): 3-17 [6] Oladipupo,O.O.,Oyelade,O.J.(2009), Knowledge Discovery from Students Result Repository: Association Rule Mining Approach, International Journal of Computer Science & Security,Vol.4, No.2, pp.199-207, 2009. [7] Brijesh Kumar Baradwaj, Saurabh Pal, Mining Educational Data to Analyze Students Performance, IJACSA, Vol. 2, No. 6, 2011. [8] Aakanksha Bhatnagar, Shweta P. Jadye, Madan Mohan Nagar, Data Mining Techniques & Distinct Applications: A Literature Review International Journal of Engineering Research & Technology (IJERT) Vol. 1 Issue 9, November- 2012. [9] Pooja Gulati, Dr. Archana Sharma, Educational Data Mining for Improving Educational quality, IJCSITS, Vol. 2, No. 3, June 2012. [10] Rajni Jindal, Malaya Dutta Borah, A Survey on Educational Data Mining And Research Trends, IJDMS, Vol.5, No.3, June 2013. [11] Nikita Jain, Vishal Srivastava, Data mining techniques: A survey paper, IJRET: International Journal of Research in Engineering and Technology, Volume: 02 Issue: 11, Nov-2013. [12] Komal S. Sahedani, Prof. B Supriya Reddy, A Review: Mining Educational Data to Forecast Failure of Engineering Students, IJARCSSE, Volume 3, Issue 12, December 2013. [13] Dr. K M Alaskar, Prof. Prashant G. Tandale, Prof. A A Basade, Data Mining Applications in Higher Education, Proceedings of National Conference on Emerging Trends: Innovations and Challenges in IT, 19-20, April 2013. [14] Smita, Priti Sharma, Use of Data Mining in Various Field: A Survey Paper, IOSR-JCE, Volume 16, Issue 3, PP 18-21, May-Jun. 2014. [15] S. Lakshmi Prabha, Dr.A.R.Mohamed Shanavas, Educational Data Mining Applications, ORAJ, Vol. 1, No. 1, August 2014. [16] Sen, Umesh Kumar, A Brief Review Status of Educational Data Mining, IJARCST, Vol. 3, Issue 1, Jan.-Mar. 2015. [17] D. Fatima, Dr. Sameen Fatima, Dr. A.V.Krishna Prasad, A Survey on Research work in Educational Data Mining, IOSR-JCE, Volume 17, Issue 2, Pages 43-49, Mar Apr. 2015. [18] Hardeep Kaur, A Review of Applications of data Mining in the Field of Education, IJARCCE, Vol. 4, Issue 4, April 2015. 89