Analysis of students study activities in virtual learning environments using data mining methods

Similar documents
Using Moodle in ESOL Writing Classes

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

E-learning Strategies to Support Databases Courses: a Case Study

CWIS 23,3. Nikolaos Avouris Human Computer Interaction Group, University of Patras, Patras, Greece

DYNAMIC ADAPTIVE HYPERMEDIA SYSTEMS FOR E-LEARNING

The Moodle and joule 2 Teacher Toolkit

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

On-Line Data Analytics

WELCOME WEBBASED E-LEARNING FOR SME AND CRAFTSMEN OF MODERN EUROPE

Introduction to Moodle

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

The influence of staff use of a virtual learning environment on student satisfaction

STUDENT MOODLE ORIENTATION

Software Security: Integrating Secure Software Engineering in Graduate Computer Science Curriculum

Specification of the Verity Learning Companion and Self-Assessment Tool

ATENEA UPC AND THE NEW "Activity Stream" or "WALL" FEATURE Jesus Alcober 1, Oriol Sánchez 2, Javier Otero 3, Ramon Martí 4

Evaluation of Learning Management System software. Part II of LMS Evaluation

Chapter 1 Analyzing Learner Characteristics and Courses Based on Cognitive Abilities, Learning Styles, and Context

E-Learning project in GIS education

Web-based Learning Systems From HTML To MOODLE A Case Study

2 User Guide of Blackboard Mobile Learn for CityU Students (Android) How to download / install Bb Mobile Learn? Downloaded from Google Play Store

Content-free collaborative learning modeling using data mining

EDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall Semester 2014 August 25 October 12, 2014 Fully Online Course

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Applying Information Technology in Education: Two Applications on the Web

EDIT 576 (2 credits) Mobile Learning and Applications Fall Semester 2015 August 31 October 18, 2015 Fully Online Course

GALICIAN TEACHERS PERCEPTIONS ON THE USABILITY AND USEFULNESS OF THE ODS PORTAL

Moodle Goes Corporate: Leveraging Open Source

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

Word Segmentation of Off-line Handwritten Documents

Automating the E-learning Personalization

Role of Blackboard Platform in Undergraduate Education A case study on physiology learning in nurse major

UNIVERSITY LEVEL GIMP ONLINE COURSE - FACULTY OF TEACHER EDUCATION (ICT COURSE)

Using Blackboard.com Software to Reach Beyond the Classroom: Intermediate

Australian Journal of Basic and Applied Sciences

Blackboard Communication Tools

Course Specification Executive MBA via e-learning (MBUSP)

On the Combined Behavior of Autonomous Resource Management Agents

Please find below a summary of why we feel Blackboard remains the best long term solution for the Lowell campus:

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

K 1 2 K 1 2. Iron Mountain Public Schools Standards (modified METS) Checklist by Grade Level Page 1 of 11

The Heart of Philosophy, Jacob Needleman, ISBN#: LTCC Bookstore:

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

From Virtual University to Mobile Learning on the Digital Campus: Experiences from Implementing a Notebook-University

DISTANCE LEARNING OF ENGINEERING BASED SUBJECTS: A CASE STUDY. Felicia L.C. Ong (author and presenter) University of Bradford, United Kingdom

Spring 2015 IET4451 Systems Simulation Course Syllabus for Traditional, Hybrid, and Online Classes

Computerised Experiments in the Web Environment

KAUNAS COLLEGE FACULTY OF ECONOMICS AND LAW Management and Business Administration study programmes FINAL REPORT

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

PROGRAMME SPECIFICATION

CS 100: Principles of Computing

MASTER S COURSES FASHION START-UP

Maintaining Resilience in Teaching: Navigating Common Core and More Online Participant Syllabus

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

UCEAS: User-centred Evaluations of Adaptive Systems

E-Teaching Materials as the Means to Improve Humanities Teaching Proficiency in the Context of Education Informatization

Automating Outcome Based Assessment

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Integration of ICT in Teaching and Learning

Graduate Program in Education

e-portfolios: Issues in Assessment, Accountability and Preservice Teacher Preparation Presenters:

Multimedia Courseware of Road Safety Education for Secondary School Students

Study of Social Networking Usage in Higher Education Environment

PUBLIC CASE REPORT Use of the GeoGebra software at upper secondary school

DICTE PLATFORM: AN INPUT TO COLLABORATION AND KNOWLEDGE SHARING

Telekooperation Seminar

Mining Association Rules in Student s Assessment Data

A Study of Generating Teaching Portfolio from LMS Logs

An adaptive and personalized open source e-learning platform

STUDENTS' RATINGS ON TEACHER

Leveraging MOOCs to bring entrepreneurship and innovation to everyone on campus

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Foundation Certificate in Higher Education

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

Multimedia Application Effective Support of Education

Programme Specification. MSc in Palliative Care: Global Perspectives (Distance Learning) Valid from: September 2012 Faculty of Health & Life Sciences

GLBL 210: Global Issues

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Educator s e-portfolio in the Modern University

MGMT3274 INTERNATONAL BUSINESS PROCESSES AND PROBLEMS

OPAC and User Perception in Law University Libraries in the Karnataka: A Study

INTRODUCTION TO GENERAL PSYCHOLOGY (PSYC 1101) ONLINE SYLLABUS. Instructor: April Babb Crisp, M.S., LPC

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

ACCOUNTING FOR MANAGERS BU-5190-OL Syllabus

Use and Adaptation of Open Source Software for Capacity Building to Strengthen Health Research in Low- and Middle-Income Countries

Success Factors for Creativity Workshops in RE

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Procedia - Social and Behavioral Sciences 93 ( 2013 ) rd World Conference on Learning, Teaching and Educational Leadership WCLTA 2012

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

University of Ulster, Northern Ireland. SilverFish Studios, Northern Ireland

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

Environment Josef Malach Kateřina Kostolányová Milan Chmura

Device Design And Process Window Analysis Of A Deep- Submicron Cmos Vlsi Technology (The Six Sigma Research Institute Series) By Philip E.

A student diagnosing and evaluation system for laboratory-based academic exercises

ADMN-1311: MicroSoft Word I ( Online Fall 2017 )

GROUP COMPOSITION IN THE NAVIGATION SIMULATOR A PILOT STUDY Magnus Boström (Kalmar Maritime Academy, Sweden)

School Inspection in Hesse/Germany

Transcription:

Ukio Technologinis ir Ekonominis Vystymas ISSN: 1392-8619 (Print) 1822-3613 (Online) Journal homepage: http://www.tandfonline.com/loi/tted20 Analysis of students study activities in virtual learning environments using data mining methods Saulius Preidys & Leonidas Sakalauskas To cite this article: Saulius Preidys & Leonidas Sakalauskas (2010) Analysis of students study activities in virtual learning environments using data mining methods, Ukio Technologinis ir Ekonominis Vystymas, 16:1, 94-108 To link to this article: https://doi.org/10.3846/tede.2010.06 Published online: 21 Oct 2010. Submit your article to this journal Article views: 353 View related articles Citing articles: 3 View citing articles Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=tted20 Download by: [37.44.199.94] Date: 28 December 2017, At: 07:58

Technological and economic development OF ECONOMY Baltic Journal on Sustainability 2010 16(1): 94 108 ANaLYSIS OF STUDENTS STUDY ACTIVITIES IN VIRTUAL LEARNING ENVIRONMENTS USING DATA MINING METHODS Saulius Preidys 1, Leonidas Sakalauskas 2 Institute of Mathematics and Informatics, Akademijos g. 4, LT-08663 Vilnius, Lithuania E-mail: 1 s.preidys@vvf.viko.lt; 2 sakal@ktl.mii.lt Received 6 April 2009; accepted 11 December 2009 Abstract. This article deals with application of data mining methods to analysis of learners behaviour using the distance learning platform BlackBoard Vista (BlackBoard 2008). Before planning a distance learning course, instructors have to pay attention to the fact that there exist different study methods: some students start reading learning materials from the very beginning to the end, some students look at unclear topics only, some start with the discussions, etc. Therefore after analyzing the learning factors and identifying learner s style, it is possible to prepare individualized learning materials and to choose a proper way of course presentation. Such a way of study organization would improve the quality of studies and make it possible to reach better results. The research was performed by observing the behaviour and results achieved by 528 students in 15 distance learning courses and, using the clustering method, 3 learner s styles using virtual learning environments (VLE) have been identified and work methods proposed for students with regard to those learners styles. Besides, the research aims to find out the factors that influence final evaluations of students. Keywords: distance education, data mining, e-learning, virtual learning environments, clustering, online learners behaviour. Reference to this paper should be made as follows: Preidys, S.; Sakalauskas, L. 2010. Analysis of students study activities in virtual learning environments using data mining methods, Technological and Economic Development of Economy 16(1): 94 108. 1. Introduction Distance teaching and learning environment provide certain opportunities for instructors to observe students learning behaviour. Analysis of such an observation enables us to provide adaptive feedback, customized assessment, and more personalized attention by dynamical monitoring and tracking learners online behaviour (Hung and Zhang 2008). ISSN 1392-8619 print/issn 1822-3613 online http://www.tede.vgtu.lt doi: 10.3846/tede.2010.06

Technological and Economic Development of Economy, 2010, 16(1): 94 108 95 Different testing systems are used to establish a learner s behaviour in the virtual learning environment. Brusilovky picks out two methods for establishing students learning behaviour: in communication and automatically (Brusilovsky 1996). The learners are examined using special questionnaires and they are assigned to appropriate models according to their answers. Questionnaires are avoided using the second method and the student s learning behaviour is studied, analyzing and comparing similarities and differences between learners behaviour. Later on, these data are analyzed and recommendations are developed according to the results observed. The latter approach allows us to find more objective evaluations and decisions because of escaping wrong answers from the questionnaires. Some virtual learning environments (VLE) are suitable for automatic investigation of learners learning behaviour. Most popular of these are BlackBoard Vista (BlackBoard 2008) and Moodle (Moodle 2008). VLE are widely used for reporting of learning materials as well as for discussions among the learners. This tool enables a teacher not only to report learning materials in a flexible way but also to provide a possibility for learners to participate in common discussions, synchronic chats, create their blogs, review video files of the lectures, use e-mail, etc. This is also a powerful tool for tracking students activities and interpreting these results. 2. Related work Research in distance education has recently become more significant. Some authors relate their research to quality assurance and improving different teaching methods (Davies et al. 2001; Daukilas et al. 2008; Targamadzė and Petrauskienė 2008). There are many researchers who deal with asynchronous text-based discussion in distance education. The quality of distance education relies mainly on learning material design and quality as well as on the quality of communication (in a broad sense) between the student, on the one hand, and the tutor and the educational institution, on the other hand (Patriarcheas and Xenos 2009). Some author s (Lin et al. 2009; Dringus and Ellis 2005) use content analysis to study a discussion forum. Using text mining methods, authors find genres of online discussion. The results of this experiment help instructors monitor online activities that occur in the discussion forums. Perhaps the most promising source of automatically gathered online learning data is the learning software itself, particularly the VME. Very important work is pre-processing original events data (Black et al. 2008). Other authors use mathematical methods e.g. statistical analysis, data mining methods to create programming agents (Castro et al. 2007; Mamčenko and Šileikienė 2006) for VLE. The agents help analysis of student data in the asynchronous learning portion of the information system; find plagiarism, students learning styles and help to make work groups. Digital plagiarism is a growing problem for educators in this information era. Many researchers use data mining methods to improve current software which detects plagiarism. Some authors, while working with text mining methods, present a new architecture for a plagiarism detection tool that can work with many different kinds of digital submissions, from plain or formatted texts to audio podcasts (Butakov and Scherbinin 2008; Zini et al. 2006). They have developed anti-plagiarism toolbox, which search plagiarism in local and in

96 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... global databases (internet by using Microsoft Live Search). This toolbox may be integrated into open source VLE Moodle. Other scholars deal with plagiarism in the area of using visualization method to find plagiarism in automated student assessments (Graven and MacKinnon 2008), and improving plagiarism detecting systems for the fastest and the most reliable (Mozgovoy et al. 2007). Several researchers analyze student s activities in virtual learning environment and recommend tutors important solutions, to present students individualized learning materials (Romero et al. 2008; Chen 2008). For example, some authors (Schiaffino et al. 2008) present eteacher, an intelligent agent that provides personalized assistance to e-learning students. eteacher observes a student s behaviour while he/she is taking online courses and automatically builds the student s profile. This profile comprises the student s learning style and information about the student s performance, such as exercises done, topics studied and exam results. In our approach, a student s learning style is automatically detected from the student s actions in an e-learning system using Bayesian networks. The group of authors (Sun et al. 2008) proposed a useful grouping method to help teachers improve group-learning in e-learning by first establishing effective groups with rules based on data mining, and then facilitating student interaction using a system that monitors members communication status. Some authors use classification methods which allow to establish students learning styles. In the literature, there are mainly five learning style models, which are subject of studies in the engineering science education literature. These learning style models are namely Myers Briggs type indicator (MBTI) Kolb s model, Felder and Silverman learning style model (FSLSM) Herrmann Brain Dominance Instrument (HBDI) and Dunn and Dunn model (Özpolat and Akar 2009). While working with FSLSM models, the author found the match ratio between the obtained learner s learning characteristics using the proposed learner model and those obtained by the questionnaires traditionally used for learning style assessment is high for most of the dimensions of learning style. Graf and Kinshuk (2006) propose an approach to detect learning styles in VLE based on the behaviour of learners during an on-line course. They provide a practical example by extending the open-source VLE Moodle with the proposed learning style detector tool. In this article authors, base their findings on previous research and data about students real activities accumulated in commercial virtual learning environment Blackboard Vista and discuss study quality dependency on their activities in VLE. By employing clustering method, authors carried out research in distinguishing three users activities groups, which were not discussed earlier, and described in detail the activities of these users in VLE. 3. Software used in research In this research, the most popular VLE Blackboard Vista Enterprise (BlackBoard 2008) was explored. Most of the largest Lithuanian universities and colleges use this software as well as in other countries. Typically, VLE tracks students activities in log files. However, in the BlackBoard Vista software all students activities are accumulated in ORACLE databases, which are available for system administrators. Later on users and course instructors may take statistical data: students logs to different study resources, time durations, etc.

Technological and Economic Development of Economy, 2010, 16(1): 94 108 97 The new tool for data tracking, called PowerSight Kit (VistaPowerSightKit 2008), is updated in 2008. System administrators can accumulate and analyse data using this tool. They can not only observe students work, but also make suggestions to course instructors and scientific personnel of an education institution working on research of offering and supporting areas of the distance learning course. Having analysed these data, it is possible to aquire more information than by using only the VLE tracking tool. The PowerSight Kit tool accumulates data on students behaviour in VLE in related tables. There is information on modules, templates, users, their rights, students activities, their grades, etc. In this research, 3 tables were used: student information (RPT_PERSON), their grades (RPT_EXT_GRADEBOOK) and activities during their study period (RPT_TRACK- ING). The relationships are indicated in Figure 1. The PowerSight Kit tool takes part of information from the ORACLE database online. Some information is transferred to separate tables therefore this information is day old. 4. The process of data mining in distance learning Data mining application methods in distance learning consists of four steps (Fig. 2): 1. Data accumulation. 2. Preparation of the data accumulated. 3. Application of data mining methods to the selected data. 4. Interpretation and analysis of the selected data (Romero et al. 2008). RPT_LEARNING_CONTEXT_SIZE RPT_TEMPLATE RPT_LEARNING_CONTEXT RPT_MEMBER RPT_TRACKING RPT_GRADEBOOK RPT_ADDRESS RPT_PERSON RPT_EXT_GRADEBOOK RPT_EXTRACT_LOG RPT_PHONE Fig. 1. Relationships of the PowerSight Kit tool

98 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... 4. Interpretation and analysis of the selected data 1. Data accumulation Other authors distinguish five data mining methods in distance learning data processing in their works. Visualisation is added to the mentioned above. 4.1. Data accumulation 3. Application of data mining methods to selected data 2. Preparation of the data accumulated Fig. 2. Steps of data mining application methods in distance learning It is not purposive to work directly on the ORACLE database because of data safety and server loading problems. While working with the database containing millions of records, even an incautious SQL query can seriously refuse speed server activities. Therefore, after evaluating possible risks, all data necessary for this research were transferred to other database in the local server for further processing. This process was completed by using a special code created using the PHP programming language. The code performs necessary calculations and transfers the existing data so that later on we could use data mining methods. The prepared data are sent to the other MySQL database and are saved for the next data analysis. During the next steps data are encoded into formats suitable for data mining programmes like STATISTICA, WEKA or DBMINER. Data transformation steps are indicated in Figure 3. VISTA ORACLE, Read-only MySQL Full access WEKA, STATISTICA, DB MINER CONSUMERS (teachers, students) Fig. 3. Data transformation steps

Technological and Economic Development of Economy, 2010, 16(1): 94 108 99 4.2. Preparation of selected data Using the software prepared a flat table was formed from the above mentioned tables to which data mining methods were applied. Only VLE tools were selected to showing the peculiarities of students activities. Some part of tools was refused to observe because of thin very rare usage. Table 1 indicates the selected indicators of students activities according to which the learning styles and activities of students were defined. Table 1. VLE tools indicating the peculiarities of students behaviour No Indicator Meaning 1 FINAL_GRADE Average of all students final grades of the course 2 CON_COUNT Number of connections of students to the course 3 TIME Time spent in VLE 4 ANNOUNCEMENT The working time of user with announcement tools 5 ASSESSMENT The working time of user with assessment tools 6 ASSIGNMENTS The working time of user with assignment tools 7 CALENDAR The working time of user with calendar tools 8 CHAT The working time of user with chat tools 9 CONTENT_PAGE The working time of user with the learning material 10 DISCUSSION The working time of user with discussion tools 11 FILE_MANAGER The working time of user with file manager tools 12 LEARNING_OBJECTIVES The working time of user with the tools of learning objectives 13 MAIL The working time of user with mail tools 14 MEDIA_LIBRARY The working time of user with medial library tools: vocabulary, video, and other learning materials 15 MY_GRADES The working time of user with My Grades tools 16 MY_WEBCT The working time of user with MyWebct tools 17 NOTES The working time of user with Notes tools 18 ORGANIZER The working time of user with the Contents of learning material 19 SYLLABUS The working time of user with the syllabus of the course 20 STUDENT_BOOKMARKS The working time of user with students bookmark tools 21 TRACKING Tracking tools of our and other activities 22 WEB_LINKS The working time of user with web Links tools hyperlink to another learning material or web pages 23 WHO_IS_ONLINE The user working time with Who online tools search online users

100 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... In pre-processing a new table is created where data of all students activities in VLE is kept (Table 2). Before data access table they are processed so that data mining methods can be applied later on. Table 2. Final table of student s activities 1 Finalgrade 2 Concount 3 Time 4 Contentpage 5 Assessment 6 Organizer 7 Assigments 8 Calendar 9 Learning objectives 10 Discussions 10 25 11 485 38 35 47 0 0 0 0 10 24 22 910 53 50 49 0 0 0 0 10 16 15 685 21 6 39 0 0 0 0 10 26 29 130 50 61 57 0 6 0 1 10 33 46 720 1 139 148 45 5 0 27 10 44 44 877 5 84 114 1 0 0 1 4.3. Application of data mining methods to selected data In this research, the clustering method was applied to study the learning behaviour of students. The result of clustering is subdivision of objects into separate groups of similar objects. Users, events, sessions, pages, activities, etc, might be as objects in distance learning. WEKA and STATISTICA data mining and statistical analysis software were used for data clustering. The most widely used K-means algorithm was applied. This algorithm clusters data according to formula: k n ( j) J = x c j= 1i= 1 i 2 ( j) x c is a square of distance between the point ( j) xi and cluster centre (MacQueen 1967). i j The data were grouped into 3 clusters by means of V-fold cross-validation. The results of this clustering are indicated in Figure 4. For a deeper data analysis we calculated the numbers of students in each cluster as well as the average of their final evaluation. Those results are presented in Table 3. j 2, Table 3. Results of clustering Cluster Amount Percentage Final evaluation 1 44 8.33333 7.568182 2 360 68.18182 7.836111 3 124 23.48485 7.604839 It is obvious that the averages of final evaluation in clusters slight differ by from one another. With a view to achieve more information, we have examined each cluster separately using additional criteria.

Technological and Economic Development of Economy, 2010, 16(1): 94 108 101 0,9 Normalized means 0,8 0,7 0,6 0,5 0,4 0,3 Cluster 1 Cluster 2 Cluster 3 0,2 0,1 0,0 final_grade Time assessment Fig. 4. Clustering of students activities using the K-means algorithm 4.4. Analysis and interpretation of the results obtained assignments learning_objectives syllabus As indicated in Figure 4, all the students from cluster 1 are the most active students, spending most of their time in VLE. Unfortunately their final evaluations are lowest as compared to the other clusters even it is hardly noticeable. In order to find out the reasons of this phenomenon, we have to make a deeper analysis of data from the 1 st cluster. Table 4 shows the results of grouping VLE tools according to their influence on the quality of studies. media_library my_webot tracking who_is_online file_manager chat Table 4. The influence of VLE tools on the quality of studies content_page assessment organizer Important Average importance Unimportant Assignments Calendar learning_objectives discussion syllabus web_links media_library my_grades my_webct student_bookmarks tracking notes who_is_online announcement file_manager mail chat

102 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... Figure 5 indicates the time spent in VLE by the students from the 1 st cluster. It is evident from Figure 5 that only 4 students have spent more than a half their time using important VLE tools: studied materials, made assessments, etc. The rest time was spent irrationally: students checked, what their personal achievements are, who is online, took part in discussions, etc. Therefore a conclusion might be drawn that the high activity percentage of using VLE is not a guarantee of a good study performance. Since a purposeful, logical usage of the important tools is an important criterion for course developers and instructors as well, a course instructor, having information about this criterion can optimise course navigation, the time spent by students in VLE and redirect students work to a purposive way, instead, sometimes they use all possible tools without evaluating their usage possibilities and purpose in general (Romero et al. 2008). The largest students group is in cluster 2. The members of this cluster are most inactive VLE visitors, but the average of their final evaluation is the highest one (Table 2). Again, with a view to explore these data, we have to make the statistical analysis of this cluster data. Figure 6 shows students activities according to different VLE objects that are divided into 3 groups as in the previous example. The diagram shows that the students of this cluster use both important and unimportant VLE tools. Using the important VLE tools for their studies, some students have spent more than 70 80% of their time. Only a small part of their time was spent for unimportant tools. Thus, it is possible to state that the students of the 2 nd cluster use study materials with exact predetermined aims. They did not spend their time browsing everywhere. On the contrary, they have found the necessary study materials and tools and used only them. Hence we way draw one more conclusion: the students from this cluster do not use a lot of VLE tools (discussions, chats, etc.). They prefer to study environments traditionally. 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 10 10 10 9 9 9 9 9 8 8 7 7 7 7 7 6 6 6 6 6 6 5 Unimportant Average importance Important Fig. 5. Analysis of students activities from the 1 st cluster according to the effectiveness of their activities

Technological and Economic Development of Economy, 2010, 16(1): 94 108 103 100% 80% 60% 40% 20% 0% 1 6 6 7 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 Fig. 6. Analysis of activity effectiveness of the students from the 2 nd cluster Unimportant Average importance Important In distance education an instructor s role is very important. An instructor affects the quality and the final result of the distance learning course. If the instructor is active in the study process, tracking students activities, participating in students discussions, promoting these discussions, then the results of such a course are better than that of the neglected students of courses. Therefore, in order to find out why the students from the 2 nd cluster were so inactive, but achieved the best results, we have additionally researched the activities of the course instructors. Their activities were differentiated into the following groups: 1. Students stimulation. 2. Course development and renewal. 3. Course administration. 4. Course testing. Students stimulation. This group includes the instructors whose activities stimulates students to use VLE tools, participate in the discussions, develop new topics, do reply to e-mails, etc. Course development. This group consists of instructors whose activities are devoted to course renewal, replenishment with additional elements, for example, adding new links to external resources, new chats, files, etc. Course administration. This group includes instructors whose activities are necessary for course administration: reading e-mails, reviewing study materials, evaluation of students assignments, tests, etc. Course testing. This group includes instructors whose imitate students activities such as testing assessments and assignments, reviewing materials and other resources. Thus, a course instructor can be sure that a student will see the result as it had been planned by a course developer.

104 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... Figure 7 illustrates the course instructor activities according to the groups that participated in the distance learning courses. As we can see from Figure 7 most of the time spent by instructors is devoted to the course administration. Only a very small part of their time was spent for students stimulation, and application of active learning methods. Thus is possible to state that students passivity was influenced by passive instructor s activities. The diagram in Figure 8 shows that very different kinds of promoting activities were applied by instructors. In some courses students were stimulated very actively, in other courses, students were left on their own. This can be the reason of different achievements of students of different clusters: conscious students continued their studies independently; other students participation was passive and more incidental in general. When preparing the study materials, course developers and instructors have to evaluate the learners group and to concentrate not only on the ways of traditional course presentation (materials prepared for printing, assessments), but also on stimulation of students of 1% 92% 2% 5% Students stimulation Course development and renewal Course administration Course testing Fig. 7. Activities of the instructor who supervised the distance learning course 250 200 150 100 50 0 Fig. 8. Students support activities in different courses

Technological and Economic Development of Economy, 2010, 16(1): 94 108 105 this group by involving them into active learning. This would let us to achieve better results and would not disappoint the students who choose virtual learning courses. Activities of the students from the 3 rd cluster are very similar to at of the 2 nd cluster, but they are a bit more active than the latter. The students belonging to the 3 rd cluster are less actively using important VLE tools, a bit more attention pay to less important tools, but they do not spend their spare time for using unimportant tools (Figure 9). Representatives of this cluster are the most potential students, but their potential is not used. The students can concentrate on the most important course elements; they are interested in the new ICT learning tools, but having no support from the course instructor, they lose a possibility to achieve better results. A better performance is expected, if we notice such students and apply active teaching methods to them. 5. Conclusions and further investigation Having performed this research and processed its results we can conclude that distance learning is a wide data mining application area. Referring to learners activities in VLE, k- means clustering method was applied and participants were divided into 3 clusters. Authors examined and described in detail the users behaviour in each cluster. Since the learners are individuals, the best results could be achieved via the individualised teaching methods (Chen 2008). Data mining methods can serve this purpose. Referring to the articles of other authors (Özpolat and Akar 2009; Graf and Kinshuk 2006; Chen 2008) and to the research made authors developed prerequisites for computer agents presenting individual materials for each learner. Since the learners are individuals, the best results could be achieved via the individualised teaching methods (Chen 2008). Data mining methods can serve this purpose. 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 2 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 10 10 Unimportant Average importance Important Fig. 9. Analysis of activity effectiveness of the students from the 3 rd cluster

106 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... It was set up in this research that learners activities in VLE are highly influenced by tutor s activities. Referring to the research made it was observed that learners were more passive in those courses where tutors made more activity efforts. It was set up that course tutors the most of their time spend for the course administration but not for activating students in the VLE. Using data mining methods authors are planning to develop other group of computer based agents which can give information about student s activities and suggest adequate actions for the course tutors for students activating. After clearing out the learners styles and activities, it is possible proceed with further research on the quality of the distance learning course. Other data mining methods such as classification, association rules, neuron networks, etc., can be applied, too. But even in the case of an excellent distance learning module the work on its preparation takes only about 30% of the total work on the course. The other part is qualitative course delivery, qualified tutorship, students involvement, etc. Several outliers occurred while working with the research data. Statistical analysis and data mining of outliers is also a possible source to find out useful information for course instructors, for example, recognition of unconcerned students (Mozgovoy et al. 2007), etc. An intelligent WEB system might be developed on the base of this research, using data mining methods, which, after analysing the existing data and evaluating students and instructors activities, could invoke ideas of the most optimal teaching methods for the course instructor. This system could optimise the learning process for students and indicate ways of achieving the best results. References Black, E. W.; Dawson, K.; Priem, J. 2008. Data for free: Using LMS activity logs to measure community in online courses, Internet and Higher Education 11: 65 70. doi:10.1016/j.iheduc.2008.03.002. BlackBoard. 2008. Available from Internet: <http://www.blackboard.com/> (accessed at 18 th December 2008). Brusilovsky, P. 1996. Methods and techniques of adaptive hypermedia, User Modeling and Uuser-Adapted 6(2 3): 87 129. doi:10.1007/bf00143964. Butakov, S.; Scherbinin, V. 2008. The toolbox for local and global plagiarism detection, Computers & Education 52: 781 788. doi:10.1016/j.compedu.2008.12.001. Castro, F.; Vellido, A.; Nebot, A.; Mugica, F. 2007. Applying data mining techniques to e-learning problems, Studies in Computational Intelligence (SCI) 62: 183 221. doi:10.1007/978-3-540-71974-8_8. Chen, C.-M. 2008. Intelligent web-based learning system with personalized learning path guidance, Computers & Education 51: 787 814. doi:10.1016/j.compedu.2007.08.004. Daukilas, S.; Kačinienė, I.; Vaišnorienė, D.; Vaščilavidienė, V. 2008. Factors that impact quality of e-teaching/learning technologies in higher education, The Quality of Higher Education 5: 132 151. Davies, G.; Cover, C. F.; Lawrence-Fowle, W.; Guzdia, M. 2001. Quality in distance education, in The Proceedings of the 31 st Annual Frontiers in Education Conference, 2001. vol. 02. Washington: IEEE Computer Society, T4F T41. doi:10.1109/fie.2001.963657. Dringus, L. P.; Ellis, T. 2005. Using data mining as a strategy for assessing asynchronous discussion forums, Computers & Education 45: 141 160. doi:10.1016/j.compedu.2004.05.003.

Technological and Economic Development of Economy, 2010, 16(1): 94 108 107 Information about Bb Learning System VistaPowerSightKit. Available from Internet: <http://www.blackboard.com/docs/as/bb%20learning%20system%20-%20datasheet%20-%20vistapowersightkit. pdf> [accessed at 11 th October 2008]. Graf, S.; Kinshuk. 2006. An approach for detecting learning styles in learning management systems, in The Proceedings of the sixth IEEE International Conference on Advances Learning Technologies, Washington: IEEE Computer Society, 161 163. doi:10.1109/icalt.2006.1652395. Graven, O. H.; MacKinnon, L. M. 2008. A consideration of the use of plagiarism tools for automated student assessment, IEEE Transactions on Education 51(2): 212 219. doi:10.1109/te.2007.914940. Hung, J.-L.; Zhang, K. 2008. Revealing online learning behaviors and activity patterns and making predictions with data mining techniques in online teaching, MERLOT Journal of Online Learning and Teaching 4(4): 426 437. Lin, F.-R.; Hsieh, L.-S.; Chuang, F.-T. 2009. Discovering genres of online discussion threads via text mining, Computers & Education 52: 481 495. doi:10.1016/j.compedu.2008.10.005. MacQueen, J. B. 1967. Some methods for classification and analysis of multivariate observations, in Proceedings of 5 th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1. Berkeley: University of California Press, 281 297. Mamčenko, J.; Šileikienė, I. 2006. Intelligent data analysis of e-learning system based on data warehouse, olap and data mining technologies, in The Proceedings of the 5 th WSEAS International Conference on Education and Educational Technology(EDU 06), Tenerife, Canary Islands, Spain, December 16 18, 2006[CD]. Tenerife: WSEAS, 171 175. Moodle. 2008. Available from Internet: <http://moodle.com/> [accessed at 18 th December 2008]. Mozgovoy, M.; Karakovskiy, S.; Klyuev, V. 2007. Fast and reliable plagiarism detection system, in The 37 th ASEE/IEEE Frontiers in Education Conference, Milwaukee, October 10 13, 2007. Milwaukee: IEEE Computer Society, 1 4. doi:10.1109/fie.2007.4417860. Özpolat, E.; Akar, G. B. 2009. Automatic detection of learning styles for an e-learning system, Computers & Education 53(2): 355 367. doi:10.1016/j.compedu.2009.02.018. Patriarcheas, K.; Xenos, M. 2009. Modelling of distance education forum: Formal languages as interpretation methodology of messages in asynchronous text-based discussion, Computers & Education 52: 438 448. doi:10.1016/j.compedu.2008.09.013. Romero, C.; Ventura, S.; Garcia, E. 2008. Data mining in course management systems: Moodle case study and tutorial, Computer & Education 51: 368 384. doi:10.1016/j.compedu.2007.05.016. Schiaffino, S.; Garcia, P.; Amandi, A. 2008. eteacher: providing personalized assistance to e-learning students, Computers & Education 51: 1744 1754. doi:10.1016/j.compedu.2008.05.008. Sun, P.-C.; Chen, H. K.; Lin, T.-C.; Wang, F.-S. 2008. A design to promote group learning in e-learning: Experiences from the field, Computers & Education 50: 661 677. doi:10.1016/j.compedu.2006.07.008. Targamadzė, A.; Petrauskienė, R. 2008. The quality of distance learning in the situation of technological change, The Quality of Higher Education 5: 74 93. Zini, M.; Fabbri, M.; Moneglia, M.; Panunzi, A. 2006. Plagiarism detection through multilevel text comparison, in The Proceedings of the second International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution. Washington, IEEE Computer Society, 181 185. doi:10.1109/axmedis.2006.40.

108 S. Preidys, L. Sakalauskas. Analysis of students study activities in virtual... Studentų, Besimokančių virtualaus mokymo aplinkoje, veiklos analizė taikant duomenų gavybos metodus S. Preidys, L. Sakalauskas Santrauka Prieš planuodami rengti ir teikti nuotolinio mokymosi kursą, rengėjai turi atsižvelgti į tai, kad žmonės studijuoja skirtingais metodais: vieni pradeda skaityti pateiktą medžiagą iš eilės, kiti peržiūri tik nesuprantamas vietas, treti persikelia į virtualias diskusijas ir pan. Todėl, išanalizavus mokymosi veiksmus ir nustačius studento stilių, vėliau galima pateikti suasmenintą mokymosi medžiagą, parinkti geresnius kurso pateikimo metodus. Toks mokymo organizavimas pagerintų studijų kokybę ir leistų pasiekti geresnių rezultatų. Šiame straipsnyje nagrinėjamas duomenų gavybos metodų taikymas, analizuojant studentų elgseną, naudojantis virtualaus mokymo terpe BlackBoard Vista (BlackBoard 2008). Reikšminiai žodžiai: virtualaus mokymo aplinka, nuotolinis mokymas, duomenų gavyba, klasterizavimas, nuotolinių studijų vartotojų elgsena. Saulius PREIDYS. Assoc. Prof. at the Dept of Information Technologies, Head of Centre of Distance Education, Vilnius College of Higher Education, PhD student at the Institute of Mathematics and Informatics (from 2008), Vilnius. Author of 5 th distance courses, member of National Association of Distance Education of Lithuania and Lithuanian Computer Society. Research interests: methodology of distance education, WEB technology, data mining methods. Leonidas SAKALAUSKAS. Doctor Habil, Professor. Department of Operational Research. Institute of Mathematics and Informatics. PhD (Candidate of Technical Sciences) (1974), Kaunas University of Technology. Doctor Habil. (2000), Institute of Mathematics and Informatics. Professor (2005). Research visits to International Centre of Theoretical Physics (ICTP) (Italy, 1996, 1998), High Performance Computing Center CINECA (Italy, 2007, 2008). He is a member of the New-York Academy of Sciences (1997), vice-president of the Lithuanian Operation Research society (2001), Elected Member of the International Statistical Institute (2002), member of International Association of Official Statistic (2001), member of European Working Groups on Continuous Optimization, Financial Modelling and Multicriterial Decisions. Author of more than 120 scientific articles. Research interests: continuous optimization, stochastic approximation, data mining, Monte-Carlo method, optimal design.