Integrated Copy-Paste Checking: Design and Services

Similar documents
AQUA: An Ontology-Driven Question Answering System

AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS

On-Line Data Analytics

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Requirements-Gathering Collaborative Networks in Distributed Software Projects

Community-oriented Course Authoring to Support Topic-based Student Modeling

A Didactics-Aware Approach to Management of Learning Scenarios in E-Learning Systems

The Moodle and joule 2 Teacher Toolkit

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Intermediate Algebra

Automating the E-learning Personalization

Educator s e-portfolio in the Modern University

Unit 7 Data analysis and design

Apps4VA at JMU. Student Projects Featuring VLDS Data. Dr. Chris Mayfield. Department of Computer Science James Madison University

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

Unit 3. Design Activity. Overview. Purpose. Profile

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

1 Use complex features of a word processing application to a given brief. 2 Create a complex document. 3 Collaborate on a complex document.

Evaluation of Learning Management System software. Part II of LMS Evaluation

SAMPLE. PJM410: Assessing and Managing Risk. Course Description and Outcomes. Participation & Attendance. Credit Hours: 3

Computerized Adaptive Psychological Testing A Personalisation Perspective

Linking Task: Identifying authors and book titles in verbose queries

EDITORIAL: ICT SUPPORT FOR KNOWLEDGE MANAGEMENT IN CONSTRUCTION

21 st Century Skills and New Models of Assessment for a Global Workplace

Academic Integrity RN to BSN Option Student Tutorial

Operational Knowledge Management: a way to manage competence

IDS 240 Interdisciplinary Research Methods

Personal Project. IB Guide: Project Aims and Objectives 2 Project Components... 3 Assessment Criteria.. 4 External Moderation.. 5

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

Programme Specification

Specification of the Verity Learning Companion and Self-Assessment Tool

Examinee Information. Assessment Information

K 1 2 K 1 2. Iron Mountain Public Schools Standards (modified METS) Checklist by Grade Level Page 1 of 11

Update on Standards and Educator Evaluation

Internal Double Degree. Management Engineering and Product-Service System Design

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Hongyan Ma. University of California, Los Angeles

Compositional Semantics

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

TU-E2090 Research Assignment in Operations Management and Services

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

UCEAS: User-centred Evaluations of Adaptive Systems

CWIS 23,3. Nikolaos Avouris Human Computer Interaction Group, University of Patras, Patras, Greece

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

IBM Software Group. Mastering Requirements Management with Use Cases Module 6: Define the System

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Systematic reviews in theory and practice for library and information studies

Patterns for Adaptive Web-based Educational Systems

Unpacking a Standard: Making Dinner with Student Differences in Mind

Automating Outcome Based Assessment

MAHATMA GANDHI KASHI VIDYAPITH Deptt. of Library and Information Science B.Lib. I.Sc. Syllabus

A student diagnosing and evaluation system for laboratory-based academic exercises

Modeling user preferences and norms in context-aware systems

Introduction to Moodle

An Open Framework for Integrated Qualification Management Portals

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Software Maintenance

An Interactive Intelligent Language Tutor Over The Internet

Abstractions and the Brain

Early Warning System Implementation Guide

Unit purpose and aim. Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50

Xenia High School Credit Flexibility Plan (CFP) Application

Prentice Hall Literature: Timeless Voices, Timeless Themes, Platinum 2000 Correlated to Nebraska Reading/Writing Standards (Grade 10)

Education for an Information Age

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

PSYC 620, Section 001: Traineeship in School Psychology Fall 2016

QUT Digital Repository:

Vocabulary Usage and Intelligibility in Learner Language

ECE-492 SENIOR ADVANCED DESIGN PROJECT

The Enterprise Knowledge Portal: The Concept

Literature and the Language Arts Experiencing Literature

Towards Semantic Facility Data Management

DYNAMIC ADAPTIVE HYPERMEDIA SYSTEMS FOR E-LEARNING

November 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students

OCR LEVEL 3 CAMBRIDGE TECHNICAL

Online Marking of Essay-type Assignments

ACADEMIC POLICIES AND PROCEDURES

Designing e-learning materials with learning objects

Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving

Postprint.

Biome I Can Statements

This Performance Standards include four major components. They are

Gr. 9 Geography. Canada: Creating a Sustainable Future DAY 1

Interaction Design Considerations for an Aircraft Carrier Deck Agent-based Simulation

Modeling full form lexica for Arabic

EDUC-E328 Science in the Elementary Schools

Beyond the Blend: Optimizing the Use of your Learning Technologies. Bryan Chapman, Chapman Alliance

INTRODUCTION TO GENERAL PSYCHOLOGY (PSYC 1101) ONLINE SYLLABUS. Instructor: April Babb Crisp, M.S., LPC

Meeting these requirements does not guarantee admission to the program.

Problems of the Arabic OCR: New Attitudes

Ontologies vs. classification systems

Promotion and Tenure Guidelines. School of Social Work

GACE Computer Science Assessment Test at a Glance

THESIS GUIDE FORMAL INSTRUCTION GUIDE FOR MASTER S THESIS WRITING SCHOOL OF BUSINESS

MMOG Subscription Business Models: Table of Contents

Transcription:

Integrated Copy-Paste : Design and Services Narayanan Kulathuramaiyer 1, Bilal Zaka 2, Denis Helic 3 Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, Malaysia 1 Institute for Information Systems and Computer Media, Graz University of Technology, Austria 1,2,3 {nara 1, bzaka 2, dhelic 3 } @iicm.edu Abstract The advances in technology have made academic cheating far too easy for learners. Furthermore, the World-Wide-Web has brought about a widespread culture of easy-access to all sorts of information, thus reducing the need for learners to perform diligent research or study. E-learning systems would then need to incorporate the monitoring and checking for student expressions of reading and writing, while guiding them towards learning the rightful skills. This paper describes the architecture and design of an integrated Copy-Paste system aimed at addressing these concerns. 1. Introduction Advances in technology, particularly the Web that has made it far too easy for learners to commit unethical practices. The over-reliance of students on Web resources such as Wikipedia and Google without considering the reliability has become a major concern. In current teaching-learning systems, the identification of problem situations in student learning processes, are largely performed after they actually happen. This approach is however ineffective in addressing the needs of future learning environments. A learning ecosystem called ICARE [1] has been proposed to overcome the problems mentioned and to serve as a model for future e-learning systems. The term ecosystem as used here refers to a community of learners and instructors, interacting cooperatively and collaboratively supported by a technology enabled learning environment. Details of the ICARE concept can be found in [2]. Key ideas of this ecosystem are summarized as follows: Support for students to be actively engaged in creating their own intellectual property and respect other people s copyright. Provide a guided environment to foster constructive learning practices and critical thinking whereby students can structure their reading and construct written works systematically. Support for the acquisition of basic readingwriting skills i.e. paraphrasing, summarizing and referencing accurately. Support for process management to structure course-work as a series of process steps. Student learning can then be continuously checked and assessed at each stage. Incorporate preventive measures for making sure that unwanted versions of copy-and-paste just cannot happen (or is drastically minimized) Incorporate viable technologies required to minimize the supervision effort required from instructors. This paper then further describes the architecture and design of an integrated Copy-Paste system, which is a key innovation of ICARE. 2. Overview of the ICARE Ecosystem The ICARE ecosystem is being built on top of WBT-Master [3] [4], a sophisticated e-learning system that supports the definition of multiple learning scenarios, project-based administration of e-learning and the interactive classroom management activities such as mentoring, brainstorming, project management etc. [4]. An overview of the ICARE ecosystem is shown in Figure 1. It adopts a learner-centered approach in supporting learners acquisition of knowledge and skills. Multiple learning scenarios are used to implement alternative learning modes made available to students.

Student User Directory Content Space Learner K-Maps Scenarios Self- Guided Collaborative Mentoring There are a number of services, which are required to facilitate learning management in ICARE. These services can be separated into basic services such as K- Card management, profile management, sessions management, etc. and application specific services which includes concept discovery management, expertise finding, knowledge visualization, Copy-Paste administration, etc. Current works at Graz University of Technology include the development of a number of these Web services, as described in [6], [7], [8], [9]. Web Services Personal Space Shared Space Figure 1: Overview of ICARE Ecosystem A rigorous academic reading and writing process is thus enforced in a guided environment. Each student maintains a portfolio, which characterizes and represents all learning outcomes, recognized student works and achievements. An e-diary is employed to enable the aggregation of student (individual and collaborative) contributions to be captured, assessed and reflected upon. Internal processes and states of the learners can then be represented and augmented with systemic input to provide deeper insights on their learning. The ecosystem allows the demonstration of student learning and understanding via knowledge maps (K- Maps). K-Maps can be associated with any particular stage in the learning process, enabling instructors to determine the actual learning of students to make an informed judgment on possible cheating on the part of students. This ecosystem also allows the instructor or mentor to identify areas in which students found difficulty in understanding or assimilation. The Copy- Paste checking subsystem has been integrated directly into the e-learning environment in such a way that it directly monitors student activities and provides appropriate feedback to instructors, mentors and students themselves. We will describe the overall system architecture of ICARE before focusing on the Copy-Paste checking facility. 3. Architecture of ICARE A mashup [5] is an emerging application development paradigm on the Web. It enables the rapid development of flexible applications built upon a collection of Web Services. A mashup architecture has been proposed for ICARE, built upon a set of Web services as shown in Figure 2. E- Environment Mentoring Self Scenarios Evaluation Collaboration Student Concept learning management K-Flow management Expertise Finder Learner Activity Tracking Student Contribution Assesment K-Visualisation Manager Copy-Paste Handler Peer Manager Application Specific Services Objects Course Folders Activity Logs K-Maps K-Card management Profile management Activity Log manager Sessions manager Annotations manager Similarity Checker Active Doc. manager Data Syncronization Basic Services Figure 2: Mashup based Architecture of ICARE A number of databases are then employed in the realization of the e-learning ecosystem to support learning management, student activity tracking and knowledge visualization. We will now focus on the layering of the Web services to support the integrated Copy-Paste checking facility. Figure 3 describes the layering applied in the design of a sophisticated mashup with a focus on the checking of Copy-Paste activities. Data access and checking service is performed in the background, while students are engaged in their learning activities. Access to both internal databases and the Web is required in finding for relevant material related to a particular student work. In the context of Copy-Paste checking, this involves finding for documents that may be copied from or is similar in concept to the student s current work. An integration service is then performed to consolidate the data fragments from multiple sources. An ontology is referred to in this respect to identify information that is most relevant to a student s task or context of work. The ontology is also required to check for semantic-level conceptual document similarity. Once information that was considered to be relevant

Publ ication Metadata Timeline Map Assemble Geographical Map ACM Categories User User Profile Input Mashup Author Homepage Citation Index Publication Ontology User Publication Mashup As sem ble Discovered Links Links Into the Future Mas hup has been pushed to the user in a non-obtrusive manner, the system continuously analyses the feedback of users (students, peers, mentors, etc.) to determine the value of the information supply. Based on a deeper analysis, students are advised and supported appropriately. The results of the analysis are then visualized and presented to the student. selected database(s). The results of this check are then passed on to an aggregator service to compile its findings, which is the forwarded to the copy-paste manager. The aggregator is able to compile the results of a group of students and present statistical information as required for the activity. The Copy- Paste manager informs the associated learning object which then provides feedback to the students. It will also be responsible for preparing a report for the instructor and mentor(s). World- Wide- Web Ontology Scenario Data Access Integrated Copy-Paste Similarity Context-based Visualisation Course Db And Logs Response mentor Object instructor Report to Activity Tracked Event A Reports of Copy-Paste Manager A Aggregator Invoke Q Student Similarity A Q= query A= Answer A = Aggregated answer Q Tools suite Tool Selection A A Q Web Other Related sources Figure 5: Design of Integrated Copy-Paste Figure 3: Layered Web services Architecture As opposed to current plagiarism detection methods as used in educational institutions, [10], [11], a conceptual similarity checking approach [12], [13] has been adopted. We will now review the mechanism for integrating Copy-Paste checking and design of the similarity checking services. An experimental copy-paste checking feature is being incorporated into WBT-Master. An instructor is able to define the form of copy-paste checking to be integrated to a selected learning activity. The interface for defining the checking as applied for an information supply application is shown in Figure 6. The key component that supports information supply is the similarity checking facility. The similarity detection capability has been provided as a web service. We will now explore how this service is further applied to perform Copy-Paste checking. 4. Mechanism for Integrating Copy-Paste Any activity in a learning scenario in ICARE can be attached to an event that could be tracked by the system, in order to invoke the Copy-Paste manager service. Figure 5 illustrates the processes involved in the integrated copy-paste checking. The Copy-Paste manager requests a similarity checking service to perform similarity detection on a

the matched fingerprints and the corresponding degree of copy. Document similarity is then computed based on the extent of matching fingerprints in a target document. A primary suspected document list is first constructed based on matching fingerprints. Further similarity checking is then performed on the primary list at the document level taking into consideration the number and order of matching fingerprints within a document. Figure 6: Information Supply System [14] 5. Similarity Detection Service for Copy- Paste Text similarity is a basic function that determines the degree of similarity between a text to be evaluated and a document(s) is found either on the Web or in an internal student database. This system makes use of an enhanced conceptual text similarity detection approach. As opposed to traditional plagiarism detection tools, which check for copying when students submit their work, our approach is unique in that an instructor can verify student inputs at any point in their learning activity and request a service to compute a score that determines the ability of students in carrying out a task. This system also allows the ability to provide students with an immediate feedback to inform him or her of a rightful means of performing an academic task [7]. This service takes as input a text that needs to be checked for similarity with either an existing document in the internal collection or the Web. An index is first constructed for each public and private collection to be used for document similarity checking. Text in a source document is first broken down into moderately sized segments called fingerprints. These fingerprints are used as search queries to identify a list of suspected similar document from available search services. The size of snippets of text used as fingerprints is twenty words, which is similar to the size of snippets returned by search engines such as Google and Yahoo. Snippet sizes can however be varied for extracting either coarse-grained or fine-grained source text segments as required. The fingerprints are compared for similarity with suspected documents via the matching performed on text snippets. A report is then presented to highlight Traditional plagiarism techniques employ word based or text hash based similarity checking. On the other hand, the proposed approach normalizes the text first to extract root forms of words. We employed synsets in Wordnet [15], as a means to enable a deeper concept level checking resulting in a reduction in term set representation. The process of Copy-Paste checking is illustrated in Figure 7 as a set of Web services. Figure 7: Similarity checking Web Services Text being compared for similarity is processed using a Part of Speech (POS) tagger to determine syntactical form of each word. The word is later normalized into its most common or generic form based on the WordNet synonym dictionary. They are represented as term vectors, created in a common vocabulary space of segments for text being compared. The similarity between word vectors of normalized text is calculated using angular measure (dot product) of vectors between the queried text and a searched result. The normalized form of texts allows checks based on normalized root concepts in addition to the typical word based matching. Figure 8 shows the process of text similarity checking approach as used in ICARE. Figure 8: Similarity checking approach The Copy-Paste detection design as shown in Figure 7 is a distributed application that consists of composite Web application services to search both the

internet and shared document sources. Service Oriented Architecture (SOA) protocol is employed to support and connect a set of heterogeneous applications made up of open components. Collaborative units (Web services) work transparently within a single computing environment. Web services are developed as standardized application component units built on standardized, platform-independent XML to enable a customized user application. The asynchronous operation of the Web services are simulated via the use of a search proxy. Figure 9 then shows the design of similarity detection system with web service components for document discovery and similarity analysis. SOAP CALL Selected Document Finger Printing & Signature Making Figure 9: Document Search and Copy Process Design 6. Conclusion The current state of e-learning and unchecked student expressions have called for the focusing on learners with just-in-time learning support. This paper described the architecture for ICARE, with details of its realization as a mashup. It has further described the design aspects of the integration of Copy-Paste checking facility into an e-learning system. The Web Services based design has demonstrated means of overcoming the limitations of current e-learning systems, which delay the checking of infringements until students submit their assignments. The modular approach applied in the design of ICARE allows the novel ideas to become adapted to other e-learning systems beyond WBT-Master. The specification of the overall architecture and design also serves as a basis for the design considerations of future e-learning systems. 7. References WWW WWW Finger Search Search snippets prints Services Services Analysis Internal Databases [1] N. Kulathuramaiyer, H. Maurer, Coping with the Copy- Paste Syndrome, Proceedings of E-Learn 2007, AACE, Calgary, Canada, 2007. [2] N. Kulathuramaiyer, H. Maurer, Addressing Copy-Paste with ICARE, accepted for publication in National University Journal, USA, June 2008. [3] WBT Master White Paper, Retrieved from http://coronet.iicm.tugraz.at on March 25, 2008. [4] D. Helic, H. Krottmaier, H. Maurer, N. Scerbakov, Implementing Project-Based in WBT Systems, Proceedings of E-Learn 2003, AACE, USA, 2003, pp. 2189-2196. [5] N. Kulathuramaiyer, Mashups: Emerging Application Development Paradigm for a Digital Journal, Journal of Universal Computer Science, Vol. 13 No. 4, 2007, pp. 531-543. http://www.jucs.org/jucs_13_4/mashups_emerging_applicati on_development [6] M.T. Afzal, N. Kulathuramaiyer, H. Maurer, Creating Links into the Future, Journal of Universal Computer Science, Vol. 13, No. 9, 2007 pp. 1234-1245. http://www.jucs.org/jucs_13_9/creating_links_into_the [7] B. Zaka, H. Maurer, Service Oriented Information Supply Model for Knowledge Workers, Proceedings of I- Know 07, 2007, pp. 432-439. [8] S. Khan, N. Kulathuramaiyer, H. Maurer, Applications of Mashups for a Digital Journal accepted for publication in Journal of Universal Computer Science. [9] S.C. Ong, N. Kulathuramaiyer, A.W. Yeo, Automated Discovery of Concepts from Text, Proceedings of the IEEE/ACM/WIC Conference on Web Intelligence, 2006, pp. 1046-1049. [10] Turnitin, Retrieved from http://www.turnitin.com on December 9, 2007. [11] Mydropbox, Retrieved from http://www.mydropbox.com on December 9, 2007. [12] H. Maurer, F. Kappe, B. Zaka, Plagiarism: a Survey, Journal of Universal Computer Science, Vol. 12, No. 8, pp. 1050-1084, 2006 [13] N. Kulathuramaiyer, H. Maurer, Fighting Plagiarism and IPR Violation: Why is it so Important?, Learned Publishing, Vol. 20. No. 4, pp. 13-19, 2007 [14] Information Supply Engine, Retrieved from www.cpdnet.org/isdemo on March 25, 2008. [15] G. Miller, WordNet : An Electronic Lexical Database, MIT Press, Cambridge, MA. http://wordnet.princeton.