Didacticiel - Etudes de cas. A Multilayer Perceptron for a classification task (neural network): comparison of TANAGRA, SIPINA and WEKA.

Similar documents
CS Machine Learning

Python Machine Learning

Storytelling Made Simple

How to set up gradebook categories in Moodle 2.

Artificial Neural Networks written examination

TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP

Creating an Online Test. **This document was revised for the use of Plano ISD teachers and staff.

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

(Sub)Gradient Descent

Assignment 1: Predicting Amazon Review Ratings

SCT Banner Student Fee Assessment Training Workbook October 2005 Release 7.2

ALEKS. ALEKS Pie Report (Class Level)

Lecture 1: Machine Learning Basics

Create Quiz Questions

Millersville University Degree Works Training User Guide

Rule Learning With Negation: Issues Regarding Effectiveness

DegreeWorks Advisor Reference Guide

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Managing the Student View of the Grade Center

Learning From the Past with Experiment Databases

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

SARDNET: A Self-Organizing Feature Map for Sequences

INPE São José dos Campos

Modeling function word errors in DNN-HMM based LVCSR systems

Implementing a tool to Support KAOS-Beta Process Model Using EPF

TotalLMS. Getting Started with SumTotal: Learner Mode

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

New Features & Functionality in Q Release Version 3.2 June 2016

Getting Started Guide

Knowledge Transfer in Deep Convolutional Neural Nets

16.1 Lesson: Putting it into practice - isikhnas

Using SAM Central With iread

Test Effort Estimation Using Neural Network

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

ACCESSING STUDENT ACCESS CENTER

Rule Learning with Negation: Issues Regarding Effectiveness

6 Financial Aid Information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

INSTRUCTOR USER MANUAL/HELP SECTION

Emporia State University Degree Works Training User Guide Advisor

Using Blackboard.com Software to Reach Beyond the Classroom: Intermediate

CHANCERY SMS 5.0 STUDENT SCHEDULING

Human Emotion Recognition From Speech

arxiv: v1 [cs.lg] 15 Jun 2015

SCT Banner Financial Aid Needs Analysis Training Workbook January 2005 Release 7

Moodle 2 Assignments. LATTC Faculty Technology Training Tutorial

GACE Computer Science Assessment Test at a Glance

Evolutive Neural Net Fuzzy Filtering: Basic Description

Adult Degree Program. MyWPclasses (Moodle) Guide

Softprop: Softmax Neural Network Backpropagation Learning

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Modeling function word errors in DNN-HMM based LVCSR systems

Excel Intermediate

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

i>clicker Setup Training Documentation This document explains the process of integrating your i>clicker software with your Moodle course.

Generative models and adversarial training

Test How To. Creating a New Test

Detailed Instructions to Create a Screen Name, Create a Group, and Join a Group

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

MyUni - Turnitin Assignments

ecampus Basics Overview

Schoology Getting Started Guide for Teachers

Experience College- and Career-Ready Assessment User Guide

The following information has been adapted from A guide to using AntConc.

Student Handbook. This handbook was written for the students and participants of the MPI Training Site.

PowerTeacher Gradebook User Guide PowerSchool Student Information System

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Many instructors use a weighted total to calculate their grades. This lesson explains how to set up a weighted total using categories.

Appendix L: Online Testing Highlights and Script

INTERMEDIATE ALGEBRA PRODUCT GUIDE

CS 446: Machine Learning

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Justin Raisner December 2010 EdTech 503

Houghton Mifflin Online Assessment System Walkthrough Guide

Online ICT Training Courseware

New Features & Functionality in Q Release Version 3.1 January 2016

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Learning Methods for Fuzzy Systems

Creating Your Term Schedule

Reviewing the student course evaluation request

LMS - LEARNING MANAGEMENT SYSTEM END USER GUIDE

STUDENT MOODLE ORIENTATION

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Creating a Test in Eduphoria! Aware

Ordered Incremental Training with Genetic Algorithms

Lecture 1: Basic Concepts of Machine Learning

Probability and Statistics Curriculum Pacing Guide

Minitab Tutorial (Version 17+)

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Preparing for the School Census Autumn 2017 Return preparation guide. English Primary, Nursery and Special Phase Schools Applicable to 7.

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

TK20 FOR STUDENT TEACHERS CONTENTS

Attributed Social Network Embedding

Cooperative evolutive concept learning: an empirical study

Donnelly Course Evaluation Process

Applications of data mining algorithms to analysis of medical data

Axiom 2013 Team Description Paper

MOODLE 2.0 GLOSSARY TUTORIALS

Transcription:

Subject A Multilayer Perceptron for a classification task (neural network): comparison of TANAGRA, SIPINA and WEKA. When we want to train a neural network, we have to follow these steps: Import the dataset; Select the discrete target attribute and the continuous input attributes; Split the dataset into learning and test set; Choose and parameterize the learning algorithm; Execute the learning process; Evaluate the performance of the model on the test set. Dataset We use the IONOSPHERE.ARFF from UCI IRVINE (ARFF is the WEKA file format). The attributes are standardized. There are 351 examples, 33 continuous descriptors, and a binary class attribute. Training a neural network with TANAGRA Dataset importation We click on the FILE/NEW menu in order to create a new diagram and import the dataset. dd/03/yyyy Page 1 sur 25

Splitting the dataset into learning and test set In the next step, we have to split the dataset into a learning set, which is used for the computation of the neural network weights, and a test set, which is used for the model performance evaluation. We add the SAMPLING component; we use 66% of examples for the learning phase. Select the class and the predictive attributes We add the DEFINE STATUS in the diagram, we use the shortcut in the toolbar, we set CLASS as TARGET, and all continuous attributes as INPUT. dd/03/yyyy Page 2 sur 25

Learning algorithm We want to add a Multiplayer Perceptron in the diagram. In the first step, we add a learning implementation algorithm (SUPERVISED LEARNING from the META-SPV LEARNING TAB). In the second step, we embed in the first one, a learning method algorithm i.e. the MULTILAYER PERCEPTRON from the SPV LEARNING tab. dd/03/yyyy Page 3 sur 25

Setting the parameters There are several kinds of parameters. The first ones are the neural architecture parameters (NETWORK tab). We use a hidden layer with two neurons. The next ones are the learning parameters (LEARNING tab). We set the LEARNING RATE to 0.15. We can define a validation set. This sub-sample enables to compute the error rate on a part of the learning set which is not used for the computation of weights. In this analysis, we do not use a validation set VALIDATION SET PROPORTION = 0.15. Last, because we have already standardized descriptors, we set ATTRIBUTE TRANSFORMATION to NONE. dd/03/yyyy Page 4 sur 25

In the last tab, STOPPING RULE, we set the parameters which enables to stop the learning process: MAX ITERATION is the max number of epochs; ERROR RATE THRESHOLD enables to stop the learning process if the resubstitution error rate is lower than this threshold. It is possible to stop the learning phase when we note a stagnation of the validation error rate on GAP TEST STAGNATION epochs. But it is not a very efficient option, check only this option if you are confident about the behavior of neural network. Reading the results We select the VIEW menu, the weights are computed and a new window appears. dd/03/yyyy Page 5 sur 25

In the first part of the window, we can see a summary of the network parameters and the resubstitution confusion matrix (0.026). We know that this estimation of the error rate is often highly optimistic. In the second part of the window, the weights of the network are displayed. We can copy and paste theses values in a spreadsheet. The ATTRIBUTE CONTRIBUTION part computes the error rate of the Perceptron when we remove one attribute. TANAGRA compares this error rate with the error rate of the whole model in order to evaluate the importance of each attribute in the prediction. For instance, we see in this table that the error rate of the whole Perceptron is 0.026. If we remove the a01 attribute -- i.e. we use the average of the attribute instead of the true values -- the error rate becomes 0.1169. The difference is 0.0909, if we use a statistical comparison between these two proportions; the t-value is 8.6868, it seems highly significant. In the last part of the window, the ERROR RATE DECREASING shows the error rate during the learning process. We can copy and paste this table in a spreadsheet and create a graphical representation of the error rate progression. dd/03/yyyy Page 6 sur 25

0.45 0.4 Resubstitution (train) error rate 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0 20 40 60 80 100 Epoch Evaluate the network on a test set We want to compute the test error rate on the 120 remaining examples. We add again the DEFINE STATUS component in the diagram. We set CLASS as TARGET, the prediction of the neural network (PRED_SPVINSTANCE_1) as INPUT. dd/03/yyyy Page 7 sur 25

Then, we add the TEST component (SPV LEARNING ASSESMENT tab). The error rate must be computed on the test (unselected examples) set. We click on the VIEW menu; the test error rate is 0.125. dd/03/yyyy Page 8 sur 25

Modifying the network parameters We can improve the power of the neural network when we modify the number of neurons in the hidden layer. We set this parameter to 10. A priori, we should obtain a more efficient network. We click again on the VIEW menu. The resubstitution error rate is 0.0206. dd/03/yyyy Page 9 sur 25

When we click on the VIEW menu of the TEST component, the test error rate is 0.1083. We have a small test set, the results suffers of a strong variability. This difference is not really significant. We have tried some other algorithms such as Linear Support Vector Machine or Linear Discriminant Analysis. We see in the following screenshot the accuracy on the same test set. dd/03/yyyy Page 10 sur 25

Training a neural network with SIPINA Importing the dataset In order to import the dataset, we click on the FILE/OPEN menu. dd/03/yyyy Page 11 sur 25

Splitting the dataset We want to use the 66% of the dataset as learning set. We select the ANALYSIS / SELECT ACTIVE EXAMPLES menu, we select the RANDOM SAMPLING option. The subsets size appears in a window. Defining the class and the predictive attributes We click on the ANALYSIS / DEFINE CLASS ATTRIBUTE menu in order to define the role of attributes. We use drag-and-drop in order to define the TARGET and the INPUT attributes. dd/03/yyyy Page 12 sur 25

The attributes selection appears on the left part of the window. The type of the attributes is displayed. Learning algorithm and parameters settings The INDUCTION METHOD / STANDARD ALGORITHM menu enables us to choose the learning algorithm. We select the NEURAL NETWORK tab and click on MULTILAYER PERCEPTRON method. dd/03/yyyy Page 13 sur 25

When we click on the OK button, a new dialog box appears. We can set the architecture of the perceptron and the training parameters. We note that we choose a high MAX ITERATION (5000). That does not matter because we can view the error rate decreasing and interactively stop the learning process in SIPINA. dd/03/yyyy Page 14 sur 25

Learning process We select the ANALYSIS / LEARNING menu. A new window appears, we can follow the error rate progression. A STOP button enables us to stop the processing. dd/03/yyyy Page 15 sur 25

Error rate evolution shows the error rate progression, we see that we obtain an error rate of 0.009 at the iteration 624. The confusion matrix is at the right part of the window. The STOP PROCESS button is very important. We can stop the processing when we think that we cannot obtain a significant improvement in the remaining iterations. In the bottom part of the window, when we select a neuron, the associated weights are displayed. Test error rate In order to apply the prediction model on the test set, we click on the ANALYSIS / TEST menu. In the subsequent dialog box, we set the following option. The confusion matrix appears in a new window. The test error rate is 0,0917. dd/03/yyyy Page 16 sur 25

Using a validation set We can follow the learning process in SIPINA; the utilization of the validation set is more interesting. We can stop the learning process when the validation error rate does not decrease. The learning set is thus split into two parts: the first, says training set, is used for the computation of the weights of the network; the second, says validation set, is used for a honest evaluation of the error rate. We close all the windows the WINDOW / CLOSE ALL menu. In order to include the utilization of a validation set in the learning process, we select a new algorithm: INDUCTION METHOD / STANDARD ALGORITHM menu, MULTILAYER PERCEPTRON (TEST ERROR RATE CONTROL) option. The learning set size is 231. We set 70% of them as a training set (70% of 231 = 161 examples), and the remaining as validation set (231 161 = 70 examples). dd/03/yyyy Page 17 sur 25

We click on the ANALYSIS / LEARNING menu in order to execute the learning process. Two curves appear now in the chart. In some cases, the validation error rate may increase when we have overfitting. Validation error rate Training error rate The confusion matrix on the test set gives the following results (ANALYSIS / TEST menu). dd/03/yyyy Page 18 sur 25

Training a neural network with WEKA When we execute WEKA (http://www.cs.waikato.ac.nz/ml/weka/), a dialog bow appears, which allows us to choose the execution mode of the software. We select the KNOWLEDGE FLOW mode. We have used the 3.5.1 version in this tutorial. Importing the dataset The ARFF LOADER component enables to import the dataset. Splitting the dataset The TRAINTEST SPLITMAKER (EVALUATION tab) enables to split the dataset into learning and test set. dd/03/yyyy Page 19 sur 25

We connect the ARFF LOADER component to this new component; we use the DATASET connection. dd/03/yyyy Page 20 sur 25

Learning algorithm and parameters In WEKA, the last column of the dataset is the default class attribute; the other columns are the predictive attributes. If we have not this configuration, we must use the CLASS ASSIGNER component. The supervised learning methods are in the CLASSIFIERS tab. We add the MULTILAYER PERCEPTRON component in the diagram. We click on the CONFIGURE menu in order to set the right parameters. We set two neurons in the hidden layer (HIDDENLAYERS); the learning rate is 0.15 (LEARNING RATE); we do not use attribute transformation (NORMALIZE ATTRIBUTES = FALSE); the max number of iteration is set to 100 (TRAINING TIME); and we do not use a validation set (VALIDATION SET SIZE = 0). We connect twice the TRAINTEST SPLITMAKER to this new component; we use the training set (1) and the test set (2) connections. dd/03/yyyy Page 21 sur 25

In order to visualize the weights of the network, we add the TEXT VIEWER component from the VISUALIZATION tab. We use the TEXT connection. dd/03/yyyy Page 22 sur 25

To launch the learning process, we click on the START LOADING menu of the ARFF LOADER component (1). And to display the results, we click on the SHOW RESULTS menu of TEXT VIEWER (2). The weights of the Multilayer Perceptron appear in a new window. dd/03/yyyy Page 23 sur 25

Test error rate We must add two new components in the diagram in order to apply the network on the test set and visualize the confusion matrix. First, we add the CLASSIFIER PERFORMANCE EVALUATOR (EVALUATION tab) in the diagram and we use the BATCH CLASSIFIER connection. Second, we add a new TEXT VIEWER component to visualize the results (TEXT connection). dd/03/yyyy Page 24 sur 25

We must execute again the learning process (START LOADING of the ARFF LOADER component). We click on the SHOW RESULTS menu of TEXT in order to display the results. There are 120 examples in the test set. The test error rate is 18.33%. Conclusion We note in this tutorial that the logic of the training and the evaluation of a neural network is the same one, whatever the software used. The implementation of a perceptron is finally rather simple. The interpretation of the results, in particular the comprehension of the weights of the network, is definitely more complicated. dd/03/yyyy Page 25 sur 25