Neural Network Model of the Backpropagation Algorithm

Similar documents
Fast Multi-task Learning for Query Spelling Correction

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices

More Accurate Question Answering on Freebase

1 Language universals

MyLab & Mastering Business

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports

Learning Methods for Fuzzy Systems

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Reinforcement Learning by Comparing Immediate Reward

Knowledge Transfer in Deep Convolutional Neural Nets

INPE São José dos Campos

Rule Learning with Negation: Issues Regarding Effectiveness

Evolutive Neural Net Fuzzy Filtering: Basic Description

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Python Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 10: Reinforcement Learning

Generative models and adversarial training

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Attributed Social Network Embedding

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Reducing Features to Improve Bug Prediction

Kamaldeep Kaur University School of Information Technology GGS Indraprastha University Delhi

Test Effort Estimation Using Neural Network

Georgetown University at TREC 2017 Dynamic Domain Track

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Human Emotion Recognition From Speech

Rule Learning With Negation: Issues Regarding Effectiveness

Artificial Neural Networks written examination

Word Segmentation of Off-line Handwritten Documents

On the Formation of Phoneme Categories in DNN Acoustic Models

A Pipelined Approach for Iterative Software Process Model

Softprop: Softmax Neural Network Backpropagation Learning

Learning to Schedule Straight-Line Code

Lecture 1: Machine Learning Basics

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

A Reinforcement Learning Variant for Control Scheduling

A Case Study: News Classification Based on Term Frequency

IMGD Technical Game Development I: Iterative Development Techniques. by Robert W. Lindeman

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Mining Association Rules in Student s Assessment Data

LEGO MINDSTORMS Education EV3 Coding Activities

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne

TD(λ) and Q-Learning Based Ludo Players

arxiv: v1 [cs.lg] 15 Jun 2015

GROUP COMPOSITION IN THE NAVIGATION SIMULATOR A PILOT STUDY Magnus Boström (Kalmar Maritime Academy, Sweden)

Deploying Agile Practices in Organizations: A Case Study

Communication and Cybernetics 17

XXII BrainStorming Day

E-Learning project in GIS education

Knowledge-Based - Systems

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Use of Online Information Resources for Knowledge Organisation in Library and Information Centres: A Case Study of CUSAT

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Role of Blackboard Platform in Undergraduate Education A case study on physiology learning in nurse major

Web-based Learning Systems From HTML To MOODLE A Case Study

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

International Variations in Divergent Creativity and the Impact on Teaching Entrepreneurship

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

Modeling function word errors in DNN-HMM based LVCSR systems

THE DEVELOPMENT OF FUNGI CONCEPT MODUL USING BASED PROBLEM LEARNING AS A GUIDE FOR TEACHERS AND STUDENTS

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

PROCEEDINGS OF SPIE. Double degree master program: Optical Design

*** * * * COUNCIL * * CONSEIL OFEUROPE * * * DE L'EUROPE. Proceedings of the 9th Symposium on Legal Data Processing in Europe

arxiv: v1 [cs.lg] 7 Apr 2015

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

Xinyu Tang. Education. Research Interests. Honors and Awards. Professional Experience

Modeling function word errors in DNN-HMM based LVCSR systems

Interaction Design Considerations for an Aircraft Carrier Deck Agent-based Simulation

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, VOL XXX, NO. XXX,

SURVIVING ON MARS WITH GEOGEBRA

BENCHMARKING OF FREE AUTHORING TOOLS FOR MULTIMEDIA COURSES DEVELOPMENT

Data Fusion Models in WSNs: Comparison and Analysis

How People Learn Physics

Learning Prospective Robot Behavior

Session H1B Teaching Introductory Electrical Engineering: Project-Based Learning Experience

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

On-Line Data Analytics

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

Algebra 2- Semester 2 Review

Impact of Educational Reforms to International Cooperation CASE: Finland

Automating the E-learning Personalization

A Comparison of Annealing Techniques for Academic Course Scheduling

Deep Facial Action Unit Recognition from Partially Labeled Data

The Learning Model S2P: a formal and a personal dimension

BMBF Project ROBUKOM: Robust Communication Networks

Essentials of Ability Testing. Joni Lakin Assistant Professor Educational Foundations, Leadership, and Technology

Saeed Rajaeepour Associate Professor, Department of Educational Sciences. Seyed Ali Siadat Professor, Department of Educational Sciences

How to set up gradebook categories in Moodle 2.

An Online Handwriting Recognition System For Turkish

Using focal point learning to improve human machine tacit coordination

ACTIVITY: Comparing Combination Locks

A Variation-Tolerant Multi-Level Memory Architecture Encoded in Two-state Memristors

Transcription:

Neural Nework Model of he Backpropagaion Algorihm Rudolf Jakša Deparmen of Cyberneics and Arificial Inelligence Technical Universiy of Košice Lená 9, 4 Košice Slovakia jaksa@neuron.uke.sk Miroslav Karák Deparmen of Cyberneics and Arificial Inelligence Technical Universiy of Košice Lená 9, 4 Košice Slovakia bracek@mizu.sk Absrac We apply a neural nework o model neural nework learning algorihm iself. The process of weighs updaing in neural nework is observed and sored ino file. Laer, his daa is used o rain anoher nework, which hen will be able o rain neural neworks by imiaing he rained algorihm. We use backpropagaion algorihm for boh, for raining, and for sampling he raining process. We imiae he raining of he nework as whole. All he weighs and weigh changes of mulilayer neural nework are processed in parallel in order o model muual dependencies beween weighs. Experimenal resuls are provided. Keywords: mealearning, learning o learn, error backpropagaion. Inroducion Adapive or opimizing learning algorihms migh be used in he neural nework learning or in he machine learning domains. Insead of fixed learning algorihms, hese algorihms improve heir own learning performance over ime, or hey develop paricular learning mehods from scrach. This ype of learning algorihms is known as he mealearning or he learning o learn approach. Works by Jürgen Schmidhuber and Sepp Hochreier [] [3] are represenaive of recen research in his area and more comprehensive overview is given by Sebasian Thrun in [4]. Thrun defines learning o learn as abiliy of algorihm o improve performance a each nex ask wih experience from previous asks [4]. Schmidhuber emphasizes on he abiliy of learner o evaluae and compare learning mehods and course of learning, and using his evaluaion o selec proper learning sraegy []. We can recognize following paradigms among mealearning approaches: similariy exploiaion, learning parameers adapaion, discovery of learning algorihm. Paricular mehods migh be focused on any of hese paradigms, or on all of hem. Similariy exploiaion is he idea ha a group of asks shares some similariy, which once learned, migh speed-up he learning of anoher asks. We can simply learn he sequence of asks o exploi heir similariy, bu some mechanism for disinguishing he ask-specific knowledge versus common cross-ask knowledge should improve he performance. Adapaion of learning parameers migh be done by some mea-learning algorihm above he learning algorihm. This paradigm migh be based more on he learning abou learning, hen on he learning o learn idea. However, knowledge abou learning is he sep o he learning o learn. Discovery of learning algorihm is he design of learning algorihm from scrach. This is more learning he learning hen learning o learn. Here we shif focus from adapaion ino learning. Mealearning algorihms migh be based on he reinforcemen learning or on supervised learning. In he reinforcemen learning case, he learner hrough rial-and-error experience improves no only is performance on some paricular ask, bu also is abiliy o learn. This can be achieved by reaing learning algorihms as a par of solved ask. I is: learning is one of acions of he learner. In he supervised learning scenario, learning and mealearning are usually reaed as independen processes.

Backpropagaion Imiaion In his secion we will describe modelling of he error backpropagaion algorihm. We wan rain neural nework o rain anoher neural nework. A neural nework model of error backpropagaion algorihm should be able o rain neural neworks in he similar manner as he original backpropagaion algorihm did. To obain such a model we will sample he raining process of backpropagaion learning and hen ry o imiae i. This is simple, while also a general approach o mealearning. I consiss of he following sequence:. rain arbirary neural nework wih error backpropagaion algorihm and sample he learning process,. rain he learning nework o imiae original learning algorihm, 3. rain arbirary neural nework using he learning nework. Consider mulilayer neural nework wih neuron acivaions x i, link weighs w ij, biases θ i, and neuron acivaion funcions f i (in i ): x i = f i (in i ) in i = M w ij x j + θ i () j= The in i is inpu ino i-h neuron and M is he number of links connecing ino i-h neuron. The error J in he supervised learning mode is defined: J p = N i= (ev p i xp i ) () The p is he index of daa paern, N is he number of oupu neurons of neural nework, and ev p i is he expeced oupu of i-h neuron on p-h paern. For simpliciy, we will omi paern index p laer. Gradien based error minimizing adapaion of weighs follows: w ij = γ J w ij = γ J w ij = γδ i x j (3) The weigh w ij links he j-h neuron ino i-h neuron, he γ is learning rae consan, and δ i is defined as: δ i = J = J = J f (in i ) (4) The f (in i ) is he derivaive of acivaion funcion f(in i ). For oupu neurons we ge: δ i = J f (in i ) = (ev i x i )f (in i ) (5) For neurons in hidden layers we ge: N h δ i = f J in h (in i ) = in h N h = f J N l (in i ) w hl x l = in h l= N h = f J N h (in i ) w hi = f (in i ) δ h w hi (6) in h The N h is number of links coming from i-h neuron and h is index of hese links and corresponding neurons. The N l is number of neurons which have connecions ino h neurons (see Fig.). The rule (6) is he error backpropagaion rule, defining he backward propagaion of error hrough nework. Rule (3) defines weigh changes minimizing his error, and rule (5) ses he base for error minimizaion. j w ij i N l Figure : Neuron indices for rule (6). The error backpropagaion algorihm is defined by rules (), (), (3), (5), and (6). To model his algorihm we may sample variables: w, w, θ, δ, x, ev, J, in, and f (in). Some of hese variables can be derived from ohers, so no full se of hem is necessary. We can model eiher, he rule (6), or he full se of rules. When modelling full se of rules, ineracions in he whole nework may be processed in model. When modelling rule (6) only, only neighborhood of paricular link is considered. N h

The number of inpus and oupus of learning nework is equal o he number of sampled variables. Oupus are w changes, and θ possibly. To use learning nework, rules (6), (5), and (3) from original backpropagaion algorihm have o be replaced wih oupus of his learning nework..98, number of raining cycles is, number of hidden neurons is. Nework opology is he same as on he Fig.3. inpu 3 Experimens w oupu Consider he neural nework on he Fig. wih wo inpus, one hidden neuron, and one oupu. I has hree weighs and wo biases, which changes we will ry o approximae wih learning nework. Thus, we will sample hese changes while learning wih he backpropagaion algorihm. Then, we will rain learning nework o approximae hem. Besides hese changes, we will sample all hree weighs and wo biases, wo inpus, one oupu, and one expeced value on oupu. This is: 9 inpus and 5 oupus for he learning nework. Such a learning nework wih wo hidden neurons is on he Fig.3. The number of hidden neurons is arbirary, i migh depend on he asks learned and on he complexiy of original raining algorihm, in our case error backpropagaion. v hb_ yb_ x x y ev hidden h h d_v d_hb_ d_yb_ d_w d_w inpu x w hidden w x hb w hb_ h yb v yb_ oupu Figure : Simple nework o be rained. The x, and x are inpus; y is oupu; w, w, and v are weighs; hb, and yb are biases; h is hidden neuron acivaion, and y is he oupu. In he s experimen we will rain he nework from Fig. o approximae boolean funcion AND (Tab.). This is simple ask and neworks learn quickly. Parameers of raining of basic nework are: γ =.3, number of raining cycles is 5. Parameers of raining of learning nework are: γ = y Figure 3: Learning nework wih wo hidden neurons for raining of nework from Fig.. Inpus are he variables describing he sae of neural nework on Fig. and oupus are changes of hem provided by he learning algorihm. AND x x y OR x x y Table : Training daa for he boolean AND and OR funcions for nework on Fig.. 3

The raining hisory for learning nework is on he Fig.4. Comparison of raining of basic nework using backpropagaion algorihm and using learning nework is on he Fig.5. The learning nework achieved beer convergence hen he original backpropagaion algorihm. However, he performance of learning nework depends also on is raining i is prone o overfiing. Also noe, ha implemenaions of raining using learning nework and error backpropagaion algorihm may differ in speed. In our case, in his experimen, backpropagaion raining ook.48 seconds, while learning nework raining ook 3.78 seconds. In he nd experimen we will use basic nework wihou a hidden layer. This is sufficien for he AND-funcion approximaion. We will have inpus, oupu, and hidden neurons in he basic nework; and 7 inpus, 3 oupus, and hidden neurons in he learning nework. The raining hisory for his learning nework is on he Fig.6. Comparison of raining of basic nework wihou hidden neurons using backpropagaion algorihm and using learning nework is on he Fig.7. Performance of learning nework in his seup is comparable o performance of backpropagaion algorihm, alhough i is slighly worse hen in he s experimen wih hidden neurons..5.5 J() J().5.5 4 6 8 4 6 8 3 4 5 6 7 8 9 Figure 4: Error of he raining of learning nework for he basic nework from Fig.. Figure 6: Error of he raining of learning nework for he basic nework wihou hidden neurons..5.45 Error Backpropagaion Learning Nework.5.45 Error Backpropagaion Learning Nework.4.4.35.35.3.3 J().5 J().5...5.5...5.5 4 6 8 4 6 8 Figure 5: Error of he raining of basic nework from Fig. using error backpropagaion algorihm and using he learning nework from Fig.3. Figure 7: Error of he raining of basic nework wihou hidden neurons using error backpropagaion algorihm and using he learning nework. 4

In he 3rd experimen we will increase he number of hidden neurons. We will have inpus, oupu, and 4 hidden neurons in he basic nework; and inpus, 7 oupus, and 3 hidden neurons in he learning nework. The ask is OR-funcion approximaion. The raining hisory for his learning nework is on he Fig.8. Comparison of raining of basic nework wihou hidden neurons using backpropagaion algorihm and using learning nework is on he Fig.9. Performance of learning nework in his seup is again beer hen original error backpropagaion algorihm. are: inpus: x and y, oupus: innersquare and ouersquare, and 5 hidden neurons for he basic nework; and 38 inpus, 7 oupus, and 3 hidden neurons for he learning nework. The raining hisory for his learning nework is on he Fig.. Comparison of raining of basic nework using backpropagaion algorihm and using learning nework is on he Fig.. Training wih he learning nework in his ask diverges, while raining wih backpropagaion algorihm sopped on some level bu did no diverge..5 4 J() J() 8 6.5 4 4 6 8 4 6 8 3 4 5 6 7 8 9 Figure 8: Error of he raining of learning nework for he basic nework wih 4 hidden neurons. J().5.45.4.35.3.5..5..5 Error Backpropagaion Learning Nework 4 6 8 Figure 9: Error of he raining of basic nework wih 4 hidden neurons using error backpropagaion algorihm and using he learning nework. In he 4h experimen we will ry more difficul classificaion ask. The Fig. depics raining and esing daa ses. The nework opologies Figure : Error of he raining of learning nework for he basic nework for square classificaion ask. J() 7 6 5 4 3 Error Backpropagaion Learning Nework 5 5 Figure : Error of he raining of basic nework for square classificaion ask using error backpropagaion algorihm and using he learning nework. 5

raining se esing se Using neural nework model of backpropagaion algorihm o rain neural neworks is a viable approach. Novel mehods of performance uning of learning algorihm are possible when using his model. There is, however, a risk of learning insabiliy wih his approach, and acual modelling of backpropagaion can be done in several differen modes. Figure : Training and esing ses for he classificaion ask. The ask is o classify poins in space by heir x and y coordinaes, wheher hey will fi ino inner square or no. 4 Analysis The AND/OR-funcions approximaion asks wih hidden unis show good resuls when rained wih learning nework. The abiliy of learning nework o ouperform backpropagaion algorihm seems promising. In fuure, knowledge from several raining algorihms migh be used o rain learning nework in order o ge even beer performance by exploiing bes of all of hese algorihms. Trial and error mode migh be furher used when raining learning nework o furher improve is performance, possibly beyond he reach of convenional learning algorihms. Slighly worse performance in he experimens wihou hidden unis poins o he nonlinear characer of eiher, learning rules of backpropagaion algorihm, or he neural nework which we rain. The problem wih applicaion of our approach o more complex classificaion ask migh be of similar characer as insabiliy of overrained learning nework. Chance of insabiliy of learning wih learning nework is an inheren propery of his approach. The beer performance wih simpler neworks favorizes he modelling of he rule (6) only of backpropagaion algorihm, insead of modelling all he rules, which we did in experimens. To furher invesigae he neural nework modelling of backpropagaion algorihm, rule (6) modelling, and deeper analysis of variable se for algorihm sampling migh help. 5 Conclusion References [] M.Karák, Mealearning mehods for neural neworks, (in Slovak), MS Thesis, Technical Universiy Košice, (5). neuron.uke.sk/ jaksa/heses [] J.Schmidhuber, J.Zhao, and M.Wiering, Simple principles of mealearning, Technical Repor IDSIA-69-96, IDSIA, (996). cieseer.is.psu.edu/schmidhuber96simple.hml [3] S.Hochreier, A.S.Younger, and P.R.Conwell, Learning o Learn Using Gradien Descen, Lecure Noes in Compuer Science, vol.3, (). cieseer.is.psu.edu/hochreierlearning.hml [4] S.Thrun, Learning To Learn: Inroducion. cieseer.is.psu.edu/aricle/hrun96learning.hml [5] J.Schmidhuber, Evoluionary Principles in Self- Referenial Learning, Diploma Thesis, Technische Universiä München, (987). www.idsia.ch/ juergen/diploma.hml [6] J.Schmidhuber, On Learning How o Learn Learning Sraegies, Technical Repor FKI-98-94, Fakul für Informaik, Technische Universi München, (994). cieseer.is.psu.edu/schmidhuber95learning.hml [7] J.Schmidhuber, A General Mehod for Incremenal Self-Improvemen and Muli-agen Learning in Unresriced Environmens, In X.Yao (Ed.), Evoluionary Compuaion: Theory and Applicaions, Scienific Publ. Co., Singapore, (996). cieseer.is.psu.edu/aricle/schmidhuber96general.hml [8] J.Schmidhuber, A Neural Nework Tha Embeds Is Own Mea-Levels, In Proc. of he Inernaional Conference on Neural Neworks 93, San Francisco. IEEE, (993). cieseer.is.psu.edu/schmidhuber93neural.hml 6