Some applications of MLPs trained with backpropagation

Similar documents
A Neural Network GUI Tested on Text-To-Phoneme Mapping

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

*** * * * COUNCIL * * CONSEIL OFEUROPE * * * DE L'EUROPE. Proceedings of the 9th Symposium on Legal Data Processing in Europe

Knowledge Transfer in Deep Convolutional Neural Nets

Artificial Neural Networks written examination

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Python Machine Learning

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems

Human Emotion Recognition From Speech

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Evolution of Symbolisation in Chimpanzees and Neural Nets

CSL465/603 - Machine Learning

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Artificial Neural Networks

Softprop: Softmax Neural Network Backpropagation Learning

Generative models and adversarial training

Speech Recognition at ICSI: Broadcast News and beyond

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

LEGO MINDSTORMS Education EV3 Coding Activities

Learning Methods for Fuzzy Systems

INPE São José dos Campos

SARDNET: A Self-Organizing Feature Map for Sequences

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

An empirical study of learning speed in backpropagation

Evolutive Neural Net Fuzzy Filtering: Basic Description

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Accelerated Learning Course Outline

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

A study of speaker adaptation for DNN-based speech synthesis

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

arxiv: v1 [cs.cv] 10 May 2017

Accelerated Learning Online. Course Outline

Speaker Identification by Comparison of Smart Methods. Abstract

On the Formation of Phoneme Categories in DNN Acoustic Models

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Test Effort Estimation Using Neural Network

Word Segmentation of Off-line Handwritten Documents

Neural Representation and Neural Computation. Philosophical Perspectives, Vol. 4, Action Theory and Philosophy of Mind (1990),

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

Summer Workshops STEM EDUCATION // PK-12

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Large vocabulary off-line handwriting recognition: A survey

Lecture 1: Machine Learning Basics

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Teachers: Use this checklist periodically to keep track of the progress indicators that your learners have displayed.

Seminar - Organic Computing

Automatic Pronunciation Checker

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

Intervening to alleviate word-finding difficulties in children: case series data and a computational modelling foundation

Top US Tech Talent for the Top China Tech Company

Soft Computing based Learning for Cognitive Radio

University of Toronto

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Teaching a Laboratory Section

I-COMPETERE: Using Applied Intelligence in search of competency gaps in software project managers.

An Introduction to Simio for Beginners

Writing Research Articles

DIBELS Next BENCHMARK ASSESSMENTS

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Deep Neural Network Language Models

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Mining Association Rules in Student s Assessment Data

Device Independence and Extensibility in Gesture Recognition

Stages of Literacy Ros Lugg

Rule Learning With Negation: Issues Regarding Effectiveness

Unit: Human Impact Differentiated (Tiered) Task How Does Human Activity Impact Soil Erosion?

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Using focal point learning to improve human machine tacit coordination

UML MODELLING OF DIGITAL FORENSIC PROCESS MODELS (DFPMs)

Lecture 1: Basic Concepts of Machine Learning

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Florida Reading Endorsement Alignment Matrix Competency 1

Degeneracy results in canalisation of language structure: A computational model of word learning

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science

Improvements to the Pruning Behavior of DNN Acoustic Models

arxiv: v1 [cs.lg] 15 Jun 2015

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Mathematics subject curriculum

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Forget catastrophic forgetting: AI that learns after deployment

Circuit Simulators: A Revolutionary E-Learning Platform

Paper Reference. Edexcel GCSE Mathematics (Linear) 1380 Paper 1 (Non-Calculator) Foundation Tier. Monday 6 June 2011 Afternoon Time: 1 hour 30 minutes

The University of Wisconsin Library System

Transcription:

Some applications of MLPs trained with backpropagation MACHINE LEARNING/ APRENENTATGE (A) Lluís A. Belanche Year 2010/11

Sonar target recognition (Gorman and Sejnowski, 1988) Two-layer backprop network trained to distinguish between reflected sonar signals of rocks and metal cylinders at bottom of Chesapeake Bay 60 input units, 2 output units Input patterns based on Fourier transform of raw time signal Tried varying numbers of hidden units {0, 3, 12, 24} Best performance is obtained with 12 hidden units (close to 100% training set accuracy) Test set accuracy is 85-90%

NETTalk (Sejnowski & Rosenberg, 1987 Parallel Networks that Learn to Pronounce English Text, Complex Systems 1, 145-168) Project for pronouncing English text: for each character, the network should give the code of the corresponding phoneme: A stream of words is given to the network, along with the phoneme pronunciation of each in symbolic form A speech generation device is used to convert the phonemes to sound The same character is pronounced differently in different contexts: Head Beach Leech Sketch

NETTalk the architecture Input is rolling sequence of 7 characters 7 x 29 possible characters = 203 binary inputs 80 neurons in one hidden layer 26 output neurons (one for each phoneme code) 16,240 weights in the first layer; 2,080 in the second 203-80-26 two-layer network

NETTalk Training results Training set: database of 1,024 words After 10 epochs the network obtains intelligible speech; after 50 epochs 95% accuracy is achieved generalization: 78% accuracy on continuation of training text Since three characters on each side are not always enough to determine the correct pronunciation, 100% accuracy cannot be obtained The learning process Gradually performs better and better discrimination Sounds like a child learning to talk damaging network produced graceful degradation, with rapid recovery on retraining Analysis of the hidden neurons reveals that some of them represent meaningful properties of the input (e.g., vowels vs. consonants)

NETTalk Comparison to Rule-Based Generalization of NETTalk: only 78% accuracy Tools based on hand-coded linguistic rules (e.g., DECtalk) achieve much higher accuracy Hand-coded linguistic rules developed over a decade, and were worth thousands of $ Flagship demonstration that converted many scientists, particularly psychologists, to neural network research The data for NETTalk used to be found at: http://homepages.cae.wisc.edu/~ece539/data/nettalk/

Zipcode Recognition (Y. LeCun, 1990)

Normalize Digits First

Feature Detectors

Network Structure

Atypical Data Recognized

Further Details and Results ~10,000 digits from the U.S. mail were used to train and test system ZIP codes on envelopes were initially located and segmented by a separate system (difficult task in itself) weight sharing used to constrain number of free parameters 1,256 units + 30,060 links + 1,000 biases, but only 9760 free parameters used an accelerated version of backprop (pseudo-newton rule) trained on 7,300 digits, tested on 2,000 error rate of ~1% on training set, ~5% on test set if marginal cases were rejected (two or more outputs approximately the same), then error reduced to ~1% with 12% rejected used "optimal brain damage" technique to prune unnecessary weights after removing weights and retraining, only ~1/4 as many free parameters as before, but better performance: 99% accuracy with 9% rejection rate achieved state-of-the-art in digit recognition much problem-specific knowledge was put into the network architecture preprocessing of input data was crucial to success

ALVINN (Autonomous Land Vehicle In a Neural Network) (Pomerleau, 1996) Network-controlled steering of a car on a winding road network inputs: 30 x 32 pixel image from a video camera, 8 x 32 gray scale image from a range finder 29 hidden units 45 output units arranged in a line corresponding to steering angle achieved speeds of up to 70 mph for 90 minutes on highways outside of Pittsburgh

ALVINN Enhancing Training Training set collected by having a human drive the vehicle: the human is too good! Solution: Rotating each image to create additional views

Face Recognition (Mitchell, 1997) 90% Accurate Learning Head Pose, recognizing 1-of-20 Faces (more info at http://www.cs.cmu.edu/~tom/faces.html)

Some additional examples