An Analysis of Classification Algorithms in Offline Handwritten Digit Recognition

Size: px
Start display at page:

Download "An Analysis of Classification Algorithms in Offline Handwritten Digit Recognition"

Transcription

1 1 An Analysis of Classification Algorithms in Offline Handwritten Digit Recognition Logan A. Helms, Jonathon Daniele Abstract The construction and implementation of computerized systems capable of classifying visual input with speed and accuracy comparable to that of the human brain has remained an open problem in computer science for over 40 years. Here two classification methods for offline handwritten digit recognition are analyzed; naïve Baye s classifier and feedforward neural network. The analysis presented in this paper suggests that a feedforward neural network combined with adaptive methods is capable of achieving better accuracy than a naïve Baye s classifier when used as a classification algorithm in offline handwritten digit recognition. Keywords Backpropagation, Classification Algorithms, Feedforward Neural Networks, Handwritten Digit Recognition, Machine Learning, Naïve Bayes Classifier. M I. INTRODUCTION UCH progress has been made in the last several years in the area of machine learning techniques for pattern recognition [1]. However, computerized systems continue to lack in perceptual performance when compared to humans. The main message of this paper is that an artificial neural network (ANN) can be built by using adaptive methods that can achieve a greater accuracy than a naïve Bayes classifier when used as the classification algorithm in an offline handwritten digit recognition system. This is made possible by relying on adaptive methods rather than hand-crafted feature extraction techniques [1]. Feature extraction is commonly defined as determining what information is most relevant to a given problem [2]. Feature extraction allows only relevant inputs into the system, creating the best model for the given information [2]. While this might be the ideal situation, handcrafting feature extraction algorithms for the variability of real world problems is a unique and daunting task for each problem [1]. Many studies of pattern recognition are dedicated to feature extraction techniques for particular problems [1]. However, much of the success in handwritten digit recognition can be attributed to advances in machine learning techniques and large training sets [1]. The availability of large training sets has allowed designers to focus more on real world problems and less on hand-crafted feature extraction algorithms [1]. II. OFFLINE HANDWRITTEN DIGIT RECOGNITION There are two states in which a handwriting recognition classification technique may be tested: an online state or an offline state. In an online system, the classification not only receives the image itself, but additional feedback such as stylus/pen pressure, multiple connected characters and stroke direction. The classification is also expected to occur in real time, or near real time, and provide immediate results. An offline system is usually trained and tested on individual characters, examining a far more limited group of possible feature sets and is not expected to provide such immediate feedback. The relevance of offline handwriting recognition is evident in check reading systems in banks worldwide. Check reading systems verify the amount and other pertinent handwritten information that previously required a human, saving banks time and money [1]. III. CLASSIFICATION ALGORITHMS A. Naïve Bayes Classifier A.1 Baye s theorem and conditional probability Baye s theorem allows a probability model to be constructed using known outcomes as shown here (1) p(b A) = p (A B) p(b) p(a) However, this formula only accounts for the probability of a single feature. A naïve approach allows the model to be constructed with a set of features for each class instead of a single feature, given as (2) p({f n } A) = p(f 1 A) p(f 2 A) p(f n A) p(a) By using a set of known features from a set of known classes, we may arrive at the probability of a feature occurring in a given class by a simple summation of the feature occurrence divided by the summation of the class occurrence (within the training set). (3) n F k A i =1 i This paper was submitted for review on December 3, The authors are undergraduate students with the Department of Computer Science, University of North Carolina Wilmington, Wilmington, NC

2 2 A.2 The independence assumption and Naïve Baye s Classification Assuming the independence of features in a class is the key distinction responsible for the unexpectedly accurate results from a proper application of a Naïve classifier (NBC). The NBC model does make the independence assumption in order to build a conditional probability set for each feature, given that feature is part of some class. While this assumption would seem to violate many real-world applications the assumption of independence has consistently provided high results in actual application [3]. A.3 Some statistical requirements It is important to note that, should a reasonable expectation of even class distribution exist, as is the case with the set of digits [0-9], the decision should be made as to how an uneven distribution within the training set may be corrected. An appropriate training set such as the MNIST database used here, maintains a relatively even distribution between classes and avoids the introduction of erroneous probability summations. However, as this has serious implications for the accuracy of feature probability and thus efficacy of the constructed model, it remains important to examine the training set in light of known statistical models and the body of previous research available. That being said, with the rising interest in machine learning techniques, the vastly lowered cost of computer equipment when compared to just a decade ago, and the high availability of computing equipment capable of both maintaining and making use of large, curated data sets has contributed a great deal to a marked rise in the quality and access to accurate training data [1]. A.4 Additional considerations While the assumption of feature independence provides at least the appearance of greater freedom in implementation, it does introduce a new set of issues. For example, in high variability systems, the inclusion of extraneous features or noise when building a model may create sufficient error as to drop the accuracy of an NBC to near the same level as random guessing [4]. Similarly, normalization above and beyond size and color features may also be required prior to applying either training or testing methodologies. In the case of the MNIST data set, while all images are contained within the same 28 x 28 grid of pixels, with grayscale values [0,255], the characters are not deslanted to align with the same axis [1]. By allowing each feature to be considered independent and therefore valid for inclusion, the NBC makes no distinction between informative and uninformative features, placing the onus on the designer to either hand-pick feature sets, or construct methodology to do so prior to training. As the purpose of a computational approach to classification is, overall, a reduction in complexity and effort relative to the task, additional complexity raises the question of comparative effort between implementation of classification techniques, i.e. NBC or an Artificial Neural Network. While the NBC algorithm itself may be simplistic, the complexity of data preparation became far greater than the complexity required for the ANN constructed as part of this research. Feature extraction or identification is almost certainly a necessary step prior to training the classifier, i.e. part of the Fig. 1. A model of an artificial neuron. preprocessing. Similarly, when examining the testing set it is important to utilize the same techniques as when preprocessing the training data. This distinction between preparing a data base and preparing the input for training or testing is often overlooked or lumped together under the same heading. It may even be included as part of the classification algorithm itself. However, the authors did find a clear distinction between the preparation of the data set (done by LeCun), the preprocessing required for application of the classification techniques (required for implementation), and the application of the classification algorithms. The distinctions and reasoning will be further examined in Section V. Computational complexity for an NBC is a function of the number of features being extracted and the number of class. Consider a training set consisting of n number of distinct classes, with each class having k features. In order to build the model for classification, it is necessary to make k passes n number of times. This reduces down to simply O(N). B. Artificial Neural Networks One of the most common classification methods for pattern recognition is neural networks. ANNs for pattern recognition, such as the feedforward neural network used in this case study, are nonlinear processors that map inputs nonlinearly to outputs [2]. ANNs achieve this by nonlinearly processing in neurons [2]. B.1 Artificial Neuron The neuron is the basic element in an ANN. The artificial neuron is inspired by neurons in biological nervous systems, in particular the human brain [5]. Fig. 1 shows an illustration of the artificial neuron considered in this paper. Biological neurons in nervous systems receive electrical or chemical signals through synapses that either excite or inhibit the signal being sent to the neuron [5]. In an attempt to mimic this observation, ANNs allow a positive weight to represent excitatory synaptic connections and a negative weight to represent inhibitory synaptic connections [5]. Inputs are multiplied by synaptic weights and summed for each neuron as modeled by (4). (4)

3 3 m x i w i i=0 The sum is then passed to an activation function, φ, which produces the neuron s output. The activation function is a nonlinear sigmoidal function that is both continuous and monotonic with an upper and lower bound [2]. A frequent choice for activation function is of the form (5) e x where x is (4). This function has an upper bound of 1 and a lower bound of 0 as depicted in Fig. 2. B.2 Feedforward Neural Network Feedforward neural networks are networks that propagate from input neurons to output neurons in one direction without cycles as shown in Fig. 3. There can be any number of hidden layers with any number of neurons. Feedforward neural networks for classification will have an output neuron for each class [2]. Input neurons accept one input and do not have an activation function. Therefore the input neuron s input is essentially its output. The input neurons outputs are propagated to a hidden layer of neurons through synaptic weights that connect the input layer to the hidden layer. Each neuron in the hidden layer will sum its weighted inputs and then compute its activation function to produce the hidden neuron s output. Outputs from neurons in the hidden layer are propagated to the next layer of neurons as inputs. This process continues until the neurons in the output layer have produced an output which is considered the network s response to the given input vector [5]. Training: There are two types of training for neural networks, supervised and unsupervised. For the ANN in this paper, supervised training is used. Supervised training is the process of presenting the network with an input vector and the target for that input vector. The target is used to calculate the error signal for each output neuron as such (6) δpj = (Tpj Opj) Opj (1 Opj) where Tpj is the target for the output neuron j for pattern p, and Opj is the actual output for the output neuron j for pattern p [5]. Similarly, the error signal for each hidden neuron is calculated as such (7) δpj = Opj (1 Opj) k δ pk W kj where δpk is the error signal of the post-synaptic neuron k and Wkj is the weight of the synaptic weight from pre-synaptic neuron j to post-synaptic neuron k [5]. The error signal for each post-synaptic neuron is used to adjust the weights that connect the pre-synaptic neurons to the post-synaptic neurons. The synaptic weight adjustments are computed in the form (8) ΔWji (t) = η δpj Opi Fig. 2. A graph of a sigmoidal function. Fig. 3. A feedforward neural network. where η is the learning rate for the network, δpj is the jth postsynaptic neuron s error signal and Opi is the output of the ith pre-synaptic neuron [5]. The learning rate η is a network variable that controls the speed at which the weights are adjusted [2]. Once weight adjustments have been computed the weights are updated as such (9) Wji (t + 1) = Wji (t) + ΔWji (t) where Wji (t + 1) is the updated weight for the weight connecting the ith pre-synaptic neuron to the jth post-synaptic neuron. It is worth mentioning that all weights are adjusted simultaneously [2]. The computational complexity of a propagation of a feedforward neural network is dependent upon the number of layers and the number of neurons in each layer. Consider a network with an input vector of size m connected to a single hidden layer with n neurons and an output layer with p neurons. Each of the n neurons in the hidden layer will sum the weighted inputs from each of the m inputs from the input vector. Each of the p neurons in the output layer will sum the weighted inputs from each of the n outputs from the hidden layer. Thus the computational complexity of a propagation through the described feedforward neural network is on the order of O(m + m n + n p ). The computational complexity of backpropagating the error of the described feedforward neural network is on the order of O(n p + m n + m).

4 4 A. Database: MNIST set Fig. 4. Samples from MNIST database. IV. RESULTS AND COMPARISON The database used to train and test the classifiers in this analysis was constructed from two NIST databases for handwritten digits [1]. One of the NIST databases contained samples that were collected from Census Bureau employees while the other database contained samples collected from high school students [1]. Understandably, the database containing samples from Census Bureau employees had samples that were cleaner than the samples in the database containing samples from high school students. The designers of the MNIST database mixed these two NIST databases to build a reliable training and test set [1]. Each sample is centered in a 28x28 pixels grayscale image. The grayscale values range from 0 to 255 and are stored as bytes in the MNIST database. Thus there are 784 bytes in each sample. The training set contains 60,000 samples while the test set contains 10,000 samples. The MNIST database along with the details of the designers work is available at t. B. Results Naïve Baye s Classifier: The author's application of a Naïve Baye's classification technique has only resulted in a success rate of 13.76% at this time. The distribution of classes in the data was essentially even, so the success of this approach is barely higher than a random assignment of labels. However, this does not serve to discredit at all the efficacy of such classification. Testing with the WEKA suite of tools for statistical analysis and classification with a Naïve Baye's classifier resulted in 69.65% accuracy, and the use of a multinomial Naïve Baye's classifier (MNBC) resulted in 83.65% accuracy. Summary tables for the NBC and MNBC from WEKA may be found at the end of this paper in Fig. 5 and Fig. 6, respectively. The WEKA toolkit may be found at The problem in using an NBC almost certainly arises out of the number of non-discrete features in each digit [6]. The time to build the system model for the implemented NBC was approx. 3 seconds; the NBC used in the WEKA toolkit took approx. 4.7 seconds, while the MNBC took approx. 0.5 seconds. As is evident from the rising accuracy of the tested methodologies, an NBC may be quite successful; however, the required preprocessing remains the same as, or greater than, the simple NBC implemented for this paper. Feedforward neural network : The other classification algorithm tested was a feedforward neural network trained with backpropagation as described in Section III. Two versions were compared, a network with two layers of weights (one hidden layer) and another network with three layers of weights (two hidden layers). Both networks were trained for 20 epochs using the entire MNIST training set [1]. The learning rate was 0.25 for all 20 epochs. Accuracy on the MNIST test set was 84.01% for the two layer network with 300 hidden neurons, and 97.60% for the three layer network with 300 neurons in the first hidden layer and 100 neurons in the second hidden layer. Training time for both networks was also tested. Both neural networks could be trained in less than an hour per epoch. This amounts to less than 18 hours of CPU to train either neural network for 20 epochs. It is worth mentioning that training time is dependent upon the designer s implementation and is irrelevant to end users [1]. V. CONCLUSIONS In considering implementations for this research paper, the decision was made to keep as close as possible to a Naïve Baye's approach and performing a classification of the MNIST data set as-is. To perform an objective analysis, it is important to begin with the same ground between classification techniques and thus making observations based on the state in which the MNIST data set originally resides was a focus. While the ANN may have required more effort strictly to set up the classification algorithm, the extensive preprocessing and feature extraction required to prepare the data for use by the NBC was not. The separation of processes required to prepare data for classification delimits a clear distinction in utility between the classification methodologies, above and beyond accuracy and training time as is commonly discussed. Using the MNIST data set as the first stage of data processing sets a common ground for further effort required in designing and implementing an NBC and ANN. Generalizing available data to common form encourages methodologies that may function with minimal customization to a specific problem. With that consideration, measuring the complexity of implementation between an NBC and ANN should obviously include any preprocessing necessary for a minimum successful application of a technique, much less comparable levels of accuracy. The neural network showed strong performance on the data with no further preprocessing, while the NBC did not. In a similar line of reasoning, while both networks are fully functional once trained, the NBC testing data requires the same preprocessing as the training data. This means that, while the processing required by the neural network grows

5 5 linearly with any testing data, the Naïve Baye's classifier will also grow by whatever factor is introduced in the preprocessing and is often substantial. The final area of consideration, time complexity between classification algorithms, does show the largest advantage of a Naïve Baye's classifier. The complexity of the implemented ANN, given earlier in this paper, is simply the complexity of propagating through the network. This is much higher than that of the NBC, as the complexity is the number of features raised to the number of classes. However, in testing both networks on the 10,000 testing samples from the MNIST data set, actual runtimes were minimal in both cases. The strength of neural networks will continue to grow as the size and availability of training databases continue to grow [1]. Although the same may be said of a Naïve Baye's classifier, as the training data increases, the neural network remains in a better position to predict results with higher accuracy and no further preprocessing, while the Naïve Baye's classifier requirements for preprocessing such as feature extraction, and noise suppression grow linearly along with the training data. The neural network's ability to function on data without the required overhead of preprocessing also applies to the testing data under examination. These advantages taken together lead to the conclusion that a feedforward neural network retains a greater advantage in comparison. REFERENCES [1] Y. LeCun, Gradient-Based Learning Applied to Document Recognition, in Proc. IEEE, vol. 86, Red Bank, NJ, Nov. 1998, pp [2] S. Samarasinghe, Neural Networks for Nonlinear Pattern Recognition, in Neural Networks for Applied Sciences and Engineering, 1st ed. Boca Raton: Auerbach, 2007, pp [3] H. Zhang, Exploring Conditions for the Optimality of Naïve Bayes, in International Journal of Pattern Recognition and Artificial Intelligence, vol. 19, New Brunswick, CAN, Mar. 2005, pp [4] L. Jiang, Learning Instance Weighted Naïve Bayes from labeled and unlabeled data, in Journal of Intelligent Information Systems, vol. 38, Wuhan, Hubei, China, Feb. 2012, pp [5] G. A. Tagliarini, Optimization Using Neural Networks, in IEEE Transactions on Computers, vol. 40, Clemson, SC, Jul. 1991, pp [6] N. Rooney, D. Patterson, and M. Galushka, A Comprehensive Review of Recursive Naïve Bayes Classifiers, in Intelligent Data Analysis, vol. 8. Newtonabbey, UK, Jan. 2004, pp Fig. 5. Summary for WEKA NBC. Fig. 6. Summary for WEKA MNBC.

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

INPE São José dos Campos

INPE São José dos Campos INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Andres Chavez Math 382/L T/Th 2:00-3:40 April 13, 2010 Chavez2 Abstract The main interest of this paper is Artificial Neural Networks (ANNs). A brief history of the development

More information

Test Effort Estimation Using Neural Network

Test Effort Estimation Using Neural Network J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Circuit Simulators: A Revolutionary E-Learning Platform

Circuit Simulators: A Revolutionary E-Learning Platform Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,

More information

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Softprop: Softmax Neural Network Backpropagation Learning

Softprop: Softmax Neural Network Backpropagation Learning Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach #BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

Knowledge Transfer in Deep Convolutional Neural Nets

Knowledge Transfer in Deep Convolutional Neural Nets Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science Gilberto de Paiva Sao Paulo Brazil (May 2011) gilbertodpaiva@gmail.com Abstract. Despite the prevalence of the

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION Atul Laxman Katole 1, Krishna Prasad Yellapragada 1, Amish Kumar Bedi 1, Sehaj Singh Kalra 1 and Mynepalli Siva Chaitanya 1 1 Samsung

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

Off-line handwritten Thai name recognition for student identification in an automated assessment system

Off-line handwritten Thai name recognition for student identification in an automated assessment system Griffith Research Online https://research-repository.griffith.edu.au Off-line handwritten Thai name recognition for student identification in an automated assessment system Author Suwanwiwat, Hemmaphan,

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Applications of data mining algorithms to analysis of medical data

Applications of data mining algorithms to analysis of medical data Master Thesis Software Engineering Thesis no: MSE-2007:20 August 2007 Applications of data mining algorithms to analysis of medical data Dariusz Matyja School of Engineering Blekinge Institute of Technology

More information

AUTOMATED FABRIC DEFECT INSPECTION: A SURVEY OF CLASSIFIERS

AUTOMATED FABRIC DEFECT INSPECTION: A SURVEY OF CLASSIFIERS AUTOMATED FABRIC DEFECT INSPECTION: A SURVEY OF CLASSIFIERS Md. Tarek Habib 1, Rahat Hossain Faisal 2, M. Rokonuzzaman 3, Farruk Ahmed 4 1 Department of Computer Science and Engineering, Prime University,

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Issues in the Mining of Heart Failure Datasets

Issues in the Mining of Heart Failure Datasets International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar

More information

Classification Using ANN: A Review

Classification Using ANN: A Review International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 7 (2017), pp. 1811-1820 Research India Publications http://www.ripublication.com Classification Using ANN:

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Using focal point learning to improve human machine tacit coordination

Using focal point learning to improve human machine tacit coordination DOI 10.1007/s10458-010-9126-5 Using focal point learning to improve human machine tacit coordination InonZuckerman SaritKraus Jeffrey S. Rosenschein The Author(s) 2010 Abstract We consider an automated

More information

Large vocabulary off-line handwriting recognition: A survey

Large vocabulary off-line handwriting recognition: A survey Pattern Anal Applic (2003) 6: 97 121 DOI 10.1007/s10044-002-0169-3 ORIGINAL ARTICLE A. L. Koerich, R. Sabourin, C. Y. Suen Large vocabulary off-line handwriting recognition: A survey Received: 24/09/01

More information

Speaker Identification by Comparison of Smart Methods. Abstract

Speaker Identification by Comparison of Smart Methods. Abstract Journal of mathematics and computer science 10 (2014), 61-71 Speaker Identification by Comparison of Smart Methods Ali Mahdavi Meimand Amin Asadi Majid Mohamadi Department of Electrical Department of Computer

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies

More information

Learning to Schedule Straight-Line Code

Learning to Schedule Straight-Line Code Learning to Schedule Straight-Line Code Eliot Moss, Paul Utgoff, John Cavazos Doina Precup, Darko Stefanović Dept. of Comp. Sci., Univ. of Mass. Amherst, MA 01003 Carla Brodley, David Scheeff Sch. of Elec.

More information

Generative models and adversarial training

Generative models and adversarial training Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?

More information

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures Alex Graves and Jürgen Schmidhuber IDSIA, Galleria 2, 6928 Manno-Lugano, Switzerland TU Munich, Boltzmannstr.

More information

arxiv: v1 [cs.lg] 3 May 2013

arxiv: v1 [cs.lg] 3 May 2013 Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE Shaofei Xue 1

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Soft Computing based Learning for Cognitive Radio

Soft Computing based Learning for Cognitive Radio Int. J. on Recent Trends in Engineering and Technology, Vol. 10, No. 1, Jan 2014 Soft Computing based Learning for Cognitive Radio Ms.Mithra Venkatesan 1, Dr.A.V.Kulkarni 2 1 Research Scholar, JSPM s RSCOE,Pune,India

More information

Data Fusion Through Statistical Matching

Data Fusion Through Statistical Matching A research and education initiative at the MIT Sloan School of Management Data Fusion Through Statistical Matching Paper 185 Peter Van Der Puttan Joost N. Kok Amar Gupta January 2002 For more information,

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Beyond the Pipeline: Discrete Optimization in NLP

Beyond the Pipeline: Discrete Optimization in NLP Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We

More information

Beyond Classroom Solutions: New Design Perspectives for Online Learning Excellence

Beyond Classroom Solutions: New Design Perspectives for Online Learning Excellence Educational Technology & Society 5(2) 2002 ISSN 1436-4522 Beyond Classroom Solutions: New Design Perspectives for Online Learning Excellence Moderator & Sumamrizer: Maggie Martinez CEO, The Training Place,

More information

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information

Arabic Orthography vs. Arabic OCR

Arabic Orthography vs. Arabic OCR Arabic Orthography vs. Arabic OCR Rich Heritage Challenging A Much Needed Technology Mohamed Attia Having consistently been spoken since more than 2000 years and on, Arabic is doubtlessly the oldest among

More information

arxiv: v1 [cs.lg] 15 Jun 2015

arxiv: v1 [cs.lg] 15 Jun 2015 Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University

More information

Early Model of Student's Graduation Prediction Based on Neural Network

Early Model of Student's Graduation Prediction Based on Neural Network TELKOMNIKA, Vol.12, No.2, June 2014, pp. 465~474 ISSN: 1693-6930, accredited A by DIKTI, Decree No: 58/DIKTI/Kep/2013 DOI: 10.12928/TELKOMNIKA.v12i2.1603 465 Early Model of Student's Graduation Prediction

More information

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,

More information

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Wilny Wilson.P M.Tech Computer Science Student Thejus Engineering College Thrissur, India. Sindhu.S Computer

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

MTH 141 Calculus 1 Syllabus Spring 2017

MTH 141 Calculus 1 Syllabus Spring 2017 Instructor: Section/Meets Office Hrs: Textbook: Calculus: Single Variable, by Hughes-Hallet et al, 6th ed., Wiley. Also needed: access code to WileyPlus (included in new books) Calculator: Not required,

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

TD(λ) and Q-Learning Based Ludo Players

TD(λ) and Q-Learning Based Ludo Players TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability

More information

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC On Human Computer Interaction, HCI Dr. Saif al Zahir Electrical and Computer Engineering Department UBC Human Computer Interaction HCI HCI is the study of people, computer technology, and the ways these

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,

More information

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

CHAPTER 4: REIMBURSEMENT STRATEGIES 24 CHAPTER 4: REIMBURSEMENT STRATEGIES 24 INTRODUCTION Once state level policymakers have decided to implement and pay for CSR, one issue they face is simply how to calculate the reimbursements to districts

More information

Mining Association Rules in Student s Assessment Data

Mining Association Rules in Student s Assessment Data www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

On the Formation of Phoneme Categories in DNN Acoustic Models

On the Formation of Phoneme Categories in DNN Acoustic Models On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

Neuro-Symbolic Approaches for Knowledge Representation in Expert Systems

Neuro-Symbolic Approaches for Knowledge Representation in Expert Systems Published in the International Journal of Hybrid Intelligent Systems 1(3-4) (2004) 111-126 Neuro-Symbolic Approaches for Knowledge Representation in Expert Systems Ioannis Hatzilygeroudis and Jim Prentzas

More information

An empirical study of learning speed in backpropagation

An empirical study of learning speed in backpropagation Carnegie Mellon University Research Showcase @ CMU Computer Science Department School of Computer Science 1988 An empirical study of learning speed in backpropagation networks Scott E. Fahlman Carnegie

More information

Mathematics process categories

Mathematics process categories Mathematics process categories All of the UK curricula define multiple categories of mathematical proficiency that require students to be able to use and apply mathematics, beyond simple recall of facts

More information

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important

More information

Deep Neural Network Language Models

Deep Neural Network Language Models Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Forget catastrophic forgetting: AI that learns after deployment

Forget catastrophic forgetting: AI that learns after deployment Forget catastrophic forgetting: AI that learns after deployment Anatoly Gorshechnikov CTO, Neurala 1 Neurala at a glance Programming neural networks on GPUs since circa 2 B.C. Founded in 2006 expecting

More information