Ensemble Methods for Handwritten Text Line Recognition Systems

Size: px
Start display at page:

Download "Ensemble Methods for Handwritten Text Line Recognition Systems"

Transcription

1 2005 IEEE International Conference on Systems, Man and Cybernetics Waikoloa, Hawaii October 10-12, 2005 Ensemble Methods for Handwritten Text Line Recognition Systems Roman Bertolami and Horst Bunke Institute of Computer Science and Applied Mathematics University of Bern, Neubrückstrasse 10, CH-3012 Bern, Switzerland {bertolam, Abstract - This paper investigates the generation and use of classifier ensembles for offline handwritten text recognition. The ensembles are derived from the integration of a language model in the hidden Markov model based recognition system. The word sequences output by the ensemble members are aligned and combined according to the ROVER framework. The addressed environment is extreme because of the existence of a large number of word classes. Moreover, the recognisers do not produce single output classes but sequences of classes. Experiments conducted on the IAM database show that the ensemble methods are able to produce statistically significant improvements in the word level accuracy when compared to the base recogniser. Keywords: Classifier Ensemble Methods, Handwritten Text Line Recognition, Statistical Language Model, Hidden Markov Model. 1 Introduction Today, offline handwriting recognition is still a field with many open challenges. Good recognition rates are achieved for character or numeral recognition, where the number of classes is rather small. But as the number of classes increases, as for example in isolated word recognition, the recognition rates drop significantly. An even more difficult task is the recognition of general handwritten text lines or sentences. Here, the lexicon usually contains a huge amount of word classes and the correct number of words in the image is unknown in advance, which leads to additional errors. In this field, recognition rates between 50% and 80% are reported in literature, depending on the experimental setup [6, 12, 14, 15, 23]. Classifier ensemble methods are used in many pattern recognition problems to improve the classification accuracy. Various experiments have shown that a set of classifiers has the potential to achieve better results than the best single classifier [10]. Two major problems have to be solved when we use ensemble methods. First, it is desirable to create multiple classifiers automatically from one given base classifier. Second, an adequate combination strategy must be found to benefit from the different classifiers results. Various ensemble methods, e.g. boosting, bagging, and random subspace, have successfully been applied to handwritten character [4, 13] or word recognition [2, 3]. A potentially large, but fixed number of classes is considered in each of these methods and the final result is selected among the results produced by the ensemble members. However, most of these methods can not be applied to text line recognition, because the output of a text line recognition system is a sequence of word classes instead of just a single word class. We do not want to select a word sequence produced by just one of the ensemble members as the final result, but derive the latter from the results of multiple ensemble members by appropriate combination. Therefore, to combine multiple text line recognisers, an alignment procedure based on dynamic programming or similar techniques has to be applied in the first step of the combination. Once the results are aligned, various voting algorithms can be used to derive the final recognition result. To the knowledge of the authors, no paper has been published yet in the literature on multiple classifier systems for offline handwritten text line recognition. Related work in handwritten digit and word recognition as well as continuous speech recognition is surveyed in the following paragraph. A framework called StrCombo has been presented in [20] for numeric string recognition. This graph-based combination approach uses each geometric segment of the individual recognisers as a node in a graph. The best path through this graph provides then the final recognition result. In isolated word recognition, another approach has been proposed in [19]. Here, no word classes are used, but words are treated as sequences of character classes. A combination framework is presented which uses a weighted opinion pool. A system called ROVER has been proposed in [1] in the domain of continuous speech recognition. The main goal of this system is to reduce the word error rate by aligning and combining the results of multiple speech recognisers. An extension of the ROVER system, where language model information supports the combination process, has been presented in [11]. In this paper, ensemble methods are applied to offline handwritten text line recognition for the first time. We propose a new method to automatically generate ensemble mem /05/$ IEEE 2334

2 Figure 1. Normalisation of the handwritten text line image. The first line shows the original image, while the normalised image is shown on the second line. bers out of one base classifier. The proposed method is based on the integration of a statistical language model. Another novel feature is the alignment of output results before combination procedures, such as voting, are applied. The application we are dealing with in this paper is extremely difficult from two different points of view. First, a classification task involving a very large number of classes (more than 12,000 in the experiments described in Sect. 4) is considered. Secondly, the output to be produced by our system is not a single class, but a sequence of classes of variable length with no given segmentation of the input signal. The remaining part of the paper is organised as follows. Section 2 introduces novel methods to generate ensembles of recognisers from a base recogniser, which can then be combined as described in Sect. 3. Experiments and results are presented in Sect. 4, whereas conclusions are drawn in the last section of the paper. 2 Generation of ensembles To obtain ensembles of classifiers we first build a hidden Markov model (HMM) based text line recogniser which we then use as the base recogniser. Multiple recognition results are then produced by the integration of a statistical language model. The proposed strategy can only be applied if a language model supports the recognition process. In contrast to other ensemble generation methods, e.g. boosting or bagging, our strategy would not work with a character or isolated word recognition system. However, for these tasks multiple classifier approaches have been proposed before [2, 3, 4, 13]. 2.1 Handwritten text line recogniser Based on the system described in [7] we create an offline handwritten text line recognition system which we then use as the base recogniser. The system can be divided into three parts, preprocessing and feature extraction, HMM based recognition, and postprocessing. The handwriting images are normalised with respect to skew, slant, and baseline position in the preprocessing phase. This normalisation reduces the impact of different writing styles. An example of the normalisation procedure is shown in Fig. 1. After preprocessing, each image of a handwritten text line is converted into a sequence of feature vectors. A sliding window, moving one pixel per step from left to right over a line of text, is used for this purpose. A number of features are extracted at each position of the sliding window. Further details about preprocessing and feature extraction are provided in [7]. For the HMM based recognition process, each character is modelled by an HMM. For each of the HMMs we chose a linear topology. The character HMMs have an individual number of states [22] and a mixture of twelve Gaussians is used to model the output distribution in each state. The HMMs are trained according to the Baum-Welch algorithm [9]. The Viterbi decoding procedure with integrated language model support is used to perform the recognition [16]. During the postprocessing phase, a confidence measure is computed for each recognised word. The confidence measure is derived from alternative candidate word sequences according to the procedure presented in [21]. 2.2 Multiple recognition results Once we have created the base recognition system, we are able to generate multiple recognisers by the integration of a language model. In the field of speech recognition, it has been shown that those parts of a recognised word sequence that are very sensitive to a specific integration of the language model are often recognised incorrectly. For these parts we are trying to find alternative interpretations to improve the recognition accuracy. The integration of the language model into an HMM based recognition system is accomplished according to the following formula: { } Ŵ = argmax log p(x W) + α log p(w) + mβ W We try to find the most likely word sequence Ŵ = (w 1,...,w m ) for a given observation sequence X. The likelihood p(x W), computed by the HMM decoding, is combined with the likelihood p(w), provided by the language model. Two parameters α and β are required to control the combination because HMM decoding as well as the language model produce merely approximations of probabilities. The term Grammar Scale Factor is used for parameter α, which weights the impact of the statistical language model. The parameter β is called Word Insertion Penalty and controls the segmentation rate of the recogniser. If we select n different parameter pairs (α i,β i ) (i = 1,...,n) we are able to produce n different recognition systems, which produce n recognition results from the same input, i.e. the same image of a handwritten text line. For the selection of m parameter pairs (α i,β i ), m n, which will be used to build an ensemble, we propose three different strategies: m best We measure the performance of n different parameter pairs on a validation set and select the m pairs that individually perform best. forward search We select the m best performing pairs on a (1) 2335

3 α β Recognition Result he going - out of the love the going - out of the love be going - out of the love the going - out of the love the going - out of the lack the going - out of the lack he going - out of the lack Figure 2. Example of multiple recognition results produced by different values of α and β. validation set by a forward search procedure. We start with the best performing pair (α 1,β 1 ). Then we iteratively determine the values of (α i,β i ) by measuring the combination of ((α 1,β 1 ),...,(α i 1,β i 1 ),(α i,β i )) for each of the remaining pairs (α i,β i ). The best performing pair is then used as (α i,β i ). The procedure is terminated once i = m. backward search Starting point is the set given by the n best performing pairs on the validation set. Iteratively, we leave one of the members out and measure the performance. We select the best performing set to continue iteratively. The procedure terminates once the best performing subset of m pairs has been found. To reduce the computational cost, the HMM based recogniser does not only produce single recognised word sequences, but whole recognition lattices [23]. These lattices contain part of the search space which has been explored during the Viterbi decoding step. This means that we have to apply the Viterbi decoding step only once instead of m times. A lattice rescoring procedure using the different values of α and β produces the different recognition results. An example of multiple recognition results for the handwritten text the going-out of the land produced by different α and β values is shown in Fig Combination of multiple results To combine the multiple recognition results we apply the Recogniser Output Voting Error Reduction (ROVER) framework [1]. This system was developed in the domain of speech recognition and first used to combine multiple continuous speech recognisers. The ROVER system can be divided into two modules, the alignment module and the voting module. These modules will be described next. 3.1 Alignment module In the alignment module we have to find an alignment for n word sequences. Because the optimal solution to this problem is NP-complete [18], we use an iterative and ap- W 1 : W 2 : W 3 : W 4 : WTN 1 = W 1 + W 2 : WTN 2 = WTN 1 + W 3 : WTN 3 = WTN 2 + W 4 : Face court is on Race course is on Face ( ours it on Face ( ours if on Face court is on Race course Face ε court is on Race ( course it ours Face ε court is on Race ( course it ours if Figure 3. Example of iteratively aligning multiple recognition results. proximate solution. This means that the word sequences are aligned incrementally. At the beginning the first two sequences are aligned using a standard string matching algorithm [17]. The result of this alignment is a Word Transition Network (WTN). The third word sequence is then aligned with this WTN, resulting in a new WTN, which is then aligned with the fourth word sequence and so on. We refer to [1] for further details. In general, this iterative alignment procedure does not deliver an optimal solution, because the result is affected by the order in which the word sequences are considered. In practice, however, the suboptimal alignment result produced by the ROVER alignment module often provides an adequate solution at much lower computational costs. An example of this alignment algorithm is shown in Fig. 3. The image of the handwritten text Face ( ours is on is analysed by four different recognisers which results in four different word sequences W 1, W 2, W 3, and W 4. None of these word sequences is a correct transcription. In the first step W 1 and W 2 are aligned in WTN 1. Next, W 3 is aligned with WTN 1. Subsequently, W 4 is added resulting in WTN 3. If we now select the correct path through WTN 3, we can perfectly transcribe the handwritten text image. 3.2 Voting module The voting module combines the different word sequences once they are aligned in a WTN. The goal is to identify the best scoring word sequence in the WTN and extract it as the final result. The decisions are made independently for each segment of the WTN. Thus neither of the adjacent segments has any effect on the current decision. Each decision depends only on the number m of recognition outputs, on the number of occurrences, m w, of a word w, and on the confidence mea- 2336

4 Word Level Accuracy [%] M Best Forward Search Backward Search Base Recogniser α Figure 4. Validation of different values of grammar scale factor (α) and word insertion penalty (β). β Word Level Accuracy Number of Ensemble Members Figure 5. Validation of the number of ensemble members for the different selection strategies. sure, c w, of word w. The confidence measure c w is defined as the maximum confidence measure among all occurrences of w at the current position in the WTN. For each possible word class w, we calculate the score s w as follows: s w = λ m w m + (1 λ)c w (2) We then select the word class w for the current segment with the highest score s w. To apply Eq. 2 we have to experimentally determine the value of λ. Parameter λ weights the impact of the number of occurrences against the confidence measure. Additionally we have to experimentally determine the confidence measure c ε for null transition arcs, because no confidence score is associated with a null transition ε. For this purpose we probe various values of λ and c ε on a validation set. 4 Experiments and results Each of the experiments reported in this section makes use of the base recogniser described in Sect. 2. This offline handwritten text line recogniser is supported by a statistical bigram language model that has been extracted from the LOB corpus [5]. We considered a writer independent task, where none of the writers in the test set has been used for the training or the validation of the system. The text lines used in the experiments originate from the IAM database [8] which consists of a large number of handwritten English forms. We use 6,166 text lines written by 283 writers to train the HMMs of the base recogniser. The ensemble methods are validated on 941 text lines written by 43 writers, whereas the test set contains 1,863 text lines from 128 writers. Training, validation and test set are disjoint, i.e. each writer has only contributed to one set. The lexicon contains 12,502 words and is given by the union of training, test, and validation set. The character HMMs of the base recogniser are trained according to the Baum-Welch algorithm. This algorithm iteratively optimises the parameters of the character models and is a special instance of the Expectation-Maximisation algorithm. Once the base recogniser is trained, we measure the performance of a large number of (α,β) value pairs on the validation set. The result of this measurement is shown in Fig. 4. We then choose the 24 best performing pairs to be considered for combination. Next, we apply the three selection strategies for (α i,β i ), described in Sect. 2.2, where i = 1,...,m. We do not only have to determine which (α,β) values should be used for combination, but also the number of ensemble members, m, of (α,β) pairs to be used. For this purpose we measure the performance for all possible values of m on the validation set. Simultaneously, we optimise the parameters λ and c ε. The result of this validation procedure is shown in Fig. 5 for each of the selection strategies. Finally, we measures the performance of the optimised systems on the test set. No forward and backward tuning of recognisers is done directly on the test set. The performance of the systems on the test set is as follows. The base recognition system attains a word level accuracy of 67.35%. The m best strategy achieves a word level accuracy of 67.82%, whereas the forward search strategy achieves 68.03%. The backward search strategy reaches a word level accuracy of 68.09%. The latter is the best performing strategy and it achieves a statistical significant improvement of 0.74% over the base recogniser at a significance level of 95%. The significance level was calculated from the text line level accuracies, by applying a statistical z-test. The results on the test set are summarised in Tab Conclusions We have proposed a novel method to generate and combine ensemble members for offline handwritten text line recognition. Handwritten text line recognition is an ex- 2337

5 Selection Strategy Word Level Accuracy Base recogniser 67.35% M best 67.82% Forward search 68.03% Backward search 68.09% Table 1. Results on the test set. tremely challenging field, due to the existence of a large number of word classes, and because the correct number of words in a text line is unknown in advance. Multiple recognisers were generated by the integration of a statistical language model in the hidden Markov model based text line recognition system. We presented three different strategies for ensemble member selection. The results of the individual ensemble members were combined according to the ROVER combination scheme. First, an iterative alignment algorithm is applied to align the results in a single word transition network. Second, the final result is built by extracting the best scoring transcription from the word transition network. In the experiments, conducted on a large set of text lines extracted from the IAM database, the proposed ensemble methods were able to achieve improvements in the word level accuracy when compared to the base recogniser. The absolute improvement, which is less than 1%, is moderate. However, this phenomenon is common in handwriting recognition, where often enormous effort is needed to achieve any improvement at all. Nevertheless we note that the obtained improvement is statistically significant. Future work will include the consideration of additional base recognition systems as well as the investigation of other alignment and voting strategies. Acknowledgement This research was supported by the Swiss National Science Foundation (Nr ). Additional funding was provided by the Swiss National Science Foundation NCCR program Interactive Multimodal Information Management (IM)2 in the Individual Project Scene Analysis. References [1] J. Fiscus. A post-processing system to yield reduced word error rates: Recognizer output voting error reduction. In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Santa Barbara, pages , [2] P. Gader, M. Mohamed, and J. Keller. Fusion of handwritten word classifiers. Pattern Recognition Letters, 17: , [3] S. Günter and H. Bunke. Ensembles of classifiers for handwritten word recognition. International Journal on Document Analysis and Recognition, 5(4): , [4] T. Huang and C. Suen. Combination of multiple experts for the recognition of unconstrained handwritten numerals. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17:90 94, [5] S. Johansson, E. Atwell, R. Garside, and G. Leech. The Tagged LOB Corpus, User s Manual. Norwegian Computing Center for the Humanities, Bergen, Norway, [6] G. Kim, V. Govindaraju, and S. Srihari. Architecture for handwritten text recognition systems. In S.-W. Lee, editor, Advances in Handwriting Recognition, pages World Scientific Publ. Co., [7] U.-V. Marti and H. Bunke. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. International Journal of Pattern Recognition and Artificial Intelligence, 15:65 90, [8] U.-V. Marti and H. Bunke. The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, 5:39 46, [9] L. Rabiner. A tutorial on hidden Markov models and selected application in speech recognition. Proc. of the IEEE, 77(2): , [10] F. Roli, J. Kittler, and T. Windeatt, editors. Proc. of the 5th International Workshop on Multiple Classifier Systems, Cagliari, Italy, [11] H. Schwenk and J. Gauvain. Combining multiple speech recognizers using voting & language model information. In International Conference on Speech and Language Processing (ICSLP), Beijing, China, pages , [12] A. Senior and A. Robinson. An off-line cursive handwriting recognition system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3): , [13] K. Sirlantzkis, M. Fairhurst, and M. Hoque. Genetic algorithm for multiple classifier configuration: A case study in character recognition. In J. Kittler and F. Roli, editors, 2nd International Workshop on Multiple Classifier Systems (MCS), Cambridge, England, pages , [14] A. Vinciarelli, S. Bengio, and H. Bunke. Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6): ,

6 [15] A. Vinciarelli and J. Luettin. Off-line cursive script recognition based on continuous density HMM. In 7th International Workshop on Frontiers in Handwriting Recognition, Amsterdam, The Netherlands, pages , [16] A. Viterbi. Error bounds for convolutional codes and an asimptotically optimal decoding algorithm. IEEE Transactions on Information Theory, 13(2): , [17] R. Wagner and M. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1): , [18] L. Wang and T. Jiang. On the complexity of multiple sequence alignment. Journal of Computational Biology, 1(4): , [19] W. Wang, A. Brakensiek, and G. Rigoll. Combination of multiple classifiers for handwritten word recognition. In 8th International Workshop on Frontiers in Handwriting Recognition, Niagara-on-the- Lake, Canada, pages , [20] X. Ye, M. Cheriet, and C. Y. Suen. Strcombo: combination of string recognizers. Pattern Recognition Letters, 23: , [21] M. Zimmermann, R. Bertolami, and H. Bunke. Rejection strategies for offline handwritten sentence recognition. In 17th International Conference on Pattern Recognition, Cambridge, England, volume 2, pages , [22] M. Zimmermann and H. Bunke. Hidden Markov model length optimization for handwriting recognition systems. In 8th International Workshop on Frontiers in Handwriting Recognition, Niagara-on-the- Lake, Canada, pages , [23] M. Zimmermann and H. Bunke. Optimizing the integration of a statistical language model in HMM based offline handwriting text recognition. In 17th International Conference on Pattern Recognition, Cambridge, England, volume 2, pages ,

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

Large vocabulary off-line handwriting recognition: A survey

Large vocabulary off-line handwriting recognition: A survey Pattern Anal Applic (2003) 6: 97 121 DOI 10.1007/s10044-002-0169-3 ORIGINAL ARTICLE A. L. Koerich, R. Sabourin, C. Y. Suen Large vocabulary off-line handwriting recognition: A survey Received: 24/09/01

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

An Online Handwriting Recognition System For Turkish

An Online Handwriting Recognition System For Turkish An Online Handwriting Recognition System For Turkish Esra Vural, Hakan Erdogan, Kemal Oflazer, Berrin Yanikoglu Sabanci University, Tuzla, Istanbul, Turkey 34956 ABSTRACT Despite recent developments in

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento

More information

Generative models and adversarial training

Generative models and adversarial training Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?

More information

Dropout improves Recurrent Neural Networks for Handwriting Recognition

Dropout improves Recurrent Neural Networks for Handwriting Recognition 2014 14th International Conference on Frontiers in Handwriting Recognition Dropout improves Recurrent Neural Networks for Handwriting Recognition Vu Pham,Théodore Bluche, Christopher Kermorvant, and Jérôme

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,

More information

Corrective Feedback and Persistent Learning for Information Extraction

Corrective Feedback and Persistent Learning for Information Extraction Corrective Feedback and Persistent Learning for Information Extraction Aron Culotta a, Trausti Kristjansson b, Andrew McCallum a, Paul Viola c a Dept. of Computer Science, University of Massachusetts,

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH 2009 423 Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition George

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Improvements to the Pruning Behavior of DNN Acoustic Models

Improvements to the Pruning Behavior of DNN Acoustic Models Improvements to the Pruning Behavior of DNN Acoustic Models Matthias Paulik Apple Inc., Infinite Loop, Cupertino, CA 954 mpaulik@apple.com Abstract This paper examines two strategies that positively influence

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

A Handwritten French Dataset for Word Spotting - CFRAMUZ

A Handwritten French Dataset for Word Spotting - CFRAMUZ A Handwritten French Dataset for Word Spotting - CFRAMUZ Nikolaos Arvanitopoulos School of Computer and Communication Sciences (IC) Ecole Polytechnique Federale de Lausanne (EPFL) nick.arvanitopoulos@epfl.ch

More information

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode Diploma Thesis of Michael Heck At the Department of Informatics Karlsruhe Institute of Technology

More information

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING Gábor Gosztolya 1, Tamás Grósz 1, László Tóth 1, David Imseng 2 1 MTA-SZTE Research Group on Artificial

More information

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Wilny Wilson.P M.Tech Computer Science Student Thejus Engineering College Thrissur, India. Sindhu.S Computer

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Yanzhang He, Eric Fosler-Lussier Department of Computer Science and Engineering The hio

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Off-line handwritten Thai name recognition for student identification in an automated assessment system

Off-line handwritten Thai name recognition for student identification in an automated assessment system Griffith Research Online https://research-repository.griffith.edu.au Off-line handwritten Thai name recognition for student identification in an automated assessment system Author Suwanwiwat, Hemmaphan,

More information

INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT

INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT Takuya Yoshioka,, Anton Ragni, Mark J. F. Gales Cambridge University Engineering Department, Cambridge, UK NTT Communication

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

INPE São José dos Campos

INPE São José dos Campos INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA

More information

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition Seltzer, M.L.; Raj, B.; Stern, R.M. TR2004-088 December 2004 Abstract

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS R.Barco 1, R.Guerrero 2, G.Hylander 2, L.Nielsen 3, M.Partanen 2, S.Patel 4 1 Dpt. Ingeniería de Comunicaciones. Universidad de Málaga.

More information

Deep Neural Network Language Models

Deep Neural Network Language Models Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Cooperative evolutive concept learning: an empirical study

Cooperative evolutive concept learning: an empirical study Cooperative evolutive concept learning: an empirical study Filippo Neri University of Piemonte Orientale Dipartimento di Scienze e Tecnologie Avanzate Piazza Ambrosoli 5, 15100 Alessandria AL, Italy Abstract

More information

Disambiguation of Thai Personal Name from Online News Articles

Disambiguation of Thai Personal Name from Online News Articles Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online

More information

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Investigation on Mandarin Broadcast News Speech Recognition

Investigation on Mandarin Broadcast News Speech Recognition Investigation on Mandarin Broadcast News Speech Recognition Mei-Yuh Hwang 1, Xin Lei 1, Wen Wang 2, Takahiro Shinozaki 1 1 Univ. of Washington, Dept. of Electrical Engineering, Seattle, WA 98195 USA 2

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology

More information

Knowledge Transfer in Deep Convolutional Neural Nets

Knowledge Transfer in Deep Convolutional Neural Nets Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract

More information

Welcome to. ECML/PKDD 2004 Community meeting

Welcome to. ECML/PKDD 2004 Community meeting Welcome to ECML/PKDD 2004 Community meeting A brief report from the program chairs Jean-Francois Boulicaut, INSA-Lyon, France Floriana Esposito, University of Bari, Italy Fosca Giannotti, ISTI-CNR, Pisa,

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Test Effort Estimation Using Neural Network

Test Effort Estimation Using Neural Network J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS

ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS Annamaria Mesaros 1, Toni Heittola 1, Antti Eronen 2, Tuomas Virtanen 1 1 Department of Signal Processing Tampere University of Technology Korkeakoulunkatu

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren Speech Technology and Research Laboratory, SRI International,

More information

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE Shaofei Xue 1

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

More information

Softprop: Softmax Neural Network Backpropagation Learning

Softprop: Softmax Neural Network Backpropagation Learning Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Seminar - Organic Computing

Seminar - Organic Computing Seminar - Organic Computing Self-Organisation of OC-Systems Markus Franke 25.01.2006 Typeset by FoilTEX Timetable 1. Overview 2. Characteristics of SO-Systems 3. Concern with Nature 4. Design-Concepts

More information

Semi-Supervised Face Detection

Semi-Supervised Face Detection Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University

More information

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

Training and evaluation of POS taggers on the French MULTITAG corpus

Training and evaluation of POS taggers on the French MULTITAG corpus Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction

More information