ISCA Archive

Size: px
Start display at page:

Download "ISCA Archive"

Transcription

1 ISCA Archive SLTU-2014, St. Petersburg, Russia, May 2014 SEQUENCE MEMOIZER BASED LANGUAGE MODEL FOR RUSSIAN SPEECH RECOGNITION Daria Vazhenina, Konstantin Markov The University of Aizu, Japan ABSTRACT In this paper, we propose a novel language model for Russian large vocabulary speech recognition based on sequence memoizer modeling technique. Sequence memoizer is a long span text dependency model and was initially proposed for character language modeling. Here, we use it to build word level language model (LM) in ASR. We compare its performance with recurrent neural network (RNN) LM, which also models long span word dependencies. A number of experiments were carried out using various amounts of train data and different text data arrangements. According to our experimental results, the sequence memoizer LM outperforms recurrent neural network and standard 3-gram LMs in terms of perplexity, while RNN LM achieves better word error rate. The lowest word error rate is achieved by combining all three language models together using linear interpolation. Index Terms sequence memoizer, advanced language modeling, inflective languages 1. INTRODUCTION Although the underlying speech technology is mostly language-independent, differences between languages with respect to their structure and grammar have substantial effect on the automatic speech recognition (ASR) systems performance. Research in the ASR area has been traditionally focused on several main languages, such as English, French, Spanish, Chinese or Japanese, and some other languages, especially eastern European languages, have received much less attention. The Russian language belongs to the Slavic branch of the Indo-European group of languages, which are characterized by complex mechanism of word-formation and flexible word order. Word relations within a sentence are marked by inflections and grammatical categories such as gender, number, person, case, etc. [1]. Sentence structure is not restricted by hard grammatical rules as in the English, German or Arabic languages. These two factors greatly reduce the predictive power of the conventional n-gram language models (LMs). Nevertheless, in current Russian large vocabulary continuous speech recognition (LVCSR) systems conventional n-grams are usually used [2-6]. An improved bi-gram model was proposed in [7] where the counts of some existing n-grams are increased after syntactic analysis of the training data. Long-distance dependencies between words are identified and added as new bi-gram counts for building 2-gram and 3-gram LMs. This allowed to reduce the word error rate of a speech recognition system with dictionary of 204K words from 27.5% to 26.9%. In conventional n-gram language models, prediction of the next word is usually conditioned just on a few preceding words, which is clearly insufficient to capture semantics. Recently, recurrent neural network (RNN) LM was proposed for better predicting sequential data using longer context dependency [8]. RNN LM allows effective processing of arbitrary length word sequences, which overcomes the main n-gram drawback - dependency on only few consecutive words. In [9], performance of this model was compared with many other state-of-the-art language models such as structured LM, random forest LM and several types of neural network LMs for the English language. It significantly outperforms all of them both in terms of perplexity and WER. In [10], RNN LM was implemented in Russian LVCSR system. Using 40M words training corpus, standalone RNN LM showed better performance than factored language model and baseline 3-gram LM. The best relative WER reduction of 7.4% was achieved using interpolation of all 3 models. The Sequence memoizer (SM), proposed in [11], is a hierarchical Bayesian model that is able to capture long range dependences and power-law characteristics. The next word in this model is conditionally dependent on all previous words in a given sequence. Here, models are built (space) as word -end symbol. Performance of the SM language model was evaluated by perplexity using APNews dataset, which consists of 14M words and has vocabulary size of about 18K words. It showed improvement over standard 4-gram, hierarchical Pitman-Yor 4-gram and conventional neural network LMs. To our knowledge, language model in a speech recognition task. This paper describes our implementation of the sequence memoizer for Russian LVCSR with vocabulary of 100K words. We investigated the influence of different training corpora sizes and text data arrangement on the 183

2 language model performance. It is compared with the RNN LM, which also allows to model unbounded-depth sequences. Both language modeling techniques are implemented using n-best re-scoring. While SM LM achieved the lowest perplexity, best, in terms of WER, was the interpolation of the conventional 3-gram LM with both the SM and RNN LMs. 2. SEQUENCE MEMOIZER Formulation of the sequence memoizer is based on an unbounded-depth hierarchical Pitman-Yor process. Hierarchical Bayesian language models have succeeded to achieve a comparable performance to the state-of-the-art n- gram LMs smoothed with modified Kneser-Ney (MKN) smoothing. A hierarchical Pitman-Yor Process (HPYP) LM, initially introduced in [12], is a type of Bayesian language model based on the Pitman-Yor (PY) process that has been shown to improve the perplexity over the MKN smoothed n- gram LM. In the HPYP LM, given context u consisting of a sequence of n previous words, let G u (w) be a distribution over word w having Pitman-Yor process as a prior: (1) where d u is a discount parameter, u is a strength parameter, u) is a context of u consisting of (n-1) previous words. Since base distribution G u) is unknown either, its prior is recursively placed over it in (1) with parameters (d u) ; (u) ;G u)) ). This recursion is repeated until we get, that is a distribution of the current word given an empty context. The prior for this distribution is given following form (2) where the base distribution G 0 is assumed to be uniform over the vocabulary. Sequence memoizer is essentially an implementation of such unbounded-depth HPYP LM, where [13]. In this case, strength parameter u is equal to 0. Then, predictive distribution of a word given its previous context u takes form where c u (w) is a count of draws with the context being u of word w; c u is a count of context being u; t u (w) is a count of draws with the context of word w being u and recursion, using u) shorter suffix of the context, G u) was applied; t u is a count of draws with the context being u and recursion G u) was applied. If context u the context tree, then the longest suffix of u is used, u) or ( u)) and so on. When building model over very long sequences, large number of recursion of form (1) might be required, which Figure 1 Sequence memoizer compact context tree rises the computational cost a lot. To reduce the size of the model all non-branching, non-leaf nodes are integrated out leaving a finite number of nodes in a compact context tree. Figure 1 shows the graphical model instantiated by the sequence of integers Note that in this SM compact context tree, nodes that are not branching nodes and are not associated with observed data are already integrated out. For instance, in our example path in non-compact tree will take form. In this case, parameters in form (1) are changed to. Inference in the SM model is performed by recursive application of the Chinese restaurant process in the same way as for the HPYP LM. In [14], a detailed inference scheme of the model discount parameter d u and word arrangement variables c u (w), c u, t u (w), t u. is described. To calculate perplexity for this model, predictive distribution of the form (3) is used as probability of a word given context P(w u). 3. EXPERIMENTS 3.1. Databases and feature extraction Our text corpus contains 41M words with vocabulary size of about 100K words. This corpus was assembled from recent news articles published by freely available Internet sites of several on-line Russian newspapers for the years We split our corpus into 40M words train set and a test set consisting of 1M words. For experiments with different corpus sizes, we separated 10M, 20M and 30M words from the full train set and used them as smaller train sets. 184

3 Table 1. Perplexities obtained using test sets with various average sentence length and train set of 10M words Average Model name length SM-1 SM-2 SM-3 SM-4 SM-5 SM-6 SM-fs SM-bs RNN-fs 3-gram all In our ASR experiments, we used the SPIIRAS [16] in all experiments with RNN, since the performance did and GlobalPhone [3] Russian speech databases. Speech data vary significantly depending on train data order. are collected in clean acoustic conditions. In total, there are utterances pronounced by 165 speakers (86 male and 79 female) with duration of about 38 hours. Speech test data consist of 10% of the GlobalPhone recordings pronounced by 5 male and 5 female speakers not used for acoustic model (AM) training. The speech signal was coded with energy and 12 MFCCs and their first and second order derivatives. The AM consists of 5342 tied states with 16 mixture GMMs as output models. Our speech decoder (Julius ver. 4.2 [17]) produces 500-best hypothesis list, which we use for rescoring by the SM and RNN LMs. The SM LMs were built using the java version of the Sequence memoizer toolkit [18] and the RNN LMs were implemented using the RNNLM toolkit (v.0.3b) [15] Experimental results When modeling long span word dependencies across sentence boundaries, sequence modeling would strongly depend on the sentence order in the training data. In many cases text corpus consists of unconnected by meaning sentences, because after data pre-processing some sentences are eliminated. Thus, we can assume that our initial data are shuffled. To find out how performance of the model depends on train data order, we built models using shuffled and sorted data. Here, we used random shuffling and sorting by sentence length in increasing and decreasing order. Our sequence memoizer model is built using word as atomic unit, unlike previous attempts built using symbols. In this case, vocabulary size of the model increases significantly from 128 to 100K. Because of the high sampling iterations as probably be necessary for more efficient parameter estimation. Changing sampling number up to the model performance, while the computation time increased a lot. Thus, we used one sampling iteration for building our SM models. For RNN LM evaluation we used optimal parameters identified in [10], 150 hidden nodes and 1000 classes. We used train data sorted in increasing order of sentence length Test sequence length experiment In this experiment, we used small train set of 10M words. From the rest of text data, we selected test sets of 1M words each so that average sentence length in these sets is different. In total, all test sets contain 2.8M words. Our baseline 3-gram was trained using 10M word train set as it is. RNN LM was trained using same set sorted in increasing order of sentence length (RNN-fs). In order to investigate effect of sentence order on the SM LM performance, we randomly shuffled the training set 6 times (SM-1 SM-6), as well as sorted it in increasing order (SMfs) and decreasing order (SM-bs). Perplexity, obtained using all test sets is summarized in Table 1. Performance of SM varies in very wide range depending on train data order. Shuffled models SM-1, SM-5 and sorted in decreasing order model SM-bs outperform both 3-gram and RNN LMs over all test sets. For all models, perplexity improves as the average sentence length increases Performance with increasing size of the train corpus In [19] it was reported, that with lots of training data, improvements provided by many advanced modeling techniques almost disappear. To investigate influence of increasing amount of train data, we used 4 train sets of 10, 20, 30 and 40 millions of words and test set as described in Section 3.1. We chose both models built using sorted data and two SM LMs built using shuffled data, which showed better performance in the previous experiment; SM-1 and SM-5. In the same manner, we built SM LMs using 20M and 30M train sets. Full size models were trained using sorted 40M train data. RNN LMs Table 2. Perplexities of models built using various sizes of train sets Train SM-1 SM-5 SMfbfs SM- RNN- 3- Relative set gram improve- size ment, % 10M M M M

4 Table 3 WERs of standalone and interpolated models built using 40M train set Model SM-fs SM-bs RNN-fs 3-gram SM/RNN/3-gram SM + RNN SM/RNN + 3-gram SM + RNN + 3-gram were built using each train set separately with same parameters identified in Section 3.2. Perplexities obtained using various model sizes are summarized in Table 2. In the last column, relative improvement in perplexity obtained by SM LM with the lowest perplexity over baseline 3-gram is presented. We can observe that relative improvement does increasing size of train data; it keeps on the same level for 20M, 30M and 40M words train sets. The lowest perplexities were obtained by SM-bs model, built with data sorted in decreasing order of sentence length, using train sets of 20M, 30M and 40M words Speech recognition evaluation of interpolated models Next, we evaluated speech recognition performance using models trained with 40M data set, based on perplexity evaluation results in the previous experiment. In Table 3, speech recognition performance is presented for SM, RNN and 3-gram LMs as well as for their linear interpolations. Although SM outperform both 3-gram and RNN in terms of Table 4. Examples of data generated by models trained on 40M train set. Model Generated data (Team of thirty firemen is trying to extinguish fire.) SM (One hundred eighty thousand rubles were assigned to conduct the campaign this year.) RNN (Do not repeat that in Geneva could not keep the past and intends to finally his work essay motorcycles which he can happen now ) (Support region Russian threat of a new level of complexity it goes together.) 3-gram (However others fired joint his predecessor we have economic sanctions.) perplexity, its standalone speech recognition performance is worse than RNN and 3-gram ones. Nevertheless, WER relative improvement of 5.3% was achieved using linear interpolation of all 3 models Random sentence generation from the SM, RNN and 3-gram LMs For testing model ability to generate valid sentences, we used SM-bs, RNN-fs and 3-gram models trained using 40M train set. Table 4 demonstrates example data generated from each LM with their approximate translation, because possible to make unambiguous translation of grammatically incorrect, meaningless sentences. It is easy to see that examples generated by SM LM are grammatically correct with appropriate choice of words. Note that RNN LM generated very long sentences, failing in splitting word sequences into sentences of appropriate length Training time comparison for SM and RNN LMs. Finally, we compared SM and RNN LMs in terms of training time using different size of text data sets. Here, we used train data sorted in descending order to train SM. Figure 2 shows that SM training time increases almost linearly, which is optimistic result for further experiments with more data. Figure 2. Training time of SM and RNN LMs built using various train sets 4. CONCLUSION As far as we know, this is the first attempt to apply sequence memoizer language model for speech recognition task. Similar to [11], we observed reduction in perplexity using sequence memoizer language model. Nevertheless, it LM. Experiments with interpolation with other models show negligible improvement, when SM scores are also included. Also, our experiment with data generation shows that SM is able to capture dependencies within sentence and produce grammatically correct and meaningful sentences. More work needs to be done to determine whether the SM model can be successfully applied to the ASR task. 186

5 5. REFERENCES [1] P. Cubberley, Russian: a linguistic introduction Cambridge University Press, [2] E.W. Whittaker, P.C. Woodland, Comparison of language modelling techniques for Russian In: Proc. ICSLP, [3] S. Stuker, T. Schultz, A grapheme based speech recognition In: Proc. SPECOM, St.Peterburg, Russia, pp , Sep [4] D. Vazhenina, K. Markov, Phoneme set selection for Russian speech recog In: Proc. IEEE NLP-KE, Tokushima, Japan, pp , Nov [5] L. Lamel, S. Courcinous, J.L. Gauvain, Y. Josse and V.B. Le, Transcription of Russian conversational speech, In: Proc. SLTU, Cape Town, South Africa, pp , May [6] Y. Titov, K. Kilgour, S. Stüker and A. Waibel, The 2011 KIT QUAERO Speech-to-Text System for Russian, In: Proc. SPECOM, Kasan, Russia, Sep [7] A. Karpov, K. Markov, I. Kipyatkova, D. Vazhenina, A. Large vocabulary Russian speech recognition using syntactico-statistical language modeling, Speech communications, vol.56, pp , [8] T. Mikolov, M. Karafiat, L. Burget, J. Cernocky, S. Khudanpur, Recurrent neur In: Proc. INTERSPEECH, Makuhari, Japan, pp , Sep [9] T. Mikolov, A. Deoras, S Kombrink, L. Burget and J. Cernocký, Empirical Evaluation and Combination of Advanced Language Modeling Techniques In: Proc. INTERSPEECH, Florence, Italy, pp , Aug [10] D. Vazhenina, K. Markov, Evaluation of advanced language modelling techniques for Russian LVCSR In: Proc. SPECOM, Pilzen, Czech Republic, pp , Sep [11] F. Wood, J. Gasthaus, C. Archambeau, L. James, and Y.W. Teh, unications of the ACM, vol. 54, no. 2, pp , [12] on Pitman- In: Proc. Annual Meeting of the ACL, Sydney, Australia, pp , Jul [13] F. Wood, C. Archambeau, J. Gasthaus, L.F. James, and Y. In: Proc. ICML, pp , [14] Y.W. Teh. A Bayesian Interpretation of Interpolated Kneser- Technical Report TRA2/06, School of Computing, NUS, [15] T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, S. Khudanpur, Extensions of recurrent neural network language In: Proc. ICASSP, Prague, Czech Republic, pp , May [16] O. Jokisch, A. Wagner, R. Sabo, R. Jaeckel, N. Cylwik, M. Multilingual speech data collection for the assessment of pronunciation and prosody in a language In: Proc. SPECOM, St.Petersburg, Russia, pp , June [17] Recent development of open-source speech recognition engine Julius In: Proc. APSIPA ASC, Sapporo, Japan, pp , Oct [18] Sequence memoizer, code/sequencememoizer/. [19] J. T. Goodman, A bit of progress in language modeling, Technical report MSR-TR ,

Deep Neural Network Language Models

Deep Neural Network Language Models Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,

More information

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Investigation on Mandarin Broadcast News Speech Recognition

Investigation on Mandarin Broadcast News Speech Recognition Investigation on Mandarin Broadcast News Speech Recognition Mei-Yuh Hwang 1, Xin Lei 1, Wen Wang 2, Takahiro Shinozaki 1 1 Univ. of Washington, Dept. of Electrical Engineering, Seattle, WA 98195 USA 2

More information

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Yanzhang He, Eric Fosler-Lussier Department of Computer Science and Engineering The hio

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING

SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING Sheng Li 1, Xugang Lu 2, Shinsuke Sakai 1, Masato Mimura 1 and Tatsuya Kawahara 1 1 School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501,

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING Gábor Gosztolya 1, Tamás Grósz 1, László Tóth 1, David Imseng 2 1 MTA-SZTE Research Group on Artificial

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

CS 598 Natural Language Processing

CS 598 Natural Language Processing CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

Speech Translation for Triage of Emergency Phonecalls in Minority Languages

Speech Translation for Triage of Emergency Phonecalls in Minority Languages Speech Translation for Triage of Emergency Phonecalls in Minority Languages Udhyakumar Nallasamy, Alan W Black, Tanja Schultz, Robert Frederking Language Technologies Institute Carnegie Mellon University

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Letter-based speech synthesis

Letter-based speech synthesis Letter-based speech synthesis Oliver Watts, Junichi Yamagishi, Simon King Centre for Speech Technology Research, University of Edinburgh, UK O.S.Watts@sms.ed.ac.uk jyamagis@inf.ed.ac.uk Simon.King@ed.ac.uk

More information

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Distributed Learning of Multilingual DNN Feature Extractors using GPUs

Distributed Learning of Multilingual DNN Feature Extractors using GPUs Distributed Learning of Multilingual DNN Feature Extractors using GPUs Yajie Miao, Hao Zhang, Florian Metze Language Technologies Institute, School of Computer Science, Carnegie Mellon University Pittsburgh,

More information

arxiv: v1 [cs.cl] 27 Apr 2016

arxiv: v1 [cs.cl] 27 Apr 2016 The IBM 2016 English Conversational Telephone Speech Recognition System George Saon, Tom Sercu, Steven Rennie and Hong-Kwang J. Kuo IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598 gsaon@us.ibm.com

More information

The 2014 KIT IWSLT Speech-to-Text Systems for English, German and Italian

The 2014 KIT IWSLT Speech-to-Text Systems for English, German and Italian The 2014 KIT IWSLT Speech-to-Text Systems for English, German and Italian Kevin Kilgour, Michael Heck, Markus Müller, Matthias Sperber, Sebastian Stüker and Alex Waibel Institute for Anthropomatics Karlsruhe

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Internet as a Normative Corpus: Grammar Checking with a Search Engine The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a

More information

Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling

Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier 1, Andy Way 2, Josef van Genabith

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Problems of the Arabic OCR: New Attitudes

Problems of the Arabic OCR: New Attitudes Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing

More information

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE Shaofei Xue 1

More information

INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT

INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT Takuya Yoshioka,, Anton Ragni, Mark J. F. Gales Cambridge University Engineering Department, Cambridge, UK NTT Communication

More information

DNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS

DNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS DNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS Jonas Gehring 1 Quoc Bao Nguyen 1 Florian Metze 2 Alex Waibel 1,2 1 Interactive Systems Lab, Karlsruhe Institute of Technology;

More information

Using Articulatory Features and Inferred Phonological Segments in Zero Resource Speech Processing

Using Articulatory Features and Inferred Phonological Segments in Zero Resource Speech Processing Using Articulatory Features and Inferred Phonological Segments in Zero Resource Speech Processing Pallavi Baljekar, Sunayana Sitaram, Prasanna Kumar Muthukumar, and Alan W Black Carnegie Mellon University,

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad

More information

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012 Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

The NICT Translation System for IWSLT 2012

The NICT Translation System for IWSLT 2012 The NICT Translation System for IWSLT 2012 Andrew Finch Ohnmar Htun Eiichiro Sumita Multilingual Translation Group MASTAR Project National Institute of Information and Communications Technology Kyoto,

More information

Improved Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge

Improved Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge Improved Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge Preethi Jyothi 1, Mark Hasegawa-Johnson 1,2 1 Beckman Institute,

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method Sanket S. Kalamkar and Adrish Banerjee Department of Electrical Engineering

More information

Noisy SMS Machine Translation in Low-Density Languages

Noisy SMS Machine Translation in Low-Density Languages Noisy SMS Machine Translation in Low-Density Languages Vladimir Eidelman, Kristy Hollingshead, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department of

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH 2009 423 Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition George

More information

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics (L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes

More information

An Introduction to the Minimalist Program

An Introduction to the Minimalist Program An Introduction to the Minimalist Program Luke Smith University of Arizona Summer 2016 Some findings of traditional syntax Human languages vary greatly, but digging deeper, they all have distinct commonalities:

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National

More information

Improvements to the Pruning Behavior of DNN Acoustic Models

Improvements to the Pruning Behavior of DNN Acoustic Models Improvements to the Pruning Behavior of DNN Acoustic Models Matthias Paulik Apple Inc., Infinite Loop, Cupertino, CA 954 mpaulik@apple.com Abstract This paper examines two strategies that positively influence

More information

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present

More information

COPING WITH LANGUAGE DATA SPARSITY: SEMANTIC HEAD MAPPING OF COMPOUND WORDS

COPING WITH LANGUAGE DATA SPARSITY: SEMANTIC HEAD MAPPING OF COMPOUND WORDS COPING WITH LANGUAGE DATA SPARSITY: SEMANTIC HEAD MAPPING OF COMPOUND WORDS Joris Pelemans 1, Kris Demuynck 2, Hugo Van hamme 1, Patrick Wambacq 1 1 Dept. ESAT, Katholieke Universiteit Leuven, Belgium

More information

Multimedia Application Effective Support of Education

Multimedia Application Effective Support of Education Multimedia Application Effective Support of Education Eva Milková Faculty of Science, University od Hradec Králové, Hradec Králové, Czech Republic eva.mikova@uhk.cz Abstract Multimedia applications have

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Summary results (year 1-3)

Summary results (year 1-3) Summary results (year 1-3) Evaluation and accountability are key issues in ensuring quality provision for all (Eurydice, 2004). In Europe, the dominant arrangement for educational accountability is school

More information

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren Speech Technology and Research Laboratory, SRI International,

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Language Model and Grammar Extraction Variation in Machine Translation

Language Model and Grammar Extraction Variation in Machine Translation Language Model and Grammar Extraction Variation in Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

International Branches

International Branches Indian Branches Chandigarh Punjab Haryana Odisha Kolkata Bihar International Branches Bhutan Nepal Philippines Russia South Korea Australia Kyrgyzstan Singapore US Ireland Kazakastan Georgia Czech Republic

More information

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS Elliot Singer and Douglas Reynolds Massachusetts Institute of Technology Lincoln Laboratory {es,dar}@ll.mit.edu ABSTRACT

More information

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology

More information

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer

More information

SPEECH RECOGNITION CHALLENGE IN THE WILD: ARABIC MGB-3

SPEECH RECOGNITION CHALLENGE IN THE WILD: ARABIC MGB-3 SPEECH RECOGNITION CHALLENGE IN THE WILD: ARABIC MGB-3 Ahmed Ali 1,2, Stephan Vogel 1, Steve Renals 2 1 Qatar Computing Research Institute, HBKU, Doha, Qatar 2 Centre for Speech Technology Research, University

More information

Guidelines for the Use of the Continuing Education Unit (CEU)

Guidelines for the Use of the Continuing Education Unit (CEU) Guidelines for the Use of the Continuing Education Unit (CEU) The UNC Policy Manual The essential educational mission of the University is augmented through a broad range of activities generally categorized

More information

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Small-Vocabulary Speech Recognition for Resource- Scarce Languages

Small-Vocabulary Speech Recognition for Resource- Scarce Languages Small-Vocabulary Speech Recognition for Resource- Scarce Languages Fang Qiao School of Computer Science Carnegie Mellon University fqiao@andrew.cmu.edu Jahanzeb Sherwani iteleport LLC j@iteleportmobile.com

More information

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS Heiga Zen, Haşim Sak Google fheigazen,hasimg@google.com ABSTRACT Long short-term

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University The Effect of Extensive Reading on Developing the Grammatical Accuracy of the EFL Freshmen at Al Al-Bayt University Kifah Rakan Alqadi Al Al-Bayt University Faculty of Arts Department of English Language

More information

An Online Handwriting Recognition System For Turkish

An Online Handwriting Recognition System For Turkish An Online Handwriting Recognition System For Turkish Esra Vural, Hakan Erdogan, Kemal Oflazer, Berrin Yanikoglu Sabanci University, Tuzla, Istanbul, Turkey 34956 ABSTRACT Despite recent developments in

More information

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1 Linguistics 1 Linguistics Matthew Gordon, Chair Interdepartmental Program in the College of Arts and Science 223 Tate Hall (573) 882-6421 gordonmj@missouri.edu Kibby Smith, Advisor Office of Multidisciplinary

More information

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode Diploma Thesis of Michael Heck At the Department of Informatics Karlsruhe Institute of Technology

More information

arxiv: v1 [cs.lg] 7 Apr 2015

arxiv: v1 [cs.lg] 7 Apr 2015 Transferring Knowledge from a RNN to a DNN William Chan 1, Nan Rosemary Ke 1, Ian Lane 1,2 Carnegie Mellon University 1 Electrical and Computer Engineering, 2 Language Technologies Institute Equal contribution

More information

Timeline. Recommendations

Timeline. Recommendations Introduction Advanced Placement Course Credit Alignment Recommendations In 2007, the State of Ohio Legislature passed legislation mandating the Board of Regents to recommend and the Chancellor to adopt

More information

Visual CP Representation of Knowledge

Visual CP Representation of Knowledge Visual CP Representation of Knowledge Heather D. Pfeiffer and Roger T. Hartley Department of Computer Science New Mexico State University Las Cruces, NM 88003-8001, USA email: hdp@cs.nmsu.edu and rth@cs.nmsu.edu

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Julie Medero and Mari Ostendorf Electrical Engineering Department University of Washington Seattle, WA 98195 USA {jmedero,ostendor}@uw.edu

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information