Refine Decision Boundaries of a Statistical Ensemble by Active Learning

Size: px
Start display at page:

Download "Refine Decision Boundaries of a Statistical Ensemble by Active Learning"

Transcription

1 Refine Decision Boundaries of a Statistical Ensemble by Active Learning a b * Dingsheng Luo and Ke Chen a National Laboratory on Machine Perception and Center for Information Science, Peking University, Beijing, 87, China b School of Computer Science, The University of Birmingham, Edgbaston, Birmingham B5 2TT, United Kingdom Abstract For pattern classification, the decision boundaries are gradually constructed in a statistical ensemble through a divide-and-conquer procedure based on resampling techniques. Hence a resampling criterion critically governs the process of forming the final decision boundaries. Motivated by active learning ideas, we propose an alternative resampling criterion based on the zero-one loss measure in this paper, where all the patterns in the training set are ranked in terms of their difficulty for classification no matter whether a pattern has been incorrectly classified or not. Our resampling criterion incorporated by Adaboost has been applied to benchmark handwritten digit recognition and text-independent speaker identification tasks. Comparative results demonstrate that our method refines decision boundaries and therefore yields the better generalization performance. I. INTRODUCTION Recent studies show that statistical ensemble learning has turned out to be an effective way in improving generalization capability of a learning system. For pattern classification, a statistical ensemble method gradually constructs the decision boundaries by a divide-and-conquer procedure. In this divide-and-conquer procedure, the less accurate decision boundaries are first constructed to roughly classify patterns such that informative patterns can be found. By means of the informative patterns the rough decision boundaries are gradually improved as the ensemble grows. The final decision boundaries are not fixed until the error-free performance is obtained. From the perspective of statistical learning, the process of constructing decision boundaries in statistical ensemble learning can be interpreted as exploring the decision boundaries of large margins [], which leads to the good generalization performance. For growing a statistical ensemble, a resampling criterion plays a crucial role in selecting data to construct decision boundaries. In general, most of existing resampling criteria are based on traditional error-based measures with respect to a distribution over examples and the misclassified portion of training patterns are merely considered in the subsequent resampling. Unlike the aforementioned resampling criteria, a so-called pseudo-loss error measure has been proposed for data selection in Adaboost where all the patterns have been considered during resampling [2]. Since the pseudo-loss measure not only focuses on the hardest patterns that are misclassified but also considers other patterns that are correctly classified, the better generalization performance has been obtained [2] due to the proper use of more information. Although all the patterns are considered for resampling in the pseudo-loss measure, those patterns being correctly classified are treated to be equally important. Our early studies in active learning indicate that those patterns may play different roles in the construction of decision boundaries even though all of them are correctly classified [3]. By further distinguishing between them with a learning algorithm, the better generalization performance has been achieved. Motivated by the aforementioned work [2],[3], we proposed a novel resampling criterion on the basis of the zero-one loss for minimum-error-rate classification [4]. The proposed criterion provides a unified measure for detecting informative patterns from all the patterns in the training set no matter whether a pattern is misclassified or not. In particular, the patterns being correctly classified are also ranked in terms of their difficulties for classification, which leads to an active data selection procedure for all the patterns in comparison with traditional error-based resampling criteria. We have applied our resampling criterion to Adaboost to tackle real world classification problems, optical character recognition and speaker identification, by means of benchmark databases. Comparative results demonstrate that our method refines decision boundaries of Adaboost and hence yields the better generalization performance. The rest of the paper is organized as follows. Section II presents the motivation and our resampling criterion. Section III describes the system for simulations and report comparative results. Conclusions are drawn in the last section. II. ACTIVE RESAMPLING CRITERION In this section, we first present the motivation on the use of active learning in a resampling criterion and then propose an active resampling criterion to construct statistical ensembles for pattern classification. A. Motivation A pattern classification problem can be described as follows: Given a training set of n examples S = {<x, C >, <x 2, C 2 >,, <x n, C n >} where x i is an instance drawn from some certain space X, and C i C (C = {,, M}) is the class label associated with x i. The learning problem for classification is, based on the training set S, to find a classifier which is expected to make a maximum correct * He is now with Department of Computation, UMIST, Manchester M60 QD, United Kingdom.

2 prediction to any instances x, x X. There are various kinds of classifiers that do not output the pure -of-m representation but offer a confidence for each class. Such classifiers can be converted into a probabilistic form by the following transformation: ŷ i (x) = M e j= 2 ( yi + ) e ( y j + ) 2 probability that the input pattern (being tested) belongs to a specific class. For decision-making, the maximum a posterior (MAP) rule is applied such that * C = arg max yˆ (2) j M ( i =,, M ) () where y i is the ith output component of a classifier. From the probabilistic point of view, each can be interpreted by the By the MAP rule, traditional statistical ensemble methods, e.g. boost, divides training patterns into two categories: easy and hard portions based on whether a correct class label is assigned to a pattern. Apparently, the MAP decision-making ŷ i j rule is suitable for testing an unknown pattern and tends to be necessary. When such a rule is used in a training stage, however, it would incur losses of useful information. Fig. depicts an example to demonstrate such a problem. For two patterns belonging to the same class (class 5), both of them have been correctly classified by a classifier. However, the two patterns convey unequal information. Apparently, the classifier can more likely produce the correct label for the pattern corresponding in Fig. (a) than that shown in Fig. (b) in terms of the probabilistic justification. In other words, the pattern shown in Fig. (b) is more informative given that it tends to be closer to decision boundaries [3]. Unfortunately, the previous resampling criteria in statistical ensemble methods merely focus on the misclassified patterns and fail to consider the distinction among patterns that have been correctly classified. Motivated by our previous work in active learning where such information has turned to be useful for refining decision boundaries [3], we propose a resampling criterion to find all possible informative patterns for construction of statistical ensemble classifiers, which is expected to provide a more active date selection procedure for refining the decision boundaries. B. Active Difficulty Measure (a) According to Bayesian decision theory [4], the zero-one loss function provides a criterion to obtain the minimum error rate or to make the maximum correct prediction. According to the zero-one loss criterion, an ideal classifier always outputs the correct -of-m representation, where the ith component corresponds to class C i, such that for x Ci, if j = i d j (x) = (3) 0 if j i In other words, the ith element of such an ideal output vector is one only while other elements are zero. Fig. 2 shows an example where a pattern belonging to class 5 is perfectly classified (c.f. Fig. ). (b) Fig.. The outputs of two patterns belonging to the same class. Fig. 2. The ideal output of a probabilistic classifier for a pattern belonging to class 5.

3 Obviously, the ideal output vector plays a reference role in the detection of informative patterns. Naturally, we treat the divergence between a practical output vector, e.g. those in Fig., and its corresponding ideal output vector as a difficulty measure for determining how difficult a pattern can be correctly classified. For convenience, we use the Euclidean distance in this paper such that the divergence for a pattern x is defined as div( yˆ( x), = [ d j yˆ j ] M j= Without difficulty, we can prove the following fact: 0 div( y ˆ, 2. The divergence measure unifies two circumstances, i.e., the misclassified case and the confidence of correctly classified case, in terms of how difficult a pattern can be classified. A misclassified pattern must be treated as the most difficult one while a pattern being classified correctly would be assigned a probability or confidence to indicate its difficulty in an uncertain way. For doing so, we define a probabilistic difficulty measure to carry out the above consideration as follows: if div( yˆ( x), > P difficulty = (5) div( yˆ( x), otherwise. As a consequence, the above difficulty measure provides an alternative resampling criterion; all the misclassified patterns can always be selected to form new training subsets while a pattern being classified correctly also has a chance (depending upon its divergence defined in (4)) to be added to above training subsets for the next round training. In comparison with the existing difficulty measures used in statistical ensemble learning, our measure in (5) would be more active to find informative patterns. Therefore, we would name our measure active difficulty measure. When our measure is inserted to Adaboost for replacing the original one, we call this modified version of Adaboost active Adaboost, accordingly, to distinguish from the original one. In addition, our active resampling criterion does not introduce higher computational loads given that the divergence computation in (4) similar to the computation for finding out misclassified patterns by the MAP rule in (3). III. SIMULATIONS In order to evaluate the effectiveness of the proposed method, we have applied the active Adaboost to two real world pattern classification tasks, text-independent speaker identification and handwritten digit recognition. For comparison, we also apply the original Adaboost [5] to the same problems. In this section, we first briefly introduce two benchmark problems. Then we describe the VQ classification system used in our simulations. Finally we report comparative results. 2 (4) A. Text-Independent Speaker Identification Speaker identification is a process that automatically identifies the personal identity based on his/her voice token. By text-independent, it means that such an identification process is carried out regardless of linguistics conveyed in utterances. For simulations in speaker identification, the KING speech corpus of 0-session (S0-S0) is adopted. The database of 5 speakers was collected partly in New Jersey and San Diego. And each session was recorded in both a wide-band (WB) and a narrow-band (NB) channel. There is a significant difference between sessions S0-S05 and S06-S0. Thus, the long time temporal span resulting in voice aging and two distinct recording channels leading to miscellaneous variations provide a desirable corpus to study the mismatch problem. In our simulations, all experiments are grouped into two categories in terms of two sets corresponding to different channels. Each category furthermore contains two groups of experiments. Thus, we have four groups of experiments in our simulations and denote them as WB, WB2, NB and NB2, respectively. Such elaboration expects to introduce different mismatch conditions to different groups of experiments such that WB2 < WB << NB2 < NB. In other words, there is the least mismatch in WB2 while the mismatch in NB is the severest. Note that such an elaborate design has been manifested to introduce different mismatch conditions [6]. In our simulations, the standard spectral analysis is first performed and then Mel-scaled cepstral feature vectors are extracted for training a classifier. B. Handwritten Digit Recognition Handwritten digit recognition is a process to recognize a handwritten digit from its picture. Similar to utterance in speaker recognition, a handwritten digit may have huge various forms due to distinct writing styles. Therefore, there are also miscellaneous mismatch between training data and testing data. For simulations in handwritten digit recognition, we choose the benchmark database, MNIST [7], where there are 60,000 examples for training and 0,000 examples for test. Each digit instance is a two-dimensional binary image and its size in MNIST has been normalized to an image patch of 28x28 pixels without altering their aspect ratio. In order to reduce the dimension to overcome the curse of dimensionality, we first use the wavelet techniques to obtain the low frequency component of the image and then discard the pixels located around the image boundaries given that those pixels are far less informative for classification. Finally, 2x2 images are obtained for simulations where each image is transformed into a 44 dimensional vector. Such a preprocessing procedure is highly consistent with the previous work [7].

4 C. VQ Classification System and Its Ensemble As a baseline system, vector quantization (VQ) technique has been selected for pattern classification [8]. The idea underlying a VQ-based classifier is creating a codebook for the data of the same class in the training set. The codebook consisting of several codewords encodes inherent characteristics of the class of data. Thus such a codebook is viewed as a model that can characterize the class of data. The training phase of a VQ-based classification system is building a codebook for every class by means of a clustering algorithm. In a testing phase, a VQ-based classification system works as follows. When an unknown pattern is coming, the distance between its feature vector and all the codewords belonging to different classes is evaluated in terms of the same similarity measure as defined in that clustering algorithm for codebook production. As a consequence, a decision is made by the similarity test and, thus, the pattern is labeled by that class that has the shortest distance to the pattern. The VQ classification techniques have been widely applied in speaker recognition where one speaker identity is characterized by a VQ codebook [9]. We also apply the VQ technique to the handwritten digit recognition task. Similarly, the characteristics of each digit are thus modeled by a corresponding codebook. As a result, a VQ classifier is used as a component classifier in an ensemble and an individual baseline system for comparison. In our simulations, AdaBoost has been adopted with different the reweighting techniques including ours to construct an ensemble VQ classifier. This constructive process is recursive until the satisfactory performance is achieved on a training set. On the other hand, a combination strategy plays an important role in the integration of classifiers trained on generated training sets. In our simulations, we employ the arithmetic averaging rule as our strategy of combining component classifiers as suggested by our previous empirical studies [6]. D. Simulation Results In our simulations, the VQ with the standard LBG algorithm [8] is employed to build class models where each VQ codebook consists of 64 codewords. In addition, the ensemble of eleven VQ classifiers always yields satisfactory results on the training sets for two benchmark databases. Due to the limited space, therefore, we report only the final generalization performance by the ensemble although the evolving performance is available as the ensemble grows. Fig. 3 shows the overall generalization performance of speaker identification produced by the baseline system, the original AdaBoost and our active AdaBoost on the WB and the NB testing sets, respectively. It is evident from Fig. 3 that our active resampling criterion performs very well given that the active AdaBoost system outperforms both the Identification Rate (%) Identification Rate (%) WB NB WB-2 Experiment on WB (a) 60.5 Experiment on NB NB-2 Baseline AdaBoost Active AdaBoost (b) Fig. 3. The generalization performance on speaker identification. (a) Results on the WB set. (b) Results on the NB sets. baseline system and the original AdaBoost system in all four experiments on WB, WB2, NB and NB2 sets where different mismatch conditions are designed for testing the generalization performance. By further comparing ours with other two methods, the error reduction rates in the WB- and the NB-, where severer mismatch conditions are involved, are better than those of their counterparts in WB-2 and NB-2, respectively. Thus, our simulations clearly demonstrate that along with the use of information carried in those misclassified patterns the use of additional information conveyed in the patterns being classified correctly is further refining decision boundaries of an ensemble classifier against mistmatch. Fig. 4 illustrates the results for the handwritten digit recognition corresponding to digits from 0 to 9. Similarly, the active Adaboost consistently outperforms the base-line system and the original Adaboost for all ten digits. It is worth mentioning that for several digits the base-line system achieves the error-free at the early stage of the ensemble growing. Due to the use of error-based resampling criteria, the original Adaboot does not grow the ensemble anymore. In contrast, our active resampling criterion makes the ensemble further grow by taking advantage of additional

5 Recongnition Rate (%) Recongnition Rate (%) Recongnition Rate (%) (a) Results on digit 0 (b) Results on digit (c) Results on digit 2 Recongnition Rate (%) Recongnition Rate (%) Recongnition Rate (%) (d) Results on digit 3 (e) Results on digit 4 (f) Results on digit 5 Recongnition Rate (%) Recongnition Rate (%) Recongnition Rate (%) (g) Results on digit 6 (h) Results on digit 7 (i) Results on digit 8 Recongnition Rate (%) Baseline AdaBoost Active AdaBoost (j) Results on digit 9 Fig. 4. Comparative results for the handwritten digit recognition problem. (a)-(j) Results corresponding to ten digits from 0 to 9.

6 information conveyed in the patterns being classified correctly, which leads to the effect of refining decision boundaries as shown in Fig. 4. The results also indicate that the idea underlying our method is highly consistent with the use of active data selection for refining the decision boundaries of a strong classifier [3]. IV. CONCLUSION In this paper, we have presented an alternative resampling criterion for active selection of informative patterns to construct a statistical ensemble classifier. In comparison with the existing error-based resampling criteria in statistical ensemble learning, our criterion makes better use of information conveyed in training patterns. Comparative results on two real world problems based on Adaboost, along with others with different statistical ensemble methods (e.g. [0],[]) not reported here, demonstrate that for pattern classification our method yields the better generalization performance by refining decision boundaries with additional information conveyed in training patterns. REFERENCES The Annals of Statistics, vol. 26, pp , 8. [2] Y. Freund and R. E. Schapire, Experiments with a new boosting algorithm, Proceedings of International Conference of Machine Learning, pp , 9. [3] L. Wang, K. Chen, and H. Chi, Capture interspeaker information with a neural network for speaker identification, IEEE Transactions on Neural Networks, vol. 3, pp , [4] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2nd Edition), Wiley-Interscience, 200. [5] Y. Freund and R. E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, vol. 55, pp.9-39, 9. [6] D. S. Luo and K. Chen, A comparative study of statistical ensemble methods on mismatch conditions, Proceedings of International Joint Conference on Neural Networks, pp.59-64, [7] URL: [8] Y. Linde, A. Buzo, and R. M. Gray, An algorithm for vector quantizer design, IEEE Transactions on Communications, vol. 28, pp. 84-, 0. [9] F. K. Soong, A. E. Rosenberg, L. R. Rabiner,, and B. H. Zhuang, A vector quantization approach to speaker identification, Proceedings of International Conference on Acoustics, Speech and Signal Processing, pp , 5. [0] R. E. Schapire, The strength of weak learnability, Machine Learning, vol. 5, pp. -227, 9. [] C. Y. Ji and S. Ma, Combinations of weak classifiers, IEEE Transactions on Neural Networks, vol. 8, pp 32-42, 9. [] R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee, Boosting the margin: A new explanation for the effectiveness of voting methods,

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012 Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of

More information

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California

More information

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING Gábor Gosztolya 1, Tamás Grósz 1, László Tóth 1, David Imseng 2 1 MTA-SZTE Research Group on Artificial

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Disambiguation of Thai Personal Name from Online News Articles

Disambiguation of Thai Personal Name from Online News Articles Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS Elliot Singer and Douglas Reynolds Massachusetts Institute of Technology Lincoln Laboratory {es,dar}@ll.mit.edu ABSTRACT

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses Thomas F.C. Woodhall Masters Candidate in Civil Engineering Queen s University at Kingston,

More information

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy Sheeraz Memon

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Voice conversion through vector quantization

Voice conversion through vector quantization J. Acoust. Soc. Jpn.(E)11, 2 (1990) Voice conversion through vector quantization Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara A TR Interpreting Telephony Research Laboratories,

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Exploration. CS : Deep Reinforcement Learning Sergey Levine Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?

More information

Softprop: Softmax Neural Network Backpropagation Learning

Softprop: Softmax Neural Network Backpropagation Learning Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

A Reinforcement Learning Variant for Control Scheduling

A Reinforcement Learning Variant for Control Scheduling A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement

More information

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

A Note on Structuring Employability Skills for Accounting Students

A Note on Structuring Employability Skills for Accounting Students A Note on Structuring Employability Skills for Accounting Students Jon Warwick and Anna Howard School of Business, London South Bank University Correspondence Address Jon Warwick, School of Business, London

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011 Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011 Cristian-Alexandru Drăgușanu, Marina Cufliuc, Adrian Iftene UAIC: Faculty of Computer Science, Alexandru Ioan Cuza University,

More information

Support Vector Machines for Speaker and Language Recognition

Support Vector Machines for Speaker and Language Recognition Support Vector Machines for Speaker and Language Recognition W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, P. A. Torres-Carrasquillo MIT Lincoln Laboratory, 244 Wood Street, Lexington, MA

More information

The stages of event extraction

The stages of event extraction The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

More information

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Wilny Wilson.P M.Tech Computer Science Student Thejus Engineering College Thrissur, India. Sindhu.S Computer

More information

Semi-Supervised Face Detection

Semi-Supervised Face Detection Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology

More information

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma International Journal of Computer Applications (975 8887) The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma Gilbert M.

More information

Segregation of Unvoiced Speech from Nonspeech Interference

Segregation of Unvoiced Speech from Nonspeech Interference Technical Report OSU-CISRC-8/7-TR63 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 FTP site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/27

More information

Chapter 2 Rule Learning in a Nutshell

Chapter 2 Rule Learning in a Nutshell Chapter 2 Rule Learning in a Nutshell This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the

More information

How to Judge the Quality of an Objective Classroom Test

How to Judge the Quality of an Objective Classroom Test How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking Catherine Pearn The University of Melbourne Max Stephens The University of Melbourne

More information

INPE São José dos Campos

INPE São José dos Campos INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

ASSESSMENT GUIDELINES (PRACTICAL /PERFORMANCE WORK) Grade: 85%+ Description: 'Outstanding work in all respects', ' Work of high professional standard'

ASSESSMENT GUIDELINES (PRACTICAL /PERFORMANCE WORK) Grade: 85%+ Description: 'Outstanding work in all respects', ' Work of high professional standard' 'Outstanding' FIRST Grade: 85%+ Description: 'Outstanding work in all respects', ' Work of high professional standard' Performance/Presentation : The work is structured, designed, performed and presented

More information

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &

More information

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,

More information

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Mining Association Rules in Student s Assessment Data

Mining Association Rules in Student s Assessment Data www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama

More information

Speech Recognition by Indexing and Sequencing

Speech Recognition by Indexing and Sequencing International Journal of Computer Information Systems and Industrial Management Applications. ISSN 215-7988 Volume 4 (212) pp. 358 365 c MIR Labs, www.mirlabs.net/ijcisim/index.html Speech Recognition

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information

The Boosting Approach to Machine Learning An Overview

The Boosting Approach to Machine Learning An Overview Nonlinear Estimation and Classification, Springer, 2003. The Boosting Approach to Machine Learning An Overview Robert E. Schapire AT&T Labs Research Shannon Laboratory 180 Park Avenue, Room A203 Florham

More information

UML MODELLING OF DIGITAL FORENSIC PROCESS MODELS (DFPMs)

UML MODELLING OF DIGITAL FORENSIC PROCESS MODELS (DFPMs) UML MODELLING OF DIGITAL FORENSIC PROCESS MODELS (DFPMs) Michael Köhn 1, J.H.P. Eloff 2, MS Olivier 3 1,2,3 Information and Computer Security Architectures (ICSA) Research Group Department of Computer

More information

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona Parallel Evaluation in Stratal OT * Adam Baker University of Arizona tabaker@u.arizona.edu 1.0. Introduction The model of Stratal OT presented by Kiparsky (forthcoming), has not and will not prove uncontroversial

More information

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations Michael Schneider (mschneider@mpib-berlin.mpg.de) Elsbeth Stern (stern@mpib-berlin.mpg.de)

More information

Mathematics Scoring Guide for Sample Test 2005

Mathematics Scoring Guide for Sample Test 2005 Mathematics Scoring Guide for Sample Test 2005 Grade 4 Contents Strand and Performance Indicator Map with Answer Key...................... 2 Holistic Rubrics.......................................................

More information

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento

More information

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales GCSE English Language 2012 An investigation into the outcomes for candidates in Wales Qualifications and Learning Division 10 September 2012 GCSE English Language 2012 An investigation into the outcomes

More information

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Transfer Learning Action Models by Measuring the Similarity of Different Domains Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn

More information

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,

More information

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Using EEG to Improve Massive Open Online Courses Feedback Interaction Using EEG to Improve Massive Open Online Courses Feedback Interaction Haohan Wang, Yiwei Li, Xiaobo Hu, Yucong Yang, Zhu Meng, Kai-min Chang Language Technologies Institute School of Computer Science Carnegie

More information