Recognizing Natural Emotions in Speech, Having Two Classes

Size: px
Start display at page:

Download "Recognizing Natural Emotions in Speech, Having Two Classes"

Transcription

1 Recognizing Natural Emotions in Speech, Having Two Classes Niels Visser University of Twente P.O. Box 217, 7500AE Enschede The Netherlands ABSTRACT Emotion recognition is a useful way to filter important information, especially when it comes to situations where statistics are kept to track ones performance during work time, such as for workers in a call centre (for example, you might want to record a call when a customer sounds angry, so you can track the patience of an employee). However, there is little research conducted on how emotion recognition methods behave in practice, compared to the work that has been done using data from actors. This paper emphasizes on the input data (sound samples), requiring them to be natural instead of acted. It shows that training and testing on the same datasets yield high accuracy rates (with accuracy rates on acted data around 90%, and accuracy rates on natural data between 60 70%). When cross trained-tested with the acted set as training data and the natural set as testing data, there is not much difference from randomly labelling the samples. However, if the natural set is used as training data and the acted set as testing data, the classifiers and yield very high accuracy rates (78.6% and 74.2% respectively) and the classifier only classifies the samples as anger/frustration. 1. INTRODUCTION A lot of research is conducted on emotion recognition in speech during the past few years. [1, 11, 13] Some of these experiments have already shown accuracy rates of more than 90%. [4, 7, 12] However, the datasets used consist mostly of acted data [1], so it might be true that the same speech based emotion recognition methods (a combination of a feature set and a classification algorithm, hereafter referred to as ERM) that have a high accuracy rate on these datasets, may achieve much lower accuracy rates in practice. Reasons why we might believe this could be true are: because the currently used datasets could contain exaggerated or otherwise exaggerated emotions (for the data in these sets is acted), it may be Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. 15 th Twente Student Conference on IT June 20 th, 2011, Enschede, The Netherlands. Copyright 2011, University of Twente, Faculty of Electrical Engineering, Mathematics and Computer Science. that the speech characteristics of these datasets differ from non-acted (natural) data. One of the reasons why datasets containing acted data are used is that you do not have a human bias playing a role in the annotation of the sound samples, because an actor is ordered to play a certain emotion. (Of course, the quality of the datasets with acted data now depends on the skills of the actor, rather than on the skills of the annotators.) How we deal with the problem of human bias is explained in section 3 of this document. Another reason to choose acted data over natural data is that it is easier to obtain high quality samples. Meaning that you can define the number of speakers, the length of the samples and the intensity of the emotions for example yourself, rather than having to search for suitable samples. (Which is a very time consuming process.) There are a few experiments conducted on emotion recognition in natural speech, which vary much in accuracy. Depending on the dataset, accuracies between 37% and 94.75% have been found using MFCC and pitch features, Gaussian mixture models as classifiers and using samples from Swedish telephone services and English meetings. [9] Or using Teager energy operators and MFCC as a feature set, Gaussian mixture models and neural networks as classifiers and two different datasets: SUSAS and ORI. [4] Remarkable in this research is that the two datasets score very differently. Both datasets contain emotions that are recorded in non-acted environments. SUSAS (containing three classes: high stress, moderate stress and neutral) scores 95% to 61% (note that 33% would statistically be expected when the emotion recognition method does not work at all and just randomly classifies the samples). ORI (containing angry, happy, anxious, dysphoric and neutral) provided a much lower score: the accuracy of the ERM ranged from 57% to 37% on this dataset. This difference could be caused by different levels of arousal between both datasets. [4] It should be noted that both datasets contain multiple (more than two) different emotions. In this document, we will use two different classes ( anger/frustration and other ), instead of using one for each emotion ( other will consist of multiple emotions). An example of an application where this setup could be used would be: deciding to record or not to record a phone call in a call centre. In this case, only the presence or absence of one set of emotions (anger or frustration) may be relevant. This way, the system can detect upset customers, and record how their problem is handled by the employee. Another possible application can be computer gaming. For example: when the difficulty is too high, the system may reduce it when the player gets frustrated or angry. (Of course, voice interaction with the program is needed in this case.)

2 There are well tested ERMs available, which yield high accuracy rates on acted data. [12, 4, 7] Tests on natural data show high accuracy rates in some cases [9, 4] and lower in others [4]. These researches use more than two (sets of) emotions, while there are applications where only two (sets of) emotions are needed, because of their binary nature (like the call centre application). We may assume that the accuracy rate of a classifier, when offered two sets of emotions, differs from a situation where it is offered each emotion apart, because the classifier might generalize features over different emotions or could get confused because of the difference in features between emotions. Therefore, we may formulate the following: - How do classifiers that yield high accuracy rates on a data set containing multiple emotions perform when having to choose from two sets of emotions? It is clear that, even if a natural data set is used, there is much difference in accuracy rates between data sets. [9, 4] We can say that the data set is an important factor in the success of an ERM. We might wonder how comparable different datasets with the same (sets of) emotions are, and what the best way is to train a classifier. Therefore, we may formulate the following: - What will be the difference in accuracy rate of emotion recognition methods when a dataset consisting of audio samples recorded from nonacted situations as opposed to using a dataset in which all the data is acted, in recognizing anger/frustration and other emotions? - Is it possible to use features from an acted dataset to build a model that is able to recognize the same (sets of) emotions in a natural environment? In this paper, a corpus containing 50% anger/frustration and 50% other emotions will be tested against a set of three different ERMs. The exact composition of the corpus can be found under section 3.1 of this paper. The emotions that are classified as other may contain any emotion, but are limited to containing the speech of one person at a time. 2. APPROACH This research focuses mainly on the dataset used for the ERMs, meaning that ERMs that have already been used in emotion recognition will be used. The main parts of the research will be: - Composing a natural corpus, as there are no natural corpora freely available - Acquiring an acted corpus for comparison with the natural corpus - Training on the natural corpus, testing on the natural corpus - Training on the natural corpus, testing on the acted corpus - Training on the acted corpus, testing on the natural corpus - Training on the acted corpus, testing on the acted corpus (Note that the terms corpus and dataset are used interchangeable in this paper.) This way, we can see how uniform both data sets are (by training and testing on the same data set) and how similar both data sets are (by training on one dataset and testing on the other). We might also be able to draw conclusions on how it is best to choose a training set (taken into account that it is useful to train on acted data, since acted data is easier to obtain). 3. METHODS 3.1 Natural Dataset Sources Since audio from call centre calls is not freely available, alternate sources that contain similar emotions are used. These will be the two Dutch programs: De rijdende rechter and TROS Radar. De rijdende rechter is a program in which quarrels are dealt with in a court-like way, and TROS Radar is a program where unsatisfied consumers are able to criticize the respective companies. Since there are two opposing parties in both programs, and the cases dealt with are most often personal, we may expect that emotions like anger and frustration are abundant. Further, the emotions of the participants are not acted. Therefore, only samples from the participants of the program, excluding the presenters, are used. These points make these two programs an excellent source for our sound samples. From these programs, a total of 176 audio samples is extracted, each containing the voice of one speaker. Before selection, the sound samples that are found to be eligible range from 1.18s to 15.96s, with an average of 5.42s and a standard deviation of 3.13s. The relatively high standard deviation can be explained due to the different content of a sample (sometimes utterances of 2 seconds contain enough information for humans to detect a certain emotion), and constraint that only a single voice may be present in the sample (sometimes it is not possible to find multiple samples that are longer than a few seconds of the same speaker, without another voice in it). The criteria on which the samples are selected during the first selection (extraction from the programmes) are: - Only one speaker at a time - The presence of words (samples containing only non-verbal sounds are excluded) - Balance between the two classes (for example: if there are 20 suitable samples containing anger or frustration and 100 suitable samples containing other emotions, not all of these 100 samples will be selected) 99 of these samples are extracted from episodes of De rijdende rechter and 77 from TROS Radar Annotation To minimize the personal influence on the labelling of the sound samples, a group of six independent people were asked to rate each of the samples. Before the list of samples was presented to the annotators, it was shuffled to ensure that not all samples of one program were played consecutively. If this was the case, people might rate the samples relatively to the program they originate from. The annotators were given two options: anger/frustration and other when rating each sample. (As can be seen in figure 1) The agreement of each

3 accessibility (it is available for download at ). This dataset contains 127 samples containing anger, and about 40 to 70 of each other emotion (anxiety/fear, disgust, happiness, boredom, neutral and sadness). To create a balanced corpus containing anger and other, we randomly chose 21 samples from each of the other six emotions and randomly deleted a sample from the samples containing anger, so that we have a symmetrical corpus of 252 samples. Figure 1: Annotation GUI sample was calculated by dividing the number of votes for the option with the most votes by the total number of annotators (six). For example: 6/6 means that all annotators agreed on one emotion, while 5/6 means that one annotator disagreed. The results are: Table 1: Agreements on the sound samples Agreement: 6/6 5/6 4/6 3/6 # of samples: (Please note that 2/6 equals 4/6, 1/6 equals 5/6 and 0/6 equals 6/6 by definition in this context, since we do not split the table in emotions, and we have only two options that the test subjects may choose from.) Of these samples, the clearest groups (the groups with the least disagreements) will be used to form a dataset of minimal 100 samples. In this case, these are the groups 6/6 and 5/6. This minimum is set to ensure that the classifier has some base to train and test on. (The classifier will train/test in folds of 10% (which equals 10 samples, in this case), so that is uses 90 samples to train on and 10 samples to test. This will be repeated 10 times, in which case every sample will be trained upon 9 times and tested upon 1 time.) The composition of the corpus is as follows: Table 2: Composition of the corpus Emotion: Anger/frustration Other Source \ group 6/6 5/6 6/6 5/6 Rechter Radar Total: Feature Extraction As a feature set, we will combine Mel-frequency cepstral coefficients (MFCC), voice quality, intensity (loudness), pitch (F0), and spectral features (energy). MFCC is chosen since it has proven itself as decent feature set when it comes to emotion recognition. [9, 8]. The other features since they have been identified as related to the expression of emotional states. [10] The features will be extracted using the open-source program OpenSMILE, using a configuration previously used by the INTERSPEECH challenge of 2010 by Florian Eyben. This configuration is packed with the OpenSMILE download, available at opensmile.sourceforge.net. OpenSMILE is used in as official feature extractor in the INTERSPEECH challenge of [3] 3.4 Classification The classification algorithms that will be used in the paper will be: - SVM ( from Weka) [6] - s (Naïve Bayes from Weka) [2] - [5] These classifiers have been chosen because of their earlier use in emotion recognition. From [5] we can see that, s and have high accuracy rates compared to the rest, which is the reason these are chosen. The tool Weka is used for running the classifiers. The default parameters for each classification algorithm will be used, as specified in Weka. Before we train and test on different data sets, all values in both sets are normalized using z-scores, so that the mean of all values always is zero, and the standard deviation of every feature equals one. As can be seen above, the corpus consists of 54 samples labelled as anger/frustration and 56 samples marked as other. The difference in ratio between anger/frustration : other = 1:1 and the ratio of the corpus (1 : 0.964) is negligible. 3.2 Acted Dataset As acted dataset the German Emo-DB is chosen, because of its strongly acted nature (10 speakers, that all speak 10 sentences, in 7 different emotions) and easy

4 4. RESULTS We may expect that training and testing on the same dataset yields the highest accuracy rates, because it is likely that the samples with the same emotions in the same data set are most alike, since they originate from the same (type of) source. The results, however, show differently: Table 3: Summary of the results Test Set Natural Acted Natural : 71.8% Bayesian: 70.0% AdaBoost: 61.8% : 78.6% Bayesian: 50.0% AdaBoost: 74.2% Training Set Acted : 54.5% Bayesian: 55.5% AdaBoost: 56.3% : 92.4% Bayesian: 85.7% AdaBoost: 90.1% As can be seen in table 3, accuracy rates are indeed high when trained and tested on the same dataset. However, there are two remarkable results: and AdaBoost yield an even higher accuracy when trained on the natural set and tested on the acted set, than when trained and tested on the natural set. This is remarkable because the two dataset originate from a whole different source, and because when testing in the opposite direction (training on acted data, testing on natural data) yields only negligibly higher than coincidence values. Detailed results are available in appendix A. Therefore, we have a closer look at the classifications of these two classifiers. The dataset that is tested on can be split into three different sets of classes: emotions, speakers and spoken sentences. It is possible that one set can easily be recognized by a classifier, while others as would have been expected based on the results when tested the other way around are more difficult to recognize. There are a few classes that differ notably from the rest. The results show that boredom, neutral, sadness and anger are best recognized (min 81.0%, max 100%) by both classifiers. Worst are anxiety/fear (, 33.3%) and happiness (AdaBoost, 23.8%). Anger is most likely to be the cause of the high accuracy rate, since 81.7% of the respective samples have been correctly classified. There is only one speaker where both classifiers yield an accuracy higher than 80%, being 83.3% () and 87.5% (AdaBoost). Since these values are close to the overall accuracy rates of both classifiers, the results of this speaker are most likely irrelevant. The same applies to the different sentences, there are two sentences that are classified above 80% correctly. Detailed results can be found in appendix B. 5. CONCLUSIONS AND DISCUSSION First, we look at the results when classifiers are trained and tested on the same dataset. In this case, it shows that when an acted dataset is used, accuracy rates reach around 90%. When a natural dataset is used, however, rates vary from 60% to around 70%. This difference may be explained by that an acted dataset is probably more constant than a natural dataset, meaning that samples that are annotated as containing the same emotion are very much alike, in contrary to samples in a natural dataset. Looking at the achievements of the different classifiers, there is not one that achieves much better than the rest when the situations are seen apart. However, achieves best in both situations, while the classifier that achieves worst is not the same in both situations. We may conclude that there is a slight difference between the achievements of the classifiers, with as the most suitable for recognizing emotions with the current feature set when training and testing on the same dataset. Second, we look at the results when one dataset is used for training and another for testing. It shows that the achievements of the classifiers drop radically: there is not much difference between just randomly labelling the samples and letting one of the classifiers do the work. Since the conclusions stated above state that both datasets do contain information about the two classes, there are a few possible explanations: There have been used different definitions of the emotions when annotating the datasets. This may be a fault, but can also be an indication that there are multiple definitions of the same emotion (maybe even cultural differences, since a Dutch and German dataset is used). If it is the latter, it may be interesting for future research to determine how significant these differences are, and how much different classes of the same emotion exist, when looking at different cultures. But there are two remarkable results: and both score very high when training on the natural data set, and testing on the acted set. When looking closer at how the different samples are classified by both classifiers, there are no sole features assignable as the cause for these remarkably high accuracy rates. (In this situation we might expect that there are a few classes or other aspects that are rated abnormally high (between 95 and 100 per cent), and that the score of the rest of the classes does not differ much from random labelling (around 55%, as seen when trained on the acted set and trained on the natural set) ). But this is not the case; instead, most of the classes are rated with about the same accuracy as the overall accuracy of both classifiers. Therefore, we may conclude that in this case and similar cases (meaning when training and testing on a dataset originating from the same type of source) the natural data are good input for building a model that needs to classify acted data. During the introduction, three questions were formulated. The first question worded: How do classifiers that yield high accuracy rates on a data set containing multiple emotions perform when having to choose from two sets of emotions? We may say that the accuracy of the classifiers depends heavily on the data set that is being used. Earlier research showed that accuracy rates up to 94.75% can be reached. However, from [5] it shows that, using the three classifiers, Naïve Bayes and accuracy rates of respectively 71.42%, 66.67% and 71.42% have be reached. The research uses prosodic and acoustic features and an acted dataset with multiple emotions. From this, we may conclude that when using the two sets anger/frustration and other instead of delight, flow,

5 confusion and frustration (as used by [5] ), much higher accuracy rates can be obtained. The second question is: What will be the difference in accuracy rate of emotion recognition methods when a dataset consisting of audio samples recorded from nonacted situations as opposed to using a dataset in which all the data is acted, in recognizing anger/frustration and other emotions? We may answer this question with that there is a significant difference in accuracy rate between data sets with acted data and data sets with natural data, being that data sets with acted data are much more constant (meaning that the emotions labelled as the same contain more mutual characteristics). For this reason, the accuracy rate of a set of acted data is higher than that of a natural data set. The third question is: Is it possible to use features from an acted dataset to build a model that is able to recognize the same (sets of) emotions in a natural environment? As stated before in this conclusion, it is likely that there is a difference in definition of an emotion when annotating different data sets. In this case, the natural set differed from the acted set in a way that it is not feasible to train on an acted set and use the obtained model in a natural environment. However, because of the results obtained when training on the natural data set and testing on the acted data set, we can see that it is possible to obtain very high accuracies (even higher than when trained and tested on a natural set) when training on a natural set and testing on an acted set. Therefore, we may think that when acted sources are chosen more specific to serve a natural goal, it is possible to obtain higher accuracy rates, since the results show that there are mutual characteristics between the acted and natural data sets. 6. FUTURE WORK 6.1 Other sets of classes In this research, anger and frustration are grouped together, as well as the rest of the emotions. The results have shown that using this binary partition, very high accuracy rates can be obtained. But of course there are applications that need other classes as input. And it might be that anger and frustration are very distinct from other emotions, which makes it easier for a classifier to distinguish between the two. It might be interesting to see if the same applies to other groups of emotions. 6.2 Cultural definition differences Since a Dutch and a German data set are used in this research, it is possible that cultural definition differences play a role in the annotation of both data sets. These differences can play an important role when ERMs are being used and sold internationally. Therefore, it might be interesting to look at how significant these differences are, if they are present. This can be done by letting groups with different nationalities annotate the same set of data. 6.3 Use other natural or acted sources The results that have been obtained from training on acted data and testing on natural data versus training on natural data and testing on acted data show remarkable differences. Since the cause of these differences is not yet entirely clear there are some mutual characteristics it can be interesting to see if the same differences appear when using another natural data set and the same acted data set, or using the same natural data set and another acted data set. REFERENCES [1] Ayadi, M.E., Kamel, M.S., Karray, F., Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern recognition 44, pp (2011) [2] Barra-Chicote, R., Fernandez, F., Lutfi, S. L., Lucas- Cuesta, J. M., Macias-Guarasa, J., Montero, J. M., San- Segundo, R., Pardo, J. M., Acoustic Emotion Recognition using Dynamic s and Multi-Space Distributions, 10th Annual Conference of the Internacional Speech Communication Association pp (2009) [3] Eyben, F., Wöllmer, M., Schuller, B., opensmile The Munich Versatile and Fast Open-Source Audio Feature Extractor, Proceedings of the 18th International Conference on Multimedea 2010, pp (2010) [4] He, L., Lech, M., Maddage, N., Memon, S., Emotion Recognition in Spontaneous Speech within Work and Family Environments, Proceedings of the 3rd International Conference on Bioinformatics and Biomedical Engineering, pp. 1 4 (2009) [5] Hoque, M.E., Yeasin, M., Louwerse, M.M., Robust Recognition of Emotion from Speech, Lecture Notes in Computer Science Volume 4133/2006, pp (2006) [6] Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K., Improvements to Platt's Algorithm for SVM Classifier Design, Neural Computation, pp (2001) [7] Lin, Y.L., Wei, G., Speech Emotion Recognition based on HMM and SVM, 2005 International Conference on Machine Learning and Cybernetics, ICMLC, pp (2005) [8] Luengo, I., Navas, E., Hernáes, I., Sánches, J., Automatic Emotion Recognition using Prosodic Parameters, 9th European Conference on Speech Communication and Technology, pp (2005) [9] Neiberg, D., Elenius, K., Laskowski, K., Emotion Recognition in Spontaneous Speech Using GMMs, INTERSPEECH ICSLP 2, pp (2006) [10] Nwe, T.L., Foo, S.W., De Silva, L.C., Speech emotion recognition using hidden Markov models, Speech Communication 41, pp (2003) [11] Sebe, N., Cohen, I., Gevers, T., Huang, T.S., Multimodal Approaches for Emotion Recognition: A Survey, Internet Imaging VI. Proceedings of the SPIE, Volume 5670, pp (2004) [12] Wu, S., Falk, T.H., Chan, W.Y., Automatic speech emotion recognition using modulation spectral features, Speech Communication Volume 53, Issue 5, pp (2010) [13] Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S., A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions, IEEE Transactions on Pattern Analysis and Machine Intelligence 31, pp (2009)

6 APPENDIX A: OVERVIEW OF THE RESULTS OF ALL TESTED SCENARIOS The columns represent the classes as classified by the classifiers, the rows represent the classes as annotated by the annotators. Scenario 1: Training: EmoDB Testing: EmoDB A = anger/frustration B = other Accuracy: 92.4% A = anger/frustration B = other Accuracy: 85.7% A = anger/frustration B = other Accuracy: 90.1% Scenario 2: Training: Natural Testing: Natural A = anger/frustration B = other Accuracy: 71.8% A = anger/frustration B = other Accuracy: 70.0% A = anger/frustration B = other Accuracy: 61.8% Scenario 3: Training: EmoDB Testing: Natural A = anger/frustration B = other Accuracy: 54.5% A = anger/frustration B = other Accuracy: 55.5% A = anger/frustration B = other Accuracy: 56.3% Scenario 4: Training: Natural Testing: EmoDB A = anger/frustration B = other Accuracy: 78.6% A = anger/frustration B = other Accuracy: 50.0% A = anger/frustration B = other Accuracy: 74.2%

7 APPENDIX B: COMPARISON AND ADABOOSTM1 WHEN TRAINING ON NATURAL SET AND TESTING ON ACTED SET The tables should be read as follows: Right classified : wrong classified = percentage of right classified samples % Results by emotion (78.6%) (74.2%) Anxiety/Fear 7:14 = 33.3% 12:9 = 57.1% Disgust 18:3 = 85.7% 13:8 = 61.9% Happiness 9:12 = 42.9% 5:16 = 23.8% Boredom 21:0 = 100% 17:4 = 81.0% Neutral 21:0 = 100% 17:4 = 81.0% Sadness 19:2 = 90.5% 20:1 = 95.2% Anger 103:23 = 81.7% Results by speaker 103:23 = 81.7% (78.6%) (74.2%) 03 (male) 24:5 = 82.8% 22:7 = 75.9% 08 (female) 23:2 = 92.0% 16:9 = 64.0% (78.6%) (74.2%) 09 (female) 20:4 = 83.3% 21:3 = 87.5% 10 (male) 9:4 = 69.2% 8:5 = 61.5% 11 (male) 16:8 = 66.7% 17:7 = 70.8% 12 (male) 11:5 = 68.8% 13:3 = 81.3% 13 (female) 26:4 = 86.7% 22:8 = 73.3% 14 (female) 24:7 = 77.4% 23:8 = 74.2% 15 (male) 21:7 = 75.0% 22:6 = 78.6% 16 (female) 24:8 = 75.0% 23:9 = 71.9% Results by sentence (78.6%) (74.2%) A01 16:3 = 84.2% 16:3 = 84.2% A02 26:5 = 83.9% 21:10 = 67.7% A04 20:4 = 83.3% 21:3 = 87.5% A05 22:4 = 84.6% 19:7 = 73.1% A07 17:8 = 68.0% 16:9 = 64.0% B01 19:6 = 76.0% 19:6 = 76.0% B02 16:7 = 69.6% 19:4 = 82.6% B03 21:9 = 70.0% 21:9 = 70.0% B09 21:4 = 84.0% 16:9 = 64.0% B10 20:4 = 83.3% 19:5 = 79.2%

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Affective Classification of Generic Audio Clips using Regression Models

Affective Classification of Generic Audio Clips using Regression Models Affective Classification of Generic Audio Clips using Regression Models Nikolaos Malandrakis 1, Shiva Sundaram, Alexandros Potamianos 3 1 Signal Analysis and Interpretation Laboratory (SAIL), USC, Los

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy Sheeraz Memon

More information

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International

More information

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation

A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation Ingo Siegert 1, Kerstin Ohnemus 2 1 Cognitive Systems Group, Institute for Information Technology and Communications

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS

ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS Annamaria Mesaros 1, Toni Heittola 1, Antti Eronen 2, Tuomas Virtanen 1 1 Department of Signal Processing Tampere University of Technology Korkeakoulunkatu

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH 2009 423 Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition George

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.

More information

Indian Institute of Technology, Kanpur

Indian Institute of Technology, Kanpur Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012 Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of

More information

A Web Based Annotation Interface Based of Wheel of Emotions. Author: Philip Marsh. Project Supervisor: Irena Spasic. Project Moderator: Matthew Morgan

A Web Based Annotation Interface Based of Wheel of Emotions. Author: Philip Marsh. Project Supervisor: Irena Spasic. Project Moderator: Matthew Morgan A Web Based Annotation Interface Based of Wheel of Emotions Author: Philip Marsh Project Supervisor: Irena Spasic Project Moderator: Matthew Morgan Module Number: CM3203 Module Title: One Semester Individual

More information

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad

More information

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS ELIZABETH ANNE SOMERS Spring 2011 A thesis submitted in partial

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING Gábor Gosztolya 1, Tamás Grósz 1, László Tóth 1, David Imseng 2 1 MTA-SZTE Research Group on Artificial

More information

CS 446: Machine Learning

CS 446: Machine Learning CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Online Updating of Word Representations for Part-of-Speech Tagging

Online Updating of Word Representations for Part-of-Speech Tagging Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org

More information

Digital Signal Processing: Speaker Recognition Final Report (Complete Version)

Digital Signal Processing: Speaker Recognition Final Report (Complete Version) Digital Signal Processing: Speaker Recognition Final Report (Complete Version) Xinyu Zhou, Yuxin Wu, and Tiezheng Li Tsinghua University Contents 1 Introduction 1 2 Algorithms 2 2.1 VAD..................................................

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Generative models and adversarial training

Generative models and adversarial training Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?

More information

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics

More information

On-the-Fly Customization of Automated Essay Scoring

On-the-Fly Customization of Automated Essay Scoring Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,

More information

Automatic Pronunciation Checker

Automatic Pronunciation Checker Institut für Technische Informatik und Kommunikationsnetze Eidgenössische Technische Hochschule Zürich Swiss Federal Institute of Technology Zurich Ecole polytechnique fédérale de Zurich Politecnico federale

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

NCEO Technical Report 27

NCEO Technical Report 27 Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students

More information

Speech Translation for Triage of Emergency Phonecalls in Minority Languages

Speech Translation for Triage of Emergency Phonecalls in Minority Languages Speech Translation for Triage of Emergency Phonecalls in Minority Languages Udhyakumar Nallasamy, Alan W Black, Tanja Schultz, Robert Frederking Language Technologies Institute Carnegie Mellon University

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

How to Judge the Quality of an Objective Classroom Test

How to Judge the Quality of an Objective Classroom Test How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM

More information

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute

More information

New Features & Functionality in Q Release Version 3.2 June 2016

New Features & Functionality in Q Release Version 3.2 June 2016 in Q Release Version 3.2 June 2016 Contents New Features & Functionality 3 Multiple Applications 3 Class, Student and Staff Banner Applications 3 Attendance 4 Class Attendance 4 Mass Attendance 4 Truancy

More information

Eyebrows in French talk-in-interaction

Eyebrows in French talk-in-interaction Eyebrows in French talk-in-interaction Aurélie Goujon 1, Roxane Bertrand 1, Marion Tellier 1 1 Aix Marseille Université, CNRS, LPL UMR 7309, 13100, Aix-en-Provence, France Goujon.aurelie@gmail.com Roxane.bertrand@lpl-aix.fr

More information

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies

More information

Expressive speech synthesis: a review

Expressive speech synthesis: a review Int J Speech Technol (2013) 16:237 260 DOI 10.1007/s10772-012-9180-2 Expressive speech synthesis: a review D. Govind S.R. Mahadeva Prasanna Received: 31 May 2012 / Accepted: 11 October 2012 / Published

More information

Semi-Supervised Face Detection

Semi-Supervised Face Detection Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University

More information

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4 Chapters 1-5 Cumulative Assessment AP Statistics Name: November 2008 Gillespie, Block 4 Part I: Multiple Choice This portion of the test will determine 60% of your overall test grade. Each question is

More information

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS Elliot Singer and Douglas Reynolds Massachusetts Institute of Technology Lincoln Laboratory {es,dar}@ll.mit.edu ABSTRACT

More information

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick

More information

Introduction to Psychology

Introduction to Psychology Course Title Introduction to Psychology Course Number PSYCH-UA.9001001 SAMPLE SYLLABUS Instructor Contact Information André Weinreich aw111@nyu.edu Course Details Wednesdays, 1:30pm to 4:15pm Location

More information

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING Yong Sun, a * Colin Fidge b and Lin Ma a a CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland

More information

MMOG Subscription Business Models: Table of Contents

MMOG Subscription Business Models: Table of Contents DFC Intelligence DFC Intelligence Phone 858-780-9680 9320 Carmel Mountain Rd Fax 858-780-9671 Suite C www.dfcint.com San Diego, CA 92129 MMOG Subscription Business Models: Table of Contents November 2007

More information

Universiteit Leiden ICT in Business

Universiteit Leiden ICT in Business Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

Case study Norway case 1

Case study Norway case 1 Case study Norway case 1 School : B (primary school) Theme: Science microorganisms Dates of lessons: March 26-27 th 2015 Age of students: 10-11 (grade 5) Data sources: Pre- and post-interview with 1 teacher

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

16.1 Lesson: Putting it into practice - isikhnas

16.1 Lesson: Putting it into practice - isikhnas BAB 16 Module: Using QGIS in animal health The purpose of this module is to show how QGIS can be used to assist in animal health scenarios. In order to do this, you will have needed to study, and be familiar

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

TCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits)

TCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits) Frameworks for Research in Mathematics and Science Education (3 Credits) Professor Office Hours Email Class Location Class Meeting Day * This is the preferred method of communication. Richard Lamb Wednesday

More information

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Instructor: Mario D. Garrett, Ph.D.   Phone: Office: Hepner Hall (HH) 100 San Diego State University School of Social Work 610 COMPUTER APPLICATIONS FOR SOCIAL WORK PRACTICE Statistical Package for the Social Sciences Office: Hepner Hall (HH) 100 Instructor: Mario D. Garrett,

More information

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.

More information

Speaker recognition using universal background model on YOHO database

Speaker recognition using universal background model on YOHO database Aalborg University Master Thesis project Speaker recognition using universal background model on YOHO database Author: Alexandre Majetniak Supervisor: Zheng-Hua Tan May 31, 2011 The Faculties of Engineering,

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

Memory-based grammatical error correction

Memory-based grammatical error correction Memory-based grammatical error correction Antal van den Bosch Peter Berck Radboud University Nijmegen Tilburg University P.O. Box 9103 P.O. Box 90153 NL-6500 HD Nijmegen, The Netherlands NL-5000 LE Tilburg,

More information

Practice Examination IREB

Practice Examination IREB IREB Examination Requirements Engineering Advanced Level Elicitation and Consolidation Practice Examination Questionnaire: Set_EN_2013_Public_1.2 Syllabus: Version 1.0 Passed Failed Total number of points

More information

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY

More information

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS R.Barco 1, R.Guerrero 2, G.Hylander 2, L.Nielsen 3, M.Partanen 2, S.Patel 4 1 Dpt. Ingeniería de Comunicaciones. Universidad de Málaga.

More information

Speech Recognition by Indexing and Sequencing

Speech Recognition by Indexing and Sequencing International Journal of Computer Information Systems and Industrial Management Applications. ISSN 215-7988 Volume 4 (212) pp. 358 365 c MIR Labs, www.mirlabs.net/ijcisim/index.html Speech Recognition

More information

MINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES

MINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES MINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES THE PRESIDENTS OF THE UNITED STATES Project: Focus on the Presidents of the United States Objective: See how many Presidents of the United States

More information

Identifying Novice Difficulties in Object Oriented Design

Identifying Novice Difficulties in Object Oriented Design Identifying Novice Difficulties in Object Oriented Design Benjy Thomasson, Mark Ratcliffe, Lynda Thomas University of Wales, Aberystwyth Penglais Hill Aberystwyth, SY23 1BJ +44 (1970) 622424 {mbr, ltt}

More information

On the Combined Behavior of Autonomous Resource Management Agents

On the Combined Behavior of Autonomous Resource Management Agents On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science

More information

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY FALL 2017 COURSE SYLLABUS Course Instructors Kagan Kerman (Theoretical), e-mail: kagan.kerman@utoronto.ca Office hours: Mondays 3-6 pm in EV502 (on the 5th floor

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Introduction to Questionnaire Design

Introduction to Questionnaire Design Introduction to Questionnaire Design Why this seminar is necessary! Bad questions are everywhere! Don t let them happen to you! Fall 2012 Seminar Series University of Illinois www.srl.uic.edu The first

More information

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Using EEG to Improve Massive Open Online Courses Feedback Interaction Using EEG to Improve Massive Open Online Courses Feedback Interaction Haohan Wang, Yiwei Li, Xiaobo Hu, Yucong Yang, Zhu Meng, Kai-min Chang Language Technologies Institute School of Computer Science Carnegie

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant Sudheer Takekar 1 Dr. D.N. Raut 2

Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant Sudheer Takekar 1 Dr. D.N. Raut 2 IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 04, 2014 ISSN (online): 2321-0613 Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant

More information