Recognizing Natural Emotions in Speech, Having Two Classes
|
|
- Noel Washington
- 6 years ago
- Views:
Transcription
1 Recognizing Natural Emotions in Speech, Having Two Classes Niels Visser University of Twente P.O. Box 217, 7500AE Enschede The Netherlands ABSTRACT Emotion recognition is a useful way to filter important information, especially when it comes to situations where statistics are kept to track ones performance during work time, such as for workers in a call centre (for example, you might want to record a call when a customer sounds angry, so you can track the patience of an employee). However, there is little research conducted on how emotion recognition methods behave in practice, compared to the work that has been done using data from actors. This paper emphasizes on the input data (sound samples), requiring them to be natural instead of acted. It shows that training and testing on the same datasets yield high accuracy rates (with accuracy rates on acted data around 90%, and accuracy rates on natural data between 60 70%). When cross trained-tested with the acted set as training data and the natural set as testing data, there is not much difference from randomly labelling the samples. However, if the natural set is used as training data and the acted set as testing data, the classifiers and yield very high accuracy rates (78.6% and 74.2% respectively) and the classifier only classifies the samples as anger/frustration. 1. INTRODUCTION A lot of research is conducted on emotion recognition in speech during the past few years. [1, 11, 13] Some of these experiments have already shown accuracy rates of more than 90%. [4, 7, 12] However, the datasets used consist mostly of acted data [1], so it might be true that the same speech based emotion recognition methods (a combination of a feature set and a classification algorithm, hereafter referred to as ERM) that have a high accuracy rate on these datasets, may achieve much lower accuracy rates in practice. Reasons why we might believe this could be true are: because the currently used datasets could contain exaggerated or otherwise exaggerated emotions (for the data in these sets is acted), it may be Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. 15 th Twente Student Conference on IT June 20 th, 2011, Enschede, The Netherlands. Copyright 2011, University of Twente, Faculty of Electrical Engineering, Mathematics and Computer Science. that the speech characteristics of these datasets differ from non-acted (natural) data. One of the reasons why datasets containing acted data are used is that you do not have a human bias playing a role in the annotation of the sound samples, because an actor is ordered to play a certain emotion. (Of course, the quality of the datasets with acted data now depends on the skills of the actor, rather than on the skills of the annotators.) How we deal with the problem of human bias is explained in section 3 of this document. Another reason to choose acted data over natural data is that it is easier to obtain high quality samples. Meaning that you can define the number of speakers, the length of the samples and the intensity of the emotions for example yourself, rather than having to search for suitable samples. (Which is a very time consuming process.) There are a few experiments conducted on emotion recognition in natural speech, which vary much in accuracy. Depending on the dataset, accuracies between 37% and 94.75% have been found using MFCC and pitch features, Gaussian mixture models as classifiers and using samples from Swedish telephone services and English meetings. [9] Or using Teager energy operators and MFCC as a feature set, Gaussian mixture models and neural networks as classifiers and two different datasets: SUSAS and ORI. [4] Remarkable in this research is that the two datasets score very differently. Both datasets contain emotions that are recorded in non-acted environments. SUSAS (containing three classes: high stress, moderate stress and neutral) scores 95% to 61% (note that 33% would statistically be expected when the emotion recognition method does not work at all and just randomly classifies the samples). ORI (containing angry, happy, anxious, dysphoric and neutral) provided a much lower score: the accuracy of the ERM ranged from 57% to 37% on this dataset. This difference could be caused by different levels of arousal between both datasets. [4] It should be noted that both datasets contain multiple (more than two) different emotions. In this document, we will use two different classes ( anger/frustration and other ), instead of using one for each emotion ( other will consist of multiple emotions). An example of an application where this setup could be used would be: deciding to record or not to record a phone call in a call centre. In this case, only the presence or absence of one set of emotions (anger or frustration) may be relevant. This way, the system can detect upset customers, and record how their problem is handled by the employee. Another possible application can be computer gaming. For example: when the difficulty is too high, the system may reduce it when the player gets frustrated or angry. (Of course, voice interaction with the program is needed in this case.)
2 There are well tested ERMs available, which yield high accuracy rates on acted data. [12, 4, 7] Tests on natural data show high accuracy rates in some cases [9, 4] and lower in others [4]. These researches use more than two (sets of) emotions, while there are applications where only two (sets of) emotions are needed, because of their binary nature (like the call centre application). We may assume that the accuracy rate of a classifier, when offered two sets of emotions, differs from a situation where it is offered each emotion apart, because the classifier might generalize features over different emotions or could get confused because of the difference in features between emotions. Therefore, we may formulate the following: - How do classifiers that yield high accuracy rates on a data set containing multiple emotions perform when having to choose from two sets of emotions? It is clear that, even if a natural data set is used, there is much difference in accuracy rates between data sets. [9, 4] We can say that the data set is an important factor in the success of an ERM. We might wonder how comparable different datasets with the same (sets of) emotions are, and what the best way is to train a classifier. Therefore, we may formulate the following: - What will be the difference in accuracy rate of emotion recognition methods when a dataset consisting of audio samples recorded from nonacted situations as opposed to using a dataset in which all the data is acted, in recognizing anger/frustration and other emotions? - Is it possible to use features from an acted dataset to build a model that is able to recognize the same (sets of) emotions in a natural environment? In this paper, a corpus containing 50% anger/frustration and 50% other emotions will be tested against a set of three different ERMs. The exact composition of the corpus can be found under section 3.1 of this paper. The emotions that are classified as other may contain any emotion, but are limited to containing the speech of one person at a time. 2. APPROACH This research focuses mainly on the dataset used for the ERMs, meaning that ERMs that have already been used in emotion recognition will be used. The main parts of the research will be: - Composing a natural corpus, as there are no natural corpora freely available - Acquiring an acted corpus for comparison with the natural corpus - Training on the natural corpus, testing on the natural corpus - Training on the natural corpus, testing on the acted corpus - Training on the acted corpus, testing on the natural corpus - Training on the acted corpus, testing on the acted corpus (Note that the terms corpus and dataset are used interchangeable in this paper.) This way, we can see how uniform both data sets are (by training and testing on the same data set) and how similar both data sets are (by training on one dataset and testing on the other). We might also be able to draw conclusions on how it is best to choose a training set (taken into account that it is useful to train on acted data, since acted data is easier to obtain). 3. METHODS 3.1 Natural Dataset Sources Since audio from call centre calls is not freely available, alternate sources that contain similar emotions are used. These will be the two Dutch programs: De rijdende rechter and TROS Radar. De rijdende rechter is a program in which quarrels are dealt with in a court-like way, and TROS Radar is a program where unsatisfied consumers are able to criticize the respective companies. Since there are two opposing parties in both programs, and the cases dealt with are most often personal, we may expect that emotions like anger and frustration are abundant. Further, the emotions of the participants are not acted. Therefore, only samples from the participants of the program, excluding the presenters, are used. These points make these two programs an excellent source for our sound samples. From these programs, a total of 176 audio samples is extracted, each containing the voice of one speaker. Before selection, the sound samples that are found to be eligible range from 1.18s to 15.96s, with an average of 5.42s and a standard deviation of 3.13s. The relatively high standard deviation can be explained due to the different content of a sample (sometimes utterances of 2 seconds contain enough information for humans to detect a certain emotion), and constraint that only a single voice may be present in the sample (sometimes it is not possible to find multiple samples that are longer than a few seconds of the same speaker, without another voice in it). The criteria on which the samples are selected during the first selection (extraction from the programmes) are: - Only one speaker at a time - The presence of words (samples containing only non-verbal sounds are excluded) - Balance between the two classes (for example: if there are 20 suitable samples containing anger or frustration and 100 suitable samples containing other emotions, not all of these 100 samples will be selected) 99 of these samples are extracted from episodes of De rijdende rechter and 77 from TROS Radar Annotation To minimize the personal influence on the labelling of the sound samples, a group of six independent people were asked to rate each of the samples. Before the list of samples was presented to the annotators, it was shuffled to ensure that not all samples of one program were played consecutively. If this was the case, people might rate the samples relatively to the program they originate from. The annotators were given two options: anger/frustration and other when rating each sample. (As can be seen in figure 1) The agreement of each
3 accessibility (it is available for download at ). This dataset contains 127 samples containing anger, and about 40 to 70 of each other emotion (anxiety/fear, disgust, happiness, boredom, neutral and sadness). To create a balanced corpus containing anger and other, we randomly chose 21 samples from each of the other six emotions and randomly deleted a sample from the samples containing anger, so that we have a symmetrical corpus of 252 samples. Figure 1: Annotation GUI sample was calculated by dividing the number of votes for the option with the most votes by the total number of annotators (six). For example: 6/6 means that all annotators agreed on one emotion, while 5/6 means that one annotator disagreed. The results are: Table 1: Agreements on the sound samples Agreement: 6/6 5/6 4/6 3/6 # of samples: (Please note that 2/6 equals 4/6, 1/6 equals 5/6 and 0/6 equals 6/6 by definition in this context, since we do not split the table in emotions, and we have only two options that the test subjects may choose from.) Of these samples, the clearest groups (the groups with the least disagreements) will be used to form a dataset of minimal 100 samples. In this case, these are the groups 6/6 and 5/6. This minimum is set to ensure that the classifier has some base to train and test on. (The classifier will train/test in folds of 10% (which equals 10 samples, in this case), so that is uses 90 samples to train on and 10 samples to test. This will be repeated 10 times, in which case every sample will be trained upon 9 times and tested upon 1 time.) The composition of the corpus is as follows: Table 2: Composition of the corpus Emotion: Anger/frustration Other Source \ group 6/6 5/6 6/6 5/6 Rechter Radar Total: Feature Extraction As a feature set, we will combine Mel-frequency cepstral coefficients (MFCC), voice quality, intensity (loudness), pitch (F0), and spectral features (energy). MFCC is chosen since it has proven itself as decent feature set when it comes to emotion recognition. [9, 8]. The other features since they have been identified as related to the expression of emotional states. [10] The features will be extracted using the open-source program OpenSMILE, using a configuration previously used by the INTERSPEECH challenge of 2010 by Florian Eyben. This configuration is packed with the OpenSMILE download, available at opensmile.sourceforge.net. OpenSMILE is used in as official feature extractor in the INTERSPEECH challenge of [3] 3.4 Classification The classification algorithms that will be used in the paper will be: - SVM ( from Weka) [6] - s (Naïve Bayes from Weka) [2] - [5] These classifiers have been chosen because of their earlier use in emotion recognition. From [5] we can see that, s and have high accuracy rates compared to the rest, which is the reason these are chosen. The tool Weka is used for running the classifiers. The default parameters for each classification algorithm will be used, as specified in Weka. Before we train and test on different data sets, all values in both sets are normalized using z-scores, so that the mean of all values always is zero, and the standard deviation of every feature equals one. As can be seen above, the corpus consists of 54 samples labelled as anger/frustration and 56 samples marked as other. The difference in ratio between anger/frustration : other = 1:1 and the ratio of the corpus (1 : 0.964) is negligible. 3.2 Acted Dataset As acted dataset the German Emo-DB is chosen, because of its strongly acted nature (10 speakers, that all speak 10 sentences, in 7 different emotions) and easy
4 4. RESULTS We may expect that training and testing on the same dataset yields the highest accuracy rates, because it is likely that the samples with the same emotions in the same data set are most alike, since they originate from the same (type of) source. The results, however, show differently: Table 3: Summary of the results Test Set Natural Acted Natural : 71.8% Bayesian: 70.0% AdaBoost: 61.8% : 78.6% Bayesian: 50.0% AdaBoost: 74.2% Training Set Acted : 54.5% Bayesian: 55.5% AdaBoost: 56.3% : 92.4% Bayesian: 85.7% AdaBoost: 90.1% As can be seen in table 3, accuracy rates are indeed high when trained and tested on the same dataset. However, there are two remarkable results: and AdaBoost yield an even higher accuracy when trained on the natural set and tested on the acted set, than when trained and tested on the natural set. This is remarkable because the two dataset originate from a whole different source, and because when testing in the opposite direction (training on acted data, testing on natural data) yields only negligibly higher than coincidence values. Detailed results are available in appendix A. Therefore, we have a closer look at the classifications of these two classifiers. The dataset that is tested on can be split into three different sets of classes: emotions, speakers and spoken sentences. It is possible that one set can easily be recognized by a classifier, while others as would have been expected based on the results when tested the other way around are more difficult to recognize. There are a few classes that differ notably from the rest. The results show that boredom, neutral, sadness and anger are best recognized (min 81.0%, max 100%) by both classifiers. Worst are anxiety/fear (, 33.3%) and happiness (AdaBoost, 23.8%). Anger is most likely to be the cause of the high accuracy rate, since 81.7% of the respective samples have been correctly classified. There is only one speaker where both classifiers yield an accuracy higher than 80%, being 83.3% () and 87.5% (AdaBoost). Since these values are close to the overall accuracy rates of both classifiers, the results of this speaker are most likely irrelevant. The same applies to the different sentences, there are two sentences that are classified above 80% correctly. Detailed results can be found in appendix B. 5. CONCLUSIONS AND DISCUSSION First, we look at the results when classifiers are trained and tested on the same dataset. In this case, it shows that when an acted dataset is used, accuracy rates reach around 90%. When a natural dataset is used, however, rates vary from 60% to around 70%. This difference may be explained by that an acted dataset is probably more constant than a natural dataset, meaning that samples that are annotated as containing the same emotion are very much alike, in contrary to samples in a natural dataset. Looking at the achievements of the different classifiers, there is not one that achieves much better than the rest when the situations are seen apart. However, achieves best in both situations, while the classifier that achieves worst is not the same in both situations. We may conclude that there is a slight difference between the achievements of the classifiers, with as the most suitable for recognizing emotions with the current feature set when training and testing on the same dataset. Second, we look at the results when one dataset is used for training and another for testing. It shows that the achievements of the classifiers drop radically: there is not much difference between just randomly labelling the samples and letting one of the classifiers do the work. Since the conclusions stated above state that both datasets do contain information about the two classes, there are a few possible explanations: There have been used different definitions of the emotions when annotating the datasets. This may be a fault, but can also be an indication that there are multiple definitions of the same emotion (maybe even cultural differences, since a Dutch and German dataset is used). If it is the latter, it may be interesting for future research to determine how significant these differences are, and how much different classes of the same emotion exist, when looking at different cultures. But there are two remarkable results: and both score very high when training on the natural data set, and testing on the acted set. When looking closer at how the different samples are classified by both classifiers, there are no sole features assignable as the cause for these remarkably high accuracy rates. (In this situation we might expect that there are a few classes or other aspects that are rated abnormally high (between 95 and 100 per cent), and that the score of the rest of the classes does not differ much from random labelling (around 55%, as seen when trained on the acted set and trained on the natural set) ). But this is not the case; instead, most of the classes are rated with about the same accuracy as the overall accuracy of both classifiers. Therefore, we may conclude that in this case and similar cases (meaning when training and testing on a dataset originating from the same type of source) the natural data are good input for building a model that needs to classify acted data. During the introduction, three questions were formulated. The first question worded: How do classifiers that yield high accuracy rates on a data set containing multiple emotions perform when having to choose from two sets of emotions? We may say that the accuracy of the classifiers depends heavily on the data set that is being used. Earlier research showed that accuracy rates up to 94.75% can be reached. However, from [5] it shows that, using the three classifiers, Naïve Bayes and accuracy rates of respectively 71.42%, 66.67% and 71.42% have be reached. The research uses prosodic and acoustic features and an acted dataset with multiple emotions. From this, we may conclude that when using the two sets anger/frustration and other instead of delight, flow,
5 confusion and frustration (as used by [5] ), much higher accuracy rates can be obtained. The second question is: What will be the difference in accuracy rate of emotion recognition methods when a dataset consisting of audio samples recorded from nonacted situations as opposed to using a dataset in which all the data is acted, in recognizing anger/frustration and other emotions? We may answer this question with that there is a significant difference in accuracy rate between data sets with acted data and data sets with natural data, being that data sets with acted data are much more constant (meaning that the emotions labelled as the same contain more mutual characteristics). For this reason, the accuracy rate of a set of acted data is higher than that of a natural data set. The third question is: Is it possible to use features from an acted dataset to build a model that is able to recognize the same (sets of) emotions in a natural environment? As stated before in this conclusion, it is likely that there is a difference in definition of an emotion when annotating different data sets. In this case, the natural set differed from the acted set in a way that it is not feasible to train on an acted set and use the obtained model in a natural environment. However, because of the results obtained when training on the natural data set and testing on the acted data set, we can see that it is possible to obtain very high accuracies (even higher than when trained and tested on a natural set) when training on a natural set and testing on an acted set. Therefore, we may think that when acted sources are chosen more specific to serve a natural goal, it is possible to obtain higher accuracy rates, since the results show that there are mutual characteristics between the acted and natural data sets. 6. FUTURE WORK 6.1 Other sets of classes In this research, anger and frustration are grouped together, as well as the rest of the emotions. The results have shown that using this binary partition, very high accuracy rates can be obtained. But of course there are applications that need other classes as input. And it might be that anger and frustration are very distinct from other emotions, which makes it easier for a classifier to distinguish between the two. It might be interesting to see if the same applies to other groups of emotions. 6.2 Cultural definition differences Since a Dutch and a German data set are used in this research, it is possible that cultural definition differences play a role in the annotation of both data sets. These differences can play an important role when ERMs are being used and sold internationally. Therefore, it might be interesting to look at how significant these differences are, if they are present. This can be done by letting groups with different nationalities annotate the same set of data. 6.3 Use other natural or acted sources The results that have been obtained from training on acted data and testing on natural data versus training on natural data and testing on acted data show remarkable differences. Since the cause of these differences is not yet entirely clear there are some mutual characteristics it can be interesting to see if the same differences appear when using another natural data set and the same acted data set, or using the same natural data set and another acted data set. REFERENCES [1] Ayadi, M.E., Kamel, M.S., Karray, F., Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern recognition 44, pp (2011) [2] Barra-Chicote, R., Fernandez, F., Lutfi, S. L., Lucas- Cuesta, J. M., Macias-Guarasa, J., Montero, J. M., San- Segundo, R., Pardo, J. M., Acoustic Emotion Recognition using Dynamic s and Multi-Space Distributions, 10th Annual Conference of the Internacional Speech Communication Association pp (2009) [3] Eyben, F., Wöllmer, M., Schuller, B., opensmile The Munich Versatile and Fast Open-Source Audio Feature Extractor, Proceedings of the 18th International Conference on Multimedea 2010, pp (2010) [4] He, L., Lech, M., Maddage, N., Memon, S., Emotion Recognition in Spontaneous Speech within Work and Family Environments, Proceedings of the 3rd International Conference on Bioinformatics and Biomedical Engineering, pp. 1 4 (2009) [5] Hoque, M.E., Yeasin, M., Louwerse, M.M., Robust Recognition of Emotion from Speech, Lecture Notes in Computer Science Volume 4133/2006, pp (2006) [6] Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K., Improvements to Platt's Algorithm for SVM Classifier Design, Neural Computation, pp (2001) [7] Lin, Y.L., Wei, G., Speech Emotion Recognition based on HMM and SVM, 2005 International Conference on Machine Learning and Cybernetics, ICMLC, pp (2005) [8] Luengo, I., Navas, E., Hernáes, I., Sánches, J., Automatic Emotion Recognition using Prosodic Parameters, 9th European Conference on Speech Communication and Technology, pp (2005) [9] Neiberg, D., Elenius, K., Laskowski, K., Emotion Recognition in Spontaneous Speech Using GMMs, INTERSPEECH ICSLP 2, pp (2006) [10] Nwe, T.L., Foo, S.W., De Silva, L.C., Speech emotion recognition using hidden Markov models, Speech Communication 41, pp (2003) [11] Sebe, N., Cohen, I., Gevers, T., Huang, T.S., Multimodal Approaches for Emotion Recognition: A Survey, Internet Imaging VI. Proceedings of the SPIE, Volume 5670, pp (2004) [12] Wu, S., Falk, T.H., Chan, W.Y., Automatic speech emotion recognition using modulation spectral features, Speech Communication Volume 53, Issue 5, pp (2010) [13] Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S., A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions, IEEE Transactions on Pattern Analysis and Machine Intelligence 31, pp (2009)
6 APPENDIX A: OVERVIEW OF THE RESULTS OF ALL TESTED SCENARIOS The columns represent the classes as classified by the classifiers, the rows represent the classes as annotated by the annotators. Scenario 1: Training: EmoDB Testing: EmoDB A = anger/frustration B = other Accuracy: 92.4% A = anger/frustration B = other Accuracy: 85.7% A = anger/frustration B = other Accuracy: 90.1% Scenario 2: Training: Natural Testing: Natural A = anger/frustration B = other Accuracy: 71.8% A = anger/frustration B = other Accuracy: 70.0% A = anger/frustration B = other Accuracy: 61.8% Scenario 3: Training: EmoDB Testing: Natural A = anger/frustration B = other Accuracy: 54.5% A = anger/frustration B = other Accuracy: 55.5% A = anger/frustration B = other Accuracy: 56.3% Scenario 4: Training: Natural Testing: EmoDB A = anger/frustration B = other Accuracy: 78.6% A = anger/frustration B = other Accuracy: 50.0% A = anger/frustration B = other Accuracy: 74.2%
7 APPENDIX B: COMPARISON AND ADABOOSTM1 WHEN TRAINING ON NATURAL SET AND TESTING ON ACTED SET The tables should be read as follows: Right classified : wrong classified = percentage of right classified samples % Results by emotion (78.6%) (74.2%) Anxiety/Fear 7:14 = 33.3% 12:9 = 57.1% Disgust 18:3 = 85.7% 13:8 = 61.9% Happiness 9:12 = 42.9% 5:16 = 23.8% Boredom 21:0 = 100% 17:4 = 81.0% Neutral 21:0 = 100% 17:4 = 81.0% Sadness 19:2 = 90.5% 20:1 = 95.2% Anger 103:23 = 81.7% Results by speaker 103:23 = 81.7% (78.6%) (74.2%) 03 (male) 24:5 = 82.8% 22:7 = 75.9% 08 (female) 23:2 = 92.0% 16:9 = 64.0% (78.6%) (74.2%) 09 (female) 20:4 = 83.3% 21:3 = 87.5% 10 (male) 9:4 = 69.2% 8:5 = 61.5% 11 (male) 16:8 = 66.7% 17:7 = 70.8% 12 (male) 11:5 = 68.8% 13:3 = 81.3% 13 (female) 26:4 = 86.7% 22:8 = 73.3% 14 (female) 24:7 = 77.4% 23:8 = 74.2% 15 (male) 21:7 = 75.0% 22:6 = 78.6% 16 (female) 24:8 = 75.0% 23:9 = 71.9% Results by sentence (78.6%) (74.2%) A01 16:3 = 84.2% 16:3 = 84.2% A02 26:5 = 83.9% 21:10 = 67.7% A04 20:4 = 83.3% 21:3 = 87.5% A05 22:4 = 84.6% 19:7 = 73.1% A07 17:8 = 68.0% 16:9 = 64.0% B01 19:6 = 76.0% 19:6 = 76.0% B02 16:7 = 69.6% 19:4 = 82.6% B03 21:9 = 70.0% 21:9 = 70.0% B09 21:4 = 84.0% 16:9 = 64.0% B10 20:4 = 83.3% 19:5 = 79.2%
Speech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationA study of speaker adaptation for DNN-based speech synthesis
A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationAffective Classification of Generic Audio Clips using Regression Models
Affective Classification of Generic Audio Clips using Regression Models Nikolaos Malandrakis 1, Shiva Sundaram, Alexandros Potamianos 3 1 Signal Analysis and Interpretation Laboratory (SAIL), USC, Los
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationRobust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction
INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationAutomatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment
Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy Sheeraz Memon
More informationEntrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany
Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International
More informationADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION
ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More informationA new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation
A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation Ingo Siegert 1, Kerstin Ohnemus 2 1 Cognitive Systems Group, Institute for Information Technology and Communications
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS
ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS Annamaria Mesaros 1, Toni Heittola 1, Antti Eronen 2, Tuomas Virtanen 1 1 Department of Signal Processing Tampere University of Technology Korkeakoulunkatu
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationIEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH 2009 423 Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition George
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationNetpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models
Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationSpeech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines
Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,
More informationInternational Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012
Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of
More informationA Web Based Annotation Interface Based of Wheel of Emotions. Author: Philip Marsh. Project Supervisor: Irena Spasic. Project Moderator: Matthew Morgan
A Web Based Annotation Interface Based of Wheel of Emotions Author: Philip Marsh Project Supervisor: Irena Spasic Project Moderator: Matthew Morgan Module Number: CM3203 Module Title: One Semester Individual
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationPhonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project
Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationTHE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS
THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS ELIZABETH ANNE SOMERS Spring 2011 A thesis submitted in partial
More informationClass-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification
Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,
More informationSemi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration
INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One
More informationBUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING
BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING Gábor Gosztolya 1, Tamás Grósz 1, László Tóth 1, David Imseng 2 1 MTA-SZTE Research Group on Artificial
More informationCS 446: Machine Learning
CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationOnline Updating of Word Representations for Part-of-Speech Tagging
Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org
More informationDigital Signal Processing: Speaker Recognition Final Report (Complete Version)
Digital Signal Processing: Speaker Recognition Final Report (Complete Version) Xinyu Zhou, Yuxin Wu, and Tiezheng Li Tsinghua University Contents 1 Introduction 1 2 Algorithms 2 2.1 VAD..................................................
More informationWE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT
WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationSpeech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence
INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationAutomatic Pronunciation Checker
Institut für Technische Informatik und Kommunikationsnetze Eidgenössische Technische Hochschule Zürich Swiss Federal Institute of Technology Zurich Ecole polytechnique fédérale de Zurich Politecnico federale
More informationSoftware Maintenance
1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories
More informationNCEO Technical Report 27
Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students
More informationSpeech Translation for Triage of Emergency Phonecalls in Minority Languages
Speech Translation for Triage of Emergency Phonecalls in Minority Languages Udhyakumar Nallasamy, Alan W Black, Tanja Schultz, Robert Frederking Language Technologies Institute Carnegie Mellon University
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationHow to Judge the Quality of an Objective Classroom Test
How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM
More informationDesign Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm
Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute
More informationNew Features & Functionality in Q Release Version 3.2 June 2016
in Q Release Version 3.2 June 2016 Contents New Features & Functionality 3 Multiple Applications 3 Class, Student and Staff Banner Applications 3 Attendance 4 Class Attendance 4 Mass Attendance 4 Truancy
More informationEyebrows in French talk-in-interaction
Eyebrows in French talk-in-interaction Aurélie Goujon 1, Roxane Bertrand 1, Marion Tellier 1 1 Aix Marseille Université, CNRS, LPL UMR 7309, 13100, Aix-en-Provence, France Goujon.aurelie@gmail.com Roxane.bertrand@lpl-aix.fr
More informationScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies
More informationExpressive speech synthesis: a review
Int J Speech Technol (2013) 16:237 260 DOI 10.1007/s10772-012-9180-2 Expressive speech synthesis: a review D. Govind S.R. Mahadeva Prasanna Received: 31 May 2012 / Accepted: 11 October 2012 / Published
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationChapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4
Chapters 1-5 Cumulative Assessment AP Statistics Name: November 2008 Gillespie, Block 4 Part I: Multiple Choice This portion of the test will determine 60% of your overall test grade. Each question is
More informationDOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds
DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS Elliot Singer and Douglas Reynolds Massachusetts Institute of Technology Lincoln Laboratory {es,dar}@ll.mit.edu ABSTRACT
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationIntroduction to Psychology
Course Title Introduction to Psychology Course Number PSYCH-UA.9001001 SAMPLE SYLLABUS Instructor Contact Information André Weinreich aw111@nyu.edu Course Details Wednesdays, 1:30pm to 4:15pm Location
More informationA GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING
A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING Yong Sun, a * Colin Fidge b and Lin Ma a a CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland
More informationMMOG Subscription Business Models: Table of Contents
DFC Intelligence DFC Intelligence Phone 858-780-9680 9320 Carmel Mountain Rd Fax 858-780-9671 Suite C www.dfcint.com San Diego, CA 92129 MMOG Subscription Business Models: Table of Contents November 2007
More informationUniversiteit Leiden ICT in Business
Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:
More informationUnvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition
Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese
More informationCase study Norway case 1
Case study Norway case 1 School : B (primary school) Theme: Science microorganisms Dates of lessons: March 26-27 th 2015 Age of students: 10-11 (grade 5) Data sources: Pre- and post-interview with 1 teacher
More informationSTUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH
STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More information16.1 Lesson: Putting it into practice - isikhnas
BAB 16 Module: Using QGIS in animal health The purpose of this module is to show how QGIS can be used to assist in animal health scenarios. In order to do this, you will have needed to study, and be familiar
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationTCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits)
Frameworks for Research in Mathematics and Science Education (3 Credits) Professor Office Hours Email Class Location Class Meeting Day * This is the preferred method of communication. Richard Lamb Wednesday
More informationThe NICT/ATR speech synthesis system for the Blizzard Challenge 2008
The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationInstructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100
San Diego State University School of Social Work 610 COMPUTER APPLICATIONS FOR SOCIAL WORK PRACTICE Statistical Package for the Social Sciences Office: Hepner Hall (HH) 100 Instructor: Mario D. Garrett,
More informationA Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language
A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.
More informationSpeaker recognition using universal background model on YOHO database
Aalborg University Master Thesis project Speaker recognition using universal background model on YOHO database Author: Alexandre Majetniak Supervisor: Zheng-Hua Tan May 31, 2011 The Faculties of Engineering,
More informationDetecting English-French Cognates Using Orthographic Edit Distance
Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National
More informationMemory-based grammatical error correction
Memory-based grammatical error correction Antal van den Bosch Peter Berck Radboud University Nijmegen Tilburg University P.O. Box 9103 P.O. Box 90153 NL-6500 HD Nijmegen, The Netherlands NL-5000 LE Tilburg,
More informationPractice Examination IREB
IREB Examination Requirements Engineering Advanced Level Elicitation and Consolidation Practice Examination Questionnaire: Set_EN_2013_Public_1.2 Syllabus: Version 1.0 Passed Failed Total number of points
More information1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all
Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY
More informationAUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS
AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS R.Barco 1, R.Guerrero 2, G.Hylander 2, L.Nielsen 3, M.Partanen 2, S.Patel 4 1 Dpt. Ingeniería de Comunicaciones. Universidad de Málaga.
More informationSpeech Recognition by Indexing and Sequencing
International Journal of Computer Information Systems and Industrial Management Applications. ISSN 215-7988 Volume 4 (212) pp. 358 365 c MIR Labs, www.mirlabs.net/ijcisim/index.html Speech Recognition
More informationMINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES
MINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES THE PRESIDENTS OF THE UNITED STATES Project: Focus on the Presidents of the United States Objective: See how many Presidents of the United States
More informationIdentifying Novice Difficulties in Object Oriented Design
Identifying Novice Difficulties in Object Oriented Design Benjy Thomasson, Mark Ratcliffe, Lynda Thomas University of Wales, Aberystwyth Penglais Hill Aberystwyth, SY23 1BJ +44 (1970) 622424 {mbr, ltt}
More informationOn the Combined Behavior of Autonomous Resource Management Agents
On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science
More informationCHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY
CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY FALL 2017 COURSE SYLLABUS Course Instructors Kagan Kerman (Theoretical), e-mail: kagan.kerman@utoronto.ca Office hours: Mondays 3-6 pm in EV502 (on the 5th floor
More informationLip reading: Japanese vowel recognition by tracking temporal changes of lip shape
Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,
More informationIntroduction to Questionnaire Design
Introduction to Questionnaire Design Why this seminar is necessary! Bad questions are everywhere! Don t let them happen to you! Fall 2012 Seminar Series University of Illinois www.srl.uic.edu The first
More informationUsing EEG to Improve Massive Open Online Courses Feedback Interaction
Using EEG to Improve Massive Open Online Courses Feedback Interaction Haohan Wang, Yiwei Li, Xiaobo Hu, Yucong Yang, Zhu Meng, Kai-min Chang Language Technologies Institute School of Computer Science Carnegie
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationUtilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant Sudheer Takekar 1 Dr. D.N. Raut 2
IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 04, 2014 ISSN (online): 2321-0613 Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant
More information