Agreement and Disagreement Utterance Detection in Conversational Speech by Extracting and Integrating Local Features

Size: px
Start display at page:

Download "Agreement and Disagreement Utterance Detection in Conversational Speech by Extracting and Integrating Local Features"

Transcription

1 INTERSPEECH 2015 Agreement and Disagreement Utterance Detection in Conversational Speech by Extracting and Integrating Local Features Atsushi Ando 1, Taichi Asami 1, Manabu Okamoto 1, Hirokazu Masataki 1, Sumitaka Sakauchi 1 1 NTT Media Intelligence Laboratories, NTT Corporation, Japan {ando.atsushi,asami.taichi,okamoto.manabu,masataki.hirokazu,sakauchi.sumitaka}@lab.ntt.co.jp Abstract This paper presents a novel framework to automatically detect agreement and disagreement utterances in natural conversation. Such a function is critical for conversation understanding such as meeting summarization. One of the difficulties of agreement and disagreement utterance detection in natural conversation is ambiguity in the utterance unit. Utterances are usually segmented by short pauses. However, in conversations, multiple sentences are often uttered in one breath. Such utterances exhibit the characteristics of agreement and disagreement only in some parts, not the whole utterance. This makes conventional methods problematic since they assume each utterance is just one sentence and extract global features from the whole utterance. To deal with this problem, we propose a detection framework that utilizes only local prosodic/lexical features. The local features are extracted from short windows that cover just a few words. Posteriors of agreement, disagreement and others are estimated window-by-window and integrated to yield a final decision. Experiments on free discussion speech show that the proposed method, through its use of local features, offers significantly higher accuracy in detecting agreement and disagreement utterances. Index Terms: agreement and disagreement utterance detection, paralinguistics, conversational speech, local features 1. Introduction One of the important applications of automatic speech recognition is to extract the structure of a conversation. The results should help the participants to recall what they discussed. However, conversations contain many extraneous utterances and it is inefficient to show each and every utterance. Hence the automatic detection of decision-making utterances such as agreement and disagreement is important. It is also useful for conversation understanding like automatic meeting summarization [1]. In this paper, our purpose is the automatic detection of agreement and disagreement utterances in natural conversation. Detection of agreement and disagreement utterances is regarded as the task of dividing continuous speech into utterances and to classify each into one of three classes, agreement, disagreement and others. Several conventional methods have been proposed to support this task. Hillard et al. [2] proposed a method of dividing speech into utterances based on short pauses and of classifying utterances using lexical and prosodic features of each utterance. Galley et al. [3] introduced adjacency pairs to this method to consider inter-utterance relations. Hahn et al. [4] used a contrast classifier to deal with the label imbalance problem. Wang et al. [5, 6] used CRF to model sentence-to-sentence context and Bousmalis et al. [7] utilized social attitudes such as head actions and body postures. One difficulty of this task is the ambiguity posed by the utterance units. Conventional methods are usually based on utterances segmented by pause length. However, participants often speak continuously and multiple sentences are uttered in one breath. Such utterances exhibit agreement or disagreement in parts of the utterance, not the entire utterance. This renders conventional methods questionable since they assume each utterance has just one sentence and characteristics of agreement and disagreement appear in the whole utterance. Accordingly, they use global features extracted from the whole utterance. To solve this problem, Germesin et al. [8] proposed an approach that attempts to improve utterance segmentations. It identifies criteria that can split speech into utterances that consist of just one sentence per utterance. They used the result of dialog label estimation based on lexical criteria to split utterances. One good point of this approach to that it allows us to use conventional methods in classifying each utterance. However, identifying the ideal criteria is problematic because conversational speech often does not follow strict grammar rules, and speech recognition error directly triggers utterance split error. Hence we take another approach for this problem. We capture the local characteristics of an utterance and utilize them to detect agreement and disagreement utterances. It is useful if conventional pause-based speech splitting methods can be used. However, existing studies do not confirm whether agreement and disagreement can be detected from local features or how to integrate local characteristics into utterance results. In this paper, we propose a new agreement and disagreement utterance detection framework based on local prosodic/lexical features. Posteriors of agreement, disagreement and others are estimated for each short window that covers several words. The method proposed herein employs prosodic features, lexical features, and combinations of both as local features to calculate posteriors. The final decision of each utterance is obtained by integrating these posteriors. Experiments on free discussion speech show that the proposed method significantly improves the detection accuracy of agreement and disagreement utterances. These indicate that local changes in both prosody and lexicon are effective for detecting agreement and disagreement utterances. 2. Dataset In this section, we show the dataset used in this research and the ratio of utterances that exhibit agreement and disagreement in only some parts of each utterance. We plan to apply our research to Japanese conversation but there is no Japanese conversational speech dataset labeled agreement, disagreement and others. Thus we newly collected Japanese conversational speech and manually labeled the utter- Copyright 2015 ISCA 2494 September 6-10, 2015, Dresden, Germany

2 Figure 1: Frequency distribution of the agreement and disagreement utterances that interval labels cover some part of each utterance. ances. Simulated conversations were held and we recorded speech uttered by the participants. An overview of the simulated conversations is given below. Each conversation had two or four participants. A subject for discussion was given and participants selected their positions, either approval or disapproval, to ensure a balance in positions. All the participants argued their position alternately. After that, a 10 minute discussion and a 5 minute conclusion were followed. We recorded their speech made in the discussion and conclusion parts. Participants were four males and four females. There was no interference during recording speech because conversations were established using a video conference system and each participant occupied a different soundproof booth. Slightly over twenty hours of conversational speech were recorded. After speech segmentation, we were left with 2987 utterances occupying 7.2 hours. We define an utterance as a period of speech that has no pauses of greater than.5 second, which is same as a spurt in [2]. Two types of labels were given to utterances: utterance labels and interval labels. Utterance label was given to each utterance. Interval label was given to the interval which labelers perceive agreement or disagreement characteristics in. Both labels have three classes: agreement, disagreement and others. Utterances and intervals labeled neither agreement nor disagreement were regarded as labeled others. However, utterances with only a single word were regarded as backchannel and were not labeled. Labelers were three and didn t participate in any conversation. We called the common utterance labels and interval labels assigned by two or more labelers as majority utterance labels and majority interval labels. We investigate the characteristics of the labels in this dataset as below. First, we use the Kappa coefficient [9] to measure the coincidence of labels between labelers. Average Kappa coefficients between labeler were.47 for the utterance labels and.48 for interval labels. On the other hand, those between majority labels and three labelers were.71 for utterance labels and.64 for interval labels. The use of majority labels as correct labels is seemed to be valid than using the labels of any of the three labelers in isolation. Hence majority labels are used hereafter. Second, we show the utterance rates of agreement, disagreement, backchannel, and others. The rates were 9%, 7%, 26%, and 58%, respectively. These appearance rates are similar to those of the dataset used in previous works [2 4], which indicates that this dataset has the same characteristic and thus is reliable. Finally, we determine the rate of the utterances exhibiting agreement and disagreement only in some parts. Figure 1 shows the frequency distribution of the agreement and disagreement Figure 2: Overview of the proposed method. utterances that interval labels cover some part of each utterance. The horizontal axis plots the rate of interval label length, calculated as the sum of the interval label length divided by whole utterance length in each agreement and disagreement utterance. For example, the rightmost value, 100%, means that the interval label covers the whole utterance. Figure 1 shows that less than half of the agreement and disagreement utterances are covered whole utterances with interval labels, and over 40% of agreement and over 30% of disagreement utterances, interval labels occupy less than half the whole utterance length. These results demonstrate that it is important to deal with utterances that exhibit agreement or disagreement in some parts of the utterance. 3. Proposed method We propose a new agreement and disagreement utterance detection method based on local characteristics. Our method consists of two steps: local class estimation step and utterance class estimation step. In the first step we estimate a class in each local window by using local features. In the second step we integrate all the results of the first step in an utterance to estimate an utterance class. Figure 2 shows the overview of the proposed method. To use local features effectively, it is important to set the local window length appropriately so as to estimate short-term agreement and disagreement. Long windows have a risk of containing more than two classes in the interval, which decreases the accuracy of agreement and disagreement estimation. Note that even humans need a certain length for estimating agreement and disagreement; very short windows are inappropriate. Taking this discussion into consideration, proposed method uses several word intervals as local windows Overview Detecting agreement and disagreement utterances is the task of estimating an utterance class L in each utterance S. ˆL = argmax P (L S) (1) L ˆL is the estimated class and there are three classes, agreement, disagreement, and others. We assume that agreement and disagreement appear in several continuous word intervals. We represent the subintervals of 2495

3 Table 1: Statistics of prosodic values calculated by word interval. Type unit feature F0 word mean, std, min, max, slope, range of F0 using word interval first phoneme mean, std, min, max, slope, range of F0 using first phoneme interval of the word last phoneme mean, std, min, max, slope, range of F0 using last phoneme interval of the word Intensity word mean, std, min, max, slope, range of intensity using word interval first phoneme mean, std, min, max, slope, range of intensity using first phoneme interval of the word last phoneme mean, std, min, max, slope, range of intensity using last phoneme interval of the word Duration word duration, speech rate of the word first phoneme duration of the first phoneme of the word last phoneme duration of the last phoneme of the word Pause word pause between the word and the previous word 3.2. Features Figure 3: An example of making correct local classes from manually annotated interval labels and words. the utterance S as s 1,,s K. These correspond to the intervals of the words w 1,,w K included in the utterance. K is total number of words in the utterance. In local class estimation step, local features f k are extracted from short window covering {s k N,,s k,,s k+n } and utilized to obtain the estimated local class ˆl k and local class posteriors. ˆlk = argmax P (l f k ) (2) l N is a parameter of local window length. A set of a local class is same as a utterance class. Posteriors written on the right side of Eq. (2) are trained by local features and correct local classes. Correct local classes are made by interval labels. The class which is dominant in each word interval is regarded as a correct local class. An example of making correct local classes is shown in Figure 3. In utterance class estimation step, we obtain estimated utterance class ˆL by integrating all the estimated local classes and the local class posteriors. We represent local class posteriors obtained in kth short windows as p k, which includes posteriors of the agreement, disagreement and others. All the estimated local classes and local class posteriors are represented as ˆl = {ˆl1,, ˆl K} and P = {p 1,, p K}. ( )) ˆL = argmax P L Φ (ˆl, P L Φ(ˆl, P) means taking ther statistics of the values: total number of local windows, occurrences of the classes, and the mean and standard deviation of the posteriors in each class. Posteriors in Eq. (3) are trained by manually annotated utterance labels and local class estimation results. (3) We use both prosodic and lexical features as local features for detection of agreement and disagreement. Prosodic features are obtained by jointing prosodic statistics calculated in each word interval. The reason we use them is that they can express prosodic characteristics in greater detail than features comprised of prosodic statistics calculated from the combination of several intervals. Amount of training data available for local label estimation is larger than that of for utterance label estimation, which enables us to use those detailed prosodic features. Prosodic statistics calculated from each word interval are shown in Table 1. These are used in emphasis speech analyses [10, 11] that are correlated with our research. We use not only word unit statistics but also phoneme unit statistics because it has been shown that agreement and disagreement are delineated by changes in phoneme interval in human-machine voice interaction [12], and those changes are also likely to be present in human-to-human interaction such as conversations. F0 statistics are extracted from only vowel intervals, and F0 value is normalized for each speaker and each conversation in order to regularize the speakers. Lexical features used in this method are similar to those in [2]. They consist of the number of agreement and disagreement keywords, perplexities and posteriors of 2-gram LMs modeled by each of the three label classes. Agreement and disagreement keywords are those that appear more than five times and whose frequency of being assigned to the agreement or disagreement class divided by the frequency of all appearances is greater than.6. These are calculated from the word sequences in the local feature window. The same training set used to train the estimator in Eq. (2) and in Eq. (3) is employed to obtain the keywords and 2-gram LMs. 4. Experiments To evaluate the proposed method, we conducted experiments on detecting agreement and disagreement utterances in conversations. 10-fold cross validation was used in experiments. A total of labels of local classes were present in the dataset, 1323(3.6%) were agreement and 2918(7.9%) were disagreement. Same training set was used to training posteriors in Eq. (2) and Eq. (3). That is, local label estimator was trained first, and then integrator was trained using the estimation results of local classes and utterance labels. In both training steps, oversampling [13] was used because of imbalance in the training classes. We used hand transcripts to obtain words and word in- 2496

4 Table 2: Detection accuracies of agreement and disagreement utterances. Pros Lex Pros+Lex window length Total Agree Disagr. Total Agree Disagr. Total Agree Disagr. Baseline Proposed 1 (self only) (self ± 1word) (self ± 2words) (self ± 3words) (self ± 4words) Table 3: Estimation accuracies of local labels. Pros Lex Pros+Lex window length Total Agree Disagr. Total Agree Disagr. Total Agree Disagr. Proposed 1 (self only) (self ± 1word) (self ± 2words) (self ± 3words) (self ± 4words) tervals, but these are usually not available in practice. Hence using speech recognition output is a future work. Neural networks were used for both in local class estimation step and in utterance class estimation step with two hidden layers with 256 nodes and one hidden layer with 32 nodes, respectively. Values of the F0 and intensity in each frame were extracted by OpenS- MILE [14] with 50ms frame length, 10ms frame shift length. We use [2] as the baseline method using global features to detect agreement and disagreement utterances. Accuracies of agreement and disagreement utterance detection and local label estimation are shown in Table 2 and Table 3. Pros, Lex, Pros+Lex mean the results achieved by using only prosodic features, only lexical features, and both prosodic and lexical features, respectively. Window length is the number of words covered by a local window. Bold presents the max value on the vertical axis. The accuracy of lexical features only for the case that the window covers only a single word was not calculated since we cannot calculate 2-gram perplexity and posterior. From the results yielded by the prosodic features, each total accuracy of the proposed method is better than that of baseline. The maximum improvement from baseline is 10.1% in case of using 3 prior and 3 following words. It indicates that the local prosodic characters of an utterance are effective in detecting agreement and disagreement. Focusing the window length, increasing the length of the local windows tends to raise accuracy, but if window includes 4 prior and 4 following words, utterance accuracy decreases. The reasons we believe are that long window often includes units that are unrelated to agreement/disagreement and that the increase in feature dimensions makes the classifier less robust. These results show that local prosodic characters accompanying agreement and disagreement have relative long duration such as several word intervals, but excessively long windows decrease detection accuracy. This corresponds with what we mentioned in section 3. We also examine results of the lexical features. Utterance label accuracies of the proposed method with lexical features exceeded accuracy of the baseline over 20%. Improvements from baseline given by proposed method with lexical features is greater than those with prosodic features. These results indicate that local lexical characteristics are also effective in detection of agreement and disagreement utterances. Utterance estimation accuracy decreases as the window widened, but local estimation accuracy increases. It indicates that the current method of integrating of the local results is considered not to be optimal with lexical features. The proposed method with both lexical and prosodic features also demonstrated superior utterance label estimation performance to the baseline. However, its local label estimation accuracy decreased as the window widened, this result is different from that of lexical or prosodic features only. This indicates that the simple combination of prosodic and lexical features described herein may not be suitable and consideration of more advanced combination methods such as variable subinterval length is a remaining problem. 5. Conclusions In this paper, we proposed a new agreement and disagreement utterance detection framework for conversational speech that uses local prosodic/lexical features. To detect agreement and disagreement utterances whose characteristics appeared in some part of the utterance, we utilize local features extracted from short windows that cover several words. Local labels are estimated by those features and posteriors of the local labels are integrated to estimate the utterance label. Experiments on free discussion speech showed that proposed method improves the accuracy of detecting agreement and disagreement utterances. Its excellent performance is due to the use of local prosodic/lexical characteristics of utterances. One future work is improving the integration of local estimation results. The method proposed herein uses mean and standard deviation of the local results and so doesn t utilize sequential information which would seem to be effective in detecting agreement and disagreement utterances. Other works are considering better combinations of prosodic and lexical features, evaluation with automatic speech recognition results and with another conversational speech dataset used in previous studies. 2497

5 6. References [1] C. Lai, S. Renals, Incorporating Lexical and Prosodic Information at Different Levels for Meeting Summarization, in Proc. of INTERSPEECH 2014, [2] D.Hillard, M. Ostendorf and E. Shriberg, Detection of Agreement vs. Disagreement in Meetings: Training with Unlabeled Data, in Proc. of the HLT-NAACL 2003, vol. 2, pp , [3] M. Galley, K. McKeown, J. Hirschberg, E. Shriberg, Identifying Agreement and Disagreement in Conversational Speech: Use of Baysian Networks to Model Pragmatic Dependencies, in Proc. of the 42nd Annual Meeting of ACL, pp , [4] S. Hahn, R. Ladner, M. Ostendorf, Agreement/Disagreement Classification: Exploiting Unlabeled Data using Contrast Classifiers, in Proc. of the HLT-NAACL, pp , [5] W. Wang, S. Yaman, K. Precoda, C. Richey, G. Raymond, Detection of Agreement and Disagreement in Broadcast Conversation, in Proc. of the 49th Annual Meeting of ACL, pp , [6] W. Wang, K. Precoda, C. Richey, G. Raymond, Identifying Agreement Disagreement in Conversational Speech. A Crosslingual Study, in Proc. of INTERSPEECH 2011, [7] K. Bousmalis, L. P. Morency, M. Pantic, Modeling hidden dynamics of multimodal cues for spontaneous agreement and disagreement recognition, in Proc. of the Automatic Face & Gesture Recognition and Workshops, pp [8] S. Germesin and T. Wilson, Agreement detection in multiparty conversation, in Proc. of the International Conference on Multimodal Interfaces, pp. 7 14, [9] S. Siegel and N. J. Castellan, Nonparametric Statistics for The Behavioral Sciences, McGraw-Hill, [10] V. K. R. Sridhar, A. Nenkova, S. Narayanan, D. Jurafsky, Detecting prominence in conversational speech: pitch accent, givenness and focus, in Proc. of Speech Prosody, [11] E. Strangert, Emphasis by Pausing, in Proc. of the 15th ICPhS, pp , [12] S. Fujie, D. Yagi, H. Kikuchi and T. Kobayashi, Prosody based Attitude Recognition with Feature Selection and Its Application to Spoken Dialog System as Para-Linguistic Information, in Proc. of the ICSLP 2004, vol. 4, pp , [13] N. Japkowicz, The Class Imbalance Problem: Significance and Strategies, in Proc. of the 2000 International Conference on Artificial Intelligence, pp , [14] F. Eyben, M. Wöllmer, and B. Schuller, opensmile - the Munich versatile and fast open-source audio feature extractor, in Proc. of the ACM Multimedia, pp ,

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Julie Medero and Mari Ostendorf Electrical Engineering Department University of Washington Seattle, WA 98195 USA {jmedero,ostendor}@uw.edu

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation

A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation A new Dataset of Telephone-Based Human-Human Call-Center Interaction with Emotional Evaluation Ingo Siegert 1, Kerstin Ohnemus 2 1 Cognitive Systems Group, Institute for Information Technology and Communications

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com

More information

Multi-modal Sensing and Analysis of Poster Conversations toward Smart Posterboard

Multi-modal Sensing and Analysis of Poster Conversations toward Smart Posterboard Multi-modal Sensing and Analysis of Poster Conversations toward Smart Posterboard Tatsuya Kawahara Kyoto University, Academic Center for Computing and Media Studies Sakyo-ku, Kyoto 606-8501, Japan http://www.ar.media.kyoto-u.ac.jp/crest/

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Eyebrows in French talk-in-interaction

Eyebrows in French talk-in-interaction Eyebrows in French talk-in-interaction Aurélie Goujon 1, Roxane Bertrand 1, Marion Tellier 1 1 Aix Marseille Université, CNRS, LPL UMR 7309, 13100, Aix-en-Provence, France Goujon.aurelie@gmail.com Roxane.bertrand@lpl-aix.fr

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

Affective Classification of Generic Audio Clips using Regression Models

Affective Classification of Generic Audio Clips using Regression Models Affective Classification of Generic Audio Clips using Regression Models Nikolaos Malandrakis 1, Shiva Sundaram, Alexandros Potamianos 3 1 Signal Analysis and Interpretation Laboratory (SAIL), USC, Los

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique Hiromi Ishizaki 1, Susan C. Herring 2, Yasuhiro Takishima 1 1 KDDI R&D Laboratories, Inc. 2 Indiana University

More information

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Akiko Sakamoto, Kazuhiko Abe, Kazuo Sumita and Satoshi Kamatani Knowledge Media Laboratory,

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Meta Comments for Summarizing Meeting Speech

Meta Comments for Summarizing Meeting Speech Meta Comments for Summarizing Meeting Speech Gabriel Murray 1 and Steve Renals 2 1 University of British Columbia, Vancouver, Canada gabrielm@cs.ubc.ca 2 University of Edinburgh, Edinburgh, Scotland s.renals@ed.ac.uk

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Online Updating of Word Representations for Part-of-Speech Tagging

Online Updating of Word Representations for Part-of-Speech Tagging Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

How to Judge the Quality of an Objective Classroom Test

How to Judge the Quality of an Objective Classroom Test How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM

More information

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Individual Differences & Item Effects: How to test them, & how to test them well

Individual Differences & Item Effects: How to test them, & how to test them well Individual Differences & Item Effects: How to test them, & how to test them well Individual Differences & Item Effects Properties of subjects Cognitive abilities (WM task scores, inhibition) Gender Age

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Dialog Act Classification Using N-Gram Algorithms

Dialog Act Classification Using N-Gram Algorithms Dialog Act Classification Using N-Gram Algorithms Max Louwerse and Scott Crossley Institute for Intelligent Systems University of Memphis {max, scrossley } @ mail.psyc.memphis.edu Abstract Speech act classification

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Verbal Behaviors and Persuasiveness in Online Multimedia Content

Verbal Behaviors and Persuasiveness in Online Multimedia Content Verbal Behaviors and Persuasiveness in Online Multimedia Content Moitreya Chatterjee, Sunghyun Park*, Han Suk Shim*, Kenji Sagae and Louis-Philippe Morency USC Institute for Creative Technologies Los Angeles,

More information

Investigation on Mandarin Broadcast News Speech Recognition

Investigation on Mandarin Broadcast News Speech Recognition Investigation on Mandarin Broadcast News Speech Recognition Mei-Yuh Hwang 1, Xin Lei 1, Wen Wang 2, Takahiro Shinozaki 1 1 Univ. of Washington, Dept. of Electrical Engineering, Seattle, WA 98195 USA 2

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers

Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers Chad Langley, Alon Lavie, Lori Levin, Dorcas Wallace, Donna Gates, and Kay Peterson Language Technologies Institute Carnegie

More information

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab Revisiting the role of prosody in early language acquisition Megha Sundara UCLA Phonetics Lab Outline Part I: Intonation has a role in language discrimination Part II: Do English-learning infants have

More information

Communication around Interactive Tables

Communication around Interactive Tables Communication around Interactive Tables Figure 1. Research Framework. Izdihar Jamil Department of Computer Science University of Bristol Bristol BS8 1UB, UK Izdihar.Jamil@bris.ac.uk Abstract Despite technological,

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA LANGUAGE AND SPEECH, 2009, 52 (4), 391 413 391 Variability in Word Duration as a Function of Probability, Speech Style, and Prosody Rachel E. Baker, Ann R. Bradlow Northwestern University, Evanston, IL,

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh The Effect of Discourse Markers on the Speaking Production of EFL Students Iman Moradimanesh Abstract The research aimed at investigating the relationship between discourse markers (DMs) and a special

More information

AP Statistics Summer Assignment 17-18

AP Statistics Summer Assignment 17-18 AP Statistics Summer Assignment 17-18 Welcome to AP Statistics. This course will be unlike any other math class you have ever taken before! Before taking this course you will need to be competent in basic

More information

On-the-Fly Customization of Automated Essay Scoring

On-the-Fly Customization of Automated Essay Scoring Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,

More information

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Paper #3 Five Q-to-survey approaches: did they work? Job van Exel

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch 2 Test Remediation Work Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) High temperatures in a certain

More information

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Instructor: Mario D. Garrett, Ph.D.   Phone: Office: Hepner Hall (HH) 100 San Diego State University School of Social Work 610 COMPUTER APPLICATIONS FOR SOCIAL WORK PRACTICE Statistical Package for the Social Sciences Office: Hepner Hall (HH) 100 Instructor: Mario D. Garrett,

More information

Beyond the Pipeline: Discrete Optimization in NLP

Beyond the Pipeline: Discrete Optimization in NLP Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

Corrective Feedback and Persistent Learning for Information Extraction

Corrective Feedback and Persistent Learning for Information Extraction Corrective Feedback and Persistent Learning for Information Extraction Aron Culotta a, Trausti Kristjansson b, Andrew McCallum a, Paul Viola c a Dept. of Computer Science, University of Massachusetts,

More information

Review in ICAME Journal, Volume 38, 2014, DOI: /icame

Review in ICAME Journal, Volume 38, 2014, DOI: /icame Review in ICAME Journal, Volume 38, 2014, DOI: 10.2478/icame-2014-0012 Gaëtanelle Gilquin and Sylvie De Cock (eds.). Errors and disfluencies in spoken corpora. Amsterdam: John Benjamins. 2013. 172 pp.

More information

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics

More information

On the Formation of Phoneme Categories in DNN Acoustic Models

On the Formation of Phoneme Categories in DNN Acoustic Models On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-

More information

REVIEW OF CONNECTED SPEECH

REVIEW OF CONNECTED SPEECH Language Learning & Technology http://llt.msu.edu/vol8num1/review2/ January 2004, Volume 8, Number 1 pp. 24-28 REVIEW OF CONNECTED SPEECH Title Connected Speech (North American English), 2000 Platform

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Test Effort Estimation Using Neural Network

Test Effort Estimation Using Neural Network J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish

More information

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE Pierre Foy TIMSS Advanced 2015 orks User Guide for the International Database Pierre Foy Contributors: Victoria A.S. Centurino, Kerry E. Cotter,

More information

The influence of written task descriptions in Wizard of Oz experiments

The influence of written task descriptions in Wizard of Oz experiments The influence of written task descriptions in Wizard of Oz experiments Heidi Brøseth Department of Language and Communication Studies Norwegian University of Science and Technology NO-7491 Trondheim broseth@hf.ntnu.no

More information

Multi-View Features in a DNN-CRF Model for Improved Sentence Unit Detection on English Broadcast News

Multi-View Features in a DNN-CRF Model for Improved Sentence Unit Detection on English Broadcast News Multi-View Features in a DNN-CRF Model for Improved Sentence Unit Detection on English Broadcast News Guangpu Huang, Chenglin Xu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li Temasek Laboratories@NTU,

More information

Corpus Linguistics (L615)

Corpus Linguistics (L615) (L615) Basics of Markus Dickinson Department of, Indiana University Spring 2013 1 / 23 : the extent to which a sample includes the full range of variability in a population distinguishes corpora from archives

More information

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Yanzhang He, Eric Fosler-Lussier Department of Computer Science and Engineering The hio

More information

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas Exploiting Distance Learning Methods and Multimediaenhanced instructional content to support IT Curricula in Greek Technological Educational Institutes P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou,

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION SUMMARY 1. Motivation 2. Praat Software & Format 3. Extended Praat 4. Prosody Tagger 5. Demo 6. Conclusions What s the story behind?

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

South Carolina English Language Arts

South Carolina English Language Arts South Carolina English Language Arts A S O F J U N E 2 0, 2 0 1 0, T H I S S TAT E H A D A D O P T E D T H E CO M M O N CO R E S TAT E S TA N DA R D S. DOCUMENTS REVIEWED South Carolina Academic Content

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Speech Translation for Triage of Emergency Phonecalls in Minority Languages

Speech Translation for Triage of Emergency Phonecalls in Minority Languages Speech Translation for Triage of Emergency Phonecalls in Minority Languages Udhyakumar Nallasamy, Alan W Black, Tanja Schultz, Robert Frederking Language Technologies Institute Carnegie Mellon University

More information

INPE São José dos Campos

INPE São José dos Campos INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Statewide Framework Document for:

Statewide Framework Document for: Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance

More information

Degeneracy results in canalisation of language structure: A computational model of word learning

Degeneracy results in canalisation of language structure: A computational model of word learning Degeneracy results in canalisation of language structure: A computational model of word learning Padraic Monaghan (p.monaghan@lancaster.ac.uk) Department of Psychology, Lancaster University Lancaster LA1

More information

Evidence for Reliability, Validity and Learning Effectiveness

Evidence for Reliability, Validity and Learning Effectiveness PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies

More information

Textbook Evalyation:

Textbook Evalyation: STUDIES IN LITERATURE AND LANGUAGE Vol. 1, No. 8, 2010, pp. 54-60 www.cscanada.net ISSN 1923-1555 [Print] ISSN 1923-1563 [Online] www.cscanada.org Textbook Evalyation: EFL Teachers Perspectives on New

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

Running head: DELAY AND PROSPECTIVE MEMORY 1

Running head: DELAY AND PROSPECTIVE MEMORY 1 Running head: DELAY AND PROSPECTIVE MEMORY 1 In Press at Memory & Cognition Effects of Delay of Prospective Memory Cues in an Ongoing Task on Prospective Memory Task Performance Dawn M. McBride, Jaclyn

More information

Multi-Lingual Text Leveling

Multi-Lingual Text Leveling Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency

More information

A Vector Space Approach for Aspect-Based Sentiment Analysis

A Vector Space Approach for Aspect-Based Sentiment Analysis A Vector Space Approach for Aspect-Based Sentiment Analysis by Abdulaziz Alghunaim B.S., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer

More information