Towards an Evolutionary Computational Approach to Articulatory Vocal Synthesis with PRAAT

Size: px
Start display at page:

Download "Towards an Evolutionary Computational Approach to Articulatory Vocal Synthesis with PRAAT"

Transcription

1 Towards an Evolutionary Computational Approach to Articulatory Vocal Synthesis with PRAAT Jared Drayton and Eduardo Miranda Interdisciplinary Centre for Computer Music Research, Plymouth University, UK Abstract. This paper presents our current work into developing an evolutionary computing approach to articulatory speech synthesis. Specifically, we implement genetic algorithms to find optimised parameter combinations for the re-synthesis of a vowel using the articulatory synthesiser PRAAT. Our framework analyses the target sound using Fast Fourier Transform (FFT) to obtain formant information, which is then harnessed in a fitness function applied to a real valued genetic algorithm using a generation size of 75 sounds over 50 generations. In this paper, we present three differently configured genetic algorithms (GAs) and offer a comparison of their suitability for elevating the average fitness of the re-synthesised sounds. Keywords: Articulatory Vocal Synthesis, Vocal Synthesis, Evolutionary Computing, Speech, PRAAT, Genetic Algorithms 1 Introduction Computing technology has advanced at a rapid frequency over the last eighty years. As computers are becoming more ubiquitous in our everyday lives, the need to communicate with our technology is increasing. Speech synthesis is the artificial production of human speech and features in an increasing amount of our digital devices. We can see the use of speech synthesis in a wide span of technologies, ranging from car GPS navigation to video games. Currently, there are three main approaches to artificially producing speech: concatenative synthesis, formant synthesis and articulatory synthesis. Out of these three, concatenative synthesis is the approach that dominates. Concatenative speech synthesis is a sound synthesis approach where small sound units of pre-recorded speech are selected from a database, and sequenced together to produce a target sound or sound sequence. This approach currently offers the highest amount of intelligibility and naturalness when compared to the other techniques available. Because the technique relies on the arranging of sound recordings from human speakers, it bypasses some of the drawbacks inherent in other methods; for example, the unnatural timbre of formant synthesis,

2 or an imperfect physical model used in articulatory synthesis. However, there are a number of limitations on concatenative synthesis systems that result from its reliance on pre-recorded speech. The corpus of sounds that concatenative synthesis relies on is finite, and the segments themselves cannot be modified extensively without negatively impacting the quality and naturalness of the sound. This severely limits the capacity to modify prosody in relation to the text given. Therefore, to account for different types of prosody, it must be accounted for in the creation of the original corpus. Articulatory synthesis is widely considered to have the biggest potential out of all current speech synthesis techniques [1]. However, as it stands, articulatory speech synthesis is largely unexploited and undeveloped. This is largely attributed to the difficulty of producing a robust articulatory Text To Speech (TTS) system that can perform on a par with existing concatenative solutions. This is due to the highly complex and non-linear relationship between parameters and the resultant sound. There have been a number of different approaches attempted for extracting vocal tract area functions, or articulatory movements. These range from using methods of imaging the vocal apparatus during speech (using machines such as an X-Ray [2] or MRI [3]) to attaching sensors to the articulators themselves. Inversion of parameters from the original audio has also been attempted [4]. In this paper we present a framework for developing an evolutionary computing approach to articulatory speech synthesis together with some initial results. The primary motivation for this research is to explore approaches to an automatic system of obtaining vocal tract area functions from recorded speech data. This is highly desirable in furthering the field of articulatory synthesis and the field of speech synthesis in general. 2 Background 2.1 Articulatory Synthesis Articulatory synthesis is a physical modelling approach to sound synthesis. These physical models emulate the physiology of the human vocal apparatus. They simulate how air exhaled from the lungs is modified by the glottis and larynx, then propagated through the vocal tract and further modified by the articulators such as the tongue and lips. Control of this synthesis method is achieved by passing numerical values to parameters that correspond to individual muscles or muscles groupings. Therefore, any set of parameter values can be thought of as describing an articulatory configuration or articulatory movement i.e. describing a vocal area tract function. There are a number of different approaches when it comes to the design of articulatory synthesisers. Some synthesisers favour the use of a simplified periodic signal in place of physically modelling the larynx. This allows the fundamental frequency or pitch to be defined manually, and decrease the complexity of the model. By not attempting to simulate the lungs and larynx, the realism in terms

3 Fig. 1. Mid-Sagittal View of the Human Vocal Apparatus of phonetic quality is reduced. Breathing patterns also have a great impact on prosody, and are also essential for accurately modelling of fricatives and plosive speech sounds. 2.2 Evolutionary Computation Within the field of evolutionary computing, there is a group of heuristic search techniques known as evolutionary algorithms that draw inspiration from the neodarwinian paradigm of evolution and survival of the fittest. These evolutionary algorithms work on an iterative process of generating solutions and testing their fitness and suitability using a fitness funtion, then combining genetic material from the fittest candidates. This is done by using genetic operators such as selection, crossover and mutation. Genetic Algorithms were developed by John Holland and put forward in the seminal text Adaptation in Natural and Artificial Systems [5]. They have been employed in a variety of different optimisation tasks [6], especially in tasks where the search space is large and not well understood. The approach of using evolutionary computing for non-linear sound synthesis applications is not a new concept, and has been explored by a number of researchers. Several different EC techniques are given in Evolutionary Computer Music [7] for musical applications, with chapters 5-7 specifically implementing GA s. Parameter matching with Frequency Modulation (FM) synthesis has also been explored [8], [9].

4 3 Methods The articulatory synthesiser used in this project is the PRAAT articulatory synthesiser [10]. PRAAT is a multi-functional software package with tools for a large range of speech analysis and synthesis tasks developed by Paul Boersma and David Weenink [11]. Whilst having a fully-fledged graphical user interface, PRAAT also provides the ability to use its own scripting language, allowing the majority of operations to be executed autonomously. This functionality allows a genetic algorithm to be implemented in conjunction with PRAAT without a great deal of retrofitting that would be required with other available synthesisers such as VTDemo or Vocal Tract Lab 2. Additionally the provided analysis tools for speech make the integration of an appropriate fitness function highly convenient, and minimises the need for using external tools. The physical model constraints are configured to use an adult male speaker. Control of the synthesiser is done by passing a configuration file that contains a list of all parameters for the model. The model used in PRAAT has 29 parameters that can be specified. Therefore the encoding approach taken in this GA is a real value representation, with each individual stored as a vector of 28 numbers. Each number of the vector represents a parameter and can take any value in the range -1 x 1. Where p 1 = Interarytenoid, p 2 = Cricothyroid, p 3 = Vocalis, p 4 = Thyroarytenoid etc. [ p1 p 2 p 3 p 4... p n ] Therefore a randomly generated individual may be initialised with parameter values such as Interarytenoid = 0.82, Cricothyroid = -0.2, Vocalis = -0.48, Thyroarytenoid = 0.1 etc. [ 0.82, 0.2, 0.48, pn ] Only one parameter uses prior knowledge. The lungs are set to a predefined value of 0.15 at the beginning of the articulation, then 0.0 at 0.1 seconds. PRAAT automatically interpolates values between these two discrete settings. The reasoning behind this choice is that unlike the other parameters, the lungs parameter needs to be changed over time to provide energy or excitation of the vocal folds necessary for phonation. This parameter is kept the same for every individual generated and is not altered by any GA operations, hence the reason individuals are represented as vectors with a length of 28, and not 29. The fitness function is implemented by using a FFT for analysis of features of the target sound. Four frequencies are extracted from each sound. The first is the fundamental frequency or pitch of the sound. The next three frequencies are the first three formants produced. This analysis is performed on the target vowel sound, and then subsequently performed on each individual sound in every generation. Fitness is based on the differences between the four frequencies in the target sound, and the respective frequencies in each individual. A penalty is introduced for each formant that is not present in the candidate s solution, which replaces the difference in frequency with a large arbitrary value (10,000).

5 The natural state of the fitness function in this application is a minimisation function, as the goal is to minimise the differences between features of two sounds rather than maximise some sort of profit. It is therefore necessary to scale the fitness for each individual in order to implement a fitness proportional selection scheme. This is achieved through dividing one by each candidate s fitness. Because of the inherent use of stochastic processes in Genetic Algorithms, any analysis of results must take this into account. Each experiment is run multiple times to ensure that there is no bias due to the stochastic process. Pseudorandom numbers are used for any stochastic processes and are provided by the built-in random function in Python, which uses a Mersenne twist algorithm. Mutation of allele values is done using a Gaussian distribution where µ = 0 and σ = Any mutation that results in allele values outside of the constraints -1 x 1 are rounded to 1.0 or Results Three differently configured genetic algorithms are presented, with the results displayed using a performance graph, as shown in Fig.2, 3 and 4. These graphs show the average fitness of the population at every generation, and the fitness of the best candidate from each population at every generation. All experiments are carried out with the same size populations and number of generations, with number of generations set to 50 and a population size of 75. The generation at 0 on each performance graph is the initial randomly generated population without having any genetic operations performed. This would be a random voice configuration. 4.1 Experiment 1 - Elitism Operator The first run is not strictly a genetic algorithm, as there is no exchange of genetic material between individuals generated in each population. It is more akin to a hybrid parallel evolutionary strategy. This first experiment was a proof of concept to demonstrate the ability that the fitness function worked as it should, and that it could differentiate. As displayed in Fig.2, the results of this show the average fitness of each population steadily increasing. This exploratory measure confirms the basic viability and behaviour in this domain. 4.2 Experiment 2 - Fitness Proportional Selection with One Point Crossover This experiment sees the introduction of fitness proportional selection (FPS), combined with one point crossover for exchange of genetic material between candidate solutions. An example of one point crossover is shown below. Two parent candidates are selected by making two calls to the FPS function which returns a candidate for each call. A random crossover point is generated and used to combine values

6 Fig. 2. Performance Graph of Using Only an Elitism Operator from both parents before and after this point. For this example a shortened range of 10 parameters is used. P arent1 [ 0.25, 0.32, 0.97, 0.6, 0.23, 0.5, 0.31, 0.89, 0.4, 0.93 ] P arent2 [ 0.4, 0.24, 0.64, 0.35, 0.51, 0.7, 0.93, 0.19, 0.83, 0.18 ] After one point crossover with a value of three, an offspring or child candidate is created by combining the first three allele values from parent 1 and then the last seven values from parent 2. This then creates a new solution containing genetic material from both parent candidates. Offspring [ 0.25, 0.32, 0.97, 0.35, 0.51, 0.7, 0.93, 0.19, 0.83, 0.18 ] This process if then repeated until a new population has been generated. In Fig.3, it shows the average fitness converges much quicker than when using just elitism. However there is stagnation of genetic diversity from around the 23rd generation where both the average fitness and best candidate of each generation shows very little change. 4.3 Experiment 3 - Fitness Proportional Selection with One Point Crossover and Mutation Here, the FPS operator is kept for selection. A mutation operator is also implemented with each allele value having a probability of 0.1 to mutate.

7 Fig. 3. Performance Graph Using Fitness Proportional Selection and One Point Crossover As it can be observed in Fig.4 there is a rapid improvement in average fitness, after which it settles down into smaller fluctuations. The fitness of the best candidate at each point improves more slowly early on, but more consistently. The voice model is clearly converging towards the target sound. 5 Discussions 5.1 Observations It is clear that the elitism operator - because there is no combination of candidates - causes the average of each population to move steadily towards the target, but will not generate new candidates outside of a random search. Removing the elitism operator and replacing it with a fitness proportional selection operator, as done in Experiment 2, causes the rapid increase in the optimisation of the average population but leads to a stagnation of genetic diversity. With the incorporation of the mutation operator in experiment 3, the average fitness fluctuated more than in the previous experiments, but this also produced the best results with respect to the fittest candidate in each population. In general GAs seem to be able to optimise the PRAAT parameters.

8 Fig. 4. Performance graph using Fitness Proportional Selection, One Point Crossover and Mutation operators. 5.2 Future Work As this is an early work in progress there are several areas identified for extensive improvement and future research, these will be focused on the following. Fitness Function: The fitness function is a crucial aspect of any genetic algorithm. This is especially true when using a multimodal one. As it is the only metric that can guide the search process, it is therefore imperative that it accurately represents the suitability of a candidate solution. If the fitness of an individual is miss-represented, then regions of the search space may be exploited that are not conducive to finding good solutions. A number of shortcomings are clear with the current fitness function. For example the current analysis takes a FFT using a window size equal to the entire length of the sound. This does not account for things such as vocal breaks, intensity of phonation, modulation of pitch etc. Genetic Operator Additions: The fitness proportional selection scheme, when applied to other optimisation tasks, has been found to be in certain cases an inferior operator. Therefore, a Rank-Based Selection to bring a change in selection pressure will be implemented. Trials with different crossover operators will also be explored, such as uniform crossover and two point crossover. Genetic Algorithm Parameters: The relationship between number of generations, population size, and mutation rate will all have a large impact

9 on convergence and population diversity. As the synthesis of each individual is computationally expensive, minimising the total number of individuals in a run is desirable. Further experiments with different mutation rates, population sizes and number of generations need to be taken to ascertain optimal values. To conclude, our results indicate that GAs are a viable method of optimisation for the parameters of PRAAT, and therefore articulatory synthesis. A substantial number of improvements have been identified, which when implemented may improve the robustness and effectiveness of the genetic algorithm for use in mapping sounds to articulatory configurations. References 1. Shadle, C., Damper, R.: Prospects for articulatory synthesis: A position paper. In: 4th ISCA Tutorial and Research Workshop. (2002) 2. Schroeter, J., Sondhi, M.: Techniques for estimating vocal-tract shapes from the speech signal. IEEE Transactions on Speech and Audio Processing 2(1) (1994) Kim, Y.c., Kim, J., Proctor, M., Toutios, A., Nayak, K., Lee, S., Narayanan, S.: Toward Automatic Vocal Tract Area Function Estimation from Accelerated Threedimensional Magnetic Resonance Imaging. In: ISCA Workshop on Speech Production in Automatic Speech Recognition, Lyon, France (2013) Busset, J., Laprie, Y., Cnrs, L., Botanique, J.: Acoustic-to-articulatory inversion by analysis-by-synthesis using cepstral coefficients. In: ICA - 21st International Congress on Acoustics. Volume (2013) 5. Holland, J.H.: Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence. U Michigan Press (1975) 6. Goldberg, D.E., Others: Genetic algorithms in search, optimization, and machine learning. Volume 412. Addison-wesley Reading Menlo Park (1989) 7. Miranda, E.R., Al Biles, J.: Evolutionary computer music. Springer (2007) 8. Horner, A., Beauchamp, J., Haken, L.: Machine tongues XVI: Genetic algorithms and their application to FM matching synthesis. Computer Music Journal (1993) Mitchell, T.J.: An exploration of evolutionary computation applied to frequency modulation audio synthesis parameter optimisation. PhD thesis, University of the West of England (2010) 10. Boersma, P.: Praat, a system for doing phonetics by computer. Glot international 5(9/10) (2001) Boersma, P.: Functional phonology: Formalizing the interactions between articulatory and perceptual drives. Holland Academic Graphics/IFOTT (1998)

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Phonetics. The Sound of Language

Phonetics. The Sound of Language Phonetics. The Sound of Language 1 The Description of Sounds Fromkin & Rodman: An Introduction to Language. Fort Worth etc., Harcourt Brace Jovanovich Read: Chapter 5, (p. 176ff.) (or the corresponding

More information

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY

More information

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute

More information

Audible and visible speech

Audible and visible speech Building sensori-motor prototypes from audiovisual exemplars Gérard BAILLY Institut de la Communication Parlée INPG & Université Stendhal 46, avenue Félix Viallet, 383 Grenoble Cedex, France web: http://www.icp.grenet.fr/bailly

More information

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech Dept. for Speech, Music and Hearing Quarterly Progress and Status Report VCV-sequencies in a preliminary text-to-speech system for female speech Karlsson, I. and Neovius, L. journal: STL-QPSR volume: 35

More information

Seminar - Organic Computing

Seminar - Organic Computing Seminar - Organic Computing Self-Organisation of OC-Systems Markus Franke 25.01.2006 Typeset by FoilTEX Timetable 1. Overview 2. Characteristics of SO-Systems 3. Concern with Nature 4. Design-Concepts

More information

Speaker Identification by Comparison of Smart Methods. Abstract

Speaker Identification by Comparison of Smart Methods. Abstract Journal of mathematics and computer science 10 (2014), 61-71 Speaker Identification by Comparison of Smart Methods Ali Mahdavi Meimand Amin Asadi Majid Mohamadi Department of Electrical Department of Computer

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

The dilemma of Saussurean communication

The dilemma of Saussurean communication ELSEVIER BioSystems 37 (1996) 31-38 The dilemma of Saussurean communication Michael Oliphant Deparlment of Cognitive Science, University of California, San Diego, CA, USA Abstract A Saussurean communication

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Consonants: articulation and transcription

Consonants: articulation and transcription Phonology 1: Handout January 20, 2005 Consonants: articulation and transcription 1 Orientation phonetics [G. Phonetik]: the study of the physical and physiological aspects of human sound production and

More information

International Journal of Advanced Networking Applications (IJANA) ISSN No. :

International Journal of Advanced Networking Applications (IJANA) ISSN No. : International Journal of Advanced Networking Applications (IJANA) ISSN No. : 0975-0290 34 A Review on Dysarthric Speech Recognition Megha Rughani Department of Electronics and Communication, Marwadi Educational

More information

Cooperative evolutive concept learning: an empirical study

Cooperative evolutive concept learning: an empirical study Cooperative evolutive concept learning: an empirical study Filippo Neri University of Piemonte Orientale Dipartimento di Scienze e Tecnologie Avanzate Piazza Ambrosoli 5, 15100 Alessandria AL, Italy Abstract

More information

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics

More information

While you are waiting... socrative.com, room number SIMLANG2016

While you are waiting... socrative.com, room number SIMLANG2016 While you are waiting... socrative.com, room number SIMLANG2016 Simulating Language Lecture 4: When will optimal signalling evolve? Simon Kirby simon@ling.ed.ac.uk T H E U N I V E R S I T Y O H F R G E

More information

Speaker recognition using universal background model on YOHO database

Speaker recognition using universal background model on YOHO database Aalborg University Master Thesis project Speaker recognition using universal background model on YOHO database Author: Alexandre Majetniak Supervisor: Zheng-Hua Tan May 31, 2011 The Faculties of Engineering,

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

Perceptual scaling of voice identity: common dimensions for different vowels and speakers DOI 10.1007/s00426-008-0185-z ORIGINAL ARTICLE Perceptual scaling of voice identity: common dimensions for different vowels and speakers Oliver Baumann Æ Pascal Belin Received: 15 February 2008 / Accepted:

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Nord, L. and Hammarberg, B. and Lundström, E. journal:

More information

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION SUMMARY 1. Motivation 2. Praat Software & Format 3. Extended Praat 4. Prosody Tagger 5. Demo 6. Conclusions What s the story behind?

More information

On Developing Acoustic Models Using HTK. M.A. Spaans BSc.

On Developing Acoustic Models Using HTK. M.A. Spaans BSc. On Developing Acoustic Models Using HTK M.A. Spaans BSc. On Developing Acoustic Models Using HTK M.A. Spaans BSc. Delft, December 2004 Copyright c 2004 M.A. Spaans BSc. December, 2004. Faculty of Electrical

More information

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses Thomas F.C. Woodhall Masters Candidate in Civil Engineering Queen s University at Kingston,

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Software Development: Programming Paradigms (SCQF level 8)

Software Development: Programming Paradigms (SCQF level 8) Higher National Unit Specification General information Unit code: HL9V 35 Superclass: CB Publication date: May 2017 Source: Scottish Qualifications Authority Version: 01 Unit purpose This unit is intended

More information

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012 Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology

More information

Speaker Recognition. Speaker Diarization and Identification

Speaker Recognition. Speaker Diarization and Identification Speaker Recognition Speaker Diarization and Identification A dissertation submitted to the University of Manchester for the degree of Master of Science in the Faculty of Engineering and Physical Sciences

More information

The IRISA Text-To-Speech System for the Blizzard Challenge 2017

The IRISA Text-To-Speech System for the Blizzard Challenge 2017 The IRISA Text-To-Speech System for the Blizzard Challenge 2017 Pierre Alain, Nelly Barbot, Jonathan Chevelu, Gwénolé Lecorvé, Damien Lolive, Claude Simon, Marie Tahon IRISA, University of Rennes 1 (ENSSAT),

More information

Knowledge-Based - Systems

Knowledge-Based - Systems Knowledge-Based - Systems ; Rajendra Arvind Akerkar Chairman, Technomathematics Research Foundation and Senior Researcher, Western Norway Research institute Priti Srinivas Sajja Sardar Patel University

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Multiagent Simulation of Learning Environments

Multiagent Simulation of Learning Environments Multiagent Simulation of Learning Environments Elizabeth Sklar and Mathew Davies Dept of Computer Science Columbia University New York, NY 10027 USA sklar,mdavies@cs.columbia.edu ABSTRACT One of the key

More information

A Pipelined Approach for Iterative Software Process Model

A Pipelined Approach for Iterative Software Process Model A Pipelined Approach for Iterative Software Process Model Ms.Prasanthi E R, Ms.Aparna Rathi, Ms.Vardhani J P, Mr.Vivek Krishna Electronics and Radar Development Establishment C V Raman Nagar, Bangalore-560093,

More information

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages

More information

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de

More information

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

A comparison of spectral smoothing methods for segment concatenation based speech synthesis D.T. Chappell, J.H.L. Hansen, "Spectral Smoothing for Speech Segment Concatenation, Speech Communication, Volume 36, Issues 3-4, March 2002, Pages 343-373. A comparison of spectral smoothing methods for

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information

Major Milestones, Team Activities, and Individual Deliverables

Major Milestones, Team Activities, and Individual Deliverables Major Milestones, Team Activities, and Individual Deliverables Milestone #1: Team Semester Proposal Your team should write a proposal that describes project objectives, existing relevant technology, engineering

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

On the Formation of Phoneme Categories in DNN Acoustic Models

On the Formation of Phoneme Categories in DNN Acoustic Models On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

EGRHS Course Fair. Science & Math AP & IB Courses

EGRHS Course Fair. Science & Math AP & IB Courses EGRHS Course Fair Science & Math AP & IB Courses Science Courses: AP Physics IB Physics SL IB Physics HL AP Biology IB Biology HL AP Physics Course Description Course Description AP Physics C (Mechanics)

More information

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Proceedings of 28 ISFA 28 International Symposium on Flexible Automation Atlanta, GA, USA June 23-26, 28 ISFA28U_12 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Amit Gil, Helman Stern, Yael Edan, and

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Softprop: Softmax Neural Network Backpropagation Learning

Softprop: Softmax Neural Network Backpropagation Learning Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering Document number: 2013/0006139 Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering Program Learning Outcomes Threshold Learning Outcomes for Engineering

More information

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National

More information

TD(λ) and Q-Learning Based Ludo Players

TD(λ) and Q-Learning Based Ludo Players TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability

More information

Unit purpose and aim. Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50

Unit purpose and aim. Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50 Unit Title: Game design concepts Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50 Unit purpose and aim This unit helps learners to familiarise themselves with the more advanced aspects

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Integrating simulation into the engineering curriculum: a case study

Integrating simulation into the engineering curriculum: a case study Integrating simulation into the engineering curriculum: a case study Baidurja Ray and Rajesh Bhaskaran Sibley School of Mechanical and Aerospace Engineering, Cornell University, Ithaca, New York, USA E-mail:

More information

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE Master of Science (M.S.) Major in Computer Science 1 MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE Major Program The programs in computer science are designed to prepare students for doctoral research,

More information

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are:

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are: Every individual is unique. From the way we look to how we behave, speak, and act, we all do it differently. We also have our own unique methods of learning. Once those methods are identified, it can make

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access The courses availability depends on the minimum number of registered students (5). If the course couldn t start, students can still complete it in the form of project work and regular consultations with

More information

GACE Computer Science Assessment Test at a Glance

GACE Computer Science Assessment Test at a Glance GACE Computer Science Assessment Test at a Glance Updated May 2017 See the GACE Computer Science Assessment Study Companion for practice questions and preparation resources. Assessment Name Computer Science

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

Circuit Simulators: A Revolutionary E-Learning Platform

Circuit Simulators: A Revolutionary E-Learning Platform Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Level 6. Higher Education Funding Council for England (HEFCE) Fee for 2017/18 is 9,250*

Level 6. Higher Education Funding Council for England (HEFCE) Fee for 2017/18 is 9,250* Programme Specification: Undergraduate For students starting in Academic Year 2017/2018 1. Course Summary Names of programme(s) and award title(s) Award type Mode of study Framework of Higher Education

More information

Specification of the Verity Learning Companion and Self-Assessment Tool

Specification of the Verity Learning Companion and Self-Assessment Tool Specification of the Verity Learning Companion and Self-Assessment Tool Sergiu Dascalu* Daniela Saru** Ryan Simpson* Justin Bradley* Eva Sarwar* Joohoon Oh* * Department of Computer Science ** Dept. of

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Bluetooth mlearning Applications for the Classroom of the Future

Bluetooth mlearning Applications for the Classroom of the Future Bluetooth mlearning Applications for the Classroom of the Future Tracey J. Mehigan, Daniel C. Doolan, Sabin Tabirca Department of Computer Science, University College Cork, College Road, Cork, Ireland

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

Automatic Pronunciation Checker

Automatic Pronunciation Checker Institut für Technische Informatik und Kommunikationsnetze Eidgenössische Technische Hochschule Zürich Swiss Federal Institute of Technology Zurich Ecole polytechnique fédérale de Zurich Politecnico federale

More information

A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS

A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS Wociech Stach, Lukasz Kurgan, and Witold Pedrycz Department of Electrical and Computer Engineering University of Alberta Edmonton, Alberta T6G 2V4, Canada

More information

PRODUCT COMPLEXITY: A NEW MODELLING COURSE IN THE INDUSTRIAL DESIGN PROGRAM AT THE UNIVERSITY OF TWENTE

PRODUCT COMPLEXITY: A NEW MODELLING COURSE IN THE INDUSTRIAL DESIGN PROGRAM AT THE UNIVERSITY OF TWENTE INTERNATIONAL CONFERENCE ON ENGINEERING AND PRODUCT DESIGN EDUCATION 6 & 7 SEPTEMBER 2012, ARTESIS UNIVERSITY COLLEGE, ANTWERP, BELGIUM PRODUCT COMPLEXITY: A NEW MODELLING COURSE IN THE INDUSTRIAL DESIGN

More information

LEGO MINDSTORMS Education EV3 Coding Activities

LEGO MINDSTORMS Education EV3 Coding Activities LEGO MINDSTORMS Education EV3 Coding Activities s t e e h s k r o W t n e d Stu LEGOeducation.com/MINDSTORMS Contents ACTIVITY 1 Performing a Three Point Turn 3-6 ACTIVITY 2 Written Instructions for a

More information

Problems of the Arabic OCR: New Attitudes

Problems of the Arabic OCR: New Attitudes Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing

More information

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING Yong Sun, a * Colin Fidge b and Lin Ma a a CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland

More information

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I Session 1793 Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I John Greco, Ph.D. Department of Electrical and Computer Engineering Lafayette College Easton, PA 18042 Abstract

More information

SOFTWARE EVALUATION TOOL

SOFTWARE EVALUATION TOOL SOFTWARE EVALUATION TOOL Kyle Higgins Randall Boone University of Nevada Las Vegas rboone@unlv.nevada.edu Higgins@unlv.nevada.edu N.B. This form has not been fully validated and is still in development.

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Expressive speech synthesis: a review

Expressive speech synthesis: a review Int J Speech Technol (2013) 16:237 260 DOI 10.1007/s10772-012-9180-2 Expressive speech synthesis: a review D. Govind S.R. Mahadeva Prasanna Received: 31 May 2012 / Accepted: 11 October 2012 / Published

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

Analyzing the Usage of IT in SMEs

Analyzing the Usage of IT in SMEs IBIMA Publishing Communications of the IBIMA http://www.ibimapublishing.com/journals/cibima/cibima.html Vol. 2010 (2010), Article ID 208609, 10 pages DOI: 10.5171/2010.208609 Analyzing the Usage of IT

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Self Study Report Computer Science

Self Study Report Computer Science Computer Science undergraduate students have access to undergraduate teaching, and general computing facilities in three buildings. Two large classrooms are housed in the Davis Centre, which hold about

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information