A STUDY ON THE EFFECT OF THE NEIGHBOR PHONEMES IN NATURAL SYNTHESIS OF SPEECH
|
|
- Jonathan Shields
- 5 years ago
- Views:
Transcription
1 Ceylon Journal of Science (Physical Sciences) 18 (2014) Computer Science A STUDY ON THE EFFECT OF THE NEIGHBOR PHONEMES IN NATURAL SYNTHESIS OF SPEECH H.M.L.N.K Herath 1 and J.V. Wijayakulasooriya 2 1Postgraduate Institute of Science, University of Peradeniya, Sri Lanka. 2Department of Electronic and Electrical Engineering, Faculty of Engineering, University of Peradeniya, Sri Lanka (*Corresponding author s 1 lakminiherath0@gmail.com 2 jan@ee.pdn.ac.lk). (Received: 13 January 2014 / Accepted after revision: 16 June 2014) ABSTRACT Natural synthesis of speech needs to identify the minute variations in phoneme during reproduction, which is affected by many factors. This paper presents an empirical study on the correlations between consequent phonemes in a speech signal. Short /a/ phoneme was selected for the study. In order to examine the effect of neighboring phonemes more clearly, words which consist of three or four phonemes were chosen. Then, the correlations between all possible pairs were calculated by comparing one cycle of each /a/ sound, which are starting from the same phonemes. Furthermore, one cycle taken from three different places, start, middle and end of the /a/ phoneme were selected and correlations between different pairs were calculated. The correlation values have clearly shown that the middle phoneme follows the preceding phoneme s energy to build the articulation between two phonemes, smoothly as well as within the /a/ phoneme itself. University of Peradeniya 2014 INTRODUCTION Speech synthesis is the artificial production of human speech. One of the main focus areas in speech synthesis research is to reduce the amount of data needed to synthesize the speech while maintaining an acceptable quality. During recent past, more emphasis is given to improve the naturalness of the synthesized speech. In this regard, many methods from low bit rate methods and high bit rate methods have been proposed (Bristow-Johnson, 1996). However, the holy grail of natural synthesis of speech is still remaining a challenging task, particularly for low bit rate applications. There are two main computer based speech synthesizing techniques: concatenative synthesis (Wavetable synthesis in music) of speech, which stored raw waveforms corresponding to each phoneme in a database called wavetable and concatenate them according to the phonemes to be synthesized (Holmes and Holmes, 2001; Smith, 2006). Although this method produces more natural speech than the mathematical coding based models, the high capacity needed for storing the speech and high bit rates involved in transmission of the speech are main concerns. In contrast, the mathematical coding based technique such as Linear Predictive Coding (LPC), which is based on Auto Regressive (AR) modeling of speech, significantly reduces the bit rate. However, the speech is modeled as a response of a Linear Time Invariant (LTI) system to an input excitation signal. The problem with Linear Time Invariant (LTI) system is the occurrence of audible discontinuities at phoneme boundaries, which leads to unnaturalness of synthetic speech. Time varying nature of phonemes Speech does not simply consist of a string of target articulations linked by simple movement between them(ohala 1993). In fact, articulation of individual sound segments or phonemes is almost always influenced by the articulation of neighboring
2 segments, often to the point of considerable overlapping of articulator activities (Ohala 1993). A phoneme is the smallest contrastive unit in the sound system of a language. Phonemes are combined with other phonemes to form meaningful units such as words or morphemes. Without appropriate transition between phonemes, the resulting speech sounds are unnatural and is hard to understand. In 1933, Menzerath and Lacerda [Hardcastle W. J et al. 1999] populated the term co-articulation. It was coined to denote instance where two successive sounds were articulated together. Many decades of experimental phonetic research have produced a large literature on the topic. The elementary fact highlighted here is that coarticulation is manifested in a temporal overlap between any two channels recruited by different phonemes. In the most basic model of articulatory by Locus (Delattre, 1969), each phoneme has a single ideal articulatory target for each contrastive articulator independent of the neighboring phonemes(phung et al. 2011). Under effects of neighboring phonemes, the transition between two phonemes is described as the movement between the two ideal targets of the phonemes. The Kozhevnikov-Chistovich model shows co- articulation within syllable but not across syllables(phung et al.,2011). Although there are many co-articulation models have been proposed there is still a lack of simple models, which are easy to be implemented in speech applications, and directly performed with acoustic data (Phung et al.,2011). Most of the mathematical speech synthesis models assume that the changes between the phonemes are time invariant. In other words, the parameter of the phoneme does not change with time. Linear systems in reality produce their outputs as a linear combination of its current and previous inputs and its pervious outputs(tatham et al., 2005). But the nature of the transition between phonemes is time variant. Figures 1 and 2 show that how the formant values change from one phoneme to another phoneme in time variant and time invariant systems. If the changes between phonemes are time invariant then the formant contours should be constant throughout the duration of a phoneme as shown in figure 1. However, in natural speech, the phonemes vary from one phoneme to another as well as within the phoneme as shown in Figure 2. The objective of this study is to find the effect of the neighboring phonemes in linear time variant nature by calculating the Pearson s correlation between phonemes. Figure 1: Formant values in time invariant system Figure 2: Formant values in time variant system METHOD Out of nearly forty four phonemes in English language, short /a/ phoneme was studied in this research. Recording phoneme sounds separately was infeasible, so that words which include short /a/ sound were selected for the recording. To examine the effect of neighboring phonemes more clearly, words which consist of three or four phonemes were chosen. From recorded words, /a/ phoneme was extracted separately. The segmentation process for the short /a/ was conducted manually by looking at the time wave and listening to the segmented phoneme. Then the Pearson s correlation coefficient (Wikipedia, 2014)between all possible pairs of different words were calculated by comparing one 46
3 cycle of each /a/ sound. In this case, pairs of words starting with the same phoneme as well as pairs of words starting with different phonemes were considered. In addition to that, one cycle taken from three different places, start, middle and end of the /a/ phoneme were selected and correlation between different pairs was calculated. A hypothesis test was conducted to find the significance of the correlation values. Sound processing and the statistical calculations were done by using the MATLAB software. RESULTS AND DISCUSSION Following correlation values were obtained by comparing, short /a/ sounds which are starting from the same phoneme (table 1). Same procedure was conducted by changing the starting phoneme and similar results have been obtained. As shown in table 1, each and every word which are starting from same phoneme has a significant correlation value greater than 0.75 and all pairs obtained p-values closer to 0. In the Pearson s correlation statistical hypothesis tests, all pairs of /a/ phoneme obtained p- values closer to 0. This shows all calculated pair wise correlations are statistically significant. Same experiment has been conducted by changing the first phoneme of the word but without changing the last phoneme. According to figure 3, /a/ sounds extracted from the words which are starting from different phonemes, but the same ending phoneme t, the correlation values are less than The p-values obtained for these pair wise correlations are also closer to 0. This interprets that there are moderate positive correlations between the words which are starting from different phonemes. Several experiments have been conducted by changing the last phoneme and similar results were obtained. It points out, those /a/ sound wave forms of words which are starting from same phoneme, have more correlation than the /a/ sound wave forms of words which are starting with different phoneme. So there was a significant relationship between the first phoneme and the following phoneme (vowel) of a word with compared to the relationship between the middle phoneme (vowel) and the next phoneme. The short /a/ phoneme wave form depends on the previous phoneme. That is previous letter have a clear impact on the following phoneme sound. Figure 3: Correlation values of comparing short /a/ sounds which are starting with different phonemes and ending with phoneme t. According to the figure 4, correlation values between /a/ sound of the word Bad with short /a/ sounds of other words which are stating from letter B were more than 0.7. That means the similarities between waveforms (one cycle) are greater than 50%. Most of them have correlation values more than 0.85.That means the similarities of some of wave forms were exceeding 75%. But when considering the relationship between the words which are starting with different phonemes, correlation values are less than 0.8. Some of those values are less than 0.5. This means that the relationship between /a/ sounds depends on the preceding phoneme. Figure 4:Correlation values of comparing Bad /a/ sound with short /a/ sounds, which are starting with letter B and different letters Figure 5 shows the average correlation values of different words by considering three cycles of /a/ phoneme taken from different places. One cycle near to the first let- 47
4 ter, middle cycle and a cycle form the end of the /a/ phoneme. Table 1: Pearson s correlation values of comparing short /a/ sound words, which start from phoneme B bad 1 bad bag ban bat back band bank batch badge bask bang bash bag ban bat back band bank batch badge bask bang bash neighboring phonemes as well as within the phoneme. CONCLUSION Figure 5: Average Correlation value of /a/ phoneme of different words extracting the cycles from three different places When compared with the cycles taken from different places, figure 5 shows the starting cycle average correlation value was always less than the middle cycle average correlation value, which implies that front cycles of the /a/ sounds have a clear impact from the previous phoneme. It is because the staring cycle lies within the transition region between the two neighboring phonemes. But when it comes to the middle cycle /a/ sound wave form was stabilized, so the average correlation value was much greater than previous values. Then the transit to the next phoneme, the correlation values vary from word to word, but all the values were less than middle correlation values. Figure 5 indicates that there is a time variant linear relationship between the The underlined approach is to investigate the effect of correlation between consequent phonemes in natural synthesis of speech. This study illustrates when the starting phoneme changes, the proceeding phoneme correlation values also change significantly. Therefore, there is a smooth linear time variant transition between consequent phonemes. In addition to that, the study also points out that the middle phoneme has a different correlation values within the phoneme when compared to the start, middle and end wave forms. It shows there is a smooth variation within the /a/ phoneme itself. Thus, the correlation values have clearly shown that the middle phoneme follows the preceding phoneme energy to build the articulation between two phonemes smoothly. The study concludes that the time variant nature of neighboring phonemes as well as within the phoneme should be strongly considered when modeling more natural speech in mathematical coding based low bit rate models. REFERENCES 48
5 Alan O Cinn éide(2008) Linear Prediction The Technique, Its Solution and Application to Speech. Published in DIT Internal Technical Report Bristow-Johnson, R.(1996) Wavetable Synthesis 101, A Fundamental Perspective, In 101st AES Convention (Los Angeles, California), Audio Engineering Society (AES), Preprint No Delattre, P. (1969)Coarticulation and The Locus Theory, StudiaLinguistica 23(1) 1 26, Holmes, J., and Holmes, W.(2001)Speech Synthesis and Recognition, Second Edition,Taylor & Francis, London, UK. 287 Hardcastle W. J. and, Hewlett N. (1999)Coarticulation: Theory, Data and Techniques, Cambridge university press. Ohala J.J.(1993)Coarticulation and phonology- university of Alberta and university of California Berkeley, language and speech 36; Phung, T., Luong, M. C. and Akagi, M.(2012) On the Stability of Spectral Targets under Effects of Coarticulation,International Journal of Computer and Electrical Engineering, Vol. 4, No. 4, ( ) Phung, T., Luong, M. C., and Akagi, M.(2011), An Investigation on Perceptual Line Spectral Frequency (PLP-LSF) Target Stability against the Vowel Neutralization Phenomenon, 3rd International Conference on Signal Acquisition and Processing (ICSAP 2011): Rabiner, L. and Juang, B. H. (1993)Fundamentals of speech Recognition, Prentice Hall International,497 Smith, J.(2006) History and Practice of Digital Sound Synthesis, CCRMA, Stanford University, Lectures notes in AES 2006 Shannon M, Zen H, Byrne W,(2013)Autoregressive Models for Statistical Parametric Speech Synthesis, IEEE transactions on audio, speech, and language processing, vol. 21 (3); ( ) Tatham, M., Morton K. (2005), Development in speech synthesis. John Wiley & Sons Ltd, England, Chapter 4, pg Taylor P.(2009)Text-to-Speech Synthesis, Cambridge University Press. (total pages) Phones, Phonemes, Allophones and Phonological Rules, accessed nd_dependence, accessed in
Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification
Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationOn the Formation of Phoneme Categories in DNN Acoustic Models
On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-
More informationThe Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access
The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics
More informationSegregation of Unvoiced Speech from Nonspeech Interference
Technical Report OSU-CISRC-8/7-TR63 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 FTP site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/27
More informationA study of speaker adaptation for DNN-based speech synthesis
A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,
More informationSpeech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers
Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,
More informationA comparison of spectral smoothing methods for segment concatenation based speech synthesis
D.T. Chappell, J.H.L. Hansen, "Spectral Smoothing for Speech Segment Concatenation, Speech Communication, Volume 36, Issues 3-4, March 2002, Pages 343-373. A comparison of spectral smoothing methods for
More informationSpeech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence
INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics
More informationDesign Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm
Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute
More informationMandarin Lexical Tone Recognition: The Gating Paradigm
Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition
More informationEli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology
ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology
More informationQuarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report VCV-sequencies in a preliminary text-to-speech system for female speech Karlsson, I. and Neovius, L. journal: STL-QPSR volume: 35
More informationDyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,
Adoption studies, 274 275 Alliteration skill, 113, 115, 117 118, 122 123, 128, 136, 138 Alphabetic writing system, 5, 40, 127, 136, 410, 415 Alphabets (types of ) artificial transparent alphabet, 5 German
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationSpeech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines
Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,
More informationRachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA
LANGUAGE AND SPEECH, 2009, 52 (4), 391 413 391 Variability in Word Duration as a Function of Probability, Speech Style, and Prosody Rachel E. Baker, Ann R. Bradlow Northwestern University, Evanston, IL,
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationPhonological Processing for Urdu Text to Speech System
Phonological Processing for Urdu Text to Speech System Sarmad Hussain Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, B Block, Faisal Town, Lahore,
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationUnvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition
Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese
More informationUNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak
UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS Heiga Zen, Haşim Sak Google fheigazen,hasimg@google.com ABSTRACT Long short-term
More informationQuarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Nord, L. and Hammarberg, B. and Lundström, E. journal:
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationATW 202. Business Research Methods
ATW 202 Business Research Methods Course Outline SYNOPSIS This course is designed to introduce students to the research methods that can be used in most business research and other research related to
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationCOURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management
COURSE SYNOPSIS This course is designed to introduce students to the research methods that can be used in most business research and other research related to the social phenomenon. The areas that will
More informationA Hybrid Text-To-Speech system for Afrikaans
A Hybrid Text-To-Speech system for Afrikaans Francois Rousseau and Daniel Mashao Department of Electrical Engineering, University of Cape Town, Rondebosch, Cape Town, South Africa, frousseau@crg.ee.uct.ac.za,
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationLearners Use Word-Level Statistics in Phonetic Category Acquisition
Learners Use Word-Level Statistics in Phonetic Category Acquisition Naomi Feldman, Emily Myers, Katherine White, Thomas Griffiths, and James Morgan 1. Introduction * One of the first challenges that language
More informationExpressive speech synthesis: a review
Int J Speech Technol (2013) 16:237 260 DOI 10.1007/s10772-012-9180-2 Expressive speech synthesis: a review D. Govind S.R. Mahadeva Prasanna Received: 31 May 2012 / Accepted: 11 October 2012 / Published
More informationA Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language
A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.
More informationInternational Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012
Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of
More informationRhythm-typology revisited.
DFG Project BA 737/1: "Cross-language and individual differences in the production and perception of syllabic prominence. Rhythm-typology revisited." Rhythm-typology revisited. B. Andreeva & W. Barry Jacques
More informationThe NICT/ATR speech synthesis system for the Blizzard Challenge 2008
The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National
More informationPhonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project
Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California
More informationLip reading: Japanese vowel recognition by tracking temporal changes of lip shape
Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,
More informationAcoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA
Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan James White & Marc Garellek UCLA 1 Introduction Goals: To determine the acoustic correlates of primary and secondary
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationSTUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH
STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160
More informationEnglish Language and Applied Linguistics. Module Descriptions 2017/18
English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,
More informationProceedings of Meetings on Acoustics
Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production
More informationVoice conversion through vector quantization
J. Acoust. Soc. Jpn.(E)11, 2 (1990) Voice conversion through vector quantization Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara A TR Interpreting Telephony Research Laboratories,
More informationSEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH
SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH Mietta Lennes Most of the phonetic knowledge that is currently available on spoken Finnish is based on clearly pronounced speech: either readaloud
More informationDEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS
DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS Natalia Zharkova 1, William J. Hardcastle 1, Fiona E. Gibbon 2 & Robin J. Lickley 1 1 CASL Research Centre, Queen Margaret University, Edinburgh
More information1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all
Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY
More informationImproved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form
Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused
More informationCambridgeshire Community Services NHS Trust: delivering excellence in children and young people s health services
Normal Language Development Community Paediatric Audiology Cambridgeshire Community Services NHS Trust: delivering excellence in children and young people s health services Language develops unconsciously
More informationGDP Falls as MBA Rises?
Applied Mathematics, 2013, 4, 1455-1459 http://dx.doi.org/10.4236/am.2013.410196 Published Online October 2013 (http://www.scirp.org/journal/am) GDP Falls as MBA Rises? T. N. Cummins EconomicGPS, Aurora,
More informationSchool of Innovative Technologies and Engineering
School of Innovative Technologies and Engineering Department of Applied Mathematical Sciences Proficiency Course in MATLAB COURSE DOCUMENT VERSION 1.0 PCMv1.0 July 2012 University of Technology, Mauritius
More informationWord Stress and Intonation: Introduction
Word Stress and Intonation: Introduction WORD STRESS One or more syllables of a polysyllabic word have greater prominence than the others. Such syllables are said to be accented or stressed. Word stress
More informationThe Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh
The Effect of Discourse Markers on the Speaking Production of EFL Students Iman Moradimanesh Abstract The research aimed at investigating the relationship between discourse markers (DMs) and a special
More informationREVIEW OF CONNECTED SPEECH
Language Learning & Technology http://llt.msu.edu/vol8num1/review2/ January 2004, Volume 8, Number 1 pp. 24-28 REVIEW OF CONNECTED SPEECH Title Connected Speech (North American English), 2000 Platform
More informationTo appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations
Post-vocalic spirantization: Typology and phonetic motivations Alan C-L Yu University of California, Berkeley 0. Introduction Spirantization involves a stop consonant becoming a weak fricative (e.g., B,
More informationRevisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab
Revisiting the role of prosody in early language acquisition Megha Sundara UCLA Phonetics Lab Outline Part I: Intonation has a role in language discrimination Part II: Do English-learning infants have
More informationAutomatic segmentation of continuous speech using minimum phase group delay functions
Speech Communication 42 (24) 429 446 www.elsevier.com/locate/specom Automatic segmentation of continuous speech using minimum phase group delay functions V. Kamakshi Prasad, T. Nagarajan *, Hema A. Murthy
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationProgram Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading
Program Requirements Competency 1: Foundations of Instruction 60 In-service Hours Teachers will develop substantive understanding of six components of reading as a process: comprehension, oral language,
More informationDisambiguation of Thai Personal Name from Online News Articles
Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online
More informationPhonological and Phonetic Representations: The Case of Neutralization
Phonological and Phonetic Representations: The Case of Neutralization Allard Jongman University of Kansas 1. Introduction The present paper focuses on the phenomenon of phonological neutralization to consider
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationTHE MULTIVOC TEXT-TO-SPEECH SYSTEM
THE MULTVOC TEXT-TO-SPEECH SYSTEM Olivier M. Emorine and Pierre M. Martin Cap Sogeti nnovation Grenoble Research Center Avenue du Vieux Chene, ZRST 38240 Meylan, FRANCE ABSTRACT n this paper we introduce
More informationUniversal contrastive analysis as a learning principle in CAPT
Universal contrastive analysis as a learning principle in CAPT Jacques Koreman, Preben Wik, Olaf Husby, Egil Albertsen Department of Language and Communication Studies, NTNU, Trondheim, Norway jacques.koreman@ntnu.no,
More informationA Cross-language Corpus for Studying the Phonetics and Phonology of Prominence
A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence Bistra Andreeva 1, William Barry 1, Jacques Koreman 2 1 Saarland University Germany 2 Norwegian University of Science and
More informationSTA 225: Introductory Statistics (CT)
Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic
More informationNCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science
More informationControl Tutorials for MATLAB and Simulink
Control Tutorials for MATLAB and Simulink Last updated: 07/24/2014 Author Information Prof. Bill Messner Carnegie Mellon University Prof. Dawn Tilbury University of Michigan Asst. Prof. Rick Hill, PhD
More informationSpeaker Identification by Comparison of Smart Methods. Abstract
Journal of mathematics and computer science 10 (2014), 61-71 Speaker Identification by Comparison of Smart Methods Ali Mahdavi Meimand Amin Asadi Majid Mohamadi Department of Electrical Department of Computer
More informationDetailed course syllabus
Detailed course syllabus 1. Linear regression model. Ordinary least squares method. This introductory class covers basic definitions of econometrics, econometric model, and economic data. Classification
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationParallel Evaluation in Stratal OT * Adam Baker University of Arizona
Parallel Evaluation in Stratal OT * Adam Baker University of Arizona tabaker@u.arizona.edu 1.0. Introduction The model of Stratal OT presented by Kiparsky (forthcoming), has not and will not prove uncontroversial
More informationJournal of Phonetics
Journal of Phonetics 40 (2012) 595 607 Contents lists available at SciVerse ScienceDirect Journal of Phonetics journal homepage: www.elsevier.com/locate/phonetics How linguistic and probabilistic properties
More informationLinguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University
Linguistics 220 Phonology: distributions and the concept of the phoneme John Alderete, Simon Fraser University Foundations in phonology Outline 1. Intuitions about phonological structure 2. Contrastive
More informationSpeaker recognition using universal background model on YOHO database
Aalborg University Master Thesis project Speaker recognition using universal background model on YOHO database Author: Alexandre Majetniak Supervisor: Zheng-Hua Tan May 31, 2011 The Faculties of Engineering,
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationOPAC and User Perception in Law University Libraries in the Karnataka: A Study
ISSN 2229-5984 (P) 29-5576 (e) OPAC and User Perception in Law University Libraries in the Karnataka: A Study Devendra* and Khaiser Nikam** To Cite: Devendra & Nikam, K. (20). OPAC and user perception
More informationPROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia
PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT by James B. Chapman Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment
More information1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature
1 st Grade Curriculum Map Common Core Standards Language Arts 2013 2014 1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature Key Ideas and Details
More informationAn Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English
Linguistic Portfolios Volume 6 Article 10 2017 An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Cassy Lundy St. Cloud State University, casey.lundy@gmail.com
More informationQuantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)
Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur) 1 Interviews, diary studies Start stats Thursday: Ethics/IRB Tuesday: More stats New homework is available
More informationHoughton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)
Houghton Mifflin Reading Correlation to the Standards for English Language Arts (Grade1) 8.3 JOHNNY APPLESEED Biography TARGET SKILLS: 8.3 Johnny Appleseed Phonemic Awareness Phonics Comprehension Vocabulary
More informationTHE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS
THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS ELIZABETH ANNE SOMERS Spring 2011 A thesis submitted in partial
More informationVimala.C Project Fellow, Department of Computer Science Avinashilingam Institute for Home Science and Higher Education and Women Coimbatore, India
World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 1, 1-7, 2012 A Review on Challenges and Approaches Vimala.C Project Fellow, Department of Computer Science
More informationLecture 9: Speech Recognition
EE E6820: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition 1 Recognizing speech 2 Feature calculation Dan Ellis Michael Mandel 3 Sequence
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationAnalyzing the Usage of IT in SMEs
IBIMA Publishing Communications of the IBIMA http://www.ibimapublishing.com/journals/cibima/cibima.html Vol. 2010 (2010), Article ID 208609, 10 pages DOI: 10.5171/2010.208609 Analyzing the Usage of IT
More informationEdinburgh Research Explorer
Edinburgh Research Explorer Personalising speech-to-speech translation Citation for published version: Dines, J, Liang, H, Saheer, L, Gibson, M, Byrne, W, Oura, K, Tokuda, K, Yamagishi, J, King, S, Wester,
More informationBody-Conducted Speech Recognition and its Application to Speech Support System
Body-Conducted Speech Recognition and its Application to Speech Support System 4 Shunsuke Ishimitsu Hiroshima City University Japan 1. Introduction In recent years, speech recognition systems have been
More informationPREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES
PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More information12- A whirlwind tour of statistics
CyLab HT 05-436 / 05-836 / 08-534 / 08-734 / 19-534 / 19-734 Usable Privacy and Security TP :// C DU February 22, 2016 y & Secu rivac rity P le ratory bo La Lujo Bauer, Nicolas Christin, and Abby Marsh
More informationModern TTS systems. CS 294-5: Statistical Natural Language Processing. Types of Modern Synthesis. TTS Architecture. Text Normalization
CS 294-5: Statistical Natural Language Processing Speech Synthesis Lecture 22: 12/4/05 Modern TTS systems 1960 s first full TTS Umeda et al (1968) 1970 s Joe Olive 1977 concatenation of linearprediction
More informationThe Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma
International Journal of Computer Applications (975 8887) The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma Gilbert M.
More informationBuilding Text Corpus for Unit Selection Synthesis
INFORMATICA, 2014, Vol. 25, No. 4, 551 562 551 2014 Vilnius University DOI: http://dx.doi.org/10.15388/informatica.2014.29 Building Text Corpus for Unit Selection Synthesis Pijus KASPARAITIS, Tomas ANBINDERIS
More informationInternational Journal of Advanced Networking Applications (IJANA) ISSN No. :
International Journal of Advanced Networking Applications (IJANA) ISSN No. : 0975-0290 34 A Review on Dysarthric Speech Recognition Megha Rughani Department of Electronics and Communication, Marwadi Educational
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More informationYoung Enterprise Tenner Challenge
Young Enterprise Tenner Challenge Evaluation Report 2014/15 Supported by Young Enterprise Our vision we want every young person in the UK to leave education with the knowledge, skills and attitudes to
More information