PERCEPTUAL RESTORATION OF INTERMITTENT SPEECH USING HUMAN SPEECH-LIKE NOISE
|
|
- Gabriel Higgins
- 6 years ago
- Views:
Transcription
1 rd International Congress on Sound & Vibration Athens, Greece 0- July 06 ICSV PERCEPTUAL RESTORATION OF INTERMITTENT SPEECH USING HUMAN SPEECH-LIKE NOISE Mitsunori Mizumachi, Shouma Imanaga Kyushu Institute of Technology, - Sensui-cho, Tobata-ku, Kitakyushu, Fukuoka 80-80, Japan. mizumach@ecs.kyutech.ac.jp Toshiharu Horiuchi KDDI R&D Laboratories, Inc., -- Ohara, Fujimino, Saitama, 6-80 Japan. Mobile phones have caused an explosive increase in packet distribution. It causes a serious problem, that is, packet loss. Packet loss concealment is indispensable for achieving smooth speech communication. As a packet loss concealment method on a client side, a waveform substitution is popular and standardized by ITU. The ITU-T G.7 conceals packet loss by inserting a phaseadjusted amplitude-attenuating previous packet into each break, but cannot deal with long-term breaks over 60 ms, that is, burst loss. The authors have previously proposed an alternative packet loss concealment relying on a human auditory capability. This method does not aim at restoring a waveform of the intermittent speech signal, but achieves perceptual restoration relying on phonetic restoration. When a gap of a intermittent speech signal is filled up with a loud arbitrary signal, we can smoothly listen to restored speech even if some segments of the original speech signal are completely lost. There is a trade-off between smoothness of the restored speech and noisiness of the gap-filling signal. Previously, the gap-filling signal was composed of a harmonic complex and ambient noises. In this paper, a human speech-like noise is substituted for the gap-filling signal. The speech-like noise can be prepared by repeatedly overlapping short-term human speech signals. It is confirmed that the proposed gap-filling signal succeeds in reducing its noisiness.. Introduction There is a serious problem in digital speech communication. A rapid increase in packet distribution causes packet loss, and recently long-term packet loss, that is, burst loss, seriously degrades quality of speech communication. Packet loss concealment is indispensable for achieving stress-free speech communication. A waveform substitution [] is one of the most popular packet loss concealment approaches, and ITU-T has standardized it for VoIP speech communication []. A model-based waveform regeneration [] is also a well-known approach, but it requires rather computational costs. Those packet loss concealment methods assume the short-term packet loss, and could not cope with burst loss. For example, the ITU-T G.7 method cannot conceal burst loss, of which duration is over 60 ms. The authors have proposed the alternative perceptual restoration of the packet loss based on an auditory illusion [,, 6]. It is interesting that an intermittent speech signal can be smoothly perceived, when the gaps are filled with noises. This auditory illusory phenomenon is called the phonemic restoration effect [7]. When the gap of the intermittent speech signal is filled up with a wideband signal, of which signal-to-noise ratio is less than -0 db, we can hear the intermittent speech smoothly even if some segments of the original speech signal are completely lost [8]. The authors have proposed
2 The rd International Congress of Sound and Vibration Table : Experimental conditions for optimizing human speech-like noise. Target speech Japanese sentences uttered by a male speaker Duration of speech break (packet loss) 0 ms Target speech to insertion noise ratio (SIR) db, 0 db, - db Individuality of human speech-like noise speaker-dependent and speaker-independent Number of overlapping speech signals,,, 0,, 0, 0 the less-harsh ambient noise with the speech-like harmonics complex as the gap-filling signal [, ]. Feasibility of the proposed method has been confirmed under quiet and noisy conditions [6]. In this paper, the gap-filling signal is further improved using a human speech-like noise, which can be prepared by overlapping speech signals. It is supposed that the speech-like noise could decrease the noisiness and incongruity of the insertion signal. Characteristics of human speech-like noises are investigated by listening test, because those vary depending on the number of overlapping speech signals and so on. The proposed gap-filling signal is subjectively evaluated compared with the previously-proposed method [6].. Perceptual restoration of intermittent speech Packet loss is perceptually concealed relying on the phonetic restoration effect [7]. It is important for achieving perceptual restoration to design a reasonable gap-filling signal, which should increase the smoothness of a resultant restored speech and decrease the noisiness of the gap-filling signal. The gap-filling signal is designed based on static and dynamic characteristics of speech. A broadband signal is suitable for the gap-filling signal in order to satisfy the masking potential rule [8]. The authors have confirmed that the phonetic restoration effect occurred, when low frequency components of the target speech were masked by ambient noise such as an air conditioner noise []. The air conditioner noise is mixed with a harmonic complex, which aims at masking higher-order harmonic components of speech. The gap-filling signal has been modified considering dynamic characteristics of speech. It has been confirmed that the temporal variation of the gap-filling signal contributes to decreases the noisiness of the insertion signal [].. Perceptual restoration using human speech-like noise. Human speech-like noise A human speech like noise is prepared by overlapping short-term speech signals. Its characteristics vary with the number of overlap. If speech signals are added less than ten times, we perceive the speech-like noise as overlap of speech signals. When hundreds of speech signals are overlapped, the resultant speech-like noise becomes a stationary noise, of which frequency characteristics refer to the long-term average of speech. The human speech-like noise can be prepared using speech signals uttered by a single speaker or multiple speakers. A variety of human speech like noises can be designed based on speakerdependence, gender-dependence, language-dependence, and so on. An International Speech Test Signal (ISTS), which is developed using speech signals uttered by a single multilingual speaker for testing hearing aids [9], is one of well-known speech-like noises. The ISTS is composed of female speech materials in American English, Arabic, Chinese, French, German, and Spanish. In this study, a language-dependent speech-like noise is suitable for packet loss concealment, so that the main concerns include speaker individuality and the number of overlap. ICSV, Athens (Greece), 0- July 06
3 The rd International Congress of Sound and Vibration Mean Opinion Score.. # overlap of human speech-like noise: (: p<0.0; : p<0.0) SNdB SN0dB 0 SN-dB Target speech to speech-like insertion noise ratio Figure : Feasibility of speaker-dependent speech-like insertion for restoring intermittent speech. SNdB SN0dB SN-dB p<0.0 (: p<0.0) p<0.0 # overlap of human speech-like noise: Mean Opinion Score... SNdB SN0dB 0 SN-dB Target speech to speech-like insertion noise ratio Figure : Feasibility of speaker-independent speech-like insertion for restoring intermittent speech.. Subjective optimization of human speech-like noise Speech signals are divided into the segments with the duration of 00 ms, and then a part of the speech signal, of which duration is 0 ms, is randomly cut out from each segment. A speech-like noise is prepared by adding the designated number of different speech segments with the duration of 0 ms. Subjective evaluation is carried out concerning the restoration of intermittent speech with the speech-like noises as insertion signals. Feasibilities of the speech-like noises were subjectively examined by the five-grade mean opinion score (MOS). students with normal hearing participated in the listening test, and gave a MOS twice for each restored speech in a random order. Experimental conditions are summarized in Table. Figures and show results for speaker-dependent and speaker-independent gap-filling signals under no background noise conditions, respectively. There is no significant difference in speakerdependency. Strictly speaking, the most suitable number of overlap is different depending on the speech to insertion noise ratio. On the whole, it is suggested that the number of overlap should be more than 0 times and might be enough up to 0 times. ICSV, Athens (Greece), 0- July 06
4 The rd International Congress of Sound and Vibration Table : Experimental conditions for performance evaluation. Target speech Japanese sentences uttered by three male speakers Background noise Station yard noise [0] Duration of speech break (packet loss) 0 ms Target speech to insertion noise ratio (SIR) - db Target speech to background noise ratio (SNR) 9 db, 6 db, db, and no background noise Individuality of human speech-like noise speaker-dependent and speaker-independent Number of overlapping speech signals 0. Performance evaluation. Procedure Feasibility of the proposed speech-like noise is examined compared with the previously-proposed method, which employs the mixture of a harmonic complex and ambient noise as an insertion signal [6]. Restored intermittent speech signals were subjectively evaluated with the five-grade MOS on smoothness of restored speech, noisiness of the insertion signal, and comprehensive evaluation. Listening tests were carried out with participants, who were student volunteers with normal hearing, under the experimental conditions in Table.. Experimental results Experimental results are given in Figs.,, and. Concerning the smoothness of restored speech, the speaker-independent speech-like noises are superior in restoring intermittent speech to the speaker-dependent insertion noises. The speaker-dependent insertion noises could not gain advantages over the previously-proposed insertion signals [6]. On the other hand, the speaker-dependent speech-like noises significantly succeed in reducing its noisiness compared with both the speakerindependent speech-like noises and the previously-proposed insertion signals. It did not depend on the levels of the background noises. Figure indicates that the speaker-dependent speech-like noise is generally suitable as an insertion signal for restoring intermittent speech under noisy environments.. Perspective for practical application In a practical situation, a speaker-dependent speech-like noise can be prepared using receiving speech signals, just after speech communication is established. Then, once packet loss is occurred, the prepared speech-like noise is immediately substituted for the lost packets. The proposed method also has an advantage in reducing computational costs over conventional waveform substitution methods such as the ITU-T G.7 method. ICSV, Athens (Greece), 0- July 06
5 The rd International Congress of Sound and Vibration MOS (Smoothness) MOS (Noisiness) Speaker-dependent speech-like noise Without Previous restoration method.. No background noise.. No background noise SN9dB 9 SN6dB 6 SNdB Target speech to background noise ratio Figure : MOS on smoothness of restored speech. Speaker-independent speech-like noise (: p<0.0) Without Previous Speaker-dependent Speaker-independent restoration method speech-like noise speech-like noise. (: p<0.0; : p<0.0). p<0.0 SN9dB 9 SN6dB 6 SNdB Target speech SN9dB to background noise SN6dB ratio SNdB Figure : MOS on noisiness of insertion signal. Without Previous Speaker-dependent Speaker-independent restoration method speech-like noise speech-like noise (: p<0.0)..9.. No background noise SN9dB 9 SN6dB 6 SNdB Target speech to background noise ratio p<0.0 Figure : MOS on comprehensive evaluation. ICSV, Athens (Greece), 0-. July 06 MOS (Comprehensive Evaluation)
6 The rd International Congress of Sound and Vibration. Conclusions A perceptual restoration of intermittent speech, which supposes burst loss of packets, is improved using a human speech-like noise. Perceptual characteristics of human speech-like noises vary depending on the number of overlapping speech signals. It is confirmed that the overlap of 0 speech segments is enough for restoring intermittent speech. As the result of subjective evaluation, it is suggested that a speaker-dependent speech-like noise is the most suitable under practical noisy environments. Future works include performance evaluation of the proposed method in various conditions of speech communication. Acknowledgement This work was partly supported by JSPS KAKENHI Grant Number The authors thank Professor Christian Giguère for fruitful suggestions. REFERENCES. Goodman, D. J., Lockhart, G., Wasem, O. and Wong, W. C. Waveform substitution techniques for recovering missing speech segments in packet voice communications, IEEE Trans. Acoust., Speech and Signal Process., (6), 0 8, (986).. ITU Recommendation G.7 Appendix I, (999), A high quality low-complexity algorithm for packet loss concealment with G.7.. Chen, Y. L. and Chen, B. S. Model-based multi-rate representation of speech signals and its application to recovery of missing speech packets, IEEE Trans. on Speech and Audio Process., (), 0, (997).. Mizumachi, M., Ohga, K., Fujii, M. and Horiuchi, T. Restoration of intermittent speech with composite gap-filling schemes relying on human auditory capability, Proc. ICSV9, paper ID: 08, (0).. Mizumachi, M., Motomura, S., Takakura, T. and Horiuchi, T. Restoration of intermittent speech based on human auditory capability and temporal characteristics of speech, Proc. ICSV0, paper ID: 7, (0). 6. Mizumachi, M., Motomura, S., Takakura, T. and Horiuchi, T. Perceptual restoration of intermittent speech under noisy environments, Proc. ICSV, paper ID: 00, (0). 7. Warren, R. M. Perceptual restoration of missing speech sounds, Science, 67, 9 9, (970). 8. Kashino, M. Phonemic restoration: The brain creates missing speech sounds, Acoustical Science and Technology, 7(6), 8, (006). 9. Holube, I., Fredelake, S., Vlaming, M. and Kollmeier, B. Development and analysis of an international speech test signal (ISTS), Int. J. Audiol., 9(), 89 90, (00). 0. Kawai, K., Fujimoto, K., Iwase, T., Yasuoka, H., Sakuma, T. and Hidaka, Y. Development of a sound source database for environmental/architectural acoustics: Introduction of smile 00 (sound material in living environment 00), Proc. International Congress on Acoustics, pp. 6 6, (00). 6 ICSV, Athens (Greece), 0- July 06
Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology
ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology
More informationRobust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction
INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer
More informationUnvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition
Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese
More informationSpeech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence
INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics
More informationBody-Conducted Speech Recognition and its Application to Speech Support System
Body-Conducted Speech Recognition and its Application to Speech Support System 4 Shunsuke Ishimitsu Hiroshima City University Japan 1. Introduction In recent years, speech recognition systems have been
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationNoise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions
26 24th European Signal Processing Conference (EUSIPCO) Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions Emma Jokinen Department
More informationClass-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification
Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationSegregation of Unvoiced Speech from Nonspeech Interference
Technical Report OSU-CISRC-8/7-TR63 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 FTP site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/27
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationSTUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH
STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160
More informationQuarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Nord, L. and Hammarberg, B. and Lundström, E. journal:
More informationMandarin Lexical Tone Recognition: The Gating Paradigm
Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition
More informationAuthor's personal copy
Speech Communication 49 (2007) 588 601 www.elsevier.com/locate/specom Abstract Subjective comparison and evaluation of speech enhancement Yi Hu, Philipos C. Loizou * Department of Electrical Engineering,
More informationIEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH 2009 423 Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition George
More informationSpeech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers
Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,
More informationUTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation
UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation Taufiq Hasan Gang Liu Seyed Omid Sadjadi Navid Shokouhi The CRSS SRE Team John H.L. Hansen Keith W. Godin Abhinav Misra Ali Ziaei Hynek Bořil
More information1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all
Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY
More informationA Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language
A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.
More informationAutomatic segmentation of continuous speech using minimum phase group delay functions
Speech Communication 42 (24) 429 446 www.elsevier.com/locate/specom Automatic segmentation of continuous speech using minimum phase group delay functions V. Kamakshi Prasad, T. Nagarajan *, Hema A. Murthy
More informationMULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY
MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract
More informationThe NICT/ATR speech synthesis system for the Blizzard Challenge 2008
The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National
More informationRevisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab
Revisiting the role of prosody in early language acquisition Megha Sundara UCLA Phonetics Lab Outline Part I: Intonation has a role in language discrimination Part II: Do English-learning infants have
More informationRhythm-typology revisited.
DFG Project BA 737/1: "Cross-language and individual differences in the production and perception of syllabic prominence. Rhythm-typology revisited." Rhythm-typology revisited. B. Andreeva & W. Barry Jacques
More informationA comparison of spectral smoothing methods for segment concatenation based speech synthesis
D.T. Chappell, J.H.L. Hansen, "Spectral Smoothing for Speech Segment Concatenation, Speech Communication, Volume 36, Issues 3-4, March 2002, Pages 343-373. A comparison of spectral smoothing methods for
More informationListening and Speaking Skills of English Language of Adolescents of Government and Private Schools
Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present
More informationHuman Factors Engineering Design and Evaluation Checklist
Revised April 9, 2007 Human Factors Engineering Design and Evaluation Checklist Design of: Evaluation of: Human Factors Engineer: Date: Revised April 9, 2007 Created by Jon Mast 2 Notes: This checklist
More informationLip reading: Japanese vowel recognition by tracking temporal changes of lip shape
Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationA Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique
A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique Hiromi Ishizaki 1, Susan C. Herring 2, Yasuhiro Takishima 1 1 KDDI R&D Laboratories, Inc. 2 Indiana University
More informationUnderstanding and Supporting Dyslexia Godstone Village School. January 2017
Understanding and Supporting Dyslexia Godstone Village School January 2017 By then end of the session I will: Have a greater understanding of Dyslexia and the ways in which children can be affected by
More informationThe Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh
The Effect of Discourse Markers on the Speaking Production of EFL Students Iman Moradimanesh Abstract The research aimed at investigating the relationship between discourse markers (DMs) and a special
More informationMalicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method
Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method Sanket S. Kalamkar and Adrish Banerjee Department of Electrical Engineering
More information2005 National Survey of Student Engagement: Freshman and Senior Students at. St. Cloud State University. Preliminary Report.
National Survey of Student Engagement: Freshman and Senior Students at St. Cloud State University Preliminary Report (December, ) Institutional Studies and Planning National Survey of Student Engagement
More informationQuarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report VCV-sequencies in a preliminary text-to-speech system for female speech Karlsson, I. and Neovius, L. journal: STL-QPSR volume: 35
More informationADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION
ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento
More informationThe Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access
The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationP. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas
Exploiting Distance Learning Methods and Multimediaenhanced instructional content to support IT Curricula in Greek Technological Educational Institutes P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou,
More informationData Fusion Models in WSNs: Comparison and Analysis
Proceedings of 2014 Zone 1 Conference of the American Society for Engineering Education (ASEE Zone 1) Data Fusion s in WSNs: Comparison and Analysis Marwah M Almasri, and Khaled M Elleithy, Senior Member,
More informationA Cross-language Corpus for Studying the Phonetics and Phonology of Prominence
A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence Bistra Andreeva 1, William Barry 1, Jacques Koreman 2 1 Saarland University Germany 2 Norwegian University of Science and
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationAn Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English
Linguistic Portfolios Volume 6 Article 10 2017 An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Cassy Lundy St. Cloud State University, casey.lundy@gmail.com
More informationUSER ADAPTATION IN E-LEARNING ENVIRONMENTS
USER ADAPTATION IN E-LEARNING ENVIRONMENTS Paraskevi Tzouveli Image, Video and Multimedia Systems Laboratory School of Electrical and Computer Engineering National Technical University of Athens tpar@image.
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More informationMeriam Library LibQUAL+ Executive Summary
Meriam Library LibQUAL+ Executive Summary Meriam Library LibQUAL+ Executive Summary Page 2 ABOUT THE SURVEY LibQUAL+ is a survey designed to measure users perceptions and expectations of library service
More informationPreparing for the oral. GCSEs in Arabic, Greek, Japanese & Russian
Preparing for the oral GCSEs in Arabic, Greek, Japanese & Russian Before entering candidates What centres need to know Check that you have an appropriate teacher available within the assessment window
More informationCLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction
CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationLecture 15: Test Procedure in Engineering Design
MECH 350 Engineering Design I University of Victoria Dept. of Mechanical Engineering Lecture 15: Test Procedure in Engineering Design 1 Outline: INTRO TO TESTING DESIGN OF EXPERIMENTS DOCUMENTING TESTS
More informationNumber of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)
Program: Journalism Minor Department: Communication Studies Number of students enrolled in the program in Fall, 2011: 20 Faculty member completing template: Molly Dugan (Date: 1/26/2012) Period of reference
More informationMinistry of Education, Republic of Palau Executive Summary
Ministry of Education, Republic of Palau Executive Summary Student Consultant, Jasmine Han Community Partner, Edwel Ongrung I. Background Information The Ministry of Education is one of the eight ministries
More informationRachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA
LANGUAGE AND SPEECH, 2009, 52 (4), 391 413 391 Variability in Word Duration as a Function of Probability, Speech Style, and Prosody Rachel E. Baker, Ann R. Bradlow Northwestern University, Evanston, IL,
More informationSpeech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines
Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,
More informationThe Structure of the ORD Speech Corpus of Russian Everyday Communication
The Structure of the ORD Speech Corpus of Russian Everyday Communication Tatiana Sherstinova St. Petersburg State University, St. Petersburg, Universitetskaya nab. 11, 199034, Russia sherstinova@gmail.com
More informationVoice conversion through vector quantization
J. Acoust. Soc. Jpn.(E)11, 2 (1990) Voice conversion through vector quantization Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara A TR Interpreting Telephony Research Laboratories,
More informationInternational Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012
Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of
More informationGuidelines for Project I Delivery and Assessment Department of Industrial and Mechanical Engineering Lebanese American University
Guidelines for Project I Delivery and Assessment Department of Industrial and Mechanical Engineering Lebanese American University Approved: July 6, 2009 Amended: July 28, 2009 Amended: October 30, 2009
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationONLINE COURSES. Flexibility to Meet Middle and High School Students at Their Point of Need
ONLINE COURSES Flexibility to Meet Middle and High School Students at Their Point of Need 88 FuelEd Online Courses Standards-based online courses for middle and high school Struggling Seeking Greater Academic
More informationAlpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are:
Every individual is unique. From the way we look to how we behave, speak, and act, we all do it differently. We also have our own unique methods of learning. Once those methods are identified, it can make
More informationOn the Formation of Phoneme Categories in DNN Acoustic Models
On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-
More informationACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS
ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS Annamaria Mesaros 1, Toni Heittola 1, Antti Eronen 2, Tuomas Virtanen 1 1 Department of Signal Processing Tampere University of Technology Korkeakoulunkatu
More informationLikelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition Seltzer, M.L.; Raj, B.; Stern, R.M. TR2004-088 December 2004 Abstract
More informationPhonological and Phonetic Representations: The Case of Neutralization
Phonological and Phonetic Representations: The Case of Neutralization Allard Jongman University of Kansas 1. Introduction The present paper focuses on the phenomenon of phonological neutralization to consider
More informationNCEO Technical Report 27
Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students
More informationPrototype Development of Integrated Class Assistance Application Using Smart Phone
Prototype Development of Integrated Class Assistance Application Using Smart Phone Kazuya Murata, Takayuki Fujimoto Graduate School of Engineering, Toyo University Kujirai 2100, Kawagoe-City, Saitama Japan
More informationStages of Literacy Ros Lugg
Beginning readers in the USA Stages of Literacy Ros Lugg Looked at predictors of reading success or failure Pre-readers readers aged 3-53 5 yrs Looked at variety of abilities IQ Speech and language abilities
More informationSemi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration
INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One
More informationOn the Combined Behavior of Autonomous Resource Management Agents
On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science
More informationEXECUTIVE SUMMARY. TIMSS 1999 International Science Report
EXECUTIVE SUMMARY TIMSS 1999 International Science Report S S Executive Summary In 1999, the Third International Mathematics and Science Study (timss) was replicated at the eighth grade. Involving 41 countries
More informationRole of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation
Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationTEACHING AND EXAMINATION REGULATIONS (TER) (see Article 7.13 of the Higher Education and Research Act) MASTER S PROGRAMME EMBEDDED SYSTEMS
TEACHING AND EXAMINATION REGULATIONS (TER) (see Article 7.13 of the Higher Education and Research Act) 2015-2016 MASTER S PROGRAMME EMBEDDED SYSTEMS UNIVERSITY OF TWENTE 1 SECTION 1 GENERAL... 3 ARTICLE
More informationSEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH
SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH Mietta Lennes Most of the phonetic knowledge that is currently available on spoken Finnish is based on clearly pronounced speech: either readaloud
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationCALL FOR APPLICATION "Researching Public Law in Rio"/ Pesquisar Direito Público no Rio
Serviço Público Federal Universidade Federal Fluminense Mestrado em Direito Constitucional CALL FOR APPLICATION "Researching Public Law in Rio"/ Pesquisar Direito Público no Rio The Master Program in Constitutional
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationQuantitative Evaluation of an Intuitive Teaching Method for Industrial Robot Using a Force / Moment Direction Sensor
International Journal of Control, Automation, and Systems Vol. 1, No. 3, September 2003 395 Quantitative Evaluation of an Intuitive Teaching Method for Industrial Robot Using a Force / Moment Direction
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationUNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak
UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS Heiga Zen, Haşim Sak Google fheigazen,hasimg@google.com ABSTRACT Long short-term
More informationAbstract. Janaka Jayalath Director / Information Systems, Tertiary and Vocational Education Commission, Sri Lanka.
FEASIBILITY OF USING ELEARNING IN CAPACITY BUILDING OF ICT TRAINERS AND DELIVERY OF TECHNICAL, VOCATIONAL EDUCATION AND TRAINING (TVET) COURSES IN SRI LANKA Janaka Jayalath Director / Information Systems,
More informationProgram in Linguistics. Academic Year Assessment Report
Office of the Provost and Vice President for Academic Affairs Program in Linguistics Academic Year 2014-15 Assessment Report All areas shaded in gray are to be completed by the department/program. ISSION
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More information(English translation)
Public selection for admission to the Two-Year Master s Degree in INTERNATIONAL SECURITY STUDIES STUDI SULLA SICUREZZA INTERNAZIONALE (MISS) Academic year 2017/18 (English translation) The only binding
More informationUK Institutional Research Brief: Results of the 2012 National Survey of Student Engagement: A Comparison with Carnegie Peer Institutions
UK Institutional Research Brief: Results of the 2012 National Survey of Student Engagement: A Comparison with Carnegie Peer Institutions November 2012 The National Survey of Student Engagement (NSSE) has
More informationTimeline. Recommendations
Introduction Advanced Placement Course Credit Alignment Recommendations In 2007, the State of Ohio Legislature passed legislation mandating the Board of Regents to recommend and the Chancellor to adopt
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationWhat Is The National Survey Of Student Engagement (NSSE)?
National Survey of Student Engagement (NSSE) 2000 Results for Montclair State University What Is The National Survey Of Student Engagement (NSSE)? US News and World Reports Best College Survey is due next
More informationBENCHMARK TREND COMPARISON REPORT:
National Survey of Student Engagement (NSSE) BENCHMARK TREND COMPARISON REPORT: CARNEGIE PEER INSTITUTIONS, 2003-2011 PREPARED BY: ANGEL A. SANCHEZ, DIRECTOR KELLI PAYNE, ADMINISTRATIVE ANALYST/ SPECIALIST
More informationThe Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University
The Effect of Extensive Reading on Developing the Grammatical Accuracy of the EFL Freshmen at Al Al-Bayt University Kifah Rakan Alqadi Al Al-Bayt University Faculty of Arts Department of English Language
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationPhonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project
Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California
More informationUniversal contrastive analysis as a learning principle in CAPT
Universal contrastive analysis as a learning principle in CAPT Jacques Koreman, Preben Wik, Olaf Husby, Egil Albertsen Department of Language and Communication Studies, NTNU, Trondheim, Norway jacques.koreman@ntnu.no,
More informationFountas-Pinnell Level P Informational Text
LESSON 7 TEACHER S GUIDE Now Showing in Your Living Room by Lisa Cocca Fountas-Pinnell Level P Informational Text Selection Summary This selection spans the history of television in the United States,
More information