CURRICULUM VITAE. Zhen-Hua Ling ( 凌震华 ) Research Interests. Education Experiences. Research Experiences. Research Projects

Similar documents
A study of speaker adaptation for DNN-based speech synthesis

Speech Emotion Recognition Using Support Vector Machine

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

Learning Methods in Multilingual Speech Recognition

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Modeling function word errors in DNN-HMM based LVCSR systems

Edinburgh Research Explorer

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Modeling function word errors in DNN-HMM based LVCSR systems

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Human Emotion Recognition From Speech

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Letter-based speech synthesis

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Using Articulatory Features and Inferred Phonological Segments in Zero Resource Speech Processing

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Statistical Parametric Speech Synthesis

Eileen Bau CIE/USA-DFW 2014

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Investigation on Mandarin Broadcast News Speech Recognition

The IRISA Text-To-Speech System for the Blizzard Challenge 2017

Calibration of Confidence Measures in Speech Recognition

Australian Journal of Basic and Applied Sciences

INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING

Speech Recognition at ICSI: Broadcast News and beyond

Distributed Learning of Multilingual DNN Feature Extractors using GPUs

On the Formation of Phoneme Categories in DNN Acoustic Models

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition

WHEN THERE IS A mismatch between the acoustic

The Current Situations of International Cooperation and Exchange and Future Expectations of Guangzhou Ploytechnic of Sports

Affective Classification of Generic Audio Clips using Regression Models

/$ IEEE

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Unit Selection Synthesis Using Long Non-Uniform Units and Phonemic Identity Matching

Speaker recognition using universal background model on YOHO database

Xinyu Tang. Education. Research Interests. Honors and Awards. Professional Experience

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Spoofing and countermeasures for automatic speaker verification

Digital Signal Processing: Speaker Recognition Final Report (Complete Version)

Improvements to the Pruning Behavior of DNN Acoustic Models

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

Multiple Intelligence Theory into College Sports Option Class in the Study To Class, for Example Table Tennis

Word Segmentation of Off-line Handwritten Documents

Expressive speech synthesis: a review

Vowel mispronunciation detection using DNN acoustic models with cross-lingual training

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Mandarin Lexical Tone Recognition: The Gating Paradigm

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions

Python Machine Learning

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

DNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS

THE enormous growth of unstructured data, including

arxiv: v1 [cs.lg] 7 Apr 2015

Identification of Opinion Leaders Using Text Mining Technique in Virtual Community

National Taiwan Normal University - List of Presidents

Speaker Identification by Comparison of Smart Methods. Abstract

Application of Visualization Technology in Professional Teaching

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Speaker Recognition. Speaker Diarization and Identification

Speech Translation for Triage of Emergency Phonecalls in Minority Languages

Empirical research on implementation of full English teaching mode in the professional courses of the engineering doctoral students

A Reinforcement Learning Variant for Control Scheduling

Learning Methods for Fuzzy Systems

Mining Association Rules in Student s Assessment Data

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

BODY LANGUAGE ANIMATION SYNTHESIS FROM PROSODY AN HONORS THESIS SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE OF STANFORD UNIVERSITY

OTHER RESEARCH EXPERIENCE & AFFILIATIONS

Simulation of Multi-stage Flash (MSF) Desalination Process

ACOUSTIC EVENT DETECTION IN REAL LIFE RECORDINGS

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

Curriculum Vitae of Chiang-Ju Chien

Probabilistic Latent Semantic Analysis

Assignment 1: Predicting Amazon Review Ratings

Multi-View Features in a DNN-CRF Model for Improved Sentence Unit Detection on English Broadcast News

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, VOL XXX, NO. XXX,

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

Wenguang Sun CAREER Award. National Science Foundation

Dr. Tang has been an active member of CAPA since She was Co-Chair of Education Committee and Executive committee member ( ).

Investigation and Analysis of College Students Cognition in Science and Technology Competitions

Transcription:

CURRICULUM VITAE Zhen-Hua Ling ( 凌震华 ) Associate Professor National Engineering Laboratory of Speech and Language Information Processing University of Science and Technology of China Hefei, Anhui, P.R. China, 230027 E-mail: zhling@ustc.edu.cn Research Interests My research interests include speech signal processing, speech synthesis, voice conversion, speech analysis and speech coding. Now my research work focuses on statistical model based speech synthesis methods. Education Experiences 06/2002 B.E. degree in electronic information engineering, University of Science and Technology of China (USTC), Hefei, China 06/2005 M.S. degree in signal and information processing, University of Science and Technology of China (USTC), Hefei, China 06/2008 Ph.D. degree in signal and information processing, University of Science and Technology of China (USTC), Hefei, China Research Experiences 02/2011~present, associate professor at University of Science and Technology of China 08/2012~08/2013, visiting scholar at University of Washington, U.S.A. 07/2008~02/2011, postdoctoral researcher at University of Science and Technology of China 10/2007~04/2008, Marie Curie Fellow at the Centre for Speech Technology Research (CSTR), University of Edinburgh, U.K. 09/2002~06/2008, research assistant at iflytek Speech Lab, University of Science and Technology of China, Hefei, China Research Projects National Natural Science Foundation of China Hierarchical Speech Synthesis Method Combining Speech Production Mechanism and Statistical Acoustic Modeling (Grant No. 61273032), 2013.01-2016.12, Committee of NNSFC, PI; National Natural Science Foundation of China Royal Society of Edinburgh Joint Project Unified articulatory-acoustic modelling for flexible and controllable speech synthesis, (Grant No. 61111130120), 2011.01-2012.12, Committee of NNSFC, PI; National Natural Science Foundation of China Statistical speech synthesis with articulatory modeling (Grant No. 60905010), 01-2012.12, Committee of NNSFC, PI;

China Postdoctoral Science Foundation Speech synthesis based on automatic evaluation of synthetic performance (Grant No. 20090450823), 2009.07-10,Committee of China Postdoctoral Science Foundation, PI; Sub-project of Hi-Tech Research and Development Program of China Key technique research and product development for multi-lingual speech synthesis (Grant No. 2006AA010104), 2006.12-10, Ministry of Science and Technology of China; Hi-Tech Research and Development Program of China HMM-based expressive and multi-lingual speech synthesis (Grant No. 2006AA01Z137), 2006.10-2009. 10, Ministry of Science and Technology of China; National Natural Science Foundation of China Expressive and Multi-Lingual Prosodic Modelling (Grant No. 60475015), 2005.01-2007.12, Committee of NNSFC. Awards Key techniques and applied development platform for intelligent speech interaction, the Second Prize of the National Science and Technology Progress Award (2011) IEEE Signal Processing Society Young Author Best Paper Award (2010) Key techniques and application platform for intelligent speech interaction, the First Prize of Science and Technology Progress of Anhui Province (2008) Chinese and English speech evaluation techniques for language learning, the Second Prize of Electronic and Information Science Progress, Chinese Institute of Electronics (2008) President Scholarship of Chinese Academy of Sciences (June 2007) Outstanding Master Graduate Student of University of Science and Technology of China (June 2005) Guanghua Scholarship of University of Science and Technology of China (June 2003) Academic Services Associate Editor of IEEE/ACM Transactions on Audio, Speech, and Language Processing, Jan. 2014 ~ Jan. 2017 ISCA (International Speech Communication Association) Communication Committee Member, 2014 Program Committee Member of 8th ISCA Speech Synthesis Workshop, 2013 Scientific Committee Member of 6th International Conference on Speech Prosody, 2012 Session chair of ICASSP(2014), Interspeech(2014, 2012) Reviewer for international journals and conferences IEEE Transactions on Audio Speech and Language Processing, Speech Communication, Computer Speech and Language, Information Sciences, Journal of Signal Processing Systems, ICASSP, Interspeech, ISCSLP, Speech Prosody, etc. Publications [International Journals] [1] Zhen-Hua Ling, Shi-Yin Kang, Heiga Zen, Andrew Senior, Mike Schuster, Xiao-Jun Qian, Helen Meng, Li Deng, "Deep Learning for Acoustic Modeling in Parametric Speech Generation," IEEE Signal Processing Magazine, accepted.

[2] Ling-Hui Chen, Zhen-Hua Ling, Li-Juan Liu, and Li-Rong Dai, "Voice Conversion Using Deep Neural Networks with Layer-Wise Generative Training," IEEE Transactions on Audio, Speech, and Language Processing, accepted. [3] Xian-Jun Xia, Zhen-Hua Ling, Yuan Jiang, and Li-Rong Dai, "HMM-based Unit Selection Speech Synthesis Using Log Likelihood Ratios Derived from Perceptual Data", Speech Communication, vol. 63-64, pp. 27-37, 2014. [4] Chen-Yu Yang, Zhen-Hua Ling, and Li-Rong Dai, "Unsupervised Prosodic Labeling of Speech Synthesis Databases Using Context-Dependent HMMs", IEICE Transactions on Information and Systems, vol.e97-d, no.6, pp. 1449-1460, 2014. [5] Zhen-Hua Ling, Li Deng, and Dong Yu, "Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis," IEEE Transaction on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2129-2139, 2013. [6] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi, Articulatory Control of HMM-based Parametric Speech Synthesis using Feature-Space-Switched Multiple Regression, IEEE Transaction on Audio, Speech, and Language Processing, vol. 21, no. 1, pp. 207-219, 2013. [7] Zhen-Hua Ling, and Li-Rong Dai, Minimum Kullback-Leibler Divergence Parameter Generation for HMM-based Speech Synthesis, IEEE Transaction on Audio, Speech, and Language Processing, vol. 20, no. 5, pp. 1492-1502, 2012. [8] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi, An analysis of HMM-based prediction of articulatory movements, Speech Communication, vol. 52, no. 10, pp. 834-846, [9] Heng Lu, Zhen-Hua Ling, Li-Rong Dai, and Ren-Hua Wang, Cross-Validation and Minimum Generation Error based Decision Tree Pruning for HMM-based Speech Synthesis, Computational Linguistics and Chinese Language Processing, vol. 15, no. 1, pp. 61-76, March [10] Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, and Ren-Hua Wang, Integrating articulatory features into HMM-based parametric speech synthesis, IEEE Transaction on Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1171-1185, 2009. (IEEE Signal Processing Society 2010 Young Author Best Paper Award) [11] Junichi Yamagishi, Takashi Nose, Heiga Zen, Zhen-Hua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, and Steve Renals, Robust speaker-adaptive HMM-based text-to-speech synthesis, IEEE Transaction on Andio, Speech and Language Processing, vol. 17, no. 6, pp. 1208-1230, 2009. [Domestic Journals] [1] Ming-Qi Cai, Zhen-Hua Ling, and Li-Rong Dai, "Research on HMM-based Articulatory Movement Prediction for Chinese," Chinese Journal of Data Acquisition and Processing, vol. 29, no. 2, pp. 204-210, 2014. (in Chinese) [2] Yang Song, Zhen-Hua Ling, and Li-Rong Dai, Optimization method for unit selection speech synthesis based on synthesis quality predictions, Journal of Tsinghua University (Sci & Tech), vol. 53, no. 6, pp. 762-766, 2011. (in Chinese) [3] Ling-Hui Chen, Zhen-Hua Ling, and Li-Rong Dai, "Voice Conversion Based on Speaker Independent Model", Chinese Journal of Pattern Recognition and Artificial Intelligence, vol. 26, no. 3, pp. 254-259 2013.(in Chinese)

[4] Yuan-Ping Zhang, Zhen-Hua Ling, Li-Rong Dai, and Qing-Feng Liu, "Improved decision tree based method for English prosodic phrase boundary prediction," Chinese Journal of Application Research of Computers, vol. 29, no. 8, pp. 2921-2925, 2012. (in Chinese) [5] Hai-Bo Liu, Hui Li, and Zhen-Hua Ling, "The research on pitch extraction method for voice activity detection based on periodic decomposition, " Journal of University of Science and Technology of China, vol. 42, no. 2, pp. 106-111, 2012. (in Chinese) [6] Yu Hu, Zhen-Hua Ling, Ren-Hua Wang, and Li-Rong Dai, "Acoustic Statistical Modeling Based Speech Synthesis Technologies," Journal of Chinese Information Processing, vol. 25, no.6, pp. 127-136, 2011. (in Chinese) [7] Chen-Yu Yang, Li-Xin Zhu, Zhen-Hua Ling, and Li-Rong Dai, Automatic phrase boundary labeling for a Mandarin TTS corpus using the Viterbi decoding algorithm, Journal of Tsinghua University (Sci & Tech), vol. 51, no. 9, pp. 1276-1281, 2011. (in Chinese) [8] Hang Liu, Zhen-Hua Ling, Wu Guo, and Li-Rong Dai, An improved cross-language model adaptation method for speech synthesis, Chinese Journal of Pattern Recognition and Artificial Intelligence, vol. 24, no. 4, 2011.(in Chinese) [9] Heng Lu, Zhen-Hua Ling, Ming Lei, Li-Rong Dai, and Ren-Hua Wang, Minimum generation error based optimization of HMM model clustering for speech synthesis, Chinese Journal of Pattern Recognition and Artificial Intelligence, vol. 23, no.6, pp. 822-828, (in Chinese) [10] Huan-Huan Zhao, Zhen-Hua Ling, Ren-Hua Wang, and Li-Rong Dai, MAP-based speaker adaptation in speech synthesis, Chinese Journal of Data Acquisition and Processing, vol. 25, no.4, pp.495-499, (in Chinese) [11] Ming Lei, Zhen-Hua Ling, and Li-Rong Dai, Minimum generation error training based on perceptually weighted line spectral pair distance for statistical parametric speech synthesis, Chinese Journal of Pattern Recognition and Artificial Intelligence, vol. 23, no.4, pp.572-579, (in Chinese) [12] Ren-Hua Wang, Li-Rong Dai, Zhen-Hua Ling, and Yu Hu, Trainable unit selection speech synthesis under statistical framework, Chinese Science Bulletin, vol. 54, no. 11, pp. 1963-1969, 2009. [13] Zhen-Hua Ling, and Ren-Hua Wang, Statistical acoustic model based unit selection algorithm for speech synthesis, Chinese Journal of Pattern Recognition and Artificial Intelligence, vol. 21, no. 3, pp. 280-284, 2008. (in Chinese) [14] Wei Zhang, Zhen-Hua Ling, Guo-Ping Hu, and Ren-Hua Wang, A synthesis instance pruning approach based on virtual non-uniform replacements, Tsinghua Science and Technology, vol. 13, no. 4, pp. 515-521, 2008. [15] Ren-Hua Wang, Li-Rong Dai, Yu Hu, and Zhen-Hua Ling, Acoustic statistical modeling based new generation speech synthesis technology, Journal of University of Science and Technology of China, vol. 38, no. 7, pp. 725-734, 2008. (in Chinese) [16] Bin Zhou, Li-Rong Dai, Zhen-Hua Ling, and Ren-Hua Wang, Novel glottal analyzing algorithm for natural utterance, Chinese Journal of Data Acquisition and Processing, vol. 20, no. 3, pp. 297-301, 2005. (in Chinese) [17] Zhen-Hua Ling, Zhi-Wei Shuang, Bin Zhou, Ren-Hua Wang, A wideband speech coding algorithm based on adaptive interpolation of weighted spectrum, Chinese Journal of Data Acquisition and Processing, vol. 20, no. 1, pp. 28-33, 2005. (in Chinese)

[18] Dong-Lai Zhu, Ren-Hua Wang, Zhen-Hua Ling, and Wei Li, Putonghua prosodic word pitch model based on HMM, Chinese Journal of Acoustics, vol. 27, no. 6, pp. 523-528, 2002. (in Chinese) [International Conferences] [1] Ling-Hui Chen, Zhen-Hua Ling, Yi-Qing Zu, Run-Qiang Yan, Yuan Jiang, Xian-Jun Xia, Ying Wang, "The USTC System for Blizzard Challenge 2014", in Blizzard Challenge Workshop, 2014. [2] Ming-Qi Cai, Zhen-Hua Ling, and Li-Rong Dai, "Formant-Controlled Speech Synthesis Using Hidden Trajectory Model", Interspeech, pp. 1529-1533, 2014. [3] Xin Wang, Zhen-Hua Ling, and Li-Rong Dai, "Concept-to-Speech Generation by Integrating Syntagmatic Features into HMM-Based Speech Synthesis", Interspeech, pp. 2942-2946, 2014. [4] Ling-Hui Chen, Zhen-Hua Ling, and Li-Rong Dai, "Voice Conversion Using Generative Trained Deep Neural Networks with Multiple Frame Spectral Envelopes", Interspeech, pp. 2313-2317, 2014. [5] Xiang Yin, Ming Lei, Yao Qian, Frank K. Soong, Lei He, Zhen-Hua Ling, and Li-Rong Dai, "Modeling DCT Parameterized F0 Trajectory at Intonation Phrase Level with DNN or Decision Tree", Interspeech, pp. 2273-2277, 2014. [6] Ling-Hui Chen, Tuomo Raitio, Cassia Valentini-Botinhao, Junichi Yamagishi, and Zhen-Hua Ling, "DNN-based stochastic postfilter for HMM-based speech synthesis", Interspeech, pp. 1954-1958, 2014. [7] Li Gao, Zhen-Hua Ling, Ling-Hui Chen, Li-Rong Dai, "Improving F0 Prediction Using Bidirectional Associative Memories and Syllable-Level F0 Features for HMM-based Mandarin Speech Synthesis", in ISCSLP, 2014. [8] Yu-Sheng Sun, Zhen-Hua Ling, Xiang Yin, Li-Rong Dai, "Integrating Global Variance of Log Power Spectrum Derived from LSPs into MGE Training for HMM-Based Parametric Speech Synthesis", in ISCSLP, 2014. [9] Xiang Yin, Zhen-Hua Ling, and Li-Rong Dai, "SPECTRAL MODELING USING NEURAL AUTOREGRESSIVE DISTRIBUTION ESTIMATORS FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS", in ICASSP, pp. 3852-3856, 2014. [10] Li-Juan Liu, Ling-Hui Chen, Zhen-Hua Ling, and Li-Rong Dai, "USING BIDIRECTIONAL ASSOCIATIVE MEMORIES FOR JOINT SPECTRAL ENVELOPE MODELING IN VOICE CONVERSION", in ICASSP, pp. 7934-7938, 2014. [11] Ling-Hui Chen, Zhen-Hua Ling, Yuan Jiang, Yang Song, Xian-Jun Xia, Yi-Qing Zu, Run-Qiang Yan, and Li-Rong Dai, "The USTC System for Blizzard Challenge 2013", in Blizzard Challenge Workshop, 2013. [12] Maria Astrinaki, Alexis Moinet, Junichi Yamagishi, Korin Richmond, Zhen-Hua Ling, Simon King, and Thierry Dutoit, "Mage - Reactive articulatory feature control of HMM-based parametric speech synthesis", in 8th ISCA Speech Synthesis Workshop, pp. 207-211, 2013. [13] Ling-Hui Chen, Zhen-Hua Ling, Yan Song, and Li-Rong Dai, Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion, in Interspeech, pp. 3052-3056, 2013.

[14] Korin Richmond, Zhen-Hua Ling, Junichi Yamagishi, Benigno Uria, On the Evaluation of Inversion Mapping Performance in the Acoustic Domain, in Interspeech, pp. 1012-1016, 2013. [15] Zhen-Hua Ling, Li Deng, and Dong Yu, "Modeling Spectral Envelopes Using Restricted Boltzmann Machines for Statistical Parametric Speech Synthesis", in ICASSP, pp. 7825-7829, 2013. [16] Chen-Yu Yang, Zhen-Hua Ling, and Li-Rong Dai, "Unsupervised Prosodic Phrase Boundary Labeling of Mandarin Speech Synthesis Database Using Context-Dependent HMM", in ICASSP, pp. 6875-6879, 2013. [17] Xin Wang, Zhen-Hua Ling, and Li-Rong Dai, "Cross-Stream Dependency Modeling Using Continuous F0 Model For Hmm-Based Speech Synthesis", In Proc. of ISCSLP, 2012. [18] Xian-Jun Xia,Zhen-Hua Ling, Chen-Yu Yang, Li-Rong Dai, "Improved Unit Selection Speech Synthesis Method Utilizing Subjective Evaluation Results On Synthetic Speech", in Proc. of ISCSLP, 2012. [19] Ming-Qi Cai, Zhen-Hua Ling, and Li-Rong Dai, Target-filtering model based articulatory movement prediction for articulatory control of HMM-based speech synthesis, in Proc. of the 11th International Conference on Signal Processing, 2012. [20] Zhen-Hua Ling, Xian-Jun Xia, Yang Song, Chen-Yu Yang, Ling-Hui Chen, and Li-Rong Dai "The USTC System for Blizzard Challenge 2012", in Proc. of Blizzard Challenge workshop, 2012. [21] Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis, Interspeech 2012. [22] Xiang Yin, Zhen-Hua Ling, Ming Lei, Li-Rong Dai, Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis, Interspeech 2012. [23] Ling-Hui Chen, Chen-Yu Yang, Zhen-Hua Ling, Yuan Jiang, Li-Rong Dai, Yu Hu, and Ren-Hua Wang, The USTC system for Blizzard Challenge 2011, in Proc. of Blizzard Challenge workshop, 2011. [24] Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-based Speech Synthesis, in Proc. of Interspeech, pp. 117-120, 2011. [25] Ling-Hui Chen, Yoshihiko Nankaku, Heiga Zen, Keiichi Tokuda, Zhen-Hua Ling, Li-Rong Dai, Estimation of Window Coefficients for Dynamic Feature Extraction for HMM based Speech Synthesis, in Proc. of Interspeech, pp. 1801-1804, 2011. [26] Ming Lei, Junichi Yamagishi, Korin Richmond, Zhen-Hua Ling, Simon King, Li-Rong Dai, Formant-controlled HMM-based Speech Synthesis, in Proc. of Interspeech, pp. 2777-2780, 2011. [27] Heng Lu, Zhen-Hua Ling, Li-Rong Dai, Ren-Hua Wang, Building HMM based Unit-Selection Speech Synthesis System Using Synthetic Speech Naturalness Evaluation Score, in Proc. of ICASSP, pp. 5352-5355, 2011. [28] Ming Lei, Zhen-Hua Ling, Li-Rong Dai, Preserve Ordering Property of Generated LSPs for Minimum Generation Error Training in HMM-based Speech Synthesis, in Proc. of ICASSP, pp. 4712-4715, 2011. [29] Ling-Hui Chen, Zhen-Hua Ling, Li-Rong Dai, Non-Parallel Training for Voice Conversion based on FT-GMM, in Proc. of ICASSP, pp. 5116-5119, 2011.

[30] Zhen-Hua Ling, Zhi-Guo Wang, Li-Rong Dai, Statistical Modeling of Syllable-Level F0 Features for HMM-based Unit Selection Speech Synthesis, in Proc. of ISCSLP, pp. 144-147, [31] Ling-Hui Chen, Zhen-Hua Ling, Wu Guo, Li-Rong Dai, GMM-based Voice Conversion with Explicit Modelling on Feature Transform, in Proc. of ISCSLP, pp. 364-368, [32] Chen-Yu Yang, Zhen-Hua Ling, Heng Lu, Wu Guo, Li-Rong Dai, Automatic Phrase Boundary Labeling for Mandarin TTS Corpus Using Context-Dependent HMM, in Proc. of ISCSLP 2010, pp. 374-377. [33] Tian-Yi Zhao, Zhen-Hua Ling, Ming Lei, Li-Rong Dai, Qing-Feng Liu, Minimum Generation Error Training for HMM-based Prediction of Articulatory Movements, in Proc. of ISCSLP, pp. 99-102, [34] Ming Lei, Yi-Jian Wu, Zhen-Hua Ling, Li-Rong Dai, Investigation of Prosodic F0 Layers in Hierarchical F0 Modeling for HMM-based Speech Synthesis, in Proc. of International Conference on Signal Processing, pp. 613-616, [35] Zhen-Hua Ling, Yu Hu, and Li-Rong Dai, Global Variance Modeling on the Log Power Spectrum of LSPs for HMM-based Speech Synthesis, in Proc. of Interspeech, pp. 825-828, [36] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi, HMM-based Text-to-Articulatoryovement Prediction and Analysis of Critical Articulators, in Proc. of Interspeech, pp. 2194-2197, [37] Heng Lu, Zhen-Hua Ling, Si Wei, Li-Rong Dai, and Ren-Hua Wang, Automatic Error Detection for Unit Selection Speech Synthesis Using Log Likelihood Ratio based SVM Classifier, in Proc. of Interspeech, pp. 162-165, [38] Ming Lei, Yi-Jian Wu, Frank K. Soong, Zhen-Hua Ling, and Li-Rong Dai, A Hierarchical F0 Modeling Method for HMM-based Speech Synthesis, in Proc. of Interspeech, pp. 2170-2173, [39] Yuan Jiang, Zhen-Hua Ling, Ming Lei, Cheng-Cheng Wang, Heng Lu, Yu Hu, Li-Rong Dai, Ren-Hua Wang, The USTC system for Blizzard Challenge 2010, in Proc. of Blizzard Challenge workshop, [40] Ming Lei, Zhen-Hua Ling, and Li-Rong Dai, Minimum generation error training with weighted Euclidean distance on LSP for HMM-based speech synthesis, in Proc. of ICASSP, pp. 4230-4233, [41] Cheng-Cheng Wang, Zhen-Hua Ling, and Li-Rong Dai, Asynchronous F0 and spectrum modeling for HMM-based speech synthesis, in Proc. of Interspeech, pp. 404-407, 2009. [42] Heng Lu, Zhen-Hua Ling, Ming Lei, Cheng-Cheng Wang, Huan-Huan Zhao, Ling-Hui Chen, Yu Hu, Li-Rong Dai, and Ren-Hua Wang, The USTC system for Blizzard Challenge 2009, in Proc. of Blizzard Challenge workshop, 2009. [43] Long Qin, Yi-Jian Wu, Zhen-Hua Ling, and Ren-Hua Wang, Model adaptation for HMM-based speech synthesis under minimum generation error criterion, in Proc. of IEEE International Symposium on Multimedia, pp. 539-544, 2008. [44] Zhen-Hua Ling, Wei Zhang, and Ren-Hua Wang, Cross-stream dependency modeling for HMM-based speech synthesis, in Proc. of ISCSLP, pp. 5-8, 2008. [45] Chen-Cheng Wang, Zhen-Hua Ling, Bu-Fan Zhang, and Li-Rong Dai, Multi-layer F0 modeling for HMM-based speech synthesis, in Proc. of ISCSLP, pp. 129-132, 2008.

[46] Heng Lu, Zhen-Hua Ling, Si Wei, Yu Hu, Li-Rong Dai, and Ren-Hua Wang, Heteronym verification for Mandarin speech synthesis, in Proc. of ISCSLP, pp. 137-140, 2008. [47] Wei Zhang, Zhen-Hua Ling, and Li-Rong Dai, Constructing scalable TTS system based on corpus approach, in Proc. of IEEE International Conference on Cybernetics and Intelligent Systems, pp. 230-235, 2008. [48] Zhen-Hua Ling, Heng Lu, Guo-Ping Hu, Li-Rong Dai, and Ren-Hua Wang, The USTC entry for Blizzard Challenge 2008, in Proc. of Blizzard Challenge workshop, 2008. [49] Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, and Ren-Hua Wang, Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge, in Proc. of Interspeech, pp. 573-576, 2008. [50] Junichi Yamagishi, Zhen-Hua Ling, and Simon King, Robustness of HMM-based speech synthesis, in Proc. of Interspeech, pp. 581-584, 2008. [51] Zhen-Hua Ling, and Ren-Hua Wang, Minimum unit selection error training for HMM-based unit selection speech synthesis system, in Proc. of ICASSP, pp. 3949-3952, 2008. [52] Long Qin, Yi-Jian Wu, Zhen-Hua Ling, Ren-Hua Wang, and Li-Rong Dai, Minimum generation error linear regression based model adaptation for HMM-based speech synthesis, in Proc. of ICASSP, pp. 3953-3956, 2008. [53] Zhen-Hua Ling, Long Qin, Heng Lu, Yu Gao, Li-Rong Dai, Ren-Hua Wang, Yuan Jiang, Zhi-Wei Zhao, Jin-Hui Yang, Jie Chen, and Guo-Ping Hu, The USTC and iflytek speech synthesis systems for Blizzard Challenge 2007, in Proc. of Blizzard Challenge workshop, 2007. [54] Zhen-Hua Ling, and Ren-Hua Wang, HMM-based unit selection combining Kullback-Leibler divergence with likelihood criterion, in Proc. of ICASSP, pp. 1245-1248, 2007. [55] Long Qin, Zhen-Hua Ling, Yi-Jian Wu, Bu-Fan Zhang, and Ren-Hua Wang, HMM-based emotional speech synthesis using average emotion model, in Proc. of International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 233-240, 2006. [56] Bu-Fan Zhang, Zhen-Hua Ling, Long Qin, and Renhua Wang, Applying SFC model for Chinese expressive speech synthesis, in Proc. of ISCSLP, 2006. [57] Zhen-Hua Ling, Yi-Jian Wu, Yu-Ping Wang, Long Qin, and Ren-Hua Wang, USTC system for Blizzard Challenge 2006 - an improved HMM-based speech synthesis method, in Proc. of Blizzard Challenge workshop, 2006. [58] Zhen-Hua Ling, and Ren-Hua Wang, HMM-based unit selection using frame sized speech segments, in Proc. of Interspeech, pp. 2034-2037, 2006. [59] Long Qin, Yi-Jian Wu, Zhen-Hua Ling, and Ren-Hua Wang, Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix format, in Proc. of Interspeech, pp. 2250-2253, 2006. [60] Zhen-Hua Ling, Yu Hu, and Ren-Hua Wang, A novel source analysis method by matching spectral characters of LF model with STRAIGHT spectrum, in Proc. of the First International Conference on Affective Computing & Intelligent Interaction (ACII), Lecture Notes in Computer Science, vol. 3784, pp. 441-448, 2005. [61] Yu-Ping Wang, Zhen-Hua Ling, and Ren-Hua Wang, Emotional speech synthesis based on improved codebook mapping voice conversion, in Proc. of the ACII, Lecture Notes in Computer Science, vol. 3784, pp. 374-381, 2005.

[62] Long Qin, Gao-Peng Chen, Zhen-Hua Ling, and Li-Rong Dai, An improved spectral and prosodic transformation method in STRAIGHT-based voice conversion, in Proc. of ICASSP, vol. 1, pp. 21-24, 2005. [63] Zhen-Hua Ling, Yu-Ping Wang, Yu Hu, and Ren-Hua Wang, Modeling glottal effect on the spectral envelop of STRAIGHT using mixture of Gaussians, in Proc. of ISCSLP, pp. 73-76, 2004. [64] Zhen-Hua Ling, Yu Hu, Zhi-Wei Shuang, and Ren-Hua Wang, Compression of speech database by feature separation and pattern clustering using STRAIGHT, in Proc. of Interspeech, pp. 766-769, 2004. [65] Zhi-Wei Shuang, Zi-Xiang Wang, Zhen-Hua Ling, and Renhua Wang, A novel voice conversion system based on codebook mapping with phoneme-tied weighting, in Proc. of Interspeech, pp. 1197-1200, 2004. [66] Zhen-Hua Ling, Yu Hu, Zhi-Wei Shuang, and Ren-Hua Wang, Decision tree based unit pre-selection in Mandarin Chinese synthesis, in Proc. of ISCSLP, 2002. [67] Zhi-Wei Shuang, Yu Hu, Zhen-Hua Ling, and Ren-Hua Wang, A miniature Chinese TTS system based on tailored corpus, in Proc. of ICSLP, pp. 2389-2392, 2002. [Domestic Conferences] [1] Jiang Yuan, Shuang-Hua Zhu, Zhen-Hua Ling and Li-Rong Dai, Research on Improving Methods for HMM Based Unit Selection Speech Synthesis, in Proc. of the 11th National Conference on Man-Machine Speech Communication, 2011. (in Chinese) [2] Zhen-Hua Ling, Yu Hu, An experimental study on the similarity performance of HMM-based parametric speech synthesis, in Proc. of the 10th National Conference on Man-Machine Speech Communication, 2009. [3] Huan-Huan Zhao, Zhen-Hua Ling, Long Qin, Ren-Hua Wang, and Li-Rong Dai, Eigenvoice based model adaptation method for voice conversion in speech synthesis, in Proc. of the 9th National Conference on Man-Machine Speech Communication, 2007. (in Chinese) [4] Yu Gao, Zhen-Hua Ling, Li-Rong Dai, and Ren-Hua Wang, An improved sinusoidal speech analysis-synthesis method, in Proc. of the 9th National Conference on Man-Machine Speech Communication, 2007. (in Chinese) [5] Wei Zhang, Zhen-Hua Ling, Guo-Ping Hu, and Ren-Hua Wang, Synthesis instances pruning approach based on virtual non-uniform replacing, in Proc. of the 9th National Conference on Man-Machine Speech Communication, 2007. [6] Heng Lu, Wei Zhang, Zhen-Hua Ling, Ren-Hua Wang, and Li-Rong Dai, An HMM-based speech synthesis system using multi-gaussian modeling and selection, in Proc. of the 9th National Conference on Man-Machine Speech Communication, 2007. (in Chinese) [7] Bin Zhou, Zhen-Hua Ling, Zhi-Wei Shuang, and Ren-Hua Wang, Research on speech synthesizer based on inverse filtering and LF model, in Proc. of the 7th National Conference on Man-Machine Speech Communication, 2003. (in Chinese) [8] Ren-Hua Wang, Yu Hu, Wei Li, and Zhen-Hua Ling, Corpus based Chinese speech synthesis system using decision tree, in Proc. of the 6th National Conference on Man-Machine Speech Communication, 2001. (in Chinese)