The Features of Vowel /E/ Pronounced by Chinese Learners

Similar documents
Mandarin Lexical Tone Recognition: The Gating Paradigm

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Speech Emotion Recognition Using Support Vector Machine

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Body-Conducted Speech Recognition and its Application to Speech Support System

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Disambiguation of Thai Personal Name from Online News Articles

On the Formation of Phoneme Categories in DNN Acoustic Models

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

REVIEW OF CONNECTED SPEECH

Learning Methods in Multilingual Speech Recognition

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Modeling function word errors in DNN-HMM based LVCSR systems

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Speech Recognition at ICSI: Broadcast News and beyond

Modeling function word errors in DNN-HMM based LVCSR systems

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Word Stress and Intonation: Introduction

Why Is the Chinese Curriculum Difficult for Immigrants Children from Southeast Asia

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

Linguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University

Lower and Upper Secondary

Voice conversion through vector quantization

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

WHEN THERE IS A mismatch between the acoustic

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

DIBELS Next BENCHMARK ASSESSMENTS

CEFR Overall Illustrative English Proficiency Scales

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Automatic intonation assessment for computer aided language learning

Coast Academies Writing Framework Step 4. 1 of 7

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

How to Judge the Quality of an Objective Classroom Test

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

The Acquisition of English Intonation by Native Greek Speakers

English Language and Applied Linguistics. Module Descriptions 2017/18

Tour. English Discoveries Online

Human Emotion Recognition From Speech

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

Problems of the Arabic OCR: New Attitudes

A study of speaker adaptation for DNN-based speech synthesis

Florida Reading Endorsement Alignment Matrix Competency 1

A student diagnosing and evaluation system for laboratory-based academic exercises

DEPARTMENT OF JAPANESE LANGUAGE AND STUDIES

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

EXECUTIVE SUMMARY. TIMSS 1999 International Science Report

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Universal contrastive analysis as a learning principle in CAPT

Empirical research on implementation of full English teaching mode in the professional courses of the engineering doctoral students

SIE: Speech Enabled Interface for E-Learning

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

The Bruins I.C.E. School

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Journal of Phonetics

Cambridgeshire Community Services NHS Trust: delivering excellence in children and young people s health services

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Automatic English-Chinese name transliteration for development of multilingual resources

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

Applying ADDIE Model for Research and Development: An Analysis Phase of Communicative Language of 9 Grad Students

Circuit Simulators: A Revolutionary E-Learning Platform

CROSS-LANGUAGE MAPPING FOR SMALL-VOCABULARY ASR IN UNDER-RESOURCED LANGUAGES: INVESTIGATING THE IMPACT OF SOURCE LANGUAGE CHOICE

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

The Journey to Vowelerria VOWEL ERRORS: THE LOST WORLD OF SPEECH INTERVENTION. Preparation: Education. Preparation: Education. Preparation: Education

Lecture Notes on Mathematical Olympiad Courses

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

On-Line Data Analytics

raıs Factors affecting word learning in adults: A comparison of L2 versus L1 acquisition /r/ /aı/ /s/ /r/ /aı/ /s/ = individual sound

Speaker Recognition. Speaker Diarization and Identification

THE MULTIVOC TEXT-TO-SPEECH SYSTEM

Speaker recognition using universal background model on YOHO database

Effect of Word Complexity on L2 Vocabulary Learning

Case study Norway case 1

Consonants: articulation and transcription

Primary English Curriculum Framework

Arabic Orthography vs. Arabic OCR

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

Quantitative Evaluation of an Intuitive Teaching Method for Industrial Robot Using a Force / Moment Direction Sensor

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

The Extend of Adaptation Bloom's Taxonomy of Cognitive Domain In English Questions Included in General Secondary Exams

A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting

Richardson, J., The Next Step in Guided Writing, Ohio Literacy Conference, 2010

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

GOLD Objectives for Development & Learning: Birth Through Third Grade

/$ IEEE

ScienceDirect. Noorminshah A Iahad a *, Marva Mirabolghasemi a, Noorfa Haszlinna Mustaffa a, Muhammad Shafie Abd. Latif a, Yahya Buntat b

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.**

Transcription:

International Journal of Signal Processing Systems Vol. 4, No. 6, December 216 The Features of Vowel /E/ Pronounced by Chinese Learners Yasukazu Kanamori Graduate School of Information Science and Technology, Aichi Prefectural University, Nagakute, Japan Email: kanamori@ist.aichi-pu.ac.jp Guoxing Fang Toyota Communication Systems Co., Ltd, Nagoya, Japan Email: k-ho@toyota-cs.com Chinese characters are composed with vowels, consonants and tones. A single-syllable word is the basic of the Chinese. It is composed with a consonant and a vowel. There are basically six types of single-vowel including /a/, /o/, /e/, /i/, /u/ and /ü/. There are also diphthongs and trip thongs that composed from singlevowel. Therefore, if learners want to pronounce Chinese correctly, it is have to, first of all, pronounce singlevowel correctly. Otherwise, there are four tones as Chinese normal tone and a neutral tone which is pronounced shortly and lightly, usually called as zerotone. In this paper, we investigated the features of the syllables including vowel /e/, /u/ and /ü/ which are pronounced by Chinese learners. Single-vowels of /e/, /u/ and /ü/ are generally considered to be difficult for Chinese learners of Japanese students to pronounce. We investigated the pronunciation of /e/, /u/ and /ü/ which are included in two-syllable word. It is understood that /e/ is the most mistaken pronounced single-vowel among /e/, /u/ and /ü/. Therefore we study the single-vowel /e/ in detail. How the pronunciation error of vowel /e/ being influenced by position of two-syllable word, tone as well as consonant is mainly discussed [8]. Abstract In this paper, we investigate the features of the syllables including vowel /e/, /u/ and /ü/ which are pronounced by Chinese learners. How the pronunciation error of vowel /e/ being influenced by position of twosyllable word, tone as well as consonant is mainly discussed. As results, firstly it is understood that the position of twosyllable word does not affect the pronunciation of vowel /e/, and secondly, for tone, the pronunciation error rate is highest when the syllable is the third of tone. In order to objectively judge the pronunciation state, we used the value of the first formant frequency in the beginning, middle and ending part of vowel /e/. The proposed method is confirmed by obtaining 8.1% correct rate when comparing the pronunciations with perception result between native Chinese students and Chinese learners of Japanese students using 54 Chinese words. Index Terms pronunciation, vowel /e/, Japanese student, verification, first formant frequency F1 I. INTRODUCTION Recently, with the development of the economics and internationalization of China, Chinese learners have increased rapidly. However, pronunciation of Chinese education faces with a lot of difficulties because of complex phonetics [1]. Pronunciation errors are often made by foreign language learners. Especially when the target language contains some phonemes that are not found in learners native language, learners will use these phonemes with ones existing in their native language. Some katakana-english dictionaries are good example for Japanese. Hence, there are many researchers have directly investigated the characteristics of pronunciation of Chinese [2]-[4] and approved some information processing techniques [5], [6]. The single vowel analysis and Computer Assisted Language Learning (CALL) [7] have been actively researched. However most of these researches are about pronunciation of consonant and prosody. There is very few research of two-syllable word of vowel about Chinese learners of Japanese students. Automatic detection of these errors is one of essential and requisite technique in CALL systems [1]. II. In order to investigate the state of the pronunciation of Chinese learners, we recorded 126 two-syllable words. Nine Japanese students who have learned Chinese for 2 and 3 years are speakers. In this paper, 2 and 3 means the Japanese students who learned Chinese for 2 and 3 years, respectively. But in this paper we don t consider the difference in the level of year. For comparison with the Chinese learners of Japanese students, we also recorded the same words from 5 native Chinese speakers. There are 54 words contain the vowel /e/, 42 words contain the vowel /ü/, and 13 words contain the vowel /u/. The words being chosen to be recorded are in account of the balance in front and rear positions of the vowel. Table I shows the details of configuration of the word. Audio data obtained from Chinese learners of Japanese students and native Chinese were recorded using a digital audio recorder (Roland R-9HR) under the following Manuscript received August 8, 215; revised June 17, 216. 216 Int. J. Sig. Process. Syst. doi: 1.18178/ijsps.4.6.523-527 CHINESE AUDIO MATIAL 523

International Journal of Signal Processing Systems Vol. 4, No. 6, December 216 conditions: sampling frequency = 48kHz, number of quantization bits = 16 bit. Five native Chinese speakers were asked to hear the data and give their evaluation results. We used three-level evaluation to evaluate the pronunciation here. The accuracy of pronunciation is showed in the Table I. From Table I, we can see that the error rate of vowel /e/ is near 5% and is higher than the others, so it is necessary to analyze vowel /e/ in more detail. TABLE I. Number of syllable CONFIGURATION OF THE WORD Number of words e 55 54 ü 42 42 u 13 13 III. Position of syllable Error rate [%] first 27 49 second 28 55 first 2 18 second 22 2 first 7 13 second 6 7 ANALYSIS OF VOWEL /E/ PRONOUNCED BY CHINESE LEARNS OF JAPANESE STUDENTS Five native Chinese heard the audio data pronounced by the Chinese learners. Evaluation was done by taking three levels: wrong, ambiguous and correct pronunciations for each vowel in two-syllable word with two vowels. Fig. 1 shows the pronunciation error rate of /e/ of Chinese learners. From Fig. 1, we can see that the error rate of 6 speakers is greater than 5% in total 9 speakers which implies that it is difficult to pronounce vowel /e/ correctly. The symbols with the initial 3 of 3A to 3E is indicated 5 speakers of 3 years undergraduate students, and the symbols with the initial 2 of 2A to 2D is indicated 2 years students, respectively in Fig. 1. The difference between the grades is not so clearly. Fig. 2 shows the error rate of vowel /e/ that is located at the first syllable of a two-syllable word. From the Fig. 2, we found that there is small influence on error rate when /e/ is in the first syllable position to six students. The ratio is a little higher for other three students in the second syllable. IV. RELATIONSHIP BETWEEN TONE AND PRONUNCIATION OF VOWEL /E/ Fig. 3 shows the relationship between tone and error rate of pronunciation of /e/. From Fig. 3, we find out that the error rate is the biggest for the third-tone, and that the error rate, which is lower than 15%, is the smallest for the zero-tone. This means that it is relatively easier to pronounce the zero-tone than the third tone. Fig. 4 shows how the error rate of /e/ being influenced by tones for individual speaker. It is seen that error rate is a variable from person to person. However, the error rates of most Chinese learners of Japanese students are in a lower level for the first-tone and the second-tone, then the rates rise at the third-tone. From Fig. 3 and Fig. 4, the error rate of the third-tone is 62.7% which is the highest rate among the discussed tones. This result reflects that the third-tone itself is the most difficult one to pronounce in the tones of Chinese. In addition to combine the /e/, it was considered more difficult to pronounce. The lowest error rate of /e/ is zero-tone, and its average error rate is only 14.7%. The reason is that there is feature by syllable. For example, such as special syllable 的 [de] and neutral tone is often to be pronounced when followed by the same syllable in the word. It is almost lower than the one of other tones for each speaker. Figure 3. Relationship between error rate and tone of /e/ Figure 1. Error rate of vowel /e/ Figure 4. The influence of tones for individual speaker Figure 2. Error rate of first syllable with /e/ to whole word The Error Rate () of /e/ for each proceeding consonant is shown in Table II. It is the highest when consonant /r/ was combined with vowel /e/ while the error rate is the lowest 216 Int. J. Sig. Process. Syst. 524

International Journal of Signal Processing Systems Vol. 4, No. 6, December 216 when consonant of /zh/ was combined with vowel /e/. This can be explained by considering that Japanese students do not used to pronounce the combination of /e/ and /r/ because there is almost no this combination in Japanese language, while Japanese students are familiar with zero-tone pronunciation and furthermore it is easy for Japanese students to pronounce when the consonant /zh/ is located at the second-syllable of a word. Fig. 5 shows the error rate s details of Table II in the figure. pronunciation of native Chinese speakers and Japanese students. The vowel /e/ section is characterized by first formant frequency (F1). In order to investigate the variation in the vowel /e/, we quantify the data as follows: 1. Calculation of the overall F1 of vowel /e/ interval, 2. Dividing vowel /e/ into three sections (beginning, middle, ending) using each 4 frames, 3. Calculation of the average of F1 of each interval. Fig. 7 shows an example of extraction of F1 that pronounced by Japanese student. The analysis condition of formant is shown in Table III. TABLE II. ROR RATE () OF /E/ FOR EACH PROCEEDING CONSONANT Cons. Numb c g k h l r zh ch sh 4 18 13 5 1 8 3 2 1 First syllablesecond syllable (%) Numb Numb (%) (%) 42.9 2 44 2 41.8 5.4 9 46.8 9 54.1 55.5 4 44.4 8 62.4 54.4 4 46.1 1 62.7 33.3 1 33.3 79.3 5 73.3 3 89.3 5.6 3 5.6 54 1 54.7 1 53.3 54.2 1 54.2 Figure 7. First formant of /e/ vowel TABLE III. ANALYSIS CONDITION OF FORMANT FREQUENCY Sampling frequency Frame length Shift interval Window type LPC order VI. Figure 5. Error rate of vowel /e/ for each consonant 16kHz 41 Points 25 Points Hamming 24 VARIATION IN THE F1 FOR EACH TONE Fig. 8 shows the analysis results of the first-tone which are pronounced by 2 native Chinese speakers and 2 Japanese students. The consonants can be divided into two parts i.e. first and second syllable in this study. The relationship of error rate between these consonants and vowel /e/ has been investigated and the results are shown in Fig. 6. From the figure, we find out that the second consonant presents a higher average error rate than the first one [4]. Figure 8. Comparison between native Chinese speakers and Japanese students for the first-tone of /e/ Vowel /e/ pronounced by two Japanese students A and B are evaluated as correct by the perception. First formant frequencies of two native Chinese speakers show a rising from beginning to ending for the first-tone of /e/. On the other hand, formants of Japanese students keep almost unchanged. Fig. 9 shows the analysis results of the second-tone. Japanese student A is evaluated to be ambiguous level, and student B is evaluated to be mistaken level. For student A, the value does not change between the beginning and middle and goes up from Figure 6. Comparison of the first and the second consonants V. DISTINCTION OF PRONUNCIATION In order to distinguish automatically whether vowel /e/ is pronounced correctly or not to use computer as a tool, we have investigated the difference between the 216 Int. J. Sig. Process. Syst. 525

International Journal of Signal Processing Systems Vol. 4, No. 6, December 216 middle to ending. For student B, the value goes down from middle to ending. Values of native Chinese speakers keep rising from beginning through middle to ending. Figure 9. Comparison between native Chinese speakers and Japanese students in the second-tone of /e/ TABLE IV. DISCRIMINATION RULE Discrimination rule Decision result R1> R2> Correct pronounce R1< or R2< Mistake pronounce We use Table IV to distinct the pronunciation of vowel /e/. In here, R1 means the first formant frequency F1 value after subtracting the beginning from the middle, and R2 means the F1 value of subtracting the middle from the ending. VII. VIFICATION OF THE PROPOSED METHOD Fig. 1 shows the analysis results of 54-word pronunciations obtained from 4 Japanese students and 2 native Chinese speakers. In Fig. 1, data of native Chinese speakers and correctly pronounced by Chinese learners lay mostly in the upper right corner while the data of non-correctly pronounced by Japanese students distribute in the lower left corner. The accuracy rate of 8.1% is obtained. effort was put on investigation of vowel /e/ since it is the most difficult to be pronounced among the vowels for Chinese learners of Japanese students. From the investigations, we found out that the error rate of /e/ is higher than 5%. By discussing how the pronunciation error of vowel /e/ being influenced by position of twosyllable word, tone as well as consonant, we found out the following conclusions: 1) relationship between accuracy of pronunciation and position of word does not really matter, 2) the error rate is the lowest at zero-tone, and the error rate is highest at the third-tone, 3) the error rate is the lowest for combining consonant /zh/ with vowel /e/, and the error rate is the highest when combining consonant /r/ with vowel /e/. The proposed method of use R1 and R2 to distinct the state of the utterance obtained the accuracy rate of 8.1%. Making a fully automatic self-learning system is our next subject in the near future. REFENCES [1] Z. Zhang and S. Makino, Chinese vowel recognition of using formant, Acoustical Society of Japan, vol. 47, no. 4, 1991. [2] Y. Kanamori, The characteristics of Chinese vowel an and ang for Japanese learner, in Proc. 18th International Congress on Acoustics, 24. [3] Y. Kanamori and T. Tokoro, The feature extraction and discrimination of Chinese aspirated and un-aspirated affrications, GESTS International Transaction on Computer Science and Engineering, vol. 8, no. 1, pp. 15-111, May 25. [4] S. C. Tseng, K. Kuei, and P. C. Tsou, Acoustic characteristics of vowels and plosives/affricates of Mandarin-speaking hearingimpaired children, Clinical Linguistics & Phonetics, vol. 25, no. 9, pp. 784-83, 211. [5] M. Eskenazi, An overview of spoken language technology for education, Speech Communication, vol. 51, pp. 832-844, 29. [6] T. Zhao, T. Zhao, A. Hoshino, M. Suzuki, N. Minematsu, and K. Hirose, Automatic Chinese pronunciation error detection using SVM trained with structural features, in Proc. Spoken Language Technology Workshop, 212, pp. 473-476. [7] T. Takagi, A. Hattori, and M. Komiya, A Chinese language learning system with visualization and speech correction for prosody, IEICE Trans., vol. J88-D-I, no. 2, pp. 478-487, 25. [8] X. Yang and F. Gao, An acoustics experiment of vowel duration in Chinese, Bulletin of Hokkaido Bunkyo University, vol. 29, pp. 65-79, 25. [9] S. Hiki and K. Imaizumi, A CAI system for self-teaching Chinese tones based on their acoustical properties, IEICE Technical Report, Sp25-41, 25. Figure 1. Distributions of R1 and R2 VIII. CONCLUSION We investigated the characteristics of the syllables including vowel /e/, /u/, /ü/ of Chinese pronunciation. An Yasukazu Kanamori received the B.S. degree in electric engineering from Nanjing University of Science and Technology, Nanjing, China in 1982. Then he received the M.D. and Ph.D. degree in Graduate School of Engineering from Utsunomiya University, Utsunomiya, Japan in 199 and 1996. From 199 to 1993 and 1996 to 2, he was an Assistant Professor of Utsunomiya University and Nara Institute of Science and Technology, respectively. From October 2 to March 22, he was a Visiting Researcher at Advanced Telecommunications Research Institute International, Spoken Language Translation Research Laboratories (ATR-SLT). He is currently an Associate Professor at Graduate school of Information and Science Technology, Aichi Prefectural University. His research interests include speech and audio signal processing, foreign language learning assistance, and dialect features analysis. Dr. Kanamori is a member of IEICE and ASJ of Japan. 216 Int. J. Sig. Process. Syst. 526

International Journal of Signal Processing Systems Vol. 4, No. 6, December 216 Guoxing Fang received the B.S. degree in College of Engineering from Chubu University, Nagoya, Japan in 29. Then he received the M.S. degree in Graduate school of Information and Science Technology, Aichi Prefectural University, Nagakute, Japan in 212. He is currently an engineer at Toyota communication systems Co., Ltd. He was interested in foreign language learning when he was the student. 216 Int. J. Sig. Process. Syst. 527