A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition"

Transcription

1 Journal of Convergence Information Technology Vol. 3 No 1, March 2008 A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition Hrudaya Ku. Tripathy* 1, B.K.Tripathy* 2 and Pradip K Das* 3 *1 Institute of Advanced Computer and Research, Prajukti Bihar, Rayagada (Orissa), India *2 School of Computing Sciences, VIT University Vellore , Tamil Nadu, India *3 Department of Computer Science & Engineering, Indian Institute of Technology Guwahati, North Guwahati (Assam), India Abstract Automatic speech recognition by machine is one of the most efficient methods for man-machine communications. Because speech waveform is nonlinear and variant. Speech recognition requires a lot of intelligence and fault tolerance in the pattern recognition algorithms. Accurate vowel recognition forms the backbone of most successful speech recognition systems. A collection of techniques exists to extract the relevant features from the steady-state regions of the vowels both in time as well as in frequency domains. This paper is, introducing fuzzy techniques allow the classification of imprecise vowel data. By incorporating the acoustic attribute, the system acquires the capacity to correctly classify imprecise speech data input. Experimental results show that the fuzzy system s performance is vastly improved over a standard Mel frequency cepstral coefficient (MFCC) features analysis of vowel recognition. The speech recognition is a particularly difficult classification problem, due to differences in voice frequency (amongst speakers) and variations in pronunciation. Keywords Fuzzy Logic, Fuzzy Inference, Vowel, Speech Recognition 1. Introduction Automatic speech recognition by machine has been a part of science fiction for many years. The early attempts were made in the 1950s by various researchers. In 1952, Davis Biddulph and Balashek [1] designed the first isolated digit recognizer for a single speaker at the Bell Laboratories. This system used a simple pattern matching method with templates for each of the digits. Matching was performed with two parameters: a frequency cut based on separating the spectrum of the spoken digit into two bands and a fundamental frequency estimated by zero-crossing counting. The 1970s and 1980s were very active periods for speech recognition with a series of important milestones: Pattern recognition algorithms were applied for the template-based isolated word recognition methods. Continuous speech from large vocabularies was understood based on the use of high-level knowledge to compensate for the errors in phonetic approaches. Speech analysis method based on Linear Predictive Coding (LPC) was used instead of conventional methods such as FFT and filter banks. Statistical modelings such as the HMMs (Hidden Markov Model) were developed for continuous speech recognition. The neural networks (back propagation. learning vector quantization) with efficient learning algorithms were proposed for speech pattern matching. Vowels are generally well defined in the spectral domain. As such, they contribute significantly to our ability to recognize speech, both by human beings and speech recognizers. For vowels, the speech behavior can be considered as a point that moves in parameter 51

2 A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition Hrudaya Ku. Tripathy, B.K.Tripathy and Pradip K Das space as the articulatory system changes. Standard HMMs using cepstral coefficients with their derivatives cannot effectively model the trajectories especially for vowels [2]. In recent years the speech recognition technology have begun to enter the real world in our life. More and more advanced algorithms were adopted in this area. The paper [3] presented an improved vowel detection and segmentation scheme. The vowel region of the wave file is removed after perceptually checking the vowel in question. This wave file is converted into data text file and is filtered for DC shift to remove internal machine noise. LP analysis followed by computation of cepstral coefficients and weighing is done to form the feature vectors. A series of experiments were reported by choosing vowel segments carefully based on perception. It was reported that the vowel recognition scores were better than the standard procedures. In this paper the Fuzzy logic techniques have been applied to classification of vowel for speech recognition and this field is growing and developing very fast. 2. Introduction to Speech Sounds 2.1 Speech Production Speech sound is produced by a set of wellcontrolled movements of various speech apparatus. Figure 1: shows a schematic cross-section through the vocal tract of the apparatus. The vocal tract is a primary acoustic tube, which is the region of the mouth cavity bounded by the vocal cords and the lips. As air is expelled from the lungs, the vocal cords are tensed and then caused to vibrate by the airflow. The frequency of oscillation is called the fundamental frequency, and it depends on the length tension and mass of the vocal cords. During this process, the shape of the vocal tube is changed by different positions of the velum, tongue, jaw and lips [4]. The average length of the vocal tract for an adult male is about 17cm. and its cross-section area can vary in its outer section from 0 to about 20cm 2. Therefore, the vocal tract, as an acoustic resonator, will determine variable resonant frequencies by adjusting the shape and size of the vocal tract. The resonant frequency is called the formant frequency or simply formant. The nasal tract is an auxiliary acoustic tube that can be acoustically cooperated with vocal tract to produce nasal sounds. Not only adjusting the shape of the vocal tract, but also the type of excitation produces various speech sounds. Besides the airflow from the lung, the excitation could come from some other sources: the fricative excitation and whispered excitation [5]. Figure 1. Schematic view of the human speech apparatus 2.2 Fundamental Speech Recognition Techniques Classification of Speech Recognizer Automatic speech recognition cm be classified into a number of different categories depending on different issues: 1. The manner in which a user speaks. Usually there are three recognition modes based on the speaking manner: Isolated word recognition: The user speaks individual words or phrases from a specified vocabulary. Isolated word recognition is suitable for command recognition. Connected word recognition: The user speaks fluent sequence of words with small spaces between words, in which each word is from a specified vocabulary (e.g.. zip codes. phone numbers). Continuous speech recognition: The speaker can speak fluently with a large vocabulary. 2. The number of users: Speaker dependent: The users of a recognition system only consist of a single speaker or a set of known speakers. Speaker independent: arbitrary users will use the ASR system in this case. 52

3 Journal of Convergence Information Technology Vol. 3 No 1, March 2008 Speaker adaptive: The system will customize its response to each individual speaker while it is in use by the speaker. 3. The size of the recognition vocabulary: A small vocabulary system only provides recognition capability for a small amount of words A large vocabu1ary system is capable of recognizing words among a vocabulary containing up to 1000 words. 3. Fuzzy Logic 3.1 Background Fuzzy sets were introduced by Zadeh [6] in 1965 as a new way to represent and manipulate data with uncertainty and fuzziness. In the old paradigm, fuzziness was considered unfavorable because of the expectation for scientific precision and accuracy. However, a fuzzy interpretation of data is a natural and intuitively possible way to formulate and solve a lot of problems in our everyday life. For example, expressions with uncertainty like "hot coffee", "heavy objects", and "warm weather" are fuzzy interpretations. Although both fuzzy sets and statistical theory can deal with uncertainty, fuzzy sets are quite different from statistical models in some ways. Probabilities represent the likelihood of a certain event with a distribution among all the events, while a fuzzy set represents the applicability of the element to the set. In another word, the fuzziness provides more uncertainty that can be found in the meanings of many words from human's thinking. 3.2 Fuzzy set and Fuzzy Logic Fuzzy sets are a super-set of classical sets. In a fuzzy set, each element is associated with a real value, which represents the degree of membership of the element in the closed unit interval [0,1]. However, in classical crisp sets, al1 element can only be classified as "0" or "1". When al1 elements in a set have either complete membership or complete non-membership, the fuzzy set reduces to a crisp set [7]. Suppose a fuzzy set A is a subset in space X that admits partial membership. It is defined as the ordered pair A = {x, m A (x)}, where x X and 0 m A (x) 1. Every fuzzy set consists of the three parts: a horizontal axis x specifying the population of sets, a vertical membership axis m A (x) which specifies the membership degree of each element and the surface itself to provide a one to one connection between the elements and their corresponding membership degree Fuzzy System Fuzzy systems use fuzzy set theory to deal with fuzzy or non-fuzzy information. Generally, a fuzzy system consists of a fuzzification subsystem. a fuzzy inference engine. a fuzzy rule base and a defuzzifier as shown in Figure 2. The fuzzy rule base and fuzzy inference engine is the core of the fuzzy-rule-based system. A fuzzy rule can be expressed by a set of fuzzy inference rules in the form of "IF x is A THEN y is B" [8], [9]. The inference engine then implements a fuzzy inference algorithm to determine the fuzzy output from the inference rules and the inputs. Finalized fuzzification follows the central stage of fuzzy system functioning fuzzy inferencing. This phase (Figure 2.) understands usage of knowledge base i.e. execution of the aggregation of fuzzy production system fuzzy premises rule application, adequate to context of fuzzy inferencing system model [10]. Note that a given input may simultaneously be a member of more than one set within a single fuzzy region. The inference engine interacts with the rule base and uses the inputs to determine which rules are applicable. The outputs are a set of fuzzy sets defined on the universe of possible outputs, which will be defuzzified to generate crisp outputs. Fuzzification Subsystem Inference Engine Fuzzy Rule Base Defuzzification System Figure 2. A typical fuzzy rule based system Frequent practice of this way of inferencing, assuming fuzzy logic systems for decision support and different systems control, one can find in literature. This was the original idea, exploited in this paper. Such system could be exploited for speech recognition because of non-numerical parameters. Conceptual extension of classic inferencing and control systems consists of absence of analytic description of such systems. First approaches to an extension of control systems based on Zadeh's concept of fuzzy sets origin from Mamdani [8], who has introduced a fuzzy logic controller which contained control algorithm based on simple rules. Approximative reasoning of such fuzzy system converts knowledge represented by incomplete (fuzzy) information and fuzzy rules into a non-fuzzy (numeric) outputs. In order to model human reasoning 53

4 A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition Hrudaya Ku. Tripathy, B.K.Tripathy and Pradip K Das mechanisms, Lofti Zadeh has introduced the fuzzy extension of conventional inferencing systems (fuzzy logic systems - FLS) that, besides quantitative aspects, have included the logic of inexact, incomplete information, operation and inferencing rules, also. In order to combine with heuristic formulation inside such systems, so called, if-then rules, numerical values of mathematical descriptions had to be symbolically interpreted. appropriately found on the basis of the succeeding pitch start position. Again, it will follow the same method to select the next pitch period. In this way, it will store 8 to 10 pitch period s raw data in a text file according to the steady part of the waveform. 4. Methodology and Preprocessing The methodology of proposed knowledge based fuzzy inference rule is shown in Figure 3. The speech waveform data consists of 3700 pitches of 200 utterances containing 10 repetitions of 5 different vowels spoken by 20 male speakers. All the speakers are from different parts of India. The vowels are recorded with a carrier sentence I say aaaaa now, I say oooo now, etc. as per IPA standard. All the speech utterances used in the present study are recorded in an air-conditioned lab with presence of number of students. Recording software was used for recording with a sampling rate of 22050Hz, mono channel and 16-bit resolution. Input Vowel Convert to Raw data Fuzzy Inference Algorithm Recognized Vowel Pitch extraction Acoustic Analysis to form Fuzzy set Figure 3. Block diagram of proposed Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition The raw speech data is extracted from the respective recorded wave files and stored in a text format. Further the raw data of the wave file was normalized with an absolute maximum value. An algorithm was developed to accurately detect a pitch period on the basis of the highest positive sample value in the data and scanning to find the lowest positive value preceding the located highest value. The position of the selected lowest positive sample will be the pitch start position. Similarly, after getting pitch start position, data is scanned forward from first highest positive sample value to reach to the next highest positive sample value. The pitch end position is Figure 4. Formulation of Fuzzy set by analyzing of Pitch period of a speech. Calculate the number of available positive going curves in an each pitch period and find out the number of peaks present in all such curves. The data corresponding to the pitch period is isolated. Next, compute the number of peaks and the number of samples of first positive going curve of the pitch period as shown in Figure 4. This is done for all vowels. The algorithm so designed to ensure that pitch periods in which there are less than 4 samples in the first positive going curve is ignored for computation. This takes care that most invalid pitch periods are removed from consideration. 4.1 Acoustic analysis using Fuzzy inference We describe the vowel recognition classifier algorithm based on the number of samples and the number of peaks present in the first positive going curve. Along with this the information on the total number of peaks located within on pitch period is also used. The proposed Fuzzy inference algorithm classifies the spoken English vowels based on the hierarchy scheme. As initially, the algorithm classifies a given input vowel into two groups namely {/o/,/u/} and {/a/,/i/,/e/}. From the first group /o/ and /u/ are recognized separately. Next, it will proceed to differentiate the given vowel between /a/ and {/i/,/e/}. Finally, the vowel is either classified as /i/ or /e/. This tree type heuristic classification is based on the acoustic analysis of the vowel waveforms. 54

5 Journal of Convergence Information Technology Vol. 3 No 1, March 2008 Heuristically it was verified by plotting the graph of all utterances of large number of pitch periods, which falls in a particular range for respective vowels. For example, as in vowel /a/ the percentage of more number of samples in first positive going curve was in between 1 to 19, percentage of peaks in the same curve was from 1 to 3 and the total number of peaks in all positive going curve of a whole pitch period was also from 1 to 8 as shown in Figure 5. Number of samples Number of peaks Number of samples in first positive curve of a vowel '/a/' Category of Samples Total number of peaks in all positive curve of a pitch of a vowel '/a/' Category of peaks 9 From Figure 8, the recognition score for /u/ and /o/ is found to be 99% and 93% respectively. Vowel /a/ has been correctly recognized with an accuracy of 92%. It is observed that in case of /e/ and /i/ the recognition score is low 82% and 86% respectively. if ((number of samples in first positive curve >=19) && (number of peaks in first positive curve <=3) && (total no of peaks in all positive curve <=8)) { if (number of samples in first positive curve <=26) printf ( Vowel is O ); else printf( Vowel is U ); } if ((number of samples in first positive curve <19) && (number of peaks in first positive curve <=3) && (total no of peaks in all positive curve <=10)) printf ( Vowel is A ); if ((number of peaks in first positive curve <=4) && (total no of peaks in all positive curve >10) && (total no of peaks in all positive curve <=20)) printf ( Vowel is E ); if ((number of samples in first positive curve >=10) && (number of peaks in first positive curve >4)) printf ( Vowel is I ); Figure 6. Fuzzy inference algorithm for vowel classification. Figure 5. Graphs of Fuzzy set analysis of vowel /a/. Similarly, computation of characterization is done for the remaining vowels /i/,/e/,/o/ and /u/. On the basis of information extracted from 3700 pitch periods of 5 vowels uttered by 4 ethnically different 20 Indian speakers. 4.2 Results On the basis of the logic developed in Fuzzy inference algorithm as explained in Figure 6, a set of 3700 pitch periods of 5 vowels spoken by 20 adult male speakers were used to test the classification system for its correctness. Figure 7, shows the output for the recognition of the 5 vowels. It should be noted that some pitch period were removed from the classification scheme as the number of samples detected in the first positive going curve is found to be less than 4 as described earlier. Experimental results show in comparison that the fuzzy system s performance is vastly improved over a standard Mel frequency cepstral coefficient (MFCC) features analysis of vowel recognition as in [3]. Figure 7. Output for the recognition of the vowel 55

6 A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition Hrudaya Ku. Tripathy, B.K.Tripathy and Pradip K Das [7] Hui Ping, Isolated Word Speech recognition Using Fuzzy Neural Techniques, A Thesis Submitted to the College of Graduate Studies and Research, University of Windsor Windsor. Ontario. Canada, [8] Lynn Yaling Cai, Hon Keung Kwan, "Fuzzy classifications using fùzzy inference networks", IEEE Transactions on Systems, Man and Cybernetic-Part B: Cybernetics, Vol. 28, No. 3, June 1998, pp [9] Hon Keung Ktvan. Yaling Cai, and Bin Zhang, "Membership function learning in fuzzy classification". International Journal of Electronics, Vol. 74, No. 6, pp Conclusion Figure 8. Vowel recognition accuracy Fuzzy systems based knowledge in speech recognition that finds useful practical applications in situations have been considered in this paper. An identification of basic acoustic parameters, established Fuzzy inference rule and the way of its classification of vowels are also illustrated. 6. References [1] K. Davis. R. Biddulph. and S. Balashek. "Automatic recognition of spoken digits". Journal Acoustic Society America, 1952, 23: pp [2] Y. Gong, Stochastic Trajectory Modeling and Sentence Searching for Continuous Speech Recognition, IEEE Trans. SAP, Vol. 5, No. 1,January (1997) [3] K. Saravanakumar, Hrudaya Ku. Tripathy, Pradip K. Das, An Improved Wave Segmentation Scheme for Vowel Recognition, Proceedings of the National Conference on Communication Technologies (NCCT-2006), Mepco Schlenk Engg. College, Tamil Nadu, (2006), [4] Lawrence Rabinar. Bing-hwang Juang, Fundamentals of speech recognition. Prentice Hall. Englewood Cliffs [10] Miloš Manić, Dragan Cvetković, Momir Praščević, Intelligibility Speech Estimation Using Fuzzy Logic Inferencing, The scientific journal FACTA UNIVERSITATIS, Vol. 1, No 4, 1999, pp Authors bio. Hrudaya Ku. Tripathy is presently working as Asst. Professor in the Dept. of Computer Science & Engineering at Institute of Advanced Computer & Research, Rayagada, Orissa, India. He has 10 years of teaching experience in UG/PG courses. He has M.Tech from Indian Institute of Technology Guwahati, and pursuing his Doctorate under Berhampur University, Berhampur Orissa, India. B.K.Tripathy is presently working as Professor in the School of Computing Science at VIT University, Vellore, India. Having around 27 years of teaching experience in UG/PG courses. He has M.Tech from University of Poona, Pune, India. and completed his PhD. He has published more than 50 research papers in national and international journals. Pradip K. Das is presently working as Asst. Professor in the Dept. of Computer Science & Engineering at Indian Institute of Technology Guwahati. Having around 20 years of teaching experience in both UG/PG classes. He has M.Sc from Delhi University and completed his PhD. He has published more than 30 research papers in national and international journals. [5] Joseph P. Campbell, JR., "Speaker recognition: A tutorial". Proceedings of IEEE, Vol. 85. No. 9. September [6] L. A. Zadeh. Fuzzy sets, Information Control, pp

Isolated Speech Recognition Using MFCC and DTW

Isolated Speech Recognition Using MFCC and DTW Isolated Speech Recognition Using MFCC and DTW P.P.S.Subhashini Associate Professor, RVR & JC College of Engineering. ABSTRACT This paper describes an approach of isolated speech recognition by using the

More information

Lecture 1-7: Source-Filter Model

Lecture 1-7: Source-Filter Model Lecture 1-7: Source-Filter Model Overview 1. Properties of vowel sounds: we can observe a number of properties of vowel sounds which tell us a great deal about how they must be generated: (i) they have

More information

Speaker Recognition Using Vocal Tract Features

Speaker Recognition Using Vocal Tract Features International Journal of Engineering Inventions e-issn: 2278-7461, p-issn: 2319-6491 Volume 3, Issue 1 (August 2013) PP: 26-30 Speaker Recognition Using Vocal Tract Features Prasanth P. S. Sree Chitra

More information

A NEW SPEAKER VERIFICATION APPROACH FOR BIOMETRIC SYSTEM

A NEW SPEAKER VERIFICATION APPROACH FOR BIOMETRIC SYSTEM A NEW SPEAKER VERIFICATION APPROACH FOR BIOMETRIC SYSTEM J.INDRA 1 N.KASTHURI 2 M.BALASHANKAR 3 S.GEETHA MANJURI 4 1 Assistant Professor (Sl.G),Dept of Electronics and Instrumentation Engineering, 2 Professor,

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute

More information

Speech Synthesizer for the Pashto Continuous Speech based on Formant

Speech Synthesizer for the Pashto Continuous Speech based on Formant Speech Synthesizer for the Pashto Continuous Speech based on Formant Technique Sahibzada Abdur Rehman Abid 1, Nasir Ahmad 1, Muhammad Akbar Ali Khan 1, Jebran Khan 1, 1 Department of Computer Systems Engineering,

More information

Gender Classification Based on FeedForward Backpropagation Neural Network

Gender Classification Based on FeedForward Backpropagation Neural Network Gender Classification Based on FeedForward Backpropagation Neural Network S. Mostafa Rahimi Azghadi 1, M. Reza Bonyadi 1 and Hamed Shahhosseini 2 1 Department of Electrical and Computer Engineering, Shahid

More information

On the Use of Perceptual Line Spectral Pairs Frequencies for Speaker Identification

On the Use of Perceptual Line Spectral Pairs Frequencies for Speaker Identification On the Use of Perceptual Line Spectral Pairs Frequencies for Speaker Identification Md. Sahidullah and Goutam Saha Department of Electronics and Electrical Communication Engineering Indian Institute of

More information

In Voce, Cantato, Parlato. Studi in onore di Franco Ferrero, E.Magno- Caldognetto, P.Cosi e A.Zamboni, Unipress Padova, pp , 2003.

In Voce, Cantato, Parlato. Studi in onore di Franco Ferrero, E.Magno- Caldognetto, P.Cosi e A.Zamboni, Unipress Padova, pp , 2003. VOWELS: A REVISIT Maria-Gabriella Di Benedetto Università degli Studi di Roma La Sapienza Facoltà di Ingegneria Infocom Dept. Via Eudossiana, 18, 00184, Rome (Italy) (39) 06 44585863, (39) 06 4873300 FAX,

More information

L12: Template matching

L12: Template matching Introduction to ASR Pattern matching Dynamic time warping Refinements to DTW L12: Template matching This lecture is based on [Holmes, 2001, ch. 8] Introduction to Speech Processing Ricardo Gutierrez-Osuna

More information

Vowel Pronunciation Accuracy Checking System Based on Phoneme Segmentation and Formants Extraction

Vowel Pronunciation Accuracy Checking System Based on Phoneme Segmentation and Formants Extraction Vowel Pronunciation Accuracy Checking System Based on Phoneme Segmentation and Formants Extraction Chanwoo Kim and Wonyong Sung School of Electrical Engineering Seoul National University Shinlim-Dong,

More information

Speaker Recognition Using MFCC and GMM with EM

Speaker Recognition Using MFCC and GMM with EM RESEARCH ARTICLE OPEN ACCESS Speaker Recognition Using MFCC and GMM with EM Apurva Adikane, Minal Moon, Pooja Dehankar, Shraddha Borkar, Sandip Desai Department of Electronics and Telecommunications, Yeshwantrao

More information

Speech Recognition using MFCC and Neural Networks

Speech Recognition using MFCC and Neural Networks Speech Recognition using MFCC and Neural Networks 1 Divyesh S. Mistry, 2 Prof.Dr.A.V.Kulkarni Department of Electronics and Communication, Pad. Dr. D. Y. Patil Institute of Engineering & Technology, Pimpri,

More information

AUTOMATIC ARABIC PRONUNCIATION SCORING FOR LANGUAGE INSTRUCTION

AUTOMATIC ARABIC PRONUNCIATION SCORING FOR LANGUAGE INSTRUCTION AUTOMATIC ARABIC PRONUNCIATION SCORING FOR LANGUAGE INSTRUCTION Hassan Dahan, Abdul Hussin, Zaidi Razak, Mourad Odelha University of Malaya (MALAYSIA) hasbri@um.edu.my Abstract Automatic articulation scoring

More information

Volume 1, No.3, November December 2012

Volume 1, No.3, November December 2012 Volume 1, No.3, November December 2012 Suchismita Sinha et al, International Journal of Computing, Communications and Networking, 1(3), November-December 2012, 115-125 International Journal of Computing,

More information

FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION

FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION James H. Nealand, Alan B. Bradley, & Margaret Lech School of Electrical and Computer Systems Engineering, RMIT University,

More information

L18: Speech synthesis (back end)

L18: Speech synthesis (back end) L18: Speech synthesis (back end) Articulatory synthesis Formant synthesis Concatenative synthesis (fixed inventory) Unit-selection synthesis HMM-based synthesis [This lecture is based on Schroeter, 2008,

More information

A comparison between human perception and a speaker verification system score of a voice imitation

A comparison between human perception and a speaker verification system score of a voice imitation PAGE 393 A comparison between human perception and a speaker verification system score of a voice imitation Elisabeth Zetterholm, Mats Blomberg 2, Daniel Elenius 2 Department of Philosophy & Linguistics,

More information

Formant Analysis of Vowels in Emotional States of Oriya Speech for Speaker across Gender

Formant Analysis of Vowels in Emotional States of Oriya Speech for Speaker across Gender Formant Analysis of Vowels in Emotional States of Oriya Speech for Speaker across Gender Sanjaya Kumar Dash-First Author E_mail id-sanjaya_145@rediff.com, Assistant Professor-Department of Computer Science

More information

Pavel Král and Václav Matoušek University of West Bohemia in Plzeň (Pilsen), Czech Republic pkral

Pavel Král and Václav Matoušek University of West Bohemia in Plzeň (Pilsen), Czech Republic pkral EVALUATION OF AUTOMATIC SPEAKER RECOGNITION APPROACHES Pavel Král and Václav Matoušek University of West Bohemia in Plzeň (Pilsen), Czech Republic pkral matousek@kiv.zcu.cz Abstract: This paper deals with

More information

Zaki B. Nossair and Stephen A. Zahorian Department of Electrical and Computer Engineering Old Dominion University Norfolk, VA, 23529

Zaki B. Nossair and Stephen A. Zahorian Department of Electrical and Computer Engineering Old Dominion University Norfolk, VA, 23529 SMOOTHED TIME/FREQUENCY FEATURES FOR VOWEL CLASSIFICATION Zaki B. Nossair and Stephen A. Zahorian Department of Electrical and Computer Engineering Old Dominion University Norfolk, VA, 23529 ABSTRACT A

More information

EE438 - Laboratory 9: Speech Processing

EE438 - Laboratory 9: Speech Processing Purdue University: EE438 - Digital Signal Processing with Applications 1 EE438 - Laboratory 9: Speech Processing June 11, 2004 1 Introduction Speech is an acoustic waveform that conveys information from

More information

Interactive training of speech articulation for hearing impaired using a talking robot

Interactive training of speech articulation for hearing impaired using a talking robot Interactive training of speech articulation for hearing impaired using a talking robot M Kitani, Y Hayashi and H Sawada Department of Intelligent Mechanical Systems Engineering, Faculty of Engineering,

More information

Abstract. 1. Introduction

Abstract. 1. Introduction A New Silence Removal and Endpoint Detection Algorithm for Speech and Speaker Recognition Applications G. Saha 1, Sandipan Chakroborty 2, Suman Senapati 3 Department of Electronics and Electrical Communication

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

The Control of Airflow during Singing

The Control of Airflow during Singing Paper presented at THE SECOND INTERNATIONAL CONFERENCE on the PHYSIOLOGY AND ACOUSTICS OF SINGING, October 6-9, 2004, Denver, Colorado. A discussion of each of the figures presented in the oral paper.

More information

SOFTCOMPUTING IN MODELING & SIMULATION

SOFTCOMPUTING IN MODELING & SIMULATION SOFTCOMPUTING IN MODELING & SIMULATION 9th July, 2002 Faculty of Science, Philadelphia University Dr. Kasim M. Al-Aubidy Computer & Software Eng. Dept. Philadelphia University The only way not to succeed

More information

Phonetics & Phonology

Phonetics & Phonology Phonetics & Phonology Pronunciation Poor English pronunciation may confuse people even if you use advanced English grammar. We can use simple words and simple grammar structures that make people understand

More information

Phonology. 1. the sounds of words are made by blowing air through the throat, mouth, and/or nose

Phonology. 1. the sounds of words are made by blowing air through the throat, mouth, and/or nose Phonology Phonology is the study of the sound system of language. It is the study of the wide variety of sounds in all languages, of the basic units of sound in a particular language, and of the regularities

More information

Utterance intonation imaging using the cepstral analysis

Utterance intonation imaging using the cepstral analysis Annales UMCS Informatica AI 8(1) (2008) 157-163 10.2478/v10065-008-0015-3 Annales UMCS Informatica Lublin-Polonia Sectio AI http://www.annales.umcs.lublin.pl/ Utterance intonation imaging using the cepstral

More information

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012 Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of

More information

Suitable Feature Extraction and Speech Recognition Technique for Isolated Tamil Spoken Words

Suitable Feature Extraction and Speech Recognition Technique for Isolated Tamil Spoken Words Suitable Feature Extraction and Recognition Technique for Isolated Tamil Spoken Words Vimala.C, Radha.V Department of Computer Science, Avinashilingam Institute for Home Science and Higher Education for

More information

Low-Delay Singing Voice Alignment to Text

Low-Delay Singing Voice Alignment to Text Low-Delay Singing Voice Alignment to Text Alex Loscos, Pedro Cano, Jordi Bonada Audiovisual Institute, Pompeu Fabra University Rambla 31, 08002 Barcelona, Spain {aloscos, pcano, jboni }@iua.upf.es http://www.iua.upf.es

More information

Lecture 16 Speaker Recognition

Lecture 16 Speaker Recognition Lecture 16 Speaker Recognition Information College, Shandong University @ Weihai Definition Method of recognizing a Person form his/her voice. Depends on Speaker Specific Characteristics To determine whether

More information

L16: Speaker recognition

L16: Speaker recognition L16: Speaker recognition Introduction Measurement of speaker characteristics Construction of speaker models Decision and performance Applications [This lecture is based on Rosenberg et al., 2008, in Benesty

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Pass Phrase Based Speaker Recognition for Authentication

Pass Phrase Based Speaker Recognition for Authentication Pass Phrase Based Speaker Recognition for Authentication Heinz Hertlein, Dr. Robert Frischholz, Dr. Elmar Nöth* HumanScan GmbH Wetterkreuz 19a 91058 Erlangen/Tennenlohe, Germany * Chair for Pattern Recognition,

More information

Discriminative Learning of Feature Functions of Generative Type in Speech Translation

Discriminative Learning of Feature Functions of Generative Type in Speech Translation Discriminative Learning of Feature Functions of Generative Type in Speech Translation Xiaodong He Microsoft Research, One Microsoft Way, Redmond, WA 98052 USA Li Deng Microsoft Research, One Microsoft

More information

Music Genre Classification Using MFCC, K-NN and SVM Classifier

Music Genre Classification Using MFCC, K-NN and SVM Classifier Volume 4, Issue 2, February-2017, pp. 43-47 ISSN (O): 2349-7084 International Journal of Computer Engineering In Research Trends Available online at: www.ijcert.org Music Genre Classification Using MFCC,

More information

Expert System for Heart Problems

Expert System for Heart Problems Expert System for Heart Problems M. Eswara Rao Asst. Professor, TP Institute of Science & Tech., Komatipalli, Bobbili. haieswar2020@gmail.com Dr. S. Govinda Rao, Scientist (Statistics) ANGR Agrl. University,RARS,

More information

Natural Speech Synthesizer for Blind Persons Using Hybrid Approach

Natural Speech Synthesizer for Blind Persons Using Hybrid Approach Procedia Computer Science Volume 41, 2014, Pages 83 88 BICA 2014. 5th Annual International Conference on Biologically Inspired Cognitive Architectures Natural Speech Synthesizer for Blind Persons Using

More information

Session 1: Gesture Recognition & Machine Learning Fundamentals

Session 1: Gesture Recognition & Machine Learning Fundamentals IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

Research Article A Robotic Voice Simulator and the Interactive Training for Hearing-Impaired People

Research Article A Robotic Voice Simulator and the Interactive Training for Hearing-Impaired People Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 28, Article ID 7682, 7 pages doi:1.11/28/7682 Research Article A Robotic Voice Simulator and the Interactive Training for

More information

Segmentation and Recognition of Handwritten Dates

Segmentation and Recognition of Handwritten Dates Segmentation and Recognition of Handwritten Dates y M. Morita 1;2, R. Sabourin 1 3, F. Bortolozzi 3, and C. Y. Suen 2 1 Ecole de Technologie Supérieure - Montreal, Canada 2 Centre for Pattern Recognition

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Table 1: Classification accuracy percent using SVMs and HMMs

Table 1: Classification accuracy percent using SVMs and HMMs Feature Sets for the Automatic Detection of Prosodic Prominence Tim Mahrt, Jui-Ting Huang, Yoonsook Mo, Jennifer Cole, Mark Hasegawa-Johnson, and Margaret Fleck This work presents a series of experiments

More information

Development of Web-based Vietnamese Pronunciation Training System

Development of Web-based Vietnamese Pronunciation Training System Development of Web-based Vietnamese Pronunciation Training System MINH Nguyen Tan Tokyo Institute of Technology tanminh79@yahoo.co.jp JUN Murakami Kumamoto National College of Technology jun@cs.knct.ac.jp

More information

VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT

VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT VOICE RECOGNITION SYSTEM: SPEECH-TO-TEXT Prerana Das, Kakali Acharjee, Pranab Das and Vijay Prasad* Department of Computer Science & Engineering and Information Technology, School of Technology, Assam

More information

THE USE OF A FORMANT DIAGRAM IN AUDIOVISUAL SPEECH ACTIVITY DETECTION

THE USE OF A FORMANT DIAGRAM IN AUDIOVISUAL SPEECH ACTIVITY DETECTION THE USE OF A FORMANT DIAGRAM IN AUDIOVISUAL SPEECH ACTIVITY DETECTION K.C. van Bree, H.J.W. Belt Video Processing Systems Group, Philips Research, Eindhoven, Netherlands Karl.van.Bree@philips.com, Harm.Belt@philips.com

More information

Learning words from sights and sounds: a computational model. Deb K. Roy, and Alex P. Pentland Presented by Xiaoxu Wang.

Learning words from sights and sounds: a computational model. Deb K. Roy, and Alex P. Pentland Presented by Xiaoxu Wang. Learning words from sights and sounds: a computational model Deb K. Roy, and Alex P. Pentland Presented by Xiaoxu Wang Introduction Infants understand their surroundings by using a combination of evolved

More information

VOICE RECOGNITION SECURITY SYSTEM USING MEL-FREQUENCY CEPSTRUM COEFFICIENTS

VOICE RECOGNITION SECURITY SYSTEM USING MEL-FREQUENCY CEPSTRUM COEFFICIENTS Vol 9, Suppl. 3, 2016 Online - 2455-3891 Print - 0974-2441 Research Article VOICE RECOGNITION SECURITY SYSTEM USING MEL-FREQUENCY CEPSTRUM COEFFICIENTS ABSTRACT MAHALAKSHMI P 1 *, MURUGANANDAM M 2, SHARMILA

More information

Speech and speech processing / April 7, 2005 Ted Gibson

Speech and speech processing / April 7, 2005 Ted Gibson Speech and speech processing 9.59 / 24.905 April 7, 2005 Ted Gibson The structure of language Sound structure: phonetics and phonology cat = /k/ + /æ/ + /t/ eat = /i/ + /t/ rough = /r/ + /^/ + /f/ Language

More information

RECENT TOPICS IN SPEECH RECOGNITION RESEARCH AT NTT LABORATORIES

RECENT TOPICS IN SPEECH RECOGNITION RESEARCH AT NTT LABORATORIES RECENT TOPICS IN SPEECH RECOGNITION RESEARCH AT NTT LABORATORIES Sadaoki Furui, Kiyohiro Shikano, Shoichi Matsunaga, Tatsuo Matsuoka, Satoshi Takahashi, and Tomokazu Yamada NTT Human Interface Laboratories

More information

Discriminative Learning of Feature Functions of Generative Type in Speech Translation

Discriminative Learning of Feature Functions of Generative Type in Speech Translation Discriminative Learning of Feature Functions of Generative Type in Speech Translation Xiaodong He Microsoft Research, One Microsoft Way, Redmond, WA 98052 USA Li Deng Microsoft Research, One Microsoft

More information

A New Kind of Dynamical Pattern Towards Distinction of Two Different Emotion States Through Speech Signals

A New Kind of Dynamical Pattern Towards Distinction of Two Different Emotion States Through Speech Signals A New Kind of Dynamical Pattern Towards Distinction of Two Different Emotion States Through Speech Signals Akalpita Das Gauhati University India dasakalpita@gmail.com Babul Nath, Purnendu Acharjee, Anilesh

More information

Support of Contextual Classifier Ensembles Design

Support of Contextual Classifier Ensembles Design Proceedings of the Federated Conference on Computer Science and Information Systems pp. 1683 1689 DOI: 10.15439/2015F353 ACSIS, Vol. 5 Support of Contextual Classifier Ensembles Design Janina A. Jakubczyc

More information

Spoken Language Identification Using Hybrid Feature Extraction Methods

Spoken Language Identification Using Hybrid Feature Extraction Methods JOURNAL OF TELECOMMUNICATIONS, VOLUME 1, ISSUE 2, MARCH 2010 11 Spoken Language Identification Using Hybrid Feature Extraction Methods Pawan Kumar, Astik Biswas, A.N. Mishra and Mahesh Chandra Abstract

More information

Variation of Vowels when Preceding Voiced And Voiceless Consonant in Sundanese

Variation of Vowels when Preceding Voiced And Voiceless Consonant in Sundanese International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 6, Issue 9 (September 2017), PP.13-20 Variation of Vowels when Preceding Voiced And Voiceless

More information

Sanjib Das Department of Computer Science, Sukanta Mahavidyalaya, (University of North Bengal), India

Sanjib Das Department of Computer Science, Sukanta Mahavidyalaya, (University of North Bengal), India Speech Recognition Technique: A Review Sanjib Das Department of Computer Science, Sukanta Mahavidyalaya, (University of North Bengal), India ABSTRACT Speech is the primary, and the most convenient means

More information

VOICE CONVERSION BY PROSODY AND VOCAL TRACT MODIFICATION

VOICE CONVERSION BY PROSODY AND VOCAL TRACT MODIFICATION VOICE CONVERSION BY PROSODY AND VOCAL TRACT MODIFICATION K. Sreenivasa Rao Department of ECE, Indian Institute of Technology Guwahati, Guwahati - 781 39, India. E-mail: ksrao@iitg.ernet.in B. Yegnanarayana

More information

II. SID AND ITS CHALLENGES

II. SID AND ITS CHALLENGES Call Centre Speaker Identification using Telephone and Data Lerato Lerato and Daniel Mashao Dept. of Electrical Engineering, University of Cape Town Rondebosch 7800, Cape Town, South Africa llerato@crg.ee.uct.ac.za,

More information

Segment-Based Speech Recognition

Segment-Based Speech Recognition Segment-Based Speech Recognition Introduction Searching graph-based observation spaces Anti-phone modelling Near-miss modelling Modelling landmarks Phonological modelling Lecture # 16 Session 2003 6.345

More information

Tamil Speech Recognition Using Hybrid Technique of EWTLBO and HMM

Tamil Speech Recognition Using Hybrid Technique of EWTLBO and HMM Tamil Speech Recognition Using Hybrid Technique of EWTLBO and HMM Dr.E.Chandra M.Sc., M.phil., PhD 1, S.Sujiya M.C.A., MSc(Psyc) 2 1. Director, Department of Computer Science, Dr.SNS Rajalakshmi College

More information

The sounds of language

The sounds of language The sounds of language Phonetics Chapter 4 1 Recap Language vs. other communicative systems Universal characteristics of language Displacement Arbitrariness Productivity Cultural transmission Duality 2

More information

Babble Noise: Modeling, Analysis, and Applications Nitish Krishnamurthy, Student Member, IEEE, and John H. L. Hansen, Fellow, IEEE

Babble Noise: Modeling, Analysis, and Applications Nitish Krishnamurthy, Student Member, IEEE, and John H. L. Hansen, Fellow, IEEE 1394 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 7, SEPTEMBER 2009 Babble Noise: Modeling, Analysis, and Applications Nitish Krishnamurthy, Student Member, IEEE, and John

More information

Low-Audible Speech Detection using Perceptual and Entropy Features

Low-Audible Speech Detection using Perceptual and Entropy Features Low-Audible Speech Detection using Perceptual and Entropy Features Karthika Senan J P and Asha A S Department of Electronics and Communication, TKM Institute of Technology, Karuvelil, Kollam, Kerala, India.

More information

ADDIS ABABA UNIVERSITY COLLEGE OF NATURAL SCIENCE SCHOOL OF INFORMATION SCIENCE. Spontaneous Speech Recognition for Amharic Using HMM

ADDIS ABABA UNIVERSITY COLLEGE OF NATURAL SCIENCE SCHOOL OF INFORMATION SCIENCE. Spontaneous Speech Recognition for Amharic Using HMM ADDIS ABABA UNIVERSITY COLLEGE OF NATURAL SCIENCE SCHOOL OF INFORMATION SCIENCE Spontaneous Speech Recognition for Amharic Using HMM A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE

More information

Foreign Accent Classification

Foreign Accent Classification Foreign Accent Classification CS 229, Fall 2011 Paul Chen pochuan@stanford.edu Julia Lee juleea@stanford.edu Julia Neidert jneid@stanford.edu ABSTRACT We worked to create an effective classifier for foreign

More information

Synthesizer control parameters. Output layer. Hidden layer. Input layer. Time index. Allophone duration. Cycles Trained

Synthesizer control parameters. Output layer. Hidden layer. Input layer. Time index. Allophone duration. Cycles Trained Allophone Synthesis Using A Neural Network G. C. Cawley and P. D.Noakes Department of Electronic Systems Engineering, University of Essex Wivenhoe Park, Colchester C04 3SQ, UK email ludo@uk.ac.essex.ese

More information

An Intelligent Speech Recognition System for Education System

An Intelligent Speech Recognition System for Education System An Intelligent Speech Recognition System for Education System Vishal Bhargava, Nikhil Maheshwari Department of Information Technology, Delhi Technological Universit y (Formerl y DCE), Delhi visha lb h

More information

Adaptation of Mamdani Fuzzy Inference System Using Neuro - Genetic Approach for Tactical Air Combat Decision Support System

Adaptation of Mamdani Fuzzy Inference System Using Neuro - Genetic Approach for Tactical Air Combat Decision Support System Adaptation of Mamdani Fuzzy Inference System Using Neuro - Genetic Approach for Tactical Air Combat Decision Support System Cong Tran 1, Lakhmi Jain 1, Ajith Abraham 2 1 School of Electrical and Information

More information

Overview of Speech Recognition and Recognizer

Overview of Speech Recognition and Recognizer Overview of Speech Recognition and Recognizer Research Article Authors 1Dr. E. Chandra, 2 Dony Joy Address for Correspondence: 1 Director, Dr.SNS Rajalakshmi College of Arts & Science, Coimbatore 2 Research

More information

VARIATION BETWEEN PALATAL VOICED FRICATIVE AND PALATAL APPROXIMANT IN URDU SPOKEN LANGUAGE

VARIATION BETWEEN PALATAL VOICED FRICATIVE AND PALATAL APPROXIMANT IN URDU SPOKEN LANGUAGE 46 VARIATION BETWEEN PALATAL VOICED FRICATIVE AND PALATAL APPROXIMANT IN URDU SPOKEN LANGUAGE SHERAZ BASHIR 1. INTRODUCTION Urdu is the national language of Pakistan. It has most of the common vocalic

More information

SYNTHESIS OF ORAL AND NASAL VOWELS OF URDU

SYNTHESIS OF ORAL AND NASAL VOWELS OF URDU 94 Center for Research in Urdu Language Processing SYNTHESIS OF ORAL AND NASAL VOWELS OF URDU MUHAMMAD KHURRAM RIAZ ABSTRACT The following oral and nasal vowels of Urdu were synthesized (i, æ, u,, i, æ,

More information

Speaker Indexing Using Neural Network Clustering of Vowel Spectra

Speaker Indexing Using Neural Network Clustering of Vowel Spectra International Journal of Speech Technology 1,143-149 (1997) @ 1997 Kluwer Academic Publishers. Manufactured in The Netherlands. Speaker Indexing Using Neural Network Clustering of Vowel Spectra DEB K.

More information

Speech Assessment for the Classification of Hypokinetic Dysarthria in Parkinson s Disease

Speech Assessment for the Classification of Hypokinetic Dysarthria in Parkinson s Disease Speech Assessment for the Classification of Hypokinetic Dysarthria in Parkinson s Disease Abdul Haleem Butt 2012 Master Thesis Computer Engineering Nr: DEGREEPROJECT Computer Engineering Program Reg Number

More information

Sentiment Analysis of Speech

Sentiment Analysis of Speech Sentiment Analysis of Speech Aishwarya Murarka 1, Kajal Shivarkar 2, Sneha 3, Vani Gupta 4,Prof.Lata Sankpal 5 Student, Department of Computer Engineering, Sinhgad Academy of Engineering, Pune, India 1-4

More information

Review of Algorithms and Applications in Speech Recognition System

Review of Algorithms and Applications in Speech Recognition System Review of Algorithms and Applications in Speech Recognition System Rashmi C R Assistant Professor, Department of CSE CIT, Gubbi, Tumkur,Karnataka,India Abstract- Speech is one of the natural ways for humans

More information

AUTOMATIC SONG-TYPE CLASSIFICATION AND SPEAKER IDENTIFICATION OF NORWEGIAN ORTOLAN BUNTING (EMBERIZA HORTULANA) VOCALIZATIONS

AUTOMATIC SONG-TYPE CLASSIFICATION AND SPEAKER IDENTIFICATION OF NORWEGIAN ORTOLAN BUNTING (EMBERIZA HORTULANA) VOCALIZATIONS AUTOMATIC SONG-TYPE CLASSIFICATION AND SPEAKER IDENTIFICATION OF NORWEGIAN ORTOLAN BUNTING (EMBERIZA HORTULANA) VOCALIZATIONS Marek B. Trawicki & Michael T. Johnson Marquette University Department of Electrical

More information

Speaker Identification System using Autoregressive Model

Speaker Identification System using Autoregressive Model Research Journal of Applied Sciences, Engineering and echnology 4(1): 45-5, 212 ISSN: 24-7467 Maxwell Scientific Organization, 212 Submitted: September 7, 211 Accepted: September 3, 211 Published: January

More information

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.

More information

Monitoring Soft Palate Movements in Speech

Monitoring Soft Palate Movements in Speech Paper Delivered at the 8lst Meeting of The Acoustical Society of America Washington, D.C. April 23, 1971 Monitoring Soft Palate Movements in Speech JOHN J. OHALA, Department of Linguistics, University

More information

An Artificial Neural Network Approach for User Class-Dependent Off-Line Sentence Segmentation

An Artificial Neural Network Approach for User Class-Dependent Off-Line Sentence Segmentation An Artificial Neural Network Approach for User Class-Dependent Off-Line Sentence Segmentation César A. M. Carvalho and George D. C. Cavalcanti Abstract In this paper, we present an Artificial Neural Network

More information

Evaluation and Comparison of Performance of different Classifiers

Evaluation and Comparison of Performance of different Classifiers Evaluation and Comparison of Performance of different Classifiers Bhavana Kumari 1, Vishal Shrivastava 2 ACE&IT, Jaipur Abstract:- Many companies like insurance, credit card, bank, retail industry require

More information

AIR FORCE INSTITUTE OF TECHNOLOGY

AIR FORCE INSTITUTE OF TECHNOLOGY SPEECH RECOGNITION USING THE MELLIN TRANSFORM THESIS Jesse R. Hornback, Second Lieutenant, USAF AFIT/GE/ENG/06-22 DEPARTMENT OF THE AIR FORCE AIR UNIVERSITY AIR FORCE INSTITUTE OF TECHNOLOGY Wright-Patterson

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

Performance improvement in automatic evaluation system of English pronunciation by using various normalization methods

Performance improvement in automatic evaluation system of English pronunciation by using various normalization methods Proceedings of 20 th International Congress on Acoustics, ICA 2010 23-27 August 2010, Sydney, Australia Performance improvement in automatic evaluation system of English pronunciation by using various

More information

Linguistic Phonetics Fall 2005

Linguistic Phonetics Fall 2005 MIT OpenCourseWare http://ocw.mit.edu 24.963 Linguistic Phonetics Fall 25 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 24.963 Linguistic Phonetics

More information

Statistical Modeling of Pronunciation Variation by Hierarchical Grouping Rule Inference

Statistical Modeling of Pronunciation Variation by Hierarchical Grouping Rule Inference Statistical Modeling of Pronunciation Variation by Hierarchical Grouping Rule Inference Mónica Caballero, Asunción Moreno Talp Research Center Department of Signal Theory and Communications Universitat

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 4pSCa: Auditory Feedback in Speech Production

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Speech Communication Session 2aSC: Linking Perception and Production (er Session)

More information

An Exploratory Study of Emotional Speech Production using Functional Data Analysis Techniques

An Exploratory Study of Emotional Speech Production using Functional Data Analysis Techniques An Exploratory Study of Emotional Speech Production using Functional Data Analysis Techniques Sungbok Lee 1,2, Erik Bresch 1, Shrikanth Narayanan 1,2,3 University of Southern California Viterbi School

More information

Description of the articulation of consonants of English

Description of the articulation of consonants of English Description of the articulation of consonants of English Chia-Lin Hsieh, Yi-Shan Chiu Chapter One Introduction In 1996, the Education Innovation Council suggested that the Ministry of Education should

More information

Convolutional Neural Networks for Speech Recognition

Convolutional Neural Networks for Speech Recognition IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL 22, NO 10, OCTOBER 2014 1533 Convolutional Neural Networks for Speech Recognition Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang,

More information

Speaker Recognition. Speaker Diarization and Identification

Speaker Recognition. Speaker Diarization and Identification Speaker Recognition Speaker Diarization and Identification A dissertation submitted to the University of Manchester for the degree of Master of Science in the Faculty of Engineering and Physical Sciences

More information

Hidden Markov Model-based speech synthesis

Hidden Markov Model-based speech synthesis Hidden Markov Model-based speech synthesis Junichi Yamagishi, Korin Richmond, Simon King and many others Centre for Speech Technology Research University of Edinburgh, UK www.cstr.ed.ac.uk Note I did not

More information

Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge

Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge 218 Bengio, De Mori and Cardin Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge Y oshua Bengio Renato De Mori Dept Computer Science Dept Computer Science McGill University

More information