Intra-speaker variation and units in human speech perception and ASR

Size: px
Start display at page:

Download "Intra-speaker variation and units in human speech perception and ASR"

Transcription

1 SRIV - ITRW on Speech Recognition and Intrinsic Variation May 20, 2006 Toulouse Intra-speaker variation and units in human speech perception and ASR Richard Wright University of Washington, Dept. of Linguistics rawright@u.washington.edu

2 Talk Outline Word recognition task 2 types of variation Sources of Inter-speaker Sources of Intra-speaker Human speech perception and variation Importance of features for perception Implications for ASR

3 Word recognition task: spontaneous speech Buckeye corpus: The Ohio State University Depts. of Psychology, Computer Science, and Linguistics collaboration Conversational speech (informal interviews) high quality recordings 40 speakers from Columbus, Ohio all from Columbus (2 accent groups) stratified for age (under 30, over 40) and sex class not controlled for orthographically transcribed and phonetically labeled freely available (with reasonable restrictions)

4 Word recognition task: single word Hz 0 ms 127

5 Word recognition task: two words Hz 0 ms 149

6 Word recognition task: three words Hz 0 ms 240

7 Word recognition task: four words Hz 0 ms 682

8 Word recognition task: six words Hz 0 ms 1276

9 Word recognition task: whole 5000 sentence 0 Hz 0 ms 2035

10 Word recognition task: summary The task demonstrates 3 aspects of human speech perception and word recognition that are still difficult for ASR to emulate: 1) Humans are able to use partial information to entertain a set of possible word candidates simultaneously without introducing confusions 2) Humans can recover gracefully from errors 3) Humans adapt dynamically to variation The task also demonstrates that humans use a combination of top-down and bottom up strategies in recognizing words

11 Types of variation Important advances in speech perception research: Variation in input is an integral part of perceptual category formation: variation is information not noise [1] Inter-speaker and intra-speaker variation are quite different in their causes and in their acoustic characteristics

12 Inter-speaker variation Results from two types of factors physiologic and anatomic factors [2] [3] size of vocal tract vocal fold mass and morphology mass and movement characteristics of articulators Social and experiential factors [4] gender (as opposed to sex) regional accent class affiliation native language, dialect, exposure to other languages

13 Inter-speaker variation Example: sex and gender differences in male and female speech are the results of both physiologic (sex) and sociologic (gender) factors While male-female acoustic differences are predicted from vocal tract differences they: emerge in children s speech well before the onset of puberty produces differences in vocal tract size [5] are greater than predicted by vocal tract differences [6] vary systematically by language [6]

14 Inter-speaker variation Largely static over the duration of a conversation Talkers generally don t change their gender, age, accent over the duration of a conversation Most of the dimensions are not unique to a single speaker but represent large sections of the population Larger corpora with appropriate samples of the population have brought improvements Better language models have also brought improvements (more appropriate phone sets or multiple word pronunciations) etc. [e.g. 7]

15 Inter-speaker variation In speech perception: we are able to understand a wide variety of accents that we have no experience with as long as they are similar to ones we know the greater the similarity to speakers we are familiar with, the lower the latency and the higher the accuracy an abrupt change from one talker to the next (even within accents) [8]

16 Inter-speaker variation In speech perception: as our experience with intra-speaker variables decreases (or as environmental noise increases) we rely on a coarser coding of the input implies a featural rather than strictly phone based lexical representation [9] [10]

17 Intra-speaker variation Multiple factors sociolinguistic [4] style shifts task: spontaneous speech, read sentences, etc. attitude of the speaker to the audience accent shifts as group affiliation shifts the accent may as well

18 Intra-speaker variation Continuous relationship between formality of the task and reduction in speech [11] least formal most reduced most formal most hyperarticulated casual conversation with a friend conversation with a stranger interviews formal speaking read texts read words in isolation

19 Intra-speaker variation Multiple factors Information based: the more predictable, the more reduced discourse: as a word s information load decreases, it becomes more reduced [12] [13] first introduced into the discourse, least reduced focus construction, less reduction lexical: base probabilities function words bear a much lower informational load than content words word frequency/familiarity confusability (perceptual similarity to other words)

20 Intra-speaker variation Multiple factors Information based factors interact with levels of formality at any one level of formality there are varying degrees of reduction based on informational factors the less formal the speech the greater the effect of discourse and lexical factors greater variability in pronunciation in spontaneous conversations than in read texts most of the variation is sub-phone in nature (reduction, increased coarticulation, etc.)

21 Intra-speaker variation Human speech perception Sociolinguistic and informational variation isn t noise, it s information: Humans rely on it to interpret the meaning of the utterance in its social context Humans use it to understand which words are important to the overall meaning of the utterance it encodes higher level syntactic and semantic structure Humans adapt dynamically to the variation delayed decisions underspecified inputs to lexical decision

22 Speech Perception Perceptual constancy in the face of highly variable input [1] the invariance is in the behavioral response not in the signal listeners use partial, feature-based information, to make lexical decisions: phones are filled in later listeners gain significant advantages from experience with specific talkers highly detailed representations include both linguistic information and non-linguistic information in the representation of lexical items indexical information listeners use both top-down and bottom-up information in interpreting utterances

23 Speech Perception Human perceptual behavior is best modeled by feature-based representation at the lexical level

24 Speculation on ASR Use of phonological features at the acoustic modeling stage provide [14] increased robustness in noise ability to adapt to highly variable inputs Use of phonological features at the lexical level [15] more efficient pronunciation modeling necessary for adaptation to variation in the signal

25 But which features? SPE features (and typical descendents from linguistics) hybrid system: acoustic, articulatory, phonological developed in mid 50s for 2 purposes (Jacobsen, Fant & Halle, 1957) universal system (language independent) of phoneme classification lexical contrasts (aspects of sounds that minimally differentiate words) describing allophonic variation within languages (at the time phoneme variants predicted by phonetic environment), grouping of sounds by patterns of variation (Natural Classes) good for abstract classification tasks - part of language model for grouping words by coarse similarity not particularly realistic model of human speech perception, and probably not the ideal features for ASR feature extraction

26 Conclusions Features play an important role in human speech perception Hold promise for ASR Caution about which features one chooses let the features fit the task

27 References 1. L. C. Nygaard, M. Sommers, and D. B. Pisoni. Effects of stimulus variability on perception and representation of spoken words in memory. Perception and Psychophysics, vol. 57, pp , D. H. Klatt and L. Klatt. Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, vol. 87, pp , J. Gonzáles. Formant frequencies and body size of speaker: a weak relationship in adult humans. Journal of Phonetics, 32, , P. Foulkes and G. Docherty. The social life of phonetics and phonology. Journal of Phonetics, in press, C. Hasek,S. Singh, & T. Murry. Acoustic attributes of preadolescent voices. Journal of the Acoustical Society of America, 68, , K. Johnson. Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics, in press, D. Jurafsky, W. Ward, Z. Jianping, K. Herold, Y. Xiuyang, and Z. Sen, What Kind of Pronunciation Variation is Hard for Triphones to Model? Proceedings ICASSP 2001, Salt Lake City, USA, vol. 1, pp , R.E. Remez, J.M. Fellowes and P.E. Rubin. Talker identification based on phonetic information. Journal of Experimental Psychology. Human Perception and Performance, vol 23, no. 3, pp , R. Herman and D. B. Pisoni. Perception of elliptical speech by an adult hearing impaired listener with a cochlear implant: some preliminary findings on coarse-coding in speech perception. Research on Spoken Language Processing: Progress Report, vol. 24. Bloomington, IN: Indiana University, G. Webster and R. Wright, R. Noise, attention and context: some problems for a cue-based approach to speech perception. In N. Niedzielski (Ed.) Speech perception in context: Beyond acoustic pattern matching. New Jersey: LEA. (46 pages), forthcoming. 11. A. Bell. Language style as audience design. Language in Society 13, 2, H.P. Grice. Presupposition and conversational implicature. In Radical Pragmatics, ed. P. Cole, pp New York: Academic Press, Reprinted in Studies in the Ways of Words, ed. H. P. Grice, pp Cambridge, MA: Harvard University Press (1989) 13. A. Bell, D. Jurafsky, E. Fosler-Lussier, C. Girand, M. Gregory, D. Gildea. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. 14. K. Kirchoff, Robust speech recognition using articulatory features, PhD Thesis, University of Bielefeld, Germany, R. Bates. Speaker Dynamics as a Source of Pronunciation Variability for Continuous Speech Recognition Models, Ph.D. dissertation, University of Washington, Seattle, Washington, USA, R. Jakobson, G. Fant, and M. Halle. Preliminaries to speech analysis. The distinctive features and their correlates. Acoustics Laboratory, Massachusetts Inst. of Technology, Technical Report No. 13. MIT press, seventh edition, 1967.

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

English Language and Applied Linguistics. Module Descriptions 2017/18

English Language and Applied Linguistics. Module Descriptions 2017/18 English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,

More information

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading Program Requirements Competency 1: Foundations of Instruction 60 In-service Hours Teachers will develop substantive understanding of six components of reading as a process: comprehension, oral language,

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH Mietta Lennes Most of the phonetic knowledge that is currently available on spoken Finnish is based on clearly pronounced speech: either readaloud

More information

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA LANGUAGE AND SPEECH, 2009, 52 (4), 391 413 391 Variability in Word Duration as a Function of Probability, Speech Style, and Prosody Rachel E. Baker, Ann R. Bradlow Northwestern University, Evanston, IL,

More information

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech Dept. for Speech, Music and Hearing Quarterly Progress and Status Report VCV-sequencies in a preliminary text-to-speech system for female speech Karlsson, I. and Neovius, L. journal: STL-QPSR volume: 35

More information

Universal contrastive analysis as a learning principle in CAPT

Universal contrastive analysis as a learning principle in CAPT Universal contrastive analysis as a learning principle in CAPT Jacques Koreman, Preben Wik, Olaf Husby, Egil Albertsen Department of Language and Communication Studies, NTNU, Trondheim, Norway jacques.koreman@ntnu.no,

More information

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

Perceptual scaling of voice identity: common dimensions for different vowels and speakers DOI 10.1007/s00426-008-0185-z ORIGINAL ARTICLE Perceptual scaling of voice identity: common dimensions for different vowels and speakers Oliver Baumann Æ Pascal Belin Received: 15 February 2008 / Accepted:

More information

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397, Adoption studies, 274 275 Alliteration skill, 113, 115, 117 118, 122 123, 128, 136, 138 Alphabetic writing system, 5, 40, 127, 136, 410, 415 Alphabets (types of ) artificial transparent alphabet, 5 German

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1 Linguistics 1 Linguistics Matthew Gordon, Chair Interdepartmental Program in the College of Arts and Science 223 Tate Hall (573) 882-6421 gordonmj@missouri.edu Kibby Smith, Advisor Office of Multidisciplinary

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

Phonological and Phonetic Representations: The Case of Neutralization

Phonological and Phonetic Representations: The Case of Neutralization Phonological and Phonetic Representations: The Case of Neutralization Allard Jongman University of Kansas 1. Introduction The present paper focuses on the phenomenon of phonological neutralization to consider

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Consonants: articulation and transcription

Consonants: articulation and transcription Phonology 1: Handout January 20, 2005 Consonants: articulation and transcription 1 Orientation phonetics [G. Phonetik]: the study of the physical and physiological aspects of human sound production and

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

Segregation of Unvoiced Speech from Nonspeech Interference

Segregation of Unvoiced Speech from Nonspeech Interference Technical Report OSU-CISRC-8/7-TR63 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 FTP site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/27

More information

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Julie Medero and Mari Ostendorf Electrical Engineering Department University of Washington Seattle, WA 98195 USA {jmedero,ostendor}@uw.edu

More information

CEFR Overall Illustrative English Proficiency Scales

CEFR Overall Illustrative English Proficiency Scales CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey

More information

Linguistics. The School of Humanities

Linguistics. The School of Humanities Linguistics The School of Humanities Ch a i r Nancy Niedzielski Pr o f e s s o r Masayoshi Shibatani Stephen A. Tyler Professors Emeriti James E. Copeland Philip W. Davis Sydney M. Lamb Associate Professors

More information

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION by Adam B. Buchwald A dissertation submitted to The Johns Hopkins University in conformity with the requirements

More information

Effects of Open-Set and Closed-Set Task Demands on Spoken Word Recognition

Effects of Open-Set and Closed-Set Task Demands on Spoken Word Recognition J Am Acad Audiol 17:331 349 (2006) Effects of Open-Set and Closed-Set Task Demands on Spoken Word Recognition Cynthia G. Clopper* David B. Pisoni Adam T. Tierney Abstract Closed-set tests of spoken word

More information

Journal of Phonetics

Journal of Phonetics Journal of Phonetics 40 (2012) 595 607 Contents lists available at SciVerse ScienceDirect Journal of Phonetics journal homepage: www.elsevier.com/locate/phonetics How linguistic and probabilistic properties

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Linguistic Portfolios Volume 6 Article 10 2017 An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Cassy Lundy St. Cloud State University, casey.lundy@gmail.com

More information

Corpus Linguistics (L615)

Corpus Linguistics (L615) (L615) Basics of Markus Dickinson Department of, Indiana University Spring 2013 1 / 23 : the extent to which a sample includes the full range of variability in a population distinguishes corpora from archives

More information

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Nord, L. and Hammarberg, B. and Lundström, E. journal:

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY

More information

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University 1 Perceived speech rate: the effects of articulation rate and speaking style in spontaneous speech Jacques Koreman Saarland University Institute of Phonetics P.O. Box 151150 D-66041 Saarbrücken Germany

More information

Rhythm-typology revisited.

Rhythm-typology revisited. DFG Project BA 737/1: "Cross-language and individual differences in the production and perception of syllabic prominence. Rhythm-typology revisited." Rhythm-typology revisited. B. Andreeva & W. Barry Jacques

More information

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology

More information

Review in ICAME Journal, Volume 38, 2014, DOI: /icame

Review in ICAME Journal, Volume 38, 2014, DOI: /icame Review in ICAME Journal, Volume 38, 2014, DOI: 10.2478/icame-2014-0012 Gaëtanelle Gilquin and Sylvie De Cock (eds.). Errors and disfluencies in spoken corpora. Amsterdam: John Benjamins. 2013. 172 pp.

More information

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets

More information

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence Bistra Andreeva 1, William Barry 1, Jacques Koreman 2 1 Saarland University Germany 2 Norwegian University of Science and

More information

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

age, Speech and Hearii

age, Speech and Hearii age, Speech and Hearii 1 Speech Commun cation tion 2 Sensory Comm, ection i 298 RLE Progress Report Number 132 Section 1 Speech Communication Chapter 1 Speech Communication 299 300 RLE Progress Report

More information

The Up corpus: A corpus of speech samples across adulthood

The Up corpus: A corpus of speech samples across adulthood The Up corpus: A corpus of speech samples across adulthood Susanne Gahl, Emily Cibelli, Kathleen Hall, Ronald Sprouse Affiliation: Linguistics Department University of California at Berkeley 1203 Dwinelle

More information

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS ROSEMARY O HALPIN University College London Department of Phonetics & Linguistics A dissertation submitted to the

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Phonological Processing for Urdu Text to Speech System

Phonological Processing for Urdu Text to Speech System Phonological Processing for Urdu Text to Speech System Sarmad Hussain Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, B Block, Faisal Town, Lahore,

More information

ACCREDITATION STANDARDS

ACCREDITATION STANDARDS ACCREDITATION STANDARDS Description of the Profession Interpretation is the art and science of receiving a message from one language and rendering it into another. It involves the appropriate transfer

More information

PHONETIC DISTANCE BASED ACCENT CLASSIFIER TO IDENTIFY PRONUNCIATION VARIANTS AND OOV WORDS

PHONETIC DISTANCE BASED ACCENT CLASSIFIER TO IDENTIFY PRONUNCIATION VARIANTS AND OOV WORDS PHONETIC DISTANCE BASED ACCENT CLASSIFIER TO IDENTIFY PRONUNCIATION VARIANTS AND OOV WORDS Akella Amarendra Babu 1 *, Ramadevi Yellasiri 2 and Akepogu Ananda Rao 3 1 JNIAS, JNT University Anantapur, Ananthapuramu,

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 46 ( 2012 ) 3011 3016 WCES 2012 Demonstration of problems of lexical stress on the pronunciation Turkish English teachers

More information

Arabic Orthography vs. Arabic OCR

Arabic Orthography vs. Arabic OCR Arabic Orthography vs. Arabic OCR Rich Heritage Challenging A Much Needed Technology Mohamed Attia Having consistently been spoken since more than 2000 years and on, Arabic is doubtlessly the oldest among

More information

Multi-Lingual Text Leveling

Multi-Lingual Text Leveling Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency

More information

THE INFLUENCE OF TASK DEMANDS ON FAMILIARITY EFFECTS IN VISUAL WORD RECOGNITION: A COHORT MODEL PERSPECTIVE DISSERTATION

THE INFLUENCE OF TASK DEMANDS ON FAMILIARITY EFFECTS IN VISUAL WORD RECOGNITION: A COHORT MODEL PERSPECTIVE DISSERTATION THE INFLUENCE OF TASK DEMANDS ON FAMILIARITY EFFECTS IN VISUAL WORD RECOGNITION: A COHORT MODEL PERSPECTIVE DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

More information

The pronunciation of /7i/ by male and female speakers of avant-garde Dutch

The pronunciation of /7i/ by male and female speakers of avant-garde Dutch The pronunciation of /7i/ by male and female speakers of avant-garde Dutch Vincent J. van Heuven, Loulou Edelman and Renée van Bezooijen Leiden University/ ULCL (van Heuven) / University of Nijmegen/ CLS

More information

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282) B. PALTRIDGE, DISCOURSE ANALYSIS: AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC. 2012. PP. VI, 282) Review by Glenda Shopen _ This book is a revised edition of the author s 2006 introductory

More information

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics

More information

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh The Effect of Discourse Markers on the Speaking Production of EFL Students Iman Moradimanesh Abstract The research aimed at investigating the relationship between discourse markers (DMs) and a special

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

Lecture Notes in Artificial Intelligence 4343

Lecture Notes in Artificial Intelligence 4343 Lecture Notes in Artificial Intelligence 4343 Edited by J. G. Carbonell and J. Siekmann Subseries of Lecture Notes in Computer Science Christian Müller (Ed.) Speaker Classification I Fundamentals, Features,

More information

The Common European Framework of Reference for Languages p. 58 to p. 82

The Common European Framework of Reference for Languages p. 58 to p. 82 The Common European Framework of Reference for Languages p. 58 to p. 82 -- Chapter 4 Language use and language user/learner in 4.1 «Communicative language activities and strategies» -- Oral Production

More information

Learners Use Word-Level Statistics in Phonetic Category Acquisition

Learners Use Word-Level Statistics in Phonetic Category Acquisition Learners Use Word-Level Statistics in Phonetic Category Acquisition Naomi Feldman, Emily Myers, Katherine White, Thomas Griffiths, and James Morgan 1. Introduction * One of the first challenges that language

More information

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,

More information

The Journey to Vowelerria VOWEL ERRORS: THE LOST WORLD OF SPEECH INTERVENTION. Preparation: Education. Preparation: Education. Preparation: Education

The Journey to Vowelerria VOWEL ERRORS: THE LOST WORLD OF SPEECH INTERVENTION. Preparation: Education. Preparation: Education. Preparation: Education VOWEL ERRORS: THE LOST WORLD OF SPEECH INTERVENTION The Journey to Vowelerria An adventure across familiar territory child speech intervention leading to uncommon terrain vowel errors, Ph.D., CCC-SLP 03-15-14

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Dialog Act Classification Using N-Gram Algorithms

Dialog Act Classification Using N-Gram Algorithms Dialog Act Classification Using N-Gram Algorithms Max Louwerse and Scott Crossley Institute for Intelligent Systems University of Memphis {max, scrossley } @ mail.psyc.memphis.edu Abstract Speech act classification

More information

COMPUTER INTERFACES FOR TEACHING THE NINTENDO GENERATION

COMPUTER INTERFACES FOR TEACHING THE NINTENDO GENERATION Session 3532 COMPUTER INTERFACES FOR TEACHING THE NINTENDO GENERATION Thad B. Welch, Brian Jenkins Department of Electrical Engineering U.S. Naval Academy, MD Cameron H. G. Wright Department of Electrical

More information

REVIEW OF CONNECTED SPEECH

REVIEW OF CONNECTED SPEECH Language Learning & Technology http://llt.msu.edu/vol8num1/review2/ January 2004, Volume 8, Number 1 pp. 24-28 REVIEW OF CONNECTED SPEECH Title Connected Speech (North American English), 2000 Platform

More information

MASN: 1 How would you define pragmatics today? How is it different from traditional Greek rhetorics? What are its basic tenets?

MASN: 1 How would you define pragmatics today? How is it different from traditional Greek rhetorics? What are its basic tenets? International Journal of Language Studies Volume 9, Number 3, July 2015, pp. **-** Pragmatics: The state of the art (An online interview with Keith Allan) Keith ALLAN, Monash University, Australia M. A.

More information

Eyebrows in French talk-in-interaction

Eyebrows in French talk-in-interaction Eyebrows in French talk-in-interaction Aurélie Goujon 1, Roxane Bertrand 1, Marion Tellier 1 1 Aix Marseille Université, CNRS, LPL UMR 7309, 13100, Aix-en-Provence, France Goujon.aurelie@gmail.com Roxane.bertrand@lpl-aix.fr

More information

Stages of Literacy Ros Lugg

Stages of Literacy Ros Lugg Beginning readers in the USA Stages of Literacy Ros Lugg Looked at predictors of reading success or failure Pre-readers readers aged 3-53 5 yrs Looked at variety of abilities IQ Speech and language abilities

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

On the Formation of Phoneme Categories in DNN Acoustic Models

On the Formation of Phoneme Categories in DNN Acoustic Models On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Process Evaluations for a Multisite Nutrition Education Program

Process Evaluations for a Multisite Nutrition Education Program Process Evaluations for a Multisite Nutrition Education Program Paul Branscum 1 and Gail Kaye 2 1 The University of Oklahoma 2 The Ohio State University Abstract Process evaluations are an often-overlooked

More information

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Stromswold & Rifkin, Language Acquisition by MZ & DZ SLI Twins (SRCLD, 1996) 1 Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Dept. of Psychology & Ctr. for

More information

Teaching ideas. AS and A-level English Language Spark their imaginations this year

Teaching ideas. AS and A-level English Language Spark their imaginations this year Teaching ideas AS and A-level English Language Spark their imaginations this year We ve put together this handy set of teaching ideas so you can explore new ways to engage your AS and A-level English Language

More information

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:

More information

Letter-based speech synthesis

Letter-based speech synthesis Letter-based speech synthesis Oliver Watts, Junichi Yamagishi, Simon King Centre for Speech Technology Research, University of Edinburgh, UK O.S.Watts@sms.ed.ac.uk jyamagis@inf.ed.ac.uk Simon.King@ed.ac.uk

More information

UC Berkeley Dissertations, Department of Linguistics

UC Berkeley Dissertations, Department of Linguistics UC Berkeley Dissertations, Department of Linguistics Title Phonetic and Social Selectivity in Speech Accommodation Permalink https://escholarship.org/uc/item/1mb4n1mv Author Babel, Molly Publication Date

More information

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab Revisiting the role of prosody in early language acquisition Megha Sundara UCLA Phonetics Lab Outline Part I: Intonation has a role in language discrimination Part II: Do English-learning infants have

More information

Reading Horizons. A Look At Linguistic Readers. Nicholas P. Criscuolo APRIL Volume 10, Issue Article 5

Reading Horizons. A Look At Linguistic Readers. Nicholas P. Criscuolo APRIL Volume 10, Issue Article 5 Reading Horizons Volume 10, Issue 3 1970 Article 5 APRIL 1970 A Look At Linguistic Readers Nicholas P. Criscuolo New Haven, Connecticut Public Schools Copyright c 1970 by the authors. Reading Horizons

More information

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Minimalism is the name of the predominant approach in generative linguistics today. It was first Minimalism Minimalism is the name of the predominant approach in generative linguistics today. It was first introduced by Chomsky in his work The Minimalist Program (1995) and has seen several developments

More information

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan James White & Marc Garellek UCLA 1 Introduction Goals: To determine the acoustic correlates of primary and secondary

More information

Voice conversion through vector quantization

Voice conversion through vector quantization J. Acoust. Soc. Jpn.(E)11, 2 (1990) Voice conversion through vector quantization Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara A TR Interpreting Telephony Research Laboratories,

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

One major theoretical issue of interest in both developing and

One major theoretical issue of interest in both developing and Developmental Changes in the Effects of Utterance Length and Complexity on Speech Movement Variability Neeraja Sadagopan Anne Smith Purdue University, West Lafayette, IN Purpose: The authors examined the

More information

Curriculum Vitae. Sara C. Steele, Ph.D, CCC-SLP 253 McGannon Hall 3750 Lindell Blvd., St. Louis, MO Tel:

Curriculum Vitae. Sara C. Steele, Ph.D, CCC-SLP 253 McGannon Hall 3750 Lindell Blvd., St. Louis, MO Tel: Curriculum Vitae Sara C. Steele, Ph.D, CCC-SLP 253 McGannon Hall 3750 Lindell Blvd., St. Louis, MO 63108 Tel: 314-977-2941 ssteele1@slu.edu Education Ph.D., Speech and Hearing Science, University of Illinois

More information

Approaches to Teaching Second Language Writing Brian PALTRIDGE, The University of Sydney

Approaches to Teaching Second Language Writing Brian PALTRIDGE, The University of Sydney Approaches to Teaching Second Language Writing Brian PALTRIDGE, The University of Sydney This paper presents a discussion of developments in the teaching of writing. This includes a discussion of genre-based

More information

Phonological encoding in speech production

Phonological encoding in speech production Phonological encoding in speech production Niels O. Schiller Department of Cognitive Neuroscience, Maastricht University, The Netherlands Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands

More information

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Yanzhang He, Eric Fosler-Lussier Department of Computer Science and Engineering The hio

More information

Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions

Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions 26 24th European Signal Processing Conference (EUSIPCO) Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions Emma Jokinen Department

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES Judith Gaspers and Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University {jgaspers cimiano}@cit-ec.uni-bielefeld.de ABSTRACT Semantic parsers

More information

Handbook for Graduate Students in TESL and Applied Linguistics Programs

Handbook for Graduate Students in TESL and Applied Linguistics Programs Handbook for Graduate Students in TESL and Applied Linguistics Programs Section A Section B Section C Section D M.A. in Teaching English as a Second Language (MA-TESL) Ph.D. in Applied Linguistics (PhD

More information

LINGUISTICS. Learning Outcomes (Graduate) Learning Outcomes (Undergraduate) Graduate Programs in Linguistics. Bachelor of Arts in Linguistics

LINGUISTICS. Learning Outcomes (Graduate) Learning Outcomes (Undergraduate) Graduate Programs in Linguistics. Bachelor of Arts in Linguistics Stanford University 1 LINGUISTICS Courses offered by the Department of Linguistics are listed under the subject code LINGUIST on the Stanford Bulletin's ExploreCourses web site. Linguistics is the study

More information

Beginning primarily with the investigations of Zimmermann (1980a),

Beginning primarily with the investigations of Zimmermann (1980a), Orofacial Movements Associated With Fluent Speech in Persons Who Stutter Michael D. McClean Walter Reed Army Medical Center, Washington, D.C. Stephen M. Tasko Western Michigan University, Kalamazoo, MI

More information

Human Factors Engineering Design and Evaluation Checklist

Human Factors Engineering Design and Evaluation Checklist Revised April 9, 2007 Human Factors Engineering Design and Evaluation Checklist Design of: Evaluation of: Human Factors Engineer: Date: Revised April 9, 2007 Created by Jon Mast 2 Notes: This checklist

More information

Problems of the Arabic OCR: New Attitudes

Problems of the Arabic OCR: New Attitudes Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing

More information