Transcription of Vol.1.3: The Physiology of Speech

Similar documents
Consonants: articulation and transcription

Phonetics. The Sound of Language

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Phonology Revisited: Sor3ng Out the PH Factors in Reading and Spelling Development. Indiana, November, 2015

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

On Developing Acoustic Models Using HTK. M.A. Spaans BSc.

source or where they are needed to distinguish two forms of a language. 4. Geographical Location. I have attempted to provide a geographical

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Christine Mooshammer, IPDS Kiel, Philip Hoole, IPSK München, Anja Geumann, Dublin

On the Formation of Phoneme Categories in DNN Acoustic Models

Speaker Recognition. Speaker Diarization and Identification

Body-Conducted Speech Recognition and its Application to Speech Support System

Prevalence of Oral Reading Problems in Thai Students with Cleft Palate, Grades 3-5

Contrasting English Phonology and Nigerian English Phonology

Affricates. Affricates, nasals, laterals and continuants. Affricates. Affricates. Study questions

age, Speech and Hearii

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Universal contrastive analysis as a learning principle in CAPT

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

The analysis starts with the phonetic vowel and consonant charts based on the dataset:

THE RECOGNITION OF SPEECH BY MACHINE

Audible and visible speech

Affricates. Affricates, nasals, laterals and continuants. Affricates. Affricates. Affricates. Affricates 11/20/2015. Phonetics of English 1

Radical CV Phonology: the locational gesture *

Consonant-Vowel Unity in Element Theory*

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

Speech Emotion Recognition Using Support Vector Machine

To appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations

9 Sound recordings: acoustic and articulatory data

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speak with Confidence The Art of Developing Presentations & Impromptu Speaking

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Beginning primarily with the investigations of Zimmermann (1980a),

Complexity in Second Language Phonology Acquisition

MASTERY OF PHONEMIC SYMBOLS AND STUDENT EXPERIENCES IN PRONUNCIATION TEACHING. Master s thesis Aino Saarelainen

COORDINATING SKINNER SPEECH AND LINKLATER VOICE FOR THE BEGINNING ACTOR DAVID L. WYGANT, B.F.A. A THESIS THEATRE ARTS

Richardson, J., The Next Step in Guided Writing, Ohio Literacy Conference, 2010

CROSS-LANGUAGE MAPPING FOR SMALL-VOCABULARY ASR IN UNDER-RESOURCED LANGUAGES: INVESTIGATING THE IMPACT OF SOURCE LANGUAGE CHOICE

Clinical Review Criteria Related to Speech Therapy 1

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

Unit 9. Teacher Guide. k l m n o p q r s t u v w x y z. Kindergarten Core Knowledge Language Arts New York Edition Skills Strand

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

MARK 12 Reading II (Adaptive Remediation)

Mandarin Lexical Tone Recognition: The Gating Paradigm

Speaker recognition using universal background model on YOHO database

Proceedings of Meetings on Acoustics

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

Speaking Rate and Speech Movement Velocity Profiles

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

CS224d Deep Learning for Natural Language Processing. Richard Socher, PhD

Guidelines for blind and partially sighted candidates

Self-Supervised Acquisition of Vowels in American English

Client Psychology and Motivation for Personal Trainers

Journal of Phonetics

Phonological and Phonetic Representations: The Case of Neutralization

Language Change: Progress or Decay?

Readyman Activity Badge Outline -- Community Group

Linguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University

U IVERSIDADE FEDERAL DE SA TA CATARI A PROGRAMA DE PÓS-GRADUAÇÃO EM LETRAS/I GLÊS E LITERATURA CORRESPO DE TE. Mariane Antero Alves

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Quarterly Progress and Status Report. Sound symbolism in deictic words

A Believable Accent: The Phonology of the Pink Panther

Clinical Application of the Mean Babbling Level and Syllable Structure Level

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Experience Corps. Mentor Toolkit

Segregation of Unvoiced Speech from Nonspeech Interference

Speaker Identification by Comparison of Smart Methods. Abstract

George s Marvelous Medicine

Automatic English-Chinese name transliteration for development of multilingual resources

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

On-the-Fly Customization of Automated Essay Scoring

Moodle Student User Guide

DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English

TabletClass Math Geometry Course Guidebook

Human Emotion Recognition From Speech

HOSA 106 HOSA STRATEGIES FOR EMERGENCY PREPAREDNESS: COMPETITIVE EVENTS

NIH Public Access Author Manuscript Lang Speech. Author manuscript; available in PMC 2011 January 1.

COMMUNICATION DISORDERS. Speech Production Process

Dublin City Schools Mathematics Graded Course of Study GRADE 4

BABBLING STAGE CONSTRUCTION OF CHILDREN S LANGUAGE ACQUISITION ON RURAL AREA LAMPUNG

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

Grade 6: Correlated to AGS Basic Math Skills

Enduring Understandings: Students will understand that

Speech/Language Pathology Plan of Treatment

TEKS Comments Louisiana GLE

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

Modern Fantasy CTY Course Syllabus

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.**

Adding Japanese language synthesis support to the espeak system

Self-Supervised Acquisition of Vowels in American English

Transcription:

Transcription of Vol.1.3: The Physiology of Speech [Slide: 1] Holistic Emotive Practices Vol. 1 Part 3: The Physiology of Speech Hello. This is Brian McPherson. Welcome to Part 3 of Holistic Emotive Practices, Volume 1. In this video I will discuss the physiology of individual speech sounds, called phonemes. In order to understand the scientific evidence for connecting individual speech sounds to an emotional value you have to be familiar with the physiology of speech production, since physiology determines the emotional significance of a phoneme. [Slide: 2] Three Universal Vowels: /â/ as in father, /û/ as in moon, and /î/ as in see. One of the most important concepts concerning speech sounds is the existence of three primary vowel sounds. These three vowels, /â/ as in father, /û/ as in moon, and /î/ as in see, are found in every language. Some languages only have these three vowels. The sounds may vary slightly, from one language to another, for example the /â/ may sound like /â/ or the /û/ may be /û/ because of slight variations in the rounding of the lips. [Slide: 3] Vowel Production Space (Figure with 3 primary vowels) However, these three vowels hold the corner positions in a more or less triangular vowel production space for every language. Linguists define vowel production space by the position of the jaw and the relative position of the lips. To produce an /â/ the jaw is lower than it is for any saying any other vowel sound. To produce an /û/ the lips must protrude farther than for any other vowel. The muscle action to do this keeps the jaw drawn up. An /î/ requires the lips to be drawn back and the jaw kept high. [Slide: 3a] (adds ae and o to figure) Other vowels such as the ae in say and the o in slow fit somewhere in the middle of the vowel production space, with the jaw not so low and the lips not so far out or back. For these reasons I consider the /â/, /û/, and /î/ primary vowels and all other vowels secondary or non primary vowels. [Slide: 4] Vowel Perception Space (Figure with 3 primary vowels) The three primary vowels also hold the corner positions in a triangular vowel perception space. Linguist use the first and second formants of vowel sounds to create vowel perception space. Formants are bands of frequencies associated with vowels. When our vocal chords vibrate they produce a number of frequencies. Not all of the frequencies escape the mouth. The mouth blocks some and passes other through. The shape of the mouth determines which frequencies or formants pass through. In essence, the mouth acts like a filter allowing certain frequencies to pass and blocking others. Those frequencies which escape the mouth and are heard come in bands which are called formants. We can identify a vowel sound by its first and second formants. 1

If you plot the primary vowels in a space defined by first and second formants the three points form a triangle. [Slide: 4a] (adds ae and o to figure) When the first and second formants of other vowels are plotted they fall somewhere in the middle of the vowel perception space, just like in vowel production space. This strong correlation between vowel production space and vowel perception space takes place because mouth shape determines both. [Slide: 5] Consonants result from constricting part of the vocal tract. We form vowels from an open unobstructed vocal tract. We constrict the vocal tract to create consonants. I will examine consonants resulting from vocal tract constrictions based on individual physiological components that make up our speech articulators. The primary articulators include the tongue, lips, and jaw. Each of these three components gets paired with a primary vowel. Two of these pairings have clear connections. The jaw lines up the /â/, since to say the /â/ the jaw assumes a lower position than for any other vowel. The lips pair with the /û/ because of the position of the lips when we utter an /û/. That leaves one major articulator and one primary vowel and gives us our third pairing: the /î/ with the tongue. [Slide: 6] Consonant formed by constricting the jaw from an /â/ position: /r/ If you constrict the vocal tract by raising the jaw from an /â/ position you can create an /r/ sound. You could also hear an /l/ sound if you raise the jaw and keep the tongue from rising quite as far, in other words let it flatten. In fact, the /r/ and an /l/ are so close in the acoustic parameters that some languages do not distinguish between them. However, to make an /l/ the tongue has to flatten somewhat compared to the /r/. So if we want to consider consonants created only through constricting the jaw, the only one is /r/. [Slide: 7] Consonants formed by constricting the lips from a /û/ position: /b/, /p/, /m/, /w/. We say several consonants using only the lips to constrict the vocal tract from an /û/ position. The nasal /m/, the sound for the letter m, is one of the most common consonants. It is found in almost every language. You form the /m/ sound by simply closing the lips and keeping them closed while you release air through the nose and vibrate the vocal chords. Two stop consonants the /p/ and /b/ are also formed with the lips. To make each of these sound the lips close momentarily before reopening. The /w/ sound is the final sound formed with the lips. To make the /w/ sound you must constrict the lips to form a small opening through which the air rushes. 2

[Slide: 8] Types of consonants formed with the tongue: Stops, Nasals, Fricatives, & Glides More consonants result from tongue constriction than all other constrictions combined. We can classify the consonants formed using the tongue into four main categories: stops, nasals, fricatives, and glides. We will examine each of these groups. [Slide: 9] Stops consonants made with the tongue: (Includes chart with Point of Articulation and voicing for 6 stop consonants: Alveolar d,t; palatal j,ch; velar k,g) To make stop consonants the tongue momentarily stops the flow of air before releasing its position and then letting air past. We distinguish these stops on two parameters: point of articulation and voicing. The point of articulation refers to the particular spot you place the tongue in order to stop the air. If the tongue stops the airflow by touching the alveolar ridge, a point just above the upper teeth, you say either a /t/ or /d/. When the tongue touches the roof of the mouth, the palate, we can speak a /ch/ or /j/. Raising the back of the tongue to the vellum yields either a /k/ or /g/. As you probably noticed for each point of articulation we can say a pair of consonants. In each of these pairs one consonant is voiced and the other unvoiced. A voiced consonant results whenever the vocal chords begin to vibrate before the blockage of air is released by the tongue. If the blockage is removed prior to the vibration of the vocal chords, then the consonant is said to be unvoiced. The voiced stops formed by the tongue include /d/, /j/ and /g/. The unvoiced are the other three, the /t/, /ch/, and /k/. [Slide: 10] Nasal Consonants made with the tongue: /n/, /ng/ English has three nasal consonants, sounds in which the air flows through the nose. We already mentioned one, the /m/, which is formed by constricting he lips. The other two result from tongue constrictions. To make the /n/ the tongue presses against the roof of the mouth. The /ng/ results from raising the back of the tongue against the vellum. [Slide: 11] Fricative consonants made with the tongue: (includes chart with POA and voicing for 4 fricative: alveolar s,z; palatal sh, sh Fricative consonants result when we force air through a narrow channel. We classify fricatives formed using the tongue in the same manner that we classify stop consonants, that is, by point of articulation and by voicing. Using this system we classify /s/ as alveolar and unvoiced, where the /z/ is alveolar and voiced. The /sh/ sound (as in sheep) we say is unvoiced, palatal, and the /sh/ sound (as in measure) is voiced palatal. [Slide: 12] Glide Consonants Made with the Tongue (palatal y; lateral l) We only have two glide consonants formed with the tongue. To say the /y/ sound the middle of the tongue raises toward the roof of the mouth. The /l/ is considered a lateral glide because air flows around the side of the tongue. 3

[Slide: 13] Consonants Employing More Than One Articulator (Tongue & Teeth voiced th, unvoiced th; Lips and teeth voiced v; unvoiced f) Some consonants employ more than one articulator. The tongue touches the teeth for two fricatives, the voiced /th/ (as in they) and the unvoiced /th/ (as in think). The teeth touch the upper lip for two fricatives the voiced /v/ and the unvoiced /f/. Finally, the /q/ sound (as in quick) involves both the lips and the tongue. For this sound the lips round slightly as the tongue raises to the vellum to stop the airflow. [Slide: 14] Guttural Consonants (Pharyngeal voiced stop gh; unvoiced stop kh; unvoiced fricative h; Glottal voiced stop or ayin; unvoiced fricative h) In English we only have one sound considered guttural. A guttural sound is one formed by a constriction in the back of the throat, before the air reaches the mouth cavity. In English we form the /h/ by constricting the throat at the top of the vocal chords, the glottis. Other languages have a number of other guttural consonants. Because some of these prove useful in HEP, I need to introduce them at this time. Another guttural glottal consonant of interest is called the ayin. It is a voiced stop consonant formed by stopping the airflow with the glottis before releasing it as you start to vibrate the vocal chords. It sounds like this / ae/. Three guttural consonants of interest are considered pharyngeal because they stop or constrict the airflow with the pharynx, which is located in the back of the throat, slightly above the vocal chords. One of these three is a fricative that I represent with an underlined letter h. It is similar to the English h in sound, but is harsher and usually louder sounding than the English h. It sounds somewhat like an attempt to clear the throat, like this, /h/. The other two pharyngeal sounds are stop consonants. Like other pairs of stop consonants that share a point of articulation one of these is voiced and the other unvoiced. The unvoiced one is represented by the letters k and h. This is a single phoneme, however it does sound somewhat like a combination of the English k and h, like this, /kh/. The voiced pharyngeal stop consonant, represented by g and h, sounds like a combination of the English g and h, like this, /gh/. The air is stopped in both of these sounds by the back of the tongue pushing the airway shut at the pharynx. [Slide: 15] photos by Brian McPherson This concludes the discussion on the physiology of phonemes. The next presentation in this series puts together the information from this and the previous talk to arrive at the heart of HEP. It makes the connections between specific speech sounds and the physiological dimensions of emotions. That s it for now. Thanks for listening. 4

5