Phrase-final creak: Articulation, acoustics, and distribution. Marc Garellek, UC San Diego Patricia Keating, UCLA

Similar documents
Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

age, Speech and Hearii

Consonants: articulation and transcription

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Rhythm-typology revisited.

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

Speech Emotion Recognition Using Support Vector Machine

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

Mandarin Lexical Tone Recognition: The Gating Paradigm

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

The influence of metrical constraints on direct imitation across French varieties

Evaluation of Various Methods to Calculate the EGG Contact Quotient

Phonetics. The Sound of Language

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

Word Stress and Intonation: Introduction

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Segregation of Unvoiced Speech from Nonspeech Interference

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Proceedings of Meetings on Acoustics

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS

Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs

Modern TTS systems. CS 294-5: Statistical Natural Language Processing. Types of Modern Synthesis. TTS Architecture. Text Normalization

Journal of Phonetics

Do multi-year scholarships increase retention? Results

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Consonant-Vowel Unity in Element Theory*

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

A study of speaker adaptation for DNN-based speech synthesis

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

The Role of the Head in the Interpretation of English Deverbal Compounds

Journal of Phonetics

Expressive speech synthesis: a review

THE RECOGNITION OF SPEECH BY MACHINE

Linking Task: Identifying authors and book titles in verbose queries

GDP Falls as MBA Rises?

Voice conversion through vector quantization

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

Probabilistic Latent Semantic Analysis

REVIEW OF CONNECTED SPEECH

5 Guidelines for Learning to Spell

Speaker recognition using universal background model on YOHO database

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Assignment 1: Predicting Amazon Review Ratings

cmp-lg/ Jan 1998

Discourse Structure in Spoken Language: Studies on Speech Corpora

CEFR Overall Illustrative English Proficiency Scales

Human Emotion Recognition From Speech

Speech Recognition at ICSI: Broadcast News and beyond

Speaker Recognition. Speaker Diarization and Identification

One major theoretical issue of interest in both developing and

The Acquisition of English Intonation by Native Greek Speakers

L1 Influence on L2 Intonation in Russian Speakers of English

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Summer in Madrid, Spain

9 Sound recordings: acoustic and articulatory data

Multi-modal Sensing and Analysis of Poster Conversations toward Smart Posterboard

Using a Native Language Reference Grammar as a Language Learning Tool

Cross Language Information Retrieval

Review in ICAME Journal, Volume 38, 2014, DOI: /icame

A Socio-Tonetic Analysis of Sui Dialect Contact. James N. Stanford Rice University. [To appear in Language Variation and Change 20(3)]

How to Teach English

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

Automatic intonation assessment for computer aided language learning

On Developing Acoustic Models Using HTK. M.A. Spaans BSc.

Phonological Processing for Urdu Text to Speech System

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Sample Goals and Benchmarks

/$ IEEE

What is related to student retention in STEM for STEM majors? Abstract:

Modeling function word errors in DNN-HMM based LVCSR systems

Learning Methods in Multilingual Speech Recognition

Understanding and Supporting Dyslexia Godstone Village School. January 2017

On the Formation of Phoneme Categories in DNN Acoustic Models

Universal contrastive analysis as a learning principle in CAPT

Speaker Identification by Comparison of Smart Methods. Abstract

Text-mining the Estonian National Electronic Health Record

Christine Mooshammer, IPDS Kiel, Philip Hoole, IPSK München, Anja Geumann, Dublin

A Hybrid Text-To-Speech system for Afrikaans

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Phonological and Phonetic Representations: The Case of Neutralization

Pre-vocational training. Unit 2. Being a fitness instructor

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

Body-Conducted Speech Recognition and its Application to Speech Support System

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand

Description: Pricing Information: $0.99

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Beginning primarily with the investigations of Zimmermann (1980a),

Transcription:

Phrase-final creak: Articulation, acoustics, and distribution Marc Garellek, UC San Diego Patricia Keating, UCLA

Prototypical creaky voice Low fundamental frequency (F0) Irregular F0 Vocal folds are mostly closed: glottis is constricted Low airflow through the glottis More energy in higher-frequency harmonics Creaky voice is common in phrase-final position Catford 1966, Laver 1980, Kreiman 1982, Klatt & Klatt 1990, Gordon & Ladefoged 2001 1

Phrase-final creak 2

Goals of this study 1. Which phonological/phonetic factors favor the occurrence of phrase-final creak? 2. On what acoustic measures do phrase-final vowels with creaky voice differ from phrase-final vowels without? 3. On what acoustic measures do phrase-final vowels with creaky voice differ from initial vowels with creaky voice? 3

Factors favoring occurrence Incidence of phrase-final creak varies with the kind of phrase: the larger the phrasetype, the more final creak We compare 3 levels of phrasing: Utterance (Break Index (BI) 5 ) Full Intonational Phrase (BI 4 ) Intermediate Intonational Phrase (BI 3 ) Requires a prosodically-rich corpus Price et al. 1991, Redi & Shattuck-Hufnagel 2001 4

Study 1: BU Radio News Corpus Four English speakers (2F, 2M) Last vowels in phrase-final words (>100 ms of voicing) were extracted: 2086 tokens Break indices (3,4,5) were extracted Vowels were binary-coded for presence/absence of creaky voice Creaky = percept of creak + presence of F0 irregularity and/or complete damping of pulses Ostendorf et al. 1995 5

News Corpus: Factors tested Break index Presence of pause (and pause length in ms) Distance of target phrase from end of Utterance (in number of syllables, phrases) Number of words in target phrase Duration of phrase (ms) Duration from end of phrase to following pitch accent Presence of final coda stop Fundamental frequency (F0, in Hz) (mean over vowel) 6

News Corpus: Analysis Logistic mixed-effects regression modeling presence of creak as a function of coded factors 7

BU Corpus: Results Only 2 factors make creak more likely: Lower F0 (before BI 3, 4) Before a bigger phrase break (an effect beyond that of F0) No other significant predictors Consistent across all 4 speakers 8

Break Index effect Higher BI more likely to have phrase-final creak Over half of Utterance-final tokens have phrase-final creak 9

Acoustic properties of phrase-final creak What acoustic measures distinguish vowels coded as creaky vs. noncreaky? News Corpus speakers all creak ~50% of time Utterance-finally (BI = 5) 10

Acoustic measures of vowels Fundamental frequency (F0) Noise in lowest frequencies (HNR05) reflects irregularity of voicing, or added noise Subharmonics-to-Harmonics ratio (SHR) reflects additional harmonics added by multiple pulsing Relative energy in first 2 harmonics (H1*-H2*) lower value reflects increased constriction of the glottis Assessed using linear mixed-effects regression 11

Lower for creaky Utt-final vowels than for non-creaky Uttfinal vowels for all speakers Acoustic results: Fundamental Frequency (F0) = Lower F0 in creak 12

Acoustic results: Harmonics-to-noise ratio (HNR) Lower for creaky Utt-final vowels than for non-creaky Uttfinal vowels for all speakers = More aperiodicity in creak 13

Acoustic results: Subharmonics-to-harmonics ratio (SHR) Higher for creaky Utt-final vowels than for non-creaky Uttfinal vowels for all speakers More subharmonics (multiple pulsing) in creak 14

Acoustic results: H1*-H2* (glottal constriction) Not sig. for f1 Lower for creaky than for non-creaky for f2 (more constricted) Moreconstricted creak Lessconstricted creak Higher for creaky than for non-creaky for m1, m2 (less constricted) 15

Interim summary Utt-final vowels coded as creaky are: Lower-pitched Noisier More multiply-pulsed voicing For 1 speaker more constricted, for 2 others less constricted compared to Utt-final vowels coded as noncreaky 16

Interim summary Cross-speaker differences in H1*-H2* are not unexpected: Prototypical creaky voice is generally more constricted, but: Slifka (2006) found evidence for less constriction in Utterance-final creak the glottis opens, lung pressure drops, and voicing begins to fail, irregular but breathy How common is less-constricted creaky voice? Next corpus is larger: 12 speakers of English, 12 of Spanish Younger speakers, more phrase-final creak 17

Study 2: English/Spanish sentence corpus Audio recordings from Garellek (2014) 12 English (6 F, 6 M) and 12 Spanish speakers (7 F, 5 M) Sentence-reading task: English sentences end in today, day, slept, trip, week Spanish sentences end in dia, encontrarla, ella, fuimos These words were coded for presence/absence of creak, just as in News Corpus study (here, Utterance-finally) 18

English/Spanish corpus: Incidence of phrase-final creak English speakers creak more Women creak more Spanish men less Overall incidence is higher than in News Corpus 19

Analysis of 9 speakers We identified 9 speakers who had good distributions of both creaky and non-creaky phrase-final vowels ( > 15%) : 6 Spanish speakers (1 M) 3 English speakers (2 M) 20

English/Spanish corpus: Acoustic analysis Same acoustic measures as in News Corpus: F0, HNR, SHR, and H1*-H2* Recall, cross-speaker differences in H1*-H2* for creaky vs. non-creaky Utterance-final vowels in News Corpus Statistical analysis: linear mixed-effects regression models comparing creaky vs. non-creaky tokens 21

English/Spanish corpus: Acoustic results Like in the News Corpus, Utterance-final creaky voice (compared to non-creaky) is: Lower in F0 Noisier/less periodic More period-doubled Unlike News Corpus, effect of creaky voice is usually lowering of H1*-H2* (constriction) Except for 2 speakers (sf4, sf7), where no difference is found. No speakers had higher H1*- H2* in creaky voice. 22

23

Study 3: Initial vs. final creaky voice In same corpus, English sentences also had phrase-initial creaky voice glottalization of prominent word-initial vowels like Anna [ˈ(ʔ)æ nə] How does Utterance-final creak compare with the phrase-initial creak? Garellek 2014 24

Initial vs. final creaky voice They depend on different factors: Phrase-final creak is F0 dependent; initial creaky voice is not Phrase-final creak extends over multiple segments/words; initial creaky voice is only on initial vowels Phrase-final creak is not prominencesensitive; initial creak is They might well have different sources, and therefore differ acoustically Garellek 2014 25

Initial vs. final creaky voice Anna said she saw him just last week. 26

English sentence corpus In English/Spanish sentence corpus, only English speakers creak in both positions 12 English speakers sentences 2079 creaky final vowels 835 creaky initial vowels Same acoustic measures as before Similar statistical comparisons as before (no language comparison) 27

English sentence corpus: Acoustic results Fundamental frequency (F0) Lower for creaky Utt-final vowels than for creaky phrase-initial vowels, for all speakers Harmonics-to-noise ratio (HNR05) Lower for creaky Utt-final vowels than for creaky phrase-initial vowels, for all speakers Sub-harmonics-to-Harmonics ratio (SHR) Higher for creaky Utt-final vowels than for creaky phrase-initial vowels, for all but one speaker Utterance-final creak is thus generally creakier than phrase-initial creak 28

English sentence corpus: Acoustic results Relative energy in first 2 harmonics (H1*-H2*) Lower H1*-H2* (more constricted) for creaky Utt-final vowels than for creaky phrase-initial vowels, for all but 3 speakers, for whom final creak has higher H1*-H2* (less constricted) These differences are often quite large Utterance-final creak is thus generally, though not always, more constricted than phrase-initial creak 29

30

Summary Study 1: Phrase-final creak is more likely at ends of higher phrases, and with lower F0; no other factors tested mattered Study 1+2: Utterance-final creak differs from non-creak by its Lower F0 and periodicity H1*-H2* generally lower (more constriction) Study 3: Utterance-final creak differs from phrase-initial creak by its Lower F0 and periodicity H1*-H2* generally lower (more constriction) 31

Phrase-final creak: Conclusions Why do we do it? To reach a low F0 target To signal end of phrase How do we do it? Usually by increased glottal constriction Always by less periodic voicing 32

Thank you! 33

34