Speech Processing /18 492/ Speech Synthesis Prosody

Similar documents
Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Word Stress and Intonation: Introduction

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Modern TTS systems. CS 294-5: Statistical Natural Language Processing. Types of Modern Synthesis. TTS Architecture. Text Normalization

L1 Influence on L2 Intonation in Russian Speakers of English

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

Mandarin Lexical Tone Recognition: The Gating Paradigm

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Designing a Speech Corpus for Instance-based Spoken Language Generation

The Acquisition of English Intonation by Native Greek Speakers

Organizing Comprehensive Literacy Assessment: How to Get Started

CS 598 Natural Language Processing

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

English Language and Applied Linguistics. Module Descriptions 2017/18

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Discourse Structure in Spoken Language: Studies on Speech Corpora

Phonological Processing for Urdu Text to Speech System

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Lecturing Module

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Speech Recognition at ICSI: Broadcast News and beyond

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Surface Structure, Intonation, and Meaning in Spoken Language

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

Rhythm-typology revisited.

A survey of intonation systems

THE SURFACE-COMPOSITIONAL SEMANTICS OF ENGLISH INTONATION MARK STEEDMAN. University of Edinburgh

Transcription of Intonation of the Spanish Language. Introduction *

Florida Reading Endorsement Alignment Matrix Competency 1

Structure and Intonation in Spoken Language Understanding

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

Unit Selection Synthesis Using Long Non-Uniform Units and Phonemic Identity Matching

Copyright and moral rights for this thesis are retained by the author

GOLD Objectives for Development & Learning: Birth Through Third Grade

Automatic intonation assessment for computer aided language learning

Journal of Phonetics

Journal of Phonetics

Fluency Disorders. Kenneth J. Logan, PhD, CCC-SLP

Common Core State Standards for English Language Arts

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

The College Board Redesigned SAT Grade 12

This publication is also available for download at

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

- Period - Semicolon - Comma + FANBOYS - Question mark - Exclamation mark

The influence of metrical constraints on direct imitation across French varieties

Copyright by Niamh Eileen Kelly 2015

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

2017 national curriculum tests. Key stage 1. English grammar, punctuation and spelling test mark schemes. Paper 1: spelling and Paper 2: questions

Coast Academies Writing Framework Step 4. 1 of 7

November 2012 MUET (800)

Letter-based speech synthesis

Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

Part I. Figuring out how English works

Primary English Curriculum Framework

Year 4 National Curriculum requirements

Natural Language Processing. George Konidaris

What the National Curriculum requires in reading at Y5 and Y6

A Correlation of. Grade 6, Arizona s College and Career Ready Standards English Language Arts and Literacy

REVIEW OF CONNECTED SPEECH

Phonological and Phonetic Representations: The Case of Neutralization

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

Learning English with CBC

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

TESL /002 Principles of Linguistics Professor N.S. Baron Spring 2007 Wednesdays 5:30 pm 8:00 pm

Teaching sentential intonation through Proverbs

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

TEKS Comments Louisiana GLE

Cambridgeshire Community Services NHS Trust: delivering excellence in children and young people s health services

THE MULTIVOC TEXT-TO-SPEECH SYSTEM

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

Wisconsin 4 th Grade Reading Results on the 2015 National Assessment of Educational Progress (NAEP)

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Test Blueprint. Grade 3 Reading English Standards of Learning

Interfacing Phonology with LFG

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Infants Perception of Intonation: Is It a Statement or a Question?

Sample Goals and Benchmarks

LITERACY, AND COGNITIVE DEVELOPMENT

Alignment of Iowa Assessments, Form E to the Common Core State Standards Levels 5 6/Kindergarten. Standard

Proof Theory for Syntacticians

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Prosody in Speech Interaction Expression of the Speaker and Appeal to the Listener

Repeated Readings. MEASURING PROGRESS Teacher observation Informally graph fluency

Welcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading

Bitonal lexical pitch accents in the Limburgian dialect of Borgloon

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

The IRISA Text-To-Speech System for the Blizzard Challenge 2017

A Socio-Tonetic Analysis of Sui Dialect Contact. James N. Stanford Rice University. [To appear in Language Variation and Change 20(3)]

CELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom

Vicente Amado Antonio Nariño HH. Corazonistas and Tabora School

Grade 2 Unit 2 Working Together

Highlighting and Annotation Tips Foundation Lesson

LING 329 : MORPHOLOGY

WASHINGTON Does your school know where you are? In class? On the bus? Paying for lunch in the cafeteria?

Transcription:

Speech Processing 15-492/18 492/18-492 Speech Synthesis Prosody

Speech Synthesis Linguistic Analysis Pronunciations Prosody

Prosody How the phonemes will be said Four aspects of prosody Phrasing: where the breaks will be Intonation: pitch accents and F0 generation Duration: how long the phonemes will be Power: energy in signal

Phrase Breaks Need to take a breath Need to chunk relevant parts together Sub-sentential Supra-word First approximation At punctuation (comma, semicolon, etc.) Too little Second approximation At each (or some) of the content/function words Too much

Phrasing Punctuation Next week, some inmates released early from the Hampton County jail in Springfield, will be wearing a wristband that hooks up to a special jack on their home phones. Content/function words Next week some inmates released early from the Hampton County jail in Springfield will be wearing a wristband that hooks up with a special jack on their home phones.

Phrasing Bachenko and Fitzpatrick 90 Rule driven with punctuation, POS and syntax Balanced phrasing (the boy saw) (the girl in the park) (the boy in the park) (saw the girl) Hirschberg and Prieto 94 CART trees (similar features) Ostendorf and Veilleux 94 Hierarchical statistical model Multilevel breaks

Phrasing (Black and Taylor 97) Balance length of phrases Predict probability of break with CART (use POS) Use n-gram n of B/NB to keep balance Trained on BBC Radio 4 (NPR-like) 31,707 words, 6,346 breaks 91% correct with 6-gram6 Still makes errors especially around I

Phrasing What is correct? Lots of answers are correct. But some are definitely bad. Ostendorf and Vielleux 94 Multiple people read same paragraphs If your method matches any single person s version it is correct.

Intonation The fundamental tune Accents (highlighting important parts) F0 generation (the tune itself)

Intonation Contour

Intonation Information Large pitch range (female) Authoritative since goes down at the end News reader Emphasis for Finance H* Final has a raise more information to come Female American newsreader from WBUR (Boston University Public Radio)

Intonation Examples Fixed durations, flat F0. Declining F0 hat accents on stressed syllables accents and end tones statistically trained

Intonational Phonology Accents and Boundaries Where are the important changes in F0? Accents on syllables Identifies important words It will be RAINY today in Boston It will be rainy TODAY in Boston It will BE rainy today IN Boston (strange)

Where do the accents go? On important words First approximation On stressed syllables in content words It WILL be RAINY TODAY in BOSTON About 80% correct on news reader speech CART training on more features Content, proper nouns, POS, position in text (not semantic information)

ToBI Tones and Break Indices A labeling for intonation (English) Different accent types H*,!H, L*, L+H* Different boundary types L+L%, L+H%, H+H%,

ToBI examples

F0 Generation Contour from accents (and durations) Piece together shapes of different accents Generated By rule Trained from data

Using real contours From a data base of different contours Select most appropriate one Record lots of different intonation examples He DID then KNOW what HAD occurred TARZAN and JANE raised THEIR heads Label them and select the contours when you want emphasis

Emphasis Synthesis This is a short example THIS is a short example This IS a short example This is A short example This is a SHORT example This is a short EXAMPLE

Duration Prediction Each phone needs a duration Make it 80ms Vowels are typically longer than consonants Emphasis/accent/stress lengthens them Initial and final phones are longer

Prediction Models By rule Klatt rules By training (using Klatt features) CART / linear regression Easy to get reasonable durations Hard to get very good durations

Fast and Slow Speech Speaking fast: not uniformly shorter durations Have less prosodic breaks Reduce syllables Make consonants shorter Make vowels a little shorter Speaking slow: not uniformly longer durations Add more prosodic breaks Small increases in vowel duration (?)

Summary Prosody Phrasing Intonation Accents + F0 generation Duration Power