Concept to Speech Generation Systems

Similar documents
English Language and Applied Linguistics. Module Descriptions 2017/18

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Designing a Speech Corpus for Instance-based Spoken Language Generation

Eyebrows in French talk-in-interaction

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

LINGUISTICS. Learning Outcomes (Graduate) Learning Outcomes (Undergraduate) Graduate Programs in Linguistics. Bachelor of Arts in Linguistics

Florida Reading Endorsement Alignment Matrix Competency 1

Applications of memory-based natural language processing

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Phonological encoding in speech production

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Dialog Act Classification Using N-Gram Algorithms

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Word Stress and Intonation: Introduction

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

The Verbmobil Semantic Database. Humboldt{Univ. zu Berlin. Computerlinguistik. Abstract

Lecture Notes in Artificial Intelligence 4343

Getting the Story Right: Making Computer-Generated Stories More Entertaining

Functional Mark-up for Behaviour Planning: Theory and Practice

THE SURFACE-COMPOSITIONAL SEMANTICS OF ENGLISH INTONATION MARK STEEDMAN. University of Edinburgh

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

Surface Structure, Intonation, and Meaning in Spoken Language

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Communication around Interactive Tables

Structure and Intonation in Spoken Language Understanding

Speech Recognition at ICSI: Broadcast News and beyond

A Case-Based Approach To Imitation Learning in Robotic Agents

The Use of Drama and Dramatic Activities in English Language Teaching

Journal of Phonetics

Organizing Comprehensive Literacy Assessment: How to Get Started

Copyright by Niamh Eileen Kelly 2015

Learning Methods for Fuzzy Systems

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

An Open Framework for Integrated Qualification Management Portals

Parsing of part-of-speech tagged Assamese Texts

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Phonological and Phonetic Representations: The Case of Neutralization

Saint Louis University Program Assessment Plan. Program Learning Outcomes Curriculum Mapping Assessment Methods Use of Assessment Data

The Acquisition of English Intonation by Native Greek Speakers

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Sample Goals and Benchmarks

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Effect of Word Complexity on L2 Vocabulary Learning

11:00 am Robotics and the Law: An American Perspective Prof. Ryan Calo, University of Washington School of Law

Speech Emotion Recognition Using Support Vector Machine

Jacqueline C. Kowtko, Patti J. Price Speech Research Program, SRI International, Menlo Park, CA 94025

L1 Influence on L2 Intonation in Russian Speakers of English

A survey of intonation systems

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

Copyright and moral rights for this thesis are retained by the author

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Manual Response Dynamics Reflect Rapid Integration of Intonational Information during Reference Resolution

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

SIE: Speech Enabled Interface for E-Learning

Metadiscourse in Knowledge Building: A question about written or verbal metadiscourse

Linguistics. The School of Humanities

TAG QUESTIONS" Department of Language and Literature - University of Birmingham

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

Spoken English, TESOL and Applied Linguistics

Natural Language Processing. George Konidaris

Regional variation in the realization of intonation contours in the Netherlands

Developing a TT-MCTAG for German with an RCG-based Parser

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Rhythm-typology revisited.

User Expertise Modelling and Adaptivity in a Speech-based System

Using dialogue context to improve parsing performance in dialogue systems

Mandarin Lexical Tone Recognition: The Gating Paradigm

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

CS 598 Natural Language Processing

SARDNET: A Self-Organizing Feature Map for Sequences

Voice conversion through vector quantization

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Some Principles of Automated Natural Language Information Extraction

Agent-Based Software Engineering

On the Open Access Strategy of the Max Planck Society

Longitudinal family-risk studies of dyslexia: why. develop dyslexia and others don t.

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

/$ IEEE

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Eye Movements in Speech Technologies: an overview of current research

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

REVIEW OF CONNECTED SPEECH

Phonological Processing for Urdu Text to Speech System

Program in Linguistics. Academic Year Assessment Report

ISSN Volume 3 No. 2, August 2005 EDITORS-IN-CHIEF

LING 329 : MORPHOLOGY

Group of National Experts on Vocational Education and Training

Routledge Library Editions: The English Language: Pronouns And Word Order In Old English: With Particular Reference To The Indefinite Pronoun Man

A cautionary note is research still caught up in an implementer approach to the teacher?

Universal contrastive analysis as a learning principle in CAPT

The recognition, evaluation and accreditation of European Postgraduate Programmes.

Emotional Variation in Speech-Based Natural Language Generation

Name C.023.SS1d Text Structure Reflection. Title: Problem and Solution. Problem. Name Text Structure Reflection C.023.SS1e. C.023.SS1c.

AQUA: An Ontology-Driven Question Answering System

Transcription:

Concept to Speech Generation Systems Proceedings of a Workshop in conjunction with 35th Annual Meeting of the Association for Computational Linguistics Edited by Kai Alter, Hannes Pirker, and Wolfgang Finkler 11 July 1997 Universidad Nacional de Educaci6n a Distancia Madrid, Spain

TABLE OF CONTENTS Organizing and Program Committee... Program Timetable... Introduction to the Workshop... ii, 111 iv Probabilistic Model of Acoustic / Prosody / Concept Relationships for Speech Synthesis Nanette M. Veilleux... Message-to-Speech: High Quality Speech Generation for Messaging and Dialogue Systems Peter Spyns, Filip Deprez, Luc Van Tichelen, and Bert Van Coile A Compact Representation of Prosodically Relevant Knowledge in a Speech Dialogue System Peter Poller and Paul Heisterkamp... Integrating Language Generation with Speech Synthesis in a Concept to Speech System Shimei Pan and Kathleen McKeown... Can Pitch Accent Type Convey Information Status in Yes-No Questions? Martine Grice and Michelina SavinG... Computing Prosodic Properties in a Data-to-Speech System M. Theune, E. Klabbers, J. Odijk, and J.R.. de Pijper... Semantic and Discourse Information For Text-to-Speech Intonation Laurie Hiyakumoto, Scott Prevost, and Justine Cassell... Looking for the Presence of Linguistic Concepts in the Prosody of Spoken Utterances Gerit P. Sonntag and Thomas Portele... 11 17 23 29 39 47 57

Program Committee Robert Bannert Univ. of Umea, Sweden John Bateman GMD Darmstadt, Germany Mary Beckman Ohio State Univ., USA Carlos Gussenhoven Univ. of Nijmegen, The Netherlands Bjorn Granstroem KTH Stockholm, Sweden Elisabeth Maier DFKI Saarbrficken, Germany Scott Prevost MIT Boston, USA Mark Steedman Univ. of Pennsylvania, USA Organizing Committee Kai Alter Austrian Research Institute for AI (OFAI) & Max-Planck-Institute of Cognitive Neuroscience alterqcns.mpg.de Wolfgang Finkler German Research Center for AI (DFKI) finkler@dfki.uni-sb.de Hannes Pirker Austrian Research Institute for AI (OFAI) hannes@ai.univie.ac.at ii

PRO GRAM TIMETABLE 9.00 am Introduction 9.10 am Veilleux, N. Probabilistic Model of Acoustic / Prosody / Concept Relationships for Speech Synthesis 9.45 am Spyns, P., Deprez, F., Van Tiche- len, L., & Van Coile, B. Message-to-Speech: High Quality Speech Generation for Messaging and Dialogue Systems 10.20 am Poller, P. & Heisterkamp, P. 10.55 am Coffee Break A Compact Representation of Prosodically Relevant Knowledge in a Speech Dialogue System 11.15 am 11.50 am Pan, S. & McKeown, K. Grice, M. & Savino, M. Integrating Language Generation with Speech Synthesis in a Concept to Speech System Can Pitch Accent Type Convey Information Status in Yes-No Questions? 12.25 pm Theune, M., Klabbers, E., Odijk, J., & de Pijper, J.R. 1.00 pm Lunch Break Computing Prosodic Properties in a Data-to-Speech System 3.00 pm 3.35 pm Hiyakumoto, L., Prevost, S., & Cassell, J. Sonntag CG.P., & Portele Th. Semantic and Discourse Information for Text-to-Speech Intonation Looking for the Presence of Linguistic Concepts in the Prosody of Spoken Utterances 4.10 pm Final Discussion..!!1

Introduction to the Workshop Traditionally, research on spoken language generation was mainly undertaken within the separate fields of natural language generation and speech synthesis. On the one hand, current generation systems allow for the production of flexible utterances. They may be utilized to overcome the limitations of Human-Computer Interfaces with stereotyped language output. However, they typically neglect aspects of intonation and hand over the resulting text in graphemic form to a speech synthesis component. On the other hand, current speech synthesizers implement the so-called Text-to-Speech approach. They are able to read aloud unrestricted text. One of the main problems posed by that paradigm is the production of adequate prosody since the written form of a text that is to be articulated is a poor knowledge source. Prosodic features of an utterance are highly dependent on the informational structure, on the linguistic structure, and on the situational context of the utterance. A tight interaction between generation and synthesis should contribute to enhance the quality of a system's output. A component for speech synthesis may be provided with relevant parameters to compute adequate intonation. A generation system may utilize options of speech synthesis during its decisions of tactical or strategic generation, e.g., reflect the information structure either by intonational cues or via morpho-syntactic variations (e.g. changing of word order). Concept-to-Speech (CTS) generation, i.e., the production of synthetic speech on the basis of pragmatic, semantic, and discourse knowledge offers a challenging and relatively new field of research in intelligent user interfaces. The questions raised in such an environment range from pragmatics, semantics, and (morpho-)syntax to phonology and phonetics. The modelling of prosody (at symbolic and acoustic level) serves as one of the open questions within this paradigm. Obviously, the development of a CTS system is very demanding. Successful work within the framework of CTS relies on the ability to integrate efforts from a number of disciplines, such as Computational Linguistics, Artificial Intelligence, Cognitive Science, and Signal Processing. The workshop will provide a forum to bring together researchers from the fields of natural language generation and speech synthesis. The aim of the workshop is to stimulate interchange of innovative ideas and results of diverse aspects of CTS generation in order to bridge the gap between these fields. iv

Among the challenging aspects of a CTS system, we proposed to address issues of the following list in the first place: How can systems for natural language generation be adapted in order to utilize new realization options to the generation process that are offered in the CTS framework? How can issues in the time-course of the interleaved process for generation and synthesis (when-to-say) be dealt with? Which requirements on speech synthesis are to be fulfilled in an incremental approach to spoken language production? Due to its inherent integrational property, being influenced by a whole number of representational levels, modelling of prosody will be one of the major topics of the workshop. How can approaches in the Text-to-Speech tradition to synthesis show their adaptability to Concept-to-Speech? We invited contributions that provide solutions to any of the topics indicated above or that present innovative applications addressing the abovementioned issues. V