PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

Similar documents
Mandarin Lexical Tone Recognition: The Gating Paradigm

Annotation Pro. annotation of linguistic and paralinguistic features in speech. Katarzyna Klessa. Phon&Phon meeting

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Test Administrator User Guide

Speech Recognition at ICSI: Broadcast News and beyond

Using dialogue context to improve parsing performance in dialogue systems

Linking Task: Identifying authors and book titles in verbose queries

PeopleSoft Human Capital Management 9.2 (through Update Image 23) Hardware and Software Requirements

Top US Tech Talent for the Top China Tech Company

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Android App Development for Beginners

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Longman English Interactive

STUDENT MOODLE ORIENTATION

Achim Stein: Diachronic Corpora Aston Corpus Summer School 2011

Eyebrows in French talk-in-interaction

A Case Study: News Classification Based on Term Frequency

1 Use complex features of a word processing application to a given brief. 2 Create a complex document. 3 Collaborate on a complex document.

Appendix L: Online Testing Highlights and Script

SkillPort Quick Start Guide 7.0

Introduction to the Revised Mathematics TEKS (2012) Module 1

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Phonological Processing for Urdu Text to Speech System

Introduction to Moodle

Quick Start Guide 7.0

HLTCOE at TREC 2013: Temporal Summarization

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Dialogue Live Clientside

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

The Structure of the ORD Speech Corpus of Russian Everyday Communication

"On-board training tools for long term missions" Experiment Overview. 1. Abstract:

Beyond the Pipeline: Discrete Optimization in NLP

PRODUCT PLATFORM AND PRODUCT FAMILY DESIGN

Introduction of Open-Source e-learning Environment and Resources: A Novel Approach for Secondary Schools in Tanzania

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Modeling function word errors in DNN-HMM based LVCSR systems

Student User s Guide to the Project Integration Management Simulation. Based on the PMBOK Guide - 5 th edition

MOODLE 2.0 GLOSSARY TUTORIALS

MULTIMEDIA Motion Graphics for Multimedia

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Proceedings of Meetings on Acoustics

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Annotation and Taxonomy of Gestures in Lecture Videos

Learning Microsoft Publisher , (Weixel et al)

M-Learning. Hauptseminar E-Learning Sommersemester Michael Kellerer LFE Medieninformatik

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

ECE-492 SENIOR ADVANCED DESIGN PROJECT

AQUA: An Ontology-Driven Question Answering System

The Smart/Empire TIPSTER IR System

PowerTeacher Gradebook User Guide PowerSchool Student Information System

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Bluetooth mlearning Applications for the Classroom of the Future

Assignment 1: Predicting Amazon Review Ratings

Session Six: Software Evaluation Rubric Collaborators: Susan Ferdon and Steve Poast

Using Moodle in ESOL Writing Classes

EdX Learner s Guide. Release

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Ricopili: Postimputation Module. WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015

CS 446: Machine Learning

Houghton Mifflin Online Assessment System Walkthrough Guide

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

The influence of metrical constraints on direct imitation across French varieties

The 24th ACM Conference on Hypertext and Social Media (HT2013): A Personal Review

GACE Computer Science Assessment Test at a Glance

Applications of memory-based natural language processing

Hongyan Ma. University of California, Los Angeles

Postprint.

arxiv: v1 [cs.cl] 2 Apr 2017

Answers To Hawkes Learning Systems Intermediate Algebra

Podcasting and Pedagogy. Workshop Objectives

Python Machine Learning

REVIEW OF CONNECTED SPEECH

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Specification of the Verity Learning Companion and Self-Assessment Tool

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers

Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles

Storytelling Made Simple

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

On-Line Data Analytics

Modeling function word errors in DNN-HMM based LVCSR systems

Writing Research Articles

Designing a Speech Corpus for Instance-based Spoken Language Generation

Shintaro Yamaguchi. Educational Background. Current Status at McMaster. Professional Organizations. Employment History

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Lessons from a Massive Open Online Course (MOOC) on Natural Language Processing for Digital Humanities

WHEN THERE IS A mismatch between the acoustic

Getting the Story Right: Making Computer-Generated Stories More Entertaining

UCEAS: User-centred Evaluations of Adaptive Systems

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

On-the-Fly Customization of Automated Essay Scoring

Class Responsibility Assignment (CRA) for Use Case Specification to Sequence Diagrams (UC2SD)

International Series in Operations Research & Management Science

University of Toronto Physics Practicals. University of Toronto Physics Practicals. University of Toronto Physics Practicals

On the Formation of Phoneme Categories in DNN Acoustic Models

Yoshida Honmachi, Sakyo-ku, Kyoto, Japan 1 Although the label set contains verb phrases, they

Transcription:

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

SUMMARY 1. Motivation 2. Praat Software & Format 3. Extended Praat 4. Prosody Tagger 5. Demo 6. Conclusions

What s the story behind? Annotation of Big Data Prosodic Phrase Segmentation Prominence Salient Word Theoretical Work: Acoustic and Linguistic Information Correlation Information Structure and Prosody (Domínguez et al., 2014a) Prosody as a Vectorial Representation of Acoustic Parameters (Domínguez et al., 2014b)

1. MOTIVATION (I) Aim Tools for Annotation of Spontaneous Speech Semi-Automatic Approaches Focus Visualization Scripting Object Question How do we deal with feature annotation? complex computational processes?

Feature Annotation 1. MOTIVATION (II)

Scripting 1. MOTIVATION (III).txt Acoustic Data TextGrid parsers.textgrid ProsodyPro (Xu, 2013) Praaline (Christoulides, 2014)

2. PRAAT SOFTWARE AND FORMAT (I) The Problem: Labels are unparseable Units of segmentation Tiers item[n] TextGrid File type Interval class/name/[n] Point class/name/[n] Start & End Time xmin/xmax Label string Point in Time point Label string

2. PRAAT SOFTWARE AND FORMAT (II) Example Praat (Boersma, 2001) Graphical User Interface

2. PRAAT SOFTWARE AND FORMAT (III) TextGrid in text editor Sublime

3. EXTENDED PRAAT (I) The Solution: Parseable Labels Tiers item[n] TextGrid File type Interval Point class/name/[n] class/name/[n] Start & End Time xmin/xmax Label string Point in Time point Label string Head Features Head Features string [n] string [n] Name Value Name Value string string string string SEMAFOR (Tsatsaronis et al., 2012) http://www.cs.cmu.edu/ ark/semafor/ GATE (Cunningham et al., 2011) https://gate.ac.uk/ Brat (Stenetorp et al., 2012) http://brat.nlplab.org/

3. EXTENDED PRAAT (II) Extended Praat for Feature Annotation (Domínguez, et al. 2016d) Local Version for Linux https://github.com/monikaupf Web Version http://kristina.taln.upf.edu/praatweb/ Tutorial https://www.youtube.com/watch?v=sjxu15dskjs

3. EXTENDED PRAAT (III) Advantage 1: Visualization Visualization Standard Praat Visualization Praat on the Web

3. EXTENDED PRAAT (III) Advantage 2: Intuitive GUI Standard Praat Keyboard Shortcuts Zooming Audio playback Scrolling Praat on the Web Dedicated buttons

Advantage 3: Scripting Composition Features allow storing information for : Data analysis Retrieval for computation at different stages Segmentation of a complex computational problem into specialized / dedicated steps Within the same Praat Environment 3. EXTENDED PRAAT (IV)

Related Work Methodology Implementation Evaluation Future Work 4. PROSODY TAGGER (I) Language and register independant Raw input: suitable for spontaneous speech analysis No previous installation http://kristina-project.eu/en/ (Domínguez, et al. 2016c) AuToBI (Rosenberg, 2010) http://eniac.cs.qc.cuny.edu/andrew/autobi/ ANALOR (Avanzi et al., 2008) http://www.lattice.cnrs.fr/analor.html?lang=fr

Multiple configurations 4. PROSODY TAGGER (II)

Output Praat on the Web TextGrid for Local Use 4. PROSODY TAGGER (III)

5. DEMO

6. CONCLUSIONS (I) Integrative and Reproducible Research Reproducibility as a principle of the scientific method Reproducibility separates scientists from normal people

Advantages of Specialized Modules 4. CONCLUSIONS (II) Testing different techniques for the same process Different configurations Optimization of modules Task assignment Library of scripts Impact on reproducibility

6. CONCLUSIONS (III) Praat on the Web Multidimensional Feature Vector within segment labels VISUALIZATION in a dedicated window Web-based Implementation Ready-to-use Interface Operational Interface for Modular Script Composition Modular Computational Tasks within Praat Platform

https://portal.upf.edu/web/mdm-dtic/home THANK YOU! Acknowledgements: Iván Latorre Joan Codina Mireia Farrús Leo Wanner Aurelio Ruiz Open Demopage & Source Code: http://kristina.taln.upf.edu/praatweb/ https://github.com/monikaupf monica.dominguez@upf.edu @MonikaUPF @talnupf @projectkristina

REFERENCES Y. Xu. 2013. Prosodypro a tool for large-scale systematic prosody analysis. In Proceedings of Tools and Resources for the Analysis of Speech Prosody (TRASP), pages 7 10, Aix-en-Provence, France. G. Christodoulides. 2014. Praaline: Integrating tools for speech corpus research. In Proceedings of the 9 th International Conference on Language Resources and Evaluation, Reykjavik, Iceland. P. Boersma. 2001. Praat, a system for doing phonetics by computer. Glot International, 5(9/10):341 345. G. Tsatsaronis, I. Varlamis, and K. Nørvag. 2012. Semafor: Semantic document indexing using semantic forests. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 12, pages 1692 1696, New York, NY, USA. ACM. P. Stenetorp, S. Pyysalo, G. Topi c, T. Ohta, S. Ananiadou, and J. Tsujii. 2012. Brat: A web-based Tool for NLP-assisted Text Annotation. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 12, pages 102 107, Stroudsburg, PA, USA. Association for Computational Linguistics. H. Cunningham, D. Maynard, K. Bontcheva, V. Tablan, N. Aswani, I. Roberts, G. Gorrell, A. Funk, A. Roberts, D. Damljanovic, T. Heitz, M. A. Greenwood, H. Saggion, v Petrak, Y. Li, and W. Peters. 2011. Text Processing with GATE (Version 6). M. Domínguez, M. Farrús, A. Burga, and L. Wanner. 2016a. Using hierarchical information structure for prosody prediction in content-to-speech applications. In Proceedings of the 8th International Conference on Speech Prosody, pages 1019 1023, Boston, USA. M. Domínguez, M. Farrús, and L. Wanner. 2016b. Combining acoustic and linguistic features in phrase-oriented prosody prediction. In Proceedings of the 8th International Conference on Speech Prosody, pages 796 800, Boston, USA. M. Domínguez, M. Farrús, and L. Wanner. 2016c. An automatic prosody tagger for spontaneous speech. Accepted in COLING 2016, Osaka, Japan. M. Domínguez, M. Farrús, and L. Wanner. 2016c. Praat on the Web: An Upgrade of Praat for Semi-Automatic Speech Annotation. Accepted in COLING 2016, Osaka, Japan.