The Closed Runway Operation Prevention Device: Applying Automatic Speech Recognition Technology for Aviation Safety

Similar documents
Aviation English Solutions

Human Factors Computer Based Training in Air Traffic Control

Initial English Language Training for Controllers and Pilots. Mr. John Kennedy École Nationale de L Aviation Civile (ENAC) Toulouse, France.

Speech Recognition at ICSI: Broadcast News and beyond

Measurement & Analysis in the Real World

How to Judge the Quality of an Objective Classroom Test

On-Line Data Analytics

What the National Curriculum requires in reading at Y5 and Y6

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Natural Language Analysis and Machine Translation in Pilot - ATC Communication. Boh Wasyliw* & Douglas Clarke $

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

DEVELOPMENT AND EVALUATION OF AN AUTOMATED PATH PLANNING AID

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

Longman English Interactive

Guidelines for Writing an Internship Report

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

Appendix L: Online Testing Highlights and Script

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

An Automated Data Fusion Process for an Air Defense Scenario

Using dialogue context to improve parsing performance in dialogue systems

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Education the telstra BLuEPRint

LEGO MINDSTORMS Education EV3 Coding Activities

Software Maintenance

Teacher: Mlle PERCHE Maeva High School: Lycée Charles Poncet, Cluses (74) Level: Seconde i.e year old students

Mandarin Lexical Tone Recognition: The Gating Paradigm

Lower and Upper Secondary

Plain Language NAGC Review

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

SOFTWARE EVALUATION TOOL

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Modeling user preferences and norms in context-aware systems

PHO 1110 Basic Photography for Photographers. Instructor Information: Materials:

Candidates must achieve a grade of at least C2 level in each examination in order to achieve the overall qualification at C2 Level.

ANGLAIS LANGUE SECONDE

Create Quiz Questions

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Miscommunication and error handling

5. UPPER INTERMEDIATE

Literature and the Language Arts Experiencing Literature

CEFR Overall Illustrative English Proficiency Scales

Aviation English Training: How long Does it Take?

Eye Movements in Speech Technologies: an overview of current research

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Language Acquisition Chart

Word Segmentation of Off-line Handwritten Documents

Tutor Guidelines Fall 2016

Detecting English-French Cognates Using Orthographic Edit Distance

learning collegiate assessment]

Grade 11 Language Arts (2 Semester Course) CURRICULUM. Course Description ENGLISH 11 (2 Semester Course) Duration: 2 Semesters Prerequisite: None

Parsing of part-of-speech tagged Assamese Texts

Lecturing Module

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

SIE: Speech Enabled Interface for E-Learning

Lecture 2: Quantifiers and Approximation

Radius STEM Readiness TM

Interview with a Fictional Character

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

English Language and Applied Linguistics. Module Descriptions 2017/18

Case study Norway case 1

Exams: Accommodations Guidelines. English Language Learners

National Literacy and Numeracy Framework for years 3/4

Jacqueline C. Kowtko, Patti J. Price Speech Research Program, SRI International, Menlo Park, CA 94025

MAE Flight Simulation for Aircraft Safety

REVIEW OF CONNECTED SPEECH

Innovative Methods for Teaching Engineering Courses

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

An Introduction to Simio for Beginners

Welcome to MyOutcomes Online, the online course for students using Outcomes Elementary, in the classroom.

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

A Quantitative Method for Machine Translation Evaluation

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Organizing Comprehensive Literacy Assessment: How to Get Started

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

Cambridgeshire Community Services NHS Trust: delivering excellence in children and young people s health services

Infrared Paper Dryer Control Scheme

21st Century Community Learning Center

Probability estimates in a scenario tree

To the Student: ABOUT THE EXAM

South Carolina English Language Arts

DIBELS Next BENCHMARK ASSESSMENTS

November 2012 MUET (800)

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

LODI UNIFIED SCHOOL DISTRICT. Eliminate Rule Instruction

Linking Task: Identifying authors and book titles in verbose queries

On-the-Fly Customization of Automated Essay Scoring

4-3 Basic Skills and Concepts

On the Combined Behavior of Autonomous Resource Management Agents

Busuu The Mobile App. Review by Musa Nushi & Homa Jenabzadeh, Introduction. 30 TESL Reporter 49 (2), pp

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

CS 100: Principles of Computing

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science

Basic German: CD/Book Package (LL(R) Complete Basic Courses) By Living Language

Let's Learn English Lesson Plan

Loughton School s curriculum evening. 28 th February 2017

Transcription:

MITRE CAASD The Closed Runway Operation Prevention Device: Applying Automatic Speech Recognition Technology for Aviation Safety Shuo Chen Hunter Kopald June 25, 2015 Approved for Public Release; Distribution Unlimited. 15-1756 2015 The MITRE Corporation. All rights reserved.

2 Outline Overview of closed runway operations Closed Runway Operation Prevention Device (CROPD) Role of speech recognition Automatic speech recognition, in general and in ATC Tuning techniques to tune performance for the ATC domain Language modeling Acoustic model adaptation Semantic interpretation and confidence thresholding Evaluating the CROPD speech recognition performance through field demonstration Implications and future work

Overview of Closed Runway Operations 3 Local Controller Transmission American one twenty-three, wind three five zero at eight, runway one left, cleared for takeoff Prevention Mechanisms Standard Operating Procedures Basic memory aids: flight strip placards Additional automation systems that provide alerts Pilot Readback Runway one left, cleared for takeoff, american one twenty-three, X X Operations no longer allowed on runways designated as closed

Closed Runway Operation Prevention Device (CROPD) 4 FAA-proposed concept to prevent operations on closed runways by using automatic speech recognition to monitor controller clearances CROPD alerts if both: 1. Clearance detected cleared to land cleared for takeoff line up and wait Other clearances to be added, e.g., cleared touch and go 2. Runway associated with clearance is designated as closed Additional details Only listens to local controller transmissions Not integrated with other tower systems (e.g., ASDE-X) Intended for use at airports both with and without advanced runway safety systems Air Traffic Control Tower! Alert Controller Closed runway status Graphical User Interface Transmission Audio Closed Runway Status Alert Trigger CROPD in Equipment Room Speech recognition and understanding Identify spoken intent: 1. Runway 2. Clearance Alert logic Compare: 1. Runway and clearance detected 2. Closed runway status Trigger alert

5 The Role of Speech Recognition Objective of the speech recognition is to detect intent to use a runway United one twenty-three, wind three three zero at six, runway one center cleared to land, traffic departs runway three zero. Recognized numbers One twenty-three Three three zero Six One (center) Logical concept ACID Wind direction Wind speed Runway clearance For CROPD, intent can be considered the presence of a clearance and the runway associated with that clearance Simplifies performance measure to specific phrase-level accuracy Requires disambiguation of phrases that could be confused with runways Requires accurate association of the correct runway with the clearance post-recognition Three zero Runway Not all words in the transmission are equally important to the detection of intent Word Error Rate (WER) is not appropriate Need an application-specific metric Composite measure of both runway and clearance Considers correct association of runway with clearance Use intent as metric, with 5 outcome types CROPD Result Intent: RWY 30, CFT Intent: RWY 19L, CTL No Intent Truth Actual Transmission Intent: RWY30, CFT Correct Intent Incorrect Intent Missed Intent No Intent False Intent False Intent Correct Rejection Speech recognition performance is different from system performance Incorrect speech recognition results do not always translate to incorrect system performance

Automatic Speech Recognition 6 In General Automatic speech recognition performance depends not only on the speech input at the time of application, but also on what is already known about the application prior to use In some cases, meaning deduction is more important than word recognition Need to tailor system to the application Need to define what recognition is needed for success With Air Traffic Control ATC characteristics that make automatic speech recognition challenging Acoustic characteristics of audio equipment Acoustic characteristics of the speaker set Rate of speech Pronunciation Accents Deviations from standard phraseology ATC characteristics that facilitate automatic speech recognition Standard phraseology, when followed Acoustic modeling custom pronunciation adaptation Context information: what the system should expect to hear KIAD Airport Diagram 12 19R 1L 19C 1C 30 19L 1R

Tuning Recognition Performance: Language Model Creation and Adaptation 7 Language models define the universe of words and word sequences that the speech recognition system is designed to recognize 1. Finite-state grammars Manually defined vocabulary and word sequences Can yield near-perfect recognition in applications where speakers adhere to defined sequences Poor tolerance for phrase deviations 2. Statistical language models (SLMs) Machine-generated vocabulary and probability model of word sequences based on transcription data from target environment Robust to speech variations and disfluencies Can yield improbable word sequences CROPD used both types of models Initial A grammar benchmark containing of speech only runway recognition and performance on local controller audio using clearance the two phrases, CROPD with language common models yielded substitutions, 70% true intent additions, detection and omissions A 50% statistical false intent language detection model trained on 10,000+ transcriptions from the local controller position

Tuning Recognition Performance: Acoustic Model Adaptation 8 Acoustic models define the statistical signatures of speech sounds, i.e. phonemes, and specify the sound sequences that form words Default acoustic model usually provided out-of-the-box Methods of adaptation: 1. Pronunciation dictionaries CROPD used both types Acoustic of acoustic Model tuning = techniques = cleared-to-land Manually defined word pronunciations Accounts for common pronunciation variation and non-standard words, e.g. airline names, fix names 2. Automatic acoustic model adaptation Automatic supervised training of existing acoustic model with audio and transcriptions from target speaker set and environment Accounts for channel characteristics and consistent speech patterns, i.e. accents, within a speaker set Custom ATC dictionary Non-standard words, e.g. = niner niner = one Composite = words where coarticulation = and assimilation were frequently observed, e.g. clearedto-land clea da lan = Custom acoustic model adapted with audio and transcriptions from local controller position Second benchmark of speech recognition performance showed improvement increased true intent detection to 89% reduced false intent detection to 18%

Tuning Recognition Performance: Post- Recognition Processing 9 Semantic interpretation performs text processing to derive logical concepts from potentially error-filled raw text Apply application-specific reasoning to disambiguate confusable phrases Introduce robustness to recognition error by selectively using only parts of the recognized text For CROPD, a semantic interpretation algorithm was developed Rescores hypotheses based on systemgenerated confidence scores Deduces the most probable combination of clearance and runway Third benchmark of speech recognition performance showed improvement increased true intent detection to 95% reduced false intent detection to 17% Confidence Thresholding uses system-generated scores to bias final result to a specific balance of missed and false detections In CROPD, confidence thresholding is implemented System users must determine what confidence threshold is operationally acceptable Can vary by facility Does not affect speech recognition system Advises the application using the speech recognition whether to accept a recognition result based on its own certainty of accuracy

Evaluating Performance through Field Demonstration 10 In summer of 2014, the CROPD was field tested at KIAD as a part of the NAS Passively monitored local controller channel by connecting to the facility voice switch A subset of field audio was selected for performance evaluation The effect of adjusting the confidence threshold was also studied True Intent Transmissions with Intent Incorrect Intent Missed Intent Transmissions without Intent Correct Non-Intent False Intent 92.70% 4.77% 2.54% 90.33% 9.67% Analysis of some of the false and incorrect intent errors showed that the semantic interpretation algorithm could be further improved Erroneously discarding correctly recognized runway phrases Some ambiguous phrases still erroneously tagged as runways

11 Implications and Future Work A combination of various tuning techniques can yield significant performance improvements CROPD is a simple, isolated application that demonstrates the feasibility of applying automatic speech recognition on live ATC transmissions Other applications could make use of the same detected clearances for other purposes or detect new instructions for other controller positions More complex applications could also be feasible with the addition of context information from tower automation systems MITRE is currently investigating the benefits of integrating speech recognition with Tower/Surface surveillance data Leverage speech-derived intent to improve safety alert performance Use aircraft and airport state information to improve speech recognition performance

2015 The MITRE Corporation. All rights reserved. Questions?

Transmission without Intent Transmission with Intent A Closer Look at Recognition vs. System Performance 13 Spoken Intent: 19L, cleared to land Closure: none Detected: none Correct non-alert Spoken Intent: 19L, cleared to land Closure: 19L/1R Detected: none Missed Alert Correct Alert Behavior Incorrect Alert Behavior Spoken runway not closed, no missed or false alert Spoken runway closed, missed alert Missed Intent Detection

Transmission without Intent Transmission with Intent A Closer Look at Recognition vs. System Performance 14 Correct Alert Behavior Incorrect Alert Behavior False Intent Detection Detected runway not closed, no false alert Detected runway closed, false alert Spoken Intent: none Closure: none Detected: 19L, cleared to land Correct non-alert Spoken Intent: none Closure: 19L/1R Detected: 19L, cleared to land False Alert

Transmission without Intent Transmission with Intent A Closer Look at Recognition vs. System Performance 15 Spoken Intent: 19L, cleared to land Closure: none Detected: 1L, cleared to land Correct non-alert Spoken Intent: 19C, cleared to land Closure: 19C/1C Detected: 1C, cleared to land Correct non-alert Correct Alert Behavior Spoken runway not closed, detected runway not closed, no missed or false alert Spoken runway closed, detected runway is opposite end of spoken runway, alert correctly generated Incorrect Alert Behavior Spoken runway closed, detected runway not closed, missed alert Spoken Intent: 19L, cleared to land Closure: 19L/1R Detected: 1L, cleared to land Missed Alert Detected runway is not closed, no false alert Detected runway is closed, false alert Spoken Intent: 19L, cleared to land Closure: none Detected: 1L, cleared to land Correct non-alert Incorrect Intent Detection Spoken Intent: 19L, cleared to land Closure: 1L/19R Detected: 1L, cleared to land False Alert