The Conversational User Interface

Similar documents
Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs

AQUA: An Ontology-Driven Question Answering System

The College Board Redesigned SAT Grade 12

Applications of memory-based natural language processing

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Some Principles of Automated Natural Language Information Extraction

Rule-based Expert Systems

Objective: Add decimals using place value strategies, and relate those strategies to a written method.

M55205-Mastering Microsoft Project 2016

CS 598 Natural Language Processing

An Introduction to the Minimalist Program

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Teachers: Use this checklist periodically to keep track of the progress indicators that your learners have displayed.

Radius STEM Readiness TM

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

Proof Theory for Syntacticians

Section 7, Unit 4: Sample Student Book Activities for Teaching Listening

Foundations of Knowledge Representation in Cyc

Speech Recognition at ICSI: Broadcast News and beyond

Eye Movements in Speech Technologies: an overview of current research

10 Tips For Using Your Ipad as An AAC Device. A practical guide for parents and professionals

Lecture 1: Machine Learning Basics

Part I. Figuring out how English works

Teacher: Mlle PERCHE Maeva High School: Lycée Charles Poncet, Cluses (74) Level: Seconde i.e year old students

Vorlesung Mensch-Maschine-Interaktion

Compositional Semantics

Loughton School s curriculum evening. 28 th February 2017

An Interactive Intelligent Language Tutor Over The Internet

Language Acquisition Chart

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

AP PSYCHOLOGY VACATION WORK PACKET UNIT 7A: MEMORY

Developing Grammar in Context

The Strong Minimalist Thesis and Bounded Optimality

Sample Performance Assessment

Intensive English Program Southwest College

Software Maintenance

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

November 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

English Language and Applied Linguistics. Module Descriptions 2017/18

SEMAFOR: Frame Argument Resolution with Log-Linear Models

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

Department of Anthropology ANTH 1027A/001: Introduction to Linguistics Dr. Olga Kharytonava Course Outline Fall 2017

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Knowledge-Based - Systems

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

Guidelines for Writing an Internship Report

California Department of Education English Language Development Standards for Grade 8

A Pipelined Approach for Iterative Software Process Model

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

Let's Learn English Lesson Plan

Lecture 1: Basic Concepts of Machine Learning

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

PROCESS USE CASES: USE CASES IDENTIFICATION

CHAPTER IV RESEARCH FINDING AND DISCUSSION

A MULTI-AGENT SYSTEM FOR A DISTANCE SUPPORT IN EDUCATIONAL ROBOTICS

Kindergarten Lessons for Unit 7: On The Move Me on the Map By Joan Sweeney

Grade Band: High School Unit 1 Unit Target: Government Unit Topic: The Constitution and Me. What Is the Constitution? The United States Government

Parsing of part-of-speech tagged Assamese Texts

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

Process improvement, The Agile Way! By Ben Linders Published in Methods and Tools, winter

Telekooperation Seminar

NAME: East Carolina University PSYC Developmental Psychology Dr. Eppler & Dr. Ironsmith

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Diploma in Library and Information Science (Part-Time) - SH220

Using dialogue context to improve parsing performance in dialogue systems

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

SOFTWARE EVALUATION TOOL

CX 101/201/301 Latin Language and Literature 2015/16

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Grade 11 Language Arts (2 Semester Course) CURRICULUM. Course Description ENGLISH 11 (2 Semester Course) Duration: 2 Semesters Prerequisite: None

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

Organizing Comprehensive Literacy Assessment: How to Get Started

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

LITERACY ACROSS THE CURRICULUM POLICY

MAILCOM Las Vegas. October 2-4, Senior Director, Proposal Management BrightKey, Inc.

CEFR Overall Illustrative English Proficiency Scales

Android App Development for Beginners

Skillsoft Acquires SumTotal: Frequently Asked Questions. October 2014

PHILOSOPHY & CULTURE Syllabus

1. Answer the questions below on the Lesson Planning Response Document.

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Speak Up 2012 Grades 9 12

5. UPPER INTERMEDIATE

This table contains the extended descriptors for Active Learning on the Technology Integration Matrix (TIM).

Constraining X-Bar: Theta Theory

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Understanding and Changing Habits

Ontological spine, localization and multilingual access

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

TIM: Table of Summary Descriptors This table contains the summary descriptors for each cell of the Technology Integration Matrix (TIM).

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Transcription:

The Conversational User Interface Ronald Kaplan Nuance Sunnyvale NL/AI Lab Department of Linguistics, Stanford May, 2013 ron.kaplan@nuance.com

GUI: The problem Extensional 2

CUI: The solution Intensional Bobrow et al 1975 3 2002-2012 Nuance Communications, Inc. All rights reserved. ENTERPRISE SOLUTIONS

From then to now: Obstacles Typing is unnatural, speech recognition is hard Language is efficient: much is unsaid but understood Rampant ambiguity without context and expectations The chicken seemed ready to eat Precision is tedious: Conversation with a 3-year-old? Language is complex Many overlapping patterns to encode meaning Conversation is a cooperative social activity Speaker/hearer model each other, share conventions, plan and reason You need something worth talking about Detect goals, track environment, determine/execute useful actions 4

The opportunity Ubiquitous computing ubiquitous complexity Mass distribution of cost-effective computation confusion (try controlling a TV, thermostat, irrigation clock ) Phone as portal: the illusion of simplicity Universal: Applification of other connected devices Uniform: Same channel for special interactions Personal and situational: preferred and appropriate behavior Cloud infrastructure for shared information and back-end processing Advances on key components: speech, NL, dialog, reasoning Public/defined interfaces to local devices, remote sources and services (Siri: The NL Summer) 5

Speech recognition performance Out-of-the-box performance is becoming good and continues to improve rapidly % users with a given error rate 50 40 30 20 60% Users experiencing an error rate of < 10% during first-time mobile use Average word error rate 20% 18% 16% 14% 18% Average word error rate reduction per year 10 12% 0 0 10 20 30 40 Average word error rate % 10% 2009 2010 2011 2012 6

Recognition research WHAT S CHANGED? 5 years ago Now Training data 1,000 s of hours 100,000 s of hours Algorithms Maximum-likelihood Deep Belief Neural Networks 100 X more computation Computation 1 workstation 10,000 s of cores Run-time GMM likelihoods + Matrix multiplies 7 2002-2012 Nuance Communications, Inc. All rights reserved. ENTERPRISE SOLUTIONS

Conversation and Information Ordinary language to describe what you need When will my package arrive? Clarification/repair No, tomorrow Drill-down discussion What are the 15-year rates? Immediate sentiment You lost my luggage! 8

Conversation and Action E-commerce Flight to San Diego Mexican restaurants? No, Italian OK, table for 4 at about 7 TV Direct command: Change to channel 5 Standing order: Turn the volume down during ads Thermostat A little cooler in the afternoon Vacation starting Tuesday Customer service Change my address to xxxx. 9

A simple conversation A dialog between Bob and a speech-enabled proactive Conversational Assistant (CA) Bob> Book a table at Zingari s after my last meeting and let Tom and Brian know to meet me there. CA> Sorry, but there aren t any tables open until 9pm. Would you like me to find you another Italian restaurant in the area at about 6:30pm? Bob> Can you find a table at a restaurant with a good wine list? CA> Barbacco has an opening. It s in the Financial District but the travel time is about the same. Bob> Ok. That sounds good. 10

A simple conversation A dialog between Bob and a speech-enabled proactive Conversational Assistant (CA) Bob> Book a table at Zingari s after my last meeting and let Tom and Brian know to meet me there. CA> Sorry, but there aren t any tables open until 9pm. Would you like me to find you another Italian restaurant in the area at about 6:30pm? Bob> Can you find a table at a restaurant with a good wine list? CA> Barbacco has an opening. It s before in leaving the Financial work District but the travel time is about the same. Bob> Ok. That sounds good. Ambiguous: booking done now or after last meeting Assumption: last meeting today (check calendar) and dinner tonight (tomorrow also meets constraints) Expectation: Bob usually spends 30 minutes on email Factor in travel time: 6:30 for reservation Referent for Tom and Brian 11

A simple conversation A dialog between Bob and a speech-enabled proactive Conversational Assistant (CA) Bob> Book a table at Zingari s after my last meeting and let Tom and Brian know to meet me there. CA> Sorry, but there aren t any tables open until 9pm. Would you like me to find you another Italian restaurant in the area at about 6:30pm? Bob> Can you find a table at a restaurant with a good wine list? CA> Barbacco has an opening. Informative It s in the explanation, Financial not District just I can t but the travel time is about the same. Bob> Ok. That sounds good. Initial search fails Relax less important constraints, propose an otherwise similar alternative: type of restaurant and table time Expose most salient of remaining constraints 12

A simple conversation A dialog between Bob and a speech-enabled proactive Conversational Assistant (MA) Bob> Book a table at Zingari s after my last meeting and let Tom and Semantic search: an Italian restaurant with good wine Brian know to meet me there. CA> Sorry, but there aren t any tables open until 9pm. Would you like me to find you another Italian restaurant in the area at about 6:30pm? Bob> Can you find a table at a restaurant with a good wine list? CA> Barbacco has an opening. It s in the Financial District but the travel time is about the same. Bob> Ok. That sounds good. Indirect speech act: Reserve, not Yes or No Implicit confirmation: No disagreement with exposed constraints Interpretation: A table at an Italian restaurant with a good wine list Sources: Yelp, restaurant websites, maps 13

A simple conversation A dialog between Bob and a speech-enabled proactive Conversational Assistant (CA) Bob> Book a table at Zingari s after my last meeting and let Tom and Brian know to meet me there. CA> Sorry, but there aren t any tables in preference open to until others 9pm. Would you like me to find you another Italian restaurant ( same in travel the area time, at Italian, about Tonight ) 6:30pm? Bob> Can you find a table at a restaurant with a good wine list? CA> Barbacco has an opening. It s in the Financial District but the travel time is about the same. Bob> Ok. That sounds good. Drop one of the constraints ( restaurant in the area ) 14

A simple conversation A dialog between Bob and a speech-enabled proactive Conversational Assistant (CA) Bob> Book a table at Zingari s after my last meeting and let Tom and Brian know to meet me there. CA> Sorry, but there aren t any tables open until 9pm. Would you like reservations, sends emails to Tom and Brian. me to find you another Italian restaurant in the area at about 6:30pm? Bob> Can you find a table at a restaurant with a good wine list? CA> Barbacco has an opening. It s in the Financial District but the travel time is about the same. Bob> Ok. That sounds good. End of Dialog. CA goes to Opentable, makes the Persistence: The duties of a true assistant are not yet complete. It must monitor the plan for unexpected events such as delays. 15

Many components, many disciplines Statistical Training & Symbolic Constraints: Data, grammars Input Language Reasoning Speech Recognition Text, Gesture, Biometrics Context Language Comprehension Sentiment Analysis Dialog Manager Speech Acts Task Planner Web Output Speech Synthesis Text, Graphics Language Generation Visual design Collaboration Model User Model Theorem Prover Apps Devices Knowledge Representation, Ontologies, Facts 16

Language and reasoning Morphology Syntax Semantics Pragmatics Discourse & Dialog AI and Reasoning Major technical challenges: Integration of independent best-of-breed components Managing end-to-end ambiguity through hard constraints and probabilistic reasoning Bridging language and logic Inferring intent & learning preferences Global resolution of ambiguity while preserving modularity Deployment at scale Modeling collaboration Representing knowledge 17

Computational challenge: Pervasive ambiguity...... Morphology & Syntax Semantics Mentions Every nominee got an award. The same award or each their own? The chicken is ready to eat. Cooked or hungry? walks untieable knot bank General Mills noun or verb? (untie)able or un(tieable)? river or financial? person or company 18

Ambiguity can be explosive if alternatives multiply within or across modules Knowledge Semantics Syntax Mentions Speech 19

Pruning Premature Disambiguation Typical approach: Local heuristics to kill as soon as possible Oops: Strong constraints may reject the so-far-best (= only) option Statistics Speech X Mentions X Syntax X X Semantics X Knowledge Semantics may know: The veal is ready to eat. The calf is ready to eat. 20

Syntactic ambiguity Bob Book a table after my last meeting (LFG/XLE-Web, Bergen) Book Later Book Now, Table Later Statistics and pragmatic reasoning to choose interpretation 21

Packing syntactic ambiguity book now, table later shared Book a table after my last meeting book later Interpretation chosen by later modules (pragmatic reasoning and domain statistics) Choice doesn t depend on meeting structure, so never unpacked 22

Technical approaches: data + rules Data driven learning by observation Classification and correlation, on the head (current fad) Automatically (?) populates framework of domain concepts and contexts Probabilistic preference and disambiguation Symbolic learning by instruction Interpretation, on the tail Deep, long-span linguistic structures provide statistical locality Less domain dependent Back-offs for robustness Appropriate combination: Trade data for knowledge 23

Semantic analysis Bob> Can you find a table at a restaurant with a good wine list? Syntactic structure mapped to logical representation with event tokens, individual objects, properties and relations Davidsonian representation (event variables) supports incremental addition of new constraints by conjunction Discourse Representation Structures (DRS) for ease of manipulation, with translation to first order logic for more general reasoning e1,e2,x,y Surface_request(e1,e2) Agent(e1,Bob), Agent(e2,CA) Find(e2), Restaurant(x), Object(e2,x) Food(x,Italian), Open(x) Available(y,x), Wine(y), Good(y) Discourse structure Logical representation 24

Pragmatics Example: Speech acts Bob> Can you find a table at a restaurant with a good wine list? Transform surface speech act (ability to find a table?) into a request to make a reservation e1,e2,x,y Surface_request(e1,e2) Agent(e1,Bob), Agent(e2,CA) Find(e2), Restaurant(x), Object(e2,x) Food(x,Italian), Open(x) Available(y,x), Wine(y), Good(y) e1,e2,x,y Request(e1,e2) Agent(e1,Bob), Agent(e2,CA) Reserve(e2), Restaurant(x), Object(e2,x) Food(x,Italian), Open(x) Available(y,x), Wine(y), Good(y) 25

Conversational interaction: Plan and replan Book a table at Zingari s after my last meeting Task recipe library Book table Get restaurant Get restaurant Get Guide Find Reserve Get restaurant Get time Reserve From user Get Guide Find Yelp Get candidates Compare Opentable Book_table(e1) Agent(e1,CA) Object(e1,r), Restaurant(r) Date(d),Time(t) Get_rest(e2) Agent(e2,CA) From_user(e3) r=zingari 26 Get_time(e4) d=12112 t=6:30pm Dynamic Intention Structures Opentable: not available Reserve(e5) Agent(e5,CA) Object(e5,r) Source(e5, Opentable) Available(r,d,t) Select new recipe and elaborate Book_table(e1) Agent(e1,CA),Object(e1,r), Restaurant(r), Has(r,w), wine(w),good(w), Date(d),Time(t) Get_restaurant(e2), Agent(e2,CA) Get_guide(e3) Agent(e3,CA) Object(e3,y) use(e3 ) Agent(e3,CA) Target(e3,y) y=yelp Find(e4), Agent(e4,CA) Object(e4,r) Source(e4,y) Type(italian), Driving(20m) r=barbacco. Get_time(e5) d=12112 t=6:30pm Reserve(e6) Agent(e6,CA) Object(e6,r) Source(e6, Opentable) Available(r,d,t)

Proactive monitoring, replan on failure Anticipate glitches, create standing orders If CA comes to believe that Bob hasn t left the office by 5:30 pm, it will form the intention to replan the book-table action CA> Bob, you re running late. Should I change the reservation? Bob> Yes, I ll be ready to leave in about 30 minutes.... 27

Standing orders Specific constraints on future/hypothetical events: Intensionality Let me know when I get close to a café but not Starbucks Move $1000 to my savings when my paycheck comes in Linguistic pipeline decodes idiosyncratic intent long tail Planner creates future-situation recognizer Monitor watches and initiates action (location, time, bank ) Also: Collaborative help for big-head situations (e.g. Google Now cards) Infer from common interests and repeated patterns of daily life Little/no linguistic analysis Templatic but flexible use of general planning and monitoring User model and context awareness to suppress unwanted intrusions 28

Extending across domains Linguistic analysis, conventions of conversation, planning principles remain General vocabulary and grammatical expressions of meaning are (mostly) domain independent I want Can you Later than that No, French Maybe Monday Structured representations can be interpreted according to context Upper ontology and axioms provide stable background People, places, objects, action, time, cause-effect, desire, belief, intention New domain: augment general framework Add/specialize vocabulary and ontology Define constraints and inferences Provide access to domain information sources and execution interfaces Architecture, algorithms, background are language independent 29

Conversation: Natural, efficient, effective Universal way of interacting with Ubiquitous technology: Phone, TV, thermostat Information, Institutions, and services (Many) core technologies now exist Challenge of integration, ambiguity Perfection is not required: People misunderstand too Must set appropriate expectations Must provide for easy repair Confirmation is often unnatural A defensive hangover from the errorful past Needed for actions with consequence 30

Conversation: The killer app for NL and AI 31

32