Natural Language Understanding

Similar documents
A discursive grid approach to model local coherence in multi-document summaries

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Speech Recognition at ICSI: Broadcast News and beyond

Control and Boundedness

Constraining X-Bar: Theta Theory

English Language and Applied Linguistics. Module Descriptions 2017/18

Intermediate Spanish: Chile after Pinochet

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Combining a Chinese Thesaurus with a Chinese Dictionary

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

Linking Task: Identifying authors and book titles in verbose queries

Corpus Linguistics (L615)

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Lecture 2: Quantifiers and Approximation

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

arxiv: v1 [cs.cl] 2 Apr 2017

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)

The stages of event extraction

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Graph Alignment for Semi-Supervised Semantic Role Labeling

WHEN THERE IS A mismatch between the acoustic

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Anticipation Guide William Faulkner s As I Lay Dying 2000 Modern Library Edition

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

A Comparison of Two Text Representations for Sentiment Analysis

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

The Smart/Empire TIPSTER IR System

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Probabilistic Latent Semantic Analysis

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Sources of difficulties in cross-cultural communication and ELT: The case of the long-distance but in Chinese discourse

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Task Tolerance of MT Output in Integrated Text Processes

MYCIN. The MYCIN Task

Florida Reading Endorsement Alignment Matrix Competency 1

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Beyond the Pipeline: Discrete Optimization in NLP

Handling Sparsity for Verb Noun MWE Token Classification

Using dialogue context to improve parsing performance in dialogue systems

BYLINE [Heng Ji, Computer Science Department, New York University,

Text-mining the Estonian National Electronic Health Record

Word Segmentation of Off-line Handwritten Documents

Grade 4. Common Core Adoption Process. (Unpacked Standards)

Telekooperation Seminar

Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure

Using Semantic Relations to Refine Coreference Decisions

INTRODUCTION TO TEACHING GUIDE

Comment-based Multi-View Clustering of Web 2.0 Items

A study of speaker adaptation for DNN-based speech synthesis

CaMLA Working Papers

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

The following information has been adapted from A guide to using AntConc.

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

Cross Language Information Retrieval

9.85 Cognition in Infancy and Early Childhood. Lecture 7: Number

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

November 2012 MUET (800)

Secondary English-Language Arts

Phonological and Phonetic Representations: The Case of Neutralization

Lecture 1: Machine Learning Basics

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Python Machine Learning

Moderator: Gary Weckman Ohio University USA

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Developing Grammar in Context

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns

FOR TEACHERS ONLY. The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION. ENGLISH LANGUAGE ARTS (Common Core)

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Developing a College-level Speed and Accuracy Test

Age Effects on Syntactic Control in. Second Language Learning

Evidence for Reliability, Validity and Learning Effectiveness

SOFTWARE EVALUATION TOOL

On-the-Fly Customization of Automated Essay Scoring

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Using Web Searches on Important Words to Create Background Sets for LSI Classification

MGMT 3362 Human Resource Management Course Syllabus Spring 2016 (Interactive Video) Business Administration 222D (Edinburg Campus)

Online Updating of Word Representations for Part-of-Speech Tagging

Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs

Identifying Novice Difficulties in Object Oriented Design

By Laurence Capron and Will Mitchell, Boston, MA: Harvard Business Review Press, 2012.

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

elearning OVERVIEW GFA Consulting Group GmbH 1

Knowledge Transfer in Deep Convolutional Neural Nets

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

ANGLAIS LANGUE SECONDE

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Mandarin Lexical Tone Recognition: The Gating Paradigm

HLTCOE at TREC 2013: Temporal Summarization

Learning Methods in Multilingual Speech Recognition

A Case Study: News Classification Based on Term Frequency

Master s Thesis. An Agent-Based Platform for Dialogue Management

Paper Reference. Edexcel GCSE Mathematics (Linear) 1380 Paper 1 (Non-Calculator) Foundation Tier. Monday 6 June 2011 Afternoon Time: 1 hour 30 minutes

Procedia - Social and Behavioral Sciences 143 ( 2014 ) CY-ICER Teacher intervention in the process of L2 writing acquisition

Unsupervised Learning of Narrative Schemas and their Participants

Transcription:

Natural Language Understanding Lecture 16: Entity-based Coherence Mirella Lapata School of Informatics University of Edinburgh mlap@inf.ed.ac.uk March 28, 2017 Mirella Lapata Natural Language Understanding 1

1 2 Discourse Representation Entity Transitions Ranking Model 3 Text Ordering Summarization Reading: Barzilay and Lapata (2008). Mirella Lapata Natural Language Understanding 2

Coherence in Text Coherence: is a property of well-written texts; makes them easier to read and understand; ensures that sentences are meaningfully related; and that the reader can work out what expressions mean; the text is thematically organized; temporally organized; rather than a random concatenation of sentences. In this lecture, we will discuss Barzilay and Lapata s (2008) entity-based model of coherence. Mirella Lapata Natural Language Understanding 3

Coherence in Text Summary A Britain said he did not have diplomatic immunity. The Spanish authorities contend that Pinochet may have committed crimes against Spanish citizens in Chile. Baltasar Garzon filed a request on Wednesday. Chile said, President Fidel Castro said Sunday he disagreed with the arrest in London. Summary B Former Chilean dictator Augusto Pinochet, was arrested in London on 14 October 1998. Pinochet, 82, was recovering from surgery. The arrest was in response to an extradition warrant served by a Spanish judge. Pinochet was charged with murdering thousands, including many Spaniards. Pinochet is awaiting a hearing, his fate in the balance. American scholars applauded the arrest. Mirella Lapata Natural Language Understanding 4

Entity-based Coherence The way entities are introduced and discussed influences coherence (Grosz et al., 1995). Entities in an utterance are ranked according to salience. Is an entity pronominalized or not? Is an entity in a prominent syntactic position? Each utterance has one center ( topic or focus). Coherent discourses have utterances with common centers. Entity transitions capture degrees of coherence (e.g., in Centering theory continue > shift). Notions of salience, utterance, ranking are left unspecified. Mirella Lapata Natural Language Understanding 5

Entity-based Local Coherence John went to his favorite music store to buy a piano. He had frequented the store for many years. He was excited that he could finally buy a piano. He arrived just as the store was closing for the day. John went to his favorite music store to buy a piano. It was a store John had frequented for many years. He was excited that he could finally buy a piano. It was closing just as John arrived. Mirella Lapata Natural Language Understanding 6

Entity-based Local Coherence John went to his favorite music store to buy a piano. He had frequented the store for many years. He was excited that he could finally buy a piano. He arrived just as the store was closing for the day. John went to his favorite music store to buy a piano. It was a store John had frequented for many years. He was excited that he could finally buy a piano. It was closing just as John arrived. Mirella Lapata Natural Language Understanding 6

Discourse Representation Entity Transitions Ranking Model Can we compute a discourse representation automatically? Does it capture coherence characteristics? What linguistic information matters for coherence? Is it robust across domains and genres? What is an appropriate coherence model? View coherence rating as a machine learning problem. Learn a ranking function without manual involvement. Apply to text-to-text generation tasks. Inspired from entity-based theories, not a direct implementation of any theory in particular. Mirella Lapata Natural Language Understanding 7

Discourse Representation Entity Transitions Ranking Model 1 Former Chilean dictator Augusto Pinochet, was arrested in London on 14 October 1998. 2 Pinochet, 82, was recovering from surgery. 3 The arrest was in response to an extradition warrant served by a Spanish judge. 4 Pinochet was charged with murdering thousands, including many Spaniards. 5 He is awaiting a hearing, his fate in the balance. 6 American scholars applauded the arrest. Mirella Lapata Natural Language Understanding 8

Discourse Representation Entity Transitions Ranking Model 1 2 3 4 5 6 Former Chilean dictator Augusto Pinochet, was arrested in London on 14 October 1998. Pinochet, 82, was recovering from surgery. The arrest was in response to an extradition warrant served by a Spanish judge. Pinochet was charged with murdering thousands, including many Spaniards. He is awaiting a hearing, Pinochet s fate in the balance. American scholars applauded the arrest. Mirella Lapata Natural Language Understanding 9

Discourse Representation Entity Transitions Ranking Model 1 2 3 4 5 6 Former Chilean dictator Augusto Pinochet S, was arrested in London X on 14 October X 1998. Pinochet S, 82, was recovering from surgery X. The arrest S was in response X to an extradition warrant X served by a Spanish judge S. Pinochet O was charged with murdering thousands O, including many Spaniards O. Pinochet S is awaiting a hearing O, his fate X in the balance X. American scholars S applauded the arrest O. Mirella Lapata Natural Language Understanding 9

Discourse Representation Entity Transitions Ranking Model 1 Pinochet S London X October X 2 Pinochet S surgery X 3 arrest S response X warrant X judge O 4 Pinochet O thousands O Spaniards O 5 Pinochet S hearing O Pinochet X fate X balance X 6 scholars S arrest O Mirella Lapata Natural Language Understanding 10

Discourse Representation Entity Transitions Ranking Model Pinochet London October Surgery Arrest Warrant Judge Thousands Spaniards Hearing Fate Balance Scholars 1 2 3 4 5 6 Mirella Lapata Natural Language Understanding 11

Discourse Representation Entity Transitions Ranking Model Pinochet London October Surgery Arrest Extradition Warrant Judge Thousands Spaniards Hearing Fate Balance Scholars 1 S 2 S 3 4 O 5 S 6 Mirella Lapata Natural Language Understanding 11

Discourse Representation Entity Transitions Ranking Model Pinochet London October Surgery Arrest Extradition Warrant Judge Thousands Spaniards Hearing Fate Balance 1 S X X 2 S X 3 S X X O 4 O O O 5 S O X X 6 O S Scholars Mirella Lapata Natural Language Understanding 11

Discourse Representation Entity Transitions Ranking Model S X X S X S X X O O O O S O X X O S Mirella Lapata Natural Language Understanding 12

Discourse Representation Entity Transitions Ranking Model S X X S X S X X O O O O S O X X O S S S X X X X X X O O O O X X O Mirella Lapata Natural Language Understanding 12

Discourse Representation Entity Transitions Ranking Model S X X S X S X X O O O O S O X X O S S S X X X X X X O O O O X X O Mirella Lapata Natural Language Understanding 12

Entity Transitions Discourse Representation Entity Transitions Ranking Model Definition A local entity transition is a sequence {S, O, X, } n that represents entity occurrences and their syntactic roles in n adjacent sentences. Feature Vector Notation Each grid x ij for document d i is represented by a feature vector: Φ(x ij ) = (p 1 (x ij ), p 2 (x ij ),..., p m (x ij )) m is the number of predefined entity transitions p t (x ij ) the probability of transition t in grid x ij Mirella Lapata Natural Language Understanding 13

Entity Transitions Discourse Representation Entity Transitions Ranking Model Example (transitions of length 2) S S S O S X S O S O O O X O X S X O X X X S O X d 1 0 0 0.03 0 0 0.02.07 0 0.12.02.02.05.25 d 2 0 0 0.02 0.07 0.02 0 0.06.04 0 0 0.36 d 3.02 0 0.03 0 0 0.06 0 0 0.05.03.07.07.29 Mirella Lapata Natural Language Understanding 14

Entity Transitions Discourse Representation Entity Transitions Ranking Model Example (transitions of length 2) S S S O S X S O S O O O X O X S X O X X X S O X d 1 0 0 0.03 0 0 0.02.07 0 0.12.02.02.05.25 d 2 0 0 0.02 0.07 0.02 0 0.06.04 0 0 0.36 d 3.02 0 0.03 0 0 0.06 0 0 0.05.03.07.07.29 Mirella Lapata Natural Language Understanding 14

Linguistic Dimensions Discourse Representation Entity Transitions Ranking Model Salience: Are some entities more important than others? Discriminate between salient (frequent) entities and the rest. Collect statistics separately for each group. Coreference: What is its contribution? Entities are coreferent if they have the same surface form. Apply a coreference resolution system. Syntax: Does syntactic knowledge matter? Use four categories {S, O, X, }. Reduce categories to {X, }. Mirella Lapata Natural Language Understanding 15

Learning a Ranking Function Discourse Representation Entity Transitions Ranking Model Training Set Ordered pairs (x ij, x ik ), where x ij and x ik represent the same document d i, and x ij is more coherent than x ik (assume j > k). Goal Find a parameter vector w such that: w (Φ(x ij ) Φ(x ik )) > 0 j, i, k such that j > k Support Vector Machines Constraint optimization problem can be solved using the search technique described in Joachims (2002). Mirella Lapata Natural Language Understanding 16

Text Ordering Text Ordering Summarization Motivation Determine a sequence in which to present a set of items. Essential step in generation applications. Data Source document and permutations of its sentences. Original order assumed coherent. Given k documents, with n permutations, obtain k n pairwise rankings for training and testing. Two corpora, Earthquakes and Accidents, 100 texts each. Mirella Lapata Natural Language Understanding 17

Text Ordering Text Ordering Summarization Sentence 1 Sentence 2 Sentence 3 Sentence 4 Mirella Lapata Natural Language Understanding 18

Text Ordering Text Ordering Summarization Sentence 1 Sentence 2 Sentence 3 Sentence 4 Sentence 2 Sentence 3 Sentence 4 Sentence 1 Sentence 4 Sentence 3 Sentence 2 Sentence 1 Sentence 2 Sentence 1 Sentence 4 Sentence 3 Mirella Lapata Natural Language Understanding 18

Comparison with State of the Art Text Ordering Summarization Vector-based Model (LSA, Foltz et al., 1998): Meaning of individual words is represented in vector space. Sentence meaning is the mean of the vectors of its words. Average distance of adjacent sentences. Unsupervised, local, lexicalized, domain independent. Mirella Lapata Natural Language Understanding 19

Comparison with State of the Art Text Ordering Summarization x x S 5 S 4 S 3 S 2 S 1 S 5 S 4 S 3 S 2 y S 1 y Mirella Lapata Natural Language Understanding 20

Comparison with State of the Art Text Ordering Summarization HMM-based Content Models (Barzilay and Lee, 2004): Model topics and their order in texts. Model is an HMM: states correspond to topics ( sentences). Model selects sentence order with highest probability. Supervised, global, lexicalized, domain dependent. Mirella Lapata Natural Language Understanding 21

Comparison with State of the Art Text Ordering Summarization Casualties Location Strength Rescue History the quake its was magnitude near was San Jose... Mirella Lapata Natural Language Understanding 22

Text Ordering Summarization Mirella Lapata Natural Language Understanding 23

Results: Ordering Text Ordering Summarization % ranks correct (test set) 90 85 80 75 -Cref+Syn+Sal -Cref-Syn-Sal +Cref+Syn+Sal +Cref-Syn-Sal HMM LSA Earthquakes % ranks correct (test set) 90 85 80 75 -Cref+Syn+Sal -Cref-Syn-Sal +Cref+Syn+Sal +Cref-Syn-Sal HMM LSA Accidents Mirella Lapata Natural Language Understanding 24

Discussion Text Ordering Summarization Omission of coreference causes performance drop. Syntax and Salience have more effect on Accidents corpus. Linguistically poor model generally worse. Entity model is better than LSA. HMM-based content models exhibit high variability. Models seem to be complementary. Mirella Lapata Natural Language Understanding 25

Summarization Text Ordering Summarization Motivation Summaries naturally exhibit coherence violations. Compare model against rankings elicited by human judges. Useful for automatic evaluation of machine generated text. Data Outputs of 5 multi-document summarization systems and corresponding human authored summaries (DUC 2003). Participants assign readability score on a seven point scale. 144 summaries, 177 participants (23 per summary). Mirella Lapata Natural Language Understanding 26

Results: Summarization Text Ordering Summarization 90 -Cref+Syn+Sal -Cref-Syn-Sal +Cref+Syn+Sal +Cref-Syn-Sal LSA Summaries % ranks correct (test set) 80 70 60 50 Mirella Lapata Natural Language Understanding 27

Results Text Ordering Summarization Coreference decreases accuracy (machine generated texts). Salience seems to have more of an impact here. Linguistically poor model is generally worse. Entity model performs better than LSA. LSA is unsupervised and exposed only to human texts. Training corpus is unsuitable for HMM-based content models. Mirella Lapata Natural Language Understanding 28

Summary Text Ordering Summarization Strengths: Novel framework for representing and measuring coherence. Entity grid and cross-sentential transitions. Suited for learning appropriate ranking function. Fully automatic and robust, useful for system development. Weaknesses: Entity grid doesn t contain lexical information. Doesn t contain a notion of global coherence. Can t model multi-paragraph text. Mirella Lapata Natural Language Understanding 29