LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

Similar documents
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

The Smart/Empire TIPSTER IR System

Argument structure and theta roles

Linking Task: Identifying authors and book titles in verbose queries

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Constraining X-Bar: Theta Theory

A Case Study: News Classification Based on Term Frequency

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

The College Board Redesigned SAT Grade 12

Task Tolerance of MT Output in Integrated Text Processes

Columbia University at DUC 2004

Vocabulary Agreement Among Model Summaries And Source Documents 1

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES SCHOOL OF INFORMATION SCIENCES

Multi-Lingual Text Leveling

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns

arxiv: v1 [cs.cl] 2 Apr 2017

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Underlying and Surface Grammatical Relations in Greek consider

EAGLE: an Error-Annotated Corpus of Beginning Learner German

12- A whirlwind tour of statistics

Vocabulary Usage and Intelligibility in Learner Language

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Control and Boundedness

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

Scoring Guide for Candidates For retake candidates who began the Certification process in and earlier.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

The stages of event extraction

Annotation Projection for Discourse Connectives

Think A F R I C A when assessing speaking. C.E.F.R. Oral Assessment Criteria. Think A F R I C A - 1 -

ACC : Accounting Transaction Processing Systems COURSE SYLLABUS Spring 2011, MW 3:30-4:45 p.m. Bryan 202

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Theoretical Syntax Winter Answers to practice problems

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Evidence for Reliability, Validity and Learning Effectiveness

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

On document relevance and lexical cohesion between query terms

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Robust Sense-Based Sentiment Classification

A Comparative Study of Research Article Discussion Sections of Local and International Applied Linguistic Journals

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

A Grammar for Battle Management Language

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio

Annotating (Anaphoric) Ambiguity 1 INTRODUCTION. Paper presentend at Corpus Linguistics 2005, University of Birmingham, England

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Developing a TT-MCTAG for German with an RCG-based Parser

Leveraging Sentiment to Compute Word Similarity

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

AQUA: An Ontology-Driven Question Answering System

Ensemble Technique Utilization for Indonesian Dependency Parser

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Learning Computational Grammars

Yoshida Honmachi, Sakyo-ku, Kyoto, Japan 1 Although the label set contains verb phrases, they

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt

Parsing of part-of-speech tagged Assamese Texts

Speech Emotion Recognition Using Support Vector Machine

Difficulties in Academic Writing: From the Perspective of King Saud University Postgraduate Students

Welcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading

SY 6200 Behavioral Assessment, Analysis, and Intervention Spring 2016, 3 Credits

Guidelines for Writing an Internship Report

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Administrative Services Manager Information Guide

Intensive English Program Southwest College

Frequency and pragmatically unmarked word order *

Som and Optimality Theory

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

HLTCOE at TREC 2013: Temporal Summarization

The Role of the Head in the Interpretation of English Deverbal Compounds

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

MATH 1A: Calculus I Sec 01 Winter 2017 Room E31 MTWThF 8:30-9:20AM

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Person Centered Positive Behavior Support Plan (PC PBS) Report Scoring Criteria & Checklist (Rev ) P. 1 of 8

RUBRICS FOR M.TECH PROJECT EVALUATION Rubrics Review. Review # Agenda Assessment Review Assessment Weightage Over all Weightage Review 1

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

CaMLA Working Papers

Prediction of Maximal Projection for Semantic Role Labeling

The Discourse Anaphoric Properties of Connectives

BYLINE [Heng Ji, Computer Science Department, New York University,

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Processing as a Source of Accessibility Effects on Variation

Re-evaluating the Role of Bleu in Machine Translation Research

Some Principles of Automated Natural Language Information Extraction

AN ANALYSIS OF GRAMMTICAL ERRORS MADE BY THE SECOND YEAR STUDENTS OF SMAN 5 PADANG IN WRITING PAST EXPERIENCES

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

Treebank mining with GrETEL. Liesbeth Augustinus Frank Van Eynde

Exploration. CS : Deep Reinforcement Learning Sergey Levine

TextGraphs: Graph-based algorithms for Natural Language Processing

Using Semantic Relations to Refine Coreference Decisions

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Extracting Social Networks and Biographical Facts From Conversational Speech Transcripts

Transcription:

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization Annemarie Friedrich, Marina Valeeva and Alexis Palmer COMPUTATIONAL LINGUISTICS & PHONETICS SAARLAND UNIVERSITY, GERMANY

Extractive Multi-Document Summarization 1

Extractive Multi-Document Summarization 1

Extractive Multi-Document Summarization Evaluation Content? Linguistic quality / Readability? 1

Extractive Multi-Document Summarization Evaluation Content? Linguistic quality / Readability? 1 4 2 5 1 2 3 4 1 1

Extractive Multi-Document Summarization Evaluation Content? Linguistic quality / Readability? 1 4 2 5 1 2 3 4 1 Automatic Evaluation Methods 1

Extractive Multi-Document Summarization Evaluation Content? Linguistic quality / Readability? 1 4 2 5 1 2 3 4 1 Automatic Evaluation Methods Automatic Content Evaluation 1

Extractive Multi-Document Summarization Evaluation Content? Linguistic quality / Readability? 1 4 2 5 1 2 3 4 1 Automatic Evaluation Methods Automatic Content Evaluation Automatic Linguistic Quality Evaluation? 1

Violations of Linguistic Quality entity mentions: reference unclear The suspect apparently called her from a cell phone shortly before the shooting began, saying he was acting out in revenge for something that happened 20 years ago, Miller said. The gunman, a local truck driver Charles Roberts, was apparently acting in revenge for an incident that happened to him 20 years ago. Charles Carl Roberts IV may have planned to 2

Violations of Linguistic Quality subsequent mention of entity too specific entity mentions: reference unclear The suspect apparently called her from a cell phone shortly before the shooting began, saying he was acting out in revenge for something that happened 20 years ago, Miller said. The gunman, a local truck driver Charles Roberts, was apparently acting in revenge for an incident that happened to him 20 years ago. Charles Carl Roberts IV may have planned to 2

Violations of Linguistic Quality subsequent mention of entity too specific entity mentions: reference unclear redundant information The suspect apparently called her from a cell phone shortly before the shooting began, saying he was acting out in revenge for something that happened 20 years ago, Miller said. The gunman, a local truck driver Charles Roberts, was apparently acting in revenge for an incident that happened to him 20 years ago. Charles Carl Roberts IV may have planned to 2

Violations of Linguistic Quality subsequent mention of entity too specific entity mentions: reference unclear redundant information The suspect apparently called her from a cell phone shortly before the shooting began, saying he was acting out in revenge for something that happened 20 years ago, Miller said. The gunman, a local truck driver Charles Roberts, was apparently acting in revenge for an incident that happened to him 20 years ago. Charles Carl Roberts IV may have planned to incomplete sentence 2

Automatic Evaluation of Linguistic Quality for Automatic Summarization 1 4 21 5 2 1 3 4 lexical, syntactic, semantic features supervised learning classifier 4 [Pitler et al., 2010; Conroy et al., 2011; Giannakopoulos and Karkaletsis, 2011; de Oliveira, 2011; Lin et al., 2012] 3

Automatic Evaluation of Linguistic Quality for Automatic Summarization 1 4 21 5 2 1 3 4 lexical, syntactic, semantic features Revision-based approach supervised learning classifier 4 [Pitler et al., 2010; Conroy et al., 2011; Giannakopoulos and Karkaletsis, 2011; de Oliveira, 2011; Lin et al., 2012] [Mani et al. 1999, Jing & McKeown 2000, Otterbacher et al. 2002] 3

LQVSumm corpus manual identification of violations of linguistic quality (subset of data) 4

LQVSumm corpus manual identification of violations of linguistic quality (subset of data) design of annotation scheme entity mention level clause level 4

LQVSumm corpus manual identification of violations of linguistic quality (subset of data) design of annotation scheme entity mention level clause level inter-annotatoragreement study 4

LQVSumm corpus manual identification of violations of linguistic quality (subset of data) design of annotation scheme entity mention level clause level inter-annotatoragreement study annotation of data sets 4

LQVSumm corpus manual identification of violations of linguistic quality (subset of data) design of annotation scheme entity mention level clause level inter-annotatoragreement study annotation of data sets collect corpus statistics and evaluate correlations with human scores 4

LQVSumm corpus manual identification of violations of linguistic quality (subset of data) design of annotation scheme entity mention level clause level inter-annotatoragreement study annotation of data sets collect corpus statistics and evaluate correlations with human scores FUTURE WORK: modeling: detection of violation types, evaluation tool 4

Annotation Scheme: Entity Mention level Who is that? unclear first mention Roberts killed himself 5

Annotation Scheme: Entity Mention level Who is that? unclear first mention Roberts killed himself Taylor s attorney Tony Taylor, 34, of Hampton, Va., has overly-specific subsequent mention 5

Annotation Scheme: Entity Mention level Who is that? unclear first mention Roberts killed himself def. NP without reference The Adam Air Boeing An Adam Air Boeing indef. NP with previous reference Taylor s attorney Tony Taylor, 34, of Hampton, Va., has overly-specific subsequent mention 5

Annotation Scheme: Entity Mention level Who is that? unclear first mention Roberts killed himself def. NP without reference The Adam Air Boeing An Adam Air Boeing indef. NP with previous reference Taylor s attorney Tony Taylor, 34, of Hampton, Va., has overly-specific subsequent mention pronouns without antecedents pronouns with misleading antecedents unclear acronyms 5

Annotation Scheme: Clause level (sentence, phrase, sequence of tokens) ungrammaticality incomplete sentence 6

Annotation Scheme: Clause level (sentence, phrase, sequence of tokens) ungrammaticality incomplete sentence dateline included GEORGETOWN, Pennsylvania 2006-10-05 16:53:53 UTC 6

Annotation Scheme: Clause level (sentence, phrase, sequence of tokens) ungrammaticality incomplete sentence dateline included GEORGETOWN, Pennsylvania 2006-10-05 16:53:53 UTC no semantic relatedness between clauses It is popularly known as the pink city. He said there was no justification for such killings. 6

Annotation Scheme: Clause level ungrammaticality incomplete sentence dateline included GEORGETOWN, Pennsylvania 2006-10-05 16:53:53 UTC (sentence, phrase, sequence of tokens) redundant information He was acting out in revenge for something that happened 20 years ago. was apparently acting in revenge for an incident that happened to him 20 years ago. no semantic relatedness between clauses It is popularly known as the pink city. He said there was no justification for such killings. 6

Annotation Scheme: Clause level ungrammaticality incomplete sentence dateline included GEORGETOWN, Pennsylvania 2006-10-05 16:53:53 UTC (sentence, phrase, sequence of tokens) redundant information He was acting out in revenge for something that happened 20 years ago. was apparently acting in revenge for an incident that happened to him 20 years ago. no semantic relatedness between clauses It is popularly known as the pink city. He said there was no justification for such killings. inappropriate use of discourse connective 6

LQVSumm: Annotated Data data source input to systems Output summarization approaches TAC 1935 summaries, TAC 2011 (initial summaries), generated by 44 different extractive summarization systems sets of 10 news articles 100-word summaries sentence selection + compression 7

LQVSumm: Annotated Data data source input to systems Output summarization approaches manual scores for summaries TAC 1935 summaries, TAC 2011 (initial summaries), generated by 44 different extractive summarization systems sets of 10 news articles 100-word summaries sentence selection + compression Readability (1-5), Pyramid (content), Responsiveness (1-5) 7

Inter-annotator agreement 100 randomly chosen summaries two annotators (A) and (B) annotations match if same type & overlapping span 8

Inter-annotator agreement 100 randomly chosen summaries two annotators (A) and (B) annotations match if same type & overlapping span level Precision(B:A) Recall(B:A) F1 entity mention 90.4 54.5 67.5 clause 84.1 83.3 83.6 8

Inter-annotator agreement 100 randomly chosen summaries two annotators (A) and (B) annotations match if same type & overlapping span level Precision(B:A) Recall(B:A) F1 entity mention 90.4 54.5 67.5 clause 84.1 83.3 83.6 A creates twice as many annotations, B s annotations are a subset of A s 8

Inter-annotator agreement 100 randomly chosen summaries two annotators (A) and (B) annotations match if same type & overlapping span level Precision(B:A) Recall(B:A) F1 entity mention 90.4 54.5 67.5 clause 84.1 83.3 83.6 Agreement higher on clause level than on entity mention level 8

Inter-annotator agreement 100 randomly chosen summaries two annotators (A) and (B) annotations match if same type & overlapping span level Precision(B:A) Recall(B:A) F1 entity mention 90.4 54.5 67.5 clause 84.1 83.3 83.6 degree of subjectivity is manageable 8

Absolute Frequencies of LQVs by type total: 1935 summaries Entity mention level 0 200 400 600 800 1000 1200 def. NP without reference unclear first mention indef. NP with previous reference pronoun without antecedent overly-specific subsequent mention pronoun with misleading antecedent unclear acronym Clause level incomplete sentence ungrammaticality redundant information dateline included no semantic relatedness between clauses inappropriate discourse connective 9

Ranking systems: average number of violations per summary compare rankings with TAC 2011 rankings draw conclusions about strengths/weaknesses of systems System Entity mention level Clause level All LQV types 1 (baseline using first 100 words as summary) 0.34 1 1.34 21 0.84 0.45 1.3 7 1.14 4.63 5.77 10

Ranking systems: average number of violations per summary compare rankings with TAC 2011 rankings draw conclusions about strengths/weaknesses of systems System Entity mention level Clause level All LQV types 1 (baseline using first 100 words as summary) Best TAC system (differs for each column, TAC 2011) 0.34 1 1.34 21 0.84 0.45 1.3 7 1.14 4.63 5.77 (System 1) 0.34 (System 16) 0.23 (System 21) 1.30 Average of systems in TAC 1.42 1.54 2.96 10

Summary-level correlation # of manually identified violations of linguistic quality Pearson s r manual scores from TAC 2011 11

Summary-level correlation # of manually identified violations of linguistic quality Pearson s r manual scores from TAC 2011 entity mention clause all Readability Pyramid (content) Responsiveness -0,4-0,3-0,2-0,1 0 0,1 11

Summary-level correlation Pearsons s r -0,25-0,15-0,05 0,05 # of manually identified LQ violations manual scores from TAC 2011: Readability incomplete sentence pronoun without antecedent ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym inappropriate discourse connective unclear first mention overly specific subsequent mention 12

Summary-level correlation Pearsons s r -0,25-0,15-0,05 0,05 # of manually identified LQ violations manual scores from TAC 2011: Readability Significantly correlated to intuitively assigned Readability scores play a role for judgment incomplete sentence pronoun without antecedent ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym inappropriate discourse connective unclear first mention overly specific subsequent mention 12

System-level correlations All summaries created by one system average # of manually identified LQ violations Average of Readability scores System 21 1.30 3.75 System 2 1.74 3.34 System 7 5.77 2.09

System-level correlations All summaries created by one system DICOMER: features from Penn Discourse TreeBankstyle discourse parser average # of manually identified LQ violations Average of Readability scores higher absolute correlation better ranking Method Ranking of Pearson s r Spearman s ρ Kendall s τ DICOMER [Lin et al. 2012] all 50 systems 0.867 0.712 0.535 LQVSumm sum(violations) 44 systems -0.820-0.858-0.713 13

System-level correlations All summaries created by one system DICOMER: features from Penn Discourse TreeBankstyle discourse parser average # of manually identified LQ violations Average of Readability scores higher absolute correlation better ranking Method Ranking of Pearson s r Spearman s ρ Kendall s τ DICOMER [Lin et al. 2012] all 50 systems 0.867 0.712 0.535 LQVSumm sum(violations) 44 systems -0.820-0.858-0.713 Pearson s r actual scores Spearman s ρ, Kendall s τ ranking only 13

System-level correlations All summaries created by one system DICOMER: features from Penn Discourse TreeBankstyle discourse parser average # of manually identified LQ violations Average of Readability scores higher absolute correlation better ranking Method Ranking of Pearson s r Spearman s ρ Kendall s τ DICOMER [Lin et al. 2012] all 50 systems 0.867 0.712 0.535 LQVSumm sum(violations) 44 systems -0.820-0.858-0.713 Pearson s r actual scores DICOMER is better (trained on TAC 2009 & TAC 2010) Spearman s ρ, Kendall s τ ranking only counting the number of violations works better than a supervised system. 13

Conclusions LQVSumm: 2000 summaries marked with LQV types 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types good inter-annotator agreement 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types good inter-annotator agreement counts and marked instances of linguistic quality violations allow for: 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types good inter-annotator agreement counts and marked instances of linguistic quality violations allow for: analyzing what a particular system is good/bad at (rather than just obtaining a numeric score) 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types good inter-annotator agreement counts and marked instances of linguistic quality violations allow for: analyzing what a particular system is good/bad at (rather than just obtaining a numeric score) developing automatic methods to detect LQVs (future work) 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types good inter-annotator agreement counts and marked instances of linguistic quality violations allow for: analyzing what a particular system is good/bad at (rather than just obtaining a numeric score) developing automatic methods to detect LQVs (future work) Available in stand-off format at: www.coli.uni-saarland.de/~afried 14

incomplete sentence pronoun without antecedent Conclusions ungrammaticality redundant information no semantic relatedness between clauses def. NP without referent dateline included pronoun with misleading antecedent indef. NP with previous referent unclear acronym connective but no discourse relation unclear first mention overly specific subsequent mention most types correlated to human judgments; others are infrequent LQVSumm: 2000 summaries marked with LQV types good inter-annotator agreement counts and marked instances of linguistic quality violations allow for: analyzing what a particular system is good/bad at (rather than just obtaining a numeric score) developing automatic methods to detect LQVs (future work) Available in stand-off format at: www.coli.uni-saarland.de/~afried 14

Backup Slides 56

Annotation Scheme: Overview entity mention level pronouns without antecedents indefinite NPs with a previous mention clause level (sentence, phrase, sequence of tokens) ungrammatical sentences no semantic relatedness 57

Performance of the G-Flow summarization system G-Flow system: Christensen et al. (NAACL 2013): Towards Coherent Multi-Document Summarization system incorporates coherence information into sentence extraction marked 50 summaries provided on the web site of the authors System Entity mention level Clause level All LQV types Best TAC system (differs for each column, TAC 2011) (System 1) 0.34 (System 16) 0.23 (System 21) 1.30 G-Flow (DUC 2004 data) 0.30 0.20 0.50 G-Flow succeeds in producing more coherent / readable summaries 10

inappropriate use of discourse connective Taylor s attorney could not be reached for comment Friday night. And the person who cooperates first gets the biggest reward. 59