Rodger Kibble and Richard Power

Similar documents
A Framework for Customizable Generation of Hypertext Presentations

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

Proof Theory for Syntacticians

Generation of Referring Expressions: Managing Structural Ambiguities

Abstractions and the Brain

Master s Thesis. An Agent-Based Platform for Dialogue Management

Constraining X-Bar: Theta Theory

Beyond the Pipeline: Discrete Optimization in NLP

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

An Introduction to the Minimalist Program

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

Learning Methods for Fuzzy Systems

An Interactive Intelligent Language Tutor Over The Internet

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Some Principles of Automated Natural Language Information Extraction

Control and Boundedness

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

An Investigation into Team-Based Planning

CEFR Overall Illustrative English Proficiency Scales

November 2012 MUET (800)

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Annotating (Anaphoric) Ambiguity 1 INTRODUCTION. Paper presentend at Corpus Linguistics 2005, University of Birmingham, England

Visual CP Representation of Knowledge

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Realization of Textual Cohesion and Coherence in Business Letters through Presupposition 1

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Procedural pragmatics and the study of discourse Louis de Saussure

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Annotation Guidelines for Rhetorical Structure

AQUA: An Ontology-Driven Question Answering System

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Think A F R I C A when assessing speaking. C.E.F.R. Oral Assessment Criteria. Think A F R I C A - 1 -

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

University of Edinburgh. University of Pennsylvania

DICTE PLATFORM: AN INPUT TO COLLABORATION AND KNOWLEDGE SHARING

Evaluation of Learning Management System software. Part II of LMS Evaluation

Underlying and Surface Grammatical Relations in Greek consider

The College Board Redesigned SAT Grade 12

Lecture 1: Machine Learning Basics

SCIENCE DISCOURSE 1. Peer Discourse and Science Achievement. Richard Therrien. K-12 Science Supervisor. New Haven Public Schools

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

"f TOPIC =T COMP COMP... OBJ

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

LFG Semantics via Constraints

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Curriculum Vitae. Kees van Deemter

Procedia - Social and Behavioral Sciences 143 ( 2014 ) CY-ICER Teacher intervention in the process of L2 writing acquisition

Evaluating text quality: judging output texts without a clear source

Using dialogue context to improve parsing performance in dialogue systems

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Innovative Methods for Teaching Engineering Courses

Index. Language Test (ANELT), 29, 235 auditory comprehension, 4,58, 100 Blissymbolics, 305

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Causal Link Semantics for Narrative Planning Using Numeric Fluents

Word Stress and Intonation: Introduction

Applications of memory-based natural language processing

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

Smarter Balanced Assessment Consortium: Brief Write Rubrics. October 2015

The CTQ Flowdown as a Conceptual Model of Project Objectives

Geo Risk Scan Getting grips on geotechnical risks

Consultation skills teaching in primary care TEACHING CONSULTING SKILLS * * * * INTRODUCTION

Strategic discourse comprehension

Iraqi EFL Students' Achievement In The Present Tense And Present Passive Constructions

A Version Space Approach to Learning Context-free Grammars

The MEANING Multilingual Central Repository

What is PDE? Research Report. Paul Nichols

GCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Lecturing Module

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

CS 598 Natural Language Processing

Accurate Unlexicalized Parsing for Modern Hebrew

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Evolution of Symbolisation in Chimpanzees and Neural Nets

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Argument structure and theta roles

Frequency and pragmatically unmarked word order *

On-Line Data Analytics

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

Vorlesung Mensch-Maschine-Interaktion

5. UPPER INTERMEDIATE

Reinforcement Learning by Comparing Immediate Reward

Interpreting Vague Utterances in Context

Guidelines for Writing an Internship Report

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

LING 329 : MORPHOLOGY

Coast Academies Writing Framework Step 4. 1 of 7

GROUP COMPOSITION IN THE NAVIGATION SIMULATOR A PILOT STUDY Magnus Boström (Kalmar Maritime Academy, Sweden)

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Florida Reading Endorsement Alignment Matrix Competency 1

Statewide Framework Document for:

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Writing for the AP U.S. History Exam

Feature-oriented vs. Needs-oriented Product Access for Non-Expert Online Shoppers

Transcription:

ITRI-99-19 Using centering theory to plan coherent texts Rodger Kibble and Richard Power December, 1999 Also published in Proc 12th Amsterdam Colloquium, Amsterdam, The Netherlands, pp. 187-192. This work was supported by the EPSRC under grants L51126 and M36960. Information Technology Research Institute Technical Report Series ITRI, Univ. of Brighton, Lewes Road, Brighton BN2 4GJ, UK TEL: +44 1273 642900 EMAIL: firstname.lastname@itri.brighton.ac.uk FAX: +44 1273 642908 NET: http://www.itri.brighton.ac.uk

1 Issues in Text Planning This paper describes an approach to text planning, one of the distinct tasks identified in Ehud Reiter s consensus architecture for Natural Language Generation(Reiter 1994, Reiter and Dale 1997). This consists a pipeline of distinct tasks: Text Planning- deciding the content of a message, and organising the component propositions intoatexttree; Sentence Planning- aggregating propositions into clausal units and choosing lexical items corresponding to concepts in the knowledge base; Linguistic realisation- surface details such as agreement, orthography etc. (See also(cahill et al. 1999) who propose a more elaborate model which allows the pipeline as a concrete instantiation.) We assume that the component propositions to be realised in a text are organised in a tree structure in which terminal nodes are elementary propositions and nonterminal nodes represent discourse relations as defined by e.g., Rhetorical Structure Theory(RST, Mann and Thompson 1987). This structure only partially determines the linear order in which the propositions will be realised- in other words, any RST structure specifies a range of possible text plans. We propose as an additional constraint that the generator should seek to maximise continuity of reference as determined by the rules and constraints of centering theory, and we argue that this enablesustoselectthemostcohesivevariantsfromasetoftextplans. 2 Centering in a nutshell Centering theory(ct) is a theory of discourse structure which models the interaction of cohesion andsalienceintheinternalorganisationofatext. (SeeGroszetal1995;cfHardt1998foramore formal treatment.) The main assumptions of the theory are: 1.Foreachutteranceinadiscoursethereispreciselyoneentitywhichisthecentreofattention or center. The center in an utterance Un is the most grammatically salient entity realised in Un?1 whichisalsorealisedin Un.Thisisalsoreferredtoasthebackward-lookingcenterorCb. 2. There is a preference for consecutive utterances within a discourse segment to keep the same entityasthecenter,andforthecentertoberealisedassubjectorpreferredcenter(cp).wereferto these principles as cohesion and salience respectively. (Hardt s(1998) formalization only covers thefirstofthese.)pairsofsuccessiveutterances hun; Un+1 iareclassifiedintothetransitiontypes showninfig.1,inorderofpreference. 3.Thecenteristheentitywhichismostlikelytobepronominalised. (Note: the notion of salience for the purposes of centering theory is most commonly defined according to a hierarchy of grammatical roles: SUBJECT > DIRECT OBJECT > INDIRECT OBJECT > OTHERS(see e.g., Brennan et al 1987). For alternative approaches see e.g.,(strube and Hahn 1999), (Walker et al 1994).) CONTINUE: cohesion and salience both hold; same center(or Cb(Un) undefined), realised as Subject in Un+1; RETAIN: cohesion only; i.e. center remains the same but is not realised as Subject in Un+1; SMOOTH SHIFT: salience only; center of Un+1 realised as Subject but not equal to Cb(Un); ROUGH SHIFT: neither cohesion nor salience holds. Figure 1: Centering Transitions 1

background NUCLEUS elaboration SATELLITE white-cream(elixir) NUCLEUS SATELLITE treats(elixir, cold-sores) cause NUCLEUS SATELLITE contain(elixir, aliprosan) relieve(aliprosan, vs-disorders) Figure 2: Rhetorical Structure 3 Centering in NLG CT has developed primarily in the context of natural language interpretation, focussing on anaphora resolution(seee.g.,brennanetal1987).nlgresearchershaveappliedcttothetasksoftextplanning(cheng MS), Sentence Planning(Mittal et al 1998) and choice of referring expression(e.g., Dale 1992). In this paper we concentrate on Text and Sentence Planning, aiming to determine whether the principles underlying the constraints and rules of the theory can be turned round and used as planning operators for generating coherent text. It is not immediately obvious how the principles of cohesion and salience described above should be implemented in an NLG system following a Reiter-type consensus architecture. If we consider these principles as planning operations, cohesion naturally comes under Text Planning: ordering a sequence of utterances to maintain the same entity as the center, possibly within a partial ordering determined by discourse relations. According to(1) above, the center is defined by grammatical salience, which is determined by the Sentence Planner- for example, choice of active or passive determines whether an entity is realised as Subject. However, in a pipelined system the TextPlannerdoesnothaveaccesstotheplan,yetitneedstoknowtheidentityofthe centerinordertoplancoherenttext.onewayoutofthisconundrumistoequatethecenterwith the topic or theme ofaandrequirethatthisisgivenaspartofthesemanticinput, so that centering rules merely reflect the information structure of a discourse. Kibble(1999) proposes an algorithm along these lines, but assumes that the topic is independently given by the textplanner. Analternativestrategywewillfollowistotreatthetaskofestablishingthe Cbas an optimisation problem. In the process we relax the pipelining constraint a little and assume that certain options for syntactic realisation can be predicted on the basis of the argument structure of predicates, which is available at the stage of Text Planning. 4 Implementedprototype Figure2showsarhetoricalstructure(RS)thatmightserveasinputtothetextplanner. The non-terminal nodes of the RS are labelled with RST relations; the terminal nodes are labelled with propositions, expressed in a simple semantic formalism. Obviously the RS allows eight possible orderings of the four propositions, depending on whether NUCLEUS precedes SATELLITE or vice-versa for each of the three rhetorical relations. For each ordering, there are many ways of distributing the propositions among text-categories such as s and paragraphs: at one(ridiculous) extreme, 2

each proposition might be allotted a separate paragraph; at the other extreme, all four propositions might be placed in the same. The number of possible text-plans increases further if we take account of the different ways in which rhetorical relations can be marked: for instance, cause can be expressed by the discourse markers since, so, and consequently (among others), or it can be left unmarked, relying on the reader to infer the relationship. The ICONOCLAST system treats text planning as a constraint satisfaction problem(van Hentenryck 1989). The dimensions of variation among different text plans(order, text-category, discourse marker) are represented by variables ranging over finite domains, and constraints among these variables are applied so that incorrect solutions are ruled out(power et al 1999). However, the number of correct solutions remains large(several hundred for the example in figure 2). To reduce them to a manageable set, the user can impose further constraints which eliminate solutions considered stylistically unacceptable; the user can also define criteria for evaluating solutions, so that even though many are still generated, they are ordered from best to worse. To keep the example simple, let us assume that the following very strict constraints are applied: For the background relation, satellite precedes nucleus. For the elaboration relation, nucleus precedes satellite. The causerelationmustbemarkedby since ;theotherrelationsare unmarked. For any ordering of the propositions, centering transitions are determined by the choices of Cpand Cb. Simplifyingagain,wewillassumethatthe Cpcanonlybevariedforthepredicate relieve, which can be expressed in the active or passive, e.g., Aliprosan[Cp] relieves viral skin disorders vs Viral skin disorders[cp] are relieved by aliprosan. Three sample text plans with Cp and Cb values specified are shown in figures 3-5. For each plan, an evaluation of centering transitions can becomputed,forinstancebyassigningthefollowingscores 1 : No Cb 0 Rough Shift 1 Smooth Shift 2 Retain 3 Continue 4 Applying this evaluation to the three text plans we obtain the following percentage scores: PlanA:4+4+2=10/12=83% PlanB:4+4+1=9/12=75% PlanC:4+0+1=5/12=42% These variants illustrate a best-case solution(plan A), the worst case(plan C) and an intermediate result(plan B). Using pronouns for the Cb after CONTINUE, and demonstratives after SMOOTH SHIFT,thefinaltextsforthethreeplansmightbeasinFigs3-5. 5 Conclusion This paper has highlighted some implications of Centering Theory for planning coherent text. We show that by making some assumptions about which entities are potential Cps, we can determine Cbs, Cps, and hence transitions, in the text planning stage, thus allowing the text planner to select the proposition sequence that yields the best continuity of reference. Acknowledgements This work was supported by the UK EPSRC under grant references L51126(Kibble) and M36960 (Power). We are grateful to Paul Piwek and Kees van Deemter for constructive comments. 1 Analternativeapproachwouldbetoscoresalienceandcohesionindependently,obtainingapartialpreferenceordering CONTINUE > fretain jsmoothshift g >ROUGHSHIFT 3

paragraph white-cream(elixir) Cp = elixir, No Cb treats(elixir, cold-sores) Cp = elixir, Cb = elixir CONTINUE cont(elixir, alipr) Cp = elixir, Cb = elixir CONTINUE rel(alipr, vs-disorders) Cp = alipr, Cb = alipr SMOOTH SHIFT Figure3:TextPlanA:Elixirisawhitecream.Itisusedinthetreatmentofcoldsores.Itcontainsaliprosan, since this relieves viral skin disorders. cont(elixir, alipr) Cp = elixir, Cb = elixir CONTINUE rel(alipr, vs-disorders) Cp = vs-disorders Cb = alipr ROUGH SHIFT Figure4:TextPlanB:...Itcontainsaliprosan,sinceviralskindisordersarerelievedbyaliprosan. rel(alipr, vs-disorders) Cp = alipr, No Cb NO CB cont(elixir, alipr) Cp = elixir, Cb = aliprosan ROUGH SHIFT Figure 5: Text Plan C:... Since aliprosan relieves viral skin disorders, Elixir contains aliprosan. 4

References SBrennan,MWalkerFriedmanandCPollard1987.ACenteringApproachtoPronouns.InProc.25thACL. LCahill,CDoran,REvans,CMellish,DPaiva,MReape,DScottandNTipper1999. TowardsaReference Architecture for Natural Language Generation Systems. Technical Report ITRI-99-14, ITRI, University of Brighton. H Cheng MS, Experimenting with the Interaction between Aggregation and Text Planning, unpublished paper, HCRC, University of Edinburgh. R Dale 1992, Generating Referring Expressions, MIT Press. B Grosz, A Joshi and S Weinstein 1995, Centering: a framework for modelling the local coherence of discourse. Computational Linguistics. D Hardt 1998, Centering in Dynamic Semantics. COLING 96. van Hentenryck 1989, Constraint Satisfaction in Logic Programming, MIT Press. RKibble1999,CbornotCb? CenteringtheoryappliedtoNLG.ACLworkshoponDiscourseandReference Structure. W Mann& S Thompson 1987, Rhetorical Structure Theory: A Theory of Text Organisation. In L Polanyi(ed.), The Structure of Discourse. V Mittal, J Moore, G Carenini and S Roth 1998, Describing Complex Charts in Natural Language: A Caption Generation System. Computational Linguistics. R Power(forthcoming), Planning Texts by Constraint Satisfaction, to appear as an ITRI Technical Report, ITRI, University of Brighton. R Power, C Doran and D Scott, 1999, Generating embedded discourse markers from rhetorical structure, Proc. of the European Workshop on Natural Language Generation. E Reiter 1994. Has a consensus NL generation architecture appeared, and is it psycholinguistically plausible? InProc.INLG7. E Reiter and R Dale 1997, Building Applied Natural-Language Generation Systems. Journal of Natural-Language Engineering M Strube and U Hahn 1999, Functional Centering- Grounding Referential Coherence in Information Structure. Computational Linguistics. 5