A Lexical Functional Mapping Algorithm

Similar documents
The Development of Linking Theory in lfg

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

An Interactive Intelligent Language Tutor Over The Internet

Constraining X-Bar: Theta Theory

LFG Semantics via Constraints

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

Pseudo-Passives as Adjectival Passives

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Underlying and Surface Grammatical Relations in Greek consider

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Switched Control and other 'uncontrolled' cases of obligatory control

cmp-lg/ Jul 1995

Argument structure and theta roles

AQUA: An Ontology-Driven Question Answering System

Control and Boundedness

Double Double, Morphology and Trouble: Looking into Reduplication in Indonesian

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

The Structure of Multiple Complements to V

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

PROJECTIONS AND GLUE FOR CLAUSE-UNION COMPLEX PREDICATES. Avery D Andrews The Australian National University. Proceedings of the LFG07 Conference

Some Principles of Automated Natural Language Information Extraction

Construction Grammar. University of Jena.

Type-driven semantic interpretation and feature dependencies in R-LFG

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

CS 598 Natural Language Processing

Minimalism is the name of the predominant approach in generative linguistics today. It was first

A Grammar for Battle Management Language

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Constructions with Lexical Integrity *

LING 329 : MORPHOLOGY

The Interface between Phrasal and Functional Constraints

The Pennsylvania State University. The Graduate School. College of the Liberal Arts THE TEACHABILITY HYPOTHESIS AND CONCEPT-BASED INSTRUCTION

A Comparison of Two Text Representations for Sentiment Analysis

On the Notion Determiner

Interfacing Phonology with LFG

A relational approach to translation

Feature-Based Grammar

Derivations (MP) and Evaluations (OT) *

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Chapter 4: Valence & Agreement CSLI Publications

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Ontologies vs. classification systems

cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN

Hindi Aspectual Verb Complexes

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Lexical Categories and the Projection of Argument Structure

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Korean ECM Constructions and Cyclic Linearization

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

The Verbmobil Semantic Database. Humboldt{Univ. zu Berlin. Computerlinguistik. Abstract

The MEANING Multilingual Central Repository

Construction Grammar. Laura A. Michaelis.

Specifying a shallow grammatical for parsing purposes

Evolution of Symbolisation in Chimpanzees and Neural Nets

Som and Optimality Theory

Heads and history NIGEL VINCENT & KERSTI BÖRJARS The University of Manchester

Automating the E-learning Personalization

The building blocks of HPSG grammars. Head-Driven Phrase Structure Grammar (HPSG) HPSG grammars from a linguistic perspective

"f TOPIC =T COMP COMP... OBJ

The Strong Minimalist Thesis and Bounded Optimality

Constructions License Verb Frames

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Update on Soar-based language processing

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Dependency, licensing and the nature of grammatical relations *

Building an HPSG-based Indonesian Resource Grammar (INDRA)

Agree or Move? On Partial Control Anna Snarska, Adam Mickiewicz University

Structure-Preserving Extraction without Traces

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Specifying Logic Programs in Controlled Natural Language

A Framework for Customizable Generation of Hypertext Presentations

Multiple case assignment and the English pseudo-passive *

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Parsing of part-of-speech tagged Assamese Texts

The Role of the Head in the Interpretation of English Deverbal Compounds

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

The Structure of Relative Clauses in Maay Maay By Elly Zimmer

Prediction of Maximal Projection for Semantic Role Labeling

A Usage-Based Approach to Recursion in Sentence Processing

An Introduction to the Minimalist Program

SOME MINIMAL NOTES ON MINIMALISM *

Context Free Grammars. Many slides from Michael Collins

Collocations of Nouns: How to Present Verb-noun Collocations in a Monolingual Dictionary

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

Speech Recognition at ICSI: Broadcast News and beyond

The Lexical Representation of Light Verb Constructions

Ch VI- SENTENCE PATTERNS.

Hindi-Urdu Phrase Structure Annotation

Natural Language Processing. George Konidaris

Developing a TT-MCTAG for German with an RCG-based Parser

The Inclusiveness Condition in Survive-minimalism

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Shared Mental Models

Word Stress and Intonation: Introduction

5 Minimalism and Optimality Theory

Transcription:

A Lexical Functional Mapping Algorithm Tamer S. Mahdi 1 and Robert E. Mercer 2 1 IBM Canada Toronto, Ontario, Canada tamer@ca.ibm.com 2 Cognitive Engineering Laboratory, Department of Computer Science The University of Western Ontario, London, Ontario, Canada mercer@csd.uwo.ca Abstract. Semantic interpretation is the process of mapping a syntactic structure to a context-independent meaning representation. An algorithm is presented which maps the f-structure produced by a Lexical- Functional Grammar parser to thematic roles. The mapping algorithm integrates Lexical Mapping Theory, Grimshaw s theory of prominence, and other semantic theories. The algorithm uses selectional restrictions from WordNet to enforce semantic compatibility among constituents. 1 Linguistic Background Lexical Mapping Theory [2] is the backbone of our algorithm. To it are added other elements linking syntax and semantics which are found in other works in semantic representation: [1,4,5,10,11] and others. In order to combine these theories, we have developed a consistent representation and an algorithm that uses this representation. Due to the size of this paper, we ll only describe the basic elements of LMT. For a comprehensive description of the main linguistic theories involved, see [8]. LMT assumes three levels of structure: a structure representing the underlying organization of argument roles called argument structure or a- structure, in addition to the f-structure and c-structure of Lexical-Functional Grammar (LFG) [6]. The theory consists of the following four components [2]: (1) Thematic Structure: The theory postulates a universal hierarchy of thematic roles ordered: agent > beneficiary > recipient/experiencer > instrument > patient/theme > location. (2) Classification of Syntactic Functions: The theory classifies syntactic functions according to the features [-r] and [-o] summarized by: thematically thematically unrestricted [ r] restricted [+r] non-objective [ o] SUBJ OBL θ objective [+o] OBJ OBJ θ This research was performed while the first author was at The University of Western Ontario and was funded by NSERC Research Grant 0036853. R. Cohen and B. Spencer (Eds.): AI 2002, LNAI 2338, pp. 303 309, 2002. c Springer-Verlag Berlin Heidelberg 2002

304 Tamer S. Mahdi and Robert E. Mercer OBL θ and OBJ θ abbreviate multiple oblique functions and restricted objects respectively; one for each semantic role θ: OBL ben,obj recip, etc. The thematically restricted functions have fixed thematic roles. At the same time, subjects and objects may correspond to any thematic role and may even be nonthematic, and hence are thematically unrestricted [-r]. On the other hand, objects have the primitive property of complementing transitive predicators (verbs and prepositions), and not complementing intransitive predicators (nouns and adjectives). This property is called objective [+o].obliques are nonobjectlike [-o]. (3) Lexical Mapping Principles: Thematic roles are associated with partial specifications of syntactic functions by means of lexical mapping principles. A constraint on these principles is the preservation of syntactic information; mapping principles can only add syntactic features and not delete or change them. This monotonicity is allowed by underspecification. There are three types of mapping principles: [2] (a) Intrinsic Role Classifications: These principles associate syntactic functions with the intrinsic meanings of the roles. They include: The Agent Encoding Principle (the agent role is encoded as a nonobjective function [-o], and alternates between subject and oblique), the Theme Encoding principle (a theme or patient role is an unrestricted function [-r], that can alternate between subject and object), and the Locative Encoding Principle (a locative role is encoded as a nonobjective function [-o]). These principles are extended to account for applicative and dative constructions [3]. It is further observed that in some languages (called asymmetrical object languages, e.g. English) at most one role can be intrinsically classified unrestricted [-r]. This constraint is called Asymmetrical Object Parameter (AOP). (b) Morpholexical Operations: Morpholexical operations affect lexical argument structures by adding and suppressing thematic roles. For example, the passive suppresses the highest thematic role in the lexical argument structure. The applicative is a morpholexical operation that adds a thematic role. It adds a newthematic role to the argument structure of a verb belowits highest role, as in Mary cooked the children dinner. (c) Default Role Classifications: The default classifications apply after the entire argument structure has been morpholexically built up. The defaults capture the generalization that the highest thematic role of a verb will be the subject, and lower roles will be nonsubjects. Thus, the general subject default classifies the highest thematic role as unrestricted [-r], and lower roles as restricted [+r]. (4) Well-formedness Conditions: (a) The Subject Condition: Every lexical form must have a subject. (b) Function-Argument Biuniqueness: In every lexical form, every expressed lexical role must have a unique syntactic function, and every syntactic function must have a unique lexical role.

A Lexical Functional Mapping Algorithm 305 2 Computing the Semantic Representation We have classified verbs, and we have defined the appropriate thematic roles for each verb class. We assume an LFG-based parser, like GWWB [7], has already produced the f-structure for the sentence. Our algorithm performs the mapping between the syntactic functions produced by the parser, and the corresponding thematic roles of the sentence s verb. If a match is found, the a-structure for the sentence is produced, otherwise the sentence is rejected. Our system consists of four main parts. Defining and Interpreting the Lexicon Defining a lexicon is not an easy task. There are many words to be defined with the majority of them having multiple senses, and even for a single verb sense, there are different complement structures with varying thematic roles corresponding to each structure. Obviously, defining each of these forms as a separate entry in the lexicon is a tedious task. We need to be able to define a single entry in the lexicon for each verb sense. That entry must capture the various thematic role combinations a verb may take. To do this, we assign optionality codes to the verb s arguments. Together, these codes define the relationships between the arguments. We have designed an algorithm that uses this specification to produce the valid role combinations for each verb. Verb Mapping The verb mapping algorithm is an operationalization of LMT. LMT considers only the thematic dimension. To integrate Grimshaw s views [5] on the interaction of the aspectual and thematic dimensions with LMT s rules of mapping, we define an aspectual prominence indicator for each verb class. The thematic roles defined for a verb class are sorted in descending order according to their prominence on the thematic hierarchy. When the prominence of the roles differs from the thematic to the aspectual dimension, the aspectual prominence indicator is set to the ranking of the aspectually most prominent argument. Before the mapping is started, that argument is swapped with the first argument on the argument list, and the list is resorted starting with its second argument to ensure that the arguments are listed in the correct order of prominence. Since, we know that the absolutely most prominent argument should be mapped to the subject, we exclude it from LMT s mapping rules by changing its intrinsic classification to [-o] regardless of what its current value is, and regardless of the thematic role of that argument. Once this is done the mapping algorithm can start. Sentence Mapping The mapping described in the previous paragraph is concerned with mapping a verb s thematic roles to the corresponding syntactic functions according to LMT s mapping principles. But, mapping a sentence involves more than that. The actual arguments of the sentence being mapped are matched against each of the valid role combinations for the sentence s verb. First, the algorithm checks to see whether the actual number of arguments in the sentence matches the number of arguments in the particular role combination under consideration. If they don t match, an error is issued, otherwise the

306 Tamer S. Mahdi and Robert E. Mercer verb mapping as discussed above is performed. The syntactic functions that result from the mapping are then compared to the syntactic functions from the sentence s f-structure (obtained from the parser). If they all match, the mapping is successful and the sentence s a-structure is produced. If one or more syntactic functions don t match, the next valid combination of roles is mapped until all valid combinations have been considered. If none of the mapped role combinations match the sentence s syntactic functions, the sentence is said to be syntactically ill-formed. Another consideration when mapping sentences is whether the sentence is in the passive or active. If the parser indicates that the sentence is in the passive, the number of arguments in the sentence is expected to be one less than that in the case of the active. If it is, we proceed with the mapping, otherwise there is an error. After sorting the valid role combinations to ensure the most prominent argument is on the top of the list that argument is removed from the top. The mapping then proceeds exactly as with active sentences. Finally, the algorithm performs adjunct mapping, but because of space limitations, we do not show their treatment in this paper. Selectional Restrictions Selectional restrictions give us a means of checking and enforcing compatibility between the verb and the conceptual types of its arguments. To accomplish this, we use WordNet [9]. Specifically, we utilize the part of WordNet that organizes nouns in a hierarchical semantic organization going from many specific terms at the lower levels to a few generic terms at the top. Each noun is linked to a superordinate term/concept, and inherits the properties of that concept in addition to having its own distinguishing properties. Word- Net is searched for the conceptual types of each of the sentence s arguments. The conceptual types found are compared with the predefined conceptual types of the verb s arguments; if the former match or are subordinate terms of the latter, the selectional restrictions are met, and the sentence is accepted otherwise the sentence is rejected. By allowing verb arguments to take more than one conceptual type we overcome the major drawback of selectional restrictions, i.e. being too general or too restrictive. Selectional restrictions are applied in exactly the same way to adjuncts.

3 Sentence Mapping Examples (1) The teacher gave the boy a book A Lexical Functional Mapping Algorithm 307 Verb Arg1 Arg2 Arg3 give (active) the teacher the boy a book f-structure : subj obj objth a-structure : AG REC TH GIVE AG REC TH Intr. -o -r +o Default -r +r ----------------------------------- SUBJ SUBJ/OBJ OBJTH W.F. SUBJ OBJ OBJTH To map the sentence the algorithm uses the sentence s constituents and each constituent s syntactic function (produced by the parser), and the voice (active in this case). The mapping is performed according to LMT: First, the intrinsic classifications apply, assigning [-o] to the agent. Since the recipient role is intrinsically classified [-r], and by AOP (see Section 1) there can only be one role intrinsically classified [-r], the theme takes the alternative [+o] classification. The general default applies assigning [-r] to the agent, making it the subject, [+r] to the theme making it a restricted object, and leaving the recipient as underspecified. By function argument biuniqueness, the beneficiary becomes the object, since the agent is realized as the subject. Since the syntactic functions produced by the mapping match the syntactic functions obtained from the parser, the sentence is accepted and the a-structure is produced as shown above. Depending on howthe optionality codes for give are defined, other role combinations valid for give are also mapped, but they are rejected since the resulting syntactic functions don t match those of the sentence. Hence, only the a-structure shown is produced. (2) The boy was given a book Verb Arg1 Arg2 give (passive) the boy a book f-structure : subj objth a-structure : REC TH GIVE REC TH Intr. -r +o Default +r -------------------- SUBJ/OBJ OBJTH W.F. SUBJ OBJTH This is the same sentence as in (1), but in the passive. The agent (the teacher) is suppressed, and the recipient (the boy) is nowthe subject. The treatment of the passive is done according to LMT, where the verb s most prominent argument is suppressed. We remove the most prominent argument from the list of the verb s

308 Tamer S. Mahdi and Robert E. Mercer arguments, and we shift each of the other arguments one position up, then the mapping is performed and the resulting syntactic functions are compared with the syntactic functions obtained from the parser. Since they match, the sentence is accepted. The mapping details are: The recipient is assigned [-r], and the theme [+o] by AOP. The general default assigns [+r] to the theme making it a restricted object. The subject condition makes the recipient the subject. In the passive, the number of arguments in the sentence is expected to be one less than in the active. Since this is the case in the example above, the sentence is accepted. (3) The waiter served the guests Verb Arg1 Arg2 serve (active) the waiter the guests f-structure : subj obj a-structure : AG REC PERSON PERSON SERVE AG REC Intr. -o -r Default -r ----------------------- SUBJ SUBJ/OBJ W.F. SUBJ OBJ In the previous examples, we didn t show the use of selectional restrictions to focus on the mapping. Here we show how applying selectional restrictions enhances the mapping. Without applying selection restrictions, there would be two possible mappings for the sentence above; object to recipient, and object to theme. Serve has many senses, the sense used here is to help someone with food or drink. The conceptual types for the arguments of serve are: person for agent and recipient, food and drink for theme. Obviously, the guests will fit the conceptual type for recipient, but not for theme. Hence, the mapping of object to theme is ruled out, and only the mapping of object to recipient is plausible. (4) Mary cooked the children Verb Arg1 Arg2 cook (active) Mary the children f-structure : subj obj Error: The theme of "cook" should be a kind of FOOD. "The children" is not a kind of FOOD! The optionality codes defined for cook disallowit from having a beneficiary without a theme, hence the children cannot be mapped to beneficiary in this example. Without selectional restrictions, the children will be mapped to theme, and the sentence would be accepted. Applying selectional restrictions matches the conceptual types of the children (i.e. person) and the theme (i.e. food). Since they don t match, the sentence will be rejected, and the error message shown above will be issued.

A Lexical Functional Mapping Algorithm 309 References 1. A. Alsina. The Role of Argument Structure in Grammar: Evidence from Romance. CSLI Publications, Stanford, CA, 1996. 303 2. J. Bresnan and J. M. Kanerva. Locative inversion in chichewa: A case study of factorization in grammar. Linguistic Inquiry, 20:1 50, 1989. 303, 304 3. J. Bresnan and L. Moshi. Object asymmetries in comparative Bantu syntax. Linguistic Inquiry, 21:147 185, 1990. 304 4. D. Dowty. Word Meaning and Montague Grammar. Reidel, Dordrecht, 1979. 303 5. J. Grimshaw. Argument Structure. The MIT Press, Cambridge, MA, 1990. 303, 305 6. R. Kaplan and J. Bresnan. Lexical-Functional Grammar: A formal system for grammatical representation. In The Mental Representation of Grammatical Relations, pages 173 281. The MIT Press, Cambridge, MA, 1982. 303 7. R. M. Kaplan and J. T. Maxwell. LFG Grammar-Writer s Workbench. Technical Report, Xerox PARC, 1996. 305 8. T. Mahdi. A Computational Approach to Lexical Functional Mapping. MScThesis, The University of Western Ontario, May 2001. 303 9. G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introduction to wordnet: An on-line lexical database. International Journal of Lexicography, 3:235 244, 1990. 306 10. K. P. Mohanan and T. Mohanan. Semantic representation in the architecture of LFG. Paper presented at the LFG Workshop, Grenoble, France, August 1996. 303 11. Z. Vendler. Linguistics in philosophy. Cornell University Press, Ithaca, NY, 1967. 303