Phrase Structure and Parsing as Search

Similar documents
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Context Free Grammars. Many slides from Michael Collins

Grammars & Parsing, Part 1:

CS 598 Natural Language Processing

Parsing of part-of-speech tagged Assamese Texts

Developing a TT-MCTAG for German with an RCG-based Parser

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Chapter 4: Valence & Agreement CSLI Publications

Developing Grammar in Context

Proof Theory for Syntacticians

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

BULATS A2 WORDLIST 2

Words come in categories

Parsing natural language

Some Principles of Automated Natural Language Information Extraction

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Control and Boundedness

Argument structure and theta roles

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

LTAG-spinal and the Treebank

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Ch VI- SENTENCE PATTERNS.

Part I. Figuring out how English works

The Interface between Phrasal and Functional Constraints

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Constraining X-Bar: Theta Theory

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

An Introduction to the Minimalist Program

Natural Language Processing. George Konidaris

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

Writing a composition

Compositional Semantics

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Advanced Grammar in Use

An Interactive Intelligent Language Tutor Over The Internet

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

Hindi-Urdu Phrase Structure Annotation

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

"f TOPIC =T COMP COMP... OBJ

Ensemble Technique Utilization for Indonesian Dependency Parser

A Computational Evaluation of Case-Assignment Algorithms

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Campus Academic Resource Program An Object of a Preposition: A Prepositional Phrase: noun adjective

LNGT0101 Introduction to Linguistics

IBAN LANGUAGE PARSER USING RULE BASED APPROACH

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Using dialogue context to improve parsing performance in dialogue systems

Sample Goals and Benchmarks

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Today we examine the distribution of infinitival clauses, which can be

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

The College Board Redesigned SAT Grade 12

Adjectives tell you more about a noun (for example: the red dress ).

Mercer County Schools

California Department of Education English Language Development Standards for Grade 8

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

A Usage-Based Approach to Recursion in Sentence Processing

BASIC ENGLISH. Book GRAMMAR

Underlying and Surface Grammatical Relations in Greek consider

Loughton School s curriculum evening. 28 th February 2017

Accurate Unlexicalized Parsing for Modern Hebrew

AQUA: An Ontology-Driven Question Answering System

Analysis of Probabilistic Parsing in NLP

Specifying Logic Programs in Controlled Natural Language

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Chapter 9 Banked gap-filling

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Aspectual Classes of Verb Phrases

The Discourse Anaphoric Properties of Connectives

2017 national curriculum tests. Key stage 1. English grammar, punctuation and spelling test mark schemes. Paper 1: spelling and Paper 2: questions

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

NAME OF ASSESSMENT: Reading Informational Texts and Argument Writing Performance Assessment

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Dependency, licensing and the nature of grammatical relations *

Hindi Aspectual Verb Complexes

Emmaus Lutheran School English Language Arts Curriculum

PROBLEMS IN ADJUNCT CARTOGRAPHY: A CASE STUDY NG PEI FANG FACULTY OF LANGUAGES AND LINGUISTICS UNIVERSITY OF MALAYA KUALA LUMPUR

DIRECT AND INDIRECT SPEECH

A Version Space Approach to Learning Context-free Grammars

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

What the National Curriculum requires in reading at Y5 and Y6

A Pumpkin Grows. Written by Linda D. Bullock and illustrated by Debby Fisher

SAMPLE PAPER SYLLABUS

How long did... Who did... Where was... When did... How did... Which did...

Pre-Processing MRSes

Copyright 2017 DataWORKS Educational Research. All rights reserved.

Transcription:

Phrase tructure Phrase tructure and Parsing as earch Informatics 2A: Lecture 17 John Longley chool of Informatics University of Edinburgh 24 October 2014 1 / 54

Phrase tructure 1 Phrase tructure Heads and Phrases Desirable Properties of a Grammar A Fragment of English 2 2 / 54

Phrase tructure Heads and Phrases Desirable Properties of a Grammar A Fragment of English Heads and Phrases oun (): oun Phrase () Adjective (A): Adjective Phrase (AP) Verb (V): Verb Phrase () Preposition (P): Prepositional Phrase () o far we have looked at terminals (words or PO tags). Today, we ll look at non-terminals, which correspond to phrases. The class that a word belongs to is closely linked to the name of the phrase it customarily appears in. In a X-phrase (eg ), the key occurrence of X (eg ) is called the head. In English, the head tends to appear in the middle of a phrase. 3 / 54

Heads and Phrases Phrase tructure Heads and Phrases Desirable Properties of a Grammar A Fragment of English English s are commonly of the form: () Adj* oun ( RelClause)* : the angry duck that tried to bite me, head: duck. s are commonly of the form: (Aux) Adv* Verb Arg* Adjunct* Arg Adjunct AdvP... : usually eats artichokes for dinner, head: eat. In Japanese, Korean, Hindi, Urdu, and other head-final languages, the head is at the end of its associated phrase. In Irish, Welsh, cots Gaelic and other head-initial languages, the head is at the beginning of its associated phrase. 4 / 54

Phrase tructure Desirable Properties of a Grammar Heads and Phrases Desirable Properties of a Grammar A Fragment of English Chomsky specified two properties that make a grammar interesting and satisfying : It should be a finite specification of the strings of the language, rather than a list of its sentences. It should be revealing, in allowing strings to be associated with meaning (semantics) in a systematic way. We can add another desirable property: It should capture structural and distributional properties of the language. (E.g. where heads of phrases are located; how a sentence transforms into a question; which phrases can float around the sentence.) 5 / 54

Phrase tructure Desirable Properties of a Grammar Heads and Phrases Desirable Properties of a Grammar A Fragment of English Context-free grammars (CFGs) provide a pretty good approximation. ome features of Ls are more easily captured using mildly context-sensitive grammars, as well see later in the course. There are also more modern grammar formalisms that better capture structural and distributional properties of human languages. (E.g. combinatory categorial grammar.) But LL(1) grammars and the like definitely aren t enough for Ls. Even if we could make a L grammar LL(1), we wouldn t want to: this would artificially suppress ambiguities, and would often mutilate the natural structure of sentences. 6 / 54

Phrase tructure A Tiny Fragment of English Heads and Phrases Desirable Properties of a Grammar A Fragment of English Let s say we want to capture in a grammar the structural and distributional properties that give rise to sentences like: A duck walked in the park. The man walked with a duck. You made a duck. You made her duck. A man with a telescope saw you. A man saw you with a telescope. You saw a man with a telescope.,v,,v, Pro,V,? Pro,V,,,V,Pro,V,Pro, Pro,V,, We want to write grammatical rules that generate these phrase structures, and lexical rules that generate the words appearing in them. 7 / 54

Phrase tructure Heads and Phrases Desirable Properties of a Grammar A Fragment of English Grammar for the Tiny Fragment of English Grammar G1 generates the sentences on the previous slide: Grammatical rules Pro V V V Prep Lexical rules a the her (determiners) man park duck telescope (nouns) Pro you (pronoun) V saw walked made (verbs) Prep in with for (prepositions) Does G1 produce a finite or an infinite number of sentences? 8 / 54

Phrase tructure in a grammar makes it possible to generate an infinite number of sentences. In direct recursion, a non-terminal on the LH of a rule also appears on its RH. The following rules add direct recursion to G1: Conj Conj and or In indirect recursion, some non-terminal can be expanded (via several steps) to a sequence of symbols containing that non-terminal: Prep 9 / 54

Phrase tructure You saw a man with a telescope. Pro You V saw Prep a man with a telescope 10 / 54

Phrase tructure You saw a man with a telescope. Pro You V saw a man Prep with a telescope 11 / 54

Phrase tructure You saw a man with a telescope. Pro You V Pro You V saw saw a man Prep with a man Prep with a telescope This illustrates attachment ambiguity: the can be a part of the or of the. ote that there s no PO ambiguity here. a telescope 12 / 54

Phrase tructure Grammar G1 only gives us one analysis of you made her duck. Pro V You made her duck There is another, ditransitive (i.e., two-object) analysis of this sentence one that underlies the pair: What did you make for her? You made her duck. 13 / 54

Phrase tructure For this alternative, G1 also needs rules like: V Pro her Pro You V made Pro You V made Pro her duck her duck In this case, the structural ambiguity is rooted in PO ambiguity. 14 / 54

Phrase tructure There is a third analysis as well, one that underlies the pair: What did you make her do? You made her duck. (move head or body quickly downwards) Here, the small clause (her duck) is the direct object of a verb. imilar small clauses are possible with verbs like see, hear and notice, but not ask, want, persuade, etc. G1 needs a rule that requires accusative case-marking on the subject of a small clause and no tense on its verb.: V 1 1 (acc) (untensed) (acc) her him them 15 / 54

Phrase tructure ow we have three analyses for you made her duck: Pro V Pro V Pro V Pro (acc) V You made her duck You made her duck You made her duck How can we compute these analyses automatically? 16 / 54

Parsing Algorithms Phrase tructure A parser is an algorithm that computes a structure for an input string given a grammar. All parsers have two fundamental properties: Directionality: the sequence in which the structures are constructed (e.g., top-down or bottom-up). earch strategy: the order in which the search space of possible analyses is explored (e.g., depth-first, breadth-first). For instance, LL(1) parsing is top-down and depth-first. 17 / 54

Phrase tructure Coming up: A zoo of parsing algorithms As we ve noted, LL(1) isn t good enough for L. We ll be looking at other parsing algorithms that work for more general CFGs. Recursive descent parsers (top-down). imple and very general, but inefficient. Other problems hift-reduce parsers (bottom-up). The Cocke-Younger-Kasami algorithm (bottom up). Works for any CFG with reasonable efficiency. The Earley algorithm (top down). Chart parsing enhanced with prediction. 18 / 54

Phrase tructure A recursive descent parser treats a grammar as a specification of how to break down a top-level goal into subgoals. Therefore: Parser searches through the trees licensed by the grammar to find the one that has the required sentence along its yield. Directionality = top-down: It starts from the start symbol of the grammar, and works its way down to the terminals. earch strategy = depth-first: It expands a given terminal as far as possible before proceeding to the next one. 19 / 54

Phrase tructure Algorithm ketch: 1 The top-level goal is to derive the start symbol (). 2 Choose a grammatical rule with as its LH (e.g, ), and replace with the RH of the rule (the subgoals; e.g., and ). 3 Choose a rule with the leftmost subgoal as its LH (e.g., ). Replace the subgoal with the RH of the rule. 4 Whenever you reach a lexical rule (e.g., the), match its RH against the current position in the input string. If it matches, move on to next position in the input. If it doesn t, try next lexical rule with the same LH. If no rules with same LH, backtrack to most recent choice of grammatical rule and choose another rule with the same LH. If no more grammatical rules, back up to the previous subgoal. 5 Iterate until the whole input string is consumed, or you fail to match one of the positions in the input. Backtrack on failure. 20 / 54

Phrase tructure the dog saw a man in the park 21 / 54

Phrase tructure the dog saw a man in the park 22 / 54

Phrase tructure the dog saw a man in the park 23 / 54

Phrase tructure the the dog saw a man in the park 24 / 54

Phrase tructure the the dog saw a man in the park 25 / 54

Phrase tructure man the the dog saw a man in the park 26 / 54

Phrase tructure park the the dog saw a man in the park 27 / 54

Phrase tructure the dog the dog saw a man in the park 28 / 54

Phrase tructure P the dog the dog saw a man in the park 29 / 54

Phrase tructure P in the dog the dog saw a man in the park 30 / 54

Phrase tructure the the dog saw a man in the park 31 / 54

Phrase tructure the dog the dog saw a man in the park 32 / 54

Phrase tructure V the dog saw the dog saw a man in the park 33 / 54

Phrase tructure V the dog saw a the dog saw a man in the park 34 / 54

Phrase tructure V the dog saw a man the dog saw a man in the park 35 / 54

Phrase tructure V P the dog saw a man in the dog saw a man in the park 36 / 54

Phrase tructure V P the dog saw a man in the dog saw a man in the park 37 / 54

Phrase tructure V P P the dog saw a man in the park the dog saw a man in the park 38 / 54

Phrase tructure V the dog saw the dog saw a man in the park 39 / 54

Phrase tructure V the dog saw a man the dog saw a man in the park 40 / 54

Phrase tructure V P the dog saw a man in the park the dog saw a man in the park 41 / 54

Phrase tructure A hift-reduce parser tries to find sequences of words and phrases that correspond to the righthand side of a grammar production and replace them with the lefthand side: Directionality = bottom-up: starts with the words of the input and tries to build trees from the words up. earch strategy = breadth-first: starts with the words, then applies rules with matching right hand sides, and so on until the whole sentence is reduced to an. 42 / 54

Phrase tructure Algorithm ketch: Until the words in the sentences are substituted with : can through the input until we recognise something that corresponds to the RH of one of the production rules (shift) Apply a production rule in reverse; i.e., replace the RH of the rule which appears in the sentential form with the LH of the rule (reduce) A shift-reduce parser implemented using a stack: 1 start with an empty stack 2 a shift action pushes the current input symbol onto the stack 3 a reduce action replaces n items with a single item 43 / 54

Phrase tructure tack Remaining T my dog saw a man in the park with a s 44 / 54

Phrase tructure tack Remaining T dog saw a man in the park with a s my 45 / 54

Phrase tructure tack Remaining T saw a man in the park with a s my dog 46 / 54

Phrase tructure tack Remaining T saw a man in the park with a s my dog 47 / 54

Phrase tructure tack Remaining T V in the park with a s saw my dog a man 48 / 54

Phrase tructure tack Remaining T V with a s saw P my dog a man in the park 49 / 54

Phrase tructure tack V saw my dog P a man in the park 50 / 54

Phrase tructure tack V my dog saw P a man in the park 51 / 54

Phrase tructure tack V my dog saw P a man in the park 52 / 54

Try it out Yourselves! Phrase tructure Recursive Descent Parser >>> from nltk.app import rdparser >>> rdparser() hift-reduce Parser >>> from nltk.app import srparser >>> srparser() 53 / 54

ummary Phrase tructure We use CFGs to represent L grammars Grammars need recursion to produce infinite sentences Most L grammars have structural ambiguity A parser computes structure for an input automatically Recursive descent and shift-reduce parsing We ll examine more parsers in Lectures 17 22 Reading: J&M (2nd edition) Chapter 12 (intro section 12.3), Chapter 13 (intro section 13.3) ext lecture: The CYK algorithm 54 / 54