CS 545 Lecture XV: Parsing

Similar documents
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

CS 598 Natural Language Processing

Grammars & Parsing, Part 1:

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Context Free Grammars. Many slides from Michael Collins

Parsing of part-of-speech tagged Assamese Texts

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Natural Language Processing. George Konidaris

Construction Grammar. University of Jena.

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Proof Theory for Syntacticians

Chapter 4: Valence & Agreement CSLI Publications

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Ch VI- SENTENCE PATTERNS.

California Department of Education English Language Development Standards for Grade 8

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

The College Board Redesigned SAT Grade 12

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

What is NLP? CS 188: Artificial Intelligence Spring Why is Language Hard? The Big Open Problems. Information Extraction. Machine Translation

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Developing a TT-MCTAG for German with an RCG-based Parser

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Words come in categories

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Compositional Semantics

Argument structure and theta roles

Some Principles of Automated Natural Language Information Extraction

Today we examine the distribution of infinitival clauses, which can be

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Developing Grammar in Context

Constraining X-Bar: Theta Theory

LNGT0101 Introduction to Linguistics

Specifying a shallow grammatical for parsing purposes

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Underlying and Surface Grammatical Relations in Greek consider

A Computational Evaluation of Case-Assignment Algorithms

Prediction of Maximal Projection for Semantic Role Labeling

Part I. Figuring out how English works

Ensemble Technique Utilization for Indonesian Dependency Parser

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

"f TOPIC =T COMP COMP... OBJ

Word Stress and Intonation: Introduction

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Adjectives tell you more about a noun (for example: the red dress ).

BULATS A2 WORDLIST 2

Advanced Grammar in Use

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

National Literacy and Numeracy Framework for years 3/4

LTAG-spinal and the Treebank

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Emmaus Lutheran School English Language Arts Curriculum

Chapter 9 Banked gap-filling

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

MERRY CHRISTMAS Level: 5th year of Primary Education Grammar:

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

TWO OLD WOMEN (An Alaskan Legend of Betrayal, Courage and Survival) By Velma Wallis

Theoretical Syntax Winter Answers to practice problems

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Analysis of Probabilistic Parsing in NLP

The stages of event extraction

L1 and L2 acquisition. Holger Diessel

Appendix D IMPORTANT WRITING TIPS FOR GRADUATE STUDENTS

Multiple case assignment and the English pseudo-passive *

A Version Space Approach to Learning Context-free Grammars

Pseudo-Passives as Adjectival Passives

Writing a composition

Aspectual Classes of Verb Phrases

Specifying Logic Programs in Controlled Natural Language

Building an HPSG-based Indonesian Resource Grammar (INDRA)

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

The Interface between Phrasal and Functional Constraints

Type Theory and Universal Grammar

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

What the National Curriculum requires in reading at Y5 and Y6

Mercer County Schools

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

Loughton School s curriculum evening. 28 th February 2017

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

Natural Language Processing: Interpretation, Reasoning and Machine Learning

Common Core State Standards for English Language Arts

Adapting Stochastic Output for Rule-Based Semantics

Som and Optimality Theory

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

AQUA: An Ontology-Driven Question Answering System

Derivational and Inflectional Morphemes in Pak-Pak Language

A relational approach to translation

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

Character Stream Parsing of Mixed-lingual Text

Transcription:

CS 545 Lecture XV: Parsing brownies_choco81@yahoo.com brownies_choco81@yahoo.com Benjamin Snyder Announcements Readings sent out Bayesian probability (Wasserman All of Statistics ) Part-of-Speech (Jurafsky and Martin) Parsing (Jurafsky and Martin) Next two weeks: Parsing and machine translation After Spring break: review and midterm After that: Project Parse Trees Central to the description of NL syntax Parts of speech were a first step Today: Constituents Dependencies Context-free grammars for English

Noun Phrases Examples: the elephant arrived it arrived elephants arrived the big ugly elephant arrived the elephant I love to hate arrived (They all appear in the same context - before a verb.) Other Kinds of Phrases Prepositional phrases on Tuesday in March under the leaking roof Sentences (clauses) John loves Mary John loves the woman he thinks is Mary sometimes John thinks he is Mary Verb phrases, adjective phrases, adverb phrases... What Makes A Phrase A Phrase? You can move it (fronting, passivizing, inversion to form a question) she makes delicious cake delicious cake she made. You can conjoin it with a similar thing the cat died the cat and the mouse died You can replace it with a pronoun, do, there, or then the furry kittens lost their mittens they lost them the professor eats snacks... and the student does (too) It can be an answer to a Wh question. What did he do? Taught computer science.

Production Rules Alternative ways to build a particular kind of phrase NP Determiner Noun NP ProperNoun Determiner an Determiner the Noun elephant ProperNoun Smith Note the use of parts of speech! Yes, you can write this in BNF if you d like. Building Noun Phrases NP Determiner N ProperNoun N Noun AP N N PP AP Adv AP Adj PP Preposition NP Rules like Determiner the an a are the kinds of partof-speech rules you d need for a POS tagger (e.g., HMM emissions). These rules - and generalizations of them - are sometimes called the lexicon. Can integrate morphology here. A Complex NP the very large man on the broken roof with a headache

Context-Free Grammars Vocabulary of terminal symbols Σ Set of nonterminal symbols (AKA variables) N Special start symbol S N Production rules of the form X α where X N (a nonterminal symbol) and α (N Σ)* (a sequence of terminals and nonterminals) Two Views of CFGs A system for generating sentences in the grammar s language Start with an S node. While there are any nonterminal symbols, nondeterministically rewrite some nonterminal using a production rule. At the end, you have a sequence of terminals. A set of rules for assigning structure to (parsing) a sentence Definitions Grammatical: said of a sentence in the language Ungrammatical: said of a sentence not in the language Derivation: sequence of top-down production steps Parse tree: graphical representation of the derivation A string is grammatical iff there s a derivation for it.

Declarative Sentences S NP VP VP (verb phrase) is typically what you used to call a predicate - the verb and its right-side arguments, like object, indirect object, etc. Questions Yes/no questions: S AuxVerb NP VP Wh-as-subject: S WhNP VP Wh-as-something else: S WhNP Aux NP VP High-Level Points The rules I/the book have given you are great in some cases. Some failures: overgenerating (generate bad English) ambiguity undergenerating (trees or sentences) Remember: there s no spec! Getting the right grammar is a matter of research, not mere implementation. There s a difference between ungrammatical as English and ungrammatical with respect to a given grammar

Agreement John loves Mary *John love Mary These men are very smart *This clever little children want some books How do we make subjects agree with verbs, or determiners agree with nouns? Agreement, Using More Detailed Rules S NP VP S3sg NP3sg VP3sg SOther NPOther VPOther NP3sg Det N 3sg ProperNoun3sg N 3sg N3sg AP N 3sg N 3sg PP VP3sg TransitiveVerb3sg NP... Verb Arguments A related problem: some verbs require certain constellations of arguments. VP TransitiveVerb NP VP IntransitiveVerb VP DitransitiveVerb NP PP DitransitiveVerb NP NP VP STakingVerb that S VP VPTaking Verb to VP TransitiveVerb kill love IntransitiveVerb eat sleep DitransitiveVerb show give STakingVerb know believe VPTakingVerb want need

Dependencies A somewhat different view of English grammar. The words are the vertices in a graph. Every word has a parent (except the root), forming a tree. The edges may be labeled to denote grammatical relations: subject, object, indirect object of a verb complement of a preposition or copula temporal adverbial Dependency Tree I gave him my address on Tuesday Context-Free Dependency Grammars gave I (subject) gave gave gave (indirect object) him gave gave (object) address address my (attributive) address gave gave (temporal) on on on (preposition complement) Tuesday

Food For Thought How are we going to find the structures? How are we going to decide among competing parses? Where are the rules going to come from? Parsing Given a grammar G and a sentence x = (x1, x2,..., xn), find the best parse tree. We re not going to simply build it step by step; we need to entertain many partial possibilities in parallel. First View: Parsing as Search S top-down? bottom-up x1 x2... xn Trees break into pieces (partial trees), which can be used to define a search space.

Top-Down Parsing (Recursive Descent) (S) SLP p. 432 x = Book that flight Top-Down Parsing (Recursive Descent) (S) (S (NP) (VP)) (S Aux (NP) (VP)) (S (VP)) SLP p. 432 x = Book that flight Top-Down Parsing (Recursive Descent) (S) (S (NP) (VP)) (S Aux (NP) (VP)) (S (VP)) (S (NP Pronoun) (VP)) (S (NP ProperNoun) (VP)) (S (NP Det Nominal) (VP)) SLP p. 432 x = Book that flight

Top-Down Parsing (Recursive Descent) (S) (S (NP) (VP)) (S Aux (NP) (VP)) (S (VP)) (S (NP Pronoun) (VP)) (S (NP ProperNoun) (VP)) (S (NP Det Nominal) (VP)) (S Aux (NP Pronoun) (VP)) (S Aux (NP ProperNoun) (VP)) (S Aux (NP Det Nominal) (VP)) SLP p. 432 x = Book that flight Top-Down Parsing (Recursive Descent) (S) (S (NP) (VP)) (S Aux (NP) (VP)) (S (VP)) (S (NP Pronoun) (VP)) (S (NP ProperNoun) (VP)) (S (NP Det Nominal) (VP)) (S Aux (NP Pronoun) (VP)) (S Aux (NP ProperNoun) (VP)) (S Aux (NP Det Nominal) (VP)) (S (VP (VP) (PP))) (S (VP Verb)) (S (VP Verb (NP))) (S (VP Verb (NP) (PP))) (S (VP Verb (PP))) SLP p. 432 x = Book that flight Top-Down Parsing (Recursive Descent) Never wastes time exploring ungrammatical trees! Inefficiency: most search states (partial trees) could never lead to a derivation of our sentence.

Bottom-Up Parsing book that flight Bottom-Up Parsing (Verb book) (Det that) (Noun flight) (Noun book) (Det that) (Noun flight) book that flight Bottom-Up Parsing (Nominal (Noun book)) (Det that) (Nominal (Noun flight)) (Verb book) (Det that) (Noun flight) (Noun book) (Det that) (Noun flight) book that flight

Bottom-Up Parsing (Verb book) (Det that) (Nominal (Noun flight)) (Nominal (Noun book)) (Det that) (Nominal (Noun flight)) (Verb book) (Det that) (Noun flight) (Noun book) (Det that) (Noun flight) book that flight Bottom-Up Parsing (Nominal (Noun book)) (NP (Det that) (Nominal (Noun flight))) (Verb book) (Det that) (Nominal (Noun flight)) (Nominal (Noun book)) (Det that) (Nominal (Noun flight)) (Verb book) (Det that) (Noun flight) (Noun book) (Det that) (Noun flight) book that flight Bottom-Up Parsing Never generates trees that are inconsistent with the sentence. Generates partial trees that have no hope of getting to S.

Ambiguity Redux A sentence may have many parses. Even if a sentence has only one parse, finding it may be difficult, because there are many misleading paths you could follow. Bottom-up: fragments that can never have a home in any S Top-down: fragments that never get you to x What to do when there are many parses... how to choose? Return them all? Classical NLP: Parsing Fed raises interest rates 0.5 percent Write symbolic or logical rules: ROOT S S NP VP NP DT NN NP NN NNS Grammar (CFG) NP NP PP VP VBP NP VP VBP NP PP PP IN NP Lexicon NN interest NNS raises VBP interest VBZ raises Use deduction systems to prove parses from words Minimal grammar on Fed raises sentence: 36 parses Simple 10-rule grammar: 592 parses Real-size grammar: many millions of parses This scaled very badly, didn t yield broad-coverage tools Ambiguities: PP Attachment

Attachments I cleaned the dishes from dinner I cleaned the dishes with detergent I cleaned the dishes in my pajamas I cleaned the dishes in the sink PP Attachment Syntactic Ambiguities I Prepositional phrases: They cooked the beans in the pot on the stove with handles. Particle vs. preposition: The puppy tore up the staircase. Complement structures The tourists objected to the guide that they couldn t hear. She knows you like the back of her hand. Gerund vs. participial adjective Visiting relatives can be boring. Changing schedules frequently confused passengers.

Syntactic Ambiguities II Modifier scope within NPs impractical design requirements plastic cup holder Multiple gap constructions The chicken is ready to eat. The contractors are rich enough to sue. Coordination scope: Small rats and mice can squeeze into holes or cracks in the wall. Dark Ambiguities Dark ambiguities: most analyses are shockingly bad (meaning, they don t have an interpretation you can get your mind around) This analysis corresponds to the correct parse of This will panic buyers! Unknown words and new usages Solution: We need mechanisms to focus attention on the best ones, probabilistic techniques do this Garden pathing: Human Processing Ambiguity maintenance