Natural Language Processing SoSe Parsing. (based on the slides of Dr. Saeedeh Momtazi)

Similar documents
Grammars & Parsing, Part 1:

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Context Free Grammars. Many slides from Michael Collins

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

CS 598 Natural Language Processing

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

LTAG-spinal and the Treebank

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Developing Grammar in Context

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

The Interface between Phrasal and Functional Constraints

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Developing a TT-MCTAG for German with an RCG-based Parser

Parsing of part-of-speech tagged Assamese Texts

Natural Language Processing. George Konidaris

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Compositional Semantics

Minimalism is the name of the predominant approach in generative linguistics today. It was first

The Role of the Head in the Interpretation of English Deverbal Compounds

Accurate Unlexicalized Parsing for Modern Hebrew

Prediction of Maximal Projection for Semantic Role Labeling

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Analysis of Probabilistic Parsing in NLP

The stages of event extraction

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

An Interactive Intelligent Language Tutor Over The Internet

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Construction Grammar. University of Jena.

Some Principles of Automated Natural Language Information Extraction

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Parsing natural language

The Smart/Empire TIPSTER IR System

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

LNGT0101 Introduction to Linguistics

Words come in categories

Proof Theory for Syntacticians

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

Advanced Grammar in Use

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Chapter 4: Valence & Agreement CSLI Publications

Constraining X-Bar: Theta Theory

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Ensemble Technique Utilization for Indonesian Dependency Parser

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Domain Adaptation for Parsing

Specifying a shallow grammatical for parsing purposes

Using dialogue context to improve parsing performance in dialogue systems

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Adapting Stochastic Output for Rule-Based Semantics

"f TOPIC =T COMP COMP... OBJ

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

AQUA: An Ontology-Driven Question Answering System

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

EAGLE: an Error-Annotated Corpus of Beginning Learner German

On the Notion Determiner

The Discourse Anaphoric Properties of Connectives

Som and Optimality Theory

An Efficient Implementation of a New POP Model

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

Campus Academic Resource Program An Object of a Preposition: A Prepositional Phrase: noun adjective

Ch VI- SENTENCE PATTERNS.

Update on Soar-based language processing

Character Stream Parsing of Mixed-lingual Text

Learning Computational Grammars

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

A Computational Evaluation of Case-Assignment Algorithms

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Aspectual Classes of Verb Phrases

The Indiana Cooperative Remote Search Task (CReST) Corpus

Hindi Aspectual Verb Complexes

A Usage-Based Approach to Recursion in Sentence Processing

An Introduction to the Minimalist Program

Sample Goals and Benchmarks

Refining the Design of a Contracting Finite-State Dependency Parser

What is NLP? CS 188: Artificial Intelligence Spring Why is Language Hard? The Big Open Problems. Information Extraction. Machine Translation

Programma di Inglese

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

LING 329 : MORPHOLOGY

Control and Boundedness

Theoretical Syntax Winter Answers to practice problems

Transcription:

Natural Language Processing SoSe 2015 Parsing Dr. Mariana Neves May 18th, 2014 (based on the slides of Dr. Saeedeh Momtazi)

Parsing Finding structural relationships between words in a sentence (http://nlp.stanford.edu:8080/parser) 2

Parsing Applications Grammar checking Speech recognition Machine translation Relation extraction Question answering (http://nlp.stanford.edu:8080/parser) 3

Parsing Grammar checking By failing to parse a sentence (http://nlp.stanford.edu:8080/parser) 4

Parsing Speech recognition By failing to parse a sentence (http://nlp.stanford.edu:8080/parser) 5

Parsing Machine translation Fail to parse a sentence (http://hpsg.fu-berlin.de/~stefan/cgi-bin/babel.cgi) 6

Parsing Relation extraction (http://nlp.stanford.edu:8080/corenlp/) 7

Parsing Question answering (http://nlp.stanford.edu:8080/corenlp/) 8

Outline Phrase Structure Syntactic Parsing 9 CKY Algorithm Statistical Parsing

Outline Phrase Structure Syntactic Parsing 10 CKY Algorithm Statistical Parsing

Constituency Working based on Constituency (Phrase structure) Organizing words into nested constituents (http://nlp.stanford.edu:8080/parser) 11

Constituency Working based on Constituency (Phrase structure) Showing that groups of words can act as single units (http://nlp.stanford.edu:8080/parser) 12

Constituency Working based on Constituency (Phrase structure) Forming coherent classes from these units that can behave in similar ways With respect to their internal structure With respect to other units in the language (http://nlp.stanford.edu:8080/parser) 13

Constituency Working based on Constituency (Phrase structure) Considering a head word for each constituent (http://nlp.stanford.edu:8080/parser) 14

Constituency The writer talked to the audience about his new book. The writer talked about his new book to the audience. About his new book the writer talked to the audience. The writer talked about to the audience his new book. 15

Constituency The writer talked to the audience about his new book. 16 The writer talked about his new book to the audience. About his new book the writer talked to the audience. The writer talked about to the audience his new book.

Context Free Grammar (CFG) Grammar G consists of Terminals (T ) Non-terminals (N) Start symbol (S) Rules (R) S NP VP NP PRP I 17 PP VBP DT NN TO NNP buy a flight to Berlin

Context Free Grammar (CFG) Terminals The set of words in the text S NP VP NP PRP VBP I 18 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) Non-Terminals The constituents in a language (noun phrase, verb phrase,...) S NP VP NP PRP VBP I 19 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) Start symbol The main constituent of the language (sentence) S NP VP NP PRP VBP I 20 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) Rules Equations that consist of a single non-terminal on the left and any number of terminals and non-terminals on the right S NP S NP VP VP NP PRP VBP I 21 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) S NP VP S VP NP NN NP PRP NP DT NN NP NP NP NP NP PP PRP I NN book VBP buy DT a VP VBP NP NN flight VP VBP NP PP TO to VP VP PP VP VP NP PP TO NNP 22 NNP Berlin

CFG 23 PRP VBP DT NN TO NNP I buy a flight to Berlin

NP PRP NP DT NN PP TO NNP VP VBP NP PP S NP VP CFG S NP VP NP 24 PP PRP VBP DT NN TO NNP I buy a flight to Berlin

Outline Phrase Structure Syntactic Parsing 25 CKY Algorithm Statistical Parsing

Parsing Taking a string and a grammar and returning proper parse tree(s) for that string S NP PRP NP DT NN PP TO NNP VP VBP NP PP S NP VP + I buy a flight to Berlin. 26 NP VP NP PRP VBP I buy DT a PP NN flight TO to NNP Berlin

Parsing Covering all and only the elements of the input string S NP VP NP PRP VBP I buy a flight to Berlin. 27 I buy DT a PP NN flight TO to NNP Berlin

Parsing Reaching the start symbol at the top of the string S NP VP NP PRP VBP I 28 buy DT a PP NN TO flight to NNP Berlin

Main Grammar Fragments Sentence Noun Phrase Verb Phrase 29 Agreement Sub-categorization

Grammar Fragments: Sentence 30 Declaratives A plane left. S NP VP Imperatives Leave! S VP Yes-No Questions Did the plane leave? S Aux NP VP Wh Questions Which airlines fly from Berlin to London? S Wh-NP VP

Grammar Fragments: NP 31 Each NP has a central critical noun called head The head of an NP can be expressed using Pre-nominals: the words that can come before the head Post-nominals: the words that can come after the head (http://en.wikipedia.org/wiki/noun_phrase)

Grammar Fragments: NP Pre-nominals Simple lexical items: the, this, a, an,... Simple possessives three cars Adjectives 32 John s sister s friend s car Quantifiers, cardinals, ordinals... John s car Complex recursive possessives a car large cars

Grammar Fragments: NP Post-nominals Prepositional phrases Non-finite clauses (-ing, -ed, infinitive) There is a flight arriving before noon I need to have dinner served Which is the last flight to arrive in Boston? Relative clauses 33 I book a flight from Seattle I want a flight that serves breakfast

Agreement Having constraints that hold among various constituents Considering these constraints in a rule or set of rules 34 Example: determiners and the head nouns in NPs have to agree in number This flight Those flights This flights Those flight

Agreement 35 Grammars that do not consider constraints will over-generate Accepting and assigning correct structures to grammatical examples (this flight) But also accepting incorrect examples (these flight)

Agreement at sentence level 36 Considering similar constraints at sentence level Example: subject and verb in sentences have to agree in number and person John flies We fly John fly We flies

Agreement How to solve the agreement problem in parsing? 37 This flight Those flights This flights Those flight John flies We fly John fly We flies

Agreement Possible CFG solution Ssg NPsg VPsg Spl NPpl VPpl NPsg Detsg Nsg NPpl Detpl Npl VPsg Vsg NPsg VPpl Vpl NPpl... Shortcoming: 38 Introducing many rules in the system

Grammar Fragments: VP 39 VPs consist of a head verb along with zero or more constituents called arguments VP V (disappear) VP V NP (prefer a morning flight) VP V PP (fly on Thursday) VP V NP PP (leave Boston in the morning) VP V NP NP (give me the flight number) Arguments Obligatory: complement Optional: adjunct

Grammar Fragments: VP Even though there are many valid VP rules, not all verbs are allowed to participate in all VP rules 40 disappear a morning flight

Grammar Fragments: VP 41 Solution (Sub-categorization): Sub-categorizing the verbs according to the sets of VP rules that they can participate in Modern grammars have more than 100 subcategories

Sub-categorization 42 Example: sneeze: John sneezed find: Please find [a flight to NY]NP give: Give [me]np [a cheaper fair]np help: Can you help [me]np [with a flight]pp prefer: I prefer [to leave earlier]to-vp tell: I was told [United has a flight]s John sneezed the book I prefer United has a flight Give with a flight

Sub-categorization The over-generation problem also exists in VP rules Permitting the presence of strings containing verbs and arguments that do not go together John sneezed the book Solution: 43 VP V NP Similar to agreement phenomena, we need a way to formally express the constraints

Parsing Algorithms 44 Top-Down Bottom-up

Parsing Algorithms Top-Down Starting with the rules that give us an S Working on the way down from S to the words S S NP VP VP NP PRP VBP I 45 buy S NP VP S VP DT a PP NN TO flight to NP NNP Berlin I VBP DT buy a PP NN TO flight to NNP Berlin

Parsing Algorithms Bottom-Up Starting with trees that link up with the words Working on the way up from words to larger and larger trees S NP VP NP PRP VBP I 46 buy DT a PP NN TO flight to NNP Berlin

Top-Down vs. Bottom-Up 47 Advantages Disadvantages

Top-Down vs. Bottom-Up 48 Top-Down Only searches for trees that can be answers (i.e. S s) But also suggests trees that are not consistent with any of the words Bottom-Up Only forms trees consistent with the words But suggests trees that make no sense globally

Top-Down vs. Bottom-Up In both cases; keep track of the search space and make choices Solutions Backtracking 49 Making a choice, if it works out then fine If not, then back up and make a different choice duplicated work Dynamic programming Avoiding repeated work Solving exponential problems in polynomial time Storing ambiguous structures efficiently

Dynamic Programming Methods 50 CKY (Cocke-Kasami-Younger): bottom-up Early: top-down

Outline Phrase Structure Syntactic Parsing 51 CKY Algorithm Statistical Parsing

Chomsky Normal Form (CNF) 52 Each grammar can be represented by a set of binary rules A BC A w A, B, C are non-terminals; w is a terminal

Chomsky Normal Form Converting to Chomsky Normal Form A BCD X BC A XD 53

CKY Parsing A BC If there is an A somewhere in the input, then there must be a B followed by a C in the input If the A spans from i to j in the input, then there must be a k such that i < k < j B spans from i to k C spans from k to j I 0 i 54 buy 1 k a 2 flight 3 to 4 Berlin 5 6 j

CKY Parsing [0,1] [0,2] [0,3] [0,4] [0,5] [0,6] [1,2] [1,3] [1,4] [1,5] [1,6] [2,3] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] [5,6] I 0 55 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP [0,1] [0,2] [0,3] [0,4] [0,5] [0,6] [1,2] [1,3] [1,4] [1,5] [1,6] [2,3] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] PRP I NP PRP [5,6] I 0 56 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,5] [0,6] [1,3] [1,4] [1,5] [1,6] [2,3] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] VBP [1,2] VBP buy [5,6] I 0 57 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,5] [0,6] [1,3] [1,4] [1,5] [1,6] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] VBP [1,2] VBP buy DT DT a [2,3] [5,6] I 0 58 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP S [0,2] [0,1] PRP I NP PRP [0,3] [0,4] VBP [0,6] [1,5] [1,6] [2,5] [2,6] [3,5] [3,6] [4,5] [4,6] VP [1,2] VBP buy [0,5] DT a [1,3] [1,4] DT NP [2,3] [2,4] NN flight NP DT NN VP VBP NP S NP VP NN [3,4] [5,6] I 0 59 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing S PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,6] [1,5] [1,6] [2,5] [2,6] [3,5] [3,6] VP VBP [1,2] VBP buy [0,5] DT a [1,3] [1,4] DT NP [2,3] [2,4] NN flight NP DT NN VP VBP NP S NP VP NN [3,4] TO TO to [4,5] [4,6] [5,6] I 0 60 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing S PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,5] [0,6] VP VBP [1,2] VBP buy S DT a [1,3] [1,4] DT NP [2,3] [2,4] NN flight NP DT NN VP VBP NP S NP VP VP [1,5] [1,6] [2,5] [2,6] [3,5] [3,6] TO PP [4,5] [4,6] NN [3,4] TO to NNP Berlin PP TO NNP VP VP PP NNP [5,6] I 0 61 buy 1 a 2 flight 3 to 4 Berlin 5 6

Outline Phrase Structure Syntactic Parsing 62 CKY Algorithm Statistical Parsing

Probabilistic Context Free Grammar (PCFG) 63 Terminals (T ) Non-terminals (N) Start symbol (S) Rules (R) Probability function (P)

Context Free Grammar (CFG) S NP VP S VP PRP I NP NN NN book NP PRP NP DT NN NP NP NP DT a NP NP PP NN flight VP VBP NP VP VP PP VP VP NP PP TO NNP 64 VBP buy TO to NNP Berlin

Probabilistic Context Free Grammar 65 0.9 S NP VP 0.1 S VP 0.3 NP NN 0.6 NN book 0.4 NP PRP 0.7 VBP buy 0.1 NP DT NN 0.2 NP NP NP 0.1 NP NP PP 0.4 NN flight 0.4 VP VBP NP 1.0 TO to 0.3 VP VP PP 0.5 VP VP NP 1.0 PP TO NNP 1.0 PRP I 0.8 DT a 1.0 NNP Berlin

Treebank A treebank is a corpus in which each sentence has been paired with a parse tree These are generally created by Parsing the collection with an automatic parser Correcting each parse by human annotators if required (http://www.nactem.ac.uk/ant/genia.html) 66

Penn Treebank Penn Treebank is a widely used treebank for English Most well-known section: Wall Street Journal Section 1 M words from 1987-1989 (S (NP (NNP John)) (VP (VPZ flies) (PP (IN to) (NNP Paris))) (..)) 67

Statistical Parsing Considering the corresponding probabilities while parsing a sentence Selecting the parse tree which has the highest probability P(t): the probability of a tree t 68 Product of the probabilities of the rules used to generate the tree

Probabilistic Context Free Grammar 69 0.9 S NP VP 0.1 S VP 0.3 NP NN 0.6 NN book 0.4 NP PRP 0.7 VBP buy 0.1 NP DT NN 0.2 NP NP NP 0.1 NP NP PP 0.4 NN flight 0.4 VP VBP NP 1.0 TO to 0.3 VP VP PP 0.5 VP VP NP 1.0 PP TO NNP 1.0 PRP I 0.8 DT a 1.0 NNP Berlin

Statistical Parsing S (0.9) VP (0.3) NP (0.4) VP (0.4) PP (1.0) NP (0.1) PRP (1.0) I VBP (0.7) DT (0.8) NN (0.4) TO (1.0) buy a flight to NNP (1.0) Berlin P(t) = 0.9 ₓ 0.4 ₓ 1.0 ₓ 0.3 ₓ 0.4 ₓ 0.7 ₓ 0.1 ₓ 0.8 ₓ 0.4 ₓ 1.0 ₓ 1.0 ₓ 1.0 70

Probabilistic CKY Parsing [0,2] [0,1] PRP I (1.0) NP PRP (0.4) VBP [1,2] VBP buy (0.7) 1.0*0.4* S 0.7*0.8*0.4*0.1*0.4* [0,6] 1.0*1.0*1.0* 0.3*0.9 VP 0.7*0.8*0.4*0.1*0.4* [1,6] 1.0*1.0*1.0* 0.3 S1.0*0.4* 0.7*0.8*0.4*0.1*0.4* [0,3] [0,4] 0.9 [0,5] PRP, NP 1.0*0.4 0.7 VP 0.7* 0.8*0.4*0.1* [1,4] 0.4 [1,5] [1,3] DT NP 0.8 0.8*0.4* [2,3] [2,4] 0.1 [2,5] DT a (0.8) NN flight (0.4) NP DT NN (0.1) VP VBP NP (0.4) S NP VP (0.9) NN [2,6] 0.4 [3,4] [3,5] [3,6] TO PP 1.0 1.0*1.0* [4,5] [4,6] 1.0 TO to (1.0) NNP Berlin (1.0) PP TO NNP (1.0) VP VP PP (0.3) NNP 1.0 [5,6] I 0 71 buy 1 a 2 flight 3 to 4 Berlin 5 6

Further Reading Speech and Language Processing 72 Chapters 12, 13, 14, 15