Natural Language Processing. SoSe Syntactic parsing

Similar documents
Grammars & Parsing, Part 1:

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Context Free Grammars. Many slides from Michael Collins

CS 598 Natural Language Processing

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Developing Grammar in Context

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

LTAG-spinal and the Treebank

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

The Role of the Head in the Interpretation of English Deverbal Compounds

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

The Interface between Phrasal and Functional Constraints

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Natural Language Processing. George Konidaris

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

The stages of event extraction

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Developing a TT-MCTAG for German with an RCG-based Parser

Parsing of part-of-speech tagged Assamese Texts

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Accurate Unlexicalized Parsing for Modern Hebrew

Prediction of Maximal Projection for Semantic Role Labeling

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Analysis of Probabilistic Parsing in NLP

Compositional Semantics

Advanced Grammar in Use

Some Principles of Automated Natural Language Information Extraction

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Parsing natural language

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Words come in categories

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Proof Theory for Syntacticians

Construction Grammar. University of Jena.

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

LNGT0101 Introduction to Linguistics

Using dialogue context to improve parsing performance in dialogue systems

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Ensemble Technique Utilization for Indonesian Dependency Parser

Hindi Aspectual Verb Complexes

An Interactive Intelligent Language Tutor Over The Internet

SEMAFOR: Frame Argument Resolution with Log-Linear Models

"f TOPIC =T COMP COMP... OBJ

Domain Adaptation for Parsing

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

The Smart/Empire TIPSTER IR System

Ch VI- SENTENCE PATTERNS.

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Specifying a shallow grammatical for parsing purposes

Chapter 4: Valence & Agreement CSLI Publications

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Constraining X-Bar: Theta Theory

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

EAGLE: an Error-Annotated Corpus of Beginning Learner German

On the Notion Determiner

Campus Academic Resource Program An Object of a Preposition: A Prepositional Phrase: noun adjective

Update on Soar-based language processing

A Computational Evaluation of Case-Assignment Algorithms

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Adapting Stochastic Output for Rule-Based Semantics

An Efficient Implementation of a New POP Model

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand

Loughton School s curriculum evening. 28 th February 2017

Intensive English Program Southwest College

AQUA: An Ontology-Driven Question Answering System

Aspectual Classes of Verb Phrases

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

Sample Goals and Benchmarks

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Multiple case assignment and the English pseudo-passive *

Theoretical Syntax Winter Answers to practice problems

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Today we examine the distribution of infinitival clauses, which can be

Som and Optimality Theory

Indian Institute of Technology, Kanpur

THE VERB ARGUMENT BROWSER

The Discourse Anaphoric Properties of Connectives

Character Stream Parsing of Mixed-lingual Text

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Underlying and Surface Grammatical Relations in Greek consider

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Transcription:

Natural Language Processing SoSe 2017 Syntactic parsing Dr. Mariana Neves May 22nd, 2017

Syntactic parsing Find structural relationships between words in a sentence (http://nlp.stanford.edu:8080/parser) 2

Motivation: Grammar checking e.g., when failing to parse a sentence (http://nlp.stanford.edu:8080/parser) 3

Motivation: Speech recognition e.g., when failing to parse a sentence (http://nlp.stanford.edu:8080/parser) 4

Motivation: Machine translation e.g., when failing to parse a sentence (http://hpsg.fu-berlin.de/~stefan/cgi-bin/babel.cgi) 5

Motivation: Relation extraction Support extraction of relations, e.g., using dependency trees (http://nlp.stanford.edu:8080/corenlp/) 6

Motivation: Question answering 7 Support extraction of the question target and its details, e.g., using dependency trees (http://nlp.stanford.edu:8080/corenlp/)

Constituency 8 Parsing is based on constituency (phrase structure). We organize words into nested constituents. Constituents are groups of words that can act as single units. (http://nlp.stanford.edu:8080/parser)

Constituency The writer talked to the audience about his new book. 9 The writer talked about his new book to the audience. About his new book the writer talked to the audience. The writer talked about to the audience his new book.

Context Free Grammar (CFG) Grammar G consists of Terminals (T ) Non-terminals (N) Start symbol (S) Rules (R) S NP VP NP PRP I 10 PP VBP DT NN TO NNP buy a flight to Berlin

Context Free Grammar (CFG) Terminals The set of words in the text S NP VP NP PRP VBP I 11 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) Non-Terminals The constituents in a language S NP VP NP PRP VBP I 12 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) Start symbol The main constituent of the language S NP VP NP PRP VBP I 13 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) Rules (or grammar) Equations that consist of a single non-terminal on the left and any number of terminals and non-terminals on the right S NP S NP VP VP NP PRP VBP I 14 buy PP DT NN TO NNP a flight to Berlin

Context Free Grammar (CFG) S NP VP S VP NP NN NP PRP NP DT NN NP NP NP NP NP PP VP VBP NP VP VBP NP PP VP VP PP VP VP NP PP TO NNP 15 PRP I NN book VBP buy DT a NN flight TO to NNP Berlin

CFG 16 PRP VBP DT NN TO NNP I buy a flight to Berlin

NP PRP NP DT NN PP TO NNP VP VBP NP PP S NP VP CFG S NP VP NP 17 PP PRP VBP DT NN TO NNP I buy a flight to Berlin

Dependency grammars No constituents, but typed dependencies Links are labeled (typed) object of the preposition passive auxiliary (http://nlp.stanford.edu/software/dependencies_manual.pdf) 18

Main Grammar Fragments Sentence Noun Phrase Verb Phrase 19 Agreement Sub-categorization

Grammar Fragments: Sentence Declaratives Imperatives Did the plane leave? (S Aux NP VP) Wh Questions 20 Leave! (S VP) Yes-No Questions A plane left. (S NP VP) Which airlines fly from Berlin to London? (S Wh-NP VP)

Grammar Fragments: Noun Phrases (NP) 21 Each NP has a central critical noun called head The head of an NP can be expressed using Pre-nominals: the words that can come before the head Post-nominals: the words that can come after the head (http://en.wikipedia.org/wiki/noun_phrase)

Grammar Fragments: NP Pre-nominals Simple lexical items: the, this, a, an,... Simple possessives three cars Adjectives 22 John s sister s friend s car Quantifiers, cardinals, ordinals... John s car Complex recursive possessives a car large cars

Grammar Fragments: NP Post-nominals Prepositional phrases Non-finite clauses (-ing, -ed, infinitive) There is a flight arriving before noon I need to have dinner served Which is the last flight to arrive in Boston? Relative clauses 23 I book a flight from Seattle I want a flight that serves breakfast

Agreement Having constraints that hold among various constituents Considering these constraints in a rule or set of rules 24 Example: determiners and the head nouns in NPs have to agree in number This flight Those flights This flights Those flight

Agreement 25 Grammars that do not consider constraints will over-generate Accepting and assigning correct structures to grammatical examples (this flight) But also accepting incorrect examples (these flight)

Agreement at sentence level 26 Considering similar constraints at sentence level Example: subject and verb in sentences have to agree in number and person John flies We fly John fly We flies

Agreement Possible CFG solution Ssg NPsg VPsg Spl NPpl VPpl NPsg Detsg Nsg NPpl Detpl Npl VPsg Vsg NPsg VPpl Vpl NPpl... Shortcoming: 27 Introducing too many rules in the system

Grammar Fragments: VP 28 VPs consist of a head verb along with zero or more constituents called arguments VP V (disappear) VP V NP (prefer a morning flight) VP V PP (fly on Thursday) VP V NP PP (leave Boston in the morning) VP V NP NP (give me the flight number) Arguments Obligatory: complement Optional: adjunct

Grammar Fragments: VP 29 Solution (Sub-categorization): Sub-categorizing the verbs according to the sets of VP rules that they can participate in Modern grammars have more than 100 subcategories

Sub-categorization 30 Example: sneeze: John sneezed find: Please find [a flight to NY]NP give: Give [me]np [a cheaper fair]np help: Can you help [me]np [with a flight]pp prefer: I prefer [to leave earlier]to-vp tell: I was told [United has a flight]s John sneezed the book I prefer United has a flight Give with a flight

Parsing Given a sentence and a grammar, return a proper parse tree. S NP PRP NP DT NN PP TO NNP VP VBP NP PP S NP VP + I buy a flight to Berlin. 31 NP VP NP PRP VBP I buy DT a PP NN flight TO to NNP Berlin

Parsing We should cover all and only the elements of the input string. S NP VP NP PRP VBP I buy a flight to Berlin. 32 I buy DT a PP NN flight TO to NNP Berlin

Parsing We should reach the start symbol at the top of the string. S NP VP NP PRP VBP I 33 buy DT a PP NN TO flight to NNP Berlin

Parsing Algorithms 34 Top-Down Bottom-up

Parsing Algorithms Top-Down Start with the rules that contains the S Work on the way down to the words S NP VP NP PRP VBP I 35 buy DT a PP NN TO flight to NNP Berlin

Parsing Algorithms Bottom-Up Start with trees that link up with the words Work on the way up to larger and larger trees S NP VP NP PRP VBP I 36 buy DT a PP NN TO flight to NNP Berlin

Top-Down vs. Bottom-Up 37 Top-Down Only searches for trees that can be answers (i.e. S s) But also suggests trees that are not consistent with any of the words Bottom-Up Only forms trees consistent with the words But suggests trees that make no sense globally

Top-Down vs. Bottom-Up In both cases, keep track of the search space and make choices Backtracking 38 We make a choice, if it works out, great! If not, then back up and make a different choice (duplicated work) Dynamic programming Avoid repeated work Solve exponential problems in polynomial time Store ambiguous structures efficiently

Dynamic Programming Methods 39 CKY (Cocke-Kasami-Younger): bottom-up Early: top-down

Chomsky Normal Form (CNF) 40 Each grammar can be represented by a set of binary rules A BC A w A, B, C are non-terminals; w is a terminal

Chomsky Normal Form Conversion to CNF: A BCD X BC A XD 41

CockeYoungerKasami (CKY) Parsing A BC If there is an A somewhere in the input, then there must be a B followed by a C in the input If the A spans from i to j in the input, then there must be a k such that i < k < j B spans from i to k C spans from k to j I 0 i 42 buy 1 k a 2 flight 3 to 4 Berlin 5 6 j

CKY Parsing [0,1] [0,2] [0,3] [0,4] [0,5] [0,6] [1,2] [1,3] [1,4] [1,5] [1,6] [2,3] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] [5,6] I 0 43 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP [0,1] [0,2] [0,3] [0,4] [0,5] [0,6] [1,2] [1,3] [1,4] [1,5] [1,6] [2,3] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] PRP I NP PRP [5,6] I 0 44 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,5] [0,6] [1,3] [1,4] [1,5] [1,6] [2,3] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] VBP [1,2] VBP buy [5,6] I 0 45 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,5] [0,6] [1,3] [1,4] [1,5] [1,6] [2,4] [2,5] [2,6] [3,4] [3,5] [3,6] [4,5] [4,6] VBP [1,2] VBP buy DT DT a [2,3] [5,6] I 0 46 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing PRP, NP S [0,2] [0,1] PRP I NP PRP [0,3] [0,4] VBP [0,6] [1,5] [1,6] [2,5] [2,6] [3,5] [3,6] [4,5] [4,6] VP [1,2] VBP buy [0,5] DT a [1,3] [1,4] DT NP [2,3] [2,4] NN flight NP DT NN VP VBP NP S NP VP NN [3,4] [5,6] I 0 47 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing S PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,6] [1,5] [1,6] [2,5] [2,6] [3,5] [3,6] VP VBP [1,2] VBP buy [0,5] DT a [1,3] [1,4] DT NP [2,3] [2,4] NN flight NP DT NN VP VBP NP S NP VP NN [3,4] TO TO to [4,5] [4,6] [5,6] I 0 48 buy 1 a 2 flight 3 to 4 Berlin 5 6

CKY Parsing S PRP, NP [0,2] [0,1] PRP I NP PRP [0,3] [0,4] [0,5] [0,6] VP VBP [1,2] VBP buy S DT a [1,3] [1,4] DT NP [2,3] [2,4] NN flight NP DT NN VP VBP NP S NP VP VP [1,5] [1,6] [2,5] [2,6] [3,5] [3,6] TO PP [4,5] [4,6] NN [3,4] TO to NNP Berlin PP TO NNP VP VP PP NNP [5,6] I 0 49 buy 1 a 2 flight 3 to 4 Berlin 5 6

Probabilistic Context Free Grammar (PCFG) 50 Terminals (T ) Non-terminals (N) Start symbol (S) Rules (R) Probability function (P)

Probabilistic Context Free Grammar 0.9 S NP VP 0.1 S VP 1.0 PRP I 0.3 NP NN 0.6 NN book 0.4 NP PRP 0.7 VBP buy 0.1 NP DT NN 0.2 NP NP NP 0.8 DT a 0.1 NP NP PP 0.4 NN flight 0.4 VP VBP NP 1.0 TO to 0.3 VP VP PP 0.5 VP VP NP 1.0 NNP Berlin 1.0 PP TO NNP Use a Treebank to calculate probabilities. 51

Treebank A treebank is a corpus in which each sentence has been paired with a parse tree These are generally created by Parsing the collection with an automatic parser Correcting each parse by human annotators if required (http://www.nactem.ac.uk/ant/genia.html) 52

Statistical Parsing Considering the corresponding probabilities while parsing a sentence Selecting the parse tree which has the highest probability P(t): the probability of a tree t 53 Product of the probabilities of the rules used to generate the tree

Probabilistic Context Free Grammar 0.9 S NP VP 0.1 S VP 0.3 NP NN 0.6 NN book 0.4 NP PRP 0.7 VBP buy 0.1 NP DT NN 0.2 NP NP NP 0.8 DT a 0.1 NP NP PP 0.4 NN flight 0.4 VP VBP NP 1.0 TO to 0.3 VP VP PP 0.5 VP VP NP 1.0 PP TO NNP 54 1.0 PRP I 1.0 NNP Berlin

Statistical Parsing S (0.9) VP (0.3) NP (0.4) VP (0.4) PP (1.0) NP (0.1) PRP (1.0) I VBP (0.7) DT (0.8) NN (0.4) TO (1.0) buy a flight to NNP (1.0) Berlin P(t) = 0.9 ₓ 0.4 ₓ 1.0 ₓ 0.3 ₓ 0.4 ₓ 0.7 ₓ 0.1 ₓ 0.8 ₓ 0.4 ₓ 1.0 ₓ 1.0 ₓ 1.0 55

Probabilistic CKY Parsing [0,2] [0,1] PRP I (1.0) NP PRP (0.4) VBP VP 0.7* 0.8*0.4*0.1* [1,4] 0.4 [1,5] 0.7 [1,2] VBP buy (0.7) 1.0*0.4* S 0.7*0.8*0.4*0.1*0.4* [0,6] 1.0*1.0*1.0* 0.3*0.9 VP 0.7*0.8*0.4*0.1*0.4* [1,6] 1.0*1.0*1.0* 0.3 S1.0*0.4* 0.7*0.8*0.4*0.1*0.4* [0,3] [0,4] 0.9 [0,5] PRP, NP 1.0*0.4 [1,3] DT NP 0.8 0.8*0.4* [2,3] [2,4] 0.1 [2,5] DT a (0.8) NN flight (0.4) NP DT NN (0.1) VP VBP NP (0.4) S NP VP (0.9) NN [2,6] 0.4 [3,4] [3,5] [3,6] TO PP 1.0 1.0*1.0* [4,5] [4,6] 1.0 TO to (1.0) NNP Berlin (1.0) PP TO NNP (1.0) VP VP PP (0.3) NNP 1.0 [5,6] I 0 56 buy 1 a 2 flight 3 to 4 Berlin 5 6

Summary 57 Constituency parsing Context-free grammars Noun phrases, verbal phrases Subcategorization Bottom-up and top-down CYK algorithm for CFG parsing Probabilistic CFG

Tools 58 Spacy: https://spacy.io/ Stanford CoreNLP: https://stanfordnlp.github.io/corenlp/ NLTK Python: http://www.nltk.org/ and others

Further Reading Speech and Language Processing 59 Chapters 12 (grammar), 13 (syntactic parsing) and 14 (statistical parsing)