Syntactic Parsing. Natural Language Processing: Lecture Kairit Sirts

Similar documents
11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Grammars & Parsing, Part 1:

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Ensemble Technique Utilization for Indonesian Dependency Parser

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Context Free Grammars. Many slides from Michael Collins

Parsing of part-of-speech tagged Assamese Texts

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

CS 598 Natural Language Processing

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Compositional Semantics

Natural Language Processing. George Konidaris

Prediction of Maximal Projection for Semantic Role Labeling

AQUA: An Ontology-Driven Question Answering System

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Accurate Unlexicalized Parsing for Modern Hebrew

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Some Principles of Automated Natural Language Information Extraction

The stages of event extraction

Developing a TT-MCTAG for German with an RCG-based Parser

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

LTAG-spinal and the Treebank

Second Exam: Natural Language Parsing with Neural Networks

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Linking Task: Identifying authors and book titles in verbose queries

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Learning Computational Grammars

A Graph Based Authorship Identification Approach

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Chapter 4: Valence & Agreement CSLI Publications

The Interface between Phrasal and Functional Constraints

Character Stream Parsing of Mixed-lingual Text

Using dialogue context to improve parsing performance in dialogue systems

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

The Role of the Head in the Interpretation of English Deverbal Compounds

The Smart/Empire TIPSTER IR System

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Analysis of Probabilistic Parsing in NLP

Construction Grammar. University of Jena.

Achim Stein: Diachronic Corpora Aston Corpus Summer School 2011

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Ch VI- SENTENCE PATTERNS.

SEMAFOR: Frame Argument Resolution with Log-Linear Models

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

A Computational Evaluation of Case-Assignment Algorithms

Applications of memory-based natural language processing

A Case Study: News Classification Based on Term Frequency

Adapting Stochastic Output for Rule-Based Semantics

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Parsing natural language

Proof Theory for Syntacticians

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Text-mining the Estonian National Electronic Health Record

"f TOPIC =T COMP COMP... OBJ

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

An Interactive Intelligent Language Tutor Over The Internet

Constraining X-Bar: Theta Theory

Two methods to incorporate local morphosyntactic features in Hindi dependency

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Beyond the Pipeline: Discrete Optimization in NLP

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Introduction to Causal Inference. Problem Set 1. Required Problems

Pre-Processing MRSes

An Introduction to the Minimalist Program

Holy Family Catholic Primary School SPELLING POLICY

Radius STEM Readiness TM

A deep architecture for non-projective dependency parsing

Hyperedge Replacement and Nonprojective Dependency Structures

Natural Language Processing: Interpretation, Reasoning and Machine Learning

Specifying a shallow grammatical for parsing purposes

Domain Adaptation for Parsing

Refining the Design of a Contracting Finite-State Dependency Parser

Argument structure and theta roles

The Effect of Multiple Grammatical Errors on Processing Non-Native Writing

Rule Learning With Negation: Issues Regarding Effectiveness

LNGT0101 Introduction to Linguistics

LING 329 : MORPHOLOGY

IBAN LANGUAGE PARSER USING RULE BASED APPROACH

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly

Introduction to Text Mining

PRODUCT PLATFORM DESIGN: A GRAPH GRAMMAR APPROACH

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Yoshida Honmachi, Sakyo-ku, Kyoto, Japan 1 Although the label set contains verb phrases, they

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Grammar Extraction from Treebanks for Hindi and Telugu

Transcription:

Syntactic Parsing Natural Language Processing: Lecture 7 19.10.2017 Kairit Sirts

Homework I Languages 2

Homework I - Results Average points: 9.35 Minimum points: 8 Maximum points: 10 10 points: everything is done and it is easy to get an overview what and how was done 9 points: There are some minor problems with the results and/or the report 8 points: There are some minor problems with the results and/or it was somewhat difficult to follow the report 3

Morphology internal structure of words Syntax internal structure of sentences http://pixelmonkey.org/pub/nlp-training/img/02_parsetree_white.png 4

Syntactic ambiguity https://www.thoughtco.com/syntactic-ambiguity-grammar-1692179 5

The chicken is ready to eat. https://cdn.drawception.com/images/panels/2012/3-28/52rw5tarqe-6.png 6

https://www.pinterest.com/pin/495747871456246303/ 7

More ambiguous sentences I saw the man with binoculars. Look at the dog with one eye. I watched her duck. The peasants are revolting. They are cooking apples. Stolen painting found by tree. Police help dog bite victim. 8

Syntactic analysis/parsing Shallow parsing Phrase structure / constituency parsing Dependency parsing 9

The role of syntax in NLP Text generation/summarization/machine translation Useful features for various information extraction tasks Syntactic structure also reflects the semantic relations between the words 10

Shallow Parsing 11

Shallow parsing Also called chunking or light parsing Split the sentence into non-overlapping syntactic phrases The morning flight from Denver has arrived. NP PP NP VP NP Noun phrase PP Prepositional Phrase VP Verb phrase 12

BIO tagging A labelling scheme often used in information extraction problems, treated as a sequence tagging task The morning flight from Denver has arrived. B_NP I_NP I_NP B_PP B_NP B_VP I_VP B_NP Beginning of a noun phrase I_NP Inside a noun phrase B_VB Beginning of a verb phrase etc 13

BIO tagging With only noun phrases The morning flight from Denver has arrived. B_NP I_NP I_NP O B_NP O O B_NP Beginning of a noun phrase I_NP Inside a noun phrase O Outside of a noun phrase 14

Sequence classifier Need annotated data for training: POS-tagged, phrase-annotated Use a sequence classifier of your choice Figure 12.8: https://web.stanford.edu/~jurafsky/slp3/12.pdf 15

Evaluation: precision and recall https://en.wikipedia.org/wiki/precision_and_recall 16

Constituency Parsing 17

Constituency parsing Full constituency parsing helps to resolve structural ambiguities Figure 12.2: https://web.stanford.edu/~jurafsky/slp3/12.pdf 18

Structural ambiguities Attachment ambiguity a constituent/phrase can be attached to different places in the tree (the elephant example) Coordination ambiguity [old [men and women]] Both men and women are old JJ NNS CC NNS old men and women [old men] and [women] Only men are old JJ NNS CC NNS old men and women 19

Bracketed style The trees can be represented linearly with brackets (S (Pr I) (Aux will) (VP (V do) (NP (Det my) (N homework)) NP ) VP ) S 20

Context-free grammars http://slideplayer.com/slide/4559350/ 21

Probabilistic CFGs http://slideplayer.com/slide/4559350/ 22

A PCFG http://slideplayer.com/slide/4559350/ 23

The probability of strings and trees http://slideplayer.com/slide/4559350/ 24

Exercise Compute the probability of a tree for People fish tanks with rods 25

PCFG for efficient parsing For efficient parsing the rules should be unary or binary Chomsky normal form all rules have the form: X --> Y Z X --> w X, Y, Z - non-terminal symbols w terminal symbol No epsilon rules 26

Before binarization http://slideplayer.com/slide/4559350/ 27

After binarization http://slideplayer.com/slide/4559350/ 28

Before and after binarization http://slideplayer.com/slide/4559350/ 29

Finding the most likely tree: CKY parsing Dynamic programming algorithm Proceeds bottom-up and performs Viterbi on trees http://slideplayer.com/slide/4559350/ 30

CKY parsing For a full example look at the slides at http://slideplayer.com/slide/4559350/ 31

CKY parsing http://slideplayer.com/slide/4559350/ 32

CKY parsing http://slideplayer.com/slide/4559350/ 33

Evaluating constituency parsing http://slideplayer.com/slide/4559350/ 34

Dependency Parsing 35

Dependency parsing Labelled dependency relation Root of the sentence Dependent Head Dependency parse is a directed graph G = (V, A) V the set of vertices corresponding to words A the set of nodes corresponding to dependency relations Visualization with http://corenlp.run/ 36

Dependency parsing More compact grammar formalism than CFG Figure 14.1: https://web.stanford.edu/~jurafsky/slp3/14.pdf 37

Dependency relations The arrows connect heads and their dependents The main verb is the head or the root of the whole sentence The arrows are labelled with grammatical functions/dependency relations Labelled dependency relation Root of the sentence Dependent Head 38

Properties of a dependency graph A dependency tree is a directed graph that satisfies the following constraints: 1. There is a single designated root node that has no incoming arcs Typically the main verb of the sentence 2. With the exception of the root node, each node has exactly one incoming arc Each dependent has a single head 3. There is a unique path from the root node to each vertex in V The graph is acyclic and connected 39

Projectivity Projective trees there are no arc crossings in the dependency graphs Non-projective trees - crossings due to free word order https://web.stanford.edu/~jurafsky/slp3/14.pdf, page 5 40

Dependency relations Figure 14.2: https://web.stanford.edu/~jurafsky/slp3/14.pdf 41

Universal dependencies http://universaldependencies.org/ Annotated treebanks in many languages Uniform annotation scheme across all languages: Universal POS tags Universal dependency relations 42

Dependency parsing methods Transition-based parsing stack-based algorithms/shift-reduce parsing only generate projective trees Graph-based algorithms can also generate non-projective trees 43

Transition-based parsing Three main components: Stack Buffer Set of dependency relations A configuration is the current state of the stack, buffer and the relation set Figure 14.5: https://web.stanford.edu/~jurafsky/slp3/14.pdf 44

Arc-standard parsing system Initial configuration: Stack contains the ROOT symbol Buffer contains all words in the sentence Dependency relation set is empty At each step perform either: Shift move a word from the buffer to the stack: LeftArc left arc between top two words in the stack, pop the second word: RightArc right arc between top two words in the stack, pop the first word: 45

Oracle The annotated data is in the form of a treebank Each sentence is annotated with its dependency tree The task of the transition-based parser is to predict the correct parsing operation at each step: Input is configuration Output is parsing action: Shift, RightArc or LeftArc The role of the oracle is to return the correct parsing operation for each configuration in the training set 46

Oracle Choose LeftArc if it produces a correct head-dependent relation given the reference parse and the current configuration Choose RightArc if: It produces a correct head-dependent relation given the reference parse and the current configuration All of the dependents of the word at the top of the stack have already been assigned Otherwise choose Shift 47

Example Shift: LeftArc: RightArc: 48

Example Stack Buffer Action Arc 49

Example Stack Buffer Action Arc [ROOT] [The, cat, sat, on, the, mat] Shift [ROOT, The] [cat, sat, on, the, mat] Shift [ROOT, The, cat] [sat, on, the, mat] Left-Arc det(the <-- cat) [ROOT, cat] [sat, on, the, mat] Shift [ROOT, cat, sat] [on, the, mat] Left-Arc nsubj(cat <-- sat) [ROOT, sat] [on, the, mat] Shift [ROOT, sat, on] [the, mat] Shift [ROOT, sat, on, the] [mat] Shift [ROOT, sat, on, the, mat] [] Left-Arc det(the <-- mat) [ROOT, sat, on, mat] [] Left-Arc case(on <-- mat) [ROOT, sat, mat] [] Right-Arc nmod(sat --> mat) [ROOT, sat] [] Right-Arc root(root, sat) [ROOT [] Done 50

Typical features First word from the stack second word from the stack The POS of the first word in the stack The POS of the second word in the stack The first word in the buffer The POS of the first word in the buffer The word and the POS of the top word in the stack 51

Exercise The next action from the current configuration is Shift. Construct the features. Template First word from the stack Second word from the stack POS of first stack word POS of second stack word First word from the buffer POS of the first buffer word Word and POS of the top stack word Feature 52

Exercise The next action from the current configuration is Shift. Construct the features. 53

Standard feature templates Figure 14.9: https://web.stanford.edu/~jurafsky/slp3/14.pdf 54

Evaluation Unlabelled attachment score: The proportion of correct head attachments Labelled attachment score: The proportion of correct head attachments labelled with the correct relation Label accuracy The proportion of correct incoming relation labels ignoring the head 55

Evaluation UAS = LAS = LA = Figure 14.15: https://web.stanford.edu/~jurafsky/slp3/14.pdf 56

Evaluation UAS = 5/6 LAS = 4/6 LA = 4/6 Figure 14.15: https://web.stanford.edu/~jurafsky/slp3/14.pdf 57

SyntaxNet https://github.com/tensorflow/models/blob/master/research/syntax net/g3doc/universal.md Language Tokens UAS LAS English 25096 84.89% 80.38% Estonian 23670 83.10% 78.83% Finnish 9140 83.65% 79.60% German 16268 79.73% 74.07% Kazakh 587 58.09% 43.95% Chinese 12012 76.71% 71.24% Latvian 3985 58.92% 51.47% Average 81.12% 75.85% 58

Neural Dependency parsers Kipperwasser and Goldberg, 2016. Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations 59

Neural Dependency parsers Dyer et al., 2015. Transition-based Dependency Parsing with Stack Long Short-Term Memory 60

Parsing resources Stanford constituency and dependency parser for English: https://nlp.stanford.edu/software/lex-parser.shtml Spacy parser for English and German: https://spacy.io/ MaltParser for morphologically complex languages: http://www.maltparser.org/ 61

Parsing Estonian Estnltk has two parsers: A trained MaltParser model A rule-based parser based on Constraint Grammar Nusaeb Nur Alam, 2017. The Comparative Evaluation of Dependency Parsers in Parsing Estonian 62

Recap Parsing is the task of finding syntactic structure of sentences Shallow parsing find only non-overlapping syntactic phrases Simpler task than full syntactic parsing Useful for information extraction tasks, i.e named entities can only occur in noun phrases Constituency parsing full syntactic analysis that breaks the text into phrases and sub-phrases Dependency parsing simpler grammar formalism that marks the syntactic dependence relation between words More suitable for languages with free word order 63