The Earley Algorithm. Syntactic analysis (5LN455) Sara Stymne Department of Linguistics and Philology. Based on slides by Marco Kuhlmann

Similar documents
11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Grammars & Parsing, Part 1:

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Prediction of Maximal Projection for Semantic Role Labeling

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Parsing of part-of-speech tagged Assamese Texts

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Context Free Grammars. Many slides from Michael Collins

Proof Theory for Syntacticians

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

LTAG-spinal and the Treebank

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Accurate Unlexicalized Parsing for Modern Hebrew

SEMAFOR: Frame Argument Resolution with Log-Linear Models

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Adapting Stochastic Output for Rule-Based Semantics

CS 598 Natural Language Processing

Domain Adaptation for Parsing

Analysis of Probabilistic Parsing in NLP

The stages of event extraction

Cross Language Information Retrieval

Hyperedge Replacement and Nonprojective Dependency Structures

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

An Efficient Implementation of a New POP Model

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

Developing a TT-MCTAG for German with an RCG-based Parser

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Some Principles of Automated Natural Language Information Extraction

The Smart/Empire TIPSTER IR System

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Hans-Ulrich Block, Hans Haugeneder Siemens AG, MOnchen ZT ZTI INF W. Germany. (2) [S' [NP who][s does he try to find [NP e]]s IS' $=~

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly

What is NLP? CS 188: Artificial Intelligence Spring Why is Language Hard? The Big Open Problems. Information Extraction. Machine Translation

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

AQUA: An Ontology-Driven Question Answering System

Learning Computational Grammars

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Extracting Verb Expressions Implying Negative Opinions

Using dialogue context to improve parsing performance in dialogue systems

The Interface between Phrasal and Functional Constraints

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Grammar Extraction from Treebanks for Hindi and Telugu

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Specifying a shallow grammatical for parsing purposes

Compositional Semantics

Introduction to Simulation

A Version Space Approach to Learning Context-free Grammars

The Indiana Cooperative Remote Search Task (CReST) Corpus

Can Human Verb Associations help identify Salient Features for Semantic Verb Classification?

Introduction to Text Mining

Lecture 10: Reinforcement Learning

Natural Language Processing. George Konidaris

Linking Task: Identifying authors and book titles in verbose queries

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Ensemble Technique Utilization for Indonesian Dependency Parser

Parsing natural language

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

Part III: Semantics. Notes on Natural Language Processing. Chia-Ping Chen

"f TOPIC =T COMP COMP... OBJ

Achim Stein: Diachronic Corpora Aston Corpus Summer School 2011

A Computational Evaluation of Case-Assignment Algorithms

A Graph Based Authorship Identification Approach

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

UCLA UCLA Electronic Theses and Dissertations

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Treebank mining with GrETEL. Liesbeth Augustinus Frank Van Eynde

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

Automatic Translation of Norwegian Noun Compounds

Chapter 4: Valence & Agreement CSLI Publications

6.863J Natural Language Processing Lecture 12: Featured attraction. Instructor: Robert C. Berwick

Carter M. Mast. Participants: Peter Mackenzie-Helnwein, Pedro Arduino, and Greg Miller. 6 th MPM Workshop Albuquerque, New Mexico August 9-10, 2010

Coordination Structure Analysis using Dual Decomposition

Memory-based grammatical error correction

Annotation Projection for Discourse Connectives

Theoretical Syntax Winter Answers to practice problems

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

Probabilistic Latent Semantic Analysis

Construction Grammar. University of Jena.

Loughton School s curriculum evening. 28 th February 2017

A Domain Ontology Development Environment Using a MRD and Text Corpus

What the National Curriculum requires in reading at Y5 and Y6

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

The Role of the Head in the Interpretation of English Deverbal Compounds

Universiteit Leiden ICT in Business

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Transcription:

The Earley Algorithm Syntactic analysis (5LN455) 2014-11-26 Sara Stymne Department of Linguistics and Philology Based on slides by Marco Kuhlmann

Recap: Treebank grammars, evaluation

Treebanks Treebanks are corpora in which each sentence has been annotated with a syntactic analysis. Producing a high-quality treebank is both time-consuming and expensive. One of the most widely known treebanks is the Penn TreeBank (PTB).

The Penn Treebank ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT the) (NN board) ) (PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director) )) (NP-TMP (NNP Nov.) (CD 29) ))) (..) ))

Treebank grammars Given a treebank, we can construct a grammar by reading rules off the phrase structure trees. A treebank grammar will account for all analyses in the treebank. It will also account for sentences that were not observed in the treebank.

Treebank grammars The simplest way to obtain rule probabilities is relative frequency estimation. Step 1: Count the number of occurrences of each rule in the treebank. Step 2: Divide this number by the total number of rule occurrences for the same left-hand side.

Parse evaluation measures Precision: Out of all brackets found by the parser, how many are also present in the gold standard? Recall: Out of all brackets in the gold standard, how many are also found by the parser? F1-score: harmonic mean between precision and recall: 2 precision recall / (precision + recall)

Parser evaluation measures 100 90 75 70 62 50 25 5 0 stupid your parser full grammar state of the art

Parse trees S root (top) NP VP leaves (bottom) Pro Verb NP I prefer Det Nom a Nom Noun Noun flight morning

Top down and bottom up top down only build trees that have S at the root node may lead to trees that do not yield the sentence bottom up only build trees that yield the sentence may lead to trees that do not have S at the root

CKY versus Earley The CKY algorithm has two disadvantages: It can only handle restricted grammars. It does not use top down information. The Earley algorithm does not have these: It can handle arbitrary grammars. Is does use top down information. On the downside, it is more complicated.

The algorithm Start with the start symbol S. Take the leftmost nonterminal and predict all possible expansions. If the next symbol in the expansion is a word, match it against the input sentence (scan); otherwise, repeat. If there is nothing more to expand, the subtree is complete; in this case, continue with the next incomplete subtree.

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S [0, 0] Predict the rule S NP VP

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 0] NP [0, 0] VP Predict the rule NP Pro

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 0] NP Pro NP [0, 0] VP Pro [0, 0] Predict the rule Pro I

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 0] NP Pro NP [0, 0] VP Pro I Pro [0, 0] I [0, 0] Scan this word

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 0] NP Pro NP [0, 0] VP Pro I Pro [0, 0] Update the dot I [0, 1]

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 0] NP Pro NP [0, 0] VP Pro I Pro [0, 1] The predicted rule is complete. I [0, 1]

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 1] NP [0, 1] VP Pro [0, 1] I [0, 1]

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 1] NP [0, 1] VP [1, 1] Pro [0, 1] I [0, 1]

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S NP VP S [0, 5] Update the dot NP [0, 1] VP [1, 5] Pro [0, 1] Verb [1, 2] NP [2, 5] I [0, 1] prefer [1, 2] Det [2, 3] Nom [3, 5] a [2, 3] Nom [3, 4] Noun [4, 5] Noun [3, 4] flight [4, 5] morning [3, 4]

Example run 0 I 1 prefer 2 a 3 morning 4 flight 5 S [0, 5] NP [0, 1] VP [1, 5] Pro [0, 1] Verb [1, 2] NP [2, 5] I [0, 1] prefer [1, 2] Det [2, 3] Nom [3, 5] a [2, 3] Nom [3, 4] Noun [4, 5] Noun [3, 4] flight [4, 5] morning [3, 4]

The algorithm Start with the start symbol S. Take the leftmost nonterminal and predict all possible expansions. If the next symbol in the expansion is a word, match it against the input sentence (scan); otherwise, repeat. If there is nothing more to expand, the subtree is complete; in this case, continue with the next incomplete subtree.

Dotted rules A dotted rule is a partially processed rule. Example: S NP VP The dot can be placed in front of the first symbol, behind the last symbol, or between two symbols on the right-hand side of a rule. The general form of a dotted rule thus is A α β, where A αβ is the original, non-dotted rule.

Chart entries The chart contains entries of the form [min, max, A α β], where min and max are positions in the input and A α β is a dotted rule. Such an entry says: We have built a parse tree whose first rule is A αβ and where the part of this rule that corresponds to α covers the words between min and max.

Inference rules Axiom [0, 0, S α] S α Predict [i, j, A α B β] [j, j, B γ] B γ Scan [i, j, A α a β] [i, j + 1, A α a β] wj = a Complete [i, j, A α B β] [i, k, A α B β] [j, k, B γ ]

Pseudo code 1

Pseudo code 2

Recogniser/parser When parsing is complete, is there a chart entry? [0, n, S α ] Recognizer If we want a parser, we have to add back pointers, and retrieve a tree Earley s algorithm can be used for PCFGs, but it is more complicated than for CKY

Summary The Earley algorithm is a parsing algorithm for arbitrary context-free grammars. In contrast to the CKY algorithm, it also uses top down information. Also in contrast to the CKY algorithm, its probabilistic extension is not straightforward. Reading: J&M 13.4.2

Course overview Seminar next Wednesday Group A: 10.15-11.15 (first names A-Nic) Group B: 11.30-12.30 (first names Nil-V) Own work: Read the seminar article and prepare Work on assignment 1 and 2 Contact me if you need help!