Natural Language Processing. Syntax

Similar documents
Grammars & Parsing, Part 1:

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

CS 598 Natural Language Processing

Context Free Grammars. Many slides from Michael Collins

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Accurate Unlexicalized Parsing for Modern Hebrew

Parsing of part-of-speech tagged Assamese Texts

Natural Language Processing. George Konidaris

Prediction of Maximal Projection for Semantic Role Labeling

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Proof Theory for Syntacticians

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

LTAG-spinal and the Treebank

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Analysis of Probabilistic Parsing in NLP

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

The Interface between Phrasal and Functional Constraints

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Compositional Semantics

Developing a TT-MCTAG for German with an RCG-based Parser

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Adapting Stochastic Output for Rule-Based Semantics

Specifying a shallow grammatical for parsing purposes

Theoretical Syntax Winter Answers to practice problems

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

"f TOPIC =T COMP COMP... OBJ

Some Principles of Automated Natural Language Information Extraction

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Domain Adaptation for Parsing

Refining the Design of a Contracting Finite-State Dependency Parser

Chapter 4: Valence & Agreement CSLI Publications

Ch VI- SENTENCE PATTERNS.

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Achim Stein: Diachronic Corpora Aston Corpus Summer School 2011

The stages of event extraction

A Computational Evaluation of Case-Assignment Algorithms

What is NLP? CS 188: Artificial Intelligence Spring Why is Language Hard? The Big Open Problems. Information Extraction. Machine Translation

AQUA: An Ontology-Driven Question Answering System

The Indiana Cooperative Remote Search Task (CReST) Corpus

Construction Grammar. University of Jena.

An Efficient Implementation of a New POP Model

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

The Role of the Head in the Interpretation of English Deverbal Compounds

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Cross Language Information Retrieval

An Introduction to the Minimalist Program

Parsing natural language

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Learning Computational Grammars

The Smart/Empire TIPSTER IR System

Minimalism is the name of the predominant approach in generative linguistics today. It was first

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

Language properties and Grammar of Parallel and Series Parallel Languages

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

Character Stream Parsing of Mixed-lingual Text

Update on Soar-based language processing

Specifying Logic Programs in Controlled Natural Language

EAGLE: an Error-Annotated Corpus of Beginning Learner German

LING 329 : MORPHOLOGY

Treebank mining with GrETEL. Liesbeth Augustinus Frank Van Eynde

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Ensemble Technique Utilization for Indonesian Dependency Parser

Hyperedge Replacement and Nonprojective Dependency Structures

Using dialogue context to improve parsing performance in dialogue systems

Control and Boundedness

Extracting Verb Expressions Implying Negative Opinions

An Interactive Intelligent Language Tutor Over The Internet

Underlying and Surface Grammatical Relations in Greek consider

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Language acquisition: acquiring some aspects of syntax.

IBAN LANGUAGE PARSER USING RULE BASED APPROACH

(Sub)Gradient Descent

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Multiple case assignment and the English pseudo-passive *

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

Pseudo-Passives as Adjectival Passives

Type Theory and Universal Grammar

Transcription:

Natural Language Processing Syntax

What is syntax? Syntax addresses the question how sentences are constructed in particular languages A grammar is a set of rules that govern the composition of sentences Parsing refers to the process of analyzing an utterance in terms of its syntactic structure

Why should you care? Syntactic information is important for many tasks: Question answering What books did he like? Grammar checking He is friend of mine. Information extraction Oracle acquired Sun.

Theoretical frameworks Phrase structure grammar Noam Chomsky (1928 ) Immediate constituent analysis Dependency grammar Lucien Tesnière (1893 1954) Functional dependency relations

Constituency A basic observation about syntactic structure is that groups of words can act as single units Los Angeles, a high-class spot such as Mindy s, three parties from Brooklyn, they. Such groups of words are called constituents Constituents tend to have similar internal structure, and behave similarly with respect to other units

Constituency Examples of constituents noun phrases (NP) she, the house, Robin Hood and his merry men, a high-class spot such as Mindy s verb phrases (VP) blushed, loves Mary, was told to sit down and be quiet, lived happily ever after prepositional phrases (PP) on it, with the telescope, through the foggy dew, apart from everything I have said so far

Context-free grammar Simple yet powerful formalism to describe the syntactic structure of natural languages Developed in the mid-1950s by Noam Chomsky Noam Chomsky Allows one to specify rules that state how a constituent can be segmented into smaller and smaller constituents, up to the level of individual words

Context-free grammar Context-free grammar A context-free grammar (CFG) consists of a finite set of nonterminal symbols a finite set of terminal symbols a distinguished nonterminal symbol S a finite set of rules of the form A α, where A is a nonterminal and α is a possibly empty sequence of nonterminal and terminal symbols

Context-free grammar A sample context-free grammar Grammar rule S NP VP NP Pronoun NP Proper-Noun NP Det Nominal Nominal Nominal Noun Nominal Noun VP Verb VP Verb NP VP Verb NP PP VP Verb PP PP Preposition NP Example I + want a morning flight I Los Angeles a flight morning flight flights do want + a flight leave + Boston + in the morning leaving + on Thursday from + Los Angeles

Derivations A derivation is a sequence of rule applications that derive a terminal string w = w1 wn from S For example: S NP VP Pro VP I VP I Verb NP I prefer NP I prefer Det Nom I prefer a Nom I prefer a Nom Noun I prefer a Noun Noun I prefer a morning Noun I prefer a morning flight

Context-free grammar A sample phrase structure tree S NP VP Pro Verb NP I prefer Det Nom a Nom Noun Noun flight morning

Context-free grammar A sample phrase structure tree S root (top) NP VP leaves (bottom) Pro Verb NP I prefer Det Nom a Nom Noun Noun flight morning

Treebanks Treebanks are corpora where each sentence is annotated with a parse tree Treebanks are generally created by parsing texts with an existing parser having human annotators correct the result This requires detailed annotation guidelines for annotating different grammatical constructions

The Penn Treebank Penn Treebank is a popular treebank for English Wall Street Journal section 1 million words from WSJ 1987 1989 ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT the) (NN board) ) (PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director) )) (NP-TMP (NNP Nov.) (CD 29) ))) (..) ))

Treebank grammars A treebank implicitly defines a grammar for the language covered in the treebank Simply take the set of rules needed to generate all the trees in the treebank Coverage of the language depends on the size of the treebank (but never complete)

Treebank grammars Treebank grammars tend to be very flat because they avoid recursive rules (and hard distinctions) The Penn Treebank has 4500 different rules for verb phrases For example: VP VBD PP! VP VBD PP PP! VP VBD PP PP PP! VP VBD PP PP PP PP!

Natural Language Processing Parsing

Parsing Parsing is the automatic analysis of a sentence with respect to its syntactic structure Given a CFG, this means deriving a phrase structure tree assigned to the sentence by the grammar With ambiguous grammars, each sentence may have many valid parse trees Should we retrieve all of them or just one? If the latter, how do we know which one?

Ambiguity I booked a flight from LA. This sentence is ambiguous. In what way? What should happen if we parse the sentence?

Ambiguity Ambiguity S NP VP Pro Verb NP I booked Det Nom a Nom PP Noun from LA flight

Ambiguity Ambiguity S NP VP Pro Verb NP PP I booked Det Nom from LA a Noun flight

Ambiguity Combinatorial explosion 1600 1200 linear cubic exponential 800 400 0 1 2 3 4 5 6 7 8

Phrase structure trees S root (top) NP VP leaves (bottom) Pro Verb NP I prefer Det Nom a Nom Noun Noun flight morning

Basic concepts of parsing Two problems for grammar G and string w: Recognition: determine if G accepts w Parsing: retrieve (all or some) parse trees assigned to w by G Two basic search strategies: Top-down: start at the root of the tree Bottom-up: start at the leaves

Top-down parsing Basic idea Start at the root node, expand tree by matching the left-hand side of rules Derive a tree whose leaves match the input Potential problems: Uses rules that could never match the input May loop on recursive rules: VP VP PP!

Bottom-up parsing Basic idea: Start with the leaves, build tree by matching the right-hand side of rules Build a tree with S at the root Potential problems Builds structures that could never be in a tree May loop on epsilon productions: NP ɛ!

Dealing with ambiguity The number of possible parse trees grows exponentially with sentence length A naive backtracking approach is too inefficient Key observation: Alternative parse trees share substructures We can use dynamic programming (again)

Probabilistic context-free grammar The number of possible parse trees grows rapidly with the length of the input. But not all parse trees are equally useful. Example: I booked a flight from Los Angeles. In many applications, we want the best parse tree, or the first few best trees. Special case: best = most probable

Probabilistic context-free grammar Probabilistic context-free grammars A probabilistic context-free grammar (PCFG) is a context-free grammar where each rule r has been assigned a probability p(r) between 0 and 1 the probabilities of rules with the same left-hand side sum up to 1

Probabilistic context-free grammar A sample PCFG Rule Probability S NP VP 1 NP Pronoun 1/3 NP Proper-Noun 1/3 NP Det Nominal 1/3 Nominal Nominal PP 1/3 Nominal Noun 2/3 VP Verb NP 8/9 VP Verb NP PP 1/9 PP Preposition NP 1

Probabilistic context-free grammar The probability of a parse tree The probability of a parse tree is defined as the product of the probabilities of the rules that have been used to build the parse tree.

Probabilistic context-free grammar The probability of a parse tree S 1/1 NP VP 1/3 8/9 Pro Verb NP 1/3 I booked Det Nom 1/3 a Nom 2/3 PP Noun from LA flight Probability: 16/729

Probabilistic context-free grammar The probability of a parse tree S 1/1 NP VP 1/3 1/9 Pro Verb NP 1/3 PP I booked Det Nom 2/3 from LA a Noun flight Probability: 6/729

Independence assumption How can we make sense of this in terms of probability theory? The probability of a rule expansion is dependent only on the left-hand side symbol Is this a reasonable independence assumption?