Context Free Grammar

Similar documents
CS 598 Natural Language Processing

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Grammars & Parsing, Part 1:

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Natural Language Processing. George Konidaris

Context Free Grammars. Many slides from Michael Collins

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Parsing of part-of-speech tagged Assamese Texts

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Proof Theory for Syntacticians

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Compositional Semantics

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Parsing natural language

Chapter 4: Valence & Agreement CSLI Publications

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Analysis of Probabilistic Parsing in NLP

Ch VI- SENTENCE PATTERNS.

Construction Grammar. University of Jena.

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

The stages of event extraction

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Argument structure and theta roles

LNGT0101 Introduction to Linguistics

The CYK -Approach to Serial and Parallel Parsing

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

LTAG-spinal and the Treebank

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

A Version Space Approach to Learning Context-free Grammars

Some Principles of Automated Natural Language Information Extraction

The Interface between Phrasal and Functional Constraints

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

Ensemble Technique Utilization for Indonesian Dependency Parser

Accurate Unlexicalized Parsing for Modern Hebrew

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Domain Adaptation for Parsing

BULATS A2 WORDLIST 2

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Hyperedge Replacement and Nonprojective Dependency Structures

Specifying Logic Programs in Controlled Natural Language

"f TOPIC =T COMP COMP... OBJ

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Developing a TT-MCTAG for German with an RCG-based Parser

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation

An Introduction to the Minimalist Program

Theoretical Syntax Winter Answers to practice problems

Words come in categories

Character Stream Parsing of Mixed-lingual Text

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

Psychology and Language

CIS 121 INTRODUCTION TO COMPUTER INFORMATION SYSTEMS - SYLLABUS

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

IBAN LANGUAGE PARSER USING RULE BASED APPROACH

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Lexical category induction using lexically-specific templates

Unit 8 Pronoun References

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Can Human Verb Associations help identify Salient Features for Semantic Verb Classification?

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Language and Computers. Writers Aids. Introduction. Non-word error detection. Dictionaries. N-gram analysis. Isolated-word error correction

Pre-Processing MRSes

Specifying a shallow grammatical for parsing purposes

AQUA: An Ontology-Driven Question Answering System

LING 329 : MORPHOLOGY

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Second Exam: Natural Language Parsing with Neural Networks

The College Board Redesigned SAT Grade 12

Language properties and Grammar of Parallel and Series Parallel Languages

The Smart/Empire TIPSTER IR System

Prediction of Maximal Projection for Semantic Role Labeling

Organizing Comprehensive Literacy Assessment: How to Get Started

Probabilistic Latent Semantic Analysis

CS 101 Computer Science I Fall Instructor Muller. Syllabus

On the Notion Determiner

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Type Theory and Universal Grammar

A Usage-Based Approach to Recursion in Sentence Processing

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Thornhill Primary School - Grammar coverage Year 1-6

EAGLE: an Error-Annotated Corpus of Beginning Learner German

6.863J Natural Language Processing Lecture 12: Featured attraction. Instructor: Robert C. Berwick

A Grammar for Battle Management Language

Parents Support Guide to Spelling, Punctuation and Grammar in Year 6.

The Role of the Head in the Interpretation of English Deverbal Compounds

California Department of Education English Language Development Standards for Grade 8

Multiple case assignment and the English pseudo-passive *

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Transcription:

Context Free Grammar CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences University of Massachusetts Amherst

Syntax: how do words structurally combine to form sentences and meaning? Representations Constituents [the big dogs] chase cats [colorless green clouds] chase cats Dependencies The dog chased the cat. My dog, a big old one, chased the cat. Idea of a grammar (G): global template for how sentences / utterances / phrases w are formed, via latent syntactic structure y Linguistics: what do G and P(w,y G) look like? Generation: score with, or sample from, P(w, y G) Parsing: predict P(y w, G) 2

Is language context-free? 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages (10.1) The cat is fat. 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages (10.1) The cat is fat. (10.2) The cat that the dog chased is fat. 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages (10.1) The cat is fat. (10.2) The cat that the dog chased is fat. (10.3) *The cat that the dog is fat. 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages (10.1) The cat is fat. (10.2) The cat that the dog chased is fat. (10.3) *The cat that the dog is fat. (10.4) The cat that the dog that the monkey kissed chased is fat. 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages (10.1) The cat is fat. (10.2) The cat that the dog chased is fat. (10.3) *The cat that the dog is fat. (10.4) The cat that the dog that the monkey kissed chased is fat. (10.5) *The cat that the dog that the monkey chased is fat. 3 [Examples from Eisenstein (2017)]

Is language context-free? Regular language: repetition of repeated structures e.g. Justeson and Katz (1995) s noun phrase pattern: (Noun Adj)* Noun (Prep Det? (Noun Adj)* Noun)* Context-free: hierarchical recursion Center-embedding: classic theoretical argument for CFG vs. regular languages (10.1) The cat is fat. (10.2) The cat that the dog chased is fat. (10.3) *The cat that the dog is fat. (10.4) The cat that the dog that the monkey kissed chased is fat. (10.5) *The cat that the dog that the monkey chased is fat. Competence vs. Performance? 3 [Examples from Eisenstein (2017)]

{ Hierarchical view of syntax a Sentence made of Noun Phrase followed by a Verb Phrase a. S John the man the elderly janitor { { { VP arrived ate an apple looked at his watch b. S VP (1) 4 [From Phillips (2003)]

Is language context-free? Practical examples where nesting seems like a useful explanation The processor has 10 million times fewer transistors on it than todays typical micro- processors, runs much more slowly, and operates at five times the voltage... S NN VP VP VP3S VPN3S... VP3S VP3S, VP3S, and VP3S VBZ VBZ... 5 [Examples from Eisenstein (2017)]

Regular language <=> RegEx <=> paths in finite state machine Context-free language <=> CFG <=> derivations in pushdown automaton A context-free grammar is a 4-tuple: N a set of non-terminals a set of terminals (distinct from N) R a set of productions, each of the form A!, where A 2 N and 2 ( [ N) S a designated start symbol Derivation: sequence of rewrite steps from S to a string (sequence of terminals, i.e. words) Yield: the final string A CFG is a boolean language model A probabilistic CFG is a probabilistic language model: Every production rule has a probability; defines prob dist. over strings. 6

Example S VP PRP VBZ PP She eats NN IN sushi with NNS chopsticks ( S ( ( PRP She)( VP ( VBZ eats) ( ( NN sushi)) ( PP ( IN with)( ( NNS chopsticks)))))) All useful grammars are ambiguous: multiple derivations with same yield [Parse tree representations: Nested parens or non-terminal spans] 7 [Examples from Eisenstein (2017)]

Example PRP She S VBZ eats VP NN sushi IN with PP NNS chopsticks PRP She S VBZ eats VP NN sushi IN with PP NNS chopsticks ( S ( ( PRP She)( VP ( VBZ eats) ( ( NN sushi)) ( PP ( IN with)( ( NNS chopsticks)))))) ( S ( ( PRP She)( VP ( VBZ eats) ( ( ( NN sushi))( PP ( IN with)( ( NNS chopsticks))))))) All useful grammars are ambiguous: multiple derivations with same yield [Parse tree representations: Nested parens or non-terminal spans] 7 [Examples from Eisenstein (2017)]

Constituents Constituent tree/parse is one representation of sentence s syntax. What should be considered a constituent, or constituents of the same category? Substitution tests Pronoun substitution Coordination tests Simple grammar of English Must balance overgeneration versus undergeneration Noun phrases modification: adjectives, PPs Verb phrases Coordination... 8

stopped here 11/14 9

Parsing with a CFG Task: given text and a CFG, answer: Does there exist at least one parse? Enumerate parses (backpointers) Cocke-Kasami-Younger algorithm Bottom-up dynamic programming: Find possible nonterminals for short spans of sentence, then possible combinations for higher spans Requires converting CFG to Chomsky Normal Form (a.k.a. binarization) 10

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 11(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 12(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 12(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 12(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 13(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY 0:3 Grammar Adj -> yummy -> foods -> store -> -> Adj 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 13(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY Grammar Adj -> yummy -> foods -> store -> -> Adj 0:3 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 13(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY Grammar Adj -> yummy -> foods -> store -> -> Adj 0:3 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 13(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY Grammar Adj -> yummy -> foods -> store -> -> Adj 0:3 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 13(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.

CKY Grammar Adj -> yummy -> foods -> store -> -> Adj 0:3 0:2 1:3 0:1 1:2 2:3 Adj For cell [i,j] (loop through them bottom-up) For possible splitpoint k=(i+1)..(j-1): For every B in [i,k] and C in [k,j], If exists rule A -> B C, add A to cell [i,j] (Recognizer)... or... yummy foods store 0 1 2 3 add (A,B,C, k) to cell [i,j] 13(Parser) Recognizer: per span, record list of possible nonterminals Parser: per span, record possible ways the nonterminal was constructed.