Admin GRAMMARS. Formally. Context free grammar 2/20/19. Assignment 3 out: due next Monday. Quiz #1 G = (NT, T, P, S)

Similar documents
Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Grammars & Parsing, Part 1:

CS 598 Natural Language Processing

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Context Free Grammars. Many slides from Michael Collins

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Parsing of part-of-speech tagged Assamese Texts

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Natural Language Processing. George Konidaris

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Ch VI- SENTENCE PATTERNS.

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Hyperedge Replacement and Nonprojective Dependency Structures

What is NLP? CS 188: Artificial Intelligence Spring Why is Language Hard? The Big Open Problems. Information Extraction. Machine Translation

Compositional Semantics

Analysis of Probabilistic Parsing in NLP

LTAG-spinal and the Treebank

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Domain Adaptation for Parsing

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Parsing natural language

Proof Theory for Syntacticians

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

The stages of event extraction

Chapter 4: Valence & Agreement CSLI Publications

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Developing a TT-MCTAG for German with an RCG-based Parser

Prediction of Maximal Projection for Semantic Role Labeling

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Accurate Unlexicalized Parsing for Modern Hebrew

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Construction Grammar. University of Jena.

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

BULATS A2 WORDLIST 2

Intensive English Program Southwest College

On the Notion Determiner

A Grammar for Battle Management Language

LNGT0101 Introduction to Linguistics

Words come in categories

The Role of the Head in the Interpretation of English Deverbal Compounds

Part III: Semantics. Notes on Natural Language Processing. Chia-Ping Chen

Some Principles of Automated Natural Language Information Extraction

Adapting Stochastic Output for Rule-Based Semantics

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

The suffix -able means "able to be." Adding the suffix -able to verbs turns the verbs into adjectives. chewable enjoyable

"f TOPIC =T COMP COMP... OBJ

Refining the Design of a Contracting Finite-State Dependency Parser

Universiteit Leiden ICT in Business

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

The Interface between Phrasal and Functional Constraints

6.863J Natural Language Processing Lecture 12: Featured attraction. Instructor: Robert C. Berwick

Control and Boundedness

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Som and Optimality Theory

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Specifying a shallow grammatical for parsing purposes

Unit 8 Pronoun References

Character Stream Parsing of Mixed-lingual Text

Language Evolution, Metasyntactically. First International Workshop on Bidirectional Transformations (BX 2012)

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

The College Board Redesigned SAT Grade 12

A Graph Based Authorship Identification Approach

Part of Speech Template

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Syntactic systematicity in sentence processing with a recurrent self-organizing network

Human-like Natural Language Generation Using Monte Carlo Tree Search

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

MODELING DEPENDENCY GRAMMAR WITH RESTRICTED CONSTRAINTS. Ingo Schröder Wolfgang Menzel Kilian Foth Michael Schulz * Résumé - Abstract

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Phenomena of gender attraction in Polish *

Ensemble Technique Utilization for Indonesian Dependency Parser

A General Class of Noncontext Free Grammars Generating Context Free Languages

Theoretical Syntax Winter Answers to practice problems

The CYK -Approach to Serial and Parallel Parsing

5 Star Writing Persuasive Essay

AQUA: An Ontology-Driven Question Answering System

A Computational Evaluation of Case-Assignment Algorithms

Specifying Logic Programs in Controlled Natural Language

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly

Developing Grammar in Context

Type Theory and Universal Grammar

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Language properties and Grammar of Parallel and Series Parallel Languages

In search of ambiguity

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

Using dialogue context to improve parsing performance in dialogue systems

Transcription:

Admin Assignment 3 out: due next Monday Quiz #1 GRAMMAR David Kauchak C159 pring 2019 some slides adapted from Ray Mooney Context free grammar Formally G = (NT, T, P, ) left hand side (single symbol) right hand side (one or more symbols) NT: finite set of nonterminal symbols T: finite set of terminal symbols, NT and T are disjoint P: finite set of productions of the form A a, A Î NT and a Î (T È NT)* Î NT: start symbol 1

CFG: Example Many possible CFGs for English, here is an example (fragment): What can we do? What can we do? 2

DetP N DetP N the boy 3

the boy likes the boy likes a girl ; Order of Derivation Irrelevant Derivations of CFGs tring rewriting system: we derive a string DetP N V Derivation history shows constituent tree: the boy likes a girl DetP N V the boy likes DetP N the boy likes a girl a girl 4

Parsing Parsing is the field of NLP interested in automatically determining the syntactic structure of a sentence Parsing can be thought of as determining what sentences are valid English sentences As a byproduct, we often can get the structure Parsing Given a CFG and a sentence, determine the possible parse tree(s) -> -> N -> PRP -> N PP -> V -> V PP PP -> IN N PRP -> I V -> eat N -> sushi N -> tuna IN -> with What parse trees are possible for this sentence? How did you do it? What if the grammar is much larger? Parsing Parsing ambiguity PP PP -> -> PRP -> N PP -> N -> V -> V PP PP -> IN N PRP -> I V -> eat N -> sushi N -> tuna IN -> with PP PP -> -> PRP -> N PP -> N -> V -> V PP PP -> IN N PRP -> I V -> eat N -> sushi N -> tuna IN -> with PRP V N IN N PRP V N IN N PRP V N IN N PRP V N IN N What is the difference between these parses? How can we decide between these? 5

A imple PCFG Probabilities! 1.0 0.7 PP 0.3 PP P 1.0 P with 1.0 V saw 1.0 PP 0.4 astronomers 0.1 ears 0.18 saw 0.04 stars 0.18 telescope 0.1 Just like n-gram language modeling, PCFGs break the sentence generation process into smaller steps/probabilities The probability of a parse is the product of the PCFG rules What are the different interpretations here? Which do you think is more likely? = 1.0 * 0.1 * 0.7 * 1.0 * 0.4 * 0.18 * 1.0 * 1.0 * 0.18 = 0.0009072 = 1.0 * 0.1 * 0.3 * 0.7 * 1.0 * 0.18 * 1.0 * 1.0 * 0.18 = 0.0006804 6

Parsing problems Pick a model e.g. CFG, PCFG, PCFG: Training If we have example parsed sentences, how can we learn a set of PCFGs? Train (or learn) a model What CFG/PCFG rules should I use? Parameters (e.g. PCFG probabilities)? What kind of data do we have? Parsing Determine the parse tree(s) given a sentence Tree Bank John V PP put the dog in the pen John V PP put the dog in the pen.. upervised PCFG Training 0.9 0.1 Det A N 0.5 PP 0.3 PropN 0.2 A ε 0.6 A Adj A 0.4 PP Prep 1.0 V 0.7 PP 0.3 English Extracting the rules Estimating PCFG Probabilities PRP V N IN PP N PRP PRP I V eat N PP N sushi PP IN N IN with N tuna What CFG rules occur in this tree? We can extract the rules from the trees PRP PRP I V eat N PP N sushi 1.0 0.7 PP 0.3 PP P 1.0 P with 1.0 V saw 1.0 How do we go from the extracted CFG rules to PCFG rules? 7

Estimating PCFG Probabilities Estimating PCFG Probabilities Extract the rules from the trees Calculate the probabilities using MLE α β P(α β α) = count(α β) count(α β) = count(α γ) count(α) γ p(α β α) Occurrences 10 V 3 PP 2 N 7 N PP 3 DT N 6 P( V ) = P( V ) = P( V ) =? count( V ) count() = 3 15 Grammar Equivalence Grammar Equivalence What does it mean for two grammars to be equal? Weak equivalence: grammars generate the same set of strings Grammar 1: DetP N and Grammar 2: a N the N trong equivalence: grammars have the same set of derivation trees With CFGs, possible only with useless rules Grammar 2: a N the N Grammar 3: a N the N, DetP many 8

Normal Forms CNF Grammar There are weakly equivalent normal forms (Chomsky Normal Form, Greibach Normal Form) A CFG is in Chomsky Normal Form (CNF) if all productions are of one of two forms: A B C with A, B, C nonterminals A a, with A a nonterminal and a a terminal Every CFG has a weakly equivalent CFG in CNF -> -> VB -> VB PP -> DT NN -> NN -> PP PP -> IN DT -> the IN -> with VB -> film VB -> trust NN -> man NN -> film NN -> trust -> -> VB -> 2 PP 2 -> VB -> DT NN -> NN -> PP PP -> IN DT -> the IN -> with VB -> film VB -> trust NN -> man NN -> film NN -> trust Probabilistic Grammar Conversion Original Grammar Chomsky Normal Form tates Aux Pronoun Proper-Noun Det Nominal Nominal Noun Nominal Nominal Noun Nominal Nominal PP Verb Verb PP PP Prep 0.8 0.1 0.1 0.2 0.2 0.6 0.3 0.2 0.5 0.2 0.5 0.3 1.0 X1 X1 Aux book include prefer 0.01 0.004 0.006 Verb PP I he she me 0.1 0.02 0.02 0.06 Houston NWA 0.16.04 Det Nominal Nominal book flight meal money 0.03 0.15 0.06 0.06 Nominal Nominal Noun Nominal Nominal PP book include prefer 0.1 0.04 0.06 Verb PP PP Prep 0.8 0.1 1.0 0.05 0.03 0.6 0.2 0.5 0.5 0.3 1.0 What is the capitol of this state? Helena (Montana) 9

Grammar questions Can we determine if a sentence is grammatical? Given a sentence, can we determine the syntactic structure? Next time: parsing Can we determine how likely a sentence is to be grammatical? to be an English sentence? Can we generate candidate, grammatical sentences? 10