Ling 566 Oct 4, Context-Free Grammar CSLI Publications

Similar documents
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

CS 598 Natural Language Processing

Grammars & Parsing, Part 1:

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Chapter 4: Valence & Agreement CSLI Publications

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Parsing of part-of-speech tagged Assamese Texts

Context Free Grammars. Many slides from Michael Collins

A Grammar for Battle Management Language

Natural Language Processing. George Konidaris

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Proof Theory for Syntacticians

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Feature-Based Grammar

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Some Principles of Automated Natural Language Information Extraction

In Udmurt (Uralic, Russia) possessors bear genitive case except in accusative DPs where they receive ablative case.

"f TOPIC =T COMP COMP... OBJ

Developing a TT-MCTAG for German with an RCG-based Parser

Organizing Comprehensive Literacy Assessment: How to Get Started

Construction Grammar. University of Jena.

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Language properties and Grammar of Parallel and Series Parallel Languages

A Usage-Based Approach to Recursion in Sentence Processing

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

A Version Space Approach to Learning Context-free Grammars

The Interface between Phrasal and Functional Constraints

Constraining X-Bar: Theta Theory

Hyperedge Replacement and Nonprojective Dependency Structures

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

An Introduction to the Minimalist Program

Analysis of Probabilistic Parsing in NLP

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

PRODUCT PLATFORM DESIGN: A GRAPH GRAMMAR APPROACH

Refining the Design of a Contracting Finite-State Dependency Parser

Character Stream Parsing of Mixed-lingual Text

Type-driven semantic interpretation and feature dependencies in R-LFG

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

LING 329 : MORPHOLOGY

A relational approach to translation

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

Information for Candidates

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Specifying Logic Programs in Controlled Natural Language

An Interactive Intelligent Language Tutor Over The Internet

Psychology and Language

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Compositional Semantics

The History of Language Teaching

Theoretical Syntax Winter Answers to practice problems

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Control and Boundedness

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

English Language and Applied Linguistics. Module Descriptions 2017/18

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Ch VI- SENTENCE PATTERNS.

Pre-Processing MRSes

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

Constructions with Lexical Integrity *

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Learning Methods in Multilingual Speech Recognition

Hindi Aspectual Verb Complexes

Annotation Projection for Discourse Connectives

Citation for published version (APA): Veenstra, M. J. A. (1998). Formalizing the minimalist program Groningen: s.n.

Participate in expanded conversations and respond appropriately to a variety of conversational prompts

A Computational Evaluation of Case-Assignment Algorithms

GRAMMAR IN CONTEXT 2 PDF

Lower and Upper Secondary

Switched Control and other 'uncontrolled' cases of obligatory control

Multiple case assignment and the English pseudo-passive *

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Accurate Unlexicalized Parsing for Modern Hebrew

The Strong Minimalist Thesis and Bounded Optimality

Pseudo-Passives as Adjectival Passives

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Ontological spine, localization and multilingual access

MODELING DEPENDENCY GRAMMAR WITH RESTRICTED CONSTRAINTS. Ingo Schröder Wolfgang Menzel Kilian Foth Michael Schulz * Résumé - Abstract

EAGLE: an Error-Annotated Corpus of Beginning Learner German

Basic concepts: words and morphemes. LING 481 Winter 2011

Teacher: Mlle PERCHE Maeva High School: Lycée Charles Poncet, Cluses (74) Level: Seconde i.e year old students

Minimalism is the name of the predominant approach in generative linguistics today. It was first

An Approach to Polarity Sensitivity and Negative Concord by Lexical Underspecification

Introduction to Causal Inference. Problem Set 1. Required Problems

Syntactic systematicity in sentence processing with a recurrent self-organizing network

Longman English Interactive

Korean ECM Constructions and Cyclic Linearization

Building an HPSG-based Indonesian Resource Grammar (INDRA)

Hans-Ulrich Block, Hans Haugeneder Siemens AG, MOnchen ZT ZTI INF W. Germany. (2) [S' [NP who][s does he try to find [NP e]]s IS' $=~

Prediction of Maximal Projection for Semantic Role Labeling

LNGT0101 Introduction to Linguistics

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

PowerTeacher Gradebook User Guide PowerSchool Student Information System

Language Center. Course Catalog

Transcription:

Ling 566 Oct 4, 2016 Context-Free Grammar

Overview Formal definition of CFG Constituency, ambiguity, constituency tests Central claims of CFG Weaknesses of CFG Reading questions

What does a theory do? Monolingual Model grammaticality/acceptability Model relationships between sentences (internal structure) Multilingual Model relationships between languages Capture generalizations about possible languages

Summary Grammars as lists of sentences: Runs afoul of creativity of language Grammars as finite-state machines: No representation of structural ambiguity Misses generalizations about structure (Not formally powerful enough)

Chomsky Hierarchy Type 0 Languages Context-Sensitive Languages Context-Free Languages Regular Languages

Context-Free Grammar A quadruple: <C,Σ,P,S > C: set of categories Σ: set of terminals (vocabulary) P: set of rewrite rules α β 1,β 2,...,β n S in C: start symbol For each rule α β 1, β 2,...,β n P α C; β i C Σ; 1 i n

A Toy Grammar RULES S VP VP (D) A* N PP* V () (PP) LEXICON D: the, some A: big, brown, old N: birds, fleas, dog, hunter, I V: attack, ate, watched PP P P: for, beside, with

Structural Ambiguity I saw the astronomer with the telescope.

Structure 1: PP under VP S VP N V PP I saw D N P the astronomer with D N the telescope

Structure 1: PP under S VP N V I saw D N PP the astronomer P with D N the telescope

Constituents How do constituents help us? (What s the point?) What aspect of the grammar determines which words will be modeled as a constituent? How do we tell which words to group together into a constituent? What does the model claim or predict by grouping words together into a constituent?

Constituency Tests Recurrent Patterns The quick brown fox with the bushy tail jumped over the lazy brown dog with one ear. Coordination The quick brown fox with the bushy tail and the lazy brown dog with one ear are friends. Sentence-initial position The election of 2000, everyone will remember for a long time. Cleft sentences It was a book about syntax they were reading.

General Types of Constituency Tests Distributional Intonational Semantic Psycholinguistic... but they don t always agree.

Central claims implicit in CFG formalism: 1. Parts of sentences (larger than single words) are linguistically significant units, i.e. phrases play a role in determining meaning, pronunciation, and/or the acceptability of sentences. 2. Phrases are contiguous portions of a sentence (no discontinuous constituents). 3. Two phrases are either disjoint or one fully contains the other (no partially overlapping constituents). 4. What a phrase can consist of depends only on what kind of a phrase it is (that is, the label on its top node), not on what appears around it.

Claims 1-3 characterize what is called phrase structure grammar Claim 4 (that the internal structure of a phrase depends only on what type of phrase it is, not on where it appears) is what makes it context-free. There is another kind of phrase structure grammar called context-sensitive grammar (CSG) that gives up 4. That is, it allows the applicability of a grammar rule to depend on what is in the neighboring environment. So rules can have the form A X, in the context of Y_Z.

Possible Counterexamples To Claim 2 (no discontinuous constituents): A technician arrived who could solve the problem. To Claim 3 (no overlapping constituents): I read what was written about me. To Claim 4 (context independence): - He arrives this morning. - *He arrive this morning. - *They arrives this morning.

A Trivial CFG S VP VP D N V D: the V: chased N: dog, cat

Trees and Rules C 0 C 1... C n. is a well-formed nonlexical tree if (and only if). C 1,..., C n are well-formed trees, and C 0 C 1...Cn is a grammar rule.

Bottom-up Tree Construction D: the V: chased N: dog, cat D V N N the chased dog cat

D N VP V VP D N D N V the dog the cat chased D N the cat

S VP S VP D N V the dog chased D N the cat

Top-down Tree Construction S VP D N VP V S VP VP D (twice) N V

S VP D N V D N

D V N N the chased dog cat

S VP D N V the dog chased D N the cat

Weaknesses of CFG (atomic node labels) It doesn t tell us what constitutes a linguistically natural rule Rules get very cumbersome once we try to deal with things like agreement and transitivity. It has been argued that certain languages (notably Swiss German and Bambara) contain constructions that are provably beyond the descriptive capacity of CFG. VP P VP S

Agreement & Transitivity S! -SG VP-SG VP-SG! IV-SG S! -PL VP-PL VP-PL! IV-PL -SG! (D) NOM-SG VP-SG! TV-SG -PL! (D) NOM-PL VP-PL! TV-PL NOM-SG! NOM-SG PP VP-SG! DTV-SG NOM-PL! NOM-PL PP VP-PL! DTV-PL NOM-SG! N-SG VP-SG! CCV-SG S NOM-PL! N-PL VP-PL! CCV-PL S! -SG VP-SG! VP-SG PP! -PL VP-PL! VP-PL PP......

Shieber 1985 Swiss German example:... mer d chind em Hans es huus lönd hälfe aastriiche... we the children-acc Hans-dat the hous-acc let help paint... we let the children help Hans paint the house Cross-serial dependency: let governs case on children help governs case on Hans paint governs case on house

Shieber 1985 Define a new language f(sg): f(d chind) = a f(jan säit das mer) = w f(em Hans) = b f(es huus) = x f(lönde) = c f(aastriiche) = y f(hälfe) = d f([other]) = z Let r be the regular language wa b xc d y f(sg) r = wa m b n xc m d n y wa m b n xc m d n y is not context free. But context free languages are closed under intersection. f(sg) (and by extension Swiss German) must not be context free.

Strongly/weakly CF A language is weakly context-free if the set of strings in the language can be generated by a CFG. A language is strongly context-free if the CFG furthermore assigns the correct structures to the strings. Shieber s argument is that SW is not weakly context-free and a fortiori not strongly contextfree. Bresnan et al (1983) had already argued that Dutch is strongly not context-free, but the

On the other hand... It s a simple formalism that can generate infinite languages and assign linguistically plausible structures to them. Linguistic constructions that are beyond the descriptive power of CFG are rare. It s computationally tractable and techniques for processing CFGs are well understood.

So... CFG has been the starting point for most types of generative grammar. The theory we develop in this course is an extension of CFG.

Reading Questions Can signed languages be described with CFG? Can CFG be used to describe agglutinating languages? What is the context that CFGs are free of?

Reading Questions What does the superscript + mean in X -> X+ CONJ X? What's the difference between Kleene star and parentheses? --> (D) A* N PP* VP --> V () (PP) If we can use CFG to write rules like PP -> P, what does it mean to say that we can't capture headedness?

Reading Questions Does HPSG use X-bar theory? What is NOM? How is it different from N? If N + PP is a NOM (distinct from ), is there something similar for V + PP distinct from VP?

Reading Questions Wouldn't it be more efficient to use features, rather than - PL etc? How much info do we need to encode about words? What about idioms?

Reading Questions What does it mean for a grammar to be able to adequately describe a language (e.g., on page 36)? How would you go about demonstrating that a type of language belonged to a particular level of the Chomsky hierarchy? What does it mean to be Turing Complete? How do HPSG and Transformational Grammar in terms of the languages they can describe? Why model structure and grammaticality with the same system?

Reading Questions Does HPSG try to model what's in the wetware? Humans seem to need very little computational power to store and utilize vast amounts of information. How do we use a human "data structure" in our computer programs? Is it reasonable to assume that NLs have a finite lexicon?

Reading Questions How does HPSG differ from other extensions to CFG (e.g. transforations)? What makes it better for computational applications? Are there other theories that can be modeled computationally? How well does HPSG work for non-english languages?

Reading Questions How do you create a CFG for a language? Manually? Automatically? How many rules do you end up with? How do you evaluate this?

Overview Formal definition of CFG Constituency, ambiguity, constituency tests Central claims of CFG Weaknesses of CFG Next time: Feature structures