English Syntax and Context Free Grammars. COMP-599 Oct 8, 2015

Similar documents
CS 598 Natural Language Processing

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Grammars & Parsing, Part 1:

Context Free Grammars. Many slides from Michael Collins

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Argument structure and theta roles

Constraining X-Bar: Theta Theory

Parsing of part-of-speech tagged Assamese Texts

Construction Grammar. University of Jena.

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Developing Grammar in Context

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Chapter 4: Valence & Agreement CSLI Publications

Language acquisition: acquiring some aspects of syntax.

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

LNGT0101 Introduction to Linguistics

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Multiple case assignment and the English pseudo-passive *

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

BULATS A2 WORDLIST 2

Hindi Aspectual Verb Complexes

Natural Language Processing. George Konidaris

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Words come in categories

Proof Theory for Syntacticians

Derivational and Inflectional Morphemes in Pak-Pak Language

Theoretical Syntax Winter Answers to practice problems

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

Emmaus Lutheran School English Language Arts Curriculum

Underlying and Surface Grammatical Relations in Greek consider

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

The Interface between Phrasal and Functional Constraints

Campus Academic Resource Program An Object of a Preposition: A Prepositional Phrase: noun adjective

"f TOPIC =T COMP COMP... OBJ

Participate in expanded conversations and respond appropriately to a variety of conversational prompts

Ch VI- SENTENCE PATTERNS.

BASIC ENGLISH. Book GRAMMAR

Course Outline for Honors Spanish II Mrs. Sharon Koller

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Some Principles of Automated Natural Language Information Extraction

Compositional Semantics

Adjectives tell you more about a noun (for example: the red dress ).

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

The Structure of Relative Clauses in Maay Maay By Elly Zimmer

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

Feature-Based Grammar

What the National Curriculum requires in reading at Y5 and Y6

Sample Goals and Benchmarks

Pseudo-Passives as Adjectival Passives

MODELING DEPENDENCY GRAMMAR WITH RESTRICTED CONSTRAINTS. Ingo Schröder Wolfgang Menzel Kilian Foth Michael Schulz * Résumé - Abstract

California Department of Education English Language Development Standards for Grade 8

Presentation Exercise: Chapter 32

Writing a composition

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Specifying a shallow grammatical for parsing purposes

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

A Computational Evaluation of Case-Assignment Algorithms

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark

The Pennsylvania State University. The Graduate School. College of the Liberal Arts THE TEACHABILITY HYPOTHESIS AND CONCEPT-BASED INSTRUCTION

The building blocks of HPSG grammars. Head-Driven Phrase Structure Grammar (HPSG) HPSG grammars from a linguistic perspective

Thornhill Primary School - Grammar coverage Year 1-6

Parsing natural language

Advanced Grammar in Use

Minimalism is the name of the predominant approach in generative linguistics today. It was first

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

EAGLE: an Error-Annotated Corpus of Beginning Learner German

In Udmurt (Uralic, Russia) possessors bear genitive case except in accusative DPs where they receive ablative case.

An Interactive Intelligent Language Tutor Over The Internet

Formulaic Language and Fluency: ESL Teaching Applications

THE VERB ARGUMENT BROWSER

Switched Control and other 'uncontrolled' cases of obligatory control

On the Notion Determiner

Accurate Unlexicalized Parsing for Modern Hebrew

Developing a TT-MCTAG for German with an RCG-based Parser

Construction Grammar. Laura A. Michaelis.

In search of ambiguity

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.

Hyperedge Replacement and Nonprojective Dependency Structures

Tibor Kiss Reconstituting Grammar: Hagit Borer's Exoskeletal Syntax 1

LTAG-spinal and the Treebank

A Usage-Based Approach to Recursion in Sentence Processing

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

An Interface between Prosodic Phonology and Syntax in Kurdish

Today we examine the distribution of infinitival clauses, which can be

Analysis of Probabilistic Parsing in NLP

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more

Proposed syllabi of Foundation Course in French New Session FIRST SEMESTER FFR 100 (Grammar,Comprehension &Paragraph writing)

The Role of the Head in the Interpretation of English Deverbal Compounds

Adapting Stochastic Output for Rule-Based Semantics

Transcription:

English Syntax and Context Free Grammars COMP-599 Oct 8, 2015

Outline What is Syntax English Syntax Context Free Grammars 2

Syntax How words can be arranged together to form a grammatical sentence. This is a valid sentence. *A sentence this valid is. An asterisk is used to indicate ungrammaticality. One view of syntax: Generate all and exactly those sentences of a language which are grammatical 3

The First Grammarian Panini (Pāṇini) from the 4 th century B.C. developed a grammar for Sanskrit. Source: https://archive.org/details/ashtadhyayitrans06paniuoft 4

What We Don t Mean by Grammar Rules or guides for how to write properly e.g., These style guides are prescriptive. We are concerned with descriptive grammars of naturally occurring language. 5

Basic Definitions Terms grammaticality prescriptivism vs descriptivism constituency grammatical relations subcategorization 6

Constituency A group of words that behave as a unit Noun phrases: computational linguistics, it, Stephen Harper, three people on the bus, Jean-Claude Van Damme, the Muscles from Brussels Adjective phrases: blue, purple, very good, ridiculously annoying and tame 7

Tests for Constituency 1. They can appear in similar syntactic environments. I saw it Jean-Claude Van Damme, the Muscles from Brussels three people on the bus *Van *on the 8

Tests for Constituency 2. They can be placed in different positions or replaced in a sentence as a unit. [Jean-Claude Van Damme, the Muscles from Brussels], beat me up. It was [Jean-Claude Van Damme, the Muscles from Brussels], who beat me up. I was beaten up by [Jean-Claude Van Damme, the Muscles from Brussels]. He beat me up. (i.e., J-C V D, the M from B) 9

Tests for Constituency 3. It can be used to answer a question. Who beat you up? [Jean-Claude Van Damme, the Muscles from Brussels] *[the Muscles from] 10

Grammatical Relations Relationships between different constituents Subject Jean-Claude Van Damme relaxed. The wallet was stolen by a thief. (Direct) object The boy kicked the ball. Indirect object She gave him a good beating. There are many other grammatical relations. 11

Subcategorization Notice that different verbs seem to require a different number of arguments: relax 1 subj steal* 2 subj, dobj kick 2 subj, dobj give 3 subj, iobj, dobj *the passive changes the subcategorization of the verb 12

More Subcategorization Some other possibilities: want 2 subj, inf. clause I want to learn about computational linguistics. apprise 3 subj, obj, pobj with of The minister apprised him of the new developments. different 2 subj, pobj with from/than/to This course is different [from/than/to] what I expected. 13

Short Exercise Identify the prepositional phrase in the following sentence. Give arguments for why it is a constituent. The next assignment is due on Tuesday, October 20th. 14

Formal Grammars Since we are computational linguists, we will use a formal computational model of grammar to account for these and other syntactic concerns. Formal grammar Rules that generate a set of strings that make up a language. (In this context, language simply refers to a set of strings.) Why? Formal understanding lets us develop appropriate algorithms for dealing with syntax. Implications for cognitive science/language learning 15

FSAs and Regular Grammars We ve already seen examples of languages defined by formal grammars before this class! FSAs to describe aspects of English morphology An FSA generates a regular language FSAs correspond to a class of formal grammars called regular grammars To describe the syntax of natural languages (with multiple constituents, subcategorization, etc.), we need a more powerful class of formal grammars context free grammars (CFGs). 16

Context Free Grammars (CFG)s Rules that describe what possible sentences are: S NP VP NP this VP V V is kicks jumps rocks 17

Constituent Tree Trees (and sentences) generated by the previous rules: S NP VP NP this VP V V is rules jumps rocks S S NP VP NP VP Non-terminals this V this V rules rocks Terminals 18

Formal Definition of a CFG A 4-tuple: N Σ set of non-terminal symbols set of terminal symbols R set of rules or productions in the form A Σ N, and A N S a designated start symbol, S N 19

Extended Example Let s develop a CFG that can account for verbs with different subcategorization frames: intransitive verbs relax 1 subj transitive verbs steal, kick 2 subj, dobj ditransitive verbs give 3 subj, iobj, dobj 20

Undergeneration and Overgeneration Problems with above grammar: Undergeneration: misses valid English sentences The boy kicked the ball softly. The thief stole the wallet with ease. Overgeneration: generates ungrammatical sentences *The boy kick the ball. *The thieves steals the wallets. 21

Extension 1 Let s add adverbs and prepositional phrases to our grammar 22

Recursion Consider the following sentences: The dog barked. I know that the dog barked. You know that I know that the dog barked. He knows that you know that I know that the dog barked. In general: S -> NP VP VP -> Vthat Sthat VP -> Vintr Vthat-> know Vintr -> barked Sthat -> that S 23

Recursion This recursion in the syntax of English means that sentences can be infinitely long (theoretically). For a given sentence S, you can always make it longer by adding [I/you/he know(s) that S]. In practice, the length is limited because we have limited attention span/memory/processing power. 24

Exercise Let s try to fix the subject-verb agreement issue: Present tense: Singular third-person subject -> verb has affix of s or es Otherwise -> base form of verb (to be is an exception, along with other irregular verbs) 25

Dependency Grammar Grammatical relations induce a dependency relation between the words that are involved. The student studied for the exam. Each phrase has a head word. the student studied for the exam the student for the exam the exam 26

Dependency Grammar We can represent the grammatical relations between phrases as directed edges between their heads. det subject pp arg prep. obj det The student studied for the exam. This lets us get at the relationships between words and phrases in the sentence more easily. Who/what are involved in the studying event? student, for the exam 27

Converting between Formalisms Dependency trees can be converted into a standard constituent tree deterministically (if the dependency edges don t cross each other). Constituent trees can be converted into a dependency tree, if you know what is the head of the constituent. Let s convert some of our previous examples 28

Crossing Dependencies Yes, there can be crossing dependencies. Especially if the language has freer word order: Er hat mich versucht zu erreichen. Er hat versucht mich zu erreichen. He tried to reach me. These have the same literal meaning. 29

Crossing Dependencies Example What would the dependency edges be in these cases? Er hat versucht, mich zu erreichen. HE HAS TRIED ME TO REACH Er hat mich versucht zu erreichen. HE HAS ME TRIED TO REACH Notice the discontinuous constituent that results in the second case. 30