L445 / L545. Dept. of Linguistics, Indiana University Spring 2017

Similar documents
CS 598 Natural Language Processing

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Grammars & Parsing, Part 1:

Context Free Grammars. Many slides from Michael Collins

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Parsing of part-of-speech tagged Assamese Texts

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Ch VI- SENTENCE PATTERNS.

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Control and Boundedness

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Constraining X-Bar: Theta Theory

Compositional Semantics

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Advanced Grammar in Use

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Natural Language Processing. George Konidaris

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Argument structure and theta roles

Proof Theory for Syntacticians

Developing Grammar in Context

Construction Grammar. University of Jena.

Pseudo-Passives as Adjectival Passives

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

Developing a TT-MCTAG for German with an RCG-based Parser

CAS LX 522 Syntax I. Long-distance wh-movement. Long distance wh-movement. Islands. Islands. Locality. NP Sea. NP Sea

Some Principles of Automated Natural Language Information Extraction

Chapter 4: Valence & Agreement CSLI Publications

Intensive English Program Southwest College

Language and Computers. Writers Aids. Introduction. Non-word error detection. Dictionaries. N-gram analysis. Isolated-word error correction

The Structure of Relative Clauses in Maay Maay By Elly Zimmer

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

Adjectives tell you more about a noun (for example: the red dress ).

Loughton School s curriculum evening. 28 th February 2017

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Written by: YULI AMRIA (RRA1B210085) ABSTRACT. Key words: ability, possessive pronouns, and possessive adjectives INTRODUCTION

Words come in categories

The Structure of Multiple Complements to V

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

"f TOPIC =T COMP COMP... OBJ

AQUA: An Ontology-Driven Question Answering System

Derivational and Inflectional Morphemes in Pak-Pak Language

Thornhill Primary School - Grammar coverage Year 1-6

LNGT0101 Introduction to Linguistics

An Introduction to the Minimalist Program

The Discourse Anaphoric Properties of Connectives

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Formulaic Language and Fluency: ESL Teaching Applications

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

LTAG-spinal and the Treebank

Accurate Unlexicalized Parsing for Modern Hebrew

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Underlying and Surface Grammatical Relations in Greek consider

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more

Prediction of Maximal Projection for Semantic Role Labeling

Dependency, licensing and the nature of grammatical relations *

The Interface between Phrasal and Functional Constraints

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

Aspectual Classes of Verb Phrases

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

Hindi-Urdu Phrase Structure Annotation

The building blocks of HPSG grammars. Head-Driven Phrase Structure Grammar (HPSG) HPSG grammars from a linguistic perspective

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

The College Board Redesigned SAT Grade 12

Korean ECM Constructions and Cyclic Linearization

CX 101/201/301 Latin Language and Literature 2015/16

Sight Word Assessment

Campus Academic Resource Program An Object of a Preposition: A Prepositional Phrase: noun adjective

THE VERB ARGUMENT BROWSER

Virtually Anywhere Episodes 1 and 2. Teacher s Notes

The subject of adjectives: Syntactic position and semantic interpretation

Update on Soar-based language processing

Structure-Preserving Extraction without Traces

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

PROBLEMS IN ADJUNCT CARTOGRAPHY: A CASE STUDY NG PEI FANG FACULTY OF LANGUAGES AND LINGUISTICS UNIVERSITY OF MALAYA KUALA LUMPUR

LET S COMPARE ADVERBS OF DEGREE

Lower and Upper Secondary

Procedia - Social and Behavioral Sciences 154 ( 2014 )

An Interactive Intelligent Language Tutor Over The Internet

Hindi Aspectual Verb Complexes

Today we examine the distribution of infinitival clauses, which can be

Course Syllabus Advanced-Intermediate Grammar ESOL 0352

Sample Goals and Benchmarks

Feature-Based Grammar

Theoretical Syntax Winter Answers to practice problems

2017 national curriculum tests. Key stage 1. English grammar, punctuation and spelling test mark schemes. Paper 1: spelling and Paper 2: questions

Multiple case assignment and the English pseudo-passive *

Nancy Hennessy M.Ed. 1

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Guidelines for Writing an Internship Report

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Transcription:

Grammars () Grammars () L445 / L545 Dept. of Linguistics, Indiana University Spring 2017 1 / 32

Parsing: Assigning Structure to Sentences Grammars () Parsing: take in an input sentence & assign a structure Input: The man left the room. Output: (S (NP (DT The) (NN man)) (VP (VBD left) (NP (DT the) (NN room)))) Why this sort of representation? Why do we group words as we do? What are these categories & what do they mean? Today: linguistic motivation for Later: formal properties 2 / 32

Grammars () = the study of the way that sentences are constructed from smaller units. No dictionary for sentences infinite number of possible sentences. The house is large. John believes that the house is large. Mary says that John believes that the house is large. Some basic principles of sentence organization: Hierarchial structure () Subcategorization & Grammatical relations 3 / 32

Grammars () = the order of words in a sentence. A sentence has different meanings based on its linear order. John loves Mary. Mary loves John. is a guiding principle for organizing words into meaningful sentences Languages vary as to what extent this is true 4 / 32

Grammars () We can t only use linear order to determine sentence organization I eat at really fancy restaurants. Many executives eat at really fancy restaurants. What are the meaningful units of the sentence Many executives eat at really fancy restaurants? Many executives really fancy really fancy restaurants at really fancy restaurants eat at really fancy restaurants We refer to these meaningful groupings as constituents 5 / 32

tests There are many tests to determine what a constituent is (though, they are prone to error) Preposed/Postposed constructions i.e., can you move the grouping around? (1) a. On September seventeenth, I d like to fly from Atlanta to Denver. b. I d like to fly on September seventeenth from Atlanta to Denver. c. I d like to fly from Atlanta to Denver on September seventeenth. Pro-form substitution Grammars () (2) John has some very heavy books, but he didn t want them. (3) I want to go home, and John wants to do so, too. 6 / 32

Hierarchical structure Grammars () Note that constituents appear within other constituents. We can represent this in a bracket form or in a syntactic tree Bracket form: [[Many executives] [eat [at [[really fancy] restaurants]]]] Syntactic tree is on the next page... 7 / 32

Syntactic tree (first pass) Grammars () a b c many executives eat d at e f restaurants really fancy 8 / 32

Grammars () Goal: be able to say that: Many executives and really fancy restaurants are the same type of grouping, or constituent at really fancy restaurants is something else For this, we will talk about different categories Lexical (which we ve seen before) Phrasal 9 / 32

Lexical categories Grammars () Lexical categories are simply word classes, or parts of speech. The main ones are: verbs: eat, drink, sleep,... nouns: gas, food, lodging,... adjectives: quick, happy, brown,... adverbs: quickly, happily, well, westward prepositions: on, in, at, to, into, of,... determiners/articles: a, an, the, this, these, some, much,... conjunctions: and, but, or, since, while,... 10 / 32

Determining lexical categories Grammars () How do we determine which category a word belongs to? Distribution: where these words can appear in a sentence e.g., Nouns like elephant can appear after articles ( determiners ) like the, while a verb like linger cannot. Morphology: what kinds of prefixes/suffixes a word can take e.g., Verbs like linger can take a ed ending to mark them as past tense. A noun like elephant cannot. 11 / 32

Closed & open classes Grammars () Open classes: new words can be easily added (tend to carry meaning): verbs nouns adjectives adverbs Closed classes: new words cannot be easily added (tend to be function words): prepositions determiners conjunctions 12 / 32

Phrasal categories Grammars () Examining the distribution of phrases, some behave in the same way The joggers ran through the park. Other phrases which can be put in place of The joggers: Susan you some children my friends from Brazil students most dogs a huge, lovable bear the people that we interviewed Since all of these contain nouns, we consider these to be noun phrases (NPs). 13 / 32

Syntactic tree Grammars () S NP VP many executives eat PP at NP AP restaurants really fancy 14 / 32

Noun Grammars () Noun phrases, like other kinds of phrases, are headed: there is a designated item (the noun) which determines the properties of the whole phrase Before the noun, you can have determiners (and pre-determiners) and adjective phrases After the noun, you can have prepositional phrases, gerunds (and other verbal clauses), and relative clauses You can also have noun-noun compounds General rule: The category of the head word percolates up to the phrase level 15 / 32

Determiner? It s not entirely clear that these phrases should be NPs; maybe they should be DPs There generally must be a noun in an NP, but often there must also be a determiner; in fact, determiners can sometimes appear alone. Grammars () (4) {*Student/The student} laughed. (5) { These/These students} think a lot. The determiner actually scopes over the noun semantically (6) All/Some/No students are happy. For some theories, a DP is more uniform with other parts of the syntax 16 / 32

Verb : Subcategorization Grammars () Verbs tend to drive the analysis of a sentence because they subcategorize for elements We can say that verbs have subcategorization frames sleep: subject find: subject, object show: subject, object, second object want: subject, object, infinitive verb phrase think: subject, sentential complement 17 / 32

Grammatical relations Grammars () Grammatical relations are the basic relations between words in a sentence (7) She eats a mammoth breakfast. In this sentence, She is the subject, while a mammoth breakfast is the object In English, the subject must agree in person and number with the verb. 18 / 32

Phrase structure rules (PSRs) Grammars () Rules for building these phrases Phrase structure rules (PSRs) build larger constituents from smaller ones. e.g., S NP VP A sentence (S) constituent is composed of a noun phrase (NP) constituent and a verb phrase (VP) constituent. (hierarchy) The NP must precede the VP. (linear order) Put PSRs together, and you have a context-free grammar (CFG) 19 / 32

Important properties of phrase structure rules Grammars () recursive = a rule can be reapplied (within its hierarchical structure). NP NP PP PP P NP The property of recursion means that the set of potential sentences in a language is infinite. potentially (structurally) ambiguous = have more than one analysis (8) I [ VP saw [ NP [ NP the man] [ PP with the telescope]]] (9) I [ VP saw [ NP the man] [ PP with the telescope]] 20 / 32

Formal definition of Grammars () 1. N: a set of non-terminal (phrasal) symbols, e.g., NP, VP, etc. 2. Σ: a set of terminal (lexical) symbols N and Σ are disjoint 3. P: a set of productions (rules) of the form A α, where A is a non-terminal and α is a collection of terminals and non-terminals 4. S: a designated start symbol Question (for later): Are capable of covering language? 21 / 32

to capture Grammars () Coordination Active & Passive Constructions Raising & Control Constructions Unbounded Dependency Constructions (UDCs) 22 / 32

Coordination Grammars () One type of phrase we have not mentioned yet is the coordinate phrase, for example John and Mary Coordination can generally apply to any kinds of (identical) phrases This makes it ambiguous and cause problems for parsing (10) I saw John and Mary left early. At some point, a parser has to decide between and joining NPs and joining Ss. 23 / 32

Difficulties with coordination Grammars () Coordination turns out to have particularly difficult properties for linguistic analysis The conjunction of two elements does not obey the same properties as each element. (11) a. *Me went to the store. b. Me and John went to the store. Coordination can be with unlike constituents (12) Robin is [ NP a Republican] and [ ADJP proud of it] Coordination can be with non-constituents (13) John gave me the bread and Mary the sugar. 24 / 32

Active & passive constructions Grammars () It is well-established that sentences occur in both active and passive forms: (14) a. Sandy saw Kim. b. Kim was seen by Sandy. can clearly handle such sentences, along the lines of: VP V fin NP VP V be VP pass VP pass V pass (PP by ) 25 / 32

Relating active and passive constructions Grammars () Even if a CFG can license such constructions, questions remain: How many rules will it take to capture every relevant grammatical distinction? How are the active and passive forms related? Through movement? Through lexical rules? Through nothing at all? 26 / 32

Raising & control constructions Grammars () Some verbs look similar in some syntactic contexts, but behave quite differently in others (15) a. John seems to be happy. b. It seems to be raining. c. John tries to be happy. d. *It tries to be raining. Generalization: Raising verbs (e.g., seem): the subject of the higher clause is the same as the subject of the lower clause Control (or equi) verbs (e.g., try): the subject of the higher clause controls the subject of the lower clause, but has certain restrictions on it. 27 / 32

Capturing the raising/control generalizations Grammars () How do we distinguish raising and control verbs in? In both cases, it seems like we have the pattern NP V VP inf Solutions seem to require one or more of the following: An empty subject in the lower clause Sharing of subjects (or subject properties) between upper and lower verbs, perhaps involving new features We ll discuss features more with unification-based grammars (needed also for agreement, etc.) A closer connection to sentence semantics 28 / 32

Unbounded dependency constructions (UDCs) Grammars () An unbounded dependency construction has an element realized non-locally and: involves constituents with different functions involves constituents of different categories is in principle unbounded 29 / 32

Example: Wh-elements Grammars () Wh-elements can have different functions: (16) a. Who did Hobbs see? Object of verb b. Who do you think saw the man? Subject of verb c. Who did Hobbs give the book to? Object of prep d. Who did Hobbs consider to be a fool? Object of obj-control verb Wh-elements can also occur in subordinate clauses: (17) a. I asked who the man saw. b. I asked who the man considered to be a fool. c. I asked who Hobbs gave the book to. d. I asked who you thought saw Hobbs. 30 / 32

Wh-elements (cont.) Different categories can be extracted: (18) a. Which man did you talk to? NP b. [To [which man]] did you talk? PP c. [How ill] has the man been? AdjP d. [How frequently] did you see the man? AdvP Grammars () This sometimes provides multiple options for a constituent: (19) a. Who does he rely [on ]? b. [On whom] does he rely? Unboundedness: (20) a. Who do you think Hobbs saw? b. Who do you think Hobbs said he saw? c. Who do you think Hobbs said he imagined that he saw? 31 / 32

Accounting for UDCs Grammars () How does one account for UDCs? Invoke a notion of movement during an analysis Include features which pass information about the non-local element Use some formalism more powerful than a CFG (e.g., Tree-Adjoining Grammar) 32 / 32