Hindi-Urdu Phrase Structure Annotation

Similar documents
Argument structure and theta roles

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Hindi Aspectual Verb Complexes

Unaccusativity and Case Licensing

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Control and Boundedness

A Computational Evaluation of Case-Assignment Algorithms

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Developing a TT-MCTAG for German with an RCG-based Parser

Theoretical Syntax Winter Answers to practice problems

The subject of adjectives: Syntactic position and semantic interpretation

Som and Optimality Theory

The Structure of Relative Clauses in Maay Maay By Elly Zimmer

Korean ECM Constructions and Cyclic Linearization

CAS LX 522 Syntax I. Long-distance wh-movement. Long distance wh-movement. Islands. Islands. Locality. NP Sea. NP Sea

THE FU CTIO OF ACCUSATIVE CASE I MO GOLIA *

Constraining X-Bar: Theta Theory

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

An Introduction to the Minimalist Program

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Words come in categories

Context Free Grammars. Many slides from Michael Collins

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Proof Theory for Syntacticians

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

Agree or Move? On Partial Control Anna Snarska, Adam Mickiewicz University

Today we examine the distribution of infinitival clauses, which can be

Syntactic Agreement. Roberta D Alessandro 18 November 2015

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

On the Notion Determiner

PROBLEMS IN ADJUNCT CARTOGRAPHY: A CASE STUDY NG PEI FANG FACULTY OF LANGUAGES AND LINGUISTICS UNIVERSITY OF MALAYA KUALA LUMPUR

Advanced Grammar in Use

The semantics of case *

California Department of Education English Language Development Standards for Grade 8

Grammars & Parsing, Part 1:

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

In Udmurt (Uralic, Russia) possessors bear genitive case except in accusative DPs where they receive ablative case.

1 The Indo-Aryan Languages: a tour

Direct and Indirect Passives in East Asian. C.-T. James Huang Harvard University

Grammar Extraction from Treebanks for Hindi and Telugu

2 The Components of the Passives. 1 Basic Issues. 2.1 Analytic Passives. Passives are analytical in most but not all Modern Indo-Aryan languages.

Update on Soar-based language processing

THE INDONESIAN JOURNAL OF LANGUAGE AND LANGUAGE TEACHING

Lexical Categories and the Projection of Argument Structure

The optimal placement of up and ab A comparison 1

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

IS THERE A PASSIVE IN DHOLUO?

Underlying and Surface Grammatical Relations in Greek consider

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

The Real-Time Status of Island Phenomena *

CS 598 Natural Language Processing

Focusing bound pronouns

Heads and history NIGEL VINCENT & KERSTI BÖRJARS The University of Manchester

Nominative Objects and Case Locality 1

Loughton School s curriculum evening. 28 th February 2017

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

Noun incorporation in Sora: A case for incorporation as morphological merger TLS: 19 February Introduction.

A Specific Role for AGR

LTAG-spinal and the Treebank

The Syntax of Coordinate Structure Complexes

The Syntax of Case and Agreement: its Relationship to Morphology and. Argument Structure

CX 101/201/301 Latin Language and Literature 2015/16

Backward Raising. Eric Potsdam and Maria Polinsky. automatically qualify as covert movement. We exclude such operations from consideration here.

Tibor Kiss Reconstituting Grammar: Hagit Borer's Exoskeletal Syntax 1

Derivations (MP) and Evaluations (OT) *

Can Human Verb Associations help identify Salient Features for Semantic Verb Classification?

Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs

Feature-Based Grammar

Chinese for Beginners CEFR Level: A1

arxiv:cmp-lg/ v1 16 Aug 1996

Multiple case assignment and the English pseudo-passive *

cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN

5 Minimalism and Optimality Theory

cmp-lg/ Jul 1995

Writing a composition

Second Language Acquisition of Complex Structures: The Case of English Restrictive Relative Clauses

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Building an HPSG-based Indonesian Resource Grammar (INDRA)

Second Language Acquisition of Korean Case by Learners with. Different First Languages

Hindi Aspectual Complex Predicates. Shakthi Poornima and Jean-Pierre Koenig. State University of New York at Buffalo

Prediction of Maximal Projection for Semantic Role Labeling

15 The syntax of overmarking and kes in child Korean

linguist 752 UMass Amherst 8 February 2017

Chapter 1 The functional approach to language and the typological approach to grammar

Construction Grammar. University of Jena.

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Formulaic Language and Fluency: ESL Teaching Applications

UCLA UCLA Electronic Theses and Dissertations

Child Causatives: Acquisition of Bi-clausal Structures in Japanese. Reiko Okabe UNIVERSITY OF CALIFORNIA. Los Angeles

Language contact in East Nusantara

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Visual CP Representation of Knowledge

TIPPING THE SCALES: THE SYNTAX OF SCALARITY IN THE COMPLEMENT OF SEEM

CHILDREN S POSSESSIVE STRUCTURES: A CASE STUDY 1. Andrew Radford and Joseph Galasso, University of Essex

Structure-Preserving Extraction without Traces

The Discourse Anaphoric Properties of Connectives

Describing Motion Events in Adult L2 Spanish Narratives

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Transcription:

Hindi-Urdu Phrase Structure Annotation Rajesh Bhatt and Owen Rambow January 12, 2009 1 Design Principle: Minimal Commitments Binary Branching Representations. Mostly lexical projections (P,, AP, AdvP) and one functional projection CP. Full clauses are treated as s, no node labelled S or IP. oun phrases are treated as Ps, no DPs or PPs. Auxiliaries are handled a verbs that combine with s. CPs are postulated only when there is evidence for a complementizer. o structural representation of agreement or case. o primitive notion of subject or object. Attempt to keep analyses uniform: there is always an X and an XP even if there is only one word in the XP. Two arguments of the verb have a canonical word order with respect to each other and the verb. Deviations from this are indicated via traces and coindexation. 1

2 Central Assumption (1) The syntax of Hindi-Urdu clauses structurally distinguishes at most two positions. These positions are the [Specifier,] position and the [Complement,] positions. Two distinct arguments: Simple Transitive (2) ne kiwaba parhi Erg book.f read.pfv.f read the book. P P K P ne kiwaba parhi (3) Diagnostics for the high argument: a. When unmarked, it controls agreement. akbara parhegi.f newspaper.m read.fut.fsg will read the newspaper. b. In non-finite embedded clauses, it becomes null. ne [ dilli jana] caha Erg Delhi go.inf want.pfv wanted to go to Delhi. c. In the obligational construction, it bears dative case. ko akbara parhna he Dat newspaper read.inf be.prs.sg has to read the newspaper. 2

(4) Diagnostics for the low argument: a. When high argument is marked, low argument controls agreement: rama ne kiwaba parhi Ram.m Erg book.f read.pfv.f Ram read the book. b. In non-finite embedded clauses, it remains overt and can control agreement: ne [ akbara parhna] caha Erg newspaper read.inf want.pfv wanted to read the newspaper. c. Depending upon its specificity and animacy, it be marked with ko ne akbara ko parha Erg newspaper Acc read.pfv read the newspaper. Using these diagnostics, we can distinguish true transitives from pseudotransitives. (5) a. rama dilli jaega Ram.m Delhi.f go.fut.msg Ram will go to Delhi. b. Obligational construction: dilli does not control agreement rama Ram ko dillii jana/*jani he Dat Delhi go.inf/*go.inf.f be.prs Ram has to go to Delhi. 3

2.1 Intransitives When there is only one distinct argument as in unaccusatives (e.g. TUTnaa break ) and in unergatives (e.g. hãs-naa laugh ), there is a question whether we still have two structurally distinguished positions or just one. One possible treatment: Unergatives Unaccusative P P i t i In principle, there are a number of diagnostics that can help distinguish between unaccusatives and unergatives but in practice, the distinction can be tricky. One possibility is to make the distinction at the level of the lexicon/propbank and assign both unaccusatives and unergatives the following simplex structure. P 2.2 Passives The single argument in the passive of a transitive occupies both the high and the low position. (6) keka kala banaya jaega cake tomorrow make.pfv FO.Fut.MSg The cake will be made tomorrow. 4

P i AdvP keka kala P t i banaya jaega 2.3 Ditransitives and Others The assumption that there are only two structurally distinguished positions has the effect that we do not make a distinction between clear cases of adjuncts (temporal, locative, manner, reason, and other adverbials) and cases of putative arguments. These include: (7) a. Dative arguments in ditransitives (8) rama Ram b. Dative subjects in dative subject verbs c. erbs with quirky objects (instrumentals, locatives) ne mina ko kala eka kiwaba di Erg Mina Dat yesterday a book give.pfv.f Ram gave a book to Mina yesterday. 5

P P K ne P rama P mina K ko AdvP kala P kiwaba D eka - note that the dative argument Mina ko and the temporal adverb kala are both represented as adjuncts on. - the argumenthood of Mina ko could be identified at the lexical level and indicated as a diacritic on the phrase structure. It could also be inherited from the PDG annotation. - while the distinction between temporal adverbials and dative arguments is a clear one, other cases are harder to adjudicate. For example, locative arguments of motion verbs and benefactive arguments of creation verbs. di 3 Scrambling and Traces Traces are used to represent to indicate scrambling. (9) The basic structure is as discussed earlier. a. egation is adjoined to. b. Adverbs are adjoined to and. c. If arguments appear in an order other than the order in the basic structure, the reordering is indicated via traces. d. The reordering of adjuncts does not necessarily involve traces. 6

(10) a. [ P akbara i [ P [ t i parhegi]]] newspaper.f read.fut.fsg The newspaper, will read. b. [ P [ akbara i [ kala [ t i parhegi]]]].f newspaper tomorrow read.fut.f will read the newspaper tomorrow. c. Putatively non-canonical word order, but no traces: [ P Sita ko [ P Ram ne [ kitaab dii]]] Sita Dat Ram Erg book.f give.pfv.f Ram gave Sita a book. d. Extraction of possessor, traces: [ P [Sita kii] i [ P [ P Ram ne [ [ P t i kitaab] parhii]] Sita Gen.f Ram Erg book.f read.pfv.f hai]] be.prs Ram read Sita s book. (literally: Sita s, Ram read book.) 4 Representation of ull Elements If there is a low argument and the high argument is missing, the high argument is realized as a pro, a silent pronoun. If we have evidence that there is a low argument but it is not overt, then the missing argument is also realized as a pro. Other elements that are missing (adjuncts, datives, etc.) are not explicitly represented. 7