Representing Syntax by Means of Properties: a Formal Framework for Descriptive Approaches

Size: px
Start display at page:

Download "Representing Syntax by Means of Properties: a Formal Framework for Descriptive Approaches"

Transcription

1 Representing Syntax by Means of Properties: a Forma Framework for Descriptive Approaches Phiippe Bache To cite this version: Phiippe Bache. Representing Syntax by Means of Properties: a Forma Framework for Descriptive Approaches. Journa of Language Modeing, Institute of Computer Science, Poish Academy of Sciences, Poand, 2016, 4 (2), pp < /jm.v4i2.129>. <ha > HAL Id: ha Submitted on 7 Feb 2017 HAL is a muti-discipinary open access archive for the deposit and dissemination of scientific research documents, whether they are pubished or not. The documents may come from teaching and research institutions in France or abroad, or from pubic or private research centers. L archive ouverte puridiscipinaire HAL, est destinée au dépôt et à a diffusion de documents scientifiques de niveau recherche, pubiés ou non, émanant des étabissements d enseignement et de recherche français ou étrangers, des aboratoires pubics ou privés.

2 Representing syntax by means of properties: a forma framework for descriptive approaches Phiippe Bache CNRS & Aix-Marseie Université Laboratoire Paroe et Langage bache@bri.fr abstract Linguistic description and anguage modeing need to be formay sound and compete whie sti being supported by data. We present a inguistic framework that bridges such forma and descriptive requirements, based on the representation of syntactic information by means of oca properties. This approach, caed Property Grammars, provides a forma basis for the description of specific characteristics as we as entire constructions. In contrast with other formaisms, a information is represented at the same eve (no property paying a more important roe than another) and independenty (any property being evauabe separatey). As a consequence, a syntactic description, instead of a compete hierarchica structure (typicay a tree), is a set of mutipe reations between words. This characteristic is crucia when describing unrestricted data, incuding spoken anguage. We show in this paper how oca properties can impement any kind of syntactic information and constitute a forma framework for the representation of constructions (seen as a set of interacting properties). The Property Grammars approach thus offers the possibiity to integrate the description of oca phenomena into a genera forma framework. Keywords: syntax, constraints, inguistic theory, usage-based theories, constructions, Property Grammars Journa of Language Modeing Vo i 2, No e iπ (1970), pp. 1 41

3 Phiippe Bache 1 introduction The description and modeing of oca anguage phenomena contributes to a better understanding of anguage processing. However, this data-driven perspective needs to provide a method of unifying modes into a unique and homogeneous framework that woud form an effective theory of anguage. Reciprocay, from the forma perspective, inguistic theories provide genera architectures for anguage processing, but sti have difficuty in integrating the variabiity of anguage productions. The chaenge at hand is to test forma frameworks using a arge range of unrestricted and heterogeneous data (incuding spoken anguage). The feasibiity of this task mainy depends on the abiity to describe a possibe forms, regardess of whether they are we-formed (i.e. grammatica) or not. Such is the goa of the inguistic trend known as usage-based (Langacker 1987; Bybee 2010), which aims to describe how anguage works based on its concrete use. Our goa is to propose a new forma framework buit upon this approach. Moving away from the generative framework. Addressing the question of the syntactic description independenty of grammaticaity represents an epistemoogica departure from the generative approach in many respects. In particuar, it consists in moving away from the representation of competence towards that of performance. Severa recent approaches in ine with this project consider grammar not as a device for generating anguage, but rather as a set of statements, making it possibe to describe any kind of input, addressing at the same time the question of gradience in grammars (Aarts 2004; Bache and Prost 2005; Fanseow et a. 2005). To use a computationa metaphor, this means repacing a procedura approach where grammar is a set of operations (rues), with a decarative approach where grammar is a set of descriptions. This evoution is fundamenta: it reies on a cear distinction between inguistic knowedge (the grammar) and parsing mechanisms that are used for buiding a syntactic structure. In most current formaisms, this is not the case. For exampe, the representation of syntactic information with trees reies on the use of phrasestructure rues which encode both a syntactic reation (government) and operationa information (the oca tree to be used in the fina structure). Such merging of operationa information within the grammar can aso be found in other formaisms. It is an important feature in [ 2 ]

4 Representing syntax by means of properties Tree-Adjoining Grammars (Joshi et a. 1975) in which the grammar is made of sub-parts of the fina syntactic tree. It is aso the case in Dependency Grammars (Tesnière 1959) with the projectivity principe (intended to contro tree we-formedness) as we as in HPSG (Poard and Sag 1994; Sag and Wasow 1999) and its feature percoation principes. We propose disentanging these different aspects by excuding information soey motivated by the kind of structure to be buit. In other words, inguistic information shoud be encoded independenty of the form of the fina representation. Grammar is imited then to a set of descriptions that are inguistic facts. As expained by Puum and Schoz (2001), doing this enabes a move away from Generative-Enumerative Syntax (GES) towards a Mode-Theoretic Syntax (MTS) (Corne and Rogers 2000; Backburn and Meyer-Vio 1997; Bache 2007). Severa works are considered by Puum and Schoz (2001) to exhibit the seeds of MTS, in particuar around HPSG and Construction Grammars (Fimore 1988; Kay and Fimore 1999). These two approaches have recenty converged, eading to a new framework caed Sign-Based Construction Grammars (Sag 2012; Sag et a. 2012). SGBG is motivated by providing a forma basis for Construction Grammars, paving the way towards modeing anguage usage. It starts to fufi the MTS requirements in that it proposes a monotonic system of decarative constraints, representing different sources of inguistic information and their interaction. However, there sti remains a imitation that is inherent to HPSG: the centra roe payed by heads. A information is controed by this eement, as the theory is head-driven. A principes are stipuated on the basis of the existence of a contextfree skeeton, impemented by dominance schemas. As a consequence, the organization of the information is syntacto-centric: the interaction of the inguistic domains is organized around a head/dependent hierarchica structure, corresponding to a tree. In these approaches, representing the information of a domain, and more to the point the interactions among the domains, requires one to first buid the schema of mothers/daughters. Constraints are then appied as fiters, so as to identify we-formed structures. As a side effect, no description can be given when no such structures can be buit. This is a severe restriction both for theoretica and cognitive reasons: one of the requirements of MTS is to represent a inguistic do- [ 3 ]

5 Phiippe Bache mains independenty of each other (in what Puum and Schoz (2001) ca a non-hoistic manner. Their interaction is to be impemented directy, without giving any priority to any of them with respect to the others. Ignoring this requirement necessariy entais a moduar and seria conception of anguage processing, which is chaenged now both in inguistics and psychoinguistics (Jackendoff 2007; Ferreira and Patson 2007; Swets et a. 2008). Evidence supporting this chaenge incudes: anguage processing is very often underspecified; inguistic information comes from different and heterogeneous sources that may vary depending on usage; the understanding mechanisms are often non-compositiona; etc. One goa of this paper is to propose an approach that accommodates such different uses of anguageso as to be abe to process canonica or non-canonica, mono- or mutimoda inputs. Describing any kind of input. Linguistic information needs to be represented separatey when trying to account for unrestricted materia, incuding non-canonica productions (for exampe in spoken anguage). The main motivation is that, whatever the sentence or the utterance to be parsed, it becomes then possibe to identify its syntactic characteristics independenty of the structure to be buit. If we adopt this approach, we sti can provide syntactic information party describing the input even when no structure can be buit (e.g. i-formed reaizations). In other words, it becomes possibe to provide a description (in some cases a partia description) of an input regardess of its form. This type of approach aows one to describe any type of sentence or utterance: it is no onger a question of estabishing whether the sentence under question is grammatica or not, but rather of describing the sentence itsef. This task amounts to deciding which descriptions present in the grammar are reevant to the object to be described and then to assessing them. Grammar as set of constructions. One important advance for inguistic theories has been the introduction of the notion of construction (Fimore 1988; Kay and Fimore 1999). A construction is the description of a specific inguistic phenomenon, eading to a specific form-function pairing that is conventionaized or even not stricty predictabe from its component parts (Godberg 2003, 2009). These pairings resut from the convergence of severa properties or characteris- [ 4 ]

6 Representing syntax by means of properties tics, as iustrated in the foowing exampes: 1. Covariationa conditiona construction The Xer the Yer: The more you watch the ess you know 2. Ditransitive construction Subj V Obj1 Obj2: She gave him a kiss 3. Idiomatic construction: kick the bucket Severa studies and new methodoogies have been appied to syntactic description in the perspective of modeing such phenomena (Bresnan 2007). The new chaenge is to integrate these constructions, which are the basic eements of usage-based descriptions, into a homogeneous framework of a grammar. The probem is twofod: first, how to represent the different properties characterizing a construction; and second, how to represent the interaction between these properties in order to form a construction. Our proposa. We seek an approach where grammars comprised of usage-based descriptions. A direct consequence is to move the question away from buiding a syntactic structure to describing the characteristics of an input. In concrete terms, grammatica information shoud be designed in terms of statements that are not conceived of with the aim of buiding a structure. We propose a presentation of a theoretica framework that integrates the main requirements of a usage-based perspective. Namey, it first integrates constructions into a grammar and secondy describes non-grammatica exempars. This approach reies on a cear distinction of operationa and decarative aspects of syntactic information. A first step in this direction has been achieved with Property Grammars (Bache 2000; Bache and Prost 2014), in which a grammar is ony made of properties, a represented independenty of each other. Property Grammars offer an adequate framework for the description of inguistic phenomena in terms of interacting properties instead of structures. We propose going one step further by integrating the notion of construction into this framework. One of the contributions of this paper, in comparison to previous works, is a forma specification of the notion of construction, based on constraints ony, instead of structures as in SBCG. It proposes moreover a computationa method for recognizing them. [ 5 ]

7 Phiippe Bache In the first section, we present a forma definition of the syntactic properties; these are used for describing any type of input. We then address more theoretica questions that constitute obstaces when trying to represent basic syntactic information independenty of the rest of the grammar. 1 We expore in particuar the consequences of representing reations between words directy, without the mediating infuence of any higher-eve structures or eements (i.e. without invoving the notion of phrases or heads). Last, we describe how this framework can incorporate the notion of construction and detai its roe in the parsing process. 2 new properties for grammars We seek to abstract the different types of properties that encode syntactic information. As expained above, we ceary separate the representation of such information from any pre-defined syntactic structure. In other words, we encode this information by itsef, and not in respect to any structure: a basic syntactic property shoud not be invoved in the buiding of a syntactic structure. It is thus necessary to provide a framework that excudes any notion of hierarchica information, such as heads or phrases: a property is a reation between two words, nothing more. Disconnecting structures and reations is the key towards the description of any kind of input as we as any type of construction. Unike most syntactic formaisms, we imit grammar to those aspects that are purey descriptive, excuding operationa information. Here, the grammatica information as we as the structures proposed for representing syntactic knowedge are not determined by how they may be used during anaysis. We want to avoid defining (e.g. as in constituency-based grammars) a phrase-structure rue as a step in the derivationa process (corresponding to a sub-tree). In this case, the notions of projection and sisterhood ecipse a other information (inear order, co-occurrence, etc.), which becomes impicit. Likewise, in dependency grammars, a dependency reation corresponds to a branch on the dependency tree. In this context, sub-categorization or mod- 1 Puum and Schoz (2001) emphasize this characteristic as a requirement for moving away from the hoistic nature of generative grammars. [ 6 ]

8 Representing syntax by means of properties ification information becomes dominant and supersedes other information which, in this case too, generay becomes impicit. This issue aso aso affects modern formaisms, such as HPSG (Poard and Sag 1994; Sag and Wasow 1999; Sag 2012) which stricty speaking does not use phrase-structure rues but organizes syntactic information by means of principes in such a way that it has to percoate through the heads, buiding as a side-effect a tree-ike structure. Our approach, in the context of Property Grammars (hereafter PG) consists in identifying the different types of syntactic information in order to represent them separatey. At this stage, we wi organize grammatica statements around the foowing types of syntactic information: the inear order that exists among severa categories in a construction the mandatory co-occurrence between two categories the excusion of co-occurrence between two categories the impossibiity of repeating a given category syntactic-semantic dependency between two categories (generay a category and the one that governs it) This ist of information is neither fixed nor exhaustive and coud be competed according to the needs of the description of specific anguages, for exampe with adjacency properties, competing inearity, or morphoogica dependencies. Foowing previous forma presentations of Property Grammars (Duchier et a. 2010; Bache and Prost 2014) we propose the foowing notations: x, y (ower case) represent individua variabes; X, Y (upper case) are set variabes. We note C(x) the set of individua variabes in the domain assigned to the category C (cf. Backofen et a. (1995) for more precise definitions). We use the set of binary predicates for inear precedence ( ) and equaity ( ). 2.1 Linearity In PG, word order is governed by a set of inearity constraints, which are based on the cause estabished in the ID/LP formaism (Gazdar et a. 1985). Unike phrase-structure or dependency grammars, this [ 7 ]

9 Phiippe Bache information is, therefore, expicit. The inearity reationship between two categories is expressed as foows: P rec(a, B) : ( x, y)[(a(x) B(y) y x)] (1) This is the same kind of inear precedence reation as proposed in GPSG (Gazdar et a. 1985). If the nodes x and y, respectivey of category A and B, are reaized 2, then y can not precede x. For exampe, in a nomina construction in Engish, we can specify the foowing inearity properties: Det Ad j; Det N; Ad j N; N W hp; N P rep (2) Note that, in this set of properties, reations are expressed directy between the exica categories (the notion of phrase-structure category is no onger used). As such, the N P rep property indicates precedence between these two categories regardess of their dependencies. This aspect is very important and constitutes one of the major characteristics of PG: a properties can be appied to any two items, incuding when no dependency or subcategorization ink them. The foowing exampe iustrates a the inearity reationships in the nomina construction The very od reporter who the senator attacked (the reative cause is not described here): Det Adv Ad j N W hp (3) In this exampe, the inearity properties between two categories are independent of the rection (government) reations that these categories are ikey to have. The inearity between Det and Adj hods even if these two categories have other dependencies (for exampe between the Adj and a modifier such as Adv). In theory, it coud even be possibe that a word dependent from the second category of the reation is reaized before the first one: as such, there is no projectivity in 2 A word or a category is said to be reaized when it occurs in the sentence to be parsed. [ 8 ]

10 Representing syntax by means of properties these reations 3. The same situation can be found for non-arguments: a inearity can be directy stipuated for exampe between a negative adverb and a verb. This is an argument in favour of stipuating properties directy between exica categories rather than using phrase structures. In addition to the representation of syntactic reations, properties may be used to instantiate attribute vaues. For exampe, we can distinguish the inearity properties between the noun and the verb, depending on whether N is subject or object by specifying this vaue in the property itsef: N[sub j] V ; V N[ob j] (4) As we sha see, a properties can be used to instantiate certain attribute vaues. As is the case in unification grammars, attributes can be used to reduce the scope of a property by imiting the categories to which it can be appied. Generay speaking, a property (paying the roe of a constraint) has a dua function: contro (imiting a definition domain) and instantiation (assigning vaues to variabes, by unification). 2.2 Co-occurrence In many cases, some words or categories must co-occur in a domain, which is typicay represented by sub-categorization properties. For exampe, the transitive schema for verbs impies that a nomina object (compement) must be incuded in the structure. Such co-occurrence constraint between two categories x and y specifies that if x is reaized in a certain domain, then y must aso be incuded. This is formay represented as foows: Req(A, B) : ( x, y)[a(x) B(y)] (5) If a node x of category A is reaized, so too is a node y of category B. The co-occurrence reation is not symmetric. As for verba constructions, a cassica exampe of co-occurrence concerns nomina and prepositiona compements of ditransitive verbs, which are represented through the foowing properties: 3 Such a phenomenon does not exist in anguages with fixed word order such as Engish or French. [ 9 ]

11 Phiippe Bache V N; V [dit] P rep (6) As described in the previous section, a property is stipuated over exica categories, independenty of their dependents and their order. It shoud be noted that co-occurrence not ony represents compementtype reations, it can aso incude co-occurrence properties directy between two categories independenty from the head (thus regardess of rection reations). For exampe, the indefinite determiner is not generay used with a comparative superative 4 : (1) a. The most interesting book of the ibrary b. *A most interesting book of the ibrary In this case, there is a co-occurrence reation between the determiner and the superative, which is represented by the property: Sup Det[de f ] (7) Furthermore, this exampe shows that we can aso specify variabe granuarity properties by appying genera or more specific categories by means of attribute vaues. A key point must be emphasized when using co-occurrence properties: the notion of head does not pay a preponderant roe in our approach. Moreover, we do not use sets of constituents within which, in constituency-based grammar, the head is distinct and indicates the type of projection. Cassicay in syntax, the head is considered to be the governing category, which is aso the minimum mandatory component required to create a phrase. This means that the governed components must be reaized together with the head. As such, this information is represented by properties estabishing co-occurrence between the head and its compements. Defining a specific property that identifies the head is, therefore, not necessary. In the case of nomina construction, the fact that N is a mandatory category is stipuated by a set of co-occurrence properties between the compements and the adjuncts to the nomina head: 4 This constraint is imited to comparative superatives. In some cases the use of an indefinite determiner entais a oss of this characteristic. In the sentence In the crowd, you had a former fastest man in the word. the superative becomes absoute, identifying a set of eements instead of a unique one. [ 10 ]

12 Representing syntax by means of properties Det N[common]; Ad j N; W hp N; P rep N (8) The set of co-occurrence properties for the nomina construction described so far can be represented by the foowing graph: The c most c c c interesting book c o f the ibrar y We sha see ater how the conjunction between co-occurrence and dependency properties is used to describe the syntactic characteristics of a head, without the need for other types of information. As such (unike previous versions of PG), using specific properties for describing the head is not required. At this stage, we can note that different soutions exist for representing non-headed constructions, for exampe when no noun is reaized in a nomina construction. As we wi see ater, a constraints are vioabe. This means that a nomina construction without a noun such as in The very rich are different from you and me can be described with a vioation of the co-occurrence properties stipuated above. This comes to identify a kind of impicit reation, not to say an empty category. Another soution consists in considering the adjective as a possibe head of the nomina construction. In such case, the grammar shoud contain another set of co-occurrence and dependency properties that are directy stipuated towards the adjective instead of the noun. 2.3 Excusion (co-occurrence restriction) In some cases, restrictions on the possibiities of co-occurrence between categories must be expressed. These incude, for exampe, cases of exica seection, concordance, etc. An excusion property is defined as foows: E xc(a, B) : ( x)( y)[a(x) B(y)] (10) When a node x of category A exists, a sibing y of category B can not exist. This is the excusion reation between two constituents, that corresponds to the co-occurrence restriction in GPSG. The foowing (9) [ 11 ]

13 Phiippe Bache properties show a few co-occurrence restrictions between categories that are ikey to be incuded in nomina constructions: Pro N; N[prop] N[com]; N[prop] P rep[in f ] (11) These properties stipuate that, in a nomina construction, the foowing can not exist simutaneousy: a pronoun and a noun; a proper noun and a common noun; nor a proper noun and an infinitive construction introduced by a preposition. Likewise, reative constructions can be managed based on the syntactic roe of the pronoun. A reative construction introduced by a subject reative pronoun, as indicated in the foowing property, can not contain a noun with this same function. This restriction is compusory in French, where reative pronouns are case marked: W hp[sub j] N[sub j] (12) It is worth noting that a particuarity of this type of property is that it can ony be verified when the entire rection domain is known. We wi discuss ater the different cases of constraint satisfiabiity, which depend on their scope. 2.4 Uniqueness Certain categories can not be repeated inside a rection domain. More specificay, categories of this kind can not be instantiated more than once in a given domain. This property is defined as foows: Uniq(A) : ( x, y)[a(x) A(y) x y] (13) If one node x of category A is reaized, other nodes y of the same category A can not exist. Uniqueness stipuates that constituents can not be repicated in a given construction. Uniqueness properties are common in domain descriptions, athough their importance depends upon the constructions to which they beong. The foowing exampe describes the uniqueness properties for nomina constructions: Uniq = {Det, Re, P rep [inf ], Adv} (14) These properties are archetypa for the determiner and the reative pronoun. They aso specify here that it is impossibe to repicate [ 12 ]

14 Representing syntax by means of properties dep Figure 1: The hierarchy of the dependency reation mod spec comp aux conj subj obj iobj xcomp a prepositiona construction that introduces an infinitive ( the wi to stop ) or a determinative adverbia phrase ( aways more evauation ). Uniqueness properties are encoded by a oop: u The book u that I read (15) 2.5 Dependency The dependency reation in PG is in ine with the notion of syntacticsemantic dependency defined in Dependency Grammars. It describes different types of reations between two categories (compement, modifier, specifier, etc.). In terms of representation, this reation is arbitrariy oriented from the dependent to the head. It indicates the fact that a given object compements the syntactic organization of the target (usuay the governor) and contributes to its semantic structure. In this section, we we eave aside semantics and focus on the syntactic aspect of the dependency reation. Dependency reations are type-based and foow a type hierarchy (Figure 1); note that this hierarchy can be competed according to requirements of specific constructions or anguages. Since the dependency reation is organized as a type hierarchy, it is possibe to describe a dependency reation at the most genera eve (the root of the hierarchy) or at any sub-eve, depending on the required precision. Each of these types and/or sub-types corresponds to a cassic syntactic reation (Figure 2). Dependency reations (noted ) possiby bear the dependency sub-type as an index. The foowing properties indicate the dependency properties appied to nomina constructions: Det spec N[com]; Ad j mod N; W hp mod N (16) [ 13 ]

15 Figure 2: The sub-types of the dependency reation dep mod spec comp subj obj iobj xcomp aux conj Phiippe Bache generic reation, indicating dependency between a constructed component and its governing component modification reation (typicay an adjunct) specification reation (typicay Det-N) the most genera reation between a head and an object (incuding the subject) dependency reation describing the subject dependency reation describing the direct object dependency reation describing the indirect object other types of compementation (for exampe between N and P rep) reation between the auxiiary and the verb conjunction reation The foowing exampe iustrates some dependencies into a nomina construction: The most spec mod mod interesting book mod o f comp the spec ibrar y (17) In this schema, we can see the specification reations between the determiners and the corresponding nouns, and the modification reations between the adjectiva and prepositiona constructions as we as between the adverb and the adjective inside the adjectiva construction. Feature contro: The types used in the dependency reations, whie specifying the reation itsef, aso provide information for the dependent eement. In PG, the dependency reation aso assigns a vaue to the function attribute of the dependent. For exampe, a subject dependency between a noun and a verb is expressed by the foowing property: N[sub j] sub j V (18) [ 14 ]

16 Representing syntax by means of properties This property instantiates the function vaue in the exica structure [function subject]. Simiary, dependency reations, as with other properties, make it possibe to contro attribute vaues thanks to unification. This is usefu, for exampe, for agreement attributes that are often inked to a dependency. For instance, in French, a gender and number agreement reation exists between the determiner, the adjective and the noun. This is expressed in the foowing dependencies: Det[agr i ] spec N[agr i ]; Ad j[ag r i ] mod N[ag r i ] (19) Forma aspects: Unike dependency grammars, this dependency reation is not strict. First of a, as the dependencies are ony a part of the syntactic information, a compete dependency graph connecting a the categories/words in the sentence is not required. Moreover, dependency graphs may contain cyces: certain categories may have dependency reations with more than one component. This is the case, for exampe, in reative constructions: the reative pronoun depends on the main verb of the construction (a compementation reation with the verb of the reative, regardess whether it is the subject, direct object, or indirect object). But it is aso a dependent of the noun that it modifies. In PG, a cyce may aso exist between two categories. Again, this is the case in the reative construction, between the verb and the reative pronoun. The reative pronoun is a compement of the main verb of the reative. It is aso the target of the dependency reation originating from the verb. This reation indicates that the verb (and its dependencies) wi pay a roe in estabishing the sense of the reative construction. In this case, the dependency reation remains generic (at the higher eve of the type hierarchy). The dependency properties of the reative construction stipuate: W hp [comp] comp V ; W hp mod N; V dep W hp (20) It shoud be noted that the dependency reation between WhP and V bears the comp type. This generic type wi be specified in the grammar by one of its sub-types sub j, ob j or iob j, each generating [ 15 ]

17 Phiippe Bache Figure 3: Characteristics of the dependency reation Antisymmetric: Antirefexive: Antitransitive: if A x B, then B x A if A B, then A B if A x B and if B x C then A x C different properties (in particuar excusion) for the reative. The foowing schema iustrates an exampe of a reative construction, with the particuarities of having a doube dependency for the W hp, and the cyce W hp-v : spec mod spec subj The repor ter who the senator at tacked (21) As we can see, the dependency graph in PG (as with the other properties) is not necessariy connected or cyce-free. Figure 3 summarizes the main characteristics of the dependency reation. Note that these reations are stipuated taking into account the precise type of the dependency reations: they are true ony for a given type, but not as a genera rue. For exampe, a symmetric compementation reation can not exist (if A is a compement of B, then B can not be a compement of A). However, a cyce can appear when the dependency types are different (as seen above for V W hp dependencies). Apart from the type-based restrictions, properties are identica to those found in dependency grammars. One important features in PG is that the dependency graph is not necessariy connected and does not necessariy have a unique root. Furthermore, we can see that when two reaized categories (i.e. each corresponding to a word in the sentence) are inked by a property, they are usuay in a dependency reation, directy or otherwise. Formay speaking, this characteristic can be expressed as foows: Let a reation expressing a PG property, et x, y and z categories: obj dep If x y, then x y y x [ z such that x z y z] (22) Finay, dependency reations comprise two key constraints, ruing out some types of dua dependencies: [ 16 ]

18 Representing syntax by means of properties A given category can not have the same type of dependency with severa categories: If x depi y, then z such that y z x depi z (23) Exampe : Pro i sub j V j ; P ro i sub j V k The same pronoun can not be subject of two different verbs. A given category can not have two different types of dependencies with the same category: If x depi y, then dep j dep i such that x t y pe_depj y (24) Exampe : Pro i ob j V j ; P ro i sub j V j A given pronoun can not simutaneousy be the subject and object of a given verb. Note that such restrictions appy for dependencies at the same eve in the dependency type hierarchy. In the above exampe, this is the case for subj and obj: such dua dependency can not exist. Aso note that these constraints do not rue out icit doube dependencies such as that encountered in contro phenomena (a same subject is shared by two verbs) or in the case of the reative pronoun which is both the modifier of a noun and the compement of the verb of the reative: W hp comp V ; W hp mod N (25) In this case, the reation types represent dependencies from both inside and outside the reative cause. 2.6 A comprehensive exampe Each property as defined above corresponds to a certain type of syntactic information. In PG, describing the syntactic units or inguistic phenomena (chunks, constructions) in the grammar consists in gathering a the reevant properties into a set. Tabe 1 summarizes the properties describing the nomina construction. In this approach, a syntactic description, instead of being organized around a specific structure (for exampe a tree), consists in a set of independent (but interacting) properties together with their status (satisfied or vioated). The graph in the figure beow iustrates the PG description of the nomina construction: The most interesting book of the ibrary. [ 17 ]

19 Phiippe Bache Tabe 1: Properties of the nomina construction Det {Det, Ad j, W hp, P rep, N} N {P rep, W hp} Det N[com] {Ad j, W hp, Prep} N Det spec N Ad j mod N W hp mod N Prep mod N Uniq = {Pro, Det, N, W hp, Prep} Pro {Det, Ad j, W hp, Prep, N} N[prop] Det u The c most spec c mod mod interesting c book u mod o f u comp u the spec ibrar y c (26) In PG, a syntactic description is therefore the graph containing a the properties of the grammar that can be evauated for the sentence to be parsed. As iustrated in the exampe, this property graph represents expicity a the syntactic characteristics associated to the input; each is represented independenty of the others. 3 bringing constructions into property grammars A construction is defined as the convergence of severa properties. For exampe, the ditransitive construction is, among other features, characterized by the fact that the argument roes are fied by two nomina objects in a specific order. The first step towards the recognition of a construction consists in identifying such basic properties. At this stage, no other process but the spotting of the properties needs to be used. This means that a properties shoud be identified directy and independenty of the rest of the grammar. For exampe, in the case of the ditransitive construction, this consists in identifying the inear order between the nomina objects. The issue, then, is to describe such oca and basic properties, without reating them to any higher eve information. As a consequence, we propose a representation in which a properties are sefcontained (as presented in the previous section) in the sense that their [ 18 ]

20 Representing syntax by means of properties evauation shoud not depend on the recognition of other eements or structure. However, the two cassica means of representing syntactic information (constituency or dependency) consist either in structuring higher-eve groups (phrases in the case of constituency-based grammars) or assigning a specific roe to the head in the definition of a branching structure (in the case of dependency grammars). In this section, we expore in greater detais these aspects and their consequences when trying to represent basic properties directy. Our anaysis is buit around three questions: the notion of syntactic group, the status of the head, and the kind of information to be encoded in the exicon for the representation of basic properties. 3.1 Constructions as sets of properties Constituency-based approaches rey on the definition of syntactic properties in terms of beonging: a syntactic object is first characterized by its set of constituents. This approach offers severa advantages in describing the distributiona properties of syntactic groups, for exampe. Moreover, it constitutes a direct framework for controing the scope of oca properties (such as inearity or cooccurrence restriction): they are vaid within a domain (a phrase). Using this notion of domain proves interesting for constraintbased frameworks in which a phrase is described by a set of categories to which severa constraints appy (offering a direct contro of the scope of constraints). However, such an approach requires the organization of syntactic information into two separate types, forming two different eves: on the one hand, the definition of the domain (the set of categories, the phrase) and, on the other hand, their inguistic properties. In terms of representation (in the grammar), this means giving priority to the definition of the domain (the identification of the set of constituents, for exampe by means of rues or schemas). The constraints come on top of this first eve, adding more information. In terms of parsing, the strategy aso foows this dua eve organization: first recognizing the set of categories (for exampe Det, N, Re,... for the NP), then evauating constraint satisfaction. The probem with this organization is that it gives priority to a certain type of information, namey constituency, that is motivated by operationa matters (the representation and the construction of the syntactic structure) more than by inguistic considerations: sisterhood [ 19 ]

21 Phiippe Bache in itsef does not provide much syntactic knowedge or, more precisey, is too vague in comparison with the syntactic properties binding two categories (such as co-occurrence, restriction, dependency, etc.). Moreover, this organization constitutes a severe drawback: a inguistic description is ony possibe when the first eve (identification of the set of categories) is competed. In other words, it is necessary to buid a phrase before being abe to evauate its properties. This approach does not fit with the notion of construction for severa reasons. First, a construction is not necessariy composed of adjacent constituents. A constituency-based grammar can not hande such objects directy. Moreover, constructions can be formed with a variabe structure (eements of varying types, non-mandatory eements, etc.), due to the fact that they encode a convergence of different sources of information (phonoogy, morphoogy, semantics, syntax, etc.). An organization in terms of constituents reies on a representation driven by syntax, which renders impossibe a description in terms of interaction of properties and domains as is the case with construction-based approaches. Our goa is to integrate a muti-domain perspective, based on a description in terms of constructions, that is capabe of deaing with any kind of input (incuding i-formed or non-canonica reaizations). We propose a representation of the inguistic information in terms of properties that are a at the same eve. In other words, a information needs to be represented in the same manner, without any priority given to one type of information over another. No domain, set of categories or phrase shoud be buit before being abe to describe the inguistic characteristics of an input: a inguistic property shoud be identified directy, independenty of any other structure. As a consequence, properties need to be represented as such in the grammar (i.e. independenty of any notion of constituency) and used directy during parsing (i.e. without needing to buid a set of categories first). This goa becomes possibe provided that the scope of the property is controed. One way to do this consists in specifying precisey the categories in reation. Two types of information can be used with this perspective: the specification of certain features (imiting the kinds of objects to which the property can be appied), and the use of an HPSG-ike category index (making it possibe to specify when two categories from two properties refer to the same object). [ 20 ]

22 Representing syntax by means of properties As such, integrating the notion of construction shoud not make use of the notion of constituency but rather favour a description based on direct reations between words (or exica categories). Thus, we fa in ine with a perspective that is akin to dependency grammars, except for the fact that we intend to use a arger variety of properties to describe the syntax and not focus excusivey on dependency. In the remainder of this section we wi present a means of representing constructions ony using such basic properties. 3.2 The question of heads: to have them or not (to have them)? The notion of head pays a decisive roe in most inguistic theories: syntax is usuay described in terms of government or dependency between a head and its dependents. In constituency-based grammars, the head bears a specia reation to its projection (the root of the oca tree it beongs to). In dependency grammars, a head is the target of the reations from the depending categories. The roe of the head can be even more important in exicaized theories such as LFG or HPSG. In this case, the head is aso an operationa eement in the construction of the syntactic structure: it represents the site through which a information (encoded by features) percoates. A exocentric syntactic reations (between a phrase constituent and another component outside this phrase) are expressed as feature vaues which, as a resut of a number of principes, move from the source constituent to the target, passing through the head. A direct consequence is that when heads pay a centra roe, syntactic information needs to be represented in a stricty hierarchica manner: as the head serves as a gateway, it is aso a reduction point from which a information reating to the head s dependents may be accessed. Such a strict hierarchica conception of syntax has a forma consequence: the syntactic structure must be represented as a hierarchica (or a tree-ike) structure in which every component (word, category, phrase, etc.) is dependent on a higher-eve eement. Such a syntactic organization is not suited for the description of many phenomena that we come across in natura anguage. For exampe, many constructions have no overt head: (2) a. John sets the red cube down and takes the back. b. First trip, New York. [ 21 ]

23 Phiippe Bache c. Monday, washing, Tuesday, ironing, Wednesday, rest. Exampe (2a) presents a cassica eision as part of a conjunction: the second NP has no head. This is aso the case in the nomina sentences in exampes (2b) and (2c), which correspond to binary structures where each nomina component hods an argumentative position (from the semantic point of view) without a head being reaized. We aready gave some eements for the anaysis of non-headed constructions in the second section. In the case of the ast two exampes, itte information can be given at the syntactic eve; it mainy comes from the interaction of morphoogy, prosody and discourse. The soution in PG (not deveoped in this paper) consists in impementing interaction constraints for controing the aignment of properties coming from the different domains (Bache and Prévot 2010). This raises the issue of structures that can be adapted to the representation of inguistic reations outside the head/dependent reation. The exampe of coective nouns in French iustrates such a situation: (3) a. un ensembe de catégories (a set of categories) b. *un ensembe des catégories (a set of-pu categories) c. ensembe de catégories (the set of categories) d. ensembe des catégories (the set of-pu categories) If a coective noun is specified by an indefinite determiner, then the compex category preposition-determiner de ( of ) which, in this case, is a partitive can ony be used in its singuar form. This construction is controed by the excusion property: Det [ind] {P rep + Det [pu] } (27) Inside a nomina construction with a coective noun, we have a direct constraint between the type of determiner (definite or indefinite) and the preposition agreement feature without any mediation of the head. In order to be compete, this property has to be restricted to those determiners specifying a coective noun. This is impemented by a co-indexation mechanism between categories, that wi be described ater on in the paper. Generay speaking, the head pays a fundamenta roe in specifying the sub-categorization or the argument structure. It is not, however, necessary to give it an operationa roe when constructing the [ 22 ]

24 Representing syntax by means of properties Figure 4: Inheritance in nomina and verba categories syntactic structure. We sha see that the head, even with no specific roe, can be identified ony as being the category to which a dependency reations converge. 3.3 The structure of exica entries As in unification grammars, the exica information is highy important. Nonetheess, the exicaization of syntactic information (emphasized in theories such as LFG or HPSG) is more imited in PG. In particuar, the exicon does not pay a direct roe in the construction of the syntactic structure; rather, a information is borne by the properties. Lexica information, athough rich, is ony used on the one hand to contro the scope of the properties (as described above) and on the other hand to instantiate the subcategorization or the specific dependencies that one category can have with others. In genera, a exica entry is associated with an attribute-vaue matrix which basicay contains the category, agreement, morphosyntactic features, sub-categorization ist and grammatica function (when reevant). This structure can be enriched with other features, for exampe those describing semantics, phonoogy, etc. It can aso be competed depending on the category, with more specific information such as mood, tense, person, or the vaence feature that gives the ist of arguments required. Figure 4 summarizes the main features of nomina and verba categories. It represents a type hierarchy, whie the subtypes inherit appropriate features from the higher-eve types. The most genera type, cat, comprises features appropriate to the description of a categories: the category abe as we as the descrip- [ 23 ]

25 Phiippe Bache tion of its dependency with other categories. This reation is described by the type of the dependency and the target vaue of the reation. In the above exampe, the ower eve subtypes describe the features appropriated to N and V: both categories take agreement. Moreover, the verb has an argument structure which specifies its vaence as we as its form attributes. As for the noun, it is associated with case features. 3.4 The roe of features Properties are reations between two exica categories (that may potentiay have other dependencies). For exampe, a inear property such as V N[ob j] indicates that the verb precedes the direct object. This reation hods regardess of the other dependency reations of V and N. However, in this exampe, specifying the function vaue is mandatory: without such, the property woud not be vaid (V N is not icit as such in Engish). The instantiation of feature vaues of a category invoved in a property reduces its definition domain and, as a side effect, the scope of the property. Moreover, with a properties being independent of each other, it is necessary to provide as much information as possibe to identify precisey the categories to be inked. Representing a property in this way renders them absoute, in the manner of Optimaity Theory (Prince and Smoensky 1993) in which a constraints are universa. In this approach, a property can be evauated directy, without needing any knowedge of the context or the rest of the syntactic structure. This condition is imperative when trying to consider a grammar as a set of properties. We present two series of exampes iustrating how feature instantiation heps in controing the appication of a property. Contro by feature vaues. The specification of feature vaues in properties can be used in order to describe certain phenomena directy. For exampe, the argument structure can be described by means of inearity and dependency properties, assigning subcategorization and case feature vaues: V N [sub j] ; V [t rans] N [ob j] V [int rans] N [ob j] ; V [dit rans] N [iob j] (28) Likewise, the different possibe constructions of the reative in French can be described by specifying the case of the reative pronoun: [ 24 ]

26 Representing syntax by means of properties Construction Properties Exampe Property graph Prepositiona Nomina Prep N N xcomp Prep N P rep Prep mod N on the tabe the book on... P rep Det dc N xcomp c Det dc N Prep mod c Tabe 2: Reative ordering depending on the construction W hp [nom] N [sub j] ; W hp [acc] N [ob j] ; W hp [nom] sub j V W hp [nom] ob j V (29) These properties stipuate that the nominative reative pronoun qui ( who ) excudes the possibiity to reaize a subject within the reative construction and specifies a subject-type dependency reation between the reative pronoun and the verb. The same type of restriction is specified for the accusative pronoun que ( which ) and coud aso be extended to the dative pronoun dont ( of which / of whom ). These properties impement the ong-distance dependency between WhP and the gap in the argument structure of the main verb. Contro by co-indexation. We iustrate here the possibiity of controing the appication of properties thanks to the co-indexation of the categories invoved in different properties. The foowing exampe describes the reative order between Prep and N, which is governed by the type of construction in which they are invoved: the preposition precedes the noun in a prepositiona construction whereas it foows it into a nomina one. Tabe 2 presents a first description of these different cases, iustrated with an exampe. As such, it is necessary to specify the inearity and dependency properties between Prep and N according to the construction they beong to. In order to distinguish between these two cases, we specify the syntactic functions. The foowing feature structures specify the dependency features of N, iustrating here the cases of the N subject of a V or compement of a Prep: (30) [ 25 ]

27 Phiippe Bache Figure 5: Features specification in properties Using this representation, the distinction between the two cases of dependency between N and P rep reies on the specification of the function and target features of the categories (Figure 5). Moreover, a co-indexation makes it possibe to ink the properties. These properties stipuate an order and a dependency reation; these are determined by the syntactic roes. In a nomina construction, the noun precedes the prepositiona construction that modifies it, whereas the preposition precedes the noun in the other construction. Two cassica mechanisms, based on unification, are used in these properties: first, the specification of the dependency attribute contros the appication of the properties (the N foowing P rep is its compement, the P rep that foows N modifies it). Moreover, index unification (marked by the use of the same index i in the previous exampes) ensures that the category is identica across a reations: the co-indexation of the categories in the different properties imposes a reference to the same object. 4 representing and processing constructions Syntactic information is usuay defined with respect to a specific domain (a set of categories). For exampe, the precedence property between Det and N ony makes sense within a nomina construction. The foowing exampe iustrates this situation, showing the possibe reations corresponding to the inearity property Det N. These reations are represented regardess of any specific domain (i.e. between a the determiners and nouns of the sentence). Same-category words are distinguished by different indexes: [ 26 ]

28 Representing syntax by means of properties Det 1 The N 1 man V reads Det 2 the N 2 book (31) In this exampe, the reation Det 1 N 2 connects two categories that ceary do not beong to the same domain. More generay, the subsets of categories {Det 1, N 1 } and {Det 2, N 2 } form possibe units, unike {Det 1, N 2 }. The probem is that, as expained in the previous section, properties need to be assessed and evauated independenty of any a priori knowedge of a specific domain: a property in the grammar is not specificay attached to a set of categories (a phrase or a dependent). However, inguistic description reies mainy on the identification of oca phenomena that corresponds to the notion of construction such as that specified in Construction Grammars (Fimore 1988). It is, therefore, necessary to propose an approach fufiing both requirements: the representation of properties independenty and the description of oca phenomena as sets of properties. We propose to examine two perspectives: one concerning the grammatica representation and the other the question of parsing. The first perspective eads to a definition of constructions in terms of an interaction of properties. The atter presents the mechanisms for recognizing a construction on the basis of topoogica characteristics of the property graph (representing set of evauated properties). 4.1 In grammar: construction = set of properties Grammars organize syntactic information on the basis of structures to which different reations can be appied. In phrase-structure grammars, the notion of phrase impicity comprises the definition of a domain (the set of constituents) in which the reations are vaid. This notion of domain aso exists in theories ike HPSG, using generic tree schemas that are competed with the subcategorization information borne by exica entries (both pieces of information together effectivey correspond to the notion of constituency). Dependency grammars, in contrast, integrate syntactic information in the dependency reation between a head and its dependents. In both cases, the question of the scope of syntactic reations reies on the topoogy of the structures: a reation is vaid inside a oca tree. Therefore, a domain [ 27 ]

29 Phiippe Bache typicay corresponds to a set of categories that share common properties. Our approach reies on a decentraized representation of syntactic information by means of reations that can be evauated independenty of the entire structure. In other words, any property can be assessed aone, without needing to evauate any other. For exampe, the assessment of inearity between two categories is done without taking into account any other information such as subcategorization. In this case, we can evauate the properties of a construction without having to create a syntactic tree: PG is based on a dynamic definition of the notion of construction. This means that a properties are assessed separatey, a construction being the set of independenty evauated properties. 5 In Construction Grammars, a construction is defined by the interaction of reations originating from different sources (exica, syntactic, semantic, prosodic, etc.). This approach makes it possibe to describe a wide variety of facts, from exica seection to syntactico-semantic interactions (Godberg 2003; Kay and Fimore 1999; Lambrecht 1995). A construction is then intended as a inguistic phenomenon that is comprised of syntactic units as we as other types of structures such as muti-word expressions, specific turns, etc. The notion of construction is, therefore, more genera than that of syntactic unit and not necessariy based on a structured representation of information (e.g. a tree). PG provides an adequate framework for the representation of constructions. First, a syntactic description is the interaction of severa sources of information and properties. Moreover, PG is a constraintbased theory in which each piece of information corresponds to a constraint (or property). The description of a construction in a PG grammar is a set of properties connecting severa categories. This definition gives priority to the reations instead of their arguments, which means that prior definition of the set of constituents invoved in the construction is not necessary. 6 As a consequence, the notion of constraint scope is not directy encoded: each property is specified independenty and 5 A direct impementation of this mechanism consists in assessing a the possibe properties, for a the combinations of words/categories, which is exponentia. Different possibiities of controing this compexity exists, such as deayed evauation or probabiistic seection. 6 In previous versions of PG, a categories beonging to a construction were indicated in a ist of constituents. [ 28 ]

30 Representing syntax by means of properties the grammar is a set of constructions, each described by a set of properties. The foowing exampe iustrates the encoding of the ditransitive construction, focusing on the reation between the type of categories (N or Prep), their inear order and their function: (1) V [dit rans] N [ob j] (5) N [ob j] ob j V [dit rans] (2) V [dit rans] X [iob j] (6) N [iob j] iob j V [dit rans] (3) N [iob j] N [ob j] (7) P rep [iob j] iob j V [dit rans] (4) N [ob j] Prep [iob j] The two first co-occurrence properties stipuate that the ditransitive verb governs a nomina object pus an indirect object of unspecified category encoded by X (that coud be, according to the rest of the properties, either a nomina or a prepositiona construction). Linearity properties stipuate that in the case of a doube nomina construction, the nomina indirect object shoud precede the direct object. Otherwise, the direct object precedes the indirect prepositiona construction. Finay, the dependency reations instantiate, according to their function, the type of the dependency with the verb. 4.2 In anaysis : construction = government domain The theoretica and naïve parsing principe in PG consists in evauating a properties that may exist between a categories corresponding to the words in a sentence. This set of properties contains considerabe noise: most of the properties evauated in this way ink categories which do not beong to the same domain. The issue is to eicit the constructions existing in this set. Concretey, the set of properties forms a graph from which the connected categories may correspond to a construction. In the foowing, we put forward a forma characterisation of the notion of construction in terms of graph topoogy. Generay speaking, two types of properties can be distinguished, based on the number of categories they invove: Binary properties, where two categories are connected: inearity, dependency, co-occurrence Unary properties: uniqueness, excusion Unary reations, because of their specificity, do not have any features that may be used to identify the construction. On the contrary, [ 29 ]

31 Phiippe Bache Figure 6: Constructions corresponding to maxima compete subgraphs Ad v Ad j Det Ad j N N V V P rep P rep N Det N Adjectiva construction Nomina construction Subject/verb construction Verb/indirect object construction Prepositiona construction Nomina construction the three types of binary properties are the basis of the domain identification mechanism. The foowing graph iustrates the characterization of the sentence A very od book is on the tabe. : Det Adv dc Ad j d N d V P rep Det d dc N d c c d c c (32) It is noteworthy that in this graph, it is possibe to identify severa subgraphs in which a the categories are interconnected. Formay, they are referred to as being compete: a compete graph is a graph where a nodes are connected. In this exampe, the nodes abeed by Adv and Adj form a compete subgraph: both categories are connected. On the other hand, the set of categories {Det, Adv, Ad j} does not form a compete subgraph, the Det and Adv categories being disconnected. Furthermore, when eiciting a construction, it is necessary to take into account a the categories of a same constraint network. For exampe, the Adj and N nodes coud form a compete subgraph, but it woud be a subset of a more compete subgraph {Det, Ad j, N} subset. As a consequence, we ony take into consideration maxima compete subgraphs. The maxima compete subgraphs in the previous exampe correspond to the subsets of the foowing nodes (Figure 6) to which we have associated a construction type. As such, based on a graph topoogy, we can identify constructions for which the foowing definition can be given: Definition: A construction is a maxima compete subgraph of the property graph. [ 30 ]

32 Representing syntax by means of properties Concretey, these subsets correspond to syntactic units. Yet, where cassica approaches rey on the definition of constructions a priori in the grammar, this definition proposes a dynamic and a posteriori description. This is fundamenta: it makes it possibe to describe any type of sentence, regardess of its grammaticaity. Anayzing a sentence consists in interpreting the property graph. This structure may contain constructions that ead directy to a semantic interpretation. But it can aso be the case that the property graph contains subparts that are not necessariy connected with the rest of the sentence. This situation occurs with ungrammatica sentences. At this stage, exhibiting the set of reevant constructions for the description of a sentence consists in identifying, among the set of maxima compete subgraphs, those that cover the set of words: in the optima case, the set of nodes of the exhibited constructions corresponds to the set of words in the sentence. Note that in theory, constructions can overap, which means that the same node coud beong to different constructions. This characteristic is usefu when combining different domains of inguistic description, incuding prosody, discourse, etc. However, when studying a singe domain, for exampe syntax, it is usefu to reduce overapping: a category beonging to a construction can contribute to another construction provided it is its head. The task is therefore to exhibit the optima set of constructions, covering the entire input. 5 parsing by satisfying constraints Parsing a sentence S consists in firsty determining and evauating the set of properties reevant for the input and secondy in exhibiting the constructions. In the second stage, it is necessary to estabish a the partitions 7 of the suite of categories that correspond to S. The issue is to know which parts correspond to a construction and whether an optima partition exists. In the first stage, an operationa semantics describing conditions of satisfiabiity must be assigned to the properties. In this perspective, we introduce some preiminary notions: 7 A partition of S is a set of non-empty parts of P, disjoint two-by-two, and that cover S. [ 31 ]

33 Phiippe Bache Figure 7: Operationa semantics of properties Uniqueness: Uniq x hods in C iff y C, then x y Excusion: x y hods in C iff z C, then z y Co-occurrence: x y hods in C iff C { y} = Linearity: x y hods in C iff pos(x, C) < pos(y, C) Set of property categories : Let p be a property. We define a function Cat(p) buiding the set of categories contained in p. For exampe, Cat(Det N) = {Det, N}. Appicabe properties : Given a grammar G and a set of categories C, the set of C-appicabe properties is the set of a the properties of G in which the categories of C appear. More specificay, a property p is appicabe when its evauation becomes possibe. Two types of properties can be distinguished: those requiring the reaization of a the categories they invove (uniqueness, inearity and dependency) and the properties needing at east one of their categories to be evauated (co-occurrence and excusion). As such, we have: Definition: Let p G: if p {uniq, in, dep}, p is an appicabe property for C iff c Cat(p), then c C if p {cooc, exc}, p is an appicabe property for C iff c Cat(p), such that c C Position in the string : We define a function Pos(c, C), returning the rank of c in the category suite C An operationa semantic definition may be assigned to each property as in Figure 7 (C being a set of categories). These definitions provide the conditions of satisfiabiity of the different properties. It now becomes possibe to iustrate how the description of the syntactic structure can be buit. The construction of the syntactic description (caed the characterization) of a construction consists in evauating the set of its appicabe properties. In more genera terms, parsing a sentence consists in evauating a the reevant properties and then determining the corresponding constructions. Formay, et S be the set of categories of a sentence to be parsed, et Part S be a partition of S, et p be one subpart of Part S, et Prop p be the [ 32 ]

34 Representing syntax by means of properties Det The Property graph Adv very dc d c Adj od d N book Characterization P + = {Det Ad j, Det N, Adv Ad j, Ad j N, Det N, Ad j N, Adv Ad j, Det N, Adv Ad j, Ad j N} P = Figure 8: Property graphs and their characterizations Adv very dc Adj od Det the d d c N book P + = {Det N, Adv Ad j, Ad j N, Det N, Ad j N, Adv Ad j, Det N, Adv Ad j, Ad j N} P = {Det Ad j} set of appicabe properties of p. The categories beonging to p part are instantiated: their feature vaues, as determined by the corresponding exica entries, are known insofar as they correspond to the words of the sentence to be parsed. The properties in Prop p stipuate constraints in which the categories are fuy instantiated (by the unification of the categories of the properties in the grammar and those reaized in the sentence). We define Sat(Prop p ) as the constraint system formed by both appicabe properties and the state of their satisfaction after evauation (true or fase). Figure 5 presents two exampes of nomina constructions aong with their characterizations; the second exampe contains a inear constraint vioation between Det and Ad j: This exampe iustrates a key aspect of Property Grammars: their abiity to describe an i-formed sentence. Furthermore, we aso note that in this description, in spite of the property vioation, the nomina construction is characterized by a arge number of satisfied constraints. This characteristic aows one to introduce a crucia eement for usage-based grammars: compensation phenomena between positive and negative information. We know that constraint vioation can be an eement of difficuty for human or automatic processing. The idea is that the vioation of constraints can be compensated by the satisfaction of some others. For exampe, the vioation of a precedence constraint can be compensated by the satisfaction of co-occurrence and dependency ones. PG offers the possibiity to quantify these com- [ 33 ]

35 Phiippe Bache pensation effects, on the basis of compexity evauation (Bache et a. 2006; Bache 2011). One important question when addressing the question of parsing is that of ambiguity. The probem is twofod: how to represent ambiguity and how to dea with it. With syntactic information being represented in terms of graphs, it is theoreticay possibe to represent different types of attachment at the same time. It is possibe to have in the property graph two dependency reation of the same type, which are then mutuay excusive. The contro of ambiguity resoution can be done cassicay, thanks to preference options impemented by property weights. 6 an appication to treebanking The use of treebanks offers a direct framework for the experimentation and the comparison of syntactic formaisms. Most of them have been deveoped using cassica constituency or dependency-based representations. They have then to be adapted when studying more specific proposas. We present in this section an approach making it possibe to extract properties from existing treebanks. Most of the properties presented in this paper can be extracted automaticay under some conditions, foowing a method presented in Bache et a. (2016). This is in particuar the case with inearity, uniqueness, co-occurrence and excusion, on which we focus in this section. The three first properties can be inferred fuy automaticay, the ast one has to be fitered manuay after its automatic extraction. The mechanism consists of two steps: 1. Extraction of the impicit context-free grammar 2. Generation of the properties from the CFG In order to vaidate the approach, we have tested the method on severa treebanks that offer different representations. We used first a set of four arge constituency-based treebanks: the Penn Treebank (Marcus et a. 1994) itsef, the Chinese Treebank (Xue et a. 2010), the Arabic Treebank (Maamouri et a. 2003), and the French Treebank (Abeié et a. 2003). In a second stage, we have appied property extraction to the Universa Dependencies Treebank (Nivre et a. 2015). We offer a brief overview of this ongoing work presenty. [ 34 ]

36 Representing syntax by means of properties 6.1 Extracting the impicit CFG from a treebank NP:SUJ SENT VP Pct Figure 9: Constituent tree and inferred CFG rues Cit VN NP:OBJ. Ee Verb Det Noun a dix-sept ans SENT NP:SUJ VP Pct NP:SUJ Cit VP VN NP:OBJ VN Verb NP:OBJ Det Noun The extraction of a context-free grammar (CFG) from a constituency treebank is based on a simpe method described in Charniak (1996). Each interna node of a tree is converted into a rue in which the eft-hand side (LHS) is the root and the right-hand side (RHS) is the sequence of constituents. The impicit grammar is composed of the compete set of rues. Figure 9 shows the syntactic tree associated to the French sentence Ee a dix-sept ans ( She is seventeen ), together with the corresponding CFG rues. We appied a simiar approach to dependency treebanks. In this case, a root node (LHS of a rue) is a head, whie the constituents (RHS) form its ist of dependents, foowing the projection order by which the head is added (encoded with the symbo *) Figure 10 iustrates the dependency tree of the same sentence as in Figure 9 with the extracted CFG rues. 6.2 Generating the properties Using these grammars, it is straightforward to extract the properties that we consider in this experiment, which we describe in First resuts The treebanks and the generated resources are seriaized as XML; this faciitates editing and visuaization. We have deveoped software to [ 35 ]

37 Phiippe Bache Figure 10: Dependency tree and inferred CFG rues SUJ ROOT OBJ PUNCT DET Cit Verb Det Noun Pct Ee a dix-sept ans. Verb:ROOT Cit:SUJ * Noun:OBJ Pct:PUNCT Noun:OBJ Det:DET * Linearity: the precedence tabe is buit whie verifying for each category preceding another category into a construction (or a right-hand side) whether this reation is vaid throughout the set of constructions Uniqueness: the set of categories that can not be repeated in a righthand side Requirement: identification of two categories that co-occur systematicay in a constructions of an XP Excusion: when two categories never co-occur in the entire set of constructions, they are supposed to be mutuay excusive; this is a strong interpretation, and it causes us to overgenerate such constraints, but it is the ony way to identify this phenomenon automaticay rhs m RHS(XP) if (( (c i, c j ) rhs m c i c j ) and ( rhs n RHS(XP) (c i, c j ) rhs n c i c j )) then add prec(c i, c j ) rhs m RHS(XP) (c i, c j ) rhs m if c i c j then add uniq(c i ) rhs m RHS(XP) boo ((c i rhs m ) (c j rhs m )) if boo then add req(c i, c j ) rhs m RHS(XP) boo ((c i rhs m ) (c j rhs m )) if boo then add exc(c i, c j ) Figure 11: Impementation of the properties view the different types of information: treebanks, tagset, extracted grammar, rues, and properties. Each type of information is associated with a ink to a corresponding exampe in the treebank. Figure 6.3 iustrates some properties of a N P extracted from the Chinese Treebank. In our interface, the eft part of the window ists the set of cat- [ 36 ]

38 Representing syntax by means of properties Figure 12: Properties from the Chinese Treebank egories of the grammar, together with frequency information. Nonterminas are hyperinked to their syntactic description (corresponding PS-rues and properties). This information is dispayed in the top right of the window. Each property (in this exampe Obigation and Uniqueness) comes with the set of rues starting from which it has been generated. Links to the different occurrences of the corresponding trees in the treebank are aso isted. The ower right side of the window contains a graphica representation of the tree structure. 7 concusion Describing inguistic phenomena by means of atomic, ow-eve, and independent properties makes possibe the joining of forma and descriptive inguistics. We are now in position to propose a genera account of anguage processing, capabe of integrating the description of oca phenomena into a goba architecture and making it possibe to benefit from the best of the descriptive and forma approaches. [ 37 ]

Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling

Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling Unsupervised Large-Vocabuary Word Sense Disambiguation with Graph-based Agorithms for Sequence Data Labeing Rada Mihacea Department of Computer Science University of North Texas rada@cs.unt.edu Abstract

More information

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions. to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about

More information

Using Voluntary work to get ahead in the job market

Using Voluntary work to get ahead in the job market Vo_1 Vounteering Using Vountary work to get ahead in the job market Job Detais data: {documents}httpwwwopeneduopenearnocw_cmid4715_2014-08-21_14-34-17_ht2.xm user: ht2 tempate: ve_pdf job name: httpwwwopeneduopenearnocw_cmid4715_2014-08-

More information

Making and marking progress on the DCSF Languages Ladder

Making and marking progress on the DCSF Languages Ladder Making and marking progress on the DCSF Languages Ladder Primary anguages taster pack Year 3 summer term Asset Languages and CILT have been asked by the DCSF to prepare support materias to hep teachers

More information

Precision Decisions for the Timings Chart

Precision Decisions for the Timings Chart PPENDIX 1 Precision Decisions for the Timings hart Data-Driven Decisions for Performance-Based Measures within ssions Deb Brown, MS, BB Stanisaus ounty Office of Education Morningside Teachers cademy Performance-based

More information

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English. Basic Syntax Doug Arnold doug@essex.ac.uk We review some basic grammatical ideas and terminology, and look at some common constructions in English. 1 Categories 1.1 Word level (lexical and functional)

More information

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Jakub Waszczuk, Agata Savary To cite this version: Jakub Waszczuk, Agata Savary. Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]. PARSEME 6th general

More information

Chapter 4: Valence & Agreement CSLI Publications

Chapter 4: Valence & Agreement CSLI Publications Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

Control and Boundedness

Control and Boundedness Control and Boundedness Having eliminated rules, we would expect constructions to follow from the lexical categories (of heads and specifiers of syntactic constructions) alone. Combinatory syntax simply

More information

CS 598 Natural Language Processing

CS 598 Natural Language Processing CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

More information

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,

More information

Teachers response to unexplained answers

Teachers response to unexplained answers Teachers response to unexplained answers Ove Gunnar Drageset To cite this version: Ove Gunnar Drageset. Teachers response to unexplained answers. Konrad Krainer; Naďa Vondrová. CERME 9 - Ninth Congress

More information

The building blocks of HPSG grammars. Head-Driven Phrase Structure Grammar (HPSG) HPSG grammars from a linguistic perspective

The building blocks of HPSG grammars. Head-Driven Phrase Structure Grammar (HPSG) HPSG grammars from a linguistic perspective Te building blocks of HPSG grammars Head-Driven Prase Structure Grammar (HPSG) In HPSG, sentences, s, prases, and multisentence discourses are all represented as signs = complexes of ponological, syntactic/semantic,

More information

Construction Grammar. University of Jena.

Construction Grammar. University of Jena. Construction Grammar Holger Diessel University of Jena holger.diessel@uni-jena.de http://www.holger-diessel.de/ Words seem to have a prototype structure; but language does not only consist of words. What

More information

BULATS A2 WORDLIST 2

BULATS A2 WORDLIST 2 BULATS A2 WORDLIST 2 INTRODUCTION TO THE BULATS A2 WORDLIST 2 The BULATS A2 WORDLIST 21 is a list of approximately 750 words to help candidates aiming at an A2 pass in the Cambridge BULATS exam. It is

More information

Feature-Based Grammar

Feature-Based Grammar 8 Feature-Based Grammar James P. Blevins 8.1 Introduction This chapter considers some of the basic ideas about language and linguistic analysis that define the family of feature-based grammars. Underlying

More information

The Interface between Phrasal and Functional Constraints

The Interface between Phrasal and Functional Constraints The Interface between Phrasal and Functional Constraints John T. Maxwell III* Xerox Palo Alto Research Center Ronald M. Kaplan t Xerox Palo Alto Research Center Many modern grammatical formalisms divide

More information

Words come in categories

Words come in categories Nouns Words come in categories D: A grammatical category is a class of expressions which share a common set of grammatical properties (a.k.a. word class or part of speech). Words come in categories Open

More information

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing. Lecture 4: OT Syntax Sources: Kager 1999, Section 8; Legendre et al. 1998; Grimshaw 1997; Barbosa et al. 1998, Introduction; Bresnan 1998; Fanselow et al. 1999; Gibson & Broihier 1998. OT is not a theory

More information

Specification of a multilevel model for an individualized didactic planning: case of learning to read

Specification of a multilevel model for an individualized didactic planning: case of learning to read Specification of a multilevel model for an individualized didactic planning: case of learning to read Sofiane Aouag To cite this version: Sofiane Aouag. Specification of a multilevel model for an individualized

More information

"f TOPIC =T COMP COMP... OBJ

f TOPIC =T COMP COMP... OBJ TREATMENT OF LONG DISTANCE DEPENDENCIES IN LFG AND TAG: FUNCTIONAL UNCERTAINTY IN LFG IS A COROLLARY IN TAG" Aravind K. Joshi Dept. of Computer & Information Science University of Pennsylvania Philadelphia,

More information

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet

More information

An Introduction to the Minimalist Program

An Introduction to the Minimalist Program An Introduction to the Minimalist Program Luke Smith University of Arizona Summer 2016 Some findings of traditional syntax Human languages vary greatly, but digging deeper, they all have distinct commonalities:

More information

An Interactive Intelligent Language Tutor Over The Internet

An Interactive Intelligent Language Tutor Over The Internet An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This

More information

Underlying and Surface Grammatical Relations in Greek consider

Underlying and Surface Grammatical Relations in Greek consider 0 Underlying and Surface Grammatical Relations in Greek consider Sentences Brian D. Joseph The Ohio State University Abbreviated Title Grammatical Relations in Greek consider Sentences Brian D. Joseph

More information

Writing a composition

Writing a composition A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a

More information

Designing Autonomous Robot Systems - Evaluation of the R3-COP Decision Support System Approach

Designing Autonomous Robot Systems - Evaluation of the R3-COP Decision Support System Approach Designing Autonomous Robot Systems - Evaluation of the R3-COP Decision Support System Approach Tapio Heikkilä, Lars Dalgaard, Jukka Koskinen To cite this version: Tapio Heikkilä, Lars Dalgaard, Jukka Koskinen.

More information

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

More information

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically

More information

Compositional Semantics

Compositional Semantics Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

More information

On the Notion Determiner

On the Notion Determiner On the Notion Determiner Frank Van Eynde University of Leuven Proceedings of the 10th International Conference on Head-Driven Phrase Structure Grammar Michigan State University Stefan Müller (Editor) 2003

More information

Derivational and Inflectional Morphemes in Pak-Pak Language

Derivational and Inflectional Morphemes in Pak-Pak Language Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes

More information

THE VERB ARGUMENT BROWSER

THE VERB ARGUMENT BROWSER THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW

More information

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Minimalism is the name of the predominant approach in generative linguistics today. It was first Minimalism Minimalism is the name of the predominant approach in generative linguistics today. It was first introduced by Chomsky in his work The Minimalist Program (1995) and has seen several developments

More information

Type Theory and Universal Grammar

Type Theory and Universal Grammar Type Theory and Universal Grammar Aarne Ranta Department of Computer Science and Engineering Chalmers University of Technology and Göteborg University Abstract. The paper takes a look at the history of

More information

Argument structure and theta roles

Argument structure and theta roles Argument structure and theta roles Introduction to Syntax, EGG Summer School 2017 András Bárány ab155@soas.ac.uk 26 July 2017 Overview Where we left off Arguments and theta roles Some consequences of theta

More information

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight. Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material

More information

Developing a TT-MCTAG for German with an RCG-based Parser

Developing a TT-MCTAG for German with an RCG-based Parser Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,

More information

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up

More information

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit Unit 1 Language Development Express Ideas and Opinions Ask for and Give Information Engage in Discussion ELD CELDT 5 EDGE Level C Curriculum Guide 20132014 Sentences Reflective Essay August 12 th September

More information

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES PRO and Control in Lexical Functional Grammar: Lexical or Theory Motivated? Evidence from Kikuyu Njuguna Githitu Bernard Ph.D. Student, University

More information

Parsing of part-of-speech tagged Assamese Texts

Parsing of part-of-speech tagged Assamese Texts IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

More information

Korean ECM Constructions and Cyclic Linearization

Korean ECM Constructions and Cyclic Linearization Korean ECM Constructions and Cyclic Linearization DONGWOO PARK University of Maryland, College Park 1 Introduction One of the peculiar properties of the Korean Exceptional Case Marking (ECM) constructions

More information

Constraining X-Bar: Theta Theory

Constraining X-Bar: Theta Theory Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

Ch VI- SENTENCE PATTERNS.

Ch VI- SENTENCE PATTERNS. Ch VI- SENTENCE PATTERNS faizrisd@gmail.com www.pakfaizal.com It is a common fact that in the making of well-formed sentences we badly need several syntactic devices used to link together words by means

More information

Grammars & Parsing, Part 1:

Grammars & Parsing, Part 1: Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review

More information

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3 Inleiding Taalkunde Docent: Paola Monachesi Blok 4, 2001/2002 Contents 1 Syntax 2 2 Phrases and constituent structure 2 3 A minigrammar of Italian 3 4 Trees 3 5 Developing an Italian lexicon 4 6 S(emantic)-selection

More information

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more Chapter 3: Semi-lexical categories 0 Introduction While lexical and functional categories are central to current approaches to syntax, it has been noticed that not all categories fit perfectly into this

More information

Pseudo-Passives as Adjectival Passives

Pseudo-Passives as Adjectival Passives Pseudo-Passives as Adjectival Passives Kwang-sup Kim Hankuk University of Foreign Studies English Department 81 Oedae-lo Cheoin-Gu Yongin-City 449-791 Republic of Korea kwangsup@hufs.ac.kr Abstract The

More information

Greeley-Evans School District 6 French 1, French 1A Curriculum Guide

Greeley-Evans School District 6 French 1, French 1A Curriculum Guide Theme: Salut, les copains! - Greetings, friends! Inquiry Questions: How has the French language and culture influenced our lives, our language and the world? Vocabulary: Greetings, introductions, leave-taking,

More information

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017 GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017 Instructor: Dr. Claudia Schwabe Class hours: TR 9:00-10:15 p.m. claudia.schwabe@usu.edu Class room: Old Main 301 Office: Old Main 002D Office hours:

More information

Switched Control and other 'uncontrolled' cases of obligatory control

Switched Control and other 'uncontrolled' cases of obligatory control Switched Control and other 'uncontrolled' cases of obligatory control Dorothee Beermann and Lars Hellan Norwegian University of Science and Technology, Trondheim, Norway dorothee.beermann@ntnu.no, lars.hellan@ntnu.no

More information

The College Board Redesigned SAT Grade 12

The College Board Redesigned SAT Grade 12 A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.

More information

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place Contents Chapter One: Background Page 1 Chapter Two: Implementation Page 7 Chapter Three: Materials Page 13 A. Reproducible Help Pages Page 13 B. Reproducible Marking Guide Page 22 C. Reproducible Sentence

More information

cmp-lg/ Jul 1995

cmp-lg/ Jul 1995 A CONSTRAINT-BASED CASE FRAME LEXICON ARCHITECTURE 1 Introduction Kemal Oazer and Okan Ylmaz Department of Computer Engineering and Information Science Bilkent University Bilkent, Ankara 0, Turkey fko,okang@cs.bilkent.edu.tr

More information

Proposed syllabi of Foundation Course in French New Session FIRST SEMESTER FFR 100 (Grammar,Comprehension &Paragraph writing)

Proposed syllabi of Foundation Course in French New Session FIRST SEMESTER FFR 100 (Grammar,Comprehension &Paragraph writing) INTERNATIONAL COLLEGE FOR GIRLS SSFFSS,, GGUURRUUKKUULL MAARRGG,, MAANNSSAARROOVVAARR,, JJAAI IPPUURR DEPARTMENT OF FRENCH SYLLABUS OF FOUNDATIION COURSE FOR THE SESSIION 2009--10 1 Proposed syllabi of

More information

Constructions with Lexical Integrity *

Constructions with Lexical Integrity * Constructions with Lexical Integrity * Ash Asudeh, Mary Dalrymple, and Ida Toivonen Carleton University & Oxford University abstract Construction Grammar holds that unpredictable form-meaning combinations

More information

In Udmurt (Uralic, Russia) possessors bear genitive case except in accusative DPs where they receive ablative case.

In Udmurt (Uralic, Russia) possessors bear genitive case except in accusative DPs where they receive ablative case. Sören E. Worbs The University of Leipzig Modul 04-046-2015 soeren.e.worbs@gmail.de November 22, 2016 Case stacking below the surface: On the possessor case alternation in Udmurt (Assmann et al. 2014) 1

More information

Type-driven semantic interpretation and feature dependencies in R-LFG

Type-driven semantic interpretation and feature dependencies in R-LFG Type-driven semantic interpretation and feature dependencies in R-LFG Mark Johnson Revision of 23rd August, 1997 1 Introduction This paper describes a new formalization of Lexical-Functional Grammar called

More information

MODELING DEPENDENCY GRAMMAR WITH RESTRICTED CONSTRAINTS. Ingo Schröder Wolfgang Menzel Kilian Foth Michael Schulz * Résumé - Abstract

MODELING DEPENDENCY GRAMMAR WITH RESTRICTED CONSTRAINTS. Ingo Schröder Wolfgang Menzel Kilian Foth Michael Schulz * Résumé - Abstract T.A.L., vol. 38, n o 1, pp. 1 30 MODELING DEPENDENCY GRAMMAR WITH RESTRICTED CONSTRAINTS Ingo Schröder Wolfgang Menzel Kilian Foth Michael Schulz * Résumé - Abstract Parsing of dependency grammar has been

More information

A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon

A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon Imen Ben Cheikh, Abdel Belaïd, Afef Kacem To cite this version: Imen Ben Cheikh, Abdel Belaïd, Afef Kacem. A Novel Approach

More information

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Project in the framework of the AIM-WEST project Annotation of MWEs for translation Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Dependency, licensing and the nature of grammatical relations *

Dependency, licensing and the nature of grammatical relations * UCL Working Papers in Linguistics 8 (1996) Dependency, licensing and the nature of grammatical relations * CHRISTIAN KREPS Abstract Word Grammar (Hudson 1984, 1990), in common with other dependency-based

More information

Citation for published version (APA): Veenstra, M. J. A. (1998). Formalizing the minimalist program Groningen: s.n.

Citation for published version (APA): Veenstra, M. J. A. (1998). Formalizing the minimalist program Groningen: s.n. University of Groningen Formalizing the minimalist program Veenstra, Mettina Jolanda Arnoldina IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF if you wish to cite from

More information

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses Universal Grammar 1 evidence : 1. crosslinguistic investigation of properties of languages 2. evidence from language acquisition 3. general cognitive abilities 1. Properties can be reflected in a.) structural

More information

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)

More information

Heads and history NIGEL VINCENT & KERSTI BÖRJARS The University of Manchester

Heads and history NIGEL VINCENT & KERSTI BÖRJARS The University of Manchester Heads and history NIGEL VINCENT & KERSTI BÖRJARS The University of Manchester Heads come in two kinds: lexical and functional. While the former are treated in a largely uniform way across theoretical frameworks,

More information

LNGT0101 Introduction to Linguistics

LNGT0101 Introduction to Linguistics LNGT0101 Introduction to Linguistics Lecture #11 Oct 15 th, 2014 Announcements HW3 is now posted. It s due Wed Oct 22 by 5pm. Today is a sociolinguistics talk by Toni Cook at 4:30 at Hillcrest 103. Extra

More information

Context Free Grammars. Many slides from Michael Collins

Context Free Grammars. Many slides from Michael Collins Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures

More information

A relational approach to translation

A relational approach to translation A relational approach to translation Rémi Zajac Project POLYGLOSS* University of Stuttgart IMS-CL /IfI-AIS, KeplerstraBe 17 7000 Stuttgart 1, West-Germany zajac@is.informatik.uni-stuttgart.dbp.de Abstract.

More information

Construction Grammar. Laura A. Michaelis.

Construction Grammar. Laura A. Michaelis. Construction Grammar Laura A. Michaelis laura.michaelis@colorado.edu Department of Linguistics 295UCB University of Colorado at Boulder Boulder, CO 80309 USA Keywords: syntax, semantics, argument structure,

More information

Smart Grids Simulation with MECSYCO

Smart Grids Simulation with MECSYCO Smart Grids Simulation with MECSYCO Julien Vaubourg, Yannick Presse, Benjamin Camus, Christine Bourjot, Laurent Ciarletta, Vincent Chevrier, Jean-Philippe Tavella, Hugo Morais, Boris Deneuville, Olivier

More information

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

More information

Pre-Processing MRSes

Pre-Processing MRSes Pre-Processing MRSes Tore Bruland Norwegian University of Science and Technology Department of Computer and Information Science torebrul@idi.ntnu.no Abstract We are in the process of creating a pipeline

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

Specifying Logic Programs in Controlled Natural Language

Specifying Logic Programs in Controlled Natural Language TECHNICAL REPORT 94.17, DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF ZURICH, NOVEMBER 1994 Specifying Logic Programs in Controlled Natural Language Norbert E. Fuchs, Hubert F. Hofmann, Rolf Schwitter

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Students concept images of inverse functions

Students concept images of inverse functions Students concept images of inverse functions Sinéad Breen, Niclas Larson, Ann O Shea, Kerstin Pettersson To cite this version: Sinéad Breen, Niclas Larson, Ann O Shea, Kerstin Pettersson. Students concept

More information

A Computational Evaluation of Case-Assignment Algorithms

A Computational Evaluation of Case-Assignment Algorithms A Computational Evaluation of Case-Assignment Algorithms Miles Calabresi Advisors: Bob Frank and Jim Wood Submitted to the faculty of the Department of Linguistics in partial fulfillment of the requirements

More information

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

More information

Theoretical Syntax Winter Answers to practice problems

Theoretical Syntax Winter Answers to practice problems Linguistics 325 Sturman Theoretical Syntax Winter 2017 Answers to practice problems 1. Draw trees for the following English sentences. a. I have not been running in the mornings. 1 b. Joel frequently sings

More information

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80. CONTENTS FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8 УРОК (Unit) 1 25 1.1. QUESTIONS WITH КТО AND ЧТО 27 1.2. GENDER OF NOUNS 29 1.3. PERSONAL PRONOUNS 31 УРОК (Unit) 2 38 2.1. PRESENT TENSE OF THE

More information

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy Informatics 2A: Language Complexity and the Chomsky Hierarchy September 28, 2010 Starter 1 Is there a finite state machine that recognises all those strings s from the alphabet {a, b} where the difference

More information

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature 1 st Grade Curriculum Map Common Core Standards Language Arts 2013 2014 1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature Key Ideas and Details

More information

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE Submitted in partial fulfillment of the requirements for the degree of Sarjana Sastra (S.S.)

More information

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology Michael L. Connell University of Houston - Downtown Sergei Abramovich State University of New York at Potsdam Introduction

More information

Today we examine the distribution of infinitival clauses, which can be

Today we examine the distribution of infinitival clauses, which can be Infinitival Clauses Today we examine the distribution of infinitival clauses, which can be a) the subject of a main clause (1) [to vote for oneself] is objectionable (2) It is objectionable to vote for

More information

Language specific preferences in anaphor resolution: Exposure or gricean maxims?

Language specific preferences in anaphor resolution: Exposure or gricean maxims? Language specific preferences in anaphor resolution: Exposure or gricean maxims? Barbara Hemforth, Lars Konieczny, Christoph Scheepers, Saveria Colonna, Sarah Schimke, Peter Baumann, Joël Pynte To cite

More information

Adapting Stochastic Output for Rule-Based Semantics

Adapting Stochastic Output for Rule-Based Semantics Adapting Stochastic Output for Rule-Based Semantics Wissenschaftliche Arbeit zur Erlangung des Grades eines Diplom-Handelslehrers im Fachbereich Wirtschaftswissenschaften der Universität Konstanz Februar

More information

LFG Semantics via Constraints

LFG Semantics via Constraints LFG Semantics via Constraints Mary Dalrymple John Lamping Vijay Saraswat fdalrymple, lamping, saraswatg@parc.xerox.com Xerox PARC 3333 Coyote Hill Road Palo Alto, CA 94304 USA Abstract Semantic theories

More information

Som and Optimality Theory

Som and Optimality Theory Som and Optimality Theory This article argues that the difference between English and Norwegian with respect to the presence of a complementizer in embedded subject questions is attributable to a larger

More information

Structure-Preserving Extraction without Traces

Structure-Preserving Extraction without Traces Empirical Issues in Syntax and Semantics 5 O. Bonami & P. Cabredo Hofherr (eds.) 2004, pp. 27 44 http://www.cssp.cnrs.fr/eiss5 Structure-Preserving Extraction without Traces Wesley Davidson 1 Introduction

More information

Emmaus Lutheran School English Language Arts Curriculum

Emmaus Lutheran School English Language Arts Curriculum Emmaus Lutheran School English Language Arts Curriculum Rationale based on Scripture God is the Creator of all things, including English Language Arts. Our school is committed to providing students with

More information

1.2 Interpretive Communication: Students will demonstrate comprehension of content from authentic audio and visual resources.

1.2 Interpretive Communication: Students will demonstrate comprehension of content from authentic audio and visual resources. Course French I Grade 9-12 Unit of Study Unit 1 - Bonjour tout le monde! & les Passe-temps Unit Type(s) x Topical Skills-based Thematic Pacing 20 weeks Overarching Standards: 1.1 Interpersonal Communication:

More information

Name of Course: French 1 Middle School. Grade Level(s): 7 and 8 (half each) Unit 1

Name of Course: French 1 Middle School. Grade Level(s): 7 and 8 (half each) Unit 1 Name of Course: French 1 Middle School Grade Level(s): 7 and 8 (half each) Unit 1 Estimated Instructional Time: 15 classes PA Academic Standards: Communication: Communicate in Languages Other Than English

More information

ON THE SYNTAX AND SEMANTICS

ON THE SYNTAX AND SEMANTICS ON THE SYNTAX AND SEMANTICS OF NUMERALS IN ENGLISH Masaru Honda O. In his 1977 monograph, an extensive study of X syntax, Jackendoff attempts to accomplish cross-category generalizations by proposing a

More information