The Prague Dependency Treebank (and WS02)
|
|
- Sylvia Gibbs
- 6 years ago
- Views:
Transcription
1 The Prague Dependency Treebank (and WS02) Jan Hajič Institute of Formal and Applied Linguistics School of Computer Science Faculty of Mathematics and Physics Charles University, Prague, Czech Republic
2 This Talk: an Overview The Prague Dependency Treebank The project The 3 annotation layers: morphology surface syntax (also: the lab) deep syntax Use of the deep representation: Machine translation Challenges for NL generation 5/7/2002 PreWS02 Summer School 2
3 The Prague Dependency Treebank Project (Czech Language Treebank) PDT v. 0.5 released (JHU workshop) 400k words annotated, unchecked 2001 PDT 1.0 released (LDC): 1.3MW annotated, morphology & surface syntax 2004 PDT 2.0 release planned 1.0MW annotated, underlying (deep) syntax: the tectogrammatical layer 5/7/2002 PreWS02 Summer School 3
4 Annotation Layers Morphology Tag (full morphology, 13 categories), lemma Analytical layer (surface syntax) Dependency, analytical function Tectogrammatical layer (underlying syntax) Dependency, functor (detailed), grammatemes, ellipsis solution, coreference, topic/focus (deep word order) 5/7/2002 PreWS02 Summer School 4
5 Morphological Annotation 13 categories: Category # of values Example(s) POS 10 N (noun), Z (punctuation) SUBPOS 75 P (personal pron.), U (possessive adj.) GENDER 8 I (masc. inanimate), X (any), - (N.A) NUMBER 4 P (plural), D (dual) CASE 9 1 (nominative), 6 (locative) POSSGENDER 4 M (masc. animate), F (feminine) POSSNUMBER 3 S (singular), P (plural) PERSON 5 1 (first),... TENSE 4 P (present), M (past) GRADE 5 3 (superlative) NEGATION 3 A (affirmative), N (negative) VOICE 3 A (active), P (passive) VAR 11 1 (1 st variant), 6 (colloq. style), 8 (abbrev.) 5/7/2002 PreWS02 Summer School 5
6 Layer 1: Morphology Tag: 13 categories Example: AAFP3----3N---- Adjective no poss. Gender negated Regular no poss. Number no voice Feminine no person reserve1 Plural no tense reserve2 Dative superlative base var. Lemma: unique identifier Ex.: (to) the most uninteresting Books/verb -> book-1, went -> go, to/prep. -> To-1 5/7/2002 PreWS02 Summer School 6
7 Layer 2: Analytical syntax Surface, dependency-based representation Every word gets a node, plus one (root) Interested in: dependency structure analytical function: Pred, Sb, Obj, Adv, Atr, Atv, Pnom; AuxV, AuxP, AuxC,...; Coord, Apos, parenthesis ExD 5/7/2002 PreWS02 Summer School 7
8 Layer 2: Analytical syntax Dependency + Analytical Function dependent governor The influence of the Mexican crisis on Central and Eastern Europe has apparently been underestimated. 5/7/2002 PreWS02 Summer School 8
9 Comparison: parse trees vs. dependency Compare: Lexicalized parse tree S(walks) Dependency tree walks VBZ NP(John) VP(walks) NNP John walks VBZ John NNP 5/7/2002 PreWS02 Summer School 9
10 Layer 3: Tectogrammatical Underlying (deep) syntax 4 sublayers: dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes): detailed functors underlying gender, number,... 5/7/2002 PreWS02 Summer School 10
11 Analytical vs. Tectogrammatical annotation (TR: sublayer 1 only shown) Underlying verb + tense Deep function Elided Actor in Another ellipsis... Prepositions out (TR: sublayer 1 only shown) 5/7/2002 PreWS02 Summer School 11
12 Layer 3: Tectogrammatical Underlying (deep) syntax 4 sublayers: dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes): detailed functors underlying gender, number,... 5/7/2002 PreWS02 Summer School 12
13 Dependency structure Similar to the surface (Analytical) layer......but: certain nodes deleted auxiliaries, non-autosemantic words, punctuation some nodes added based on word (mostly verb, noun) valency some ellipsis resolution detailed dependency relation labels (functors) 5/7/2002 PreWS02 Summer School 13
14 Tectogrammatical Functors Actants : ACT, PAT, EFF, ADDR, ORIG cannot repeat in a clause, usually compulsory Free modifications (~ 50) can repeat; optional, sometimes compulsory Ex.: LOC, DIR1,...; TWHEN, TTILL,...; RESTR, DESC; BEN, ATT, ACMP, INTT, MANN; MAT, APP; ID, DPHR, Special Coordination, Rhematizers, Foreign phrases,... 5/7/2002 PreWS02 Summer School 14
15 Tectogrammatical Example Analytical verb form:» (he) allowed would-be to-be enrolled» směl by být zapsán Collapsed Additional attributes (grammatemes): conditional + allow 5/7/2002 PreWS02 Summer School 15
16 Tectogrammatical Example Predicate with copula (state)» (the) pool has-been already filled» bazén byl již napuštěný ý 5/7/2002 PreWS02 Summer School 16
17 Tectogrammatical Example Passive construction (action)» (The) book has-been translated [by Mr. X]» Kniha byla přeložena Disappeared Added 5/7/2002 PreWS02 Summer School 17
18 Tectogrammatical Example Object» (he) gave him a-book» dal mu knihu Obj goes into ACT, PAT, ADDR, EFF or ORIG based on governor s valency frame 5/7/2002 PreWS02 Summer School 18
19 Tectogrammatical Example Relative clause (embedded) (a) house, which is expensive, (we) (to-ourselves) will-notbuy dům, který je drahý, si nekoupíme 5/7/2002 PreWS02 Summer School 19
20 Tectogrammatical Example Incomplete phrases» Peter works well, but Paul badly» Petr pracuje dobře, ale Pavel špatně Added 5/7/2002 PreWS02 Summer School 20
21 Layer 3: Tectogrammatical Underlying (deep) syntax 4 sublayers: dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes): detailed functors underlying gender, number,... 5/7/2002 PreWS02 Summer School 21
22 Deep word order, topic/focus Deep word order: from old information to the new one (left-to-right) at every level (head included) projectivity by definition i.e., partial level-based order -> total d.w.o. Topic/focus/contrastive topic attribute of every node restricted by d.w.o. and other constraints 5/7/2002 PreWS02 Summer School 22
23 Deep word order, topic/focus Example: Analytical dep. tree: Baker bakes rolls. vs. Baker IC bakes rolls. 5/7/2002 PreWS02 Summer School 23
24 Layer 3: Tectogrammatical Underlying (deep) syntax 4 sublayers: dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes): detailed functors underlying gender, number,... 5/7/2002 PreWS02 Summer School 24
25 Coreference Grammatical (vs. textual) Ex.: Peter moved to Iowa after he finished his PhD. move PRED Peter Iowa ACT DIR1 finish TWHEN he ACT he APP PhD PAT NB: poster about Control, this morning 5/7/2002 PreWS02 Summer School 25
26 Layer 3: Tectogrammatical Underlying (deep) syntax 4 sublayers: dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes): detailed functors underlying gender, number,... 5/7/2002 PreWS02 Summer School 26
27 Grammatemes Syntactic (= detailed functors) only for some functors: WHEN: before/after LOC: next-to, behind, in-front-of Lexical, underlying number (SG/PL), tense, modality, degree of comparison strictly only where necessary (agreement!) 5/7/2002 PreWS02 Summer School 27
28 The Valency Lexicon Valency frames each verb (+ some nouns, adjectives) has slots for functor/form pairs: give: ACT(Nom) PAT(Acc) ADDR(to+Dat) Basic set prepared in advance, annotators add entries on-the-go, checking and approval process follows (consistency) Compare: Levin s Classes, Proposition Bank 5/7/2002 PreWS02 Summer School 28
29 Tectogrammatical Annotation Manual annotation 4 groups of annotators ~ 4 sublayers Special graphical tool (TrEd) Customizable graphical tree editor Preprocessing Data from Analytical Layer, preprocessed Online dependency function preassignment 5/7/2002 PreWS02 Summer School 29
30 The [Manual] Annotation Tool Perl/PerlTk based, platform-independent Linux, Windows 95/98/2000, Solaris,... Perl as the macro language unlimited online processing capability Flexibility for interactive checking split screen, graphical diff function Customization, printing, plugins 5/7/2002 PreWS02 Summer School 30
31 The TrEd Tree Editor Graphical tool TrEd Main screen: Original sentence: [This year s flu season is still quiet in Europe.] Editing window customization Run a macro Multiwindow editing/compare 5/7/2002 PreWS02 Summer School 31
32 Valency Lexicon in TrEd to write sth (about sth) 5/7/2002 PreWS02 Summer School 32
33 What Is It Good For? Machine Translation TL representation is closer to an interlingua than surface (analytical) syntax => less work in the transfer phase more work in parsing and generation...but advantage in multilingual MT application Question answering same representation for questions and answers 5/7/2002 PreWS02 Summer School 33
34 Machine Translation Architecture Typical (structural) MT system: Transfer a parse Analysis (parsing) Generation (synthesis) source sentence target sentence 5/7/2002 PreWS02 Summer School 34
35 Machine Translation Architecture Tectogrammatical layer-based system: Transfer (tectogrammatical) parsing parsing morphology (tagging) tectogrammatical layer analytical layer morphological layer generation linearization morph. synthesis source sentence target sentence 5/7/2002 PreWS02 Summer School 35
36 Comparison: analytical layer 5/7/2002 PreWS02 Summer School 36
37 Comparison: tectogrammatical l. The [Homestead s] only remaining baker bakes the most famous roll s to the north of Long River. al-xabaaz al- axiir al-baaqii [fii Homestead] yaśmacu ashhar al-kruasaanaat ilaa shimaal min Long River. 5/7/2002 PreWS02 Summer School 37
38 The Three Crucial Steps Analytical (surface) Tectogrammatical additional parsing required Transfer minimal effort: only true transformations needed (like swimming ~ schwimmen gern) Generation back from Tectogrammatical representation to Analytical (surface syntax) 5/7/2002 PreWS02 Summer School 38
39 The Devil s In... The additional three steps: (tectogrammatical) parsing parsing morphology (tagging) Transfer tectogrammatical layer analytical layer morphological layer Generation linearization (trivial) morph. synthesis (easy) source sentence target sentence 5/7/2002 PreWS02 Summer School 39
40 The Devil s In... The additional three steps: Tectogrammatical parsing (Simple) transfer tectogrammatical layer source analytical layer target Generation: - Deletions - Insertions: prepositions, conjunctions,... - Word order - Morphology 5/7/2002 PreWS02 Summer School 40
41 Components:...the Generation Deletions of nodes [rare if going into English] Insertions of nodes prepositions, conjunctions, punctuation splitting phrases/idioms/named entities Tree reorganization (numeric expressions) Surface word order (analytical tree: defined w.o.) Morphology (agreement, cases based on subcat) 5/7/2002 PreWS02 Summer School 41
42 Generation Insertion of Prepositions střed center tectogrammatical layer přitažlivost APP.sg gravity APP.sg center středu přitažlivosti.nfs2 Atr analytical layer of AuxP gravity.nn Atr 5/7/2002 PreWS02 Summer School 42
43 Surface word order přijít.past Generation come.past tectogrammatical layer včera Petr yesterday Peter TWHEN ACT TWHEN ACT přijít.vb3sp come.vbd včera Petr Adv Sb analytical layer Peter yesterday Sb Adv 5/7/2002 PreWS02 Summer School 43
44 Generation: Complex Input English translation 5/7/2002 PreWS02 Summer School 44
45 Generation: How-To (1) Statistical, (perhaps) in two steps Analytical tree reconstruction everything except word order i.e., includes morphology (tag assignment) Word Order projective trees assumed here 1 thus, it is sufficient to determine level-by-level word order 1 Additional step required for non-projective constructions [can be avoided for English] 5/7/2002 PreWS02 Summer School 45
46 Generation: How-To (2) Reconstruction: two possible ways transformation-based learning ([fn]tbl) probabilistic, by a dependency tree model: based on triplets <word,tag,afun> and dependency relation (governor,dependent) ~ Collins bilexical model, Charniak parser model, Bangalore & Rambow afun instead of nonterminals 5/7/2002 PreWS02 Summer School 46
47 Generation: How-To (3) Word order language model for a single level in the tree: <word,tag,afun> triples; includes head (no afun) come.vbd Peter.NNP yesterday.adv Sb Adv non-projective constructions (and some more) by classic n-gram LM 5/7/2002 PreWS02 Summer School 47
48 Generation: How-To (4) Data trained on WSJ: converted to analytical dependency trees adapted Jason Eisner s head assignment rules added rules for heads of base NPs added rules for analytical functions rule-based parsing to tectogrammatical layer (for now; manual annotation will follow) i.e., TR AR data available (English) 5/7/2002 PreWS02 Summer School 48
49 Some pointers Current version of PDT: v1.0 morphology + analytical level 1.3M words (train/dev test/eval test) Projects -> Treebank LDC2001T10 (PDT v1.0) Workshop 2002 Using TL for MT Generation 5/7/2002 PreWS02 Summer School 49
Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.
to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationA Framework for Customizable Generation of Hypertext Presentations
A Framework for Customizable Generation of Hypertext Presentations Benoit Lavoie and Owen Rambow CoGenTex, Inc. 840 Hanshaw Road, Ithaca, NY 14850, USA benoit, owen~cogentex, com Abstract In this paper,
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationChapter 4: Valence & Agreement CSLI Publications
Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).
More informationAdding syntactic structure to bilingual terminology for improved domain adaptation
Adding syntactic structure to bilingual terminology for improved domain adaptation Mikel Artetxe 1, Gorka Labaka 1, Chakaveh Saedi 2, João Rodrigues 2, João Silva 2, António Branco 2, Eneko Agirre 1 1
More informationAccurate Unlexicalized Parsing for Modern Hebrew
Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The
More informationBULATS A2 WORDLIST 2
BULATS A2 WORDLIST 2 INTRODUCTION TO THE BULATS A2 WORDLIST 2 The BULATS A2 WORDLIST 21 is a list of approximately 750 words to help candidates aiming at an A2 pass in the Cambridge BULATS exam. It is
More informationA Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many
Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.
More informationDeveloping a TT-MCTAG for German with an RCG-based Parser
Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,
More informationUnderlying and Surface Grammatical Relations in Greek consider
0 Underlying and Surface Grammatical Relations in Greek consider Sentences Brian D. Joseph The Ohio State University Abbreviated Title Grammatical Relations in Greek consider Sentences Brian D. Joseph
More informationContext Free Grammars. Many slides from Michael Collins
Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures
More informationLTAG-spinal and the Treebank
LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)
More informationApproaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque
Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically
More informationCompositional Semantics
Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language
More informationBasic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.
Basic Syntax Doug Arnold doug@essex.ac.uk We review some basic grammatical ideas and terminology, and look at some common constructions in English. 1 Categories 1.1 Word level (lexical and functional)
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationWords come in categories
Nouns Words come in categories D: A grammatical category is a class of expressions which share a common set of grammatical properties (a.k.a. word class or part of speech). Words come in categories Open
More informationThe stages of event extraction
The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks
More informationUNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen
UNIVERSITY OF OSLO Department of Informatics Dialog Act Recognition using Dependency Features Master s thesis Sindre Wetjen November 15, 2013 Acknowledgments First I want to thank my supervisors Lilja
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit
Unit 1 Language Development Express Ideas and Opinions Ask for and Give Information Engage in Discussion ELD CELDT 5 EDGE Level C Curriculum Guide 20132014 Sentences Reflective Essay August 12 th September
More informationENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist
Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet
More informationSpecifying a shallow grammatical for parsing purposes
Specifying a shallow grammatical for parsing purposes representation Atro Voutilainen and Timo J~irvinen Research Unit for Multilingual Language Technology P.O. Box 4 FIN-0004 University of Helsinki Finland
More informationTHE VERB ARGUMENT BROWSER
THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationParticipate in expanded conversations and respond appropriately to a variety of conversational prompts
Students continue their study of German by further expanding their knowledge of key vocabulary topics and grammar concepts. Students not only begin to comprehend listening and reading passages more fully,
More informationModeling full form lexica for Arabic
Modeling full form lexica for Arabic Susanne Alt Amine Akrout Atilf-CNRS Laurent Romary Loria-CNRS Objectives Presentation of the current standardization activity in the domain of lexical data modeling
More informationSome Principles of Automated Natural Language Information Extraction
Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationChapter 9 Banked gap-filling
Chapter 9 Banked gap-filling This testing technique is known as banked gap-filling, because you have to choose the appropriate word from a bank of alternatives. In a banked gap-filling task, similarly
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationDeveloping Grammar in Context
Developing Grammar in Context intermediate with answers Mark Nettle and Diana Hopkins PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE The Pitt Building, Trumpington Street, Cambridge, United
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More information1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class
If we cancel class 1/20 idea We ll spend an extra hour on 1/21 I ll give you a brief writing problem for 1/21 based on assigned readings Jot down your thoughts based on your reading so you ll be ready
More informationTowards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la
Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationDerivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.
Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material
More informationToday we examine the distribution of infinitival clauses, which can be
Infinitival Clauses Today we examine the distribution of infinitival clauses, which can be a) the subject of a main clause (1) [to vote for oneself] is objectionable (2) It is objectionable to vote for
More informationCh VI- SENTENCE PATTERNS.
Ch VI- SENTENCE PATTERNS faizrisd@gmail.com www.pakfaizal.com It is a common fact that in the making of well-formed sentences we badly need several syntactic devices used to link together words by means
More informationConstraining X-Bar: Theta Theory
Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,
More informationLinguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis
International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationNational Literacy and Numeracy Framework for years 3/4
1. Oracy National Literacy and Numeracy Framework for years 3/4 Speaking Listening Collaboration and discussion Year 3 - Explain information and ideas using relevant vocabulary - Organise what they say
More informationEdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,
More informationCalifornia Department of Education English Language Development Standards for Grade 8
Section 1: Goal, Critical Principles, and Overview Goal: English learners read, analyze, interpret, and create a variety of literary and informational text types. They develop an understanding of how language
More informationA Computational Evaluation of Case-Assignment Algorithms
A Computational Evaluation of Case-Assignment Algorithms Miles Calabresi Advisors: Bob Frank and Jim Wood Submitted to the faculty of the Department of Linguistics in partial fulfillment of the requirements
More informationSemantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition
Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition Roy Bar-Haim,Ido Dagan, Iddo Greental, Idan Szpektor and Moshe Friedman Computer Science Department, Bar-Ilan University,
More informationBANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS
Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.
More informationConstruction Grammar. University of Jena.
Construction Grammar Holger Diessel University of Jena holger.diessel@uni-jena.de http://www.holger-diessel.de/ Words seem to have a prototype structure; but language does not only consist of words. What
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationcambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN
C O P i L cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN 2050-5949 THE DYNAMICS OF STRUCTURE BUILDING IN RANGI: AT THE SYNTAX-SEMANTICS INTERFACE H a n n a h G i b s o
More informationType Theory and Universal Grammar
Type Theory and Universal Grammar Aarne Ranta Department of Computer Science and Engineering Chalmers University of Technology and Göteborg University Abstract. The paper takes a look at the history of
More informationEAGLE: an Error-Annotated Corpus of Beginning Learner German
EAGLE: an Error-Annotated Corpus of Beginning Learner German Adriane Boyd Department of Linguistics The Ohio State University adriane@ling.osu.edu Abstract This paper describes the Error-Annotated German
More informationThe College Board Redesigned SAT Grade 12
A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.
More informationMinimalism is the name of the predominant approach in generative linguistics today. It was first
Minimalism Minimalism is the name of the predominant approach in generative linguistics today. It was first introduced by Chomsky in his work The Minimalist Program (1995) and has seen several developments
More informationDerivational and Inflectional Morphemes in Pak-Pak Language
Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes
More informationUniversal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses
Universal Grammar 1 evidence : 1. crosslinguistic investigation of properties of languages 2. evidence from language acquisition 3. general cognitive abilities 1. Properties can be reflected in a.) structural
More informationCase government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG
Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More informationBASIC ENGLISH. Book GRAMMAR
BASIC ENGLISH Book 1 GRAMMAR Anne Seaton Y. H. Mew Book 1 Three Watson Irvine, CA 92618-2767 Web site: www.sdlback.com First published in the United States by Saddleback Educational Publishing, 3 Watson,
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationFirst Grade Curriculum Highlights: In alignment with the Common Core Standards
First Grade Curriculum Highlights: In alignment with the Common Core Standards ENGLISH LANGUAGE ARTS Foundational Skills Print Concepts Demonstrate understanding of the organization and basic features
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationEmmaus Lutheran School English Language Arts Curriculum
Emmaus Lutheran School English Language Arts Curriculum Rationale based on Scripture God is the Creator of all things, including English Language Arts. Our school is committed to providing students with
More informationArgument structure and theta roles
Argument structure and theta roles Introduction to Syntax, EGG Summer School 2017 András Bárány ab155@soas.ac.uk 26 July 2017 Overview Where we left off Arguments and theta roles Some consequences of theta
More informationMultiple case assignment and the English pseudo-passive *
Multiple case assignment and the English pseudo-passive * Norvin Richards Massachusetts Institute of Technology Previous literature on pseudo-passives (see van Riemsdijk 1978, Chomsky 1981, Hornstein &
More informationInleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3
Inleiding Taalkunde Docent: Paola Monachesi Blok 4, 2001/2002 Contents 1 Syntax 2 2 Phrases and constituent structure 2 3 A minigrammar of Italian 3 4 Trees 3 5 Developing an Italian lexicon 4 6 S(emantic)-selection
More informationLanguage Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin
Stromswold & Rifkin, Language Acquisition by MZ & DZ SLI Twins (SRCLD, 1996) 1 Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Dept. of Psychology & Ctr. for
More informationMemory-based grammatical error correction
Memory-based grammatical error correction Antal van den Bosch Peter Berck Radboud University Nijmegen Tilburg University P.O. Box 9103 P.O. Box 90153 NL-6500 HD Nijmegen, The Netherlands NL-5000 LE Tilburg,
More informationa) analyse sentences, so you know what s going on and how to use that information to help you find the answer.
Tip Sheet I m going to show you how to deal with ten of the most typical aspects of English grammar that are tested on the CAE Use of English paper, part 4. Of course, there are many other grammar points
More informationSenior Stenographer / Senior Typist Series (including equivalent Secretary titles)
New York State Department of Civil Service Committed to Innovation, Quality, and Excellence A Guide to the Written Test for the Senior Stenographer / Senior Typist Series (including equivalent Secretary
More informationOn the Notion Determiner
On the Notion Determiner Frank Van Eynde University of Leuven Proceedings of the 10th International Conference on Head-Driven Phrase Structure Grammar Michigan State University Stefan Müller (Editor) 2003
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1
Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up
More informationExperiments with a Higher-Order Projective Dependency Parser
Experiments with a Higher-Order Projective Dependency Parser Xavier Carreras Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) 32 Vassar St., Cambridge,
More informationAN EXPERIMENTAL APPROACH TO NEW AND OLD INFORMATION IN TURKISH LOCATIVES AND EXISTENTIALS
AN EXPERIMENTAL APPROACH TO NEW AND OLD INFORMATION IN TURKISH LOCATIVES AND EXISTENTIALS Engin ARIK 1, Pınar ÖZTOP 2, and Esen BÜYÜKSÖKMEN 1 Doguş University, 2 Plymouth University enginarik@enginarik.com
More informationWriting a composition
A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a
More informationA First-Pass Approach for Evaluating Machine Translation Systems
[Proceedings of the Evaluators Forum, April 21st 24th, 1991, Les Rasses, Vaud, Switzerland; ed. Kirsten Falkedal (Geneva: ISSCO).] A First-Pass Approach for Evaluating Machine Translation Systems Pamela
More informationGrammars & Parsing, Part 1:
Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review
More informationCharacter Stream Parsing of Mixed-lingual Text
Character Stream Parsing of Mixed-lingual Text Harald Romsdorfer and Beat Pfister Speech Processing Group Computer Engineering and Networks Laboratory ETH Zurich {romsdorfer,pfister}@tik.ee.ethz.ch Abstract
More informationAdapting Stochastic Output for Rule-Based Semantics
Adapting Stochastic Output for Rule-Based Semantics Wissenschaftliche Arbeit zur Erlangung des Grades eines Diplom-Handelslehrers im Fachbereich Wirtschaftswissenschaften der Universität Konstanz Februar
More informationcmp-lg/ Jul 1995
A CONSTRAINT-BASED CASE FRAME LEXICON ARCHITECTURE 1 Introduction Kemal Oazer and Okan Ylmaz Department of Computer Engineering and Information Science Bilkent University Bilkent, Ankara 0, Turkey fko,okang@cs.bilkent.edu.tr
More informationAdjectives tell you more about a noun (for example: the red dress ).
Curriculum Jargon busters Grammar glossary Key: Words in bold are examples. Words underlined are terms you can look up in this glossary. Words in italics are important to the definition. Term Adjective
More informationHoughton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)
Houghton Mifflin Reading Correlation to the Standards for English Language Arts (Grade1) 8.3 JOHNNY APPLESEED Biography TARGET SKILLS: 8.3 Johnny Appleseed Phonemic Awareness Phonics Comprehension Vocabulary
More informationCORPUS ANALYSIS CORPUS ANALYSIS QUANTITATIVE ANALYSIS
CORPUS ANALYSIS Antonella Serra CORPUS ANALYSIS ITINEARIES ON LINE: SARDINIA, CAPRI AND CORSICA TOTAL NUMBER OF WORD TOKENS 13.260 TOTAL NUMBER OF WORD TYPES 3188 QUANTITATIVE ANALYSIS THE MOST SIGNIFICATIVE
More informationIssues of Projectivity in the Prague Dependency Treebank
Issues of Projectivity in the Prague Dependency Treebank Eva Hajičová, Jiří Havelka, Petr Sgall, Kateřina Veselá, Daniel Zeman Center for Computational linguistics Faculty of Mathematics and Physics, Charles
More informationA Graph Based Authorship Identification Approach
A Graph Based Authorship Identification Approach Notebook for PAN at CLEF 2015 Helena Gómez-Adorno 1, Grigori Sidorov 1, David Pinto 2, and Ilia Markov 1 1 Center for Computing Research, Instituto Politécnico
More informationHindi-Urdu Phrase Structure Annotation
Hindi-Urdu Phrase Structure Annotation Rajesh Bhatt and Owen Rambow January 12, 2009 1 Design Principle: Minimal Commitments Binary Branching Representations. Mostly lexical projections (P,, AP, AdvP)
More informationSEMAFOR: Frame Argument Resolution with Log-Linear Models
SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon
More informationHeuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger
Page 1 of 35 Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger Kaihong Liu, MD, MS, Wendy Chapman, PhD, Rebecca Hwa, PhD, and Rebecca S. Crowley, MD, MS
More informationSAMPLE PAPER SYLLABUS
SOF INTERNATIONAL ENGLISH OLYMPIAD SAMPLE PAPER SYLLABUS 2017-18 Total Questions : 35 Section (1) Word and Structure Knowledge PATTERN & MARKING SCHEME (2) Reading (3) Spoken and Written Expression (4)
More informationBeyond the Pipeline: Discrete Optimization in NLP
Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We
More informationWhat the National Curriculum requires in reading at Y5 and Y6
What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the
More informationFeature-Based Grammar
8 Feature-Based Grammar James P. Blevins 8.1 Introduction This chapter considers some of the basic ideas about language and linguistic analysis that define the family of feature-based grammars. Underlying
More information