ANNOTATING DISCOURSE IN PRAGUE DEPENDENCY TREEBANK

Size: px
Start display at page:

Download "ANNOTATING DISCOURSE IN PRAGUE DEPENDENCY TREEBANK"

Transcription

1 ANNOTATING DISCOURSE IN PRAGUE DEPENDENCY TREEBANK PDTB WORKSHOP, PHILADEPHIA APRIL 30, 2012 Lucie Poláková (Mladová), Charles University in Prague

2 PRAGUE DEPENDENCY TREEBANK & DISCOURSE 2006: Prague Dependency Treebank (PDT) 2.0 Hajič et al. : multilayer annotations of 50,000 sentences of Czech journalistic texts: - morphology - surface syntax - underlying syntax + sentence semantics (tectogrammatical tree structures) + other phenomena included: information structure, grammatically bound and textual pronominal coreference : annotation of discourse-related phenomena - Prof. Eva Hajičová

3 DISCOURSE-LEVEL PHENOMENA IN PDT Manual annotations of the whole treebank: 49,431 sentences in 3165 documents of sentences, with an average length of 15.6 sentences 1. Explicit discourse connectives + their arguments (scopes), sense tags (= PDTB II. annotation) 2. a) Extended textual coreference annotation b) Bridging relations ALL DISCOURSE PHENOMENA ANNOTATED DIRECTLY ON THE SYNTACTIC (tectogrammatical) TREES! Both projects completed 2011, currently under checking procedures

4 EXAMPLE OF ANNOTATION ON TREES 1. Searching for the possible discourse connectives in a plain text (unlike in Penn) V polovině července radní předložené rozhodnutí akceptovali. Dodnes však žádná smlouva podepsána nebyla. In middle July, the councilors accepted the proposed decision. However, no contract was signed so far. 2. Marking the discourse relation on the tectogrammatical trees:

5 In middle July, the councilors accepted the proposed decision. However, no contract was signed so far.

6 In middle July, the councilors accepted the proposed decision. However, no contract was signed so far.

7 In middle July, the councilors accepted the proposed decision. However, no contract was signed so far.

8 In middle July, the councilors accepted the proposed decision. However, no contract was signed so far. - linking of the arguments of the relation by an arrow (automatically connected with choice of semantic intepretation)

9 In middle July, the councilors accepted the proposed decision. However, no contract was signed so far. - linking of the arguments of the relation by an arrow (automatically connected with choice of semantic intepretation) - adding of the connective

10 In middle July, the councilors accepted the proposed decision. However, no contract was signed so far. - linking of the arguments of the relation by an arrow (automatically connected with choice of semantic intepretation) - adding of the connective - marking of the extent of the arguments

11 DISCOURSE REPRESENTATION COMPACT VIEW

12 SEMANTIC LABELS IN PRAGUE TEMPORAL CONTINGENCY COMPARISON (CONTRAST) EXPANSION asynchronous reason - result confrontation conjunction synchronous pragmatic reason result opposition instantiation purpose pragmatic contrast specification explication restrictive opposition equivalence condition concession generalization pragmatic condition correction (replacement) gradation conjunctive alternative disjunctive alternative

13 SEMANTIC LABELS IN PRAGUE TEMPORAL CONTINGENCY COMPARISON (CONTRAST) EXPANSION asynchronous reason - result confrontation conjunction synchronous pragmatic reason result opposition instantiation purpose pragmatic contrast specification explication restrictive opposition equivalence condition concession generalization pragmatic condition correction (replacement) gradation conjunctive alternative disjunctive alternative

14 ADDITIONAL INFORMATION ANNOTATED List structures quite a typical way of text composition Headings Alternative lexical expressions of the connectives, for example: Z toho vyplývá = tedy, takže Důvodem je = protože (This implies = so, therefore ) (The reason is that = because) Double-sense relations The so-called collections Not annotated so far: Implicit connectives, their arguments and senses (in comparison to PDTB 2.0) Attribution (partly present in some features of the syntactic trees)

15 TREEBANK STATISTICS Measured on train data = 9/10 of the treebank = 43,955 sentences Relation Intra-sentential Intersentential Intersentential Total conj opp reason cond conc preced confr spec purp corr grad Relation Intrasentential Total restr synchr explicat disjalt exempl equiv gener conjalt f_opp f_reason f_cond total

16 INTERANNOTATOR AGREEMENT MEASUREMENT treebank: 10 sections, in each of them a sample annotated by all annotators for the IAA measurement (2,084 sentences) following table: only the inter-sentential relations (assumed improvement with intra-sententials) connective-based measurement one pair of annotators average agreement on semantic types 0.77 similar measure in PDTB 0.8 (Prasad et al., LREC 2008) agreement on higher level: 4 basic semantic classes 0.89 (Penn 0.94)

17 IAA IN THE SUBSEQUENT MEASUREMENTS Measurement Connectivebased F1 measure Agreement on semantic types Kappa on sem. types train train train train train train train dtest etest train

18 WHAT HAVE WE LEARNED? Linguistically: the absolute number of explicit inter-sentential relations is low (it increases with the incorporation of "syntactic" discourse edges) coreference annotation (entity-based, EntRel) on the same data HELPS a lot things different nature of the relations (condition syntax-bound; specification text-bound) some very ambiguous connectives in Czech (ale = but) some Czech connectives have no exact English counterpart (totiž additional semantic category?) some findings correspond with Penn numbers discourse structure to some extent language independent we need genre distinction

19 WHAT HAVE WE LEARNED? Technically: trees with resolved syntactic structure help a lot (ellipsis, easy extraction of intra-sentential relations, coreference accessible ) sometimes problems to deal with larger arguments, complexity of the trees the representation offers lot of data with all the various types of linguistic information AT ONCE

20 RECENT & NEXT Completed: the annotation of explicits annotation manual in Czech and in English webpage with a browser in a sample of data intergration of the two projects to ONE layer with coreference and bridging annotations In progress: extracting relevant tectogrammatical information (intra-sentential discourse relations) release of the discourse and coreference data as PDT 2.5 Next: altlexes genre distinction of the texts interplays (information structure, Play Coref, the phenomenon of contrast...) automatic experiments

21 Thank you! This work was supported by the Grant Agency of the Czech Republic (P406/12/0658, P406/2010/0875 ) and by the Czech Ministry of Education (ME 10018).

Some Principles of Automated Natural Language Information Extraction

Some Principles of Automated Natural Language Information Extraction Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

The Discourse Anaphoric Properties of Connectives

The Discourse Anaphoric Properties of Connectives The Discourse Anaphoric Properties of Connectives Cassandre Creswell, Kate Forbes, Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi Λ, Bonnie Webber y Λ University of Pennsylvania 3401 Walnut Street Philadelphia,

More information

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet

More information

New perspectives on cohesion and coherence

New perspectives on cohesion and coherence New perspectives on cohesion and coherence Implications for translation Edited by Katrin Menzel Ekaterina Lapshinova-Koltunski Kerstin Kunz Translation and Multilingual Natural Language Processing 6 language

More information

Compositional Semantics

Compositional Semantics Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

More information

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions. to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about

More information

Adding syntactic structure to bilingual terminology for improved domain adaptation

Adding syntactic structure to bilingual terminology for improved domain adaptation Adding syntactic structure to bilingual terminology for improved domain adaptation Mikel Artetxe 1, Gorka Labaka 1, Chakaveh Saedi 2, João Rodrigues 2, João Silva 2, António Branco 2, Eneko Agirre 1 1

More information

CS 598 Natural Language Processing

CS 598 Natural Language Processing CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

More information

University of Edinburgh. University of Pennsylvania

University of Edinburgh. University of Pennsylvania Behrens & Fabricius-Hansen (eds.) Structuring information in discourse: the explicit/implicit dimension, Oslo Studies in Language 1(1), 2009. 171-190. (ISSN 1890-9639) http://www.journals.uio.no/osla :

More information

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure Introduction Outline : Dynamic Semantics with Discourse Structure pierrel@coli.uni-sb.de Seminar on Computational Models of Discourse, WS 2007-2008 Department of Computational Linguistics & Phonetics Universität

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

Accurate Unlexicalized Parsing for Modern Hebrew

Accurate Unlexicalized Parsing for Modern Hebrew Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The

More information

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading Program Requirements Competency 1: Foundations of Instruction 60 In-service Hours Teachers will develop substantive understanding of six components of reading as a process: comprehension, oral language,

More information

Control and Boundedness

Control and Boundedness Control and Boundedness Having eliminated rules, we would expect constructions to follow from the lexical categories (of heads and specifiers of syntactic constructions) alone. Combinatory syntax simply

More information

English Language and Applied Linguistics. Module Descriptions 2017/18

English Language and Applied Linguistics. Module Descriptions 2017/18 English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,

More information

A High-Quality Web Corpus of Czech

A High-Quality Web Corpus of Czech A High-Quality Web Corpus of Czech Johanka Spoustová, Miroslav Spousta Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics Charles University Prague, Czech Republic {johanka,spousta}@ufal.mff.cuni.cz

More information

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization Annemarie Friedrich, Marina Valeeva and Alexis Palmer COMPUTATIONAL LINGUISTICS & PHONETICS SAARLAND UNIVERSITY, GERMANY

More information

Developing a TT-MCTAG for German with an RCG-based Parser

Developing a TT-MCTAG for German with an RCG-based Parser Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

Applications of memory-based natural language processing

Applications of memory-based natural language processing Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal

More information

LING 329 : MORPHOLOGY

LING 329 : MORPHOLOGY LING 329 : MORPHOLOGY TTh 10:30 11:50 AM, Physics 121 Course Syllabus Spring 2013 Matt Pearson Office: Vollum 313 Email: pearsonm@reed.edu Phone: 7618 (off campus: 503-517-7618) Office hrs: Mon 1:30 2:30,

More information

Underlying and Surface Grammatical Relations in Greek consider

Underlying and Surface Grammatical Relations in Greek consider 0 Underlying and Surface Grammatical Relations in Greek consider Sentences Brian D. Joseph The Ohio State University Abbreviated Title Grammatical Relations in Greek consider Sentences Brian D. Joseph

More information

Annotation Projection for Discourse Connectives

Annotation Projection for Discourse Connectives SFB 833 / Univ. Tübingen Penn Discourse Treebank Workshop Annotation projection Basic idea: Given a bitext E/F and annotation for F, how would the annotation look for E? Examples: Word Sense Disambiguation

More information

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically

More information

A Framework for Customizable Generation of Hypertext Presentations

A Framework for Customizable Generation of Hypertext Presentations A Framework for Customizable Generation of Hypertext Presentations Benoit Lavoie and Owen Rambow CoGenTex, Inc. 840 Hanshaw Road, Ithaca, NY 14850, USA benoit, owen~cogentex, com Abstract In this paper,

More information

Semi-supervised Training for the Averaged Perceptron POS Tagger

Semi-supervised Training for the Averaged Perceptron POS Tagger Semi-supervised Training for the Averaged Perceptron POS Tagger Drahomíra johanka Spoustová Jan Hajič Jan Raab Miroslav Spousta Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics,

More information

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen UNIVERSITY OF OSLO Department of Informatics Dialog Act Recognition using Dependency Features Master s thesis Sindre Wetjen November 15, 2013 Acknowledgments First I want to thank my supervisors Lilja

More information

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282) B. PALTRIDGE, DISCOURSE ANALYSIS: AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC. 2012. PP. VI, 282) Review by Glenda Shopen _ This book is a revised edition of the author s 2006 introductory

More information

SEMAFOR: Frame Argument Resolution with Log-Linear Models

SEMAFOR: Frame Argument Resolution with Log-Linear Models SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon

More information

An Interactive Intelligent Language Tutor Over The Internet

An Interactive Intelligent Language Tutor Over The Internet An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This

More information

An Introduction to the Minimalist Program

An Introduction to the Minimalist Program An Introduction to the Minimalist Program Luke Smith University of Arizona Summer 2016 Some findings of traditional syntax Human languages vary greatly, but digging deeper, they all have distinct commonalities:

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

Annotating (Anaphoric) Ambiguity 1 INTRODUCTION. Paper presentend at Corpus Linguistics 2005, University of Birmingham, England

Annotating (Anaphoric) Ambiguity 1 INTRODUCTION. Paper presentend at Corpus Linguistics 2005, University of Birmingham, England Paper presentend at Corpus Linguistics 2005, University of Birmingham, England Annotating (Anaphoric) Ambiguity Massimo Poesio and Ron Artstein University of Essex Language and Computation Group / Department

More information

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)

More information

Parsing of part-of-speech tagged Assamese Texts

Parsing of part-of-speech tagged Assamese Texts IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

More information

Grammar Extraction from Treebanks for Hindi and Telugu

Grammar Extraction from Treebanks for Hindi and Telugu Grammar Extraction from Treebanks for Hindi and Telugu Prasanth Kolachina, Sudheer Kolachina, Anil Kumar Singh, Samar Husain, Viswanatha Naidu,Rajeev Sangal and Akshar Bharati Language Technologies Research

More information

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

More information

Phonological and Phonetic Representations: The Case of Neutralization

Phonological and Phonetic Representations: The Case of Neutralization Phonological and Phonetic Representations: The Case of Neutralization Allard Jongman University of Kansas 1. Introduction The present paper focuses on the phenomenon of phonological neutralization to consider

More information

Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

More information

The interpretation of Latin predicative participles

The interpretation of Latin predicative participles The interpretation of Latin predicative participles EALC.15.06.2012 Øyvind Strand Predicative participles Dionysius Syracusis expulsus Dionysius:NOM Syracuse expell:prtc.perf.pass.nom Corinthi pueros docebat

More information

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,

More information

Language Evolution, Metasyntactically. First International Workshop on Bidirectional Transformations (BX 2012)

Language Evolution, Metasyntactically. First International Workshop on Bidirectional Transformations (BX 2012) Language Evolution, Metasyntactically First International Workshop on Bidirectional Transformations (BX 2012) Vadim Zaytsev, SWAT, CWI 2012 Introduction Every language document employs its own We focus

More information

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence. NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

More information

Derivational and Inflectional Morphemes in Pak-Pak Language

Derivational and Inflectional Morphemes in Pak-Pak Language Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes

More information

Constraining X-Bar: Theta Theory

Constraining X-Bar: Theta Theory Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,

More information

Realization of Textual Cohesion and Coherence in Business Letters through Presupposition 1

Realization of Textual Cohesion and Coherence in Business Letters through Presupposition 1 Realization of Textual Cohesion and Coherence in Business Letters through Presupposition 1 Yu Chunmei English teacher in Foreign Language Department of Sichuan University of Science& Engineering 180# Xueyuan

More information

The stages of event extraction

The stages of event extraction The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

More information

Issues of Projectivity in the Prague Dependency Treebank

Issues of Projectivity in the Prague Dependency Treebank Issues of Projectivity in the Prague Dependency Treebank Eva Hajičová, Jiří Havelka, Petr Sgall, Kateřina Veselá, Daniel Zeman Center for Computational linguistics Faculty of Mathematics and Physics, Charles

More information

Using Semantic Relations to Refine Coreference Decisions

Using Semantic Relations to Refine Coreference Decisions Using Semantic Relations to Refine Coreference Decisions Heng Ji David Westbrook Ralph Grishman Department of Computer Science New York University New York, NY, 10003, USA hengji@cs.nyu.edu westbroo@cs.nyu.edu

More information

Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

More information

LTAG-spinal and the Treebank

LTAG-spinal and the Treebank LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)

More information

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

More information

Dissertation Summaries. The Acquisition of Aspect and Motion Verbs in the Native Language (Aristotle University of Thessaloniki, 2014)

Dissertation Summaries. The Acquisition of Aspect and Motion Verbs in the Native Language (Aristotle University of Thessaloniki, 2014) brill.com/jgl Dissertation Summaries The Acquisition of Aspect and Motion Verbs in the Native Language (Aristotle University of Thessaloniki, 2014) Maria Kotroni Aristotle University of Thessaloniki mkotroni@hotmail.com

More information

Construction Grammar. University of Jena.

Construction Grammar. University of Jena. Construction Grammar Holger Diessel University of Jena holger.diessel@uni-jena.de http://www.holger-diessel.de/ Words seem to have a prototype structure; but language does not only consist of words. What

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer

More information

Syntactic Dependencies for Multilingual and Multilevel Corpus Annotation

Syntactic Dependencies for Multilingual and Multilevel Corpus Annotation Syntactic Dependencies for Multilingual and Multilevel Corpus Annotation Simon Mille¹, Leo Wanner¹, ² ¹DTIC, Universitat Pompeu Fabra, ²ICREA C/ Roc Boronat, 138, 08018 Barcelona, Spain simon.mille@upf.edu,

More information

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Project in the framework of the AIM-WEST project Annotation of MWEs for translation Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment

More information

Annotation Guidelines for Rhetorical Structure

Annotation Guidelines for Rhetorical Structure Annotation Guidelines for Rhetorical Structure Manfred Stede University of Potsdam stede@uni-potsdam.de Debopam Das University of Potsdam debdas@uni-potsdam.de Version 1.0 (March 2017) Maite Taboada Simon

More information

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight. Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material

More information

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation Gene Kim and Lenhart Schubert Presented by: Gene Kim April 2017 Project Overview Project: Annotate a large, topically

More information

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

How to analyze visual narratives: A tutorial in Visual Narrative Grammar How to analyze visual narratives: A tutorial in Visual Narrative Grammar Neil Cohn 2015 neilcohn@visuallanguagelab.com www.visuallanguagelab.com Abstract Recent work has argued that narrative sequential

More information

Annotation and Taxonomy of Gestures in Lecture Videos

Annotation and Taxonomy of Gestures in Lecture Videos Annotation and Taxonomy of Gestures in Lecture Videos John R. Zhang Kuangye Guo Cipta Herwana John R. Kender Columbia University New York, NY 10027, USA {jrzhang@cs., kg2372@, cjh2148@, jrk@cs.}columbia.edu

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 12: 9 September 2012 ISSN

LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 12: 9 September 2012 ISSN LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 12: 9 September 2012 ISSN 1930-2940 Managing Editor: M. S. Thirumalai, Ph.D. Editors: B. Mallikarjun, Ph.D. Sam Mohanlal, Ph.D.

More information

Chapter 4: Valence & Agreement CSLI Publications

Chapter 4: Valence & Agreement CSLI Publications Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).

More information

A discursive grid approach to model local coherence in multi-document summaries

A discursive grid approach to model local coherence in multi-document summaries Universidade de São Paulo Biblioteca Digital da Produção Intelectual - BDPI Departamento de Ciências de Computação - ICMC/SCC Comunicações em Eventos - ICMC/SCC 2015-09 A discursive grid approach to model

More information

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger Page 1 of 35 Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger Kaihong Liu, MD, MS, Wendy Chapman, PhD, Rebecca Hwa, PhD, and Rebecca S. Crowley, MD, MS

More information

A First-Pass Approach for Evaluating Machine Translation Systems

A First-Pass Approach for Evaluating Machine Translation Systems [Proceedings of the Evaluators Forum, April 21st 24th, 1991, Les Rasses, Vaud, Switzerland; ed. Kirsten Falkedal (Geneva: ISSCO).] A First-Pass Approach for Evaluating Machine Translation Systems Pamela

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Parallel Syntactic Annotation of Multiple Languages

Parallel Syntactic Annotation of Multiple Languages Parallel Syntactic Annotation of Multiple Languages Owen Rambow, Bonnie Dorr, David Farwell, Rebecca Green, Nizar Habash Stephen Helmreich, Eduard Hovy, Lori Levin, Keith J. Miller Teruko Mitamura, Florence

More information

Experiments with a Higher-Order Projective Dependency Parser

Experiments with a Higher-Order Projective Dependency Parser Experiments with a Higher-Order Projective Dependency Parser Xavier Carreras Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) 32 Vassar St., Cambridge,

More information

Beyond the Pipeline: Discrete Optimization in NLP

Beyond the Pipeline: Discrete Optimization in NLP Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Providing student writers with pre-text feedback

Providing student writers with pre-text feedback Providing student writers with pre-text feedback Ana Frankenberg-Garcia This paper argues that the best moment for responding to student writing is before any draft is completed. It analyses ways in which

More information

Cross-linguistic aspects in child L2 acquisition

Cross-linguistic aspects in child L2 acquisition 609238IJB0010.1177/1367006915609238International Journal of Bi-lingualismChondrogianni and Vasić research-article2015 Editorial Note Cross-linguistic aspects in child L2 acquisition International Journal

More information

Controlled vocabulary

Controlled vocabulary Indexing languages 6.2.2. Controlled vocabulary Overview Anyone who has struggled to find the exact search term to retrieve information about a certain subject can benefit from controlled vocabulary. Controlled

More information

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

More information

Private International Law In Czech Republic. By Monika Pauknerová

Private International Law In Czech Republic. By Monika Pauknerová Private International Law In Czech Republic By Monika Pauknerová Applicable Law in the International Arbitration - Applicable Law in the International Conference "Applicable Law in the International Switzerland),

More information

POWLA: Modeling linguistic corpora in OWL/DL

POWLA: Modeling linguistic corpora in OWL/DL POWLA: Modeling linguistic corpora in OWL/DL Christian Chiarcos Information Sciences Institute, University of Southern California, 4676 Admiralty Way # 1001, Marina del Rey, CA 90292 chiarcos@daad-alumni.de

More information

New Features & Functionality in Q Release Version 3.2 June 2016

New Features & Functionality in Q Release Version 3.2 June 2016 in Q Release Version 3.2 June 2016 Contents New Features & Functionality 3 Multiple Applications 3 Class, Student and Staff Banner Applications 3 Attendance 4 Class Attendance 4 Mass Attendance 4 Truancy

More information

Beyond constructions:

Beyond constructions: 2 nd NTU Workshop on Discourse and Grammar in Formosan Languages National Taiwan University, 1 June 2013 Beyond constructions: Takivatan Bunun predicate-argument structure, grammatical coherence, and the

More information

Intensive Writing Class

Intensive Writing Class Intensive Writing Class Student Profile: This class is for students who are committed to improving their writing. It is for students whose writing has been identified as their weakest skill and whose CASAS

More information

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses Universal Grammar 1 evidence : 1. crosslinguistic investigation of properties of languages 2. evidence from language acquisition 3. general cognitive abilities 1. Properties can be reflected in a.) structural

More information

Developing a large semantically annotated corpus

Developing a large semantically annotated corpus Developing a large semantically annotated corpus Valerio Basile, Johan Bos, Kilian Evang, Noortje Venhuizen Center for Language and Cognition Groningen (CLCG) University of Groningen The Netherlands {v.basile,

More information

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE Submitted in partial fulfillment of the requirements for the degree of Sarjana Sastra (S.S.)

More information

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition Roy Bar-Haim,Ido Dagan, Iddo Greental, Idan Szpektor and Moshe Friedman Computer Science Department, Bar-Ilan University,

More information

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important

More information

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1 Linguistics 1 Linguistics Matthew Gordon, Chair Interdepartmental Program in the College of Arts and Science 223 Tate Hall (573) 882-6421 gordonmj@missouri.edu Kibby Smith, Advisor Office of Multidisciplinary

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Mental Models and the Meaning of Connectives: A Study on Children, Adolescents and Adults

Mental Models and the Meaning of Connectives: A Study on Children, Adolescents and Adults Mental Models and the Meaning of Connectives: A Study on Children, Adolescents and Adults Katiuscia Sacco (sacco@psych.unito.it) Monica Bucciarelli (monica@psych.unito.it) Mauro Adenzato (adenzato@psych.unito.it)

More information

The Smart/Empire TIPSTER IR System

The Smart/Empire TIPSTER IR System The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of

More information

Highlighting and Annotation Tips Foundation Lesson

Highlighting and Annotation Tips Foundation Lesson English Highlighting and Annotation Tips Foundation Lesson About this Lesson Annotating a text can be a permanent record of the reader s intellectual conversation with a text. Annotation can help a reader

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

DYNAMIC ADAPTIVE HYPERMEDIA SYSTEMS FOR E-LEARNING

DYNAMIC ADAPTIVE HYPERMEDIA SYSTEMS FOR E-LEARNING University of Craiova, Romania Université de Technologie de Compiègne, France Ph.D. Thesis - Abstract - DYNAMIC ADAPTIVE HYPERMEDIA SYSTEMS FOR E-LEARNING Elvira POPESCU Advisors: Prof. Vladimir RĂSVAN

More information

Specifying Logic Programs in Controlled Natural Language

Specifying Logic Programs in Controlled Natural Language TECHNICAL REPORT 94.17, DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF ZURICH, NOVEMBER 1994 Specifying Logic Programs in Controlled Natural Language Norbert E. Fuchs, Hubert F. Hofmann, Rolf Schwitter

More information

LFG Semantics via Constraints

LFG Semantics via Constraints LFG Semantics via Constraints Mary Dalrymple John Lamping Vijay Saraswat fdalrymple, lamping, saraswatg@parc.xerox.com Xerox PARC 3333 Coyote Hill Road Palo Alto, CA 94304 USA Abstract Semantic theories

More information