Annotation Projection for Discourse Connectives
|
|
- Ethelbert Wilkins
- 6 years ago
- Views:
Transcription
1 SFB 833 / Univ. Tübingen Penn Discourse Treebank Workshop
2 Annotation projection Basic idea: Given a bitext E/F and annotation for F, how would the annotation look for E? Examples: Word Sense Disambiguation (Bentivogli and Pianta, 2005) FrameNet frames (Pado and Lapata, 2005) Coreference chains (Postolache et al., 2006) Ways of realization: Experiments in vitro Adding manual annotation to a parallel corpus Translating an annotated corpus Automatic annotation of a parallel corpus
3 Annotation Experiment Training on Penn Discourse Treebank: Identification, desambiguation of discourse connectives (Pitler and Nenkova, 2009) Finding connective arguments (Wellner, 2009) Automatic annotation EuroParl-EN Training on projection from EuroParl-DE Testing on manual annotated material from TüBa-D/Z
4 Automatic Annotation: Example
5 Automatic Annotation Potential Connectives e.g. and, either... or, as soon as Discourse Connective yes/no Classification into discourse relation using syntactic features Connective string node, parent, left/right neighbour VP in right neighbour? traces, function labels (Pitler and Nenkova, 2009) here: tense/mood of connected sentences
6 Automatic annotation (2) Anaphoricity Finding Connective Arguments Path between connective and arguments Constituents (IN SBAR S) Dependencies+labels ( ccomp advcl) fork production SBAR 1 IN C S 2
7 Automatic Annotation (3) In-Domain (PTB) Identification: 92% P, 98% R Top-Level Relations: 94-95% Medium Taxonomy Level: 84% Arguments: Anaphoricity F=0.975 Arg1 87% Head, 76% PS node Arg2 93% Head, 89% PS node
8 Automatic Annotation (3) In-Domain (PTB) Identification: 92% P, 98% R Top-Level Relations: 94-95% Medium Taxonomy Level: 84% Arguments: Anaphoricity F=0.975 Arg1 87% Head, 76% PS node Arg2 93% Head, 89% PS node Out of Domain (EuroParl) Automatic Parses (Berkeley Parser) Identification: 83% P, 97% R
9 Error analysis OK: I regret this, but the vote has already been taken and [ arg1 the decision is made] [ conn so] [ arg2 let us leave the matter there]. Ich bedauere das, aber die Abstimmung ist durchgeführt worden, [ arg1 die Entscheidung ist gefallen], [ conn also] [ arg2 lassen wir die Dinge]. [ Contingency.Cause.Result ] Misparse in EN: [ arg1 Madam President, [ conn if] [ arg2 the vote records correctly how my Group voted I shall not, and cannot], object to that]. Frau Präsidentin, [ conn wenn] [ arg1/arg2aus der Abstimmung einwandfrei hervorgeht, wie meine Fraktion abgestimmt hat, dann werde und kann ich nichts dagegen sagen]. [ Contingency.Condition.Hypothetical ]
10 Projection Properties of Discourse Connectives Syntactically heterogeneous (Subordinating/coordinating) conjunctions and, after Adverbial PPs (for example) Discontinuous units (if... then, either... or; as soon as) No content words unreliable word alignment
11 CRF-Baseline We will be in a position to judge just as soon as the Commission pres Über Prüfsteine werden wir alsbald verfügen, wenn die Kommission ihr R B-DC B-DC
12 CRF Baseline Each word that is aligned with a connective Direct sequence classifier Word, POS, starting/ending clauses With intersect alignment: P=0.839 R=0.382 F=0.525 With grow-diag-final alignment: P=0.740 R=0.419 F=0.535
13 Lexicon-based List of connectives from the Handbuch der deutschen Konnektoren (Pasch et al., 2003) Baseline: treat all entries as discourse connectives P=0.270 R=0.940 F=0.421
14 Alignment+lexicon We will be in a position to judge just as soon as the Commission pres Über Prüfsteine werden wir alsbald verfügen, wenn die Kommission ihr R
15 Lexicon-based (2) Positive examples: aligned with a discourse connective on English side Negative examples: Unaligned or aligned with a non-discourse occurrence Syntactic features (German): Label of node, parent, left/right neighbour node Discriminative parser (Versley and Rehbein, 2009) P=0.623 R=0.765 F1=0.687
16 Lexicon-based (3) Why is the lexicon-based approach better? Cleaner training data? Better features? Additional experiments: CRF with cleaner training data Replace HdK lexicon with induced lexicon
17 Lexicon induction Candidates: Any sequence occurring as counterparts of English DCs Average Overlap metric: Dice coefficient between candidate string and mapped DC, averaged over all aligned occurrences
18 HdK vs. induced abermals abgesehen davon allenfalls abschließend allerdings als als Beispiel als Ergebnis als Folge also als ob am Ende anders ausgedrückt
19 Results overview Prec Recl F β=1 all HdK simple CRF, giza-refined simple CRF, giza-intersect HdK+classifier induced+classifier HdK+CRF induced+crf
20 Summary Projected training data + lexicon Results in usable classifier for discourse connectives Need evaluation data for relation labels, argument finding Combine with hand-annotated data in target language
21 Thanks for listening! THE END
22 Alignment example: f e, e f Vielen Dank, Herr Segni, das will ich gerne tun. Thank you, Mr Segni, I shall. do sogladly Vielen Dank, Herr Segni, das will ich gerne tun. Thank you, Mr Segni, I shall. do sogladly
23 Alignment example: Union/Intersection Vielen Dank, Herr Segni, das will ich gerne tun. Thank you, Mr Segni, I shall. do sogladly
24 Alignment example: Heuristic refinement Vielen Dank, Herr Segni, das will ich gerne tun. Thank you, Mr Segni, I shall. do sogladly
25 Bentivogli, L. and Pianta, E. (2005). Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor corpus. Natural Language Engineering, 11(3): Carlson, L., Marcu, D., and Okurowski, M. E. (2003). Building a discourse-tagged corpus in the framework of rhetorical structure theory. In Current Directions in Discourse and Dialogue. Kluwer. Pado, S. and Lapata, M. (2005). Cross-lingual projection of role-semantic information. In Proceedings of HLT/EMNLP Pasch, R., Brauße, U., Breindl, E., and Waßner, U. H. (2003). Handbuch der deutschen Konnektoren. Walter de Gruyter, Berlin / New York. Pitler, E. and Nenkova, A. (2009). Using syntax to disambiguate explicit discourse connectives in text. In ACL 2009 short papers. Postolache, O., Cristea, D., and Orasan, C. (2006). Transferring coreference chains through word alignment. In Proceedings of
26 the 5th Language Resources and Evaluation Conference (LREC2006). Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., and Webber, B. (2008). The Penn Discourse Treebank 2.0. In Proceedings of LREC Versley, Y. and Rehbein, I. (2009). Scalable discriminative parsing for German. In Proc. IWPT Wellner, B. (2009). Sequence Models and Ranking Methods for Discourse Parsing. PhD thesis, Brandeis University. Wolf, F. and Gibson, E. (2005). Representing discourse coherence: A corpus-based study. Computational Linguistics, 31(2):
University of Edinburgh. University of Pennsylvania
Behrens & Fabricius-Hansen (eds.) Structuring information in discourse: the explicit/implicit dimension, Oslo Studies in Language 1(1), 2009. 171-190. (ISSN 1890-9639) http://www.journals.uio.no/osla :
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationThe Discourse Anaphoric Properties of Connectives
The Discourse Anaphoric Properties of Connectives Cassandre Creswell, Kate Forbes, Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi Λ, Bonnie Webber y Λ University of Pennsylvania 3401 Walnut Street Philadelphia,
More informationLTAG-spinal and the Treebank
LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)
More informationDEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS
DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS Julia Tmshkina Centre for Text Techitology, North-West University, 253 Potchefstroom, South Africa 2025770@puk.ac.za
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationSEMAFOR: Frame Argument Resolution with Log-Linear Models
SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon
More informationCross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels
Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels Jörg Tiedemann Uppsala University Department of Linguistics and Philology firstname.lastname@lingfil.uu.se Abstract
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationDeveloping a TT-MCTAG for German with an RCG-based Parser
Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,
More informationThe stages of event extraction
The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1
Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up
More informationMETHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS
METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar
More informationGrammars & Parsing, Part 1:
Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationThe Role of the Head in the Interpretation of English Deverbal Compounds
The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordăchioaia i, Lonneke van der Plas ii, Glorianna Jagfeld i (Universität Stuttgart i, University of Malta ii ) Wen wurmt
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationThe Verbmobil Semantic Database. Humboldt{Univ. zu Berlin. Computerlinguistik. Abstract
The Verbmobil Semantic Database Karsten L. Worm Univ. des Saarlandes Computerlinguistik Postfach 15 11 50 D{66041 Saarbrucken Germany worm@coli.uni-sb.de Johannes Heinecke Humboldt{Univ. zu Berlin Computerlinguistik
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationLQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization
LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization Annemarie Friedrich, Marina Valeeva and Alexis Palmer COMPUTATIONAL LINGUISTICS & PHONETICS SAARLAND UNIVERSITY, GERMANY
More informationDeveloping a large semantically annotated corpus
Developing a large semantically annotated corpus Valerio Basile, Johan Bos, Kilian Evang, Noortje Venhuizen Center for Language and Cognition Groningen (CLCG) University of Groningen The Netherlands {v.basile,
More informationApplying Speaking Criteria. For use from November 2010 GERMAN BREAKTHROUGH PAGRB01
Applying Speaking Criteria For use from November 2010 GERMAN BREAKTHROUGH PAGRB01 Contents Introduction 2 1: Breakthrough Stage The Languages Ladder 3 Languages Ladder can do statements for Breakthrough
More informationParsing Morphologically Rich Languages:
1 / 39 Rich Languages: Sandra Kübler Indiana University 2 / 39 Rich Languages joint work with Daniel Dakota, Wolfgang Maier, Joakim Nivre, Djamé Seddah, Reut Tsarfaty, Daniel Whyatt, and many more def.
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationSusanne J. Jekat
IUED: Institute for Translation and Interpreting Respeaking: Loss, Addition and Change of Information during the Transfer Process Susanne J. Jekat susanne.jekat@zhaw.ch This work was funded by Swiss TxT
More informationRole of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation
Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,
More informationTreebank mining with GrETEL. Liesbeth Augustinus Frank Van Eynde
Treebank mining with GrETEL Liesbeth Augustinus Frank Van Eynde GrETEL tutorial - 27 March, 2015 GrETEL Greedy Extraction of Trees for Empirical Linguistics Search engine for treebanks GrETEL Greedy Extraction
More informationAccurate Unlexicalized Parsing for Modern Hebrew
Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The
More informationLearning Computational Grammars
Learning Computational Grammars John Nerbonne, Anja Belz, Nicola Cancedda, Hervé Déjean, James Hammerton, Rob Koeling, Stasinos Konstantopoulos, Miles Osborne, Franck Thollard and Erik Tjong Kim Sang Abstract
More informationThe Discourse Effects of the Indefinite Demonstrative dieser in German
The Discourse Effects of the Indefinite Demonstrative dieser in German Annika Deichsel annika.deichsel@ling.uni-stuttgart.de Institut für Linguistik/Germanistik Universität Stuttgart Abstract. This work
More informationMultilingual Sentiment and Subjectivity Analysis
Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department
More informationTowards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la
Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)
More informationAnnotating (Anaphoric) Ambiguity 1 INTRODUCTION. Paper presentend at Corpus Linguistics 2005, University of Birmingham, England
Paper presentend at Corpus Linguistics 2005, University of Birmingham, England Annotating (Anaphoric) Ambiguity Massimo Poesio and Ron Artstein University of Essex Language and Computation Group / Department
More informationAn Interactive Intelligent Language Tutor Over The Internet
An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This
More informationModeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures
Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,
More informationBYLINE [Heng Ji, Computer Science Department, New York University,
INFORMATION EXTRACTION BYLINE [Heng Ji, Computer Science Department, New York University, hengji@cs.nyu.edu] SYNONYMS NONE DEFINITION Information Extraction (IE) is a task of extracting pre-specified types
More informationA Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books
A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books Yoav Goldberg Bar Ilan University yoav.goldberg@gmail.com Jon Orwant Google Inc. orwant@google.com Abstract We created
More informationInformation Status in Generation Ranking
Aoife Cahill nformation Status in Generation Ranking 1 / 57 nformation Status in Generation Ranking Aoife Cahill joint work with Arndt Riester Heidelberg Computational Linguistics Colloquium December 9,
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationThe Karlsruhe Institute of Technology Translation Systems for the WMT 2011
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu
More informationUniversity of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma
University of Alberta Large-Scale Semi-Supervised Learning for Natural Language Processing by Shane Bergsma A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of
More informationSpecifying a shallow grammatical for parsing purposes
Specifying a shallow grammatical for parsing purposes representation Atro Voutilainen and Timo J~irvinen Research Unit for Multilingual Language Technology P.O. Box 4 FIN-0004 University of Helsinki Finland
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationExploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data
Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationControl and Boundedness
Control and Boundedness Having eliminated rules, we would expect constructions to follow from the lexical categories (of heads and specifiers of syntactic constructions) alone. Combinatory syntax simply
More informationResolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Jeju Island, South Korea, July 2012, pp. 777--789.
More informationExtracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models
Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),
More informationCan Human Verb Associations help identify Salient Features for Semantic Verb Classification?
Can Human Verb Associations help identify Salient Features for Semantic Verb Classification? Sabine Schulte im Walde Institut für Maschinelle Sprachverarbeitung Universität Stuttgart Seminar für Sprachwissenschaft,
More informationPre-Processing MRSes
Pre-Processing MRSes Tore Bruland Norwegian University of Science and Technology Department of Computer and Information Science torebrul@idi.ntnu.no Abstract We are in the process of creating a pipeline
More information2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases
POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz
More informationGraph Alignment for Semi-Supervised Semantic Role Labeling
Graph Alignment for Semi-Supervised Semantic Role Labeling Hagen Fürstenau Dept. of Computational Linguistics Saarland University Saarbrücken, Germany hagenf@coli.uni-saarland.de Mirella Lapata School
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationSome Principles of Automated Natural Language Information Extraction
Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract
More informationThe Smart/Empire TIPSTER IR System
The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of
More informationBeyond the Pipeline: Discrete Optimization in NLP
Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We
More informationEAGLE: an Error-Annotated Corpus of Beginning Learner German
EAGLE: an Error-Annotated Corpus of Beginning Learner German Adriane Boyd Department of Linguistics The Ohio State University adriane@ling.osu.edu Abstract This paper describes the Error-Annotated German
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationLEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano
LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES Judith Gaspers and Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University {jgaspers cimiano}@cit-ec.uni-bielefeld.de ABSTRACT Semantic parsers
More informationHandling Sparsity for Verb Noun MWE Token Classification
Handling Sparsity for Verb Noun MWE Token Classification Mona T. Diab Center for Computational Learning Systems Columbia University mdiab@ccls.columbia.edu Madhav Krishna Computer Science Department Columbia
More informationA Computational Evaluation of Case-Assignment Algorithms
A Computational Evaluation of Case-Assignment Algorithms Miles Calabresi Advisors: Bob Frank and Jim Wood Submitted to the faculty of the Department of Linguistics in partial fulfillment of the requirements
More informationIntroduction. Beáta B. Megyesi. Uppsala University Department of Linguistics and Philology Introduction 1(48)
Introduction Beáta B. Megyesi Uppsala University Department of Linguistics and Philology beata.megyesi@lingfil.uu.se Introduction 1(48) Course content Credits: 7.5 ECTS Subject: Computational linguistics
More informationLanguage Model and Grammar Extraction Variation in Machine Translation
Language Model and Grammar Extraction Variation in Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department
More informationThe MEANING Multilingual Central Repository
The MEANING Multilingual Central Repository J. Atserias, L. Villarejo, G. Rigau, E. Agirre, J. Carroll, B. Magnini, P. Vossen January 27, 2004 http://www.lsi.upc.es/ nlp/meaning Jordi Atserias TALP Index
More informationHeuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger
Page 1 of 35 Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger Kaihong Liu, MD, MS, Wendy Chapman, PhD, Rebecca Hwa, PhD, and Rebecca S. Crowley, MD, MS
More informationExploiting Wikipedia as External Knowledge for Named Entity Recognition
Exploiting Wikipedia as External Knowledge for Named Entity Recognition Jun ichi Kazama and Kentaro Torisawa Japan Advanced Institute of Science and Technology (JAIST) Asahidai 1-1, Nomi, Ishikawa, 923-1292
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationHyperedge Replacement and Nonprojective Dependency Structures
Hyperedge Replacement and Nonprojective Dependency Structures Daniel Bauer and Owen Rambow Columbia University New York, NY 10027, USA {bauer,rambow}@cs.columbia.edu Abstract Synchronous Hyperedge Replacement
More informationParsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank
Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank Dan Klein and Christopher D. Manning Computer Science Department Stanford University Stanford,
More informationA Comparison of Two Text Representations for Sentiment Analysis
010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational
More informationA relational approach to translation
A relational approach to translation Rémi Zajac Project POLYGLOSS* University of Stuttgart IMS-CL /IfI-AIS, KeplerstraBe 17 7000 Stuttgart 1, West-Germany zajac@is.informatik.uni-stuttgart.dbp.de Abstract.
More informationA Framework for Customizable Generation of Hypertext Presentations
A Framework for Customizable Generation of Hypertext Presentations Benoit Lavoie and Owen Rambow CoGenTex, Inc. 840 Hanshaw Road, Ithaca, NY 14850, USA benoit, owen~cogentex, com Abstract In this paper,
More informationSurvey on parsing three dependency representations for English
Survey on parsing three dependency representations for English Angelina Ivanova Stephan Oepen Lilja Øvrelid University of Oslo, Department of Informatics { angelii oe liljao }@ifi.uio.no Abstract In this
More informationWhat Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017
What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017 Supervised Training of Neural Networks for Language Training Data Training Model this is an example the cat went to
More informationThe presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.
Lecture 4: OT Syntax Sources: Kager 1999, Section 8; Legendre et al. 1998; Grimshaw 1997; Barbosa et al. 1998, Introduction; Bresnan 1998; Fanselow et al. 1999; Gibson & Broihier 1998. OT is not a theory
More informationUnsupervised Learning of Narrative Schemas and their Participants
Unsupervised Learning of Narrative Schemas and their Participants Nathanael Chambers and Dan Jurafsky Stanford University, Stanford, CA 94305 {natec,jurafsky}@stanford.edu Abstract We describe an unsupervised
More informationGrammar Extraction from Treebanks for Hindi and Telugu
Grammar Extraction from Treebanks for Hindi and Telugu Prasanth Kolachina, Sudheer Kolachina, Anil Kumar Singh, Samar Husain, Viswanatha Naidu,Rajeev Sangal and Akshar Bharati Language Technologies Research
More informationThe Choice of Features for Classification of Verbs in Biomedical Texts
The Choice of Features for Classification of Verbs in Biomedical Texts Anna Korhonen University of Cambridge Computer Laboratory 15 JJ Thomson Avenue Cambridge CB3 0FD, UK alk23@cl.cam.ac.uk Yuval Krymolowski
More informationTHE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING
SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,
More informationThe MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation
The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation AUTHORS AND AFFILIATIONS MSR: Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova,
More informationNetpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models
Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.
More informationSemantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition
Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition Roy Bar-Haim,Ido Dagan, Iddo Greental, Idan Szpektor and Moshe Friedman Computer Science Department, Bar-Ilan University,
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationTowards a MWE-driven A* parsing with LTAGs [WG2,WG3]
Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Jakub Waszczuk, Agata Savary To cite this version: Jakub Waszczuk, Agata Savary. Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]. PARSEME 6th general
More informationEvaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment
Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Akiko Sakamoto, Kazuhiko Abe, Kazuo Sumita and Satoshi Kamatani Knowledge Media Laboratory,
More information2.1 The Theory of Semantic Fields
2 Semantic Domains In this chapter we define the concept of Semantic Domain, recently introduced in Computational Linguistics [56] and successfully exploited in NLP [29]. This notion is inspired by the
More informationUNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen
UNIVERSITY OF OSLO Department of Informatics Dialog Act Recognition using Dependency Features Master s thesis Sindre Wetjen November 15, 2013 Acknowledgments First I want to thank my supervisors Lilja
More informationFinding Translations in Scanned Book Collections
Finding Translations in Scanned Book Collections Ismet Zeki Yalniz Dept. of Computer Science University of Massachusetts Amherst, MA, 01003 zeki@cs.umass.edu R. Manmatha Dept. of Computer Science University
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationBuilding a Semantic Role Labelling System for Vietnamese
Building a emantic Role Labelling ystem for Vietnamese Thai-Hoang Pham FPT University hoangpt@fpt.edu.vn Xuan-Khoai Pham FPT University khoaipxse02933@fpt.edu.vn Phuong Le-Hong Hanoi University of cience
More informationTheoretical Syntax Winter Answers to practice problems
Linguistics 325 Sturman Theoretical Syntax Winter 2017 Answers to practice problems 1. Draw trees for the following English sentences. a. I have not been running in the mornings. 1 b. Joel frequently sings
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More informationUsing Semantic Relations to Refine Coreference Decisions
Using Semantic Relations to Refine Coreference Decisions Heng Ji David Westbrook Ralph Grishman Department of Computer Science New York University New York, NY, 10003, USA hengji@cs.nyu.edu westbroo@cs.nyu.edu
More information