METEOR-Hindi : Automatic MT Evaluation Metric for Hindi as a Target Language
|
|
- Jesse Neal
- 6 years ago
- Views:
Transcription
1 METEOR-Hindi : Automatic MT Evaluation Metric for Hindi as a Target Language Ankush Gupta, Sriram Venkatapathy and Rajeev Sangal Language Technologies Research Centre IIIT-Hyderabad
2 NEED FOR MT EVALUATION MT Systems becoming widespread How well do they work in practice? Are they reliable enough? Absolute vs. Relative quality MT is a technology still in research stages How can we tell if we are making progress? Metrics that can drive experimental development. 12/10/10 2
3 Human vs. Automatic MT Evaluation Human MT Evaluations Subjective Expensive Time-consuming Cannot be reused Most reliable Automatic MT Evaluations Objective Cheap Fast Reusable Highly correlated with subjective evaluation? 12/10/10 3
4 Problems with BLEU Metric (Papineni et al., 2002) Sentence-level Scores not based on meaning (Liu et al. 2005; Liu and Gildea., 2006) Only Exact Matches Lack of Recall Equal weightage Admits too much variation by using Higher Order n-grams for Fluency and Grammaticality Geometric averaging of n-grams 12/10/10 4
5 Important Aspects in English-Hindi MT Word order not so important : Does not strictly convey grammatical roles of words. Correct Case marking Morphological richness The boys are playing cricket लड़क कक ट ख ल रह ह The boys brought the book from the market लड़क न ब ज़ र स कत ब खर द Synonym Matching 12/10/10 5
6 METEOR (Banerjee and Lavie., 2005) Creates a word alignment between reference(s) and test sentence 3 series of word mapping modules : Exact Match Stem Match Synonym Match Each module maps words not mapped in any of earlier stage(s) Score is computed as the harmonic mean of unigram precision and recall. Additional penalties computed to capture order of words 12/10/10 6
7 Advantages of METEOR for Eng-Hnd MT Evaluation Flexible word matching (Exact, Stem and Synonym) Unigram matching : Not rely totally on word-order, thus suitable for Hindi which is relatively free word-order language Uses and Emphasizes Recall in addition to Precision Parametrized Features : Can be tuned for diff. types of human judgements and for diff. Languages (Lavie and Denkowski., 2009) 12/10/10 7
8 METEOR-Hindi Aligner Extended implementation of METEOR to support evaluation of translation into Hindi Word alignment algorithm same New stemming module for Hindi : Hindi Morph Analyzer used New synonym module for Hindi : Hindi Wordnet 1.2 used 12/10/10 8
9 METEOR-Hindi Parameters Apart from using only word-based features, also use other linguistic features like Local word Group, Part-of-Speech and Clause Match. 12/10/10 9
10 METEOR-Hindi Parameters Local Word Group Match : Consist of a content word and associated function word(s). Reference : Test1 : Test2 : बल न क त क म र क त क बल न म र क त न बल क म र Reference Test1 Test2 बल न बल न बल क क त क क त क क त न म र म र म र 12/10/10 10
11 METEOR-Hindi Parameters Part-of-Speech Matching : Compute number of matching words with same POS tag. Used CRF POS tagger. Reference : Test : र म ख ल रह ह ख ल र म रह ह Reference POS Test POS र म NN ख ल NNP ख ल VM र म NNP रह VAUX रह VM ह VAUX ह VAUX 12/10/10 11
12 METEOR-Hindi Parameters Clause Match : Phrase containing exactly one verb (finite or non-finite). Used Hindi clause boundary identifier. Reference : Test : र हत सक ल ज कर क दन लग र हत सक ल ज कर ख लन लग Reference र हत सक ल ज कर क दन लग Test र हत सक ल ज कर ख लन लग 12/10/10 12
13 METEOR-Hindi Scoring Function Stage Features Exact Match Precision, Recall Stem Match Precision, Recall Synonym Match Precision, Recall Local Word Group Precision, Recall Part-of-Speech Precision, Recall Clause Match Precision, Recall Parameters Used for Scoring in METEOR-Hindi Score, s = ( W i * f i ) / ( W i ) [W i : Weight of feature i ] 12/10/10 13
14 METEOR-Hindi Scoring Function Penalty not used in METEOR-Hindi General equation for scoring facilitates use of standard Machine Learning techniques Due to unavailability of high-quality training data, currently all the weights are set to 1 12/10/10 14
15 Example Reference Test : र हत न ब ज़ र स प सतक खर द : ब ज़ र स र हत कत ब खर दन Exact Matches : ब ज़ र, स, र हत Stem Matches : ब ज़ र, स, र हत, खर दन Synonym Matches : ब ज़ र, स, र हत, कत ब, खर दन LWG Matches : ब ज़ र स, कत ब, खर दन POS Matches : ब ज़ र, स, कत ब, खर दन Clause Matches : - 12/10/10 15
16 Example Exact Match Precision : f1 = 3/5 Exact Match Recall : f2 = 3/6 Stem Match Precision : f3 = 4/5 Stem Match Recall : f4 = 4/6 Synonym Match Precision : f5 = 5/5 Synonym Match Recall : f6 = 5/6 LWG Precision : f7 = 3/4 LWG Recall : f8 = 3/4 POS Precision : f9 = 4/5 POS Recall : f10 = 4/6 Clause Precision : f11 = 0/5 Clause Recall : f12 = 0/6 12/10/10 16
17 Example Score = ( W i * f i ) / ( W i ) = (1*3/5 + 1*3/ *0/6) / (12) = /10/10 17
18 Tools Used Morph Analyzer : Hindi Morph Hindi Wordnet : Hindi Wordnet 1.2 (Jha et al., 2001) CRF Part of Speech (POS) Tagger (PVS and Karthik G, 2007) Hindi Local Word Grouper (Bharati et al., 1998) Hindi Clause Boundary Identifier (Developed at IIIT-H as part of ILMT Project) 12/10/10 18
19 Experiments and Results Dataset of 100 sentences ; 60 test translations from System1 and 40 from System2 Number of Sentences 100 Avg. Test Sentence length Avg. Ref. Sentence length Exact Matches 433 Stem Matches 574 Synonym Matches 622 LWG Matches 426 POS Matches 576 Clause Matches 9 12/10/10 19
20 Experiments and Results Metric Features Correlation BLEU METEOR-Hindi Exact METEOR-Hindi Exact + Stem METEOR-Hindi Exact + Stem + Synonym METEOR-Hindi Exact + Stem + Synonym LWG METEOR-Hindi Exact + Stem + Synonym POS METEOR-Hindi Exact + Stem + Synonym Clause METEOR-Hindi Exact + Stem + Synonym + LWG + POS + Clause /10/10 20
21 Experiments and Results Reason of low correlation with BLEU is it gives score zero in most of the sentences 12/10/10 21
22 Experiments and Results METEOR-Hindi Human 12/10/10 22
23 Experiments and Results Highest correlation (0.703) obtained using Exact, Stem, Synonym and POS features. Using linguistic features ( stemming, synonym ) resulted in better correlation Surprisingly, LWG and Clause features did not show increase in correlation Errors in reference sentences like ह र क, उनम स and ग ड़य म while it should be ह र क, उनम स and ग ड़य म 12/10/10 23
24 Experiments and Results Adding Clause feature also decreased correlation Only 9 clauses matched in 100 sentences Most of sentences have only one verb; so entire sentence is a clause Clause being a much higher level concept than words, scores should be less penalized if clauses do not match 12/10/10 24
25 Experiments and Results Scatterplot of BLEU and Human Score 12/10/10 25
26 Experiments and Results Scatterplot of METEOR-Hindi and Human Score 12/10/10 26
27 Experiments and Results Metric Average-Score BLEU METEOR-Hindi Human Average Scores Compared MT System1 and System2 METEOR-Hindi correlated better (as compared to BLEU) Human annotators as well as METOR-Hindi gave greater score to System2 while BLEU ranked System1 higher 12/10/10 27
28 Experiments and Results Metric System Average Pearson correlation Score BLEU System METEOR-Hindi System Human System Metric System Average Pearson correlation Score BLEU System METEOR-Hindi System Human System /10/10 28
29 Future Work Train METEOR-Hindi on large amount of high-quality data and find optimum values of weightages for various parameters Use additional features like paraphrase match to achieve better correlation Putting up tool online for others to use 12/10/10 29
30 REFERENCES Some Issues in Automatic Evaluation of English-Hindi MT : More Blues for Bleu Ananthakrishnan et al., 2007 Bleu: a method for automatic evaluation of machine translation. Papineni et al., 2002 Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. Satanjeev Banerjee and Alon Lavie Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgments. Alon Lavie and Abhay Agarwal., 2007 The meteor metric for automatic evaluation of machine translation. Alon Lavie and Michael Denkowski, 2009 Local Word Grouping and its relevance to Indian Languages. Bharati et al., 1991 Part-of-speech tagging and chunking using conditional random fields and transformation based learning. Avinesh PVS and Karthik G., /10/10 30
31 REFERENCES Re-evaluating the role of BLEU in machine translation research. Chris Callison-Burch et al The significance of recall in automatic metrics for mt evaluation. Alon Lavie et al., 2004 Syntactic features for evaluation of machine translation. Liu et al., 2004 Stochastic iterative alignment for machine translation evaluation. Liu et al., 2005 A wordnet for Hindi. Jha et al., /10/10 31
32 THANK YOU 12/10/10 32
क त क ई-व द य लय पत र क 2016 KENDRIYA VIDYALAYA ADILABAD
क त क ई-व द य लय पत र क 2016 KENDRIYA VIDYALAYA ADILABAD FROM PRINCIPAL S KALAM Dear all, Only when one is equipped with both, worldly education for living and spiritual education, he/she deserves respect
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationHinMA: Distributed Morphology based Hindi Morphological Analyzer
HinMA: Distributed Morphology based Hindi Morphological Analyzer Ankit Bahuguna TU Munich ankitbahuguna@outlook.com Lavita Talukdar IIT Bombay lavita.talukdar@gmail.com Pushpak Bhattacharyya IIT Bombay
More informationDCA प रय जन क य म ग नद शक द र श नद श लय मह म ग ध अ तरर य ह द व व व लय प ट ह द व व व लय, ग ध ह स, वध (मह र ) DCA-09 Project Work Handbook
मह म ग ध अ तरर य ह द व व व लय (स सद र प रत अ ध नयम 1997, म क 3 क अ तगत थ पत क य व व व लय) Mahatma Gandhi Antarrashtriya Hindi Vishwavidyalaya (A Central University Established by Parliament by Act No.
More informationTINE: A Metric to Assess MT Adequacy
TINE: A Metric to Assess MT Adequacy Miguel Rios, Wilker Aziz and Lucia Specia Research Group in Computational Linguistics University of Wolverhampton Stafford Street, Wolverhampton, WV1 1SB, UK {m.rios,
More informationRe-evaluating the Role of Bleu in Machine Translation Research
Re-evaluating the Role of Bleu in Machine Translation Research Chris Callison-Burch Miles Osborne Philipp Koehn School on Informatics University of Edinburgh 2 Buccleuch Place Edinburgh, EH8 9LW callison-burch@ed.ac.uk
More informationS. RAZA GIRLS HIGH SCHOOL
S. RAZA GIRLS HIGH SCHOOL SYLLABUS SESSION 2017-2018 STD. III PRESCRIBED BOOKS ENGLISH 1) NEW WORLD READER 2) THE ENGLISH CHANNEL 3) EASY ENGLISH GRAMMAR SYLLABUS TO BE COVERED MONTH NEW WORLD READER THE
More informationCROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE
CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE Pratibha Bajpai 1, Dr. Parul Verma 2 1 Research Scholar, Department of Information Technology, Amity University, Lucknow 2 Assistant
More informationTwo methods to incorporate local morphosyntactic features in Hindi dependency
Two methods to incorporate local morphosyntactic features in Hindi dependency parsing Bharat Ram Ambati, Samar Husain, Sambhav Jain, Dipti Misra Sharma and Rajeev Sangal Language Technologies Research
More informationDetection of Multiword Expressions for Hindi Language using Word Embeddings and WordNet-based Features
Detection of Multiword Expressions for Hindi Language using Word Embeddings and WordNet-based Features Dhirendra Singh Sudha Bhingardive Kevin Patel Pushpak Bhattacharyya Department of Computer Science
More informationRegression for Sentence-Level MT Evaluation with Pseudo References
Regression for Sentence-Level MT Evaluation with Pseudo References Joshua S. Albrecht and Rebecca Hwa Department of Computer Science University of Pittsburgh {jsa8,hwa}@cs.pitt.edu Abstract Many automatic
More informationवण म गळ ग र प ज http://www.mantraaonline.com/ वण म गळ ग र प ज Check List 1. Altar, Deity (statue/photo), 2. Two big brass lamps (with wicks, oil/ghee) 3. Matchbox, Agarbatti 4. Karpoor, Gandha Powder,
More informationENGLISH Month August
ENGLISH 2016-17 April May Topic Literature Reader (a) How I taught my Grand Mother to read (Prose) (b) The Brook (poem) Main Course Book :People Work Book :Verb Forms Objective Enable students to realise
More informationMachine Translation on the Medical Domain: The Role of BLEU/NIST and METEOR in a Controlled Vocabulary Setting
Machine Translation on the Medical Domain: The Role of BLEU/NIST and METEOR in a Controlled Vocabulary Setting Andre CASTILLA castilla@terra.com.br Alice BACIC Informatics Service, Instituto do Coracao
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 95 APRIL
The Prague Bulletin of Mathematical Linguistics NUMBER 95 APRIL 2011 33 50 Machine Learning Approach for the Classification of Demonstrative Pronouns for Indirect Anaphora in Hindi News Items Kamlesh Dutta
More informationQuestion (1) Question (2) RAT : SEW : : NOW :? (A) OPY (B) SOW (C) OSZ (D) SUY. Correct Option : C Explanation : Question (3)
Question (1) Correct Option : D (D) The tadpole is a young one's of frog and frogs are amphibians. The lamb is a young one's of sheep and sheep are mammals. Question (2) RAT : SEW : : NOW :? (A) OPY (B)
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationह द स ख! Hindi Sikho!
ह द स ख! Hindi Sikho! by Shashank Rao Section 1: Introduction to Hindi In order to learn Hindi, you first have to understand its history and structure. Hindi is descended from an Indo-Aryan language known
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationGrammar Extraction from Treebanks for Hindi and Telugu
Grammar Extraction from Treebanks for Hindi and Telugu Prasanth Kolachina, Sudheer Kolachina, Anil Kumar Singh, Samar Husain, Viswanatha Naidu,Rajeev Sangal and Akshar Bharati Language Technologies Research
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationF.No.29-3/2016-NVS(Acad.) Dated: Sub:- Organisation of Cluster/Regional/National Sports & Games Meet and Exhibition reg.
नव दय ववद य लय सम त (म नव स स धन ववक स म त र लय क एक स व यत स स न, ववद य लय श क ष एव स क षरत ववभ ग, भ रत सरक र) ब -15, इन स लयट य यन नल एयरय, स क लर 62, न यड, उत तर रद 201 309 NAVODAYA VIDYALAYA SAMITI
More informationDomain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling
Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier 1, Andy Way 2, Josef van Genabith
More informationThe Role of String Similarity Metrics in Ontology Alignment
The Role of String Similarity Metrics in Ontology Alignment Michelle Cheatham and Pascal Hitzler August 9, 2013 1 Introduction Tim Berners-Lee originally envisioned a much different world wide web than
More informationDistant Supervised Relation Extraction with Wikipedia and Freebase
Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational
More informationScienceDirect. Malayalam question answering system
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 24 (2016 ) 1388 1392 International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015) Malayalam
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationTimeline. Recommendations
Introduction Advanced Placement Course Credit Alignment Recommendations In 2007, the State of Ohio Legislature passed legislation mandating the Board of Regents to recommend and the Chancellor to adopt
More informationIntroduction to Simulation
Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationOn document relevance and lexical cohesion between query terms
Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationSoftware Maintenance
1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationIndividual Differences & Item Effects: How to test them, & how to test them well
Individual Differences & Item Effects: How to test them, & how to test them well Individual Differences & Item Effects Properties of subjects Cognitive abilities (WM task scores, inhibition) Gender Age
More informationLeveraging Sentiment to Compute Word Similarity
Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global
More informationDetecting English-French Cognates Using Orthographic Edit Distance
Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationThe Karlsruhe Institute of Technology Translation Systems for the WMT 2011
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu
More informationCorpus Linguistics (L615)
(L615) Basics of Markus Dickinson Department of, Indiana University Spring 2013 1 / 23 : the extent to which a sample includes the full range of variability in a population distinguishes corpora from archives
More informationAGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS
AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationEdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More informationCX 101/201/301 Latin Language and Literature 2015/16
The University of Warwick Department of Classics and Ancient History CX 101/201/301 Latin Language and Literature 2015/16 Module tutor: Clive Letchford Humanities Building 2.21 c.a.letchford@warwick.ac.uk
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationA Simple Surface Realization Engine for Telugu
A Simple Surface Realization Engine for Telugu Sasi Raja Sekhar Dokkara, Suresh Verma Penumathsa Dept. of Computer Science Adikavi Nannayya University, India dsairajasekhar@gmail.com,vermaps@yahoo.com
More informationThe stages of event extraction
The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks
More informationCross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels
Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels Jörg Tiedemann Uppsala University Department of Linguistics and Philology firstname.lastname@lingfil.uu.se Abstract
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationValue Creation Through! Integration Workshop! Value Stream Analysis and Mapping for PD! January 31, 2002!
Presented by:! Hugh McManus for Rich Millard! MIT! Value Creation Through! Integration Workshop! Value Stream Analysis and Mapping for PD!!!! January 31, 2002! Steps in Lean Thinking (Womack and Jones)!
More informationBANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS
Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.
More informationUnderstanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)
Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010) Jaxk Reeves, SCC Director Kim Love-Myers, SCC Associate Director Presented at UGA
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More informationEvaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment
Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Akiko Sakamoto, Kazuhiko Abe, Kazuo Sumita and Satoshi Kamatani Knowledge Media Laboratory,
More informationAccreditation of Prior Experiential and Certificated Learning (APECL) Guidance for Applicants/Students
Accreditation of Prior Experiential and Certificated Learning (APECL) Guidance for Applicants/Students The following guidance notes set provide an overview for applicants and students in relation to making
More informationDerivational and Inflectional Morphemes in Pak-Pak Language
Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationA hybrid approach to translate Moroccan Arabic dialect
A hybrid approach to translate Moroccan Arabic dialect Ridouane Tachicart Mohammadia school of Engineers Mohamed Vth Agdal University, Rabat, Morocco tachicart@gmail.com Karim Bouzoubaa Mohammadia school
More informationWhat is PDE? Research Report. Paul Nichols
What is PDE? Research Report Paul Nichols December 2013 WHAT IS PDE? 1 About Pearson Everything we do at Pearson grows out of a clear mission: to help people make progress in their lives through personalized
More informationNamed Entity Recognition: A Survey for the Indian Languages
Named Entity Recognition: A Survey for the Indian Languages Padmaja Sharma Dept. of CSE Tezpur University Assam, India 784028 psharma@tezu.ernet.in Utpal Sharma Dept.of CSE Tezpur University Assam, India
More informationPhonological Processing for Urdu Text to Speech System
Phonological Processing for Urdu Text to Speech System Sarmad Hussain Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, B Block, Faisal Town, Lahore,
More informationIterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages
Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationRadius STEM Readiness TM
Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and
More informationLanguage Model and Grammar Extraction Variation in Machine Translation
Language Model and Grammar Extraction Variation in Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationThe College Board Redesigned SAT Grade 12
A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.
More informationSouth Carolina English Language Arts
South Carolina English Language Arts A S O F J U N E 2 0, 2 0 1 0, T H I S S TAT E H A D A D O P T E D T H E CO M M O N CO R E S TAT E S TA N DA R D S. DOCUMENTS REVIEWED South Carolina Academic Content
More informationStatewide Framework Document for:
Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance
More informationContext Free Grammars. Many slides from Michael Collins
Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures
More informationFirst Grade Standards
These are the standards for what is taught throughout the year in First Grade. It is the expectation that these skills will be reinforced after they have been taught. Mathematical Practice Standards Taught
More informationThe Ups and Downs of Preposition Error Detection in ESL Writing
The Ups and Downs of Preposition Error Detection in ESL Writing Joel R. Tetreault Educational Testing Service 660 Rosedale Road Princeton, NJ, USA JTetreault@ets.org Martin Chodorow Hunter College of CUNY
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationMeasuring the relative compositionality of verb-noun (V-N) collocations by integrating features
Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features Sriram Venkatapathy Language Technologies Research Centre, International Institute of Information Technology
More information2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases
POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz
More informationExtracting Verb Expressions Implying Negative Opinions
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Extracting Verb Expressions Implying Negative Opinions Huayi Li, Arjun Mukherjee, Jianfeng Si, Bing Liu Department of Computer
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Ch 2 Test Remediation Work Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) High temperatures in a certain
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationTABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards
TABE 9&10 Revised 8/2013- with reference to College and Career Readiness Standards LEVEL E Test 1: Reading Name Class E01- INTERPRET GRAPHIC INFORMATION Signs Maps Graphs Consumer Materials Forms Dictionary
More informationCalifornia Department of Education English Language Development Standards for Grade 8
Section 1: Goal, Critical Principles, and Overview Goal: English learners read, analyze, interpret, and create a variety of literary and informational text types. They develop an understanding of how language
More informationNoisy SMS Machine Translation in Low-Density Languages
Noisy SMS Machine Translation in Low-Density Languages Vladimir Eidelman, Kristy Hollingshead, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department of
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationCopyright Corwin 2015
2 Defining Essential Learnings How do I find clarity in a sea of standards? For students truly to be able to take responsibility for their learning, both teacher and students need to be very clear about
More informationYoshida Honmachi, Sakyo-ku, Kyoto, Japan 1 Although the label set contains verb phrases, they
FlowGraph2Text: Automatic Sentence Skeleton Compilation for Procedural Text Generation 1 Shinsuke Mori 2 Hirokuni Maeta 1 Tetsuro Sasada 2 Koichiro Yoshino 3 Atsushi Hashimoto 1 Takuya Funatomi 2 Yoko
More information1.11 I Know What Do You Know?
50 SECONDARY MATH 1 // MODULE 1 1.11 I Know What Do You Know? A Practice Understanding Task CC BY Jim Larrison https://flic.kr/p/9mp2c9 In each of the problems below I share some of the information that
More informationTest Effort Estimation Using Neural Network
J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationLanguage Independent Passage Retrieval for Question Answering
Language Independent Passage Retrieval for Question Answering José Manuel Gómez-Soriano 1, Manuel Montes-y-Gómez 2, Emilio Sanchis-Arnal 1, Luis Villaseñor-Pineda 2, Paolo Rosso 1 1 Polytechnic University
More informationSTT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.
STT 231 Test 1 Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point. 1. A professor has kept records on grades that students have earned in his class. If he
More informationA heuristic framework for pivot-based bilingual dictionary induction
2013 International Conference on Culture and Computing A heuristic framework for pivot-based bilingual dictionary induction Mairidan Wushouer, Toru Ishida, Donghui Lin Department of Social Informatics,
More informationESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly
ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly Inflected Languages Classical Approaches to Tagging The slides are posted on the web. The url is http://chss.montclair.edu/~feldmana/esslli10/.
More information