Improved Word Alignments for Statistical Machine Translation

Size: px
Start display at page:

Download "Improved Word Alignments for Statistical Machine Translation"

Transcription

1 Improved Word Alignments for Statistical Machine Translation Institut für Maschinelle Sprachverarbeitung Universität Stuttgart Universität Heidelberg

2 Outline 2 Intro to statistical machine translation (SMT) How to build an SMT system SMT terminology What are word alignments? Improving word alignments for SMT Evaluating quality New model New training algorithm

3 How to Build an SMT System 3 Start with a large parallel corpus Consists of document pairs (document and its translation) Sentence alignment: in each document pair automatically find those sentences which are translations of one another Results in sentence pairs (sentence and its translation) Word alignment: in each sentence pair automatically annotate those words which are translations of one another Results in word-aligned sentence pairs

4 How to Build an SMT System 4 Construct a function g which, given a sentence in the source language and a hypothesized translation into the target language, assigns a goodness score g(die Waschmaschine läuft, the washing machine is running) = high number g(die Waschmaschine läuft, the car drove) = low number

5 How to Build an SMT System 5 Implement a search algorithm which, given a source language sentence, finds the target language sentence which maximizes g Problem: exhaustively searching this space is intractable Need an auxiliary function h that returns an approximate goodness score for only a part of the target sentence Using h, gradually build the target sentence from left to right

6 Using the SMT System 6 To use our SMT system to translate a new, unseen sentence, call the search algorithm Returns its determination of the best target language sentence To see if your SMT system works well, do this for a large number of unseen sentences and evaluate the results

7 SMT Models 7 We wish to build a machine translation system which given a Foreign sentence f produces its English translation e We build a model of P( e f ), the probability of the sentence e given the sentence f To translate a Foreign text f, choose the English text e which maximizes P( e f )

8 8 Noisy Channel: Decomposing P(e f ) argmax P( e f ) = argmax P( f e ) P( e ) e e P( e ) is referred to as the language model P ( e ) can be modeled using standard models (N-grams, etc) Parameters of P ( e ) can be estimated using large amounts of monolingual text (English) P( f e ) is referred to as the translation model

9 SMT Terminology Parameterized Model: the form of the function g which is used to determine the goodness of a translation g(die Waschmaschine läuft, the washing machine is running) = P(e f) P(the washing machine is running die Waschmaschine läuft)= n(1 die) t(the die) n(2 Waschmaschine) t(washing Waschmaschine) t(machine Waschmaschine) n(2 läuft) t(is läuft) t(running läuft) l(the START) l(washing the) l(machine washing) l(is machine) l(running is) 9

10 SMT Terminology Parameters: lookup tables used in the function g P(the washing machine is running die Waschmaschine läuft)= n(1 die) t(the die) n(2 Waschmaschine) t(washing Waschmaschine) t(machine Waschmaschine) n(2 läuft) t(is läuft) t(running läuft) l(the START) l(washing the) l(machine washing) l(is machine) l(running is) x 0.1 x 0.5 x 0.8 x 0.7 x 0.1 x 0.1 x 0.1 x

11 SMT Terminology Parameters: lookup tables used in the function g P(the washing machine is running die Waschmaschine läuft)= n(1 die) t(the die) n(2 Waschmaschine) t(washing Waschmaschine) t(machine Waschmaschine) n(2 läuft) t(is läuft) t(running läuft) l(the START) l(washing the) l(machine washing) l(is machine) l(running is) x 0.1 x 0.5 x 0.8 x 0.7 x 0.1 x 0.1 x 0.1 x Change washing machine to car 0.1 x 0.1 x 0.1 x n( 1 Waschmaschine) t(car Waschmaschine) x 0.1 x 0.1 x 0.1 x also different

12 SMT Terminology 12 Training: automatically building the lookup tables used in g, using parallel sentences One way to determine t(the die) Generate a word alignment for each sentence pair Look through the word-aligned sentence pairs Count the number of times die is translated as the Divide by the number of times die is translated. If this is 10% of the time, we set t(the die) = 0.1

13 Evaluation Evaluation metric: method for assigning a numeric score to a set of hypothesized translations Automatic evaluation metrics often rely on comparison with previously completed human translations BLEU compares the 1,2,3,4-gram overlap with one to four human translations BLEU penalizes generating long strings BLEU works well for comparing two similar MT systems 13

14 SMT Last Words 14 Translating is usually referred to as decoding (Warren Weaver) SMT was invented by ASR (Automatic Speech Recognition) researchers. In ASR: P(e) = language model P(e f) = acoustic model However, SMT must deal with word reordering!

15 Word Alignments 15 Recall that we build translation models from word-aligned parallel sentences The statistics involved in state of the art SMT translation models are simple Just count translations in the word-aligned parallel sentences But what is a word alignment, and how do we obtain it?

16 Word alignment is annotation of minimal translational correspondences Annotated in the context in which they occur Not idealized translations! (solid blue lines annotated by a bilingual expert)

17 Word Alignments Mathematically, P(f e) = P(f, a e) An alignment represents one way f could be generated from e But for the models discussed today, we approximate! P(f e) = argmax P(f, a e) a a 17

18 Automatic word alignments are typically generated using a model called IBM Model 4 No linguistic knowledge No correct answers are supplied to the system unsupervised learning (red dashed line = automatically generated hypothesis)

19 Overview: Improving Word Alignment 19 Solving problems with: Measuring word alignment quality Modeling word alignments Knowledge-free training process

20 How to measure alignment quality? 20 If we want to compare word alignment algorithms, we can generate a word alignment with each algorithm Then build an SMT system from each alignment Compare performance of the SMT systems using BLEU But this is slow, building SMT systems can take days of computation Question: Can we have an automatic metric like BLEU, but for alignment? Answer: there are several metrics already defined, they involve comparison with gold standard alignments

21 Problem: Existing Metrics Do Not Track Translation Quality 21 - Dozens of papers at ACL, NAACL, HLT, COLING, WPT03, WPT05, etc, report word alignment quality increases using various metrics - Contradiction: few of these report translation results - Those that do report inconclusive gains - This is because the two commonly used metrics, Alignment Error Rate (AER) and balanced F- Measure, do not correlate with MT performance! - We will show that these metrics have low correlation with BLEU

22 Measuring Precision and Recall 22 Start by fully linking hypothesized alignments

23 Measuring Precision and Recall 23 Precision is percentage of links in hypothesis that are correct If we hypothesize there are no links, have 100% precision Recall is percentage of correct links we hypothesized If we hypothesize all possible links, have 100% recall We will test metrics which formally define and combine these in different ways

24 Evaluating Alignment Error Rate 24 Does the widely used Alignment Error Rate (AER) metric correlate with BLEU? Use our baseline unsupervised alignment system in combination with three symmetrization heuristics (union, refined, intersection) One of these is usually used to build MT systems Effect is having three very different alignment systems

25 Alignment Error Rate (AER) 25 Gold Precision( A, P) = P A A = 3 4 (e3,f4) wrong f1 f2 f3 f4 f5 e1 e2 e3 e4 Recall( A,S) = S A S = 2 3 (e2,f3) not in hyp Hypothesis AER( A,P,S) = 1 P A + S + S A A = 2 7 f1 f2 f3 f4 f5 e1 e2 e3 e4

26 Experiment Desideratum: Keep everything constant in a set of SMT systems except the word-level alignments Alignments should be realistic Experiment: Take a parallel corpus of 8M words of Foreign-English. Word-align it. Build SMT system. Report AER and Bleu. For better alignments: train on 16M, 32M, 64M words (but use only the 8M words for MT building). For worse alignments: train on 2 1/2, 4 1/4, 8 1/8 of the 8M word training corpus. If AER is a good indicator of MT performance, 1 AER and BLEU should correlate no matter how the alignments are built (union, intersection, refined) Low 1 AER scores should correspond to low BLEU scores High 1 AER scores should correspond to high BLEU scores 26

27 AER is not a good indicator of MT performance 27

28 28 AER is wrongly derived from F-Measure (can be shown analytically) For details see Squib in Comp. Ling. (Sept 2007) Important: AER incorrectly favors sparse alignments (many unlinked words).

29 F α -score 29 We will try a different evaluation metric called F α -score The alpha refers to a parameter tuned to favor either precision or recall

30 F α -score 30 Gold f1 f2 f3 f4 f5 e1 e2 e3 e4 Hypothesis f1 f2 f3 f4 f5 e1 e2 e3 e4 Precision( A, S) = Recall( A,S) = F( A,S, α ) = S A A S A S 1 α + Precision( A,S) = 3 4 = α Recall( A,S) Called F α -score to differentiate from ambiguous term F-Measure (e3,f4) wrong (e2,f3) (e3,f5) not in hyp

31 F α -score is a good indicator of MT performance 31 α = 0.4

32 32 We have a way to rapidly measure alignment quality for SMT We will now look at alignment modeling

33 Problem: Existing Models Have the Wrong Structure 33 Existing generative models make false assumptions about alignment structure Proposed discriminative models either: Depend on generative models for their best results Or make false assumptions about structure themselves

34 1-to-N Assumption 34 1-to-N assumption Multi-word cepts (words in one language translated as a unit) only allowed on target side. Source side limited to single word cepts. Forced to create M-to-N alignments using heuristics, e.g. union

35 LEAF Generative Story (all) 35

36 LEAF Generative Story (0) 36

37 LEAF Generative Story (1) 37 Explicitly model three word types: Head word: provide most of conditioning for translation Robust representation of multi-word cepts (for this task) This is to semantics as ``syntactic head word'' is to syntax Non-head word: attached to a head word Deleted source words and spurious target words

38 LEAF Generative Story (2) 38 Stochastically attach the non-head words to a head word (using distance and the non-head word class)

39 LEAF Generative Story (3) 39 Generate exactly one target head word from each source word

40 LEAF Generative Story (4) 40 Decide how big the target cepts will be (using the source head and whether the source cept is only one word)

41 LEAF Generative Story (5) 41 Decide the number of spurious words (use the number of non-spurious words)

42 LEAF Generative Story (6) 42 Generate the spurious words

43 LEAF Generative Story (7) 43 Generate the target non-head words in each cept, conditioned on the source head word and the target head word class

44 LEAF Generative Story (8) 44 For each cept, place the target head word and then non-head words (relative distortion model)

45 LEAF Generative Story (9) 45 Place the spurious words

46 LEAF Can score the same structure in both directions Math in one direction (please do not try to read):

47 Comparing LEAF with Model 4 47 Model 4 does not allow source cepts to be more than one word This requires us to use heuristics to account for multiple word constructions LEAF allows multiple word source cepts LEAF is able to use the head-word relationship to better model both the source cept and the target cept

48 Unsupervised Training with EM 48 Expectation Maximization (EM) Unsupervised learning Maximize the likelihood of the training data Likelihood is (informally) the probability the model assigns to the training data E-Step: predict according to current parameters M-Step: reestimate parameters from predictions Amazing but true: if we iterate E and M steps, we increase likelihood!

49 The EM Algorithm 49 Bootstrap Viterbi alignments Translation Model Initial parameters E-Step Refined parameters Viterbi alignments M-Step

50 Want to learn more about EM? 50 See K. Knight 1999 word alignment tutorial Available from

51 M-Step 51 M-Step: reestimate parameters Count events in the Viterbi Simple smoothing: add a small fractional constant Normalize to sum to 1 Bootstrap (initial M-step) See EMNLP 2007 paper for details

52 E-Step 52 E-Step: search for Viterbi alignments Solved using local hillclimbing search Given a starting alignment we can permute the alignment by making small changes such as swapping the incoming links for two words Algorithm: Begin: Given starting alignment, make list of possible small changes (e.g. list every possible swap of the incoming links for two words) for each possible small change Create new alignment A2 by copying A and applying small change If score(a2) > score(best) then best = A2 end for Choose best alignment as starting point, goto Begin: See ACL 2006 paper for improved local hillclimbing search

53 Discussion 53 LEAF has powerful features But requires approximate search Correct structure: M-to-N discontiguous First general purpose statistical word alignment model of this structure! Head word assumption allows use of multi-word cepts Gives power of phrase-based models, but decisions robustly decompose over words

54 The story so far 54 We know that better alignments (as measured using the F α -score) lead to better MT We have defined LEAF, a generative model which models M-to-N discontiguous alignments LEAF can be trained using approximate EM What about integrating new knowledge Light supervision (the correct alignments for a few sentence pairs) Linguistic knowledge?

55 Existing Approaches Can Not Utilize New Knowledge 55 Existing unsupervised alignment techniques can not use manually annotated data Could be useful for light supervision It is difficult to add new knowledge sources to generative models Requires completely reengineering the generative story for each new knowledge source

56 Semi-Supervised Training Overview 56 First decompose the steps of the LEAF generative story into sub-models of a (log-) linear model Allows us to tune vector λ which has a scalar for each submodel controlling its contribution The idea here is that we might trust, for instance, the translation distribution (one sub-model) more than the number of words in a cept distribution (another sub-model). This allows us to integrate new sub-models unrelated to LEAF and adjust their weights with respect to other submodels Then

57 Semi-Supervised Training Overview 57 Define a semi-supervised algorithm which alternates increasing likelihood with decreasing error Increasing likelihood similar to EM Discriminatively bias EM to converge to a local maxima of likelihood which corresponds to better alignments Better = higher F α -score on small gold standard corpus

58 The EMD Algorithm Initialize: Perform initial M-step : estimate sub-model parameters from the HMM Viterbi alignments (bootstrap) Perform initial D-step : Find λ values which maximize F α -score on the small gold standard word-aligned development corpus Repeat: E-Step : Find Viterbi alignments using sub-models weighted by λ M-Step : Re-estimate sub-model params from the new Viterbi alignments D-step : Find λ values that maximize F α -score on the small gold word-aligned development corpus 58

59 The EMD Algorithm 59 Bootstrap Initial sub-model parameters Tuned lambda vector E-Step Viterbi alignments Translation Model D-Step Viterbi alignments Sub-model parameters M-Step

60 Previous Work: Semi-Supervised Usual formulation of semi-supervised learning: using unlabeled data to help supervised learning Build supervised system using labeled data Predict on unlabeled data Iterate (estimating from both labeled data and predictions on unlabeled data) We do not have enough gold standard word alignments to estimate parameters directly! EMD allows us to train a small number of important parameters discriminatively, the rest using likelihood maximization, and allows interaction 60

61 Story so far 61 We ve now presented a new metric, a new model, and a new semi-supervised training algorithm We ve reformulated LEAF as a log-linear model and added additional sub-models We will train this model using the semi-supervised EMD training algorithm to maximize F α -score How well does this work?

62 Experiments 62 French/English LDC Hansard (67 M English words) 110 gold standard aligned sentences MT: Alignment Templates, phrase-based Arabic/English NIST 2006 task (168 M English words) 1000 gold standard aligned sentences MT: Hiero, hierarchical phrases

63 Results French/English Arabic/English 63 System F-Measure BLEU F-Measure BLEU (α = 0.4) (1 ref) (α = 0.1) (4 refs) IBM Model (GIZA++) and heuristics EMD (ACL model) and heuristics LEAF+EMD

64 Contributions 64 Found a metric for measuring alignment quality which correlates with MT quality Designed LEAF, the first generative model of M-to-N discontiguous alignments Developed a semi-supervised training algorithm, the EMD algorithm Obtained large gains of 1.2 BLEU and 2.8 BLEU points for French/English and Arabic/English tasks

65 65 Much of the presented work was joint work with Daniel Marcu ISI (Univ. Southern California) Thank You! Dankeschön!

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu

More information

Language Model and Grammar Extraction Variation in Machine Translation

Language Model and Grammar Extraction Variation in Machine Translation Language Model and Grammar Extraction Variation in Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling

Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier 1, Andy Way 2, Josef van Genabith

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Greedy Decoding for Statistical Machine Translation in Almost Linear Time

Greedy Decoding for Statistical Machine Translation in Almost Linear Time in: Proceedings of HLT-NAACL 23. Edmonton, Canada, May 27 June 1, 23. This version was produced on April 2, 23. Greedy Decoding for Statistical Machine Translation in Almost Linear Time Ulrich Germann

More information

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode Diploma Thesis of Michael Heck At the Department of Informatics Karlsruhe Institute of Technology

More information

The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation

The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation AUTHORS AND AFFILIATIONS MSR: Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova,

More information

The NICT Translation System for IWSLT 2012

The NICT Translation System for IWSLT 2012 The NICT Translation System for IWSLT 2012 Andrew Finch Ohnmar Htun Eiichiro Sumita Multilingual Translation Group MASTAR Project National Institute of Information and Communications Technology Kyoto,

More information

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models

Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models Jianfeng Gao Microsoft Research One Microsoft Way Redmond, WA 98052 USA jfgao@microsoft.com Xiaodong He Microsoft

More information

Noisy SMS Machine Translation in Low-Density Languages

Noisy SMS Machine Translation in Low-Density Languages Noisy SMS Machine Translation in Low-Density Languages Vladimir Eidelman, Kristy Hollingshead, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department of

More information

The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

More information

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation Baskaran Sankaran and Anoop Sarkar School of Computing Science Simon Fraser University Burnaby BC. Canada {baskaran,

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition Seltzer, M.L.; Raj, B.; Stern, R.M. TR2004-088 December 2004 Abstract

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Re-evaluating the Role of Bleu in Machine Translation Research

Re-evaluating the Role of Bleu in Machine Translation Research Re-evaluating the Role of Bleu in Machine Translation Research Chris Callison-Burch Miles Osborne Philipp Koehn School on Informatics University of Edinburgh 2 Buccleuch Place Edinburgh, EH8 9LW callison-burch@ed.ac.uk

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels

Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels Jörg Tiedemann Uppsala University Department of Linguistics and Philology firstname.lastname@lingfil.uu.se Abstract

More information

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick

More information

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

Lab 1 - The Scientific Method

Lab 1 - The Scientific Method Lab 1 - The Scientific Method As Biologists we are interested in learning more about life. Through observations of the living world we often develop questions about various phenomena occurring around us.

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

BYLINE [Heng Ji, Computer Science Department, New York University,

BYLINE [Heng Ji, Computer Science Department, New York University, INFORMATION EXTRACTION BYLINE [Heng Ji, Computer Science Department, New York University, hengji@cs.nyu.edu] SYNONYMS NONE DEFINITION Information Extraction (IE) is a task of extracting pre-specified types

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS

DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS Julia Tmshkina Centre for Text Techitology, North-West University, 253 Potchefstroom, South Africa 2025770@puk.ac.za

More information

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES Judith Gaspers and Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University {jgaspers cimiano}@cit-ec.uni-bielefeld.de ABSTRACT Semantic parsers

More information

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Akiko Sakamoto, Kazuhiko Abe, Kazuo Sumita and Satoshi Kamatani Knowledge Media Laboratory,

More information

Training and evaluation of POS taggers on the French MULTITAG corpus

Training and evaluation of POS taggers on the French MULTITAG corpus Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Online Updating of Word Representations for Part-of-Speech Tagging

Online Updating of Word Representations for Part-of-Speech Tagging Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,

More information

On the Combined Behavior of Autonomous Resource Management Agents

On the Combined Behavior of Autonomous Resource Management Agents On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Bootstrapping and Evaluating Named Entity Recognition in the Biomedical Domain

Bootstrapping and Evaluating Named Entity Recognition in the Biomedical Domain Bootstrapping and Evaluating Named Entity Recognition in the Biomedical Domain Andreas Vlachos Computer Laboratory University of Cambridge Cambridge, CB3 0FD, UK av308@cl.cam.ac.uk Caroline Gasperin Computer

More information

Annotation Projection for Discourse Connectives

Annotation Projection for Discourse Connectives SFB 833 / Univ. Tübingen Penn Discourse Treebank Workshop Annotation projection Basic idea: Given a bitext E/F and annotation for F, how would the annotation look for E? Examples: Word Sense Disambiguation

More information

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

More information

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

A heuristic framework for pivot-based bilingual dictionary induction

A heuristic framework for pivot-based bilingual dictionary induction 2013 International Conference on Culture and Computing A heuristic framework for pivot-based bilingual dictionary induction Mairidan Wushouer, Toru Ishida, Donghui Lin Department of Social Informatics,

More information

Regression for Sentence-Level MT Evaluation with Pseudo References

Regression for Sentence-Level MT Evaluation with Pseudo References Regression for Sentence-Level MT Evaluation with Pseudo References Joshua S. Albrecht and Rebecca Hwa Department of Computer Science University of Pittsburgh {jsa8,hwa}@cs.pitt.edu Abstract Many automatic

More information

Constructing Parallel Corpus from Movie Subtitles

Constructing Parallel Corpus from Movie Subtitles Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing

More information

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

More information

What is a Mental Model?

What is a Mental Model? Mental Models for Program Understanding Dr. Jonathan I. Maletic Computer Science Department Kent State University What is a Mental Model? Internal (mental) representation of a real system s behavior,

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Kang Liu, Liheng Xu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy

More information

Learning to Schedule Straight-Line Code

Learning to Schedule Straight-Line Code Learning to Schedule Straight-Line Code Eliot Moss, Paul Utgoff, John Cavazos Doina Precup, Darko Stefanović Dept. of Comp. Sci., Univ. of Mass. Amherst, MA 01003 Carla Brodley, David Scheeff Sch. of Elec.

More information

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

Generative models and adversarial training

Generative models and adversarial training Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

End-of-Module Assessment Task

End-of-Module Assessment Task Student Name Date 1 Date 2 Date 3 Topic E: Decompositions of 9 and 10 into Number Pairs Topic E Rubric Score: Time Elapsed: Topic F Topic G Topic H Materials: (S) Personal white board, number bond mat,

More information

The stages of event extraction

The stages of event extraction The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Multilingual Sentiment and Subjectivity Analysis

Multilingual Sentiment and Subjectivity Analysis Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Lecture 2: Quantifiers and Approximation

Lecture 2: Quantifiers and Approximation Lecture 2: Quantifiers and Approximation Case study: Most vs More than half Jakub Szymanik Outline Number Sense Approximate Number Sense Approximating most Superlative Meaning of most What About Counting?

More information

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer

More information

Inteligencia Artificial. Revista Iberoamericana de Inteligencia Artificial ISSN:

Inteligencia Artificial. Revista Iberoamericana de Inteligencia Artificial ISSN: Inteligencia Artificial. Revista Iberoamericana de Inteligencia Artificial ISSN: 1137-3601 revista@aepia.org Asociación Española para la Inteligencia Artificial España Lucena, Diego Jesus de; Bastos Pereira,

More information

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,

More information

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego

More information

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,

More information

arxiv:cmp-lg/ v1 22 Aug 1994

arxiv:cmp-lg/ v1 22 Aug 1994 arxiv:cmp-lg/94080v 22 Aug 994 DISTRIBUTIONAL CLUSTERING OF ENGLISH WORDS Fernando Pereira AT&T Bell Laboratories 600 Mountain Ave. Murray Hill, NJ 07974 pereira@research.att.com Abstract We describe and

More information

Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval

Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval Jianqiang Wang and Douglas W. Oard College of Information Studies and UMIACS University of Maryland, College Park,

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar

More information

Using computational modeling in language acquisition research

Using computational modeling in language acquisition research Chapter 8 Using computational modeling in language acquisition research Lisa Pearl 1. Introduction Language acquisition research is often concerned with questions of what, when, and how what children know,

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

The KIT-LIMSI Translation System for WMT 2014

The KIT-LIMSI Translation System for WMT 2014 The KIT-LIMSI Translation System for WMT 2014 Quoc Khanh Do, Teresa Herrmann, Jan Niehues, Alexandre Allauzen, François Yvon and Alex Waibel LIMSI-CNRS, Orsay, France Karlsruhe Institute of Technology,

More information

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad

More information

Finding Translations in Scanned Book Collections

Finding Translations in Scanned Book Collections Finding Translations in Scanned Book Collections Ismet Zeki Yalniz Dept. of Computer Science University of Massachusetts Amherst, MA, 01003 zeki@cs.umass.edu R. Manmatha Dept. of Computer Science University

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Deep Neural Network Language Models

Deep Neural Network Language Models Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com

More information

Loughton School s curriculum evening. 28 th February 2017

Loughton School s curriculum evening. 28 th February 2017 Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Learning to Rank with Selection Bias in Personal Search

Learning to Rank with Selection Bias in Personal Search Learning to Rank with Selection Bias in Personal Search Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork Google Inc. Mountain View, CA 94043 {xuanhui, bemike, metzler, najork}@google.com ABSTRACT

More information

Using Small Random Samples for the Manual Evaluation of Statistical Association Measures

Using Small Random Samples for the Manual Evaluation of Statistical Association Measures Using Small Random Samples for the Manual Evaluation of Statistical Association Measures Stefan Evert IMS, University of Stuttgart, Germany Brigitte Krenn ÖFAI, Vienna, Austria Abstract In this paper,

More information

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma University of Alberta Large-Scale Semi-Supervised Learning for Natural Language Processing by Shane Bergsma A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy Informatics 2A: Language Complexity and the Chomsky Hierarchy September 28, 2010 Starter 1 Is there a finite state machine that recognises all those strings s from the alphabet {a, b} where the difference

More information

Age Effects on Syntactic Control in. Second Language Learning

Age Effects on Syntactic Control in. Second Language Learning Age Effects on Syntactic Control in Second Language Learning Miriam Tullgren Loyola University Chicago Abstract 1 This paper explores the effects of age on second language acquisition in adolescents, ages

More information

Corrective Feedback and Persistent Learning for Information Extraction

Corrective Feedback and Persistent Learning for Information Extraction Corrective Feedback and Persistent Learning for Information Extraction Aron Culotta a, Trausti Kristjansson b, Andrew McCallum a, Paul Viola c a Dept. of Computer Science, University of Massachusetts,

More information

arxiv: v2 [cs.cv] 30 Mar 2017

arxiv: v2 [cs.cv] 30 Mar 2017 Domain Adaptation for Visual Applications: A Comprehensive Survey Gabriela Csurka arxiv:1702.05374v2 [cs.cv] 30 Mar 2017 Abstract The aim of this paper 1 is to give an overview of domain adaptation and

More information