The NTT Statistical Machine Translation System for IWSLT2005

Similar documents
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011

The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation

Noisy SMS Machine Translation in Low-Density Languages

The NICT Translation System for IWSLT 2012

Language Model and Grammar Extraction Variation in Machine Translation

Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Cross Language Information Retrieval

Re-evaluating the Role of Bleu in Machine Translation Research

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation

Greedy Decoding for Statistical Machine Translation in Almost Linear Time

arxiv: v1 [cs.cl] 2 Apr 2017

A Named Entity Recognition Method using Rules Acquired from Unlabeled Data

Speech Recognition at ICSI: Broadcast News and beyond

Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

The KIT-LIMSI Translation System for WMT 2014

CS Machine Learning

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Constructing Parallel Corpus from Movie Subtitles

Overview of the 3rd Workshop on Asian Translation

Detecting English-French Cognates Using Orthographic Edit Distance

Residual Stacking of RNNs for Neural Machine Translation

A heuristic framework for pivot-based bilingual dictionary induction

Modeling function word errors in DNN-HMM based LVCSR systems

Cross-lingual Text Fragment Alignment using Divergence from Randomness

Switchboard Language Model Improvement with Conversational Data from Gigaword

The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Investigation on Mandarin Broadcast News Speech Recognition

A study of speaker adaptation for DNN-based speech synthesis

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models

DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Multi-Lingual Text Leveling

A Quantitative Method for Machine Translation Evaluation

Modeling function word errors in DNN-HMM based LVCSR systems

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

My First Spanish Phrases (Speak Another Language!) By Jill Kalz

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

GENERAL COMMENTS Some students performed well on the 2013 Tamil written examination. However, there were some who did not perform well.

Effect of Word Complexity on L2 Vocabulary Learning

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

Semi-supervised Training for the Averaged Perceptron POS Tagger

BULATS A2 WORDLIST 2

Inteligencia Artificial. Revista Iberoamericana de Inteligencia Artificial ISSN:

Cross-Lingual Text Categorization

Using dialogue context to improve parsing performance in dialogue systems

Multilingual Sentiment and Subjectivity Analysis

Learning Methods in Multilingual Speech Recognition

Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels

Speech Translation for Triage of Emergency Phonecalls in Minority Languages

Finding Translations in Scanned Book Collections

The Role of the Head in the Interpretation of English Deverbal Compounds

Regression for Sentence-Level MT Evaluation with Pseudo References

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Classroom Activities/Lesson Plan

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

WHEN THERE IS A mismatch between the acoustic

Mandarin Lexical Tone Recognition: The Gating Paradigm

BYLINE [Heng Ji, Computer Science Department, New York University,

Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval

Training and evaluation of POS taggers on the French MULTITAG corpus

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Basic German: CD/Book Package (LL(R) Complete Basic Courses) By Living Language

Edinburgh Research Explorer

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

Calibration of Confidence Measures in Speech Recognition

ROSETTA STONE PRODUCT OVERVIEW

Voice conversion through vector quantization

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Deep Neural Network Language Models

Trend Survey on Japanese Natural Language Processing Studies over the Last Decade

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Task Tolerance of MT Output in Integrated Text Processes

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Syntactic surprisal affects spoken word duration in conversational contexts

Context Free Grammars. Many slides from Michael Collins

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

The role of the first language in foreign language learning. Paul Nation. The role of the first language in foreign language learning

Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

Using SAM Central With iread

Distant Supervised Relation Extraction with Wikipedia and Freebase

Initial approaches on Cross-Lingual Information Retrieval using Statistical Machine Translation on User Queries

Math 1313 Section 2.1 Example 2: Given the following Linear Program, Determine the vertices of the feasible set. Subject to:

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Improvements to the Pruning Behavior of DNN Acoustic Models

PowerTeacher Gradebook User Guide PowerSchool Student Information System

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

The Smart/Empire TIPSTER IR System

Transcription:

The NTT Statistical Machine Translation System for IWSLT2005 Hajime Tsukada, Taro Watanabe, Jun Suzuki, Hideto Kazawa, and Hideki Isozaki NTT Communication Science Labs.

Purpose A large number of reportedly effective features is evaluated by our system. Additional monolingual and bilingual resources are also evaluated. Monolingual resources for generated language modeling Bilingual resources for translation modeling

SMT based on Log-linear Models [Och, 2002][Och, 2003] : feature functions is calculated based on the minimum error rate criterion in our system. Easy to combine various features for translation modeling, language modeling, and lexical reorder modeling

Language Model Features Features: 6-gram Class-based 9-gram Prefix-4 9-gram Suffix-4 9-gram Training Conditions: Mixed casing Prefix-4 (suffix-4) takes only 4-letter prefixes (suffixes) [Och, 2005]. Examples of prefix-4 I d like to reserve -> I d like to rese+ I d like to make a reservation -> I d like to make a rese+

Phrase-based Features Phrase translation probabilities, and : (f, e) Dice(f, e)

Phrase Based Features (cont d) Phrase extraction probability of source/target: Phrase pair extraction probability:

Phrase Based Features (cont d) Adjusted Dice coefficient:

Word-level Features Lexical weights, and, where

Word-level Features (cont d) IBM model 1 scores,, where and

Word-level Features (cont d) Viterbi IBM model 1 scores, and, where

Word-level Features (cont d) Noisy OR gates, and, where

Word-level Features (cont d) Deletion penalty,, where

Lexical Reordering Features Distortion model, where denotes the starting position of the foreign phrase translated into the i-th English phrase, denotes the end position of the foreign phrase translated into the (i-1)-th English phrase.

Lexical Reordering Features (cont d) Right and left monotone model and, where and denotes the number of right connected phrases that are monotone.

Other features Number of words that constitute a translation Number of phrases that constitute a translation

Decoder Beam search + A* search Constraints for reordering: Window size constraint, restricting number of words to be skipped in the source ITG-constraint

Experimental Purpose To validate the use of the reportedly effective features All features introduced previously are used. Evaluation of additional language resources Comparable experiments with both supplied and unrestricted data tracks are conducted. Target language is English: Japanese-to-English Chinese-to-English Korean-to-English Arabic-to-English

Experimental Conditions Mixed casing and prefix-4 form for word alignment Mixed casing for language models Language models are trained by SRI toolkit

Monolingual Corpora for Unrestricted Data Track ATR: ATR spoken language database WEB: WEB pages on traveling

Bilingual Corpora for Unrestricted Data Track ATR: ATR spoken language database LDC: LDC2004T08 and LDC2005T10

Other Setups Use NIST score for estimating feature function scaling factors ITG-constraints for J-to-E and K-to-E Window size constraints up to 7 for A-to-E and C-to-E On-the-fly estimation of language models 1. Vocabulary set is limited to that observed in the supplied corpus and ATR database when counting n- grams. 2. N-gram models for decoding are derived from the vocabulary set generated by using the extracted phrase pairs and the test set.

Evaluation of Additional Monolingual Corpora -- Output Language Perplexity of N-grams for Decoding -- The perplexities of n-grams trained by additional resources are small enough.

Evaluation of Additional Bilingual Corpora -- Input Language Perplexity of Supplieddata Trigram -- ATR LDC IWSLT IWSLT

Results Supplied < Unrestricted Additional monolingual resources are helpful.

Conclusions Competitive accuracy is obtained. The log-linear model effectively utilized n- grams trained by out-of-domain corpora, and improved the translation accuracy of the supplied data. Future works: Feature extraction Why is our system extremely inferior in terms of BLEU scores?