IIT Bombay s English-Indonesian submission at WAT: Integrating neural language models with SMT

Similar documents
The KIT-LIMSI Translation System for WMT 2014

Residual Stacking of RNNs for Neural Machine Translation

Overview of the 3rd Workshop on Asian Translation

3 Character-based KJ Translation

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011

Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling

The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017

The NICT Translation System for IWSLT 2012

Language Model and Grammar Extraction Variation in Machine Translation

arxiv: v1 [cs.cl] 2 Apr 2017

Noisy SMS Machine Translation in Low-Density Languages

Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation

Re-evaluating the Role of Bleu in Machine Translation Research

A hybrid approach to translate Moroccan Arabic dialect

Deep Neural Network Language Models

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Training and evaluation of POS taggers on the French MULTITAG corpus

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Initial approaches on Cross-Lingual Information Retrieval using Statistical Machine Translation on User Queries

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

A heuristic framework for pivot-based bilingual dictionary induction

Leveraging Sentiment to Compute Word Similarity

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation

Enhancing Morphological Alignment for Translating Highly Inflected Languages

Regression for Sentence-Level MT Evaluation with Pseudo References

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Modeling function word errors in DNN-HMM based LVCSR systems

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

arxiv: v3 [cs.cl] 7 Feb 2017

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Greedy Decoding for Statistical Machine Translation in Almost Linear Time

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

arxiv: v4 [cs.cl] 28 Mar 2016

Experts Retrieval with Multiword-Enhanced Author Topic Model

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Problems of the Arabic OCR: New Attitudes

1. Introduction. 2. The OMBI database editor

Modeling function word errors in DNN-HMM based LVCSR systems

TINE: A Metric to Assess MT Adequacy

Online Updating of Word Representations for Part-of-Speech Tagging

GCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

arxiv: v1 [cs.lg] 7 Apr 2015

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Cross-lingual Text Fragment Alignment using Divergence from Randomness

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

EQuIP Review Feedback

Lip Reading in Profile

A Case Study: News Classification Based on Term Frequency

Multi-Lingual Text Leveling

Second Exam: Natural Language Parsing with Neural Networks

arxiv: v2 [cs.cl] 18 Nov 2015

Inteligencia Artificial. Revista Iberoamericana de Inteligencia Artificial ISSN:

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Speech Recognition at ICSI: Broadcast News and beyond

Python Machine Learning

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Probabilistic Latent Semantic Analysis

The 2014 KIT IWSLT Speech-to-Text Systems for English, German and Italian

Cross Language Information Retrieval

Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment

The role of the first language in foreign language learning. Paul Nation. The role of the first language in foreign language learning

End-to-End SMT with Zero or Small Parallel Texts 1. Abstract

Linking Task: Identifying authors and book titles in verbose queries

The A2iA Multi-lingual Text Recognition System at the second Maurdor Evaluation

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Word Segmentation of Off-line Handwritten Documents

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers

Trend Survey on Japanese Natural Language Processing Studies over the Last Decade

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Word Translation Disambiguation without Parallel Texts

What the National Curriculum requires in reading at Y5 and Y6

A study of speaker adaptation for DNN-based speech synthesis

Patterns for Adaptive Web-based Educational Systems

Word Embedding Based Correlation Model for Question/Answer Matching

Robust Sense-Based Sentiment Classification

arxiv: v1 [cs.cl] 27 Apr 2016

The IDN Variant Issues Project: A Study of Issues Related to the Delegation of IDN Variant TLDs. 20 April 2011

Detecting English-French Cognates Using Orthographic Edit Distance

A deep architecture for non-projective dependency parsing

arxiv: v1 [cs.cv] 10 May 2017

DICE - Final Report. Project Information Project Acronym DICE Project Title

A Quantitative Method for Machine Translation Evaluation

Deep Multilingual Correlation for Improved Word Embeddings

Autoencoder and selectional preference Aki-Juhani Kyröläinen, Juhani Luotolahti, Filip Ginter

Constructing Parallel Corpus from Movie Subtitles

Connect Microbiology. Training Guide

Using dialogue context to improve parsing performance in dialogue systems

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Transcription:

12th Dec., 2016 The 3 rd Workshop on Asian Language Translation (WAT2016), Japan collocated with COLING 2016 1 IIT Bombay s English-Indonesian submission at WAT: Integrating neural language models with SMT Sandhya Singh, Anoop Kunchukuttan, Pushpak Bhattacharyya {sandhya, anoopk, pb}@cse.iitb.ac.in Center for Indian Language Technology IIT Bombay

12th Dec., 2016 WAT2016 2 Motivation At CFILT, English-Indonesian language pair is being experimented as a part of a Project. Relatively new language pair among Asian language Translations.

12th Dec., 2016 WAT 2016 3 About English-Indonesian Language pair Script is Latin for both English and Indonesian. Sentence structure followed is SVO (Subject Verb Object). Not much structural divergence between English and Indonesian. Indonesian is highly agglutinative and morphologically rich as compared to English language. Indonesian is considered as resource poor language.

12th Dec., 2016 WAT 2016 4 Experiment Description (1/4) Four different systems were trained for both directions of language pair: 1. Phrase Based SMT system (Moses baseline ) MGIZA++ for word alignment grow-diag-final-end heuristic Lexicalized Reordering Batch MIRA tuning 5-gram LM with Kneser-Ney smoothing using SRILM Data Statistics Language Training Set Tuning Set Test Set For LM English 44939 sentences 400 sentences 400 sentences 50000 sentences Indonesian 44939 sentences 400 sentences 400 sentences 50000 sentences

12th Dec., 2016 WAT 2016 5 Experiment Description (2/4) 2. System using Neural Language Model as a feature for translation(nplm) Neural Language model with default NPLM settings (Vaswani et al. (2013)) Word embedding size as 700, 750, 800 for 5 epochs One hidden layer Integrated as a feature in PBSMT system Data statistics Language Training Set Tuning Set Test Set For LM English 44939 sentences 400 sentences 400 sentences 50000 sentences + 2M sentences (Europarl) Indonesian 44939 sentences 400 sentences 400 sentences 50000 sentences + 2M sentences (CommonCrawl)

12th Dec., 2016 WAT 2016 6 Experiment Description (3/4) 3. System using Bilingual Neural Language Model as a feature for translation(nnjm) Neural network joint LM with Parallel data (Devlin et al. (2014)) 5-gram LM with 9 source context word One hidden layer Integrated as a feature in PBSMT system Data Statistics Language Training Set Tuning Set Test Set For LM English 44939 sentences 400 sentences 400 sentences 50000 sentences Indonesian 44939 sentences 400 sentences 400 sentences 50000 sentences

12th Dec., 2016 WAT 2016 7 Experiment Description (4/4) 4. System using Operation Sequence Model for translation(osm) Integrates 5-gram-based reordering and translation in a single generative process (Durrani et al. (2013)) Deals with words along with context of source & target. Data Statistics Language Training Set Tuning Set Test Set For LM English 44939 sentences 400 sentences 400 sentences 50000 sentences Indonesian 44939 sentences 400 sentences 400 sentences 50000 sentences

12th Dec., 2016 WAT 2016 8 Evaluation Process 1. Automatic Evaluation metrics BLEU points RIBES Scores AMFM Scores 2. Pairwise Crowdsourcing Evaluation Against the shared task baseline 3. JPO Adequacy Evaluation For content transmission

12th Dec., 2016 WAT 2016 9 English-Indonesian MT system

12th Dec., 2016 WAT 2016 10 Automatic Evaluation of English Indonesian MT system Approach Used BLEU score RIBES score AMFM score Phrase based SMT 21.74 0.804986 0.55095 Operation Sequence Model Neural LM with OE = 700 Neural LM with OE =750 Neural LM with OE = 800 21.70 0.806182 0.552480 22.12 0.804933 0.5528 21.64 0.806033 0.555 22.08 0.806697 0.55188 Joint neural LM* 22.35 0.808943 0.55597 Increase in BLEU score with NNJM by 0.61 points over PBSMT system * WAT Submission, OE: Output Embedding

12th Dec., 2016 WAT 2016 11 Pairwise Crowdsourcing Analysis of EI system(1/2) Crowdsourcing Evaluation method 5 Evaluators scored the sentence translations against the shared task baseline translation as : Ø Better than baseline : 1 Ø Tie with baseline : 0 Ø Worse than baseline : -1 All 5 scores were added and converted to : Ø 1 if >= 2 Ø -1 if <= -2 Ø 0 if between 2 & -2

12th Dec., 2016 WAT 2016 12 Pairwise Crowdsourcing Analysis of EI system(2/2) Scores received from pairwise evaluations Experiment Approach Followed Better than Baseline Comparable to Baseline Worse than Baseline Scores English- Indonesian NNJM 23% 44.75% 32.25% -9.0250 Observations For worse sentences, sentence length is found to be >= 25 words. Words not getting translated is the most visible error.

12th Dec., 2016 WAT 2016 13 JPO Adequacy Scores of EI system Adequacy evaluation method Ø 2 Annotators evaluated 200 translations for adequacy scores from 1 5 Ø Frequency of each score is used to compare. Scores : Experiment Approach Followed Adequacy distribution 5 4 3 2 1 Adequacy Score English- Indonesian NNJM 17.75% 25.25% 23.25% 16.5% 17.25% 3.10

12th Dec., 2016 WAT 2016 14 Summary of all evaluations for EI system (NNJM) Our systems adequacy scores suggests that the sentences are able to convey the meaning well.

12th Dec., 2016 WAT 2016 15 Indonesian-English MT system

12th Dec., 2016 WAT 2016 16 Results for Indonesian English MT system Approach Used BLEU score RIBES score AMFM score Phrase based SMT 22.03 Operation Sequence Model* Neural LM with OE= 700 Neural LM with OE = 750 Neural LM with OE = 800 Joint Neural LM 22.24 0.781430 22.58 21.99 22.15 0.78032 0.564580 0.566950 0.781983 0.569330 0.780901 0.56340 0.782302 0.566470 22.05 0.781268 0.565860 Increase in BLEU score with NPLM by 0.55 points over PBSMT system * WAT Submission, OE: Output Embedding

12th Dec., 2016 WAT 2016 17 Pairwise Crowdsourcing Analysis of IE system Scores of crowdsourcing evaluation (refer to slide-11 for evaluation method) Experiment Indonesian- English Approach Followed OSM approach Better than Baseline Comparable to Baseline Worse than Baseline Scores 20% 34% 46% -26.00 Observations Ø For worse sentences, Sentence length is found to be >= 25 words

12th Dec., 2016 WAT 2016 18 JPO Adequacy Scores of IE system Scores (refer to slide-13 for evaluation method ): Experiment Indonesian- English Approach Followed OSM approach Adequacy distribution 5 4 3 2 1 Adequacy Score 12% 18.75% 31.75% 30.5% 7% 2.98 Observation: -From adequacy distribution, it can be observed that > 50% of translations are adequate enough to convey the meaning.

12th Dec., 2016 WAT 2016 19 Summary of all evaluations for Indonesian-English system(osm) Our systems scores with OSM approach are not very promising against the baseline system.

12th Dec., 2016 WAT 2016 20 Output Analysis of Indonesian-English System Reference Sentence Translated Sentence Error Analysis Moreover, syariah banking has yet to become a national agenda, Riawan said. Of course, we will adhere to the rules, Bimo said. The Indonesian government last year canceled 11 foreign-funded projects across the country for various reasons, the Finance Ministry said. As the second largest Islamic bank with a 29% market share of the Islamic banking industry's total assets at end-2007 albeit only 0.5% of overall banking industry's total assets, net financing margin NFM on Muamalat's financing operations increased to 7.9% in 2007 from 6.4% in 2004 due to better funding structure. In addition, the banking industry had not so national agenda, said Riawan who also director of the main BMI. We will certainly patuhi regulations, Bimo said. The government has cancel foreign loans from various creditors to 11 projects in 2006 because various reasons. As the second largest bank of the market by 29 percent of the total assets syariah banking loans at the end of December 2007 although the market only 0.5 percent of the total assets banking industry as a whole, financing profit margin Muamalat rose to 7.9 percent in 2007 from 6.4 percent in 2004 thanks to funding structure. Phrase insertion All words not translated Phrase dropped Phrase dropped * Text in blue represents error

12th Dec., 2016 WAT 2016 21 Observations by Language Experts Output analysis of Indonesian-English system The Sentences were adequate and fluent to some extent. The major error was of dropping and insertion of phrases. Some Indonesian words could not be translated to English due to lack of vocabulary learnt. Ø Though OOV word percentage was found to be only 5% of the total words in the test set. Error in choice of function words used for English language. Ø Require some linguistic insight on the Indonesian side of the language to understand the usage of function words in the source language.

12th Dec., 2016 WAT 2016 22 Conclusion Due to structural similarity, translation outputs are adequate to understand. Integrating Neural Probabilistic LM (NPLM) with additional data as a feature in PBSMT system improves the translation quality. Integrating Neural Network Joint Model (Bilingual LM) trained on parallel data as a feature in PBSMT system improves translation quality.

12th Dec., 2016 WAT 2016 23 Future Work Investigate the hyperparameters for the neural language model. Experiment with pure neural MT system for English-Indonesian language pair.

12th Dec., 2016 WAT 2016 24 References Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2015. "Neural machine translation by jointly learning to align and translate." In ICLR. Devlin, Jacob, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard M. Schwartz, and John Makhoul. 2014. "Fast and Robust Neural Network Joint Models for Statistical Machine Translation." In conference of the Association of Computational Linguistics. Durrani, Nadir, Helmut Schmid, and Alexander Fraser. 2011. "A joint sequence translation model with integrated reordering." Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics. Durrani, Nadir, Alexander M. Fraser, and Helmut Schmid. 2013. "Model With Minimal Translation Units, But Decode With Phrases." HLT-NAACL. Koehn, Philipp, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan. 2007. "Moses: Open source toolkit for statistical machine translation." In Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions. Association for Computational Linguistics. Nakazawa, Toshiaki and Mino, Hideya and Ding, Chenchen and Goto, Isao and Neubig, Graham and Kurohashi, Sadao and Sumita, Eiichiro. 2016. Overview of the 3rd Workshop on Asian Translation. Proceedings of the 3rd Workshop on Asian Translation (WAT2016), October. Niehues, Jan, Teresa Herrmann, Stephan Vogel, and Alex Waibel. 2011. "Wider context by using bilingual language models in machine translation." InProceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics. Vaswani, Ashish, Yinggong Zhao, Victoria Fossum, and David Chiang. 2013. "Decoding with Large- Scale Neural Language Models Improves Translation." In EMNLP.

12th Dec., 2016 WAT 2016 25 Thank You!