Unsupervised Machine Translation
|
|
- Richard Lamb
- 5 years ago
- Views:
Transcription
1 Unsupervised Machine Translation Alexis Conneau 3rd year PhD student Facebook AI Research, Université Le Mans Joint work with Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
2 Motivation Neural machine translation works well for language pairs with a lot of parallel data (English-French, English-German, etc.) Performance drops when parallel data is scarce Vietnamese, Norwegian, Basque, Ukrainian, Serbian The creation of parallel data is difficult, and costly 2
3 Motivation Neural machine translation works well for language pairs with a lot of parallel data (English-French, English-German, etc.) Performance drops when parallel data is scarce Vietnamese, Norwegian, Basque, Ukrainian, Serbian The creation of parallel data is difficult, and costly Most language pairs use English as a pivot However, monolingual data is much easier to find 3
4 Questions Can we use monolingual data to improve a MT system? Can we reduce the amount of supervision? 4
5 Questions Can we use monolingual data to improve a MT system? Can we reduce the amount of supervision? Can we even learn WITHOUT ANY supervision? 5
6 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) 6
7 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Small parallel dataset Huge monolingual corpus in the target language English // French // French (mono) 7
8 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Small parallel dataset Huge monolingual corpus in the target language Train a (target source) model M t2s English // French // French (mono) 8
9 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Small parallel dataset Huge monolingual corpus in the target language Train a (target source) model M t2s Use M t2s to translate the target monolingual corpus English // English (noisy) French // French (mono) 9
10 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Small parallel dataset Huge monolingual corpus in the target language Train a (target source) model M t2s Use M t2s to translate the target monolingual corpus Use the two parallel datasets to train M s2t English // English (noisy) French // French (mono) 10
11 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Dual learning (He et al., 2016) (source target source) M t2s (M s2t (x s )) = x s (target source target) M s2t (M t2s (x t )) = x t 11
12 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Dual learning (He et al., 2016) Pivot-based Related language pairs (Firat et al., 2016; Johnson et al., 2016) Images (Nakayama & Nishida (2017), Lee et al. (2017)) 12
13 Prior work Semi-supervised Back-translation (Sennrich et al., 2015) Dual learning (He et al., 2016) Pivot-based Related language pairs (Firat et al., 2016; Johnson et al., 2016) Images (Nakayama & Nishida (2017), Lee et al. (2017)) Fully unsupervised Ravi & Knight,
14 Our approach Start with unsupervised word translation Easier task to start with Already insights of why it could work Can be used as a first step towards unsupervised sentence translation 14
15 Weakly-supervised word translation Exploiting similarities among languages for machine translation (Mikolov et al., 2013)
16 Weakly-supervised word translation Exploiting similarities among languages for machine translation (Mikolov et al., 2013) Start from two pre-trained monolingual spaces (word2vec) Totally unsupervised Widely used Strong systems for monolingual embeddings Semantically and syntactically relevant Not task-specific, useful across domains
17 Weakly-supervised word translation Exploiting similarities among languages for machine translation (Mikolov et al., 2013) Start from two pre-trained monolingual spaces (word2vec) Project the source space onto the target space using a small dictionary + =
18 Weakly-supervised word translation Exploiting similarities among languages for machine translation (Mikolov et al., 2013) Start from two pre-trained monolingual spaces (word2vec) Project the source space onto the target space using a small dictionary + = Feed-forward network does not improve over linear mapping (Mikolov et al., 2013) Orthogonal projection works best Xing et al. (2015), Smith et al. (2017)
19 Weakly-supervised word translation Linear projection Mikolov et al. (2013) 19
20 Weakly-supervised word translation Linear projection Mikolov et al. (2013) Orthogonal projection Xing et al. (2015), Smith et al. (2017) Procrustes 20
21 Weakly-supervised word translation Linear projection Mikolov et al. (2013) Orthogonal projection Xing et al. (2015), Smith et al. (2017) Procrustes Given a source word s, define the translation as: (nearest neighbor according to the cosine distance) 21
22 Unsupervised word translation Can we find the mapping W in an unsupervised way? + =
23 Adversarial training If WX and Y are perfectly aligned, these spaces should be undistinguishable 23
24 Adversarial training If WX and Y are perfectly aligned, these spaces should be undistinguishable Train a discriminator D to discriminate elements from WX and Y Discriminator training 24
25 Adversarial training If WX and Y are perfectly aligned, these spaces should be undistinguishable Train a discriminator D to discriminate elements from WX and Y Train W to unable the discriminator from making accurate predictions Discriminator training Mapping training 25
26 Orthogonality constraint Isometric mapping Preserve dot-product Preserve monolingual quality embeddings Training more robust (no mapping collapse) After each training update, project the mapping to the orthogonal manifold: W (1 + )W (WW T )W W W 2 r W (kw T W Idk 2 F ) Cisse et al. (ICML 2017) 26
27 Results on word translation Adversarial 27
28 Results on word translation Adversarial 90 Procrustes Adversarial en-es es-en en-fr fr-en en-ru ru-en en-zh zh-en Word translation retrieval Adversarial 1.5k source queries, 200k target keys (vocabulary of 200k words for all languages) 28
29 Unsupervised word translation Summary Given independent monolingual datasets in a source and a target language: We can create high-quality cross-lingual dictionaries We can create high-quality cross-lingual embeddings 29
30 Unsupervised sentence translation Could we apply the same unsupervised training procedure to sentences? 30
31 Unsupervised sentence translation Could we apply the same unsupervised training procedure to sentences? Number of points grows exponentially with sentence length 31
32 Unsupervised sentence translation Could we apply the same unsupervised training procedure to sentences? Number of points grows exponentially with sentence length No similar embeddingstructures across languages 32
33 Unsupervised sentence translation Could we apply the same unsupervised training procedure to sentences? Number of points grows exponentially with sentence length No similar embeddingstructures across languages Direct application does not work (even in a supervised setting) 33
34 Proposed architecture Denoising Auto-Encoding Source encoder Source decoder Train a source source denoising autoencoder (DAE) 34
35 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Train a source source denoising autoencoder (DAE) Critical to add noise to avoid trivial reconstructions 35
36 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Train a source source denoising autoencoder (DAE) Critical to add noise to avoid trivial reconstructions Two sources of noise Word dropout: each word is removed with a probability p (usually 0.1) Ref: Arizona was the first to introduce such a requirement. Arizona was the first to such a requirement. Arizona was first to introduce such a requirement. 36
37 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Train a source source denoising autoencoder (DAE) Critical to add noise to avoid trivial reconstructions Two sources of noise Word dropout: each word is removed with a probability p (usually 0.1) Word shuffle: word order is (slightly) shuffled inside sentences Ref: Arizona was the first to introduce such a requirement. Arizona the first was to introduce a requirement such. Arizona was the to introduce first such requirement a. 37
38 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder C noise Target encoder Target decoder Train a source source denoising autoencoder (DAE) Train a target target denoising autoencoder (DAE) 38
39 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Discriminator C noise Target encoder Target decoder Train a source source denoising autoencoder (DAE) Train a target target denoising autoencoder (DAE) Make source and target latent states indistinguishable using adversarial training 39
40 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Discriminator C noise Target encoder Target decoder Train a source source denoising autoencoder (DAE) Train a target target denoising autoencoder (DAE) Make source and target latent states indistinguishable using adversarial training We want decoders to operate in the same space share parameters between encoders 40
41 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Discriminator C noise Target encoder Target decoder Works on simple / small datasets, with short sentences or small vocabulary 41
42 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Discriminator C noise Target encoder Target decoder Works on simple / small datasets, with short sentences or small vocabulary Problem: at test time we want (source target) or (target source) 42
43 Proposed architecture Denoising Auto-Encoding C noise Source encoder Source decoder Discriminator C noise Target encoder Target decoder Works on simple / small datasets, with short sentences or small vocabulary Problem: at test time we want (source target) or (target source) Cross-Domain training: train the model to perform actual translations We do not have parallel data generate artificial translations for training 43
44 Proposed architecture Cross-Domain Training M model at previous iter Source encoder Target decoder Train on pairs generated using a stale version of the model 44
45 Proposed architecture Cross-Domain Training M model at previous iter Source encoder Target decoder Train on pairs generated using a stale version of the model Start with word-by-word translation une photo d une rue bondée en ville. (sentence from monolingual corpus) a photo of a street crowded in a city. a view of a crowded city street. (word-by-word translation) (gold translation) 45
46 Proposed architecture Cross-Domain Training M model at previous iter Source encoder Target decoder M model at previous iter Target encoder Source decoder Train on pairs generated using a stale version of the model Start with word-by-word translation Symmetric training 46
47 Recap Denoising autoencoding to learn good sentence representations 47
48 Recap Denoising autoencoding to learn good sentence representations Match distributions of latent features across the two domains Adversarial training Parameter sharing 48
49 Recap Denoising autoencoding to learn good sentence representations Match distributions of latent features across the two domains Adversarial training Parameter sharing Cross-lingual training to learn to translate Trick: use stale version of the model to produce a noisy source Use a word-by-word translation model to initialize the algorithm 49
50 Recap Denoising autoencoding to learn good sentence representations Match distributions of latent features across the two domains Adversarial training Parameter sharing Cross-lingual training to learn to translate Trick: use stale version of the model to produce a noisy source Use a word-by-word translation model to initialize the algorithm Pretrain word embeddings with aligned embeddings 50
51 Examples of unsupervised translations Source une femme aux cheveux roses habillée en noir parle à un homme. Iteration 0 Iteration 1 Iteration 2 Iteration 3 Reference a woman with pink hair dressed in black talks to a man. 51
52 Examples of unsupervised translations Source une femme aux cheveux roses habillée en noir parle à un homme. Iteration 0 a woman at hair roses dressed in black speaks to a man. Iteration 1 Iteration 2 Iteration 3 Reference a woman with pink hair dressed in black talks to a man. 52
53 Examples of unsupervised translations Source une femme aux cheveux roses habillée en noir parle à un homme. Iteration 0 a woman at hair roses dressed in black speaks to a man. Iteration 1 a woman at glasses dressed in black talking to a man. Iteration 2 Iteration 3 Reference a woman with pink hair dressed in black talks to a man. 53
54 Examples of unsupervised translations Source une femme aux cheveux roses habillée en noir parle à un homme. Iteration 0 a woman at hair roses dressed in black speaks to a man. Iteration 1 a woman at glasses dressed in black talking to a man. Iteration 2 a woman at pink hair dressed in black speaks to a man. Iteration 3 Reference a woman with pink hair dressed in black talks to a man. 54
55 Examples of unsupervised translations Source une femme aux cheveux roses habillée en noir parle à un homme. Iteration 0 a woman at hair roses dressed in black speaks to a man. Iteration 1 a woman at glasses dressed in black talking to a man. Iteration 2 a woman at pink hair dressed in black speaks to a man. Iteration 3 a woman with pink hair dressed in black is talking to a man. Reference a woman with pink hair dressed in black talks to a man. log(bleu) = min(1 r N c, 0) + X 1 N log p n n=1 c: length of the candidate translation r: average length of a reference over the corpus p_n: number_shared_ngrams(candidate, reference) / length(candidates) 55
56 Thank you Word translation without parallel data Alexis Conneau *, Guillaume Lample *, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou (ICLR 2018) Code: Unsupervised Machine Translation Using Monolingual Corpora Only Guillaume Lample, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato (ICLR 2018) 56
arxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationarxiv: v2 [cs.cv] 30 Mar 2017
Domain Adaptation for Visual Applications: A Comprehensive Survey Gabriela Csurka arxiv:1702.05374v2 [cs.cv] 30 Mar 2017 Abstract The aim of this paper 1 is to give an overview of domain adaptation and
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationTHE world surrounding us involves multiple modalities
1 Multimodal Machine Learning: A Survey and Taxonomy Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency arxiv:1705.09406v2 [cs.lg] 1 Aug 2017 Abstract Our experience of the world is multimodal
More informationFinding Translations in Scanned Book Collections
Finding Translations in Scanned Book Collections Ismet Zeki Yalniz Dept. of Computer Science University of Massachusetts Amherst, MA, 01003 zeki@cs.umass.edu R. Manmatha Dept. of Computer Science University
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationThe Karlsruhe Institute of Technology Translation Systems for the WMT 2011
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu
More informationTraining a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski
Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski Problem Statement and Background Given a collection of 8th grade science questions, possible answer
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationA heuristic framework for pivot-based bilingual dictionary induction
2013 International Conference on Culture and Computing A heuristic framework for pivot-based bilingual dictionary induction Mairidan Wushouer, Toru Ishida, Donghui Lin Department of Social Informatics,
More informationUnsupervised Cross-Lingual Scaling of Political Texts
Unsupervised Cross-Lingual Scaling of Political Texts Goran Glavaš and Federico Nanni and Simone Paolo Ponzetto Data and Web Science Group University of Mannheim B6, 26, DE-68159 Mannheim, Germany {goran,
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationAutoregressive product of multi-frame predictions can improve the accuracy of hybrid models
Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,
More informationExploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data
Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer
More informationDetecting English-French Cognates Using Orthographic Edit Distance
Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National
More informationClickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models
Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models Jianfeng Gao Microsoft Research One Microsoft Way Redmond, WA 98052 USA jfgao@microsoft.com Xiaodong He Microsoft
More informationMultilingual Sentiment and Subjectivity Analysis
Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationCROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2
1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis
More informationLes cartes au poisson
French as a Second Language (FSL) Grade 7 Living Respectfully Les cartes au poisson Contributor Shelley Constantin, health and physical education consultant, Catholic School Centre, Calgary, AB, Calgary
More informationADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF
Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download
More informationTRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY
TRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY Philippe Hamel, Matthew E. P. Davies, Kazuyoshi Yoshii and Masataka Goto National Institute
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationVisual CP Representation of Knowledge
Visual CP Representation of Knowledge Heather D. Pfeiffer and Roger T. Hartley Department of Computer Science New Mexico State University Las Cruces, NM 88003-8001, USA email: hdp@cs.nmsu.edu and rth@cs.nmsu.edu
More informationDNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS
DNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS Jonas Gehring 1 Quoc Bao Nguyen 1 Florian Metze 2 Alex Waibel 1,2 1 Interactive Systems Lab, Karlsruhe Institute of Technology;
More informationNoisy SMS Machine Translation in Low-Density Languages
Noisy SMS Machine Translation in Low-Density Languages Vladimir Eidelman, Kristy Hollingshead, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department of
More informationBridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models
Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationBooks Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny
By the End of Year 8 All Essential words lists 1-7 290 words Commonly Misspelt Words-55 working out more complex, irregular, and/or ambiguous words by using strategies such as inferring the unknown from
More informationMETHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS
METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar
More informationSEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING
SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING Sheng Li 1, Xugang Lu 2, Shinsuke Sakai 1, Masato Mimura 1 and Tatsuya Kawahara 1 1 School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501,
More informationarxiv: v1 [cs.cl] 20 Jul 2015
How to Generate a Good Word Embedding? Siwei Lai, Kang Liu, Liheng Xu, Jun Zhao National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences, China {swlai, kliu,
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationarxiv: v2 [cs.ir] 22 Aug 2016
Exploring Deep Space: Learning Personalized Ranking in a Semantic Space arxiv:1608.00276v2 [cs.ir] 22 Aug 2016 ABSTRACT Jeroen B. P. Vuurens The Hague University of Applied Science Delft University of
More informationConstructing Parallel Corpus from Movie Subtitles
Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing
More informationCross-lingual Text Fragment Alignment using Divergence from Randomness
Cross-lingual Text Fragment Alignment using Divergence from Randomness Sirvan Yahyaei, Marco Bonzanini, and Thomas Roelleke Queen Mary, University of London Mile End Road, E1 4NS London, UK {sirvan,marcob,thor}@eecs.qmul.ac.uk
More informationMULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY
MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationThe Strong Minimalist Thesis and Bounded Optimality
The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this
More informationPOS tagging of Chinese Buddhist texts using Recurrent Neural Networks
POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important
More informationLIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting
LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting El Moatez Billah Nagoudi Laboratoire d Informatique et de Mathématiques LIM Université Amar
More informationRobust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction
INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationBYLINE [Heng Ji, Computer Science Department, New York University,
INFORMATION EXTRACTION BYLINE [Heng Ji, Computer Science Department, New York University, hengji@cs.nyu.edu] SYNONYMS NONE DEFINITION Information Extraction (IE) is a task of extracting pre-specified types
More informationMultilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities
Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities Soto Montalvo GAVAB Group URJC Raquel Martínez NLP&IR Group UNED Arantza Casillas Dpt. EE UPV-EHU Víctor Fresno GAVAB
More informationLessons from a Massive Open Online Course (MOOC) on Natural Language Processing for Digital Humanities
Lessons from a Massive Open Online Course (MOOC) on Natural Language Processing for Digital Humanities Simon Clematide, Isabel Meraner, Noah Bubenhofer, Martin Volk Institute of Computational Linguistics
More informationarxiv: v1 [cs.cv] 10 May 2017
Inferring and Executing Programs for Visual Reasoning Justin Johnson 1 Bharath Hariharan 2 Laurens van der Maaten 2 Judy Hoffman 1 Li Fei-Fei 1 C. Lawrence Zitnick 2 Ross Girshick 2 1 Stanford University
More informationDeep Neural Network Language Models
Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com
More informationLEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano
LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES Judith Gaspers and Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University {jgaspers cimiano}@cit-ec.uni-bielefeld.de ABSTRACT Semantic parsers
More informationAccuracy (%) # features
Question Terminology and Representation for Question Type Classication Noriko Tomuro DePaul University School of Computer Science, Telecommunications and Information Systems 243 S. Wabash Ave. Chicago,
More informationLaboratorio di Intelligenza Artificiale e Robotica
Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning
More informationA Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention
A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention Damien Teney 1, Peter Anderson 2*, David Golub 4*, Po-Sen Huang 3, Lei Zhang 3, Xiaodong He 3, Anton van den Hengel 1 1
More informationLaboratorio di Intelligenza Artificiale e Robotica
Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationCross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels
Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels Jörg Tiedemann Uppsala University Department of Linguistics and Philology firstname.lastname@lingfil.uu.se Abstract
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationNCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science
More informationWhen Student Confidence Clicks
When Student Confidence Clicks Academic Self-Efficacy and Learning in HE Fabio R. Aricò 1 ACKNOWLEDGEMENTS UEA-HEFCE Widening Participation Teaching Fellowship HEA Teaching Development Grant Scheme 2 ETHICAL
More informationA DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA
International Journal of Semantic Computing Vol. 5, No. 4 (2011) 433 462 c World Scientific Publishing Company DOI: 10.1142/S1793351X1100133X A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationModel Ensemble for Click Prediction in Bing Search Ads
Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Microsoft Bing xiaoling@microsoft.com Hucheng Zhou Microsoft Research huzho@microsoft.com Weiwei Deng Microsoft Bing dedeng@microsoft.com
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationDomain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling
Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier 1, Andy Way 2, Josef van Genabith
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationCourse Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE
EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION
ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento
More informationCombining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval
Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval Jianqiang Wang and Douglas W. Oard College of Information Studies and UMIACS University of Maryland, College Park,
More informationData Integration through Clustering and Finding Statistical Relations - Validation of Approach
Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationModeling full form lexica for Arabic
Modeling full form lexica for Arabic Susanne Alt Amine Akrout Atilf-CNRS Laurent Romary Loria-CNRS Objectives Presentation of the current standardization activity in the domain of lexical data modeling
More informationWord Sense Disambiguation
Word Sense Disambiguation D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 May 21, 2009 Excerpt of the R. Mihalcea and T. Pedersen AAAI 2005 Tutorial, at: http://www.d.umn.edu/ tpederse/tutorials/advances-in-wsd-aaai-2005.ppt
More informationSummarizing Answers in Non-Factoid Community Question-Answering
Summarizing Answers in Non-Factoid Community Question-Answering Hongya Song Zhaochun Ren Shangsong Liang hongya.song.sdu@gmail.com zhaochun.ren@ucl.ac.uk shangsong.liang@ucl.ac.uk Piji Li Jun Ma Maarten
More informationSyntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews
Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Kang Liu, Liheng Xu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationOntological spine, localization and multilingual access
Start Ontological spine, localization and multilingual access Some reflections and a proposal New Perspectives on Subject Indexing and Classification in an International Context International Symposium
More informationA survey of multi-view machine learning
Noname manuscript No. (will be inserted by the editor) A survey of multi-view machine learning Shiliang Sun Received: date / Accepted: date Abstract Multi-view learning or learning with multiple distinct
More informationCombining a Chinese Thesaurus with a Chinese Dictionary
Combining a Chinese Thesaurus with a Chinese Dictionary Ji Donghong Kent Ridge Digital Labs 21 Heng Mui Keng Terrace Singapore, 119613 dhji @krdl.org.sg Gong Junping Department of Computer Science Ohio
More informationTransfer Learning Action Models by Measuring the Similarity of Different Domains
Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn
More informationA Reinforcement Learning Variant for Control Scheduling
A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement
More informationFUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria
FUZZY EXPERT SYSTEMS 16-18 18 February 2002 University of Damascus-Syria Dr. Kasim M. Al-Aubidy Computer Eng. Dept. Philadelphia University What is Expert Systems? ES are computer programs that emulate
More information