A Comparative Study on Applying Hierarchical Phrase-based and Phrase-based on Thai-Chinese Translation
|
|
- Maryann Rogers
- 6 years ago
- Views:
Transcription
1 2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems A Comparative Study on Applying Hierarchical Phrase-based and Phrase-based on Thai-Chinese Translation Prasert Luekhong 1,2, Rattasit Sukhauta 2, Peerachet Porkaew 3, Taneth Ruangrajitpakorn 3 and Thepchai Supnithi 3 1 College of Integrated Science and Technology, Rajamangala University of Technology Lanna, Chiang Mai, Thailand prasert@rmutl.ac.th 2 Computer Science Department, Faculty of Science, Chiang Mai University, Chiang Mai, Thailand rattasit.s@cmu.ac.th 3 Language and Semantic Technology Laboratory, National Electronics and Computer Technology Center, Thailand {peerachet.porkaew, taneth.rua, thepchai}@nectec.or.th Abstract To set an appropriate goal of SMT research for Thaibased translation, the comparative study of potential and suitability between phrase-based translation (PBT) and hierarchical phrase-based translation (HPBT) becomes an initial question. Thai-Chinese language-pair is chosen as experimental subject since they share most of common syntactic pattern and Chinese resource is numerous. Based on standard setting, we gain a result that 3-gram HPBT gains significantly better BLEU point over 3-gram PBT while 3-gram HPBT is approximately equal to 5-gram PBT. Moreover, from the results, a Chinese-to- Thai translation obtains better accuracies than a Thai-to-Chinese translation from every approach. Keywords- hierarchical phrase-based translation; SMT; Thai- Chinese translation I. INTRODUCTION In the past decades, many researches on statistical machine translation (SMT) have been conducted and they result in several methods and approaches. The major approaches of SMT can be categorized as word-based approach, phrasebased approach and tree-based approach [1]. With the high demand on SMT development, various softwares were developed to help on implementing SMT such as Moses[2], Phrasal[3], Cdec[4], Joshua[5] and Jane[6]. Moses and Phrasal gains our focus since they both are open-source and can effectively generate all three above mentioned approaches while the other cannot. However, Moses receives more public attention over Phrasal in terms of popularity since it has been applied as a baseline in several consortiums such as ACL since 2007, Coling, EMNLP, and so on. With tool such as Moses, SMT developer at least requires a parallel corpus of any language-pair to conveniently implement a baseline of statistical translation. Various language-pairs were trialed and applied to SMT in the past such as English-French, English-Spanish, English- German, and they eventually gained much impressive accuracy results [7] since they have sufficient and well-made data for training, for instance The Linguistic Data Consortium (LDC) [8], The parallel corpus for statistical machine translation (Europarl) [9], The JRC-Acquis [10] and English- Persian Parallel Corpus[11]. Unfortunately for low-resource language such as Thai, the researchers suffer from insufficient data to conduct a full-scale experiment on SMT thus the translation accuracies with any other languages are obviously low, for example simple phrase-based SMT on English-to- Thai gained BLEU score around 13.11% [12]. Furthermore, Thai currently lacks of sufficient resource on syntactic treebank to effort on the tree-based approach hence an SMT research on Thai is limited to word-based approach and phrase-based approach. Since phrase-based SMT has been evidentially claimed to overcome the translation result from word-based approach [1], the development of word-based SMT for Thai is dismissed. With the limited resource for experimenting complete Thai tree-based SMT by Moses, hierarchical phrase-based translation (HPBT) becomes more interesting since its accuracies on other language-pairs are severally reported to be higher than simple phrase-based translation approach (PBT) [13]. Though a high potential of HPBT is renowned, none of any experiment on HPBT for Thai has been yet submitted. In the contrary, there are also some documents claiming the negative results of HPBT as well, for example the BLEU score result of Arabic and English translation using HPBT is reported to gain 0.6 BLEU point lower than the PBT [14]. Therefore, this raises a question that which approach is more suitable for Thai language. From the linguistic point of view, it is clearly that SMT works better with the language-pairs from the same typology since the impressive BLEU points are noticeably obtained from European language-pairs [7]. Therefore, to test a suitability of different approaches on Thai, Chinese is selected as a language-pair in the test because of its resourcefulness and resemblance of Thai structures. In this work, a comparative study between Chinese-Thai translation based on HPBT and PBT approach will be conducted to set as flag-ship for further researches on Thai SMT. Moreover, different surrounding words (3-grams and 5- grams) as factor will also be studied to compare as how they are affected to a translation result /12 $ IEEE DOI /KICSS
2 The rest of this paper is organized as follows. Section II gives a background on past document relevant to HPBT and PBT translation result. Section III explains the methods and set-ups on HPBT and PBT implementation for Thai-Chinese pair. Section IV gives detail on experiment setting and shows the experiment result with discussion. Lastly, Section V gives a conclusion and a list of further plans for improvement. II. BACKGROUND Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The idea behind statistical machine translation comes from information theory. A document is translated according to the probability distribution Defined by Brown[15] that a string in the target language (for example, English) is the translation of a string in the source language (for example, French). SMT can be divided to 3 categorical as Word-Based, Phrase-Based and Tree-Based models. A. Word-Based Model Word-based Model based on lexical translation. The translation of words in isolation requires bilingual dictionary that maps word from one language to another. The main issue on this approach is greatly caused by a lexicon complexity. Generally in natural language, lexicons with same surface are not solely referred to single concept but multiple entities. Even though they can be all defined in the dictionary, they still are not sufficient to cover the actual meaning while they are in different contexts. For example, Thai word (koh) can be translated into either island (noun) or to stick (verb) in English. An example of a word-based translation system is the freely available GIZA++ [16] package (GPLed), which includes the training program for IBM models1-5 that follows the description by Brawn [15] and hidden Markov model [17]. Recently, the attention in the word-based approach has been fading since it is proved to have unreliably low result and they are several methods that overcome its capability. B. Phrase-Based Model As of its name, phrase-based model performs a translation based on phrasal unit. It gains advantage over simple wordbased model in terms of appropriateness on selecting translation by the surrounding context. Koehn states that the currently best performing statistical machine translation systems are based on phrase-based models [1]. The capability to translate small word sequences at a time is arguably the advantage of phrase-based translation. Though many SMT systems [18][19][20][21][22] are developed regarding to this approach and show adequate outcome, the remaining limitation is that it is a pure statisticalbased method without linguistic knowledge and can return unexpected error caused by sparseness and insufficient amount of training data. Nevertheless, this model still gains much favorable since it is simple to implement with the plain parallel corpora. C. Tree-Based Model Tree-based model can be defined as the usage of syntactic tree for assisting on mapping different linguist structure and contextual word translation by using synchronous grammar [23][24][25]. Nevertheless, it requires treebank[26] as a resource for total translation process. Therefore, less informative model, such as tree to string[27][28] and string to tree[29], or a model without linguistic information, such as hierarchical phrase-base model, were proposed. For rich-resource languages with comfortable treebank, implementing a tree-based model with full linguistic information can be planed. Otherwise, less informative model or model without linguistic information are only options for low-resource language. III. DEVELOPMENT OF THAI-CHINESE SMT This work aims to learn compatibility to Thai language from two famous approaches of SMT, i.e. phrase-based translation (PBT) and hierarchical phrase-based translation (HPBT). We design the system architecture to experiment as show in Figure 1. From Figure 1, the machine translation process starts with training process. From a parallel corpus, rules for HPBT and phrases for PBT are separately extracted into tables while the data in a parallel corpus will also be used in training for genrating a language model. In a summary, a training process returns three mandatory outputs for testing process as rule Figure 1. System architecture 127
3 table for HPBT, phrase table for PBT, and language model for both. For testing process, input sentence for tranlation is needed. As the systemmanage input once a sentence, input is designed to one sentence per line. To translate based on HPBT and PBT, each decoder is executed separately and return a translation result. For more details, each process is described in the following sections. A. Phrase-based Translation (PBT) The statistical phrase-based MT is an improvement of the statistical word-based MT. The word-based approach use word-to-word translation probability to translate source sentence. The phrase-based approach allows the system to divide the source sentence into segments before translating those segments. Because segmented translation pair (so called phrase translation pair) can capture a local reordering and can reduce translation alternatives, the quality of output from phrase-based approach is generally higher than wordbased approach. It should be noted that phrase pairs are automatically extracted from corpus and they are not defined as same as traditional linguistic phrases. As a baseline for a comparison with HPBT, PBT is developed based on both 3-gram and 5-gram. In this work, phrase-based model translation proposed in [1] is implemented as follows. 1) Phrase Extraction Algorithm The process of producing phrase translation model starts from phrase extraction algorithm. Below is an overview of phrase extraction algorithm. 1) Collect word translation probabilities of source-totarget (forward model) and target-to-source (backward model) by using IBM model 4. 2) Use the forward model and backward model from step 1) to align words for source sentence-to-target pair and target-to-source pair respectively. Only the highest probability is choose for each word. 3) Intersect both forward and backward word alignment point to get highly accurate alignment point. 4) Fill additional alignment points using heuristic growing procedure. 5) Collect consistence phrase pair from step 4) 2) Phrase-based Model Given is the set of possible translation results and is the source sentence. Finding the best translation result can be done by maximize the using Bayes s rule. (1) Where is a translation model and is the target language model. Target Language model can be trained from a monolingual corpus of the target language. (1) can be written in a form of Log Linear Model to add a given customized features. For each phrase-pair, five features are introduced i.e. forward and backward phrase translation probability distribution, forward and backward lexical weight, and phrase penalty. According to these five features, (1) can be summarized as follows: (2) In (2) is a phrase segmentation of. The terms and are the phrase-level conditional probabilities for forward and backward probability distribution with feature weight and respectively. and are lexical weight scores for phrase pair with weights and. These lexical weights for each pair are calculated from forward and backward word alignment probabilities. The term is phrase penalty with feature weight. The phrase penalty score support fewer and longer phrase pairs to be selected. is the language model with weight. The phrase-level conditional probabilities or phrase translation probabilities can be obtained from phrase extraction process. (3) The lexical weight is applied to check the quality of an extracted phrase pair. For a given phrase pair with an alignment, lexical weight is the joint probability of every word alignment. For a source word that aligns to more than one target word, the average probability is used. (4) Where is lexical translation probability of the word pair and is number of word in phrase. 3) Decoding The decoder is used to search the most likely translation according to the source sentence, phrase translation model and the target language model. The search algorithm can be performed by beam-search[30]. The main algorithm of beam search starts from an initial hypothesis. The next hypothesis can be expanded from the initial hypothesis which is not necessary to be the next phrase segmentation of the source sentence. Words in the path of hypothesis expansion are marked. The system produces a translation alternative when a path covers all words. The scores of each alternative are calculated and the sentence with highest score is selected. Some techniques such as hypothesis recombination and heuristic pruning can be applied to overcome the exponential size of search space. B. Hierarchical Phrase-based Translation (HPBT) Chiang [13] proposed hierarchical phrase-based translation (HPBT) in his work. It is a statistical machine translation model that uses hierarchical phrases. Hierarchical phrases are defined as phrases consisting of two or more sub-phrases that hierarchically link to each other. To create hierarchical phrase 128
4 model, a synchronous context-free grammar (aka. a syntaxdirected transduction grammar [31]) is learned from a parallel text without any syntactic annotations. A synchronous CFG derivation begins with a pair of linked start symbols. At each step, two linked non-terminals are rewritten using the two components of a single rule. When denoting links with boxed indices, they was re-indexed the newly introduced symbols apart from the symbols already present. In this work, we follow the implement instruction based on Chiang [13]. The methodology can be summarized as follows. Since a grammar in a synchronous CFG is elementary structures which rewrite rules with aligned pairs of right-hand sides, it can be defined as: (5) Where is a non-terminal, and are both strings of terminals and non-terminals, and is a one-to-one correspondence between nonterminal occurrences in and nonterminal occurrences in. 1) Rule Extraction Algorithm The extraction process begins with a word-aligned corpus: a set of triples, where is a source sentence, is an target sentence, and is a (many-to-many) binary relation between positions of and positions of. The word alignments are obtained by running GIZA++ [16] on the corpus in both directions, and forming the union of the two sets of word alignments. Each word-aligned sentence from the two sets of word alignments is extracted into a pair of a set of rules that are consistent with the word alignments. This can be listed in two main steps. 1) Identify initial phrase pairs using the same criterion as most phrase-based systems [22], namely, there must be at least one word inside one phrase aligned to a word inside the other, but no word inside one phrase can be aligned to a word outside the other phrase. 2) In order to obtain rules from the phrases, they look for phrases that contain other phrases and replace the sub-phrases with nonterminal symbols. 2) Hierarchical-phrase-based Model Chiang [13] explained hierarchical-phrase-based model that Given a source sentence, a synchronous CFG will have many derivations that yield on the source side, and therefore many possible target translations. With such explanation, a model over derivations is defined to predict which translations are more likely than others. Following the log-linear model [32] over derivations D, the calculation is obtained as: (6) Where the are features defined on derivations and the are feature weights. One of the features is an -gram language model ; the remainder of the features will define as products of functions on the rules used in a derivation: Thus we can rewrite as (7) (8) The factors other than the language model factor can be put into a particularly convenient form. A weighted synchronous CFG is a synchronous CFG together with a function that assigns weights to rules. This function induces a weight function over derivations: If we define then the probability model becomes 3) Training (9) (10) (11) On the attempt to estimate the parameters of the phrase translation and lexical-weighting features, frequencies of phrases are necessary for the extracted rules. For each sentence pair in the training data, more than one derivation of the sentence pair use the several rules extracted from it. They are following Och and others, to use heuristics to hypothesize a distribution of possible rules as though then observed them in the training data, a distribution that does not necessarily maximize the likelihood of the training data. Och s method [22]gives a count of one to each extracted phrase pair occurrence. They give a count of one to each initial phrase pair occurrence, and then distribute its weight equally among the rules obtained by subtracting sub-phrases from it. Treating this distribution data, They use relative-frequency estimation to obtain and. Finally, the parameters of the log-linear model (16) are learned by minimum-error-rate training[33], which tries to set the parameters so as to maximize the BLEU score [34] of a development set. This gives a weighted synchronous CFG according to (6) that is ready to be used by the decoder. 4) Decoding We applied CKY parser as a decoder. We also exploited beam search in the post-process for mapping source and target derivation. Given a source sentence, it finds the target yield of the single best derivation that has source yield: (12) They find not only the best derivation for a source sentence but also a list of the k-best derivations. These k-best derivations are utilized for minimum-error-rate training to rescore a language model, and they also use to reduce searching space by cube pruning[35]. 129
5 Figure 2. A sentence example of Chinese-to-Thai language with the alignment of words 5) Example of the Hierarchical Translation Process In order to explain the process of hierarchical translation, translation processes by steps are demonstrated using Thai and Chinese as example. Figure 2 exemplifies pair of Chinese and Thai sentence with the word alignment for reader s understandability. From Figure 2, a synchronous CFG extracted from parallel corpus is selected according to the given words. With the highest probability for each word, a list of rule is gained as shown in Figure 3. Rule Rule Number (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Figure 3. The hierarchical rule extracted from our example sentence. In Figure 3, where a notation in the left hand side, is non-terminal and S is start-symbol. The right hand side contains two set of CFG rules separated by comma. The left set of rule is a set of both terminal and non-terminal rule of source target which is Chinese in the example while another side contains Thai set of rule. These hierarchical rules will be utilized in derivation within decoding process as demonstrated in Figure 3. Following the example in Figure 3, the derivation of synchronous CFG shown in Figure 4 is processed in topdown manner by expanding from top non-terminal node with the immediate child non-terminal node until finding a terminal node of the leftmost node. A number notation above arrow in Figure 4 refers to rule number in Figure 3. From the actual data, examples of rule table of HPBT and phrase table of PBT gained from learning of Thai-Chinese parallel corpus are shown in Figure 5 and Figure 6, respectively. The difference between Figure 5 and Figure 6 is the [X] notation in Figure 5 to indicate a slot for other word or phrase to derive as a tree. IV. EXPERIMENT A. Data Preparation To experiment a Thai-Chinese translation, parallel corpus was gathered from two sources: BTEC (Basic Travel Expression Corpus) [36] and HIT London Olympic Corpus [37]. The former and latter consists of 26,544 English- Chinese sentences and 62,733 English-Chinese sentence pairs, respectively. All English sentences in the both corpora were carefully translated into Thai by professional linguists and translators. In total, we gain Thai-Chinese parallel corpus with 89,277 sentence pairs. In the preprocess, Chinese sentences were word-segmented by using Stanford Chinese Word Segmentation Tool [38] while Thai sentences were segmented by exploiting SWATH [39]. Since both languages do not have a reliable explicit word boundary. We manually selected 877 sentence pairs as a development set and randomly chose 1,000 sentence pairs as a test set. The remaining sentence pairs were applied as a training data set. B. Experiment Setting This work aims to compare the quality of Thai-Chinese SMT between phrase-based translation (PBT) approach and hierarchical phrase-based translation (HPBT) approach. language modeling tool SRILM [40] was exploited to generate 3-gram and 5-gram language model of Chinese and Thai. Moses [2] is chosen to function on phrase extraction, ruletable generation and decoding and the minimum-error-rate training (MERT) function was applied for tuning weights of both models. The results of Moses for HPBT and PBT are rule 130
6 Figure 4. A derivational process of translation of Figure 2 with rules from Figure 3 [X][X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X] [X][X] [X] Figure 5. An example of rule table from HBPT of Thai-to-Chinese (0) (0) (0,1) (1) (1) (0) (2) (0,1) (2) (2) (1) (0) (3) (2) (0,1) (1) (1,2) () (0,1) (1) (0) (0,1) (0,1) (1) (0) (5) (4) (1) (0,1,2,3) (0) (0) (0,4,5,6) (3,4) (4) (4) (2) (1) e e (1) (1,2) () () (0,1) (1) (0) (0,1) () (0,1) (1) Figure 6. An example of phrase table from PBT of Thai-to-Chinese table and phrase table, respectively. Examples of both tables are given in Figure 5 and Figure 6. The difference from the tables is that HPBT rule table includes translations of terminal and non-terminal nodes to clarify hierarchy while PBT phrase table informs translation pairs of phrase with word order. C. Results and Discussion We evaluate the system from both directions, Chinese-to- Thai and Thai-to-Chinese. Table 1 shows the experimental results in term of BLEU score [34]. The evaluation involves in 3-gram PBT, 5-gram PBT and 3-gram HPBT. We evaluate the system from both direction, Chinese-to-Thai and Thai-to- Chinese. Table I shows the experimental results on translation accuracy in term of BLEU score [34]. From the BLEU point result shown in Table I, the best result is obtained with 3-gram HPBT from every test. TABLE I. Source-to-target language THAI-CHINESE TRANSLATION EXPERIMENT RESULT BLEU score PBT 3-gram PBT 5-gram HPBT 3-gram Chinese-to-Thai Thai-to-Chinese For only Chinese-to-Thai translation, 3-gram HBPT overcomes 3-gram PBT for about 3.9 BLEU point, and 3-gram HBPT and 5-gram PBT are approximately equal. In case of Thai-to-Chinese, 3-gram PBT returns the lowest result and 3- gram HBPT gains the best BLEU point and defeats both 3- gram PBT and 5-gram PBT for 2.2 and 1.3 BLEU point, respectively. For the viewpoint of language base, Chinese-to- 131
7 Thai translation is obviously better than Thai-to-Chinese translation. Since Chinese-to-Thai translation results of both 3-gram HPBT model and 5-gram PBT model are slightly different, it is better to focus on working for 3-gram HPBT model. With less n-gram, generated rules are much smaller and the required size of corpus does not have to cover the sparseness of the surrounding words. V. CONCLUSION AND FUTURE WORK In this work, we studied on applying 3-gram PBT model, 5- gram PBT model and 3-gram HPBT model to translate Thaito-Chinese and Chinese-to-Thai. By comparing the results, we found that 3-gram HPBT shows potential on translating of both directions since the BLEU points of 3-gram HPBT are the best on Thai-to-Chinese translation. In case of Chinese-to- Thai, both 3-gram HPBT model and 5-gram PBT model return approximately equal BELU result which is greater than 3- gram PBT model about 4 BLEU point. From the experiment, results on Chinese-to-Thai are obviously better than Thai-to- Chinese results. To improve this work, we plan to add little linguistic information to the training data for reducing the currently large amount of synchronous CFG rules. Moreover, we plan to experiment a 3-gram HPBT model on different sentence length of Thai to separately study accuracy ratio based on sentence length since Thai sentence is naturally long. To cover all available SMT approach, tree-to-string model will be tested on Chinese-to-Thai. Lastly, English-Thai language pair will be tested with HPBT. ACKNOWLEDGEMENT The Authors would like to thank the Office of the Higher Education Commission, Thailand, for funding support under the program Strategic Scholarships for Frontier Research Network for the Ph.D. Program. Prasert Luekhong also thanks Graduate School, Chiang Mai University, Thailand and Rajamangala University of Technology Lanna, Thailand for their funding. Prasert Luekhong is grateful to Dr. Liu Qun for an opportunity as visiting researcher at The ICT Natural Language Processing Research Group, China Academic of Science, Beijing, China. REFERENCES [1] P. Koehn, Statistical machine translation. Cambridge University Press, 2010, p [2] P. Koehn et al., Moses: Open source toolkit for statistical machine translation, in Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, 2007, no. June, pp [3] D. Cer, M. Galley, and D. Jurafsky, Phrasal: A toolkit for statistical machine translation with facilities for extraction and incorporation of arbitrary model features, Proceedings of the NAACL, no. June, pp. 9-12, [4] C. Dyer, J. Weese, H. Setiawan, and A. Lopez, cdec: A decoder, alignment, and learning framework for finite-state and context-free translation models, in Proceedings of the ACL, 2010, no. July, pp [5] L. Schwartz, W. Thornton, and J. Weese, Joshua: An open source toolkit for parsing-based machine translation, Machine Translation, no. March, pp , [6] D. Vilar, D. Stein, and M. Huck, Jane: Open source hierarchical translation, extended with reordering and lexicon models, in on Statistical Machine Translation and and Metrics MATR (WMT 2010), 2010, no. July, pp [7] Matrix Euro. [Online]. Available: [Accessed: 29-May-2012]. [8] M. Liberman and C. Cieri, The Creation, Distribution and Use of Linguistic Data: The Case of the Linguistic Data Consortium, in proceedings of the 1st International Conference on Language Resources and Evaluation (LREC), [9] P. Koehn, Europarl: A parallel corpus for statistical machine translation, in MT summit, 2005, vol. 11. [10] R. Steinberger, B. Pouliquen, and A. Widiger, The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages, Arxiv preprint cs/ , vol. 4, no. 1, [11] M. V. Yazdchi and H. Faili, Generating english-persian parallel corpus using an automatic anchor finding sentence aligner, in Natural Language Processing and Knowledge Engineering (NLP- KE), 2010 International Conference on, 2010, pp [12] P. Porkaew and T. Ruangrajitpakorn, Translation of Noun Phrase from English to Thai using Phrase-based SMT with CCG Reordering Rules, in Design, [13] D. Chiang, Hierarchical Phrase-Based Translation, Computational Linguistics, vol. 33, no. 2, pp , Jun [14] M. Huck, M. Ratajczak, P. Lehnen, and H. Ney, A comparison of various types of extended lexicon models for statistical machine translation, in Conf. of the Assoc. for Machine Translation in the Americas (AMTA), Denver, CO, [15] P. F. Brown, V. J. D. Pietra, S. A. D. Pietra, and R. L. Mercer, The mathematics of statistical machine translation: Parameter estimation, Computational linguistics, vol. 19, no. 2, pp , [16] F. J. Och and H. Ney, Giza++: Training of statistical translation models. Internal report, RWTH Aachen University, i6. informatik. rwth-aachen. de, [17] S. Vogel, H. Ney, and C. Tillmann, HMM-based word alignment in statistical translation, Proceedings of the 16th conference on Computational linguistics, vol. 2, p. 836, [18] F. J. Och and H. Weber, Improving statistical natural language translation with categories and rules, Proceedings of the 36th annual meeting on Association for Computational Linguistics -, pp , [19] F. J. Och, C. Tillmann, and H. Ney, Improved alignment models for statistical machine translation, in Proc. of the Joint SIGDAT Conf. on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999, pp [20] F. J. Och, Statistical machine translation: from single-word models to alignment templates, Citeseer, [21] P. Koehn, F. J. F. J. Och, and D. Marcu, Statistical phrase-based translation, in Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, 2003, pp [22] F. J. Och and H. Ney, The Alignment Template Approach to Statistical Machine Translation, Computational Linguistics, vol. 30, no. 4, pp , Dec [23] S. M. Shieber and Y. Schabes, Generation and synchronous treeadjoining grammars, Computational Intelligence, vol. 7, no. 4, pp , Nov [24] D. Chiang and K. Knight, An introduction to synchronous grammars, Tutorial on ACL-06, no. June, pp. 1-16, [25] P. Blunsom, T. Cohn, and M. Osborne, Bayesian synchronous grammar induction, in Advances in Neural Information Processing Systems, 2009, vol. 21, pp [26] M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini, Building a large annotated corpus of English: The Penn Treebank, Computational linguistics, vol. 19, no. 2, pp , [27] Y. Liu, Q. Liu, and S. Lin, Tree-to-string alignment template for statistical machine translation, in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, no. 6, pp [28] L. Huang and H. Mi, Efficient incremental decoding for tree-tostring translation, in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010, pp
8 [29] R. S. Used, The UOT System: Improve String-to-Tree translation Using Head-Driven Phrase Structure Grammar and Predicate- Argument Structures, in mt-archive.info, 2009, pp [30] P. Koehn, Pharaoh: a beam search decoder for phrase-based statistical machine translation models, Machine translation: From real users to research, pp , [31] P. L. II and S. R. E., Syntax-directed transduction, Journal of the ACM (JACM), vol. l, no. 3, pp , [32] F. J. Och and H. Ney, Discriminative Training and Maximum Entropy Models for Statistical Machine Translation, in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002, no. July, pp [33] F. J. Och, Minimum error rate training in statistical machine translation, in Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1, 2003, vol. 1001, no. 1, pp [34] K. Papineni, S. Roukos, T. Ward, and W.-jing Zhu, BLEU : a Method for Automatic Evaluation of Machine Translation, Computational Linguistics, no. July, pp , [35] Y. Feng, H. Mi, Y. Liu, and Q. Liu, An efficient shift-reduce decoding algorithm for phrased-based machine translation, in Proceedings of the 23rd International Conference on Computational Linguistics: Posters, 2010, pp [36] BTEC Task International Workshop on Spoken Language Translation. [Online]. Available: [Accessed: 27-May-2012]. [37] M. Yang, H. Jiang, and T. Zhao, Construct trilingual parallel corpus on demand, in Chinese Spoken Language Processing, 2006, pp [38] P. Chang, M. Galley, and C. D. Manning, Optimizing Chinese word segmentation for machine translation performance, in Proceedings of the Third Workshop on Statistical Machine Translation, [39] P. Charoenpornsawat, SWATH: Smart Word Analysis for Thai [40] A. Stolcke, SRILM-an extensible language modeling toolkit, in Seventh International Conference on Spoken Language Processing,
Language Model and Grammar Extraction Variation in Machine Translation
Language Model and Grammar Extraction Variation in Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department
More informationNoisy SMS Machine Translation in Low-Density Languages
Noisy SMS Machine Translation in Low-Density Languages Vladimir Eidelman, Kristy Hollingshead, and Philip Resnik UMIACS Laboratory for Computational Linguistics and Information Processing Department of
More informationThe MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation
The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation AUTHORS AND AFFILIATIONS MSR: Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova,
More informationExploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data
Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer
More informationThe Karlsruhe Institute of Technology Translation Systems for the WMT 2011
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu
More informationImproved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation
Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation Baskaran Sankaran and Anoop Sarkar School of Computing Science Simon Fraser University Burnaby BC. Canada {baskaran,
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationDomain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling
Domain Adaptation in Statistical Machine Translation of User-Forum Data using Component-Level Mixture Modelling Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier 1, Andy Way 2, Josef van Genabith
More informationDEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS
DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS Julia Tmshkina Centre for Text Techitology, North-West University, 253 Potchefstroom, South Africa 2025770@puk.ac.za
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationThe NICT Translation System for IWSLT 2012
The NICT Translation System for IWSLT 2012 Andrew Finch Ohnmar Htun Eiichiro Sumita Multilingual Translation Group MASTAR Project National Institute of Information and Communications Technology Kyoto,
More informationConstructing Parallel Corpus from Movie Subtitles
Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationThe KIT-LIMSI Translation System for WMT 2014
The KIT-LIMSI Translation System for WMT 2014 Quoc Khanh Do, Teresa Herrmann, Jan Niehues, Alexandre Allauzen, François Yvon and Alex Waibel LIMSI-CNRS, Orsay, France Karlsruhe Institute of Technology,
More informationRe-evaluating the Role of Bleu in Machine Translation Research
Re-evaluating the Role of Bleu in Machine Translation Research Chris Callison-Burch Miles Osborne Philipp Koehn School on Informatics University of Edinburgh 2 Buccleuch Place Edinburgh, EH8 9LW callison-burch@ed.ac.uk
More informationNCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationPOS tagging of Chinese Buddhist texts using Recurrent Neural Networks
POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationA heuristic framework for pivot-based bilingual dictionary induction
2013 International Conference on Culture and Computing A heuristic framework for pivot-based bilingual dictionary induction Mairidan Wushouer, Toru Ishida, Donghui Lin Department of Social Informatics,
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationMULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY
MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationGreedy Decoding for Statistical Machine Translation in Almost Linear Time
in: Proceedings of HLT-NAACL 23. Edmonton, Canada, May 27 June 1, 23. This version was produced on April 2, 23. Greedy Decoding for Statistical Machine Translation in Almost Linear Time Ulrich Germann
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationCross-lingual Text Fragment Alignment using Divergence from Randomness
Cross-lingual Text Fragment Alignment using Divergence from Randomness Sirvan Yahyaei, Marco Bonzanini, and Thomas Roelleke Queen Mary, University of London Mile End Road, E1 4NS London, UK {sirvan,marcob,thor}@eecs.qmul.ac.uk
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationCross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels
Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels Jörg Tiedemann Uppsala University Department of Linguistics and Philology firstname.lastname@lingfil.uu.se Abstract
More informationCROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2
1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationA Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many
Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.
More informationBridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models
Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationThe stages of event extraction
The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks
More informationTraining and evaluation of POS taggers on the French MULTITAG corpus
Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationThe RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017
The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017 Jan-Thorsten Peter, Andreas Guta, Tamer Alkhouli, Parnia Bahar, Jan Rosendahl, Nick Rossenbach, Miguel
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationAnnotation Projection for Discourse Connectives
SFB 833 / Univ. Tübingen Penn Discourse Treebank Workshop Annotation projection Basic idea: Given a bitext E/F and annotation for F, how would the annotation look for E? Examples: Word Sense Disambiguation
More informationEvaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment
Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Akiko Sakamoto, Kazuhiko Abe, Kazuo Sumita and Satoshi Kamatani Knowledge Media Laboratory,
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationAccurate Unlexicalized Parsing for Modern Hebrew
Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The
More informationClickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models
Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models Jianfeng Gao Microsoft Research One Microsoft Way Redmond, WA 98052 USA jfgao@microsoft.com Xiaodong He Microsoft
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationGrammars & Parsing, Part 1:
Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review
More informationTHE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING
SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,
More informationLongest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More information2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases
POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationDisambiguation of Thai Personal Name from Online News Articles
Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online
More informationBYLINE [Heng Ji, Computer Science Department, New York University,
INFORMATION EXTRACTION BYLINE [Heng Ji, Computer Science Department, New York University, hengji@cs.nyu.edu] SYNONYMS NONE DEFINITION Information Extraction (IE) is a task of extracting pre-specified types
More informationReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology
ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationInformatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy
Informatics 2A: Language Complexity and the Chomsky Hierarchy September 28, 2010 Starter 1 Is there a finite state machine that recognises all those strings s from the alphabet {a, b} where the difference
More informationMultilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities
Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities Soto Montalvo GAVAB Group URJC Raquel Martínez NLP&IR Group UNED Arantza Casillas Dpt. EE UPV-EHU Víctor Fresno GAVAB
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationEdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationAn Interactive Intelligent Language Tutor Over The Internet
An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This
More informationA Quantitative Method for Machine Translation Evaluation
A Quantitative Method for Machine Translation Evaluation Jesús Tomás Escola Politècnica Superior de Gandia Universitat Politècnica de València jtomas@upv.es Josep Àngel Mas Departament d Idiomes Universitat
More informationUsing Semantic Relations to Refine Coreference Decisions
Using Semantic Relations to Refine Coreference Decisions Heng Ji David Westbrook Ralph Grishman Department of Computer Science New York University New York, NY, 10003, USA hengji@cs.nyu.edu westbroo@cs.nyu.edu
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationA Version Space Approach to Learning Context-free Grammars
Machine Learning 2: 39~74, 1987 1987 Kluwer Academic Publishers, Boston - Manufactured in The Netherlands A Version Space Approach to Learning Context-free Grammars KURT VANLEHN (VANLEHN@A.PSY.CMU.EDU)
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationTowards a MWE-driven A* parsing with LTAGs [WG2,WG3]
Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Jakub Waszczuk, Agata Savary To cite this version: Jakub Waszczuk, Agata Savary. Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]. PARSEME 6th general
More informationTINE: A Metric to Assess MT Adequacy
TINE: A Metric to Assess MT Adequacy Miguel Rios, Wilker Aziz and Lucia Specia Research Group in Computational Linguistics University of Wolverhampton Stafford Street, Wolverhampton, WV1 1SB, UK {m.rios,
More informationInvestigation on Mandarin Broadcast News Speech Recognition
Investigation on Mandarin Broadcast News Speech Recognition Mei-Yuh Hwang 1, Xin Lei 1, Wen Wang 2, Takahiro Shinozaki 1 1 Univ. of Washington, Dept. of Electrical Engineering, Seattle, WA 98195 USA 2
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationOverview of the 3rd Workshop on Asian Translation
Overview of the 3rd Workshop on Asian Translation Toshiaki Nakazawa Chenchen Ding and Hideya Mino Japan Science and National Institute of Technology Agency Information and nakazawa@pa.jst.jp Communications
More informationLTAG-spinal and the Treebank
LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)
More informationModeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures
Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1
Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationDeep Neural Network Language Models
Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com
More informationRegression for Sentence-Level MT Evaluation with Pseudo References
Regression for Sentence-Level MT Evaluation with Pseudo References Joshua S. Albrecht and Rebecca Hwa Department of Computer Science University of Pittsburgh {jsa8,hwa}@cs.pitt.edu Abstract Many automatic
More informationContext Free Grammars. Many slides from Michael Collins
Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures
More informationImpact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment
Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment Takako Aikawa, Lee Schwartz, Ronit King Mo Corston-Oliver Carmen Lozano Microsoft
More informationCOMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR
COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR ROLAND HAUSSER Institut für Deutsche Philologie Ludwig-Maximilians Universität München München, West Germany 1. CHOICE OF A PRIMITIVE OPERATION The
More informationDiscriminative Learning of Beam-Search Heuristics for Planning
Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University
More informationPREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES
PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More information