Explorations in vector space the continuous-bag-of-words model from word2vec. Jesper Segeblad

Size: px
Start display at page:

Download "Explorations in vector space the continuous-bag-of-words model from word2vec. Jesper Segeblad"

Transcription

1 Explorations in vector space the continuous-bag-of-words model from word2vec Jesper Segeblad January 2016

2 Contents 1 Introduction Purpose The continuous bag of words model Measuring similarity Evaluating the models Extracting training samples Hyperparameters Experiments 6 4 Results 7 5 Conclusions 8 1

3 1 Introduction To represent words as vectors in a relatively low dimensional space has in the last few years become increasingly popular in the natural language processing community. In this representation, we can think of words as points in a vector space space where words that are semantically similar lie close to each other. This approach has several advantages to representing words as atomic symbols. Words do have a relation to one another, and to actually have a representation that can capture this can be very beneficial to many natural language applications. Traditionally, these models have used word co-occurrence matrices, counting how many times words co-occur in some text corpus. Other types of ways to create these vectors have also been explored, such as trying to predict a word from the words that surround it. Although all these models differ in many ways, they are all trained on raw, unannotated text, and rest on the distributional hypothesis: the hypothesis that words which have similar meaning occur in similar contexts (Sahlgren, 2008). This is the assumption that drives these models, and also make them successful. Word2vec is a highly popular software package that provides two types of algorithms for creating such word vectors, the skip-gram and continuous-bag- of-words models (Mikolov et al., 2013a). The main difference between these two algorithms is the objective: while the skip-gram model try to predict the surrounding words given a target word, the continuous-bag-of-words model try to predict the target word given the surrounding words. In a follow up paper, the authors recommended using the skip-gram model with negative sampling (Mikolov et al., 2013b) and is the model from word2vec that has gained the most attention. Less attention has been given to the continuous-bag-of- word model. This does not make it less interesting however, and is the model that will be studied in greater detail here. 1.1 Purpose The purpose here is to get a deeper understanding of the continuous-bag-of- words model from word2vec. This is both of the theory and math behind this model, why it actually works, as well as how this might be implemented in practice. 2

4 2 The continuous bag of words model The continuous bag of words model (CBOW) is inspired by neural network architecture, with an input layer, hidden layer, and output layer. Similarly, it is trained using gradient descent and backpropagation which is a common technique for training neural networks (Russell and Norvig, 2009). It does however lack the non-linear activation function at the hidden layer traditionally used in neural networks, and instead passes on the linear activation of the hidden layer (from now on called projection layer as in the original article by Mikolov et al. (2013a)) to the output. First of, we need a vocabulary, which is the words that we want to create word vectors for. The input layer has the dimension of the size of the vocabulary (denote this size by V). Between the input layer and the projection layer we have a matrix of dimensions VxN, where N is the desired dimensions of the word vectors. This matrix will be called W in. Each row in this matrix corresponds to some word in our vocabulary. Between the projection and output layer we have a matrix of size NxV, which we call W out. With these matrices we have two vector representations of each word in the vocabulary, one coming from the rows of W in and one coming from the columns of W out. These will be referred to as the input vectors and output vectors respectively, following Rong (2014). When the model the model is initialized, all these word vectors have random values, and the objective is to find better values for these vectors so that words that are semantically similar have similar vectors. We can think of the input as a number of one-hot encoded column vectors, meaning that one of the entries in the vector are 1 while the rest are 0. How many of these vectors that are taken as input are decided by the context size. The goal of the model is to predict the word at the output layer given the the context words fed into to the model at the input layer. This is done by modeling the conditional probability of the output word given the input words, p(word out words in ). The model takes two inputs at this stage, the context words and the target word. The context words are projected onto the projection layer by combining their respective input vectors, taken from the rows of the input weight matrix, or more formally, by multiplying them with the input weight matrix. But since they are one-hot encoded this is equivalent to just copying them. They are combined by taking the average vector (denote this transformation by h). h = 1 C (v w1 + v w v wi ) (1) Here, C is the number of words in the context and v wi is the input vector of the i:th word in the context. We thus end up with an average vector, with the dimensionality N. With this vector we can compute a score for each output word (Rong, 2014) (denote this score u j ). u j = W T out j h (2) W out j is the j:th column of the output matrix, the output vector of the word j. This is transposed from a column vector to a row vector so that we can multiply it by the averaged vector h. With this score we can compute the conditional probability of this being the actual output word using the softmax function. P(word out words in )=y j = exp(u j ) V Â exp(wout T k h) k=1 The score u j is exponentiated and divided by the sum of the exponentiated score of all other words in the vocabulary, giving us a probability of this being the actual output word. The next step is to update the weights of the model based on the error of the prediction. Since we expect only one output word, the objective is to maximize the equation 3 for the actual output word. Given this we have a corresponding loss function E = logp(word out words in ) (Rong, 2014). The error at the output layer can be computed as e j = y j t j, where y j is the prediction of the model and t j is the actual probability of this word being (3) 3

5 the output word. t j can be either 0 or 1, depending on if it is the actual output word or not, 1 if it is, 0 otherwise. To update the weights between the projection and output layer, this error is multiplied with h, the averaged vector of the input words and the learning rate (denoted h). This results in another vector, which is then subtracted from the output vector of the word. W T out j = W T out j h e j h (4) When the output vector have been updated, the error is propagated backwards to update the input vectors in the context. In order to do this, the sum of all output vectors, weighted by the prediction error, in the vocabulary is calculated. EH = V Â j=1 e j W outij (5) This gives an N-dimensional vector EH, which is used to update the input vectors in the context. W ini = W ini h 1 EH for i in context (6) C Here, each vector (i.e. row of the input matrix W in ) of the words in the context are updated by subtracting the averaged error multiplied by the learning rate (h). The computations for updating the vectors are rather costly, since for each possible output word we have to check its probability and compare it to the actual probability (0 or 1). The number of output probabilities are the same as the number of words in the vocabulary, meaning that the execution time becomes quite substantial if we have a large vocabulary. With a vocabulary of one million words a total of one million checks would have to made just for one training example. There are however some optimization tricks that can be applied, such as hierarchical softmax and negative sampling, described in (Mikolov et al., 2013b) (though in the context of the skip-gram model). Hierarchical softmax makes use of a binary tree that represents the output layer. Negative sampling reconstructs the tasks from a multiclass classification problem to a binary one: given a training sample, the task is to predict if the word and its context comes from the real training data or sampled randomly from a random distribution. While the math behind this can seem rather complex and involved, the intuitive understanding of the model is somewhat simpler and builds on the distributional hypothesis previously mentioned. The objective of the model is to maximize the probability of the output target word given the context words, and the weights of the model (the word vectors) are moved in order to maximize this probability. If the probability of a word is overestimated, the input vectors will move further away from the output vector. And if the probability is underestimated, the input vectors will move closer to the output vectors (Rong, 2014). 2.1 Measuring similarity With the resulting vectors, the distance between the words in the vocabulary can be measured. If we think of the words as points in a two dimensional space, words that are close to each other should have a similar meaning (given that our model actually works and our assumptions about word similarity being reflected by their distributional properties). Mikolov et al. (2013a) uses the cosine similarity to measure the similarity between them. Cosine similarity is the angle between two vectors in a geometric perspective, and is defined as the dot product divided by the product of their respective lengths (Jurafsky and Martin, 2009). cos sim(v,u)= v u v u (7) 4

6 2.2 Evaluating the models These types of vector representations are often evaluated on various types of word similarity tasks. In (Mikolov et al., 2013a) evaluation of the models on two types of similarity tasks, semantic and syntactic, and on one sentence completion task. The similarity tasks are questions of the type a is to b as c is to d, where the task is fill in d. This is accomplished with vector addition and subtraction: X = vec(a) vec(b)+vec(c). With this resulting vector X, a search over the all the vectors in the vocabulary can be made to find the vector that is closest to it. The sentence completion task consists of trying complete a sentence that has one missing word. A list is given with five words that all can be said to be reasonable choice (Mikolov et al., 2013a). 2.3 Extracting training samples Since these are trained on unannotated text, obtaining large amounts of training data is no problem. From this text, training samples can be extracted. These training samples are words together with the context they appear in (from here on called word and context pairs). The context is defined as a symmetric window around the focus word. One can view this as sliding a window over the text collection and looking at a word and the n words around it. If for example a window size of 2 is chosen, the 2 preceding and the two following words are used as the context. Exactly how many words that should be used as context are not well established, and different types of window sizes captures different types of semantic similarity (Goldberg, 2015). This might be something to choose depending on the task that the resulting vectors are to be used for. Goldberg (2015) also describes some types of preprocessing that can be used before extracting the word-context pairs, which can include removing words that appear to frequently or too infrequently, and removing sentences that are either to long or too short. 2.4 Hyperparameters The continuous-bag-of-words model relies on a few hyperparameters. They won t be covered in much detail, but the most important ones that are used in the implementation further on are described. First we have to decide on the dimensionality of the resulting word vectors (N from the previous description). The results presented in Mikolov et al. (2013a) indicates that a higher dimensionality might give better word representations, since vectors of a higher dimensionality performed better on the tasks that they were evaluated on. Another parameter is how the vectors are initialized. Since they are initialized to random values, one has to decide how these random values are chosen. The approach used in word2vec is to use uniformly 1 sampled values between 2N and 1 2N (Goldberg, 2015). Yet another of these hyperparameters is the learning rate. Mikolov et al. (2013a) used an initial learning rate of 0.025, and during the training decreased it so that it approached zero further into the training. 5

7 3 Experiments To get a clearer understanding of the inner workings of the model, a small implementation was made in Python together with the Numpy library (Van Der Walt et al., 2011). This implementation is rather simple and not very efficient, because of the inefficient weight updating, and none of the optimization tricks were implemented. The model as implemented is initialized together with a vocabulary. This vocabulary is a Python dictionary with the words that we want to create word vectors for. Each word also has a corresponding index number, which is how we know which vector is assigned to which word. The weight matrices 1 are initialized randomly with values between 2N and 1 2N drawn from a uniform distribution, as in the original implementation. The learning rate was set to a default value of 0.1, and this is not adjusted during training of the model. The default size of the projection layer is set 100. As the original intention was to train the model on the first tokens from Swedish Wikipedia, some methods to preprocess that corpus was implemented. This preprocessing removes all words appearing less than 10 times in the corpus, and they appear neither as target or context words. This was done before extracting the word and context pairs used as training samples. In practice this means that the context window is expanded. The context window size was set to the two previous words and the two following words. Words that appear in the beginning of a sentence and does not have any word preceding it, only the words following it are used as context words. The same holds for words at the end of a sentence. Following Levy et al. (2015), sentences shorter than 5 words were also removed. This resulted in around 9000 unique words and sentences for training. Even with that relatively small amount of words, training was to slow to complete over the acquired words from Swedish Wikipedia. These limitations make it hard to evaluate how well this implemented model actually performs the way it was originally intended. The code necessary for doing this was implemented however. Instead a small testing corpus was built to see how well this implementation performed and if the resulting vector similarities make sense. This small testing corpus can be found below. These experiments were done with a model having the default parameters as described above. The window size was not set to 2+2 however, instead a window size of 1+1 was used, i.e. one word to the left and one word to the right. drink apple juice there. drink orange juice there. drink tomato juice there. drink beer there. drink milk there. they eat beans here. they eat meat here. they eat pasta here. they eat pork here. they eat rice here. we play football outside. we play ice hockey outside. we play soccer outside. we play golf outside. As can be seen, this testing corpus is quite small. The small size is made up for by doing multiple passes over the training data. 6

8 4 Results Some results can be found in the table below. These results were retrieved with using a vector dimensionality of 100 and doing 200 passes over the training data. This high number of passes was necessary because of the small size of the training data. Although these results might not be the most interesting, they do reflect some regularities in the training data. If we for example look at the three first entries in the table, we see that the closest words are words that are separated by another one. In other words, they are words that appear in the same context. Word Closest word Cosine similarity play outside drink juice eat here juice drink apple there beans they To see the effect of a smaller vector dimensionality another test was done with this parameter set to 50 instead of the default 100. These results can be seen below. The only thing that has changed is the cosine similarities (and this is not even by that much), the closest words are still the same. Word Closest word Cosine similarity play outside drink juice eat here juice drink apple there beans they passes over the training data are quite many. To see the effects of a smaller number of passes, this number was reduced to 50. The results from this are presented below. Word Closest word Cosine similarity play outside drink juice eat here juice drink apple pork beans they The cosine similarities are significantly lower. The most interesting thing however is that the nearest word to apple now is pork, and they are not that close. This is quite surprising since they do not share any context in the training data. It might be attributed to the small number of actual training instances, and smaller number of passes over the data. 7

9 5 Conclusions Vector representations of words are very popular among natural language processing researchers and practitioners today. They perform very well on a number of tasks and can capture many types of word similarities. Here, the continuous-bag-of-words model from the popular word2vec package has been implemented. Even though the model could not be tested as originally intended, the tests that were made did show that the resulting word vectors could reflect regularities in the training data. And while the provided implementation does not scale to large vocabularies, it does provide a starting point for exploring the model further. One such exploration might be to look at the optimizations regarding weight updating. 8

10 References Yoav Goldberg. A primer on neural network models for natural language processing. arxiv preprint arxiv: , Daniel Jurafsky and James H Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, volume 21. Pearson Education, 2nd edition, ISBN doi: / Omer Levy, Yoav Goldberg, and Ido Dagan. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3: , Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR, 2013a. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages , 2013b. Xin Rong. word2vec parameter learning explained. arxiv preprint arxiv: , Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 3rd edition, ISBN Magnus Sahlgren. The distributional hypothesis. Italian Journal of Linguistics, 20(1):33 53, Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. The numpy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2):22 30,

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

arxiv: v1 [cs.cl] 20 Jul 2015

arxiv: v1 [cs.cl] 20 Jul 2015 How to Generate a Good Word Embedding? Siwei Lai, Kang Liu, Liheng Xu, Jun Zhao National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences, China {swlai, kliu,

More information

Georgetown University at TREC 2017 Dynamic Domain Track

Georgetown University at TREC 2017 Dynamic Domain Track Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention Damien Teney 1, Peter Anderson 2*, David Golub 4*, Po-Sen Huang 3, Lei Zhang 3, Xiaodong He 3, Anton van den Hengel 1 1

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Second Exam: Natural Language Parsing with Neural Networks

Second Exam: Natural Language Parsing with Neural Networks Second Exam: Natural Language Parsing with Neural Networks James Cross May 21, 2015 Abstract With the advent of deep learning, there has been a recent resurgence of interest in the use of artificial neural

More information

A deep architecture for non-projective dependency parsing

A deep architecture for non-projective dependency parsing Universidade de São Paulo Biblioteca Digital da Produção Intelectual - BDPI Departamento de Ciências de Computação - ICMC/SCC Comunicações em Eventos - ICMC/SCC 2015-06 A deep architecture for non-projective

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General Grade(s): None specified Unit: Creating a Community of Mathematical Thinkers Timeline: Week 1 The purpose of the Establishing a Community

More information

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski Problem Statement and Background Given a collection of 8th grade science questions, possible answer

More information

LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting

LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting El Moatez Billah Nagoudi Laboratoire d Informatique et de Mathématiques LIM Université Amar

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis

More information

Attributed Social Network Embedding

Attributed Social Network Embedding JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, MAY 2017 1 Attributed Social Network Embedding arxiv:1705.04969v1 [cs.si] 14 May 2017 Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua Abstract Embedding

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

arxiv: v2 [cs.ir] 22 Aug 2016

arxiv: v2 [cs.ir] 22 Aug 2016 Exploring Deep Space: Learning Personalized Ranking in a Semantic Space arxiv:1608.00276v2 [cs.ir] 22 Aug 2016 ABSTRACT Jeroen B. P. Vuurens The Hague University of Applied Science Delft University of

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Autoencoder and selectional preference Aki-Juhani Kyröläinen, Juhani Luotolahti, Filip Ginter

Autoencoder and selectional preference Aki-Juhani Kyröläinen, Juhani Luotolahti, Filip Ginter ESUKA JEFUL 2017, 8 2: 93 125 Autoencoder and selectional preference Aki-Juhani Kyröläinen, Juhani Luotolahti, Filip Ginter AN AUTOENCODER-BASED NEURAL NETWORK MODEL FOR SELECTIONAL PREFERENCE: EVIDENCE

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

Deep Neural Network Language Models

Deep Neural Network Language Models Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com

More information

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Dublin City Schools Mathematics Graded Course of Study GRADE 4 I. Content Standard: Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding of number systems and reasonable estimates using paper and pencil, technology-supported

More information

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Yelong Shen Microsoft Research Redmond, WA, USA yeshen@microsoft.com Xiaodong He Jianfeng Gao Li Deng Microsoft Research

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

South Carolina College- and Career-Ready Standards for Mathematics. Standards Unpacking Documents Grade 5

South Carolina College- and Career-Ready Standards for Mathematics. Standards Unpacking Documents Grade 5 South Carolina College- and Career-Ready Standards for Mathematics Standards Unpacking Documents Grade 5 South Carolina College- and Career-Ready Standards for Mathematics Standards Unpacking Documents

More information

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011 Montana Content Standards for Mathematics Grade 3 Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011 Contents Standards for Mathematical Practice: Grade

More information

Mathematics process categories

Mathematics process categories Mathematics process categories All of the UK curricula define multiple categories of mathematical proficiency that require students to be able to use and apply mathematics, beyond simple recall of facts

More information

This scope and sequence assumes 160 days for instruction, divided among 15 units.

This scope and sequence assumes 160 days for instruction, divided among 15 units. In previous grades, students learned strategies for multiplication and division, developed understanding of structure of the place value system, and applied understanding of fractions to addition and subtraction

More information

Backwards Numbers: A Study of Place Value. Catherine Perez

Backwards Numbers: A Study of Place Value. Catherine Perez Backwards Numbers: A Study of Place Value Catherine Perez Introduction I was reaching for my daily math sheet that my school has elected to use and in big bold letters in a box it said: TO ADD NUMBERS

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

A Reinforcement Learning Variant for Control Scheduling

A Reinforcement Learning Variant for Control Scheduling A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement

More information

Statewide Framework Document for:

Statewide Framework Document for: Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance

More information

Speaker Identification by Comparison of Smart Methods. Abstract

Speaker Identification by Comparison of Smart Methods. Abstract Journal of mathematics and computer science 10 (2014), 61-71 Speaker Identification by Comparison of Smart Methods Ali Mahdavi Meimand Amin Asadi Majid Mohamadi Department of Electrical Department of Computer

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Grade 5 + DIGITAL. EL Strategies. DOK 1-4 RTI Tiers 1-3. Flexible Supplemental K-8 ELA & Math Online & Print

Grade 5 + DIGITAL. EL Strategies. DOK 1-4 RTI Tiers 1-3. Flexible Supplemental K-8 ELA & Math Online & Print Standards PLUS Flexible Supplemental K-8 ELA & Math Online & Print Grade 5 SAMPLER Mathematics EL Strategies DOK 1-4 RTI Tiers 1-3 15-20 Minute Lessons Assessments Consistent with CA Testing Technology

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT The Journal of Technology, Learning, and Assessment Volume 6, Number 6 February 2008 Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Answer Key For The California Mathematics Standards Grade 1

Answer Key For The California Mathematics Standards Grade 1 Introduction: Summary of Goals GRADE ONE By the end of grade one, students learn to understand and use the concept of ones and tens in the place value number system. Students add and subtract small numbers

More information

Semantic and Context-aware Linguistic Model for Bias Detection

Semantic and Context-aware Linguistic Model for Bias Detection Semantic and Context-aware Linguistic Model for Bias Detection Sicong Kuang Brian D. Davison Lehigh University, Bethlehem PA sik211@lehigh.edu, davison@cse.lehigh.edu Abstract Prior work on bias detection

More information

Summarizing Answers in Non-Factoid Community Question-Answering

Summarizing Answers in Non-Factoid Community Question-Answering Summarizing Answers in Non-Factoid Community Question-Answering Hongya Song Zhaochun Ren Shangsong Liang hongya.song.sdu@gmail.com zhaochun.ren@ucl.ac.uk shangsong.liang@ucl.ac.uk Piji Li Jun Ma Maarten

More information

arxiv: v1 [cs.lg] 15 Jun 2015

arxiv: v1 [cs.lg] 15 Jun 2015 Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

More information

Comment-based Multi-View Clustering of Web 2.0 Items

Comment-based Multi-View Clustering of Web 2.0 Items Comment-based Multi-View Clustering of Web 2.0 Items Xiangnan He 1 Min-Yen Kan 1 Peichu Xie 2 Xiao Chen 3 1 School of Computing, National University of Singapore 2 Department of Mathematics, National University

More information

Guide to the Uniform mark scale (UMS) Uniform marks in A-level and GCSE exams

Guide to the Uniform mark scale (UMS) Uniform marks in A-level and GCSE exams Guide to the Uniform mark scale (UMS) Uniform marks in A-level and GCSE exams This booklet explains why the Uniform mark scale (UMS) is necessary and how it works. It is intended for exams officers and

More information

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,

More information

Topic Modelling with Word Embeddings

Topic Modelling with Word Embeddings Topic Modelling with Word Embeddings Fabrizio Esposito Dept. of Humanities Univ. of Napoli Federico II fabrizio.esposito3 @unina.it Anna Corazza, Francesco Cutugno DIETI Univ. of Napoli Federico II anna.corazza

More information

Lecture 2: Quantifiers and Approximation

Lecture 2: Quantifiers and Approximation Lecture 2: Quantifiers and Approximation Case study: Most vs More than half Jakub Szymanik Outline Number Sense Approximate Number Sense Approximating most Superlative Meaning of most What About Counting?

More information

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION Atul Laxman Katole 1, Krishna Prasad Yellapragada 1, Amish Kumar Bedi 1, Sehaj Singh Kalra 1 and Mynepalli Siva Chaitanya 1 1 Samsung

More information

Focus of the Unit: Much of this unit focuses on extending previous skills of multiplication and division to multi-digit whole numbers.

Focus of the Unit: Much of this unit focuses on extending previous skills of multiplication and division to multi-digit whole numbers. Approximate Time Frame: 3-4 weeks Connections to Previous Learning: In fourth grade, students fluently multiply (4-digit by 1-digit, 2-digit by 2-digit) and divide (4-digit by 1-digit) using strategies

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Indian Institute of Technology, Kanpur

Indian Institute of Technology, Kanpur Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

arxiv: v4 [cs.cl] 28 Mar 2016

arxiv: v4 [cs.cl] 28 Mar 2016 LSTM-BASED DEEP LEARNING MODELS FOR NON- FACTOID ANSWER SELECTION Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou IBM Watson Core Technologies Yorktown Heights, NY, USA {mingtan,cicerons,bingxia,zhou}@us.ibm.com

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

An Online Handwriting Recognition System For Turkish

An Online Handwriting Recognition System For Turkish An Online Handwriting Recognition System For Turkish Esra Vural, Hakan Erdogan, Kemal Oflazer, Berrin Yanikoglu Sabanci University, Tuzla, Istanbul, Turkey 34956 ABSTRACT Despite recent developments in

More information

Mathematics Success Level E

Mathematics Success Level E T403 [OBJECTIVE] The student will generate two patterns given two rules and identify the relationship between corresponding terms, generate ordered pairs, and graph the ordered pairs on a coordinate plane.

More information

Unsupervised Cross-Lingual Scaling of Political Texts

Unsupervised Cross-Lingual Scaling of Political Texts Unsupervised Cross-Lingual Scaling of Political Texts Goran Glavaš and Federico Nanni and Simone Paolo Ponzetto Data and Web Science Group University of Mannheim B6, 26, DE-68159 Mannheim, Germany {goran,

More information

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach #BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying

More information

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking Catherine Pearn The University of Melbourne Max Stephens The University of Melbourne

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

A Vector Space Approach for Aspect-Based Sentiment Analysis

A Vector Space Approach for Aspect-Based Sentiment Analysis A Vector Space Approach for Aspect-Based Sentiment Analysis by Abdulaziz Alghunaim B.S., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer

More information

First Grade Standards

First Grade Standards These are the standards for what is taught throughout the year in First Grade. It is the expectation that these skills will be reinforced after they have been taught. Mathematical Practice Standards Taught

More information

Using focal point learning to improve human machine tacit coordination

Using focal point learning to improve human machine tacit coordination DOI 10.1007/s10458-010-9126-5 Using focal point learning to improve human machine tacit coordination InonZuckerman SaritKraus Jeffrey S. Rosenschein The Author(s) 2010 Abstract We consider an automated

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Word Embedding Based Correlation Model for Question/Answer Matching

Word Embedding Based Correlation Model for Question/Answer Matching Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Word Embedding Based Correlation Model for Question/Answer Matching Yikang Shen, 1 Wenge Rong, 2 Nan Jiang, 2 Baolin

More information

arxiv: v5 [cs.ai] 18 Aug 2015

arxiv: v5 [cs.ai] 18 Aug 2015 When Are Tree Structures Necessary for Deep Learning of Representations? Jiwei Li 1, Minh-Thang Luong 1, Dan Jurafsky 1 and Eduard Hovy 2 1 Computer Science Department, Stanford University, Stanford, CA

More information

CS177 Python Programming

CS177 Python Programming CS177 Python Programming Recitation 1 Introduction Adapted from John Zelle s Book Slides 1 Course Instructors Dr. Elisha Sacks E-mail: eps@purdue.edu Ruby Tahboub (Course Coordinator) E-mail: rtahboub@purdue.edu

More information

Why Did My Detector Do That?!

Why Did My Detector Do That?! Why Did My Detector Do That?! Predicting Keystroke-Dynamics Error Rates Kevin Killourhy and Roy Maxion Dependable Systems Laboratory Computer Science Department Carnegie Mellon University 5000 Forbes Ave,

More information

Model Ensemble for Click Prediction in Bing Search Ads

Model Ensemble for Click Prediction in Bing Search Ads Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Microsoft Bing xiaoling@microsoft.com Hucheng Zhou Microsoft Research huzho@microsoft.com Weiwei Deng Microsoft Bing dedeng@microsoft.com

More information

Knowledge Transfer in Deep Convolutional Neural Nets

Knowledge Transfer in Deep Convolutional Neural Nets Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract

More information

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA International Journal of Semantic Computing Vol. 5, No. 4 (2011) 433 462 c World Scientific Publishing Company DOI: 10.1142/S1793351X1100133X A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF

More information

Math Grade 3 Assessment Anchors and Eligible Content

Math Grade 3 Assessment Anchors and Eligible Content Math Grade 3 Assessment Anchors and Eligible Content www.pde.state.pa.us 2007 M3.A Numbers and Operations M3.A.1 Demonstrate an understanding of numbers, ways of representing numbers, relationships among

More information

Postprint.

Postprint. http://www.diva-portal.org Postprint This is the accepted version of a paper presented at CLEF 2013 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality,

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information