Learning Ranking vs. Modeling Relevance
|
|
- Lenard Richardson
- 6 years ago
- Views:
Transcription
1 Learning Ranking vs. Modeling Relevance Dmitri Roussinov Department of Information Systems W.P.Carey School of Business Arizona State University Abstract The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or generative language modeling, which requires explicit assumptions about the term distributions. The nowadays popular discriminative (classification, machine learning, statistical forecasting etc.) approaches have been mostly abandoned while solving this task in spite of their success in a different task of text categorization. In this paper, we studied if a classifier can be trained solely based on labeled examples to successfully generalize to new (unseen by the system) queries and provide performance comparable with popular heuristic or language models. Our SVM-based classifier learns from the relevance judgments available with the standard test collections and generalizes to new, previously unseen queries its ability to compare and rank documents with respect to a given query. To accomplish this, we have designed a representation scheme, which is based on the discretized form of the high level statistics of the query term occurrences (such as tf, df, and document length) rather than individual terms. Using the standard metrics of average precision and the standard large and small test collections we confirmed that our machine learning approach can achieve the performance comparable with and better than the performance of the current state of the art models. 1. Introduction And Prior Work In spite of several decades of research, modern document retrieval technology still has to overcome the burden of information overload. Although many improvements have been successfully tried to improve document ranking with respect to a user s request, they all require manual parameter tuning in order to be helpful rather than detrimental within a particular application. The variety of techniques only disorients the practitioners and designers of document management systems, suggesting no clear winner and no methodology to choose the applicable improvement techniques and their parameters. On the other side, many practitioners are well familiar with machine learning (classification) paradigm, where the training sets are (typically manually) developed and the appropriate technological solutions are selected and their parameters are tuned on those sets. Although used with striking success for text categorization, classification-based approaches (e.g. those based on support vector machines Weiguo Fan Accounting and Information Systems Department Virginia Tech wfan@vt.edu [9]) have been relatively abandoned when trying to improve ad hoc retrieval in favor of empirical (e.g. vector space [15]) or generative (e.g. language [19]) models, which produce a ranking function that gives each document a score, rather than trying to learn a classifier that would help to discriminate between relevant and irrelevant documents and order them accordingly. A generative model needs to make assumptions that the query and document words are sampled from the same underlying distributions and that the distributions have certain forms, which entail specific smoothing techniques (e.g. popular Dirichletprior). A discriminative (classifier-based) model, on the other side, does not need to make any assumptions about the forms of the underlying distributions or the criteria for the relevance but instead, learns to predict to which class a certain pattern (document) belongs to based on the labeled training examples. Another important advantage of a discriminative approach for the information retrieval task, is its ability to explicitly utilize the relevance judgments existing with standard test collections in order to train the IR algorithms and possibly enhance retrieval accuracy for the new (unseen) queries. Our work is motivated by the objective to bring closer numerous achievements in the domains of machine learning and classification to the classical task of ad-hoc information retrieval (IR), which is ordering documents by the estimated degree of relevance to a given query. Our classifier learns how to compare every pair of documents with respect to a given query, based on the relevance indicating features that the documents may have. As it is commonly done in information retrieval, the features are derived from the word overlap between the query and documents. The earliest formulation of the classic IR problem as a classification (discrimination) problem was suggested by Robertson and Sparck Jones [13], however performed well only when the relevance judgments were available for the same query but not generalizing well to new queries. Fuhr and Buckley [4] used polynomial regression to estimate the coefficients in a linear ranking function combining such well-known features as a weighted term frequency, document length and query length. They tested their description-oriented approach on the standard small-scale collections (Cranfield, NPL, INSPEC, CISI, CACM) to achieve the relative change in the average precision ranging from -17% to + 33% depending on the collection tested and the implementation parameters. Gey [6] applied logistic regression in a similar setting with the following results: Cranfield +12%, CACM /06/$20.00 (C) 2006 IEEE 1
2 +7.9%, CISI -4.4%, however he did not test them on new (unseen by the algorithm) queries, hypothesizing that splitting documents into training and testing collections would not be possible since a large number of queries is necessary in order to train for a decent logistic regression approach to document retrieval. Instead, he applied a regression trained on Cranfield to CISI collection but with a negative effect. Recently, the approaches based on learning have reported several important breakthroughs. Fan et al. [4] applied genetic programming in order to learn how to combine various terms into the optimal ranking function that outperformed the popular Okapi formula on robust retrieval test collection. Nallapati [12] made a strong argument in favor of discriminative models and trained an SVM-based classifier to combine 6 different components (terms) from the popular ranking functions (such as tf.idf and language models) to achieve better than the language model performance in 2 out of 16 test cases (figure in [12]), not statistically distinguishable in 8 cases and only 80% of the best performance in 6 cases. Greiff [7] derived the optimal shape of global weighting on a set of TREC collections resulting in 8-86% improvement over INQUERY ranking formula. There have been studies using past relevance judgements to optimize retrieval. For example, Joachims [10] applied Support Vector Machines to learn linear ranking function from user click-troughs while interfacing with a search engine. We would like to emphasize that the task considered here is fundamentally different from routing, filtering, text categorization or any framework based on user relevance feedback: we are optimizing retrieval for new, previously unseen queries for which no relevance judgments are assumed to be available. In this study, we present a representation scheme, which is based on the discretization of the global (corpus statistics) and local (document statistics) weighting of term overlaps between queries and documents. The major difference of our work from Fan et al. [4] or Nallapati [12] or works on fusion (e.g. [18]) is that we did not try to combine several known ranking functions (or their separate terms) into one, but rather we learn the weighting functions directly through discretization. The shorter versions of this paper with a slightly different focus were presented earlier:?? Discretization allows representing a continuous function by a set of values at certain points. These values are learned by a machine learning technique to optimize certain criteria, e.g. average precision. Thus, we believe our approach offers a significant advantage since it does not limit the shapes of the learned ranking functions to a certain class of functions suggested after heuristic explorations of language modeling done in prior research. We have also empirically established that our combination of the representation scheme, learning mechanism and sampling allows learning from the past relevance judgments in order to successfully generalize to the new (unseen) queries. When the representation was created without any knowledge of the top ranking functions and their parameters, our approach reached the known top performance solely through the learning process. When our representation was taking advantage of functions that are known to perform well and their parameters, the resulting combination was able to slightly exceed the top performance on large test collections and considerably exceed the performance on small scale standard test collections. The next section formalizes our approach, followed by empirical results and conclusions. 2. Formalization Of Our Approach Our approach to ad hoc document retrieval learns how important each type of an occurrence of a query term in a document. For example, in some very primitive way (for illustration only), we can define two document features: feature S ( strong ), indicating multiple occurrences of a rare query term (e.g. discretization ) in a document, and feature W ( weak ), indicating a single occurrence of a frequent term (e.g. information ). The particular terms ( discretization and information ) are not used directly in the representation, so all the multiple occurrences of the rare terms and single occurrences of the frequent terms are treated the same way. Then, a machine learning technique should discover that the feature S is much stronger indicator of relevance than the feature W. In the implementation presented in this paper, each occurrence of a query term t in a document d is assigned to a bin (specified by an integer number within a limited range) based on the term document frequency in the collection (df) and the number of the term occurrences within the document (tf). By learning the discrimination properties of each feature (bin), rather than separate terms, our method allows generalization to new queries. Thus, the ranking functions studied in this paper are limited to the so called lw.gw class: R(q, d) = t q L ( tf ( t, d ), d ) G ( t ) Here L(tf, d) -- local weighting, is the function of the number of occurrences of the term in the document tf(t, d), possibly combined with the other document d statistics, e.g. document length in words. G(t) -- global weighting, can be any collection level statistic of the term (e.g. df - document frequency). It can be easily verified that this class of ranking functions is very general and includes all the well known successful ranking functions such as variations of tf.idf and 2
3 BM25 (Okapi). For example, in the classical tf.idf formula L(tf, d) = tf / d, where tf is the number of occurrences of the term t in the document, d is the length of the document vector and G(t) = idf(df(t)) = log (N / df(t)), where df(t) is the total number of documents in the collection that have term t. N is the total number of documents. The lw.gw representation of BM25 is discussed below in detail. It can be also shown that many of the recently introduced language models fall into that category as well, specifically the best performing in TREC ad hoc tests Dirichlet smoothing, Jelinek Mercer smoothing, and Absolute Discounting approaches can be represented that way (see equation 6 and table I in [19]). It has been known for a long time that the shapes of the global and local weighting functions can dramatically affect the retrieval accuracy in standard test collections. However, we are not aware of any attempts to learn those shapes directly from the labeled examples, which we performed in this study. Thus, our central research question was the following: can the optimal (best performing) shapes of the global and local functions be learned purely from labeled examples, without heuristic experimentation or elaborate analytical modeling and assumptions about term distributions? Each occurrence of a query word in a document is assigned to a bin. Each bin is specified by two numbers: g (for global) in the range [1, B] and l (for local) in the range [1, L] as following: log (df(t)) g(t) = {B(1- )} log(n) (1) l (tf(t, d), d) = min( tf (t, d), L )) (1a) where N is the total number of documents, {.} stands for rounding down to the nearest integer. The logarithmic scale allows more even term distribution among bins than simple linear assignment, which is desirable for more efficient learning. It is motivated by a typical histogram of df(t) distribution, which looks much more uniform in a logarithmic scale. It is important to note that it does not have anything to do with the log function in the classical idf weighting. Formula (1) does not produce any weights but only assigns each term occurrence to a specific bin based on the term document frequency. The weights are later trained and effectively define any shape of global weighting, including those tried in the prior heuristic explorations: log, square root, reciprocals and other functions. Let s note that in our case l(tf, d) formula does not really need rounding to an integer since tf is already a positive integer. However, in a more general case, tf can be normalized by document length (as is done in BM25 and language models) and, thus, local weighting would become a continuous function. It is important to note that our discrete representation does not ignore the occurrences above L but simply treats them the same way as tf = L. The intuition behind this capping is that increasing tf above certain value would not typically indicate the higher relevance of the document. Each occurrence of a query term in a document corresponds to a bin (g, l). Each (g,l) combination determines a feature in a vector representing a documentquery pair f(d, q) and is denoted below as f( d, q) [g, l]. The dimensionality of the feature space is L x B. E.g. for 8 local weighting bins and 10 global weighting bins we would deal with the vector size of 80. A feature vector f(d, q) represents each document d with respect to query q. Since the query term occurrences assigned to the same bins are treated the same way, the value of each feature in the vector is just the number of the term occurrences assigned to each bin (g, l): f ( d, q) [g, l] = t q, g ( t ) = g, l ( t, d ) = l 1 (2) Now, for the document ranking function, we can simply use the dot product between the feature vector and the vector of learned optimal weights w: R(q, d) = w * f ( d, q) Ideally, the learning mechanism should assign higher weights to the more important bins (e.g. multiple occurrence of a rare term) and low weights to the less important bins (e.g. single occurrence of a common terms). The exact learned values determine the optimal shape of global and local weighting. Table 1 shows an example of the bin assignments and the resulting feature vector for a specific document and the query the anti missile defense system of star wars. The bins with lower numbers (0-7) correspond to the terms with large document frequencies (the, of). Since they happen to be in almost all documents, their weights will be learned to be very small (non discriminative). They could be alternatively removed by a stop word list. The other, less frequent words, occupy large bins (g > 0) and l mostly equal 7 (corresponds to tf = 8) since tf is capped at 8, except star and wars having l = 5 (corresponds to tf = 6). The feature vector representing query/document pair has only the following non zero coordinates (bins): 7, 15, 22, 23 and 20, with the occurrences within the same bins (e.g. 7, 15 and 23) aggregated. 3
4 Table 1. An example of term frequencies, the resulting bin assignments and document/query feature vector. Qid = 0 docid = doclen = avgdoclen = term: the TF: 42 DF: g:0 l:7 bin: 7 term: of TF: 11 DF: g:0 l:7 bin: 7 term: system TF: 20 DF: g:1 l:7 bin: 15 term: defense TF: 8 DF: g:1 l:7 bin: 15 term: anti TF: 10 DF: 5126 g:2 l:7 bin: 23 term: star TF: 6 DF: 4536 g:2 l:5 bin: 22 term: missile TF: 15 DF:1237 g:2 l:7 bin: 23 term: wars TF: 6 DF: 657 g:3 l:5 bin: 30 document/query feature vector: 7:2, 15:2, 22:1, 23:2, 30:1 We still can make the representation more powerful by considering the learned weights w[g, l] not the replacements but rather the adjustments to some other, heuristically chosen, global G (t) and local L (t, d) weighting functions (e.g. bm25): f ( d, q) [g, l] = t q, g( t) = g, l ( tf ( t, d ), d ) = l L( t, d) G( t) (2a) We define the specific choice of global G() and local L() weighting functions as starting ranking function (SRF). When all the bin weights w[g, l] are set to 1, our ranking function is the same as its SRF. The learning process finds the optimal values for w[g, l] for the collection of training queries and their relevance judgments, thus adjusting the important shapes of the global and local weighting to achieve better accuracy. SRF can be chosen from one of the known to perform well ranking functions (e.g. tf.idf or BM25 or based on language models) to take advantage of the fact that those formulas and their optimal parameters on the standard test collections are known for the researchers. Alternatively, we can set SRF to the constant value (e.g. 1 in formula 2), thus not taking advantage of any of the prior empirical investigations and to see if our framework is able to learn reasonable (or even top-notch) performance purely from labeled examples. Below, we describe our experiments with each approach. Since the score is linear with respect to the feature values, we can train the weights w as a linear classifier that predicts the preference relation between pairs of documents with respect to the given query. Document d1 is more likely to be relevant (has a higher score) than document d2 iff f(d1, q) * w > f(d1, q) * w and vice versa. An important advantage of using a linear classifier is that rank ordering of documents according to the learned pairwise preferences can be simply performed by ordering according to their linear score f(d, q) * w. Please refer to [2] for the ordering algorithms in a more general non linear case. We chose support vector machines (SVM) for training the classifier weights w[g, l] since they are known to work well with large numbers of features, ranging in our experiments from 8 to 512, depending on the number of bins. For our empirical tests, we used the SVMLight package freely available for academic research from Joachims [9]. We preserved the default parameters coming with version V Comparison Tests 3.1 Experiments with large collections We used the TREC, Disks 1 and 2, collections to test our framework, topics for training and for testing and vice-versa. For indexing, we used the Lemur package [11], with the default set of parameters, and no stop word removal or stemming. Although those procedures are generally beneficial for accuracy, it is also known that they do not significantly interfere with testing various ranking functions and thus are omitted in many studies to allow easier replication. We used only topic titles for queries to simulate short queries typically run by online surfers or company employees trying to locate a document. We used the most popular average (non-interpolated) precision as our performance metric. The characteristics of the collection after indexing are shown in Table 2. We also reproduced results similar to the reported below on the Disk 3 collection and topics , but did not include them in this paper due to size limitations. Table 2. The characteristics of the test collection: TREC Disks 1 and 2. Collection Number of documents Number of terms Number of unique terms Average document length Topics TREC Disks 1 and 2 741, ,059, ,
5 Table 3. Learning without any knowledge of ranking functions. 16 x 8 bin design. Testing: Training: Original Learne d Baseline Original Learned Baseline The choice of the baseline is very important for the validity of the findings. We used the results reported in [12] as guidance. According to [12] the best performing language model on this test collection was the one based on the Dirichlet smoothing, which we informally verified by varying the parameters available in Lemur. We found the optimal parameter μ = 1900 to be the same as the one reported in [12] but the average precision lower (0.205 vs ). The difference may be attributed to the different indexing parameters, not using stemming or a stopword list. By experimenting with the other ranking functions and their parameters, we noticed that the implementation of BM25, available in Lemur, provided almost identical performance (0.204). Its ranking function is BM25 (tf, df) = tf / (tf + K * (1 b + b * d / d a ) * log ( N / (df +.5)), where d is the document word length and d a is its average across all documents. The optimal parameter values were close to the default K = 1.0 and b =.5. We noticed that the query term frequency components could be ignored without any noticeable loss of precision. This may be because the TREC topic titles are short and the words are very rarely repeated in the queries. Since the difference between this ranking function and the optimal from the available language models was negligible we selected the former as both our baseline and also as the starting ranking function (SRF) in our experiments. For simplicity, we call it simply BM25 throughout our paper. First, we were curious to see if our framework can learn reasonable performance without taking advantage of our knowledge of the top ranking functions and their parameters. For this, we set our starting ranking function (SRF) to a constant value (1.0), thus using only the minimum out of the empirical knowledge and theoretical models developed by information retrieval researchers during several decades: specifically only the fact that relevance can be predicted by tf and df. Table 3 shows performance for the 16 x 8 combination of bins. It can be seen that our approach has reached % of the top performance solely through the learning process. The original performance is the one obtained by assigning all the classifier weights to 1. When the same set was used for training and testing the result obviously overestimates the learning capability of the framework. However, it also gives the upper bound of performance of a discretized gw.lw combination. Since we already informally demonstrated that our discrete representation is almost identical in performance to the smooth one, we have an estimate of the upper bound of performance for the entire family of gw.lw ranking functions, which includes all the popular ones such as tf.idf, BM25 or some of the language models. Attaining this upper bound in practice may require much larger number of training examples or further improvement in the weighting functions due to analytical modeling. In order to evaluate if more training data can help, we also ran tests using 90 topics for training and the remaining 10 for testing. We ran 10 tests each time using 10 different sequential topics for testing and averaged our results. In this case, the averaged performance was completely restored to the baseline level with the mean difference in precision across test queries +0.5% and 1% standard deviation of the mean. We believe this is a remarkable result considering the difficulties that the prior learning based approaches had with the classical information retrieval task! We attribute our success to both higher flexibility and generalizability of our discrete representation. We also varied the number of bins to evaluate the effect of granularity of representation. Figures 1 and 2 demonstrate that 8 bins suffice for both global and local weighting. Higher numbers did not result Table 4. Surpassing the baseline performance. 8 x 8 bin design. Testing: Training: Learned Baseline % change Learned Baseline (+/- 0.9) (+/- 1.0) (+/- 1.0) (+/- 1.3) 5
6 in noticeable improvements. Figure 1. Learning local weighting for various numbers of bins. Learning on and testing on Average precision Number of bins Figure 2. Learning global weighting for various numbers of bins. Learning on and testing on Average precision Basel Lear Basel Lear Number of bins In order to test whether our approach can exceed the baseline performance we set BM25 to be our starting ranking function (SRF). Thus, in this case G(t) = log ( N / (df +.5)) (6) L(tf, d) = tf / (tf + K * (1 b + b * d / d a ) Table 4 shows performance for the 8 by 8 bin design. Although the improvement is relatively small (2-3%) it is still statistically significant at the level of alpha < 0.1, when the paired t-test was performed. The value in % change column shows the mean % improvement across all the queries and its standard deviation. It may differ from the % change of the mean performance since there is wide variability in the performance across queries but smaller variability in the improvement. We believe even such a small improvement is remarkable considering the amount of attention the researches have paid to optimizing the ranking functions for this specific data set which has been available for more than seven years. A number of recent studies reported Table 5. Small test collections and their baseline performance. Collection Number of Queries Number of Documents Baseline Average Precision Cranfield NPL Med CISI comparable improvements on the same test collection by using more elaborate modeling or richer representations. Of course the improvement due to the techniques such as those based on n-grams, document structures, natural language processing or query expansion can possibly achieve even better results. However in this study we deliberately limited our focus to the bags of words. 3.2 Experiments with small collections Information Retrieval from small and mid size collections is still important within organizations for their day-to-day activities, for example such as locating important messages, policy manuals or customer complaint tickets. However, much of the recent studies have been performed on the TREC-size collections. Small test collections have been also extensively studied in the past including the works mentioned in our Introduction. In order to address the possible practical value of searching small collections and to compare our results with the past efforts, we also performed our experiments on the following classical collections: Canfield, NPL, CISI, and Med. Table 5 lists the properties of the test collections and the baseline performance on them. Table 6 shows the effects of learning across queries. We only explored learning global weighting in those tests. The number of global weighting bins was set to 10. As you can see the effect was overall positive, ranging from -4% to +28% of relative improvement. The results also show that generally there was no danger of overfitting: effects on training and testing sets were similar. The only negative effect listed in the Table 5 corresponds to a zero learning effect on the training set. Since in practice it can be predicted during the training phase, the degradation could be easily avoided by not applying our technique and using a traditional scoring method instead. Table 7 shows the learning effect across different collections, 12 combinations total, tested once each. Positive relative improvements ranged from 0 to +28%. The only noticeable negative effect were again the results of learning on CISI collection. CISI does not demonstrate improvement on the training set either, so can be excluded 6
7 Table 6. Learning across queries. The effects on all 4 small test collections. Collection Crossvalidation Precision on the training set Precision on the testing set Cranfield NPL Med CISI Relative Improvement from the application of our method in practice. It is remarkable that NPL collection improves as much as if trained on Cranfield as if trained on itself. We can observe that NPL and Cranfield are more amenable for the technique. It is not surprising since they have much larger numbers of queries than the other two. Our across collection training results are very encouraging and stronger than reported in the prior related studies mentioned in our introduction. We believe the effects can be also further increased by normalization of features within each collection, a standard procedure in machine learning, which we did not try in our study. 4. Other Conclusions, Limitations And Future Work We explored learning how to rank documents with respect to a given query using linear Support Vector Machines and discretization-based representation. Our approach to information retrieval represents a family of discriminative approaches, currently not well studied by researchers. Our experiments indicate that learning from relevant judgments available with the standard test collections and generalizing to new queries is not only feasible but can be a very powerful source of improvement. When tested with a popular standard collection, our approach achieved the performance of the best well-known techniques (BM25 and language models), which have been developed as a result of extensive past experiments or elaborate theoretical modeling. When combined with the best performing ranking functions, our approach added a small (2-3%), but statistically significant, improvement. Although practical significance of this study may be limited at the moment since it does not demonstrate a dramatic increase in retrieval performance in large test collections, we believe our findings have important theoretical contributions since they indicate that the power of discriminative approach is comparable to the best known analytical or heuristic apporaches. This work also lays the foundation for extending the discriminative approach to richer representations, such as those using word n-grams, grammatical relations between words, and the structure of documents. We deliberately limited our investigation to bag of words approach and did not use bigrams, LSI, query expansion, or pseudo relevant feedback. Under those conditions, our small improvements reported here are comparable with those reported in other recent works when evaluations were made relatively to a known strong baseline (e.g. Okapi/BM25 or Language Models) and ad hoc TREC collections. On classical small test collection, our learning approach demonstrates significant improvement, ranging from 0 to 30% and even works when trained on a different collection, which prior approaches failed to accomplish. We believe that our approach performs well because it learns the important function shapes of global and local weighting. The major advantages of our approach are the following: Simplicity: It does not require any analytical modeling and making assumption about statistical distributions of query and document terms. Extensibility: The approach can easily involve other learning techniques and other relevance features such as those based on n-grams, part of speech, structural elements of a document (title, headings) or general properties of a document (popularity, style, trustworthiness, etc.). It can also incorporate other classifiers. Explicitness: Through analysis of the learned weights it allows interpreting the importance of specific classes of terms (e.g. frequent vs. rare) and occurrences of terms in documents (e.g. single occurrence vs. multiple). Of course, using only few test cases (topics sets and collections) is a limitation of this current study, which we are going to address in our future research. We view our approach as a complement, rather than competitive, to the analytical approaches such as language models. Our approach can be also used as an explorative tool in order to identify important relevance-indicating features, which can be later modeled analytically. We believe that our work and the ones referred in this paper may bring many of the achievements made in a more general area of classification and machine learning closer to the task of rank ordered information retrieval, thus making retrieval engines more helpful in reducing the information overload and meeting people s needs. 5. Acknowledgement Weiguo Fan's work is supported by NSF under the grant number ITR Roussinov s work was supported by Dean s Award of Excellence, W.P. Carey School of Business, summer
8 Table 7. Learning across small collections. Training Collection Testing Collection Average absolute effect Relative effect on the on the testing testing collection (%) collection (%) Cranfield NPL Cranfield Med Cranfield CISI NPL Cranfield NPL Med NPL CISI Med Cranfield Med NPL Med CISI CISI Cranfield CISI NPL CISI Med References [1] Bartell, B., Cottrell, G., and Belew, R. (1994). Optimizing Parameters in a Ranked Retrieval System Using Multi-Query Relevance Feedback. Symposium on Document Analysis and Information Retrieval (SDAIR). [2] Cohen, W., Shapire, R., and Singer, Y. Learning to order things. Journal of Artificial Intelligence Research, 10, , [3] Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. Proceedings of the Twelfth International Conference on Machine Learning (pp ). Tahoe City, CA: Morgan Kaufmann. [4] Fan, W., Luo, M., Wang, L., Xi, W., and Fox, E. A. (2004). Tuning Before Feedback: Combining Ranking Discovery and Blind Feedback for Robust Retrieval. Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), [5] Fuhr, N. and C. Buckley (1991). A probabilistic learning approach for document indexing. ACM Transactions on Information Systems, 9, [6] Gey, F. C. (1994). Inferring probability of relevance using the method of logistic regression. In Proceedings of the 17th ACM Conference on Research and Development in Information Retrieval (SIGIR 94), pp [7] Greiff, W. A Theory of Term Weighting Based on Exploratory Data Analysis. ACM SIGIR [8] Hearst, M. (1998). Support Vector Machines. IEEE Intelligent Systems Magazine, Trends and Controversies, Marti Hearst, ed., 13(4), July/August [9] Joachims, T., A Statistical Learning Model of Text Classification with Support Vector Machines. Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), [10] Joachims, T., Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, [11] Kraaij, W., Westerveld T. and Hiemstra, D., The Lemur Toolkit for Language Modeling and Information Retrieval, 2.cs.cmu.edu/~lemur/ [12] Nallapati, R. (2004). Discriminative models for information retrieval. Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), 2004, pp [13] Robertson S. E. and Sparck Jones, K., Relevance weighting of search terms, Journal of American Society for Information Sciences, 27(3), pp ,
9 [14] Robertson, S. E., Walker, S., Jones S., Hancock- Beaulieu M.M., and Gatford, M., Okapi at TREC- 4, in D. K. Harman, editor, Proceedings of the Fourth Text Retrieval Conference, pp NIST Special Publication , [15] Salton, G. and McGill, M.J. (1983). Introduction to Modern Information Retrieval. New York. McGraw-Hill. [16] Song, F. and Croft, W.B. A general language model for information retrieval. In Proceedings of Eighth International Conference on Information and Knowledge Management (CIKM 99). [17] Vapnik, V. N. Statistical Learning Theory. John Wiley and Sons Inc., New York, [18] Vogt, C., Cottrell, G. (1999). Fusion Via a Linear Combination of Scores. Information Retrieval, 1(3), pp [19] Zhai, C., and Lafferty, J. (2001). A study of smoothing methods for language models applied to Ad Hoc information retrieval. Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), pp ,
Assignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationOn document relevance and lexical cohesion between query terms
Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationTerm Weighting based on Document Revision History
Term Weighting based on Document Revision History Sérgio Nunes, Cristina Ribeiro, and Gabriel David INESC Porto, DEI, Faculdade de Engenharia, Universidade do Porto. Rua Dr. Roberto Frias, s/n. 4200-465
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationLearning to Rank with Selection Bias in Personal Search
Learning to Rank with Selection Bias in Personal Search Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork Google Inc. Mountain View, CA 94043 {xuanhui, bemike, metzler, najork}@google.com ABSTRACT
More informationUMass at TDT Similarity functions 1. BASIC SYSTEM Detection algorithms. set globally and apply to all clusters.
UMass at TDT James Allan, Victor Lavrenko, David Frey, and Vikas Khandelwal Center for Intelligent Information Retrieval Department of Computer Science University of Massachusetts Amherst, MA 3 We spent
More informationBridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models
Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationGrade 6: Correlated to AGS Basic Math Skills
Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationClickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models
Clickthrough-Based Translation Models for Web Search: from Word Models to Phrase Models Jianfeng Gao Microsoft Research One Microsoft Way Redmond, WA 98052 USA jfgao@microsoft.com Xiaodong He Microsoft
More informationGeorgetown University at TREC 2017 Dynamic Domain Track
Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More informationVariations of the Similarity Function of TextRank for Automated Summarization
Variations of the Similarity Function of TextRank for Automated Summarization Federico Barrios 1, Federico López 1, Luis Argerich 1, Rosita Wachenchauzer 12 1 Facultad de Ingeniería, Universidad de Buenos
More informationEvidence for Reliability, Validity and Learning Effectiveness
PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationChapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard
Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationCombining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval
Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval Jianqiang Wang and Douglas W. Oard College of Information Studies and UMIACS University of Maryland, College Park,
More informationAnalysis of Enzyme Kinetic Data
Analysis of Enzyme Kinetic Data To Marilú Analysis of Enzyme Kinetic Data ATHEL CORNISH-BOWDEN Directeur de Recherche Émérite, Centre National de la Recherche Scientifique, Marseilles OXFORD UNIVERSITY
More informationIterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages
Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer
More informationA cognitive perspective on pair programming
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2006 Proceedings Americas Conference on Information Systems (AMCIS) December 2006 A cognitive perspective on pair programming Radhika
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationarxiv: v1 [cs.lg] 3 May 2013
Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1
More informationRule discovery in Web-based educational systems using Grammar-Based Genetic Programming
Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de
More informationLanguage Independent Passage Retrieval for Question Answering
Language Independent Passage Retrieval for Question Answering José Manuel Gómez-Soriano 1, Manuel Montes-y-Gómez 2, Emilio Sanchis-Arnal 1, Luis Villaseñor-Pineda 2, Paolo Rosso 1 1 Polytechnic University
More informationAGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS
AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic
More informationLahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017
Instructor Syed Zahid Ali Room No. 247 Economics Wing First Floor Office Hours Email szahid@lums.edu.pk Telephone Ext. 8074 Secretary/TA TA Office Hours Course URL (if any) Suraj.lums.edu.pk FINN 321 Econometrics
More informationAs a high-quality international conference in the field
The New Automated IEEE INFOCOM Review Assignment System Baochun Li and Y. Thomas Hou Abstract In academic conferences, the structure of the review process has always been considered a critical aspect of
More informationUniversiteit Leiden ICT in Business
Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:
More informationWhy Did My Detector Do That?!
Why Did My Detector Do That?! Predicting Keystroke-Dynamics Error Rates Kevin Killourhy and Roy Maxion Dependable Systems Laboratory Computer Science Department Carnegie Mellon University 5000 Forbes Ave,
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationAlgebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview
Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best
More informationCHAPTER 4: REIMBURSEMENT STRATEGIES 24
CHAPTER 4: REIMBURSEMENT STRATEGIES 24 INTRODUCTION Once state level policymakers have decided to implement and pay for CSR, one issue they face is simply how to calculate the reimbursements to districts
More informationLecture 2: Quantifiers and Approximation
Lecture 2: Quantifiers and Approximation Case study: Most vs More than half Jakub Szymanik Outline Number Sense Approximate Number Sense Approximating most Superlative Meaning of most What About Counting?
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationProbability and Statistics Curriculum Pacing Guide
Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods
More informationOnline Updating of Word Representations for Part-of-Speech Tagging
Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationNotes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1
Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial
More informationWE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT
WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working
More informationToward Reproducible Baselines: The Open-Source IR Reproducibility Challenge
Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge Jimmy Lin 1(B), Matt Crane 1, Andrew Trotman 2, Jamie Callan 3, Ishan Chattopadhyaya 4, John Foley 5, Grant Ingersoll 4, Craig
More informationAn Introduction to Simio for Beginners
An Introduction to Simio for Beginners C. Dennis Pegden, Ph.D. This white paper is intended to introduce Simio to a user new to simulation. It is intended for the manufacturing engineer, hospital quality
More informationHow do adults reason about their opponent? Typologies of players in a turn-taking game
How do adults reason about their opponent? Typologies of players in a turn-taking game Tamoghna Halder (thaldera@gmail.com) Indian Statistical Institute, Kolkata, India Khyati Sharma (khyati.sharma27@gmail.com)
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationReinforcement Learning by Comparing Immediate Reward
Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate
More informationCOMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS
COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS L. Descalço 1, Paula Carvalho 1, J.P. Cruz 1, Paula Oliveira 1, Dina Seabra 2 1 Departamento de Matemática, Universidade de Aveiro (PORTUGAL)
More informationNCEO Technical Report 27
Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students
More informationMandarin Lexical Tone Recognition: The Gating Paradigm
Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition
More informationTIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy
TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE Pierre Foy TIMSS Advanced 2015 orks User Guide for the International Database Pierre Foy Contributors: Victoria A.S. Centurino, Kerry E. Cotter,
More information*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN
From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,
More informationMTH 141 Calculus 1 Syllabus Spring 2017
Instructor: Section/Meets Office Hrs: Textbook: Calculus: Single Variable, by Hughes-Hallet et al, 6th ed., Wiley. Also needed: access code to WileyPlus (included in new books) Calculator: Not required,
More informationStatewide Framework Document for:
Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationCross-Lingual Text Categorization
Cross-Lingual Text Categorization Nuria Bel 1, Cornelis H.A. Koster 2, and Marta Villegas 1 1 Grup d Investigació en Lingüística Computacional Universitat de Barcelona, 028 - Barcelona, Spain. {nuria,tona}@gilc.ub.es
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationLearning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for
Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com
More information10.2. Behavior models
User behavior research 10.2. Behavior models Overview Why do users seek information? How do they seek information? How do they search for information? How do they use libraries? These questions are addressed
More informationDiscriminative Learning of Beam-Search Heuristics for Planning
Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University
More informationCAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011
CAAP Content Analysis Report Institution Code: 911 Institution Type: 4-Year Normative Group: 4-year Colleges Introduction This report provides information intended to help postsecondary institutions better
More informationEvaluation of Usage Patterns for Web-based Educational Systems using Web Mining
Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl
More informationEvaluation of Usage Patterns for Web-based Educational Systems using Web Mining
Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationA Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Yelong Shen Microsoft Research Redmond, WA, USA yeshen@microsoft.com Xiaodong He Jianfeng Gao Li Deng Microsoft Research
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationTHE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography
THE UNIVERSITY OF SYDNEY Semester 2, 2017 Information Sheet for MATH2068/2988 Number Theory and Cryptography Websites: It is important that you check the following webpages regularly. Intermediate Mathematics
More informationThe lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.
Name: Partner(s): Lab #1 The Scientific Method Due 6/25 Objective The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationUniversity of Groningen. Systemen, planning, netwerken Bosman, Aart
University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document
More informationCross-Language Information Retrieval
Cross-Language Information Retrieval ii Synthesis One liner Lectures Chapter in Title Human Language Technologies Editor Graeme Hirst, University of Toronto Synthesis Lectures on Human Language Technologies
More informationTime series prediction
Chapter 13 Time series prediction Amaury Lendasse, Timo Honkela, Federico Pouzols, Antti Sorjamaa, Yoan Miche, Qi Yu, Eric Severin, Mark van Heeswijk, Erkki Oja, Francesco Corona, Elia Liitiäinen, Zhanxing
More informationGCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education
GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge
More informationCross-lingual Text Fragment Alignment using Divergence from Randomness
Cross-lingual Text Fragment Alignment using Divergence from Randomness Sirvan Yahyaei, Marco Bonzanini, and Thomas Roelleke Queen Mary, University of London Mile End Road, E1 4NS London, UK {sirvan,marcob,thor}@eecs.qmul.ac.uk
More informationConversational Framework for Web Search and Recommendations
Conversational Framework for Web Search and Recommendations Saurav Sahay and Ashwin Ram ssahay@cc.gatech.edu, ashwin@cc.gatech.edu College of Computing Georgia Institute of Technology Atlanta, GA Abstract.
More informationFirms and Markets Saturdays Summer I 2014
PRELIMINARY DRAFT VERSION. SUBJECT TO CHANGE. Firms and Markets Saturdays Summer I 2014 Professor Thomas Pugel Office: Room 11-53 KMC E-mail: tpugel@stern.nyu.edu Tel: 212-998-0918 Fax: 212-995-4212 This
More informationEntrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany
Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International
More information