Dimensionality Reduction for Active Learning with Nearest Neighbour Classifier in Text Categorisation Problems

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Dimensionality Reduction for Active Learning with Nearest Neighbour Classifier in Text Categorisation Problems"

Transcription

1 Dimensionality Reduction for Active Learning with Nearest Neighbour Classifier in Text Categorisation Problems Michael Davy Artificial Intelligence Group, Department of Computer Science, Trinity College Dublin Saturnino Luz Artificial Intelligence Group, Department of Computer Science, Trinity College Dublin Abstract Dimensionality reduction techniques are commonly used in text categorisation problems to improve training and classification efficiency as well as to avoid overfitting. The best performing dimensionality reduction techniques for text categorisation are supervised, hence utilise the label information of the training data. Active learning is used to reduce the number of labelled training examples for problems where obtaining label information is expensive. Since the vast majority of data supplied to active learning are unlabelled, supervised dimensionality reduction techniques cannot be readily employed. For this reason, active learning in text categorisation problems do not perform dimensionality reduction thereby restricting the choice of classifier. In this paper we investigate unsupervised dimensionality reduction techniques in active learning for text categorisation problems. Two unsupervised techniques are investigated, namely Document Frequency and Principal Components Analysis. We empirically show increased performance of active learning, using a k-nearest Neighbour classifier, when dimensionality reduction is applied using the unsupervised techniques. 1 Introduction Text categorisation is defined to be the task of assigning documents to a set of predefined categories [1]. Automated solutions to text categorisation have been developed using supervised learning where a classifier is induced from a large number of labelled examples. Supervised learning assumes there is an abundance of labelled examples, however, this assumption does not hold for many domains. While labelled examples can be scarce, unlabelled examples are naturally abundant. Active learning is a technique for constructing accurate classifiers from very small amounts of training data. Reductions in the number of labelled examples required are achieved by active learning controlling the training data and only populating it with very informative examples. Conversely, supervised learning has no control over the training data, hence requires far more data to ensure there are sufficient numbers of informative training examples. Orders of magnitude reductions in labelling requirements are achieved when performing active learning on text categorisation problems [5]. In this paper we explore the difficulties arising from performing dimensionality reduction in active learning for text categorisation problems. The most successful dimensionality reduction techniques for text categorisation are supervised feature selection methods [13]. However, performing supervised feature selection is a significant problem for active learning tasks since the majority of supplied training data are unlabelled. As text data is naturally high dimensional, the choice of classifier used in active learning is therefore limited to those which do not suffer from the curse of dimensionality [7]. We investigate the application of unsupervised dimensionality reduction to active learning on text categorisation problems. Reducing the dimensionality while retaining the discriminative features will allow for greater flexibility in the choice of classifier used in active learning. To the best of our knowledge this is the first analysis of the use of unsupervised dimensionality reduction in the context of active learning for text categorisation problems. Empirical evaluation were conducted on the effect of dimensionality reduction to the performance of active learning using the k-nearest Neighbour (knn) algorithm. Two well established unsupervised dimensionality reduction techniques were considered for use in active learning problems. Feature selection is performed using Document Frequency performed with a global policy () while feature extraction is performed using Principal Components Analysis ().

2 Both techniques offer significant reductions in the size of the input data with and reducing dimensionality by up to 9% and 98% respectfully. We demonstrate that preprocessing the data using the unsupervised dimensionality reduction techniques can significantly increase the performance of active learning using the knn making it more competitive with state-of-the-art classifiers such as Support Vector Machines. A brief description of active learning, in particular, poolbased active learning is given in Section 2. The unsupervised dimensionality reduction techniques are reviewed in section 3. Empirically evaluated on real world text corpora is presented and discussed in section 4. Finally conclusions and future work are given in section 5 Algorithm 1: Pool-Based Active Learner Input: tr - training data Input: ul - unlabelled examples for i =to stopping criteria met do Φ i = Induce(tr) q = QuerySelect(ul, Φ i ) ul ul \{q} l = Oracle(q) tr tr {(q, l)} Output: Φ F = Induce(tr) // Induce // Select // Remove // Label // Update 2 Active Learning The goal of active learning is to produce an accurate classifier (Φ) fromas fewtrainingexamplesas possible. Thisis advantageous for domains where labelled training examples are scarce and the task of labelling is expensive. Typically training data for supervised learning are chosen randomly prior to induction. This is referred to as passive learning since the learner has no control over the which examples constitute the training data. Conversely, active learning allows the learner to construct it s own training data. Starting from a small number of labelled seed examples, an active learner will iteratively select unlabelled examples, acquire correct labels and update the training data. Certain examples will contain more information about the problem than others. Passive learning can potentially label a large number of uninformative examples. Active learning attempts to select (and label) only those examples which contain the most information. Therefore, active learning can significant reduce the number of labelled examples when compared to passive learning. 2.1 Pool-Based Active Learning In this paper we use pool-based active learning [5, 6] where the learner is supplied with a pool of unlabelled examples from which it selects queries. Algorithm 1 gives the outline of a pool-based active learning. The active learner is given a pool of unlabelled examples (ul) and training data (tr) which is seeded with a small number of labelled examples. In each iteration a classifier (Φ i ) is constructed from all the known labelled training data using an induction classification algorithm. The classifier can then be used by the query selection function to help select informative examples by providing predictions on unlabelled data. A query example (q) is selected using the query selection function and removed from the unlabelled pool. The true label (l) of the selected example is obtained from the oracle which is an external entity; assumed to be human and considered infallible. Once the true label is known, the labelled example can be added to the training data where classifiers induced in subsequent iterations will incorporate the information. Common stopping criteria used in active learning are: a limit on the number of examples the oracle is willing to label or stopping once all unlabelled examples have been selected. Once stopped the output of active learning is a classifier (Φ F ) trained on all the known labelled data Query Selection The query selection function is a crucial component of active learning and is responsible for selecting informative examples from the pool. A number of selection strategies have emerged in the literature [6, 1]. In this paper we use Uncertainty Sampling (US) [5] as the query selection function. US selects examples which the current classifier (Φ i ) is most uncertain about. Uncertainty is defined as the confidence the classifier has in a prediction. For a probabilistic classifier a prediction close to. or 1. indicates a confident prediction while a prediction close to indicates an uncertain prediction. Unlabelled examples in the pool are sorted according to their prediction uncertainty and the most uncertain example selected as the query, as shown in Equation 1. s =argmin x ul Φ i(x) (1) 3 Dimensionality Reduction for Active Learning While high performance supervised feature selection techniques [13] can be applied in supervised text categorisation problems, the same supervised techniques can not be readily employed in active learning since the majority of

3 training data supplied are unlabelled. The use of benchmark corpora can allow the use of supervised feature selection [4]. However, in real world applications the label information is not available, which limits the applicability of this kind of approach. In general, dimensionality reduction is not performed for active learning in text categorisation problems. To compensate, classifiers capable of handling high dimensional data are preferred, restricting the choice of classifier used in active learning experiments. In this paper we explore an alternative approach which is suitable for realistic active learning in text categorisation problems. Two well established unsupervised dimensionality reduction techniques are considered for use in conjunction with active learning. 3.1 Document Frequency Global () Document frequency [1] is a feature selection technique where features are chosen based on the number of documents in which they occur. Rare features which only occur in a small number of documents are removed and only the features which occur in a large number of documents are retained. Despite its simplicity the performance of document frequency is comparable to the best performing feature selection methods [13] such as Information Gain. It is worth noting that stopwords are removed before dimensionality reduction is performed. Document Frequency can be performed using either a local or global policy. Local dimension reduction selects a set of terms for each category (context-sensitive). Obviously this requires knowledge of the label information. Conversely, a global policy for document frequency will select a set of the most frequent terms regardless of category, hence does not require label information (context-free). We use document frequency performed globally as an unsupervised feature selection technique. 3.2 Principal Components Analysis () Principal Components Analysis is a method for projecting high dimensional data into a new low dimensional space with minimum loss of information. It is an unsupervised feature extraction technique which discovers the directions of maximal variance in the data. The coordinate system of the original data is orthogonally transformed where the new coordinates are called the principal components (sometimes called principal axes). Principal components can be found by performing eigenvalue decomposition of the covariance matrix constructed from the training data. The solution to the eigenvalue decomposition is a set of eigenvectors which have associated eigenvalues. Eigenvectors are the principal components of the data while the eigenvalues define the amount of variance accounted for by the principal component. Principal components are sorted by their eigenvalues where the first principal component will account for the largest amount variance, the second principal component will account for the second largest amount, and so on for Text Categorisation. Given a set of l examples, principal component analysis will first centre the data by constructing the mean of the data µ (as given in Equation 2) and subtracting this from each example. Centering the data is not essential but can remove irrelevant variance as it reduces the overall sum of the eigenvalues. µ = 1 l x i (2) l i=1 The covariance matrix (C) is constructed as the dot product of the centered examples as given in Equation 3 (here centering is incorporated into the construction of the covariance matrix). C = 1 l l (x i µ)(x i µ) T (3) i=1 The eigenvalue problem (Equation 4) is solved by performing eigenvalue decomposition on C. The solution is a set of eigenvectors (v) and their associated eigenvalues (λ). Cv = λv (4) The d largest eigenvalues are sorted (λ 1 λ 2 λ 3... λ d ) in descending order and their associated eigenvectors stacked to form the transformation matrix W = [v 1,v 2,v 3,...,v d ]. For a given example x it can be transformed into the reduced space by Equation 5. y = W T x (5) The value of d is an important factor in the success of. Since the eigenvalues correspond to the amount of variance accounted for by their associated eigenvector, the proportion of variance accounted for by the first d eigenvectors can be calculated as: λ 1 + λ λ d λ 1 + λ λ d λ N In this paper we choose the leading d components which account for 9% of the variance in the data.

4 4 Empirical Evaluation 4.1 Experimental Setup Experiments were conducted to examine the effect of the proposed unsupervised dimensionality reduction techniques on the performance of active learning. Two standard benchmark corpora previously used in active learning research [11, 8], namely the Reuters corpus and a subset of the 2 Newsgroup corpus were used. The original feature set was obtained from preprocessing the corpora to remove stopwords and punctuation. Stemming was performed using the Porter stemming algorithm. Reduced feature sets were constructed using the two unsupervised dimensionality reduction techniques performed on the unlabelled and seed data. retained only 1% of the most frequent features while transformed the original data onto a d-dimensional space where d was chosen as the number of principal components which accounted for 9% of the variance in the data. Both the training and test sets were re-expressed in the reduced feature representation. The knn is a high performance classifier [12] for text categorisation, however, it is sensitive to high dimensional data. While it is not commonly used for active learning text categorisation tasks we chose the knn since it will benefit greatly from dimensionality reduction. The output of the knn was transformed into a class membership probability estimate where the distribution is based on the distance of the query example to the k nearest neighbours. The estimate was then used as a measure of uncertainty (as discussed in section 2.1.1). The k value was fixed at 3 in our experiments. The optimal value for k is typically found using validation data, which is not available in active learning. A low value for k is also important for the early iterations of active learning since the number of training examples can be very low. Comparison are made between a baseline knn using the full feature set (), knn using the dimensionality reduced data ( and ) and also a top-line Support Vector Machine trained on the full feature set (). The Spider 1 toolbox for Matlab was used to perform the experiments with the andre optimisation selected for the. Active learning was seeded with 4 positive and 4 negative examples. Just one query example was selected per iteration. Once started, active learning was only stopped when all the unlabelled examples had been selected from the pool. The performance of active learning was measured using the classifier induced in each iteration (Φ i ) evaluated on a test set. Each experiment was run ten times and the results averaged. Within each trial the same seed examples for active learning were supplied each of the techniques. 1 Table 1. Iterations of active learning required to achieve supervised learning performance for R1. Percentage of pool labelled. Full MacroF (46%) 324 (33%) 243 (25%) MicroF (93%) 444 (45%) 385 (39%) 4.2 Reuters (R1) We used the R1 [2] which is the top ten most frequent categories of the ModApte split. One-versus-rest experiments were constructed for each individual category. To reduce the computational overhead of performing active learning, a pool of 1, documents were randomly selected from the 9, 63 training documents as used previous active learning research [11]. selected, on average, the leading 36 principal components, which is a 98.5% reduction in dimensionality. retained only the top 1, 987 (1%) features. Due to the unbalanced class distribution the F 1 value of precision (π) and recall (ρ) was chosen as the performance metric, (where F 1 = 2πρ π+ρ ). F 1 was calculated using both macroaveraged and microaveraged variants of precision and recall. Macro F (a) Macro F 1 Micro F (b) Micro F 1 Figure 1. Performance of Active Learning for R1. The number of iterations of active learning is given on the X axis and the F 1 is given on the Y axis. Performance of active learning on the R1 data is given in Figure 1. and can be seen lift performance of active learning closer to that achieved by the top-line classifier. Of the two unsupervised dimensionality reduction techniques achieves both a greater reduction in dimensionality and a higher performance increase. The number of iteration of active learning required to produce a classifier (Φ i ) with performance equal to a classifier constructed by supervised learning on all training data

5 using the full feature set, is given in Table 1. Increasing the performance of active learning subsequently reduces the labelling effort. Both and increase performance resulting in reductions in the number of required labelled examples. Again is seen to outperform. Given the high cost of labelling it is useful to consider halting active learning after a limited number of labels are acquired. Stopping at 25 iterations, the increase in F 1 using compared to is (Macro).496 (Micro) 561 while the increase in F 1 of compared to is (Macro).219 (Micro).82. Bold text indicates statistical significance (α =). were: (A-R) 194, (G-X) 23,(W-H) 232 and (B-C) 21, reducing dimensionality by approximately 81% on average. reduced dimensionality by 9% (a) Atheism-Religion (A-R) (b) Graphics-X (G-X) Random Feature Selection 5 5 It could be the case that the observed improvements in performance could be due simply to the positive effect reducing the number of features has on the classifier, irrespective of the quality of the reduced set. In order to test that possibility we compared performance of the baseline to random feature selection [3] (c) Windows-Hardware (W-H) (d) Baseball-Cryptography (B-C) Macro F Micro F Figure 3. Performance of Active Learning for 2NG. Iterations of Active Learning is given on the X-axis and the is given on the Y-axis. 5 Full Rand (a) Macro F 1 Full Rand (b) Micro F 1 Figure 2. Performance of Rand compared to Full. Iterations of active learning is given on thex-axisandthethef 1 is given on the Y- axis. Due to the 1v1 problems was used as the performance metric. Figure 3 plots the rate of active learning for the four sub-problems. Both of the unsupervised dimensionality reduction techniques ( and ) increase the performance of active learning. again offers greater reductions in dimensionality and also outperforms on all four problems. Figure 2 plots the performance of random feature selection (Rand) w.r.t the original feature set () on the R1 dataset 2. The performance of Rand is significantly worse which shows features selected by the unsupervised techniques are discriminative Newsgroups Subset (2NG) Four 1v1 problems constructed from the the 2 Newsgroups corpus [8]. The problems range in difficulty from easy to hard. Ten 5%/5% training/testing splits of the data were constructed and the results obtained were averaged. The average number of principal components chosen 2 Rand was not run on the 2 Newsgroups dataset for the sake of brevity Table 2. Iterations of active learning required to achieve performance of Supervised learning for 2NG. (Percentage of pool labelled). Full A-R 672 (95%) 597 (85%) 416 (59%) G-X 773 (8%) 661 (68%) 58 (6%) W-H 616 (64%) 553 (57%) 342 (35%) B-C 48 (42%) 428 (44%) 21 (21%) We compared the number of iterations required to a classifier (Φ i ) with equal to that produced by supervised learning on all training data using the full feature set. Table 2. shows a significant reduction in the labelling effort when the dimensionality reduction techniques are employed.

6 Stopping after just 25 iterations the reduction in of compared to is: (A-R) 74 (G-X) 883 (W-H).643 (B-C) 517 while the reduction in of compared to is: (A-R).255 (G-X).28 (W-H).316 (B-C).386. Bold text indicates statistical significance (α =). 4.4 Discussion Empirical evaluation shows that employing unsupervised dimensionality reduction increases the performance of active learning using a knn. Performing offered some performance increase compared with the baseline () performance. The performance increase was shown to be a result the selection discriminative features since Random feature selection failed to achieve any increase in performance. outperformed in all of the experiments conducted. There are some noticeable differences between the two techniques which may account for the increased performance. While statically reduced the dimensionally of the data, dynamically reduced the dimensionality until the majority of variance in the data was accounted for. In the 2NG experiments, for instance, dimensionality was reduced to just 23 features in the G-X sub-problem. Subsequently classification in the reduced feature set was considerably easier leading to higher performance of active learning and a large reduction in the labelling effort (58 compared to the baseline of 773). While was shown to be best performing technique the computational expense associated with is far greater, which limits its applicability to very large datasets. offers some increased performance at much lower computational expense. 5 Conclusions and Future Work Supervised dimensionality reduction techniques can not be readily employed in active learning scenarios since the majority of training data is unlabelled. The choice of classifier used in active learning is therefore limited to those which do not suffer from the curse of dimensionality. This paper investigated the use of well established unsupervised dimensionality reduction techniques for use in active learning on text categorisation problemsto increase performance and allow for greater flexibility in the choice of classification algorithm. Empirical evaluations on two benchmark corpora show that both Document Frequency performed Globally and Principal Components Analysis significantly increased the performance active learning when using a knn. In both sets of experiments was found to outperform, however this the increased performance comes with the higher computational overhead associated with conducting. We plan to continue this research to look at Kernel Principal Components Analysis (K) [9] which will allow for non-linear principal components to be found. Acknowledgements This research is funded by the Irish Research Council for Science, Engineering and Technology (IRCSET). References [1] M. Davy and S. Luz. Active learning with history-based query selection for text categorisation. Proceedings of the 29th European Conference on Information Retrieval Research, ECIR 27, 4425:695, 27. [2] F. Debole and F. Sebastiani. An analysis of the relative hardness of reuters subsets. Journal of the American Society for Information Science and Technology, 56(6): , 25. [3] G. Forman. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3(1): , 23. [4] S. Hoi, R. Jin, and M. Lyu. Large-scale text categorization by batch mode active learning. Proceedings of the 15th international conference on World Wide Web, 26. [5] D. Lewis and W. Gale. A sequential algorithm for training text classifiers. Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pages 3 12, [6] A. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. Proceedings of the 15th International Conference on Machine Learning, pages , Uses random initial cases. [7] T. Mitchell. Machine Learning. McGraw-Hill, [8] G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. Proceedings of the 17th International Conference on Machine Learning, pages , 2. [9] B. Scholkopf, A. Smola, and K. Muller. Kernel principal component analysis. Advances in Kernel Methods-Support Vector Learning, pages , [1] F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1 47, 22. [11] S. Tong and D. Koller. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2:45 66, 21. [12] Y. Yang and X. Liu. A re-examination of text categorization methods. Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 42 49, [13] Y. Yang and J. Pedersen. A comparative study on feature selection in text categorization. Proceedings of the 14th International Conference on Machine Learning, 97, 1997.

Beyond TFIDF Weighting for Text Categorization in the Vector Space Model

Beyond TFIDF Weighting for Text Categorization in the Vector Space Model Beyond TFIDF Weighting for Text Categorization in the Vector Space Model Pascal Soucy Coveo Quebec, Canada psoucy@coveo.com Guy W. Mineau Université Laval Québec, Canada guy.mineau@ift.ulaval.ca Abstract

More information

Machine Learning and Applications in Finance

Machine Learning and Applications in Finance Machine Learning and Applications in Finance Christian Hesse 1,2,* 1 Autobahn Equity Europe, Global Markets Equity, Deutsche Bank AG, London, UK christian-a.hesse@db.com 2 Department of Computer Science,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Word Sense Disambiguation with Semi-Supervised Learning

Word Sense Disambiguation with Semi-Supervised Learning Word Sense Disambiguation with Semi-Supervised Learning Thanh Phong Pham 1 and Hwee Tou Ng 1,2 and Wee Sun Lee 1,2 1 Department of Computer Science 2 Singapore-MIT Alliance National University of Singapore

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Activist: A New Framework for Dataset Labelling

Activist: A New Framework for Dataset Labelling Dublin Institute of Technology ARROW@DIT Conference papers School of Computing 2016-9 Activist: A New Framework for Dataset Labelling Jack O'Neill Dublin Institute of Technology, d13128926@mydit.ie Sarah

More information

Utility Theory, Minimum Effort, and Predictive Coding

Utility Theory, Minimum Effort, and Predictive Coding Utility Theory, Minimum Effort, and Predictive Coding Fabrizio Sebastiani (Joint work with Giacomo Berardi and Andrea Esuli) Istituto di Scienza e Tecnologie dell Informazione Consiglio Nazionale delle

More information

arxiv: v1 [cs.lg] 3 May 2013

arxiv: v1 [cs.lg] 3 May 2013 Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1

More information

Session 1: Gesture Recognition & Machine Learning Fundamentals

Session 1: Gesture Recognition & Machine Learning Fundamentals IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research

More information

Cross-Domain Video Concept Detection Using Adaptive SVMs

Cross-Domain Video Concept Detection Using Adaptive SVMs Cross-Domain Video Concept Detection Using Adaptive SVMs AUTHORS: JUN YANG, RONG YAN, ALEXANDER G. HAUPTMANN PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Problem-Idea-Challenges Address accuracy

More information

Weighted Average Pointwise Mutual Information for Feature Selection in Text Categorization

Weighted Average Pointwise Mutual Information for Feature Selection in Text Categorization Weighted Average Pointwise Mutual Information for Feature Selection in Text Categorization Karl-Michael Schneider Department of General Linguistics, University of Passau, 94030 Passau, Germany schneide@phil.uni-passau.de

More information

Modelling Student Knowledge as a Latent Variable in Intelligent Tutoring Systems: A Comparison of Multiple Approaches

Modelling Student Knowledge as a Latent Variable in Intelligent Tutoring Systems: A Comparison of Multiple Approaches Modelling Student Knowledge as a Latent Variable in Intelligent Tutoring Systems: A Comparison of Multiple Approaches Qandeel Tariq, Alex Kolchinski, Richard Davis December 6, 206 Introduction This paper

More information

Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA

Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA Adult Income and Letter Recognition - Supervised Learning Report An objective look at classifier performance for predicting adult income and Letter Recognition Dudon Wai Georgia Institute of Technology

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington" 2012"

A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Machine

More information

Active Learning for Natural Language Parsing and Information Extraction

Active Learning for Natural Language Parsing and Information Extraction Appears in Proceedings of the Sixteenth International Machine Learning Conference, pp.406-414, Bled, Slovenia, June 1999 Active Learning for Natural Language Parsing and Information Extraction Cynthia

More information

Toward Optimal Active Learning through Sampling Estimation of Error Reduction

Toward Optimal Active Learning through Sampling Estimation of Error Reduction Toward Optimal Active Learning through Sampling Estimation of Error Reduction Nicholas Roy Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 USA Andrew McCallum WhizBang! Labs - Research,

More information

Active Learning Using Hint Information

Active Learning Using Hint Information 1 Active Learning Using Hint Information Chun-Liang Li, Chun-Sung Ferng, and Hsuan-Tien Lin {b97018, r99922054, htlin}@csie.ntu.edu.tw Department of Computer Science, National Taiwan University Keywords:

More information

USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING

USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING D.M.Kulkarni 1, S.K.Shirgave 2 1, 2 IT Department Dkte s TEI Ichalkaranji (Maharashtra), India Abstract Many data mining techniques have been

More information

Data Mining of Traffic Video Sequences

Data Mining of Traffic Video Sequences Data Mining of Traffic Video Sequences Final Report Prepared by: Ajay J. Joshi Nikolaos P. Papanikolopoulos Artificial Intelligence, Robotics and Vision Laboratory Department of Computer Science and Engineering

More information

Johannes Fürnkranz Austrian Research Institute for Artificial Intelligence. Schottengasse 3, A-1010 Wien, Austria

Johannes Fürnkranz Austrian Research Institute for Artificial Intelligence. Schottengasse 3, A-1010 Wien, Austria A Study Using -gram Features for Text Categorization Johannes Fürnkranz Austrian Research Institute for Artificial Intelligence Schottengasse 3, A-1010 Wien, Austria E-mail: juffi@ai.univie.ac.at Technical

More information

Text Categorization and Support Vector Machines

Text Categorization and Support Vector Machines Text Categorization and Support Vector Machines István Pilászy Department of Measurement and Information Systems Budapest University of Technology and Economics e-mail: pila@mit.bme.hu Abstract: Text categorization

More information

Distribution based stemmer refinement

Distribution based stemmer refinement Distribution based stemmer refinement B. L. Narayan and Sankar K. Pal Machine Intelligence Unit, Indian Statistical Institute, 203, B. T. Road, Calcutta - 700108, India. Email: {bln r, sankar}@isical.ac.in

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Improving Document Clustering by Utilizing Meta-Data*

Improving Document Clustering by Utilizing Meta-Data* Improving Document Clustering by Utilizing Meta-Data* Kam-Fai Wong Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong kfwong@se.cuhk.edu.hk Nam-Kiu Chan Centre

More information

Active Learning with Direct Query Construction

Active Learning with Direct Query Construction Active Learning with Direct Query Construction Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario N6A 5B7, Canada cling@csd.uwo.ca Jun Du Department of Computer

More information

Phrase detection Project proposal for Machine Learning course project

Phrase detection Project proposal for Machine Learning course project Phrase detection Project proposal for Machine Learning course project Suyash S Shringarpure suyash@cs.cmu.edu 1 Introduction 1.1 Motivation Queries made to search engines are normally longer than a single

More information

Novel Approach to Discover Effective Patterns For Text Mining

Novel Approach to Discover Effective Patterns For Text Mining Novel Approach to Discover Effective Patterns For Text Mining Rujuta Taware ME-II Computer Engineering, JSPMS s BSIOTR (W), Wagholi, Pune, India. Prof. Sanchika A. Bajpai Department of Computer Engineering,

More information

Big Data Analytics Clustering and Classification

Big Data Analytics Clustering and Classification E6893 Big Data Analytics Lecture 4: Big Data Analytics Clustering and Classification Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science September 28th, 2017 1

More information

Large-Scale Text Categorization by Batch Mode Active Learning

Large-Scale Text Categorization by Batch Mode Active Learning Large-Scale Text Categorization by Batch Mode Active Learning Steven C. H. Hoi Rong Jin Michael R. Lyu Department of Computer Science and Engineering Department of Computer Science and Engineering The

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B

36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B 36-350: Data Mining Fall 2009 Instructor: Cosma Shalizi, Statistics Dept., Baker Hall 229C, cshalizi@stat.cmu.edu Teaching Assistant: Joseph Richards, jwrichar@stat.cmu.edu Lectures: Monday, Wednesday

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa Chen, Yuxin Horning, Mitchell Shee, Liberty Supervised by: Prof. Yohann Tero August 9, 213 Abstract This report details and exts the implementation

More information

A Hybrid User Model for News Story Classification

A Hybrid User Model for News Story Classification A Hybrid User Model for News Story Classification Daniel Billsus and Michael J. Pazzani * Dept. of Information and Computer Science, University of California, Irvine, CA, USA Abstract. We present an intelligent

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Document Categorization Using Latent Semantic Indexing

Document Categorization Using Latent Semantic Indexing White Paper Document Categorization Using Latent Semantic Indexing Anthony Zukas and Robert J. Price NOTE: The following white paper was originally published in 2003. The foreword and epilogue are new,

More information

A Cluster based Approach with N-Grams at Word Level for Document Classification

A Cluster based Approach with N-Grams at Word Level for Document Classification A Cluster based Approach with N-Grams at Word Level for Document Classification Apeksha Khabia M. Tech Student CSE Department SRCOEM, Nagpur, India ABSTRACT A breakneck progress of computers and web makes

More information

Intelligent document classification

Intelligent document classification Intelligent document classification Rafael A. Calvo, H. A. Ceccatto Instituto de Física Rosario (CONICET-UNR) 27 de Febrero 210bis 2000 Rosario, Argentina rafa@ifir.edu.ar May 26, 2000 Abstract In this

More information

Cross Language Text Categorization Using a Bilingual Lexicon

Cross Language Text Categorization Using a Bilingual Lexicon Cross Language Text Categorization Using a Bilingual Lexicon Ke Wu, Xiaolin Wang and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 800 Dong Chuan Rd., Shanghai

More information

FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION

FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION James H. Nealand, Alan B. Bradley, & Margaret Lech School of Electrical and Computer Systems Engineering, RMIT University,

More information

Automatic Text Summarization for Annotating Images

Automatic Text Summarization for Annotating Images Automatic Text Summarization for Annotating Images Gediminas Bertasius November 24, 2013 1 Introduction With an explosion of image data on the web, automatic image annotation has become an important area

More information

N-Gram-Based Text Categorization

N-Gram-Based Text Categorization N-Gram-Based Text Categorization William B. Cavnar and John M. Trenkle Proceedings of the Third Symposium on Document Analysis and Information Retrieval (1994) presented by Marco Lui Automated text categorization

More information

Similarity-Weighted Association Rules for a Name Recommender System

Similarity-Weighted Association Rules for a Name Recommender System Similarity-Weighted Association Rules for a Name Recommender System Benjamin Letham Operations Research Center Massachusetts Institute of Technology Cambridge, MA, USA bletham@mit.edu Abstract. Association

More information

15 : Case Study: Topic Models

15 : Case Study: Topic Models 10-708: Probabilistic Graphical Models, Spring 2015 15 : Case Study: Topic Models Lecturer: Eric P. Xing Scribes: Xinyu Miao,Yun Ni 1 Task Humans cannot afford to deal with a huge number of text documents

More information

CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015

CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015 CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:30-11 (WESB 100).

More information

On The Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

On The Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis On The Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis Asriyanti Indah Pratiwi, Adiwijaya Telkom University, Telekomunikasi Street No 1, Bandung 40257, Indonesia

More information

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Aditya Sarkar, Julien Kawawa-Beaudan, Quentin Perrot Friday, December 11, 2014 1 Problem Definition Driving while drowsy inevitably

More information

Baseline Methods for Active Learning

Baseline Methods for Active Learning JMLR: Workshop and Conference Proceedings 6 (0) 47 57 Workshop on Active Learning and Experimental Design Baseline Methods for Active Learning Gavin C. Cawley School of Computing Sciences University of

More information

Data Fusion and Bias

Data Fusion and Bias Data Fusion and Bias Performance evaluation of various data fusion methods İlker Nadi Bozkurt Computer Engineering Department Bilkent University Ankara, Turkey bozkurti@cs.bilkent.edu.tr Hayrettin Gürkök

More information

Comparing Deep Learning and Conventional Machine Learning for Authorship Attribution and Text Generation

Comparing Deep Learning and Conventional Machine Learning for Authorship Attribution and Text Generation Comparing Deep Learning and Conventional Machine Learning for Authorship Attribution and Text Generation Gregory Luppescu Department of Electrical Engineering Stanford University gluppes@stanford.edu Francisco

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network

Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Nick Latourette and Hugh Cunningham 1. Introduction Our paper investigates the use of named entities

More information

TCDSCSS: Dimensionality Reduction to Evaluate Texts of Varying Lengths - an IR Approach

TCDSCSS: Dimensionality Reduction to Evaluate Texts of Varying Lengths - an IR Approach TCDSCSS: Dimensionality Reduction to Evaluate Texts of Varying Lengths - an IR Approach Arun Jayapal Dept of Computer Science Trinity College Dublin jayapala@cs.tcd.ie Martin Emms Dept of Computer Science

More information

Using Latent Semantic Analysis in Text Summarization and Summary Evaluation

Using Latent Semantic Analysis in Text Summarization and Summary Evaluation Using Latent Semantic Analysis in Text Summarization and Summary Evaluation Josef Steinberger * jstein@kiv.zcu.cz Karel Ježek * Jezek_ka@kiv.zcu.cz Abstract: This paper deals with using latent semantic

More information

Learning Categories and their Instances by Contextual Features

Learning Categories and their Instances by Contextual Features Learning Categories and their Instances by Contextual Features Antje Schlaf, Robert Remus Natural Language Processing Group, University of Leipzig, Germany {antje.schlaf, rremus}@informatik.uni-leipzig.de

More information

Determining the Characteristic of Difficult Job Shop Scheduling Instances for a Heuristic Solution Method

Determining the Characteristic of Difficult Job Shop Scheduling Instances for a Heuristic Solution Method Determining the Characteristic of Difficult Job Shop Scheduling Instances for a Heuristic Solution Method Helga Ingimundardottir and Thomas Philip Runarsson School of Engineering and Natural Sciences,

More information

Scaling to Very Very Large Corpora for Natural Language Disambiguation

Scaling to Very Very Large Corpora for Natural Language Disambiguation Scaling to Very Very Large Corpora for Natural Language Disambiguation Michele Banko and Eric Brill Microsoft Research 1 Microsoft Way Redmond, WA 98052 USA {mbanko,brill}@microsoft.com Abstract The amount

More information

Scaling Semi-Supervised Multinomial

Scaling Semi-Supervised Multinomial University of Sussex Department of Informatics Bachelor of Science in Computer Science and Artificial Intelligence Final Year Project Scaling Semi-Supervised Multinomial Naïve Bayes Author: Thomas H. Kober

More information

Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results

Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results Anthony Trippe Managing Director, Patinformatics, LLC Patent Information Fair & Conference November 10, 2017

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Unsupervised Learning and Dimensionality Reduction A Continued Study on Letter Recognition and Adult Income

Unsupervised Learning and Dimensionality Reduction A Continued Study on Letter Recognition and Adult Income Unsupervised Learning and Dimensionality Reduction A Continued Study on Letter Recognition and Adult Income Dudon Wai, dwai3 Georgia Institute of Technology CS 7641: Machine Learning Abstract: This paper

More information

Document Classification using Neural Networks Based on Words

Document Classification using Neural Networks Based on Words Volume 6, No. 2, March-April 2015 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info Document Classification using Neural Networks Based on

More information

Feature Reduction Techniques for Arabic Text Categorization

Feature Reduction Techniques for Arabic Text Categorization Feature Reduction Techniques for Arabic Text Categorization Rehab Duwairi Department of Computer Information Systems, Jordan University of Science and Technology, Irbid, Jordan. E-mail: rehab@just.edu.jo

More information

Predicting Bugs Components via Mining Bug Reports

Predicting Bugs Components via Mining Bug Reports JOURNAL OF SOFTWARE, VOL. 7, NO. 5, MAY 2012 1149 Predicting Bugs Components via Mining Bug Reports Deqing Wang, Hui Zhang, Rui Liu, Mengxiang Lin, and Wenjun Wu State Key Laboratory of Software Development

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Feature Weighting Strategies in Sentiment Analysis

Feature Weighting Strategies in Sentiment Analysis Feature Weighting Strategies in Sentiment Analysis Olena Kummer and Jacques Savoy Rue Emile-Argand 11, CH-2000 Neuchâtel {olena.zubaryeva,jacques.savoy}@unine.ch http://www2.unine.ch/iiun Abstract. In

More information

Predicting Yelp Ratings Using User Friendship Network Information

Predicting Yelp Ratings Using User Friendship Network Information Predicting Yelp Ratings Using User Friendship Network Information Wenqing Yang (wenqing), Yuan Yuan (yuan125), Nan Zhang (nanz) December 7, 2015 1 Introduction With the widespread of B2C businesses, many

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

INTRODUCTION TO DATA SCIENCE

INTRODUCTION TO DATA SCIENCE DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:

More information

Optimal Task Assignment within Software Development Teams Caroline Frost Stanford University CS221 Autumn 2016

Optimal Task Assignment within Software Development Teams Caroline Frost Stanford University CS221 Autumn 2016 Optimal Task Assignment within Software Development Teams Caroline Frost Stanford University CS221 Autumn 2016 Introduction The number of administrative tasks, documentation and processes grows with the

More information

Bird Species Identification from an Image

Bird Species Identification from an Image Bird Species Identification from an Image Aditya Bhandari, 1 Ameya Joshi, 2 Rohit Patki 3 1 Department of Computer Science, Stanford University 2 Department of Electrical Engineering, Stanford University

More information

Active Learning Selection Strategies for Information Extraction

Active Learning Selection Strategies for Information Extraction Active Learning Selection Strategies for Information Extraction Aidan Finn Nicholas Kushmerick Smart Media Institute, Computer Science Department, University College Dublin, Ireland {aidan.finn, nick}@ucd.ie

More information

Active Learning for Ranking through Expected Loss Optimization

Active Learning for Ranking through Expected Loss Optimization Active Learning for Ranking through Expected Loss Optimization Bo Long bolong@yahoo-inc.com Yi Chang yichang@yahoo-inc.com Olivier Chapelle chap@yahoo-inc.com Zhaohui Zheng zhaohui@yahoo-inc.com Ya Zhang

More information

Pattern Classification and Clustering Spring 2006

Pattern Classification and Clustering Spring 2006 Pattern Classification and Clustering Time: Spring 2006 Room: Instructor: Yingen Xiong Office: 621 McBryde Office Hours: Phone: 231-4212 Email: yxiong@cs.vt.edu URL: http://www.cs.vt.edu/~yxiong/pcc/ Detailed

More information

- Introduzione al Corso - (a.a )

- Introduzione al Corso - (a.a ) Short Course on Machine Learning for Web Mining - Introduzione al Corso - (a.a. 2009-2010) Roberto Basili (University of Roma, Tor Vergata) 1 Overview MLxWM: Motivations and perspectives A temptative syllabus

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Applied Machine Learning Lecture 1: Introduction

Applied Machine Learning Lecture 1: Introduction Applied Machine Learning Lecture 1: Introduction Richard Johansson January 16, 2018 welcome to the course! machine learning is getting increasingly popular among students our courses are full! many thesis

More information

Exposé for a Master s Thesis

Exposé for a Master s Thesis Exposé for a Master s Thesis Stefan Selent January 21, 2017 Working Title: TF Relation Mining: An Active Learning Approach Introduction The amount of scientific literature is ever increasing. Especially

More information

A Study of Relation Annotation in Business Environments Using Web Mining

A Study of Relation Annotation in Business Environments Using Web Mining A Study of Relation Annotation in Business Environments Using Web Mining Qi Li School of Information Science University of Pittsburgh qili@sis.pitt.edu Daqing He School of Information Science University

More information

Admission Prediction System Using Machine Learning

Admission Prediction System Using Machine Learning Admission Prediction System Using Machine Learning Jay Bibodi, Aasihwary Vadodaria, Anand Rawat, Jaidipkumar Patel bibodi@csus.edu, aaishwaryvadoda@csus.edu, anandrawat@csus.edu, jaidipkumarpate@csus.edu

More information

TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS

TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS ALINA SIRBU, OZALP BABAOGLU SUMMARIZED BY ARDA GUMUSALAN MOTIVATION 2 MOTIVATION Human-interaction-dependent data centers are not sustainable for future data

More information

Active learning algorithms for multi-label data

Active learning algorithms for multi-label data Active learning algorithms for multi-label data Everton Alvares Cherman University of Sao Paulo (USP) P.O. Box 668, Zip code 13561-970 Sao Carlos - SP, Brazil Tel.: +55-16-3373-9700 Fax: +55-16-3371-2238

More information

CS545 Machine Learning

CS545 Machine Learning Machine learning and related fields CS545 Machine Learning Course Introduction Machine learning: the construction and study of systems that learn from data. Pattern recognition: the same field, different

More information

Cascade evaluation of clustering algorithms

Cascade evaluation of clustering algorithms Cascade evaluation of clustering algorithms Laurent Candillier 1,2, Isabelle Tellier 1, Fabien Torre 1, Olivier Bousquet 2 1 GRAppA - Charles de Gaulle University - Lille 3 candillier@grappa.univ-lille3.fr

More information

Improving Real-time Expert Control Systems through Deep Data Mining of Plant Data

Improving Real-time Expert Control Systems through Deep Data Mining of Plant Data Improving Real-time Expert Control Systems through Deep Data Mining of Plant Data Lynn B. Hales Michael L. Hales KnowledgeScape, Salt Lake City, Utah USA Abstract Expert control of grinding and flotation

More information

Using Information from the Target Language to Improve Crosslingual Text Classification

Using Information from the Target Language to Improve Crosslingual Text Classification Using Information from the Target Language to Improve Crosslingual Text Classification Gabriela Ramírez-de-la-Rosa 1, Manuel Montes-y-Gómez 1, Luis Villaseñor-Pineda 1, David Pinto-Avendaño 2, and Thamar

More information

Topic Models and a Revisit of Text-related Applications

Topic Models and a Revisit of Text-related Applications Topic Models and a Revisit of Text-related Applications Viet Ha-Thuc Computer Science Department The University of Iowa Iowa City, IA 52242, USA hviet@cs.uiowa.edu Padmini Srinivasan School of Library

More information

An Efficiently Focusing Large Vocabulary Language Model

An Efficiently Focusing Large Vocabulary Language Model An Efficiently Focusing Large Vocabulary Language Model Mikko Kurimo and Krista Lagus Helsinki University of Technology, Neural Networks Research Centre P.O.Box 5400, FIN-02015 HUT, Finland Mikko.Kurimo@hut.fi,

More information

Cross language Text Categorization by acquiring Multilingual Domain Models from Comparable Corpora

Cross language Text Categorization by acquiring Multilingual Domain Models from Comparable Corpora Cross language Text Categorization by acquiring Multilingual Domain Models from Comparable Corpora Alfio Gliozzo and Carlo Strapparava ITC-Irst via Sommarive, I-38050, Trento, ITALY {gliozzo,strappa}@itc.it

More information

Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets

Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets Roi Reichart ICNC Hebrew University of Jerusalem roiri@cs.huji.ac.il Ari Rappoport Institute of Computer

More information

An efficient stemming for Arabic Text Classification

An efficient stemming for Arabic Text Classification An efficient stemming for Arabic Text Classification Attia Nehar Département d informatique A.T. University BP. 37G, 03000 Laghouat, Algeria Email: a.nehar@mail.lagh-univ.dz Djelloul Ziadi Université de

More information

Enriching the Crosslingual Link Structure of Wikipedia - A Classification-Based Approach -

Enriching the Crosslingual Link Structure of Wikipedia - A Classification-Based Approach - Enriching the Crosslingual Link Structure of Wikipedia - A Classification-Based Approach - Philipp Sorg and Philipp Cimiano Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe, Germany {sorg,cimiano}@aifb.uni-karlsruhe.de

More information

On Multiclass Universum Learning

On Multiclass Universum Learning On Multiclass Universum Learning Sauptik Dhar Naveen Ramakrishnan Vladimir Cherkassky Mohak Shah Robert Bosch Research and Technology Center, CA University of Minnesota, MN University of Illinois at Chicago,

More information

arxiv: v3 [cs.lg] 9 Mar 2014

arxiv: v3 [cs.lg] 9 Mar 2014 Learning Factored Representations in a Deep Mixture of Experts arxiv:1312.4314v3 [cs.lg] 9 Mar 2014 David Eigen 1,2 Marc Aurelio Ranzato 1 Ilya Sutskever 1 1 Google, Inc. 2 Dept. of Computer Science, Courant

More information

ONLINE SPEAKER DIARIZATION USING ADAPTED I-VECTOR TRANSFORMS. Weizhong Zhu and Jason Pelecanos. IBM Research, Yorktown Heights, NY 10598, USA

ONLINE SPEAKER DIARIZATION USING ADAPTED I-VECTOR TRANSFORMS. Weizhong Zhu and Jason Pelecanos. IBM Research, Yorktown Heights, NY 10598, USA ONLINE SPEAKER DIARIZATION USING ADAPTED I-VECTOR TRANSFORMS Weizhong Zhu and Jason Pelecanos IBM Research, Yorktown Heights, NY 1598, USA {zhuwe,jwpeleca}@us.ibm.com ABSTRACT Many speaker diarization

More information

White Paper. Using Sentiment Analysis for Gaining Actionable Insights

White Paper. Using Sentiment Analysis for Gaining Actionable Insights corevalue.net info@corevalue.net White Paper Using Sentiment Analysis for Gaining Actionable Insights Sentiment analysis is a growing business trend that allows companies to better understand their brand,

More information

Sentiment Detection Using Lexically-Based Classifiers

Sentiment Detection Using Lexically-Based Classifiers Sentiment Detection Using Lexically-Based Classifiers Ben Allison Natural Language Processing Group, Department of Computer Science University of Sheffield, UK b.allison@dcs.shef.ac.uk Abstract. This paper

More information

Using Unlabeled Data for Supervised Learning

Using Unlabeled Data for Supervised Learning Using Unlabeled Data for Supervised Learning Geoffrey Towell Siemens Corporate Research 755 College Road East Princeton, N J 08540 Abstract Many classification problems have the property that the only

More information