Machine Learning for NLP
|
|
- Gladys Meghan Hodges
- 1 years ago
- Views:
Transcription
1 Natural Language Processing SoSe 2014 Machine Learning for NLP Dr. Mariana Neves April 30th, 2014 (based on the slides of Dr. Saeedeh Momtazi)
2 Introduction Field of study that gives computers the ability to learn without being explicitly programmed Arthur Samuel, 1959 Learning Methods Supervised learning 2 Active learning Unsupervised learning Semi-supervised learning Reinforcement learning Natural Language Processing Machine Learning for NLP
3 Outline 3 Supervised Learning Semi-supervised learning Unsupervised learning Natural Language Processing Machine Learning for NLP
4 Outline 4 Supervised Learning Semi-supervised learning Unsupervised learning Natural Language Processing Machine Learning for NLP
5 Supervised Learning Example: mortgage credit decision Age Income 5 Natural Language Processing Machine Learning for NLP
6 Supervised Learning age? income 6 Natural Language Processing Machine Learning for NLP
7 Classification Training T1 T2 Tn C1 C2 Cn F1 F2 Fn Model(F,C) Testing Tn+1 7? Fn+1 Natural Language Processing Machine Learning for NLP Cn+1
8 Applications Problems POS tagging Named entity recognition Word sense disambiguation Spam mail detection Language identification Text categorization Information retrieval 8 Natural Language Processing Machine Learning for NLP Items Word Word Word Document Document Document Document Categories POS Named entity Word's sense Spam/Not Spam Language Topic Relevant/Not relevant
9 Part-of-speech tagging 9 Natural Language Processing Machine Learning for NLP
10 Named entity recognition 10 Natural Language Processing Machine Learning for NLP
11 Word sense disambiguation 11 Natural Language Processing Machine Learning for NLP
12 Spam mail detection 12 Natural Language Processing Machine Learning for NLP
13 Language identification 13 Natural Language Processing Machine Learning for NLP
14 Text categorization 14 Natural Language Processing Machine Learning for NLP
15 Classification Training T1 T2 Tn C1 C2 Cn F1 F2 Fn Model(F,C) Testing Tn+1? Fn+1 15 Natural Language Processing Machine Learning for NLP Cn+1
16 Classification algorithms K Nearest Neighbor Support Vector Machines Naïve Bayes Maximum Entropy Linear Regression Logistic Regression Neural Networks Decision Trees Boosting Natural Language Processing Machine Learning for NLP
17 Classification algorithms K Nearest Neighbor Support Vector Machines Naïve Bayes Maximum Entropy Linear Regression Logistic Regression Neural Networks Decision Trees Boosting Natural Language Processing Machine Learning for NLP
18 K Nearest Neighbor? 18 Natural Language Processing Machine Learning for NLP
19 K Nearest Neighbor? 19 Natural Language Processing Machine Learning for NLP
20 K Nearest Neighbor 1-nearest neighbor 20 Natural Language Processing Machine Learning for NLP
21 K Nearest Neighbor 3-nearest neighbors? 21 Natural Language Processing Machine Learning for NLP
22 K Nearest Neighbor 3-nearest neighbors 22 Natural Language Processing Machine Learning for NLP
23 Classification algorithms K Nearest Neighbor Support Vector Machines Naïve Bayes Maximum Entropy Linear Regression Logistic Regression Neural Networks Decision Trees Boosting Natural Language Processing Machine Learning for NLP
24 Support vector machines 24 Natural Language Processing Machine Learning for NLP
25 Support vector machines Find a hyperplane in the vector space that separates the items of the two categories 25 Natural Language Processing Machine Learning for NLP
26 Support vector machines There might be more than one possible separating hyperplane 26 Natural Language Processing Machine Learning for NLP
27 Support vector machines Find the hyperplane with maximum margin Vectors at the margins are called support vectors 27 Natural Language Processing Machine Learning for NLP
28 Classification algorithms K Nearest Neighbor Support Vector Machines Naïve Bayes Maximum Entropy Linear Regression Logistic Regression Neural Networks Decision Trees Boosting Natural Language Processing Machine Learning for NLP
29 Naïve Bayes Selecting the class with highest probability Minimizing the number of items with wrong labels c =argmax c P (c i ) i Probability should depend on the to be classified data (d) P(c i d ) 29 Natural Language Processing Machine Learning for NLP
30 Naïve Bayes c =argmax c P (c i ) i c =argmax c P (c i d ) i P (d c i ) P (c i ) c =argmax c P (d ) i c =argmax c P (d c i ) P (c i ) i 30 Natural Language Processing Machine Learning for NLP
31 Naïve Bayes c =argmax c P (d c i ) P (c i ) i Prior probability Likelihood probability 31 Natural Language Processing Machine Learning for NLP
32 Classification Training T1 T2 Tn C1 C2 Cn F1 F2 Fn Model(F,C) Testing Tn+1? Fn+1 32 Natural Language Processing Machine Learning for NLP Cn+1
33 Spam mail detection Features: - words - sender's - contains links - contains attachments - contains money amounts Natural Language Processing Machine Learning for NLP
34 Feature selection Bag-of-words: Each document can be represented by the set of words that appear in the document Result is a high dimensional feature space The process is computationally expensive Solution Using a feature selection method to select informative words 34 Natural Language Processing Machine Learning for NLP
35 Feature selection methods Information gain Mutual information χ-square 35 Natural Language Processing Machine Learning for NLP
36 Information gain Measuring the number of bits required for category prediction w.r.t. the presence or absence of a term in the document Removing words whose information gain is less than a predefined threshold IG (w)= i=1 K P (c i ) log P(ci ) + P( w) i=1 + P( w ) i=1 36 Natural Language Processing Machine Learning for NLP K P (c i w ) log P (ci w) K P (c i w ) log P (ci w )
37 Information gain N = # docs N i = # docs in category ci N w = # docs containing w N w = # docs not containing w N iw = # docs in category ci containing w N i w = # docs in category ci not containing w Ni P(c i )= N Nw P( w)= N P(c i w)= N iw Ni N w P( w )= N P(c i w )= N i w Ni 37 Natural Language Processing Machine Learning for NLP
38 Mutual information Measuring the effect of each word in predicting the category How much does its presence or absence in a document contribute to category prediction? P (w, c i ) MI ( w, c i )=log P (w) P (c i ) Removing words whose mutual information is less than a predefined threshold MI ( w)=max i MI ( w, c i ) MI ( w)= i P (c i ) MI ( w, c i ) 38 Natural Language Processing Machine Learning for NLP
39 χ-square Measuring the dependencies between words and categories 2 N ( N iw N iw N i w N i w ) χ 2 (w, c i )= ( N iw + N i w ) ( N i w + N iw ) ( N iw + N i w ) ( N i w + N iw ) Ranking words based on their χ-square measure χ 2 (w)= i=1 K P (c i ) χ 2 (w, ci ) Selecting the top words as features 39 Natural Language Processing Machine Learning for NLP
40 Feature selection These models perform well for document-level classification Spam Mail Detection Language Identification Text Categorization Word-level Classification might need another types of features Part-of-speech tagging Named Entity Recognition 40 Natural Language Processing Machine Learning for NLP
41 Supervised learning Shortcoming Relies heavily on annotated data Time consuming and expensive task Solution Active learning Using a minimum amount of annotated data Annotating further data by human, if they are very informative 41 Natural Language Processing Machine Learning for NLP
42 Active learning 42 Natural Language Processing Machine Learning for NLP
43 Active learning - Annotating a small amount of data 43 Natural Language Processing Machine Learning for NLP
44 Active learning - Calculating the confidence score of the classifier on unlabeled data H L M L 44 Natural Language Processing Machine Learning for NLP
45 Active learning - Finding the informative unlabeled data (data with lowest confidence) H L M L - manually annotating the informative data 45 Natural Language Processing Machine Learning for NLP
46 Outline Supervised Learning Semi-supervised learning Unsupervised learning 46 Natural Language Processing Machine Learning for NLP
47 Semi-supervised learning Annotating data is a time consuming and expensive task Solution Using a minimum amount of annotated data Annotating further data automatically 47 Natural Language Processing Machine Learning for NLP
48 Semi-supervised learning - A small amount of labeled data 48 Natural Language Processing Machine Learning for NLP
49 Semi-supervised learning - A large amount of unlabeled data 49 Natural Language Processing Machine Learning for NLP
50 Semi-supervised learning - Finding the similarity between the labeled and unlabeled data - Predicting the labels of the unlabeled data 50 Natural Language Processing Machine Learning for NLP
51 Semi-supervised learning - Training the classifier using labeled data and predicted labels of unlabeled data 51 Natural Language Processing Machine Learning for NLP
52 Semi-supervised learning - Introducing a lot of noisy data to the system - Adding unlabeled data to the training set, if the predicted label has a high confidence 52 Natural Language Processing Machine Learning for NLP
53 Outline Supervised Learning Semi-supervised learning Unsupervised learning 53 Natural Language Processing Machine Learning for NLP
54 Supervised Learning age? income 54 Natural Language Processing Machine Learning for NLP
55 Unsupervised Learning age income 55 Natural Language Processing Machine Learning for NLP
56 Unsupervised Learning age income 56 Natural Language Processing Machine Learning for NLP
57 Clustering Calculating similarities between the data items Assigning similar data items to the same cluster 57 Natural Language Processing Machine Learning for NLP
58 Applications Word clustering Speech recognition Machine translation Named entity recognition Information retrieval... Document clustering Text classification Information retrieval Natural Language Processing Machine Learning for NLP
59 Speech recognition Computers can recognize a speeech. Computers can wreck a nice peach. recognition speech named-entity hand-writing 59 Natural Language Processing Machine Learning for NLP wreck ball ship
60 Machine translation The cat eats... Die Katze frisst... Die Katze isst... Katze fressen Hund laufen 60 Natural Language Processing Machine Learning for NLP essen Jung Mann
61 Language modelling I have a meeting on Moday evening. You should work on Wednesday afternoon. The next session is on Thursday morning. The talk is on Monday morning. The talk is on Monday molding. Monday Thursday Friday Sunday Saturday Tuesday morning afternoon evening night Tuesday 61 Natural Language Processing Machine Learning for NLP
62 Clustering algorithms Flat K-means Hierarchical Top-Down (Divisive) Bottom-Up (Agglomerative) Single-link Complete-link Average-link 62 Natural Language Processing Machine Learning for NLP
63 K-means The best known clustering algorithm Works well for many cases Used as default/baseline for clustering documents Defining each cluster center as the mean or centroid of the items in the cluster 1 μ = x c x c Minimizing the average squared Euclidean distance of the items from their cluster centers 63 Natural Language Processing Machine Learning for NLP
64 K-means Initialization: Randomly choose k items as initial centroids while stopping criterion has not been met do for each item do Find the nearest centroid Assign the item to the cluster associated with the nearest centroid end for for each cluster do Update the centroid of the cluster based on the average of all items in the cluster end for end while Iterating two steps: Re-assignment Assigning each vector to its closest centroid Re-computation Computing each centroid as the average of the vectors that were assigned to it in re-assignment 64 Natural Language Processing Machine Learning for NLP
65 K-means 65 Natural Language Processing Machine Learning for NLP
66 Hierarchical Agglomerative Clustering (HAC) Creating a hierarchy in the form of a binary tree 66 Natural Language Processing Machine Learning for NLP
67 Hierarchical Agglomerative Clustering (HAC) Creating a hierarchy in the form of a binary tree 67 Natural Language Processing Machine Learning for NLP
68 Hierarchical Agglomerative Clustering (HAC) Initial Mapping: Put a single item in each cluster while reaching the predefined number of clusters do for each pair of clusters do Measure the similarity of two clusters end for Merge the two clusters that are most similar end while Measuring the similarity in three ways: Single-link Complete-link Average-link 68 Natural Language Processing Machine Learning for NLP
69 Hierarchical Agglomerative Clustering (HAC) Single-link / single-linkage clustering Based on the similarity of the most similar members 69 Natural Language Processing Machine Learning for NLP
70 Hierarchical Agglomerative Clustering (HAC) Complete-link / complete-linkage clustering Based on the similarity of the most dissimilar members 70 Natural Language Processing Machine Learning for NLP
71 Hierarchical Agglomerative Clustering (HAC) Average-link / average-linkage clustering Based on the average of all similarities between the members 71 Natural Language Processing Machine Learning for NLP
72 Hierarchical Agglomerative Clustering (HAC) 72 Natural Language Processing Machine Learning for NLP
73 This is no clustering...just word frequencies 73 Natural Language Processing Machine Learning for NLP
74 Further reading 74 Natural Language Processing Machine Learning for NLP
75 Further reading 75 Natural Language Processing Machine Learning for NLP
76 Further reading 76 Natural Language Processing Machine Learning for NLP
Natural Language Processing SoSe Sentiment Analysis. (based on the slides of Dr. Saeedeh Momtazi)
Natural Language Processing SoSe 2015 Sentiment Analysis Dr. Mariana Neves June 8th, 2015 (based on the slides of Dr. Saeedeh Momtazi) Outline 2 Applications Task Machine Learning Approach Rule-based Approach
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
Unsupervised Learning
17s1: COMP9417 Machine Learning and Data Mining Unsupervised Learning May 2, 2017 Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997 http://www-2.cs.cmu.edu/~tom/mlbook.html
Introduction to Classification
Introduction to Classification Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes Each example is to
Introduction to Machine Learning
Introduction to Machine Learning Hamed Pirsiavash CMSC 678 http://www.csee.umbc.edu/~hpirsiav/courses/ml_fall17 The slides are closely adapted from Subhransu Maji s slides Course background What is the
Introduction to Classification, aka Machine Learning
Introduction to Classification, aka Machine Learning Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes
Unsupervised Learning
09s1: COMP9417 Machine Learning and Data Mining Unsupervised Learning June 3, 2009 Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997 http://www-2.cs.cmu.edu/~tom/mlbook.html
A Review on Classification Techniques in Machine Learning
A Review on Classification Techniques in Machine Learning R. Vijaya Kumar Reddy 1, Dr. U. Ravi Babu 2 1 Research Scholar, Dept. of. CSE, Acharya Nagarjuna University, Guntur, (India) 2 Principal, DRK College
SB2b Statistical Machine Learning Hilary Term 2017
SB2b Statistical Machine Learning Hilary Term 2017 Mihaela van der Schaar and Seth Flaxman Guest lecturer: Yee Whye Teh Department of Statistics Oxford Slides and other materials available at: http://www.oxford-man.ox.ac.uk/~mvanderschaar/home_
Introduction to Machine Learning
Introduction to Machine Learning D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 April 6, 2009 Outline Outline Introduction to Machine Learning Outline Outline Introduction to Machine Learning
P(A, B) = P(A B) = P(A) + P(B) - P(A B)
AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) P(A B) = P(A) + P(B) - P(A B) Area = Probability of Event AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) If, and only if, A and B are independent,
Multi-Class Sentiment Analysis with Clustering and Score Representation
Multi-Class Sentiment Analysis with Clustering and Score Representation Mohsen Farhadloo Erik Rolland mfarhadloo@ucmerced.edu 1 CONTENT Introduction Applications Related works Our approach Experimental
Natural Language Processing
Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the course book Sentiment Analysis 2 --------------- ---------------
CS545 Machine Learning
Machine learning and related fields CS545 Machine Learning Course Introduction Machine learning: the construction and study of systems that learn from data. Pattern recognition: the same field, different
Machine Learning with MATLAB Antti Löytynoja Application Engineer
Machine Learning with MATLAB Antti Löytynoja Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB MATLAB as an interactive
Session 1: Gesture Recognition & Machine Learning Fundamentals
IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research
STA 414/2104 Statistical Methods for Machine Learning and Data Mining
STA 414/2104 Statistical Methods for Machine Learning and Data Mining Radford M. Neal, University of Toronto, 2014 Week 1 What are Machine Learning and Data Mining? Typical Machine Learning and Data Mining
Question Classification in Question-Answering Systems Pujari Rajkumar
Question Classification in Question-Answering Systems Pujari Rajkumar Question-Answering Question Answering(QA) is one of the most intuitive applications of Natural Language Processing(NLP) QA engines
COMP 551 Applied Machine Learning Lecture 11: Ensemble learning
COMP 551 Applied Machine Learning Lecture 11: Ensemble learning Instructor: Herke van Hoof (herke.vanhoof@mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp551
INTRODUCTION TO MACHINE LEARNING. Machine Learning: What s The Challenge?
INTRODUCTION TO MACHINE LEARNING Machine Learning: What s The Challenge? Goals of the course Identify a machine learning problem Use basic machine learning techniques Think about your data/results What
Natural Language Processing
Natural Language Processing Lexical Semantics Word Sense Disambiguation and Word Similarity Potsdam, 31 May 2012 Saeedeh Momtazi Information Systems Group based on the slides of the course book Outline
Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results
Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results Anthony Trippe Managing Director, Patinformatics, LLC Patent Information Fair & Conference November 10, 2017
CSCI , Data Mining and Warehousing Spring 2015
CSCI 6366.01, Data Mining and Warehousing Spring 2015 Instructor: Zhixiang Chen, Office: ENGR 3.272, Phone: 665-3520, Email: zchen@utpa.edu, WWW Home Page: faculty. utpa.edu/zchen/ Office Hours: Monday
USING THE MESH HIERARCHY TO INDEX BIOINFORMATICS ARTICLES
USING THE MESH HIERARCHY TO INDEX BIOINFORMATICS ARTICLES JEFFREY CHANG Stanford Biomedical Informatics jchang@smi.stanford.edu As the number of bioinformatics articles increase, the ability to classify
Welcome to CMPS 142 and 242: Machine Learning
Welcome to CMPS 142 and 242: Machine Learning Instructor: David Helmbold, dph@soe.ucsc.edu Office hours: Monday 1:30-2:30, Thursday 4:15-5:00 TA: Aaron Michelony, amichelo@soe.ucsc.edu Web page: www.soe.ucsc.edu/classes/cmps242/fall13/01
Programming Social Robots for Human Interaction. Lecture 4: Machine Learning and Pattern Recognition
Programming Social Robots for Human Interaction Lecture 4: Machine Learning and Pattern Recognition Zheng-Hua Tan Dept. of Electronic Systems, Aalborg Univ., Denmark zt@es.aau.dk, http://kom.aau.dk/~zt
CSC 411 MACHINE LEARNING and DATA MINING
CSC 411 MACHINE LEARNING and DATA MINING Lectures: Monday, Wednesday 12-1 (section 1), 3-4 (section 2) Lecture Room: MP 134 (section 1); Bahen 1200 (section 2) Instructor (section 1): Richard Zemel Instructor
Machine Learning Lecture 1: Introduction
Welcome to CSCE 478/878! Please check off your name on the roster, or write your name if you're not listed Indicate if you wish to register or sit in Policy on sit-ins: You may sit in on the course without
Lecture 1.1: Introduction CSC Machine Learning
Lecture 1.1: Introduction CSC 84020 - Machine Learning Andrew Rosenberg January 29, 2010 Today Introductions and Class Mechanics. Background about me Me: Graduated from Columbia in 2009 Research Speech
Python Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
COMP 551 Applied Machine Learning Lecture 12: Ensemble learning
COMP 551 Applied Machine Learning Lecture 12: Ensemble learning Associate Instructor: Herke van Hoof (herke.vanhoof@mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
Machine Learning for Computer Vision
Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.058 (Fridays) Main lecture MSc. Ioannis John Chiotellis
A Review on Machine Learning Algorithms, Tasks and Applications
A Review on Machine Learning Algorithms, Tasks and Applications Diksha Sharma 1, Neeraj Kumar 2 ABSTRACT: Machine learning is a field of computer science which gives computers an ability to learn without
Welcome to CMPS 142: Machine Learning. Administrivia. Lecture Slides for. Instructor: David Helmbold,
Welcome to CMPS 142: Machine Learning Instructor: David Helmbold, dph@soe.ucsc.edu Web page: www.soe.ucsc.edu/classes/cmps142/winter07/ Text: Introduction to Machine Learning, Alpaydin Administrivia Sign
CS474 Natural Language Processing. Word sense disambiguation. Machine learning approaches. Dictionary-based approaches
CS474 Natural Language Processing! Today Lexical semantic resources: WordNet» Dictionary-based approaches» Supervised machine learning methods» Issues for WSD evaluation Word sense disambiguation! Given
Lecture 12: Clustering LECTURE 12 1
Lecture 12: Clustering 6.0002 LECTURE 12 1 Reading Chapter 23 6.0002 LECTURE 12 2 Machine Learning Paradigm Observe set of examples: training data Infer something about process that generated that data
36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B
36-350: Data Mining Fall 2009 Instructor: Cosma Shalizi, Statistics Dept., Baker Hall 229C, cshalizi@stat.cmu.edu Teaching Assistant: Joseph Richards, jwrichar@stat.cmu.edu Lectures: Monday, Wednesday
Inductive Learning and Decision Trees
Inductive Learning and Decision Trees Doug Downey EECS 349 Spring 2017 with slides from Pedro Domingos, Bryan Pardo Outline Announcements Homework #1 was assigned on Monday (due in five days!) Inductive
Lecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
Statistical Learning- Classification STAT 441/ 841, CM 764
Statistical Learning- Classification STAT 441/ 841, CM 764 Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo aghodsib@uwaterloo.ca Two Paradigms Classical Statistics Infer
Machine Learning and Pattern Recognition Introduction
Machine Learning and Pattern Recognition Introduction Giovanni Maria Farinella gfarinella@dmi.unict.it www.dmi.unict.it/farinella What is ML & PR? Interdisciplinary field focusing on both the mathematical
Applied Machine Learning Lecture 1: Introduction
Applied Machine Learning Lecture 1: Introduction Richard Johansson January 16, 2018 welcome to the course! machine learning is getting increasingly popular among students our courses are full! many thesis
Text Classification & Naïve Bayes
Text Classification & Naïve Bayes CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Some slides by Dan Jurafsky & James Martin, Jacob Eisenstein Today Text classification problems and their
Naive Bayes Classifier Approach to Word Sense Disambiguation
Naive Bayes Classifier Approach to Word Sense Disambiguation Daniel Jurafsky and James H. Martin Chapter 20 Computational Lexical Semantics Sections 1 to 2 Seminar in Methodology and Statistics 3/June/2009
18 LEARNING FROM EXAMPLES
18 LEARNING FROM EXAMPLES An intelligent agent may have to learn, for instance, the following components: A direct mapping from conditions on the current state to actions A means to infer relevant properties
INTRODUCTION TO DATA SCIENCE
DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:
Text Categorization and Support Vector Machines
Text Categorization and Support Vector Machines István Pilászy Department of Measurement and Information Systems Budapest University of Technology and Economics e-mail: pila@mit.bme.hu Abstract: Text categorization
High-performance Word Sense Disambiguation with Less Manual Effort
University of Colorado, Boulder CU Scholar Computer Science Graduate Theses & Dissertations Computer Science Spring 1-1-2010 High-performance Word Sense Disambiguation with Less Manual Effort Dmitriy Dligach
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise
COMP 527: Data Mining and Visualization. Danushka Bollegala
COMP 527: Data Mining and Visualization Danushka Bollegala Introductions Lecturer: Danushka Bollegala Office: 2.24 Ashton Building (Second Floor) Email: danushka@liverpool.ac.uk Personal web: http://danushka.net/
Big Data Analytics Clustering and Classification
E6893 Big Data Analytics Lecture 4: Big Data Analytics Clustering and Classification Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science September 28th, 2017 1
COMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Machine learning: what? Study of making machines learn a concept without having to explicitly program it. Constructing algorithms that can: learn
PDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/101867
Spotting Sentiments with Semantic Aware Multilevel Cascaded Analysis
Spotting Sentiments with Semantic Aware Multilevel Cascaded Analysis Despoina Chatzakou, Nikolaos Passalis, Athena Vakali Aristotle University of Thessaloniki Big Data Analytics and Knowledge Discovery,
What is Machine Learning?
What is Machine Learning? INFO-4604, Applied Machine Learning University of Colorado Boulder August 29-31, 2017 Prof. Michael Paul Definition Murphy: a set of methods that can automatically detect patterns
Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network
Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Nick Latourette and Hugh Cunningham 1. Introduction Our paper investigates the use of named entities
Machine Learning with Weka
Machine Learning with Weka SLIDES BY (TOTAL 5 Session of 1.5 Hours Each) ANJALI GOYAL & ASHISH SUREKA (www.ashish-sureka.in) CS 309 INFORMATION RETRIEVAL COURSE ASHOKA UNIVERSITY NOTE: Slides created and
CSE 258 Lecture 3. Web Mining and Recommender Systems. Supervised learning Classification
CSE 258 Lecture 3 Web Mining and Recommender Systems Supervised learning Classification Last week Last week we started looking at supervised learning problems Last week We studied linear regression, in
Stanford NLP. Evan Jaffe and Evan Kozliner
Stanford NLP Evan Jaffe and Evan Kozliner Some Notable Researchers Chris Manning Statistical NLP, Natural Language Understanding and Deep Learning Dan Jurafsky sciences Percy Liang Natural Language Understanding,
PRESENTATION TITLE. A Two-Step Data Mining Approach for Graduation Outcomes CAIR Conference
PRESENTATION TITLE A Two-Step Data Mining Approach for Graduation Outcomes 2013 CAIR Conference Afshin Karimi (akarimi@fullerton.edu) Ed Sullivan (esullivan@fullerton.edu) James Hershey (jrhershey@fullerton.edu)
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551
A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington" 2012"
A Few Useful Things to Know about Machine Learning Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Machine
EECS 349 Machine Learning
EECS 349 Machine Learning Instructor: Doug Downey (some slides from Pedro Domingos, University of Washington) 1 Logistics Instructor: Doug Downey Email: ddowney@eecs.northwestern.edu Office hours: Mondays
CS534 Machine Learning
CS534 Machine Learning Spring 2013 Lecture 1: Introduction to ML Course logistics Reading: The discipline of Machine learning by Tom Mitchell Course Information Instructor: Dr. Xiaoli Fern Kec 3073, xfern@eecs.oregonstate.edu
Available online:
VOL4 NO. 1 March 2015 - ISSN 2233 1859 Southeast Europe Journal of Soft Computing Available online: www.scjournal.ius.edu.ba A study in Authorship Attribution: The Federalist Papers Nesibe Merve Demir
COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining.
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining 1.0 Course Designations
Lecture: Clustering and Segmentation
Lecture: Clustering and Segmentation Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to segmentation and clustering Gestalt theory
Statistical methods in NLP Classication
Statistical methods in NLP Classication UNIVERSITY OF Richard Johansson February 4, 2016 overview of today's lecture classication: general ideas Naive Bayes recap formulation, estimation Naive Bayes as
CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015
CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:30-11 (WESB 100).
An Extractive Approach of Text Summarization of Assamese using WordNet
An Extractive Approach of Text Summarization of Assamese using WordNet Chandan Kalita Department of CSE Tezpur University Napaam, Assam-784028 chandan_kalita@yahoo.co.in Navanath Saharia Department of
A comparison between Latent Semantic Analysis and Correspondence Analysis
A comparison between Latent Semantic Analysis and Correspondence Analysis Julie Séguéla, Gilbert Saporta CNAM, Cedric Lab Multiposting.fr February 9th 2011 - CARME Outline 1 Introduction 2 Latent Semantic
INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad
INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad - 500 043 INFORMATION TECHNOLOGY TUTORIAL QUESTION BANK Name INFORMATION RETRIEVAL SYSTEM Code A70533 Class IV B. Tech I Semester
Link Learning with Wikipedia
Link Learning with Wikipedia (Milne and Witten, 2008b) Dominikus Wetzel dwetzel@coli.uni-sb.de Department of Computational Linguistics Saarland University December 4, 2009 1 / 28 1 Semantic Relatedness
CS540 Machine learning Lecture 1 Introduction
CS540 Machine learning Lecture 1 Introduction Administrivia Overview Supervised learning Unsupervised learning Other kinds of learning Outline Administrivia Class web page www.cs.ubc.ca/~murphyk/teaching/cs540-fall08
Bird Species Identification from an Image
Bird Species Identification from an Image Aditya Bhandari, 1 Ameya Joshi, 2 Rohit Patki 3 1 Department of Computer Science, Stanford University 2 Department of Electrical Engineering, Stanford University
Machine Learning L, T, P, J, C 2,0,2,4,4
Subject Code: Objective Expected Outcomes Machine Learning L, T, P, J, C 2,0,2,4,4 It introduces theoretical foundations, algorithms, methodologies, and applications of Machine Learning and also provide
EECS 349 Machine Learning
EECS 349 Machine Learning Instructor: Doug Downey (some slides from Pedro Domingos, University of Washington) 1 Logistics Instructor: Doug Downey Email: ddowney@eecs.northwestern.edu Office hours: Mondays
10701: Intro to Machine Learning. Instructors: Pradeep Ravikumar, Manuela Veloso, Teaching Assistants:
10701: Intro to Machine Instructors: Pradeep Ravikumar, pradeepr@cs.cmu.edu Manuela Veloso, mmv@cs.cmu.edu Teaching Assistants: Shaojie Bai shaojieb@andrew.cmu.edu Adarsh Prasad adarshp@andrew.cmu.edu
Lecture 22: Introduction to Natural Language Processing (NLP)
Lecture 22: Introduction to Natural Language Processing (NLP) Traditional NLP Statistical approaches Statistical approaches used for processing Internet documents If we have time: hidden variables COMP-424,
Machine Learning. June 22, 2006 CS 486/686 University of Waterloo
Machine Learning June 22, 2006 CS 486/686 University of Waterloo Outline Inductive learning Decision trees Reading: R&N Ch 18.1-18.3 CS486/686 Lecture Slides (c) 2006 K.Larson and P. Poupart 2 What is
Lecture 9: Classification and algorithmic methods
1/28 Lecture 9: Classification and algorithmic methods Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 17/5 2011 2/28 Outline What are algorithmic methods?
Assignment 6 (Sol.) Introduction to Machine Learning Prof. B. Ravindran
Assignment 6 (Sol.) Introduction to Machine Learning Prof. B. Ravindran 1. Assume that you are given a data set and a neural network model trained on the data set. You are asked to build a decision tree
Machine Learning Algorithms: A Review
Machine Learning Algorithms: A Review Ayon Dey Department of CSE, Gautam Buddha University, Greater Noida, Uttar Pradesh, India Abstract In this paper, various machine learning algorithms have been discussed.
Advanced Natural Language Processing and Information Retrieval
Advanced Natural Language Processing and Information Retrieval Course Description Alessandro Moschitti Department of Computer Science and Information Engineering University of Trento Email: moschitti@disi.unitn.it
White Paper. Using Sentiment Analysis for Gaining Actionable Insights
corevalue.net info@corevalue.net White Paper Using Sentiment Analysis for Gaining Actionable Insights Sentiment analysis is a growing business trend that allows companies to better understand their brand,
Word Sense Determination from Wikipedia. Data Using a Neural Net
1 Word Sense Determination from Wikipedia Data Using a Neural Net CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University By Qiao Liu May 2017 Word Sense Determination
Feedback Prediction for Blogs
Feedback Prediction for Blogs Krisztian Buza Budapest University of Technology and Economics Department of Computer Science and Information Theory buza@cs.bme.hu Abstract. The last decade lead to an unbelievable
CS 510: Lecture 8. Deep Learning, Fairness, and Bias
CS 510: Lecture 8 Deep Learning, Fairness, and Bias Next Week All Presentations, all the time Upload your presentation before class if using slides Sign up for a timeslot google doc, if you haven t already
Style-based Distance Features for Author Verification - Notebook for PAN at CLEF 2013
Style-based Distance Features for Author Verification - Notebook for PAN at CLEF 2013 Erwan Moreau, Carl Vogel To cite this version: Erwan Moreau, Carl Vogel. Style-based Distance Features for Author Verification
Phrase detection Project proposal for Machine Learning course project
Phrase detection Project proposal for Machine Learning course project Suyash S Shringarpure suyash@cs.cmu.edu 1 Introduction 1.1 Motivation Queries made to search engines are normally longer than a single
Machine Learning :: Introduction. Konstantin Tretyakov
Machine Learning :: Introduction Konstantin Tretyakov (kt@ut.ee) MTAT.03.183 Data Mining November 5, 2009 So far Data mining as knowledge discovery Frequent itemsets Descriptive analysis Clustering Seriation
Lecture I Outline. Course information and details Why do machine learning? What is machine learning? Why now? Type of Learning
Lecture I Outline Course information and details Why do machine learning? What is machine learning? Why now? Type of Learning Association Classification Three types: Linear, Decision Tree, and Nearest
545 Machine Learning, Fall 2011
545 Machine Learning, Fall 2011 Final Project Report Experiments in Automatic Text Summarization Using Deep Neural Networks Project Team: Ben King Rahul Jha Tyler Johnson Vaishnavi Sundararajan Instructor:
Word Sense Disambiguation with Semi-Supervised Learning
Word Sense Disambiguation with Semi-Supervised Learning Thanh Phong Pham 1 and Hwee Tou Ng 1,2 and Wee Sun Lee 1,2 1 Department of Computer Science 2 Singapore-MIT Alliance National University of Singapore
Admission Prediction System Using Machine Learning
Admission Prediction System Using Machine Learning Jay Bibodi, Aasihwary Vadodaria, Anand Rawat, Jaidipkumar Patel bibodi@csus.edu, aaishwaryvadoda@csus.edu, anandrawat@csus.edu, jaidipkumarpate@csus.edu
Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 11, 2011
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 11, 2011 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline
CS838-1 Advanced NLP: Automatic Summarization
CS838-1 Advanced NLP: Automatic Summarization Andrew Goldberg (goldberg@cs.wisc.edu) March 16, 2007 1 Introduction Automatic summarization involves reducing a text document or a larger corpus of multiple
CS 6140: Machine Learning Spring 2017
CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Time and Loca@on
About This Specialization
About This Specialization The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended
CIS 419/519 Introduction to Machine Learning Course Project Guidelines
CIS 419/519 Introduction to Machine Learning Course Project Guidelines 1 Project Overview One the main goals of this course is to prepare you to apply machine learning algorithms to realworld problems.