COMP 527: Data Mining and Visualization. Danushka Bollegala

Similar documents
CSL465/603 - Machine Learning

Python Machine Learning

Mining Association Rules in Student s Assessment Data

Rule Learning With Negation: Issues Regarding Effectiveness

Lecture 1: Machine Learning Basics

Probabilistic Latent Semantic Analysis

Rule Learning with Negation: Issues Regarding Effectiveness

(Sub)Gradient Descent

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

Learning From the Past with Experiment Databases

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Mining Student Evolution Using Associative Classification and Clustering

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Welcome to. ECML/PKDD 2004 Community meeting

A Comparison of Two Text Representations for Sentiment Analysis

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Using Web Searches on Important Words to Create Background Sets for LSI Classification

A Case Study: News Classification Based on Term Frequency

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CS Machine Learning

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

CS 446: Machine Learning

Australian Journal of Basic and Applied Sciences

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Universidade do Minho Escola de Engenharia

Applications of data mining algorithms to analysis of medical data

Top US Tech Talent for the Top China Tech Company

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Assignment 1: Predicting Amazon Review Ratings

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Learning Methods for Fuzzy Systems

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Human Emotion Recognition From Speech

A survey of multi-view machine learning

Mathematics. Mathematics

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Indian Institute of Technology, Kanpur

Article A Novel, Gradient Boosting Framework for Sentiment Analysis in Languages where NLP Resources Are Not Plentiful: A Case Study for Modern Greek

Text-mining the Estonian National Electronic Health Record

Issues in the Mining of Heart Failure Datasets

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Activity Recognition from Accelerometer Data

Laboratorio di Intelligenza Artificiale e Robotica

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Using dialogue context to improve parsing performance in dialogue systems

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Cross-lingual Short-Text Document Classification for Facebook Comments

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Reducing Features to Improve Bug Prediction

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Semantic and Context-aware Linguistic Model for Bias Detection

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

Feature Selection based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification using Naïve Bayes

Switchboard Language Model Improvement with Conversational Data from Gigaword

Performance Analysis of Optimized Content Extraction for Cyrillic Mongolian Learning Text Materials in the Database

Generative models and adversarial training

Attributed Social Network Embedding

Semi-Supervised Face Detection

Model Ensemble for Click Prediction in Bing Search Ads

Exposé for a Master s Thesis

K-Medoid Algorithm in Clustering Student Scholarship Applicants

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Axiom 2013 Team Description Paper

Laboratorio di Intelligenza Artificiale e Robotica

Word Segmentation of Off-line Handwritten Documents

Linking Task: Identifying authors and book titles in verbose queries

Bug triage in open source systems: a review

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Variations of the Similarity Function of TextRank for Automated Summarization

Identification of Opinion Leaders Using Text Mining Technique in Virtual Community

Second Exam: Natural Language Parsing with Neural Networks

On-Line Data Analytics

arxiv: v1 [cs.cl] 20 Jul 2015

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Modeling function word errors in DNN-HMM based LVCSR systems

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Comment-based Multi-View Clustering of Web 2.0 Items

arxiv: v2 [cs.cv] 30 Mar 2017

A study of speaker adaptation for DNN-based speech synthesis

Speech Recognition at ICSI: Broadcast News and beyond

Data Stream Processing and Analytics

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

AQUA: An Ontology-Driven Question Answering System

Content-based Image Retrieval Using Image Regions as Query Examples

Machine Learning and Development Policy

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

arxiv: v2 [cs.cl] 26 Mar 2015

arxiv: v1 [cs.lg] 15 Jun 2015

Transcription:

COMP 527: Data Mining and Visualization Danushka Bollegala

Introductions Lecturer: Danushka Bollegala Office: 2.24 Ashton Building (Second Floor) Email: danushka@liverpool.ac.uk Personal web: http://danushka.net/ Research interests Natural Language Processing (NLP) 2

Course web site http://danushka.net/lect/dm Course notes, lecture schedule, assignments, references are uploaded to the course web site Discussion board (QA) on vital available. Do not email me your questions. Instead post them on the discussion board so that others can also benefit from your QA. 3

Evaluation 75% End of Year Exam 2.5 hrs Assignment 1: 12% Assignment 2: 13% short answers and/or essay type questions Select 4 out of 5 questions Past papers are available on the lecture web site Some of the review questions might appear in the exam as well! 25% Continuous Assessment Both assignments are programming oriented (in Python) Attend lab sessions for Python+Data Mining (once a week) 4

Data Mining, Witten References Pattern recognition and machine learning (PRML), Bishop. Fundamentals of Statistical Natural Language Processing (FSNLP), Manning 5

Course summary Data preprocessing (missing values, noisy data, scaling) Classification algorithms Decision trees, Naive Bayes, k-nn, logistic regression, SVM Clustering algorithms k-means, k-medoids, Hierarchical clustering Text Mining, Graph Mining, Information Retrieval Neural networks and Deep Learning Dimensionality reduction Visualization theory, t-sne, embeddings Word embedding learning 6

Data Mining Intro Danushka Bollegala

What is data mining? Various definitions The nontrivial extraction of implicit, previously unknown, and potentially useful information from data (Piatetsky-Shapiro) the automated or convenient extraction of patterns representing knowledge implicitly stored or captured in large databases, data warehouses, the Web, or data streams (Han, page xxi) the process of discovering patterns in data. The process must be automatic or (more usually) semiautomatic. The patterns discovered must be meaningful (Witten, page 5) 8

Applications of Text Mining Computer program wins Jeopardy contest in 2011! 9

Applications of Deep Learning 10

Deep Learning hesis: untangles objects cat An unsupervised neural network learns to recognize cats when trained using millions of you tube videos! (2012) image credit: Jeff Dean @ Google 11

Deep Learning Google acquires London-based AI (gaming) startup for USD 400M! 12

Industrial Interests Data Mining (DM)/ Machine Learning (ML)/ Natural Language Processing (NLP) experts are sought after by the CS industry Google research (Geoff Hinton/NN) Baidu (Andrew Ng) Facebook AI research (Yann LeCun/Deep ML) The ability to apply the algorithms we learn in this lecture (and their complex combinations) will greatly improve your employability in CS industries 13

Academic Interests DM is an active research field. Top conferences Knowledge Discovery and Data Mining (KDD) [http://www.kdd.org/ kdd2018/] Annual Conference of the Association for Computational Linguistics (ACL) [http://acl2018.org/] International Word Wide Web Conference (WWW) [www2018.thewebconf.org] International Conference on Machine Learning (ICML) Neural and Information Processing (NIPS) International Conference on Learning Representations (ICLR) 14

Piatetsky-Shapiro View Knowledge Interpretation Data Model Data Mining Transformed Data Transformation Preprocessed Data Preprocessing Target Data Selection Initial Data (As tweaked by Dunham) 15

CRISP-DM View 16

Two main goals in DM Prediction Build models that can predict future/unknown values of variables/patterns based on known data Machine learning, Pattern recognition Description Analyse given datasets to identify novel/ interesting/useful patterns/rules/trends that can describe the dataset clustering, pattern mining, associative rule mining 17

Broad classification of Algorithms Data Mining Predictive Descriptive Classification Algorithms (k-nn, Naive Bayes, logistic regression, SVM, Neural Networks, Decision Trees) Clustering Algorithms (k-means, hierarchical clustering) visualization algorithms (t-sne, PCA) Dimensionality reduction (SVD, PCA) Pattern/sequence mining 18

Classification Given a data point x, classify it into a set of discrete classes Example Sentiment classification The movie was great +1 The food was cold and tasted bad -1 Spam vs. non-spam email classification We want to learn a classifier f(x) that predicts either -1 or +1. We must learn function f that optimises some objective (e.g. number of misclassifications) A train dataset {x,y} where y {-1,1} is provided to learn the function f. supervised learning 19

Clustering Given a dataset {x 1,x 2,,x n } group the data points into k groups such that data points within the same group have some common attributes/similarities. Why we need clusters (groups) If the dataset is large, we can select some representative samples from each cluster Summarise the data, visualise the data 20

Cluster visualization 21

Word clusters words that express similar sentiments are grouped into Yogatama+14 the same cluster 22

COMP527 Data Mining and Visualisation Problem Set 0 Danushka Bollegala Question 1 Consider two vectors x, y R 3 defined as x =(1, 2, 1) and y =( 1, 0, 1). Answer the following questions about these two vectors. A. Compute the length (l 2 norm) of x and y. (4 marks) B. Compute the inner product between x and y. (2 marks) C. Compute the cosine of the angle between the two vectors x and y. (4 marks) D. Compute the Euclidean distance between the end points corresponding to the two vectors x and y. (4 marks) E. For any two vectors x, y R d such that x 2 = y 2 = 1 show that the following relationship holds between their cosine similarity cos(x, y) and their Euclidean distance Euc(x, y). (6 marks) Euc(x, y) 2 = 2(1 cos(x, y)) 1

Question 2 Consider a matrix A R 2 2 defined as follows: ( ) 2 1 A = 1 2 Answer the following questions related to A. A. Compute the transpose A. (2 marks) B. Compute the determinant det(a). (2 marks) C. Compute the inverse A 1. (4 marks) D. Compute the eigenvalues and eigenvectors of A. (6 marks) 2

Question 3 A. Given σ(x) = 1 1+exp(ax+b), compute σ (x), the differential of σ(x) with respect to x. B. Given H(p) = p log(p) (1 p) log(1 p), find the value of p that maximises H(p). C. Find the maximum value of g(x, y) =x 2 + y 2 such that y x + 1. 3