Natural Language Processing SoSe Sentiment Analysis. (based on the slides of Dr. Saeedeh Momtazi)

Similar documents
Twitter Sentiment Classification on Sanders Data using Hybrid Approach

CS 446: Machine Learning

A Case Study: News Classification Based on Term Frequency

Multilingual Sentiment and Subjectivity Analysis

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Assignment 1: Predicting Amazon Review Ratings

Rule Learning With Negation: Issues Regarding Effectiveness

Switchboard Language Model Improvement with Conversational Data from Gigaword

Human Emotion Recognition From Speech

Speech Emotion Recognition Using Support Vector Machine

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Python Machine Learning

Indian Institute of Technology, Kanpur

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Reducing Features to Improve Bug Prediction

A Comparison of Two Text Representations for Sentiment Analysis

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Learning From the Past with Experiment Databases

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Rule Learning with Negation: Issues Regarding Effectiveness

(Sub)Gradient Descent

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons

Using Web Searches on Important Words to Create Background Sets for LSI Classification

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Universidade do Minho Escola de Engenharia

Using dialogue context to improve parsing performance in dialogue systems

Lecture 1: Machine Learning Basics

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Linking Task: Identifying authors and book titles in verbose queries

The stages of event extraction

A Vector Space Approach for Aspect-Based Sentiment Analysis

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Australian Journal of Basic and Applied Sciences

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Cross Language Information Retrieval

Finding Translations in Scanned Book Collections

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

Cross-lingual Short-Text Document Classification for Facebook Comments

CSL465/603 - Machine Learning

PROGRAMME DE TRAVAIL INTERNE

Semantic and Context-aware Linguistic Model for Bias Detection

Improving Machine Learning Input for Automatic Document Classification with Natural Language Processing

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Mining Association Rules in Student s Assessment Data

Applications of data mining algorithms to analysis of medical data

BYLINE [Heng Ji, Computer Science Department, New York University,

A Web Based Annotation Interface Based of Wheel of Emotions. Author: Philip Marsh. Project Supervisor: Irena Spasic. Project Moderator: Matthew Morgan

Lecture 1: Basic Concepts of Machine Learning

Disambiguation of Thai Personal Name from Online News Articles

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Determining the Semantic Orientation of Terms through Gloss Classification

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Word Segmentation of Off-line Handwritten Documents

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Humboldt-Universität zu Berlin

Improving Simple Bayes. Abstract. The simple Bayesian classier (SBC), sometimes called

CS Machine Learning

Extracting and Ranking Product Features in Opinion Documents

Virtually Anywhere Episodes 1 and 2. Teacher s Notes

Preference Learning in Recommender Systems

Universiteit Leiden ICT in Business

A Bayesian Learning Approach to Concept-Based Document Classification

Data Fusion Through Statistical Matching

AQUA: An Ontology-Driven Question Answering System

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Polish (JUN ) General Certificate of Secondary Education June 2014

Mercer County Schools

The Foundations of Interpersonal Communication

The taming of the data:

Extracting Verb Expressions Implying Negative Opinions

arxiv: v2 [cs.cv] 30 Mar 2017

Learning Distributed Linguistic Classes

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

THE VERB ARGUMENT BROWSER

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Emmaus Lutheran School English Language Arts Curriculum

A process by any other name

Constructing Parallel Corpus from Movie Subtitles

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

CAFE Collaboration Aimed at Finding Experts

Using Hashtags to Capture Fine Emotion Categories from Tweets

A Biological Signal-Based Stress Monitoring Framework for Children Using Wearable Devices

Telekooperation Seminar

Multivariate k-nearest Neighbor Regression for Time Series data -

What is this place? Inferring place categories through user patterns identification in geo-tagged tweets

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

A Study of Synthetic Oversampling for Twitter Imbalanced Sentiment Analysis

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

arxiv: v1 [cs.lg] 3 May 2013

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011

Monticello Community School District K 12th Grade. Spanish Standards and Benchmarks

Semi-Supervised Face Detection

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio

Transcription:

Natural Language Processing SoSe 2015 Sentiment Analysis Dr. Mariana Neves June 8th, 2015 (based on the slides of Dr. Saeedeh Momtazi)

Outline 2 Applications Task Machine Learning Approach Rule-based Approach

Outline 3 Applications Task Machine Learning Approach Rule-based Approach

Product reviews 4

Social Media (http://www.streamcrab.com/) 5

Event Analysis and Prediction 6

Event Analysis and Prediction 7

Event Analysis and Prediction (http://www.thestocksonar.com/sentiment-analysis) 8

Outline 9 Applications Task Machine Learning Approach Rule-based Approach

Sentiment Analysis Levels Text Fact Opinion 10 + - angry, afraid,... happy, surprised,...

Advanced Sentiment Analysis 11 Opinion holder and Opinion target/aspect Students [OP HOLDER] like Wikipedia [TARGET] because it is easy to use and it sounds authoritative. I had a nice stay in this hotel and the rooms very clean. [ASPECT] were

Advanced Sentiment Analysis Mixed opinions 12 The restaurant has an amazing view but it is very dirty.

Other names 13 Opinion mining Opinion extraction Sentiment mining Subjectivity detection Subjectivity analysis

Sentiment Analysis Approaches Machine learning methods classification Rule-based methods dictionary oriented 14

Outline 15 Applications Task Machine Learning Approach Rule-based Approach

Machine Learning Approach Training T1 T2 Tn C1 C2 Cn F1 F2 Fn Model(F,C) Testing Tn+1 16? Fn+1 Cn+1

Sentiment Classification 17 Using any kinds of supervised classifiers K Nearest Neighbor Support Vector Machines Naïve Bayes Maximum Entropy Logistic Regression...

Features All words or adjectives? 18 All words works better than adjectives only

Features 19 Word occurrence or frequency? Word occurrence is more useful than frequency Using binary value for words Replace all word counts higher than 0 in each text by 1

Features Negation Negation words change the text polarity 20 Adding prefix NOT to every word between negation and next punctuation I did not like the restaurant location, but the food... I did not NOT-like NOT-the NOT-restaurant NOT-location, but the food...

Features Other emotions 21 Considering emoticons as additional features :) :( As well as smilies

Fine-grained analysis Dealing with finer classes of sentiment -3,-2,-1,+1,+2,+3 (SAP HANA database) 22

Fine-grained Analysis 23 Approaches Using multiclass classifier (6 classes in this case) Using two level classifier First level: polarity classifier (positive or negative) Second level: strength classifier (1 or 2 or 3)

Outline 24 Applications Task Machine Learning Approach Rule-based Approach

Rule-based Approach Training T1 T2 Tn C1 C2 Cn bad hate lie ugly poor... good love brave intelligent nice... Testing Tn+1 25? Cn+1

Rule-based Approach 26 Looking for opinionated words in each text Classifying the text based on the number of positive and negative words

Rule-based Approach 27 Considering different rules for classification Fine-grained dictionary Negation words Booster words Idioms Emoticons Mixed opinions Linguistic features of the language

Rule-based Approach 28 Fine-grained Dictionary It was a good song. The song was excellent.

Rule-based Approach 29 Negation Words It was a good song. The song was not good.

Rule-based Approach 30 Booster Words The song was interesting. The song was very interesting. The song was somewhat interesting.

Rule-based Approach Idioms 31 shock horror

Rule-based Approach Mixed Opinions The song was good, but I think its title was strange. 32

Opinion Dictionary 33 English Subjectivity Clues (2005) SentiSpin (2005) SentiWordNet (2006) Polarity Enhancement (2009) SentiStrength (2010)

Opinion Dictionary 34 German GermanPolarityClues (2010) SentiWortSchatz (2010) GermanSentiStrength (2012)

Machine Learning with Opinion Dictionary 35 Using opinion words as a feature in the algorithms Ignoring other words in the text