Natural Language Processing

Similar documents
Twitter Sentiment Classification on Sanders Data using Hybrid Approach

A Case Study: News Classification Based on Term Frequency

CS 446: Machine Learning

Multilingual Sentiment and Subjectivity Analysis

Assignment 1: Predicting Amazon Review Ratings

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Rule Learning With Negation: Issues Regarding Effectiveness

Indian Institute of Technology, Kanpur

Switchboard Language Model Improvement with Conversational Data from Gigaword

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Speech Emotion Recognition Using Support Vector Machine

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Human Emotion Recognition From Speech

Learning From the Past with Experiment Databases

(Sub)Gradient Descent

Python Machine Learning

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Reducing Features to Improve Bug Prediction

Lecture 1: Machine Learning Basics

Rule Learning with Negation: Issues Regarding Effectiveness

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons

A Comparison of Two Text Representations for Sentiment Analysis

Australian Journal of Basic and Applied Sciences

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Using dialogue context to improve parsing performance in dialogue systems

The stages of event extraction

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Cross Language Information Retrieval

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Universidade do Minho Escola de Engenharia

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Linking Task: Identifying authors and book titles in verbose queries

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Guidelines for drafting the participant observation report

Finding Translations in Scanned Book Collections

PROGRAMME DE TRAVAIL INTERNE

CSL465/603 - Machine Learning

arxiv: v2 [cs.cv] 30 Mar 2017

A Vector Space Approach for Aspect-Based Sentiment Analysis

Academic Success at Ohio State. Caroline Omolesky Program Officer for Sponsored Programs and Academic Liaison Office of International Affairs

Semantic and Context-aware Linguistic Model for Bias Detection

Improving Machine Learning Input for Automatic Document Classification with Natural Language Processing

TEAM-BUILDING GAMES, ACTIVITIES AND IDEAS

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Lecture 1: Basic Concepts of Machine Learning

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

The taming of the data:

Determining the Semantic Orientation of Terms through Gloss Classification

Multivariate k-nearest Neighbor Regression for Time Series data -

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Humboldt-Universität zu Berlin

CS Machine Learning

Senior Stenographer / Senior Typist Series (including equivalent Secretary titles)

Virtually Anywhere Episodes 1 and 2. Teacher s Notes

Word Segmentation of Off-line Handwritten Documents

Learning Distributed Linguistic Classes

Extracting and Ranking Product Features in Opinion Documents

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Universiteit Leiden ICT in Business

A Bayesian Learning Approach to Concept-Based Document Classification

16.1 Lesson: Putting it into practice - isikhnas

Applications of data mining algorithms to analysis of medical data

Genre classification on German novels

Data Fusion Through Statistical Matching

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

BYLINE [Heng Ji, Computer Science Department, New York University,

Preference Learning in Recommender Systems

Mining Association Rules in Student s Assessment Data

arxiv: v1 [cs.lg] 3 May 2013

Answer Key For The California Mathematics Standards Grade 1

Polish (JUN ) General Certificate of Secondary Education June 2014

Mercer County Schools

The Foundations of Interpersonal Communication

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

Extracting Verb Expressions Implying Negative Opinions

Probabilistic Latent Semantic Analysis

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Chapter 2. Intelligent Agents. Outline. Agents and environments. Rationality. PEAS (Performance measure, Environment, Actuators, Sensors)

Content-based Image Retrieval Using Image Regions as Query Examples

Let s think about how to multiply and divide fractions by fractions!

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Constructing Parallel Corpus from Movie Subtitles

Emmaus Lutheran School English Language Arts Curriculum

A process by any other name

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Cross-lingual Short-Text Document Classification for Facebook Comments

CAFE Collaboration Aimed at Finding Experts

Using Hashtags to Capture Fine Emotion Categories from Tweets

Transcription:

Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the course book

Sentiment Analysis 2 --------------- --------------- --------------- --------------- --------------- --------------- ---------------

Outline 3 1 Applications 2 Task 3 Machine Learning Approach 4 Rule-based Approach

Outline 4 1 Applications 2 Task 3 Machine Learning Approach 4 Rule-based Approach

Hotel Reviews 5

Product Reviews 6 Picture Quality Ease of Use Size Weight Color Zoom

Social Media 7

Event Analysis and Prediction 8 Analyzing the side effects of events in different communities Predicting the election results Predicting the Stock exchange...

Outline 9 1 Applications 2 Task 3 Machine Learning Approach 4 Rule-based Approach

Sentiment Analysis Levels 10 + Opinion Text Fact happy surprised... angry afraid...

Advanced Sentiment Analysis 11 Opinion holder Opinion target / aspect Students }{{} like Wikipedia because it is easy to use and it sounds authoritative. }{{} op holder target I had a nice stay in this hotel and the rooms }{{} were very clean.. aspect Mixed opinions The restaurant has an amazing view but it is very dirty.

Other Names 12 Opinion mining Opinion extraction Sentiment mining Subjectivity detection Subjectivity analysis

Sentiment Analysis Approaches 13 Machine learning methods classification Rule-based methods dictionary oriented

Outline 14 1 Applications 2 Task 3 Machine Learning Approach 4 Rule-based Approach

Machine Learning Approach 15 Training T 1 C 1 T 2 C 2... f 1 f 2... Model T n C n Testing f n T n+1? f n+1 C n+1

Sentiment Classification 16 Using any kinds of supervised classifiers K Nearest Neighbor Support Vector Machines Naïve Bayes Maximum Entropy Logistic Regression...

Features 17 Word All words or adjectives? All words works better than adjectives only Word occurrence or frequency? Word occurrence is more useful than frequency Using binary value for words Replace all word counts higher than 0 in each text by 1

Features 18 Negation Negation words change the text polarity Adding prefix NOT to every word between negation and next punctuation I did not like the restaurant location, but the food... I did not NOT-like NOT-the NOT-restaurant NOT-location but the food...

Features 19 Other emotions Considering emoticons as additional features :) :(

Fine-grained Analysis 20 Dealing with finer classes of sentiment -3,-2,-1,+1,+2,+3 Approaches Using multiclass classifier (6 classes in this case) Using two level classifier First level: polarity classifier (positive or negative) Second level: strength classifier (1 or 2 or 3)

Outline 21 1 Applications 2 Task 3 Machine Learning Approach 4 Rule-based Approach

Rule-based Approach 22 Training T 1 C 1 T 2 C 2... T n C n good love brave intelligent nice... bad hate lie ugly poor... Testing T n+1? C n+1

Rule-based Approach 23 Looking for opinionated words in each text Classifying the text based on the number of positive and negative words Considering different rules for classification Fine-grained dictionary Negation words Booster words Idioms Emoticons Mixed opinions Linguistic features of the language

Rule-based Approach 24 Fine-grained Dictionary It was a good song. The song was excellent.

Rule-based Approach 25 Negation Words The song was good. The song was not good.

Rule-based Approach 26 Booster Words The song was interesting. The song was very interesting. The song was somewhat interesting.

Rule-based Approach 27 Idioms shock horror

Rule-based Approach 28 Mixed Opinions The song was good, but I think its title was strange.

Rule-based Approach 29 German Linguistic Features I do not love the song. Ich liebe nicht das Lied. Ich liebe das Lied nicht.

Opinion Dictionary 30 English Subjectivity Clues (2005) SentiSpin (2005) SentiWordNet (2006) Polarity Enhancement (2009) SentiStrength (2010) German GermanPolarityClues (2010) SentiWortSchatz (2010) GermanSentiStrength (2012)

Machine Learning with Opinion Dictionary 31 Using opinion words as a feature in the algorithms Ignoring other words in the text Adjectives alone do not work well, but opinion words are the best features to be used