Semi-supervised emotion lexicon expansion with label propagation

Similar documents
Using Hashtags to Capture Fine Emotion Categories from Tweets

Multilingual Sentiment and Subjectivity Analysis

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Lecture 1: Machine Learning Basics

Python Machine Learning

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

A Comparison of Two Text Representations for Sentiment Analysis

arxiv: v2 [cs.cv] 30 Mar 2017

Truth Inference in Crowdsourcing: Is the Problem Solved?

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

A Web Based Annotation Interface Based of Wheel of Emotions. Author: Philip Marsh. Project Supervisor: Irena Spasic. Project Moderator: Matthew Morgan

TRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY

Assignment 1: Predicting Amazon Review Ratings

FEEL: a French Expanded Emotion Lexicon

Probabilistic Latent Semantic Analysis

Learning Methods in Multilingual Speech Recognition

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Using Web Searches on Important Words to Create Background Sets for LSI Classification

A Bayesian Learning Approach to Concept-Based Document Classification

CS 446: Machine Learning

(Sub)Gradient Descent

Graph Alignment for Semi-Supervised Semantic Role Labeling

CS Machine Learning

Leveraging Sentiment to Compute Word Similarity

Comment-based Multi-View Clustering of Web 2.0 Items

A Case Study: News Classification Based on Term Frequency

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Artificial Neural Networks written examination

Article A Novel, Gradient Boosting Framework for Sentiment Analysis in Languages where NLP Resources Are Not Plentiful: A Case Study for Modern Greek

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Switchboard Language Model Improvement with Conversational Data from Gigaword

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

arxiv: v1 [cs.cl] 2 Apr 2017

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Unsupervised Cross-Lingual Scaling of Political Texts

Linking Task: Identifying authors and book titles in verbose queries

Cross Language Information Retrieval

On document relevance and lexical cohesion between query terms

Matching Similarity for Keyword-Based Clustering

Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons

BENCHMARK TREND COMPARISON REPORT:

TextGraphs: Graph-based algorithms for Natural Language Processing

Speech Emotion Recognition Using Support Vector Machine

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

Postprint.

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

A Vector Space Approach for Aspect-Based Sentiment Analysis

Speech Recognition at ICSI: Broadcast News and beyond

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

arxiv: v1 [cs.lg] 15 Jun 2015

AQUA: An Ontology-Driven Question Answering System

Multi-Lingual Text Leveling

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

Modeling function word errors in DNN-HMM based LVCSR systems

LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting

Determining the Semantic Orientation of Terms through Gloss Classification

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Generative models and adversarial training

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models

Emotions from text: machine learning for text-based emotion prediction

Georgetown University at TREC 2017 Dynamic Domain Track

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Cross-Lingual Text Categorization

Human Emotion Recognition From Speech

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Robust Sense-Based Sentiment Classification

Online Updating of Word Representations for Part-of-Speech Tagging

arxiv: v1 [cs.lg] 3 May 2013

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

arxiv: v1 [cs.cl] 20 Jul 2015

WHEN THERE IS A mismatch between the acoustic

Language Acquisition Chart

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval

Semi-Supervised Face Detection

Beyond the Pipeline: Discrete Optimization in NLP

Measuring Web-Corpus Randomness: A Progress Report

Attributed Social Network Embedding

arxiv: v2 [cs.ir] 22 Aug 2016

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Mining Topic-level Opinion Influence in Microblog

Differential Evolutionary Algorithm Based on Multiple Vector Metrics for Semantic Similarity Assessment in Continuous Vector Space

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Ensemble Technique Utilization for Indonesian Dependency Parser

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

The taming of the data:

Can Human Verb Associations help identify Salient Features for Semantic Verb Classification?

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Transcription:

Semi-supervised emotion lexicon expansion with label propagation Mario Giulianelli 1 Daniël de Kok 2 1 University of Amsterdam 2 Seminar für Sprachwissenschaft University of Tübingen CLIN, 2018 1/19

Emotion and sentiment Sentiment analysis commonly refers to the task of polarity annotation. A piece of text is positioned on a value scale from negative to positive. Emotion analysis replaces the value scale with a set of m basic emotions. A text is assigned to an emotion class or it is mapped onto an m-dimensional space. Our work: document-based 2/19

Challenges of emotion analysis Lack of contextual information: judgements on affective orientation are subjective and susceptible to cross-cultural differences. Time to start this research paper Am not gonna watch Barcelona match today Inter-annotator agreement: trained annotators agree with a simple-average Pearson correlation of 53.67, and with a frequency-based average correlation of 43 (Strapparava and Mihalcea, 2007). Insufficient lexical coverage 1 for anger, disgust, fear, joy, sadness, and surprise. 3/19

Challenges of emotion analysis Lack of contextual information: judgements on affective orientation are subjective and susceptible to cross-cultural differences. Time to start this research paper Am not gonna watch Barcelona match today Inter-annotator agreement: trained annotators agree with a simple-average Pearson correlation of 53.67, and with a frequency-based average correlation of 43 (Strapparava and Mihalcea, 2007). Insufficient lexical coverage: - only 3,462 1 emotion words in the NRC Emotion Lexicon - one third of the Hashtag Corpus contains no lexicon words - a tweet contains on average 1.09 lexicon words 1 for anger, disgust, fear, joy, sadness, and surprise. 3/19

Approaches to emotion analysis Corpus-based Emotion analysis as a supervised classification problem. Datasets: news (SemEval-2007 Affective Text), tweets (Hashtag Emotion Corpus), blog posts, fairy tales Features: Weighted PMI, SentiWordNet scores, synonyms and lexical contrast from WordNet Lexicon-based Relies on labeled dictionaries to calculate the emotional orientation of a text from the words and phrases that constitute it. Lexica: WordNet Affect, Hashtag Emotion Lexicon, NRC Emotion Lexicon 4/19

Approaches to emotion analysis Lexica and corpora are complementary sources of information and can be used jointly (Strapparava and Mihalcea, 2008). Corpus-based approaches learn to use contextual information. Lexicon-based approaches typically have a wider coverage of emotion-bearing words but are context-independent. 5/19

Problems Narrow coverage Saddened by the terrifying events in Virginia. Affective content but emotionally neutral words I want cake. I bet we don t have any. Indirect affective words I am going to have a monster year. Compositionality Beating poverty in a small way. Implicatures I m not actually writing a physics exam today. 6/19

Solution Assumption: all terms in a text contribute to its affective content. Use transductive learning to extend the coverage of an existing emotion lexicon, thereby: addressing the disproportion between lexicon words and unseen types leveraging latent information within the (semantic) space of lexicon words 7/19

Label propagation (Zhu and Ghahramani, 2002) Construct a fully connected graph: labeled and unlabeled words are vertices edges are weighted by the distances between distributional word representations w ij = exp ( dist(x i,x j ) 2 σ 2 ) Compute a probabilistic transition matrix T T ij = P(i j) = w ij k w kj and a label matrix Y that stores, for each word, its probability distribution over labels. 8/19

Label propagation (Zhu and Ghahramani, 2002) Iterative algorithm 1. Propagate Y TY 2. Row-normalise Y 3. Repeat until convergence Closed-form solution Partition [ the transition ] matrix Tll T T = lu T ul T uu Compute solution directly Y u = ( I Tuu 1 ) Tul Y l 9/19

Label propagation with word embeddings Use cosine similarity to weight edges: ( ( ) ) xi x j w ij = σ a + b x i 2 x j 2 Replace a R with α R d parameters that control edge weights along the d dimensions of the chosen word representation: ( ( xi w ij = σ α x ) ) j + b x i 2 x j 2 Parameter optimisation Minimise H = ij Y ij log Y ij using gradient descent. 10/19

Batched-based label propagation The size of the transition matrix can cause memory issues: a R T R V V 2GB α R d T R V V d 600GB (V=32,930; half-precision; 300-dimensional vectors) Label Propagation in batches: Randomly select a subset of the vocabulary of size U < V possibly fix the distribution of labeled and unlabeled instances to be equal to the proportion that they have in the original transition probability matrix Compute the submatrix T R U U d Propagate labels within submatrix Repeat M times for each submatrix Repeat for N submatrices 11/19

Representing words linguistic units Specialised word embeddings Learn emotion-specific word vectors directly from a large annotated corpus by extending an existing general purpose embedding algorithm (e.g. Collobert and Weston, Skipgram) Use pretrained embeddings as weights for an emotion classifier and update them during training (our approach) Other features Character-level models of emotion intensity (Lakomkin et al., 2017) Additional lexical resources: WordNet, SentiWordNet 12/19

Experiments We compare four emotion classifiers: One-vs-all SVM (Mohammad and Kiritchenko, 2015) Bidirectional LSTM Bidirectional LSTM model with an emotion lexicon (NRC Lexicon) Bidirectional LSTM model with the extended emotion lexicon obtained through label propagation 13/19

Results emotion classification Classification on the Hashtag Emotion Corpus Classifier P R F 1 Mohammad and Kiritchenko, 2015 55.1 45.6 49.9 Bidirectional LSTM 55.0 55.0 55.0 Bidirectional LSTM + emotion lexicon 55.2 55.2 55.2 Bidirectional LSTM + expanded lexicon 2 56.2 56.2 56.2 Domain adaptivity: classification on the SemEval-2007 headlines Classifier P R F 1 Mohammad and Kiritchenko, 2015 46.7 38.6 42.2 Bidirectional LSTM 38.8 50.3 43.8 Bidirectional LSTM + emotion lexicon 39.2 50.9 44.3 Bidirectional LSTM + expanded lexicon 2 43.1 48.9 45.9 2 with scalar parameter a 14/19

Survey Test the classification accuracy of an untrained person with respect to an emotion-annotated dataset, the Hashtag Corpus. 33 participants: undergraduate and graduate students task: read 25 tweets and assign each to one emotion class 825 unique tweets Three main types of tweets: I m so excited for starting gift shopping early! #joy Grateful for the sudden ability to make amazing omelettes! #surprise Dropped my phone in coffee shitty day #joy 15/19

Humans as classifiers Survey Our model Emotion class P R F 1 P R F 1 anger 25 50 33 38 27 32 disgust 18 70 29 40 18 25 fear 48 22 30 58 52 55 joy 52 46 49 66 76 71 sadness 50 52 51 40 44 42 surprise 40 23 29 53 46 49 average 40.9 40.4 40.6 56.2 56.2 56.2 Assigning an emotion to a short paragraph is a hard task for both a human and a statistical classifier. More contextual information is required than it is available in the paragraph itself. 16/19

Summary Conclusions Label propagation can be used to extend the coverage of an existing emotion lexicon. Access to an expanded emotion lexicon can improve emotion classification as it combines context-sensitivity with wide, context-independent lexical coverage. Outlook Can character-level models of emotion intensity (Lakomin et al., 2017) be used for label propagation? Enrich word representations with lexical contrast information. 17/19

Thank you! 18/19

Intrinsic evaluation Average Kullback Leibler divergence for 10-fold cross-validation on the NRC Emotion Lexicon. Lexicon expansion KL divergence Uniform distribution 1.34 Majority class (Hashtag Corpus) 21.32 Prior class distribution (Hashtag Corpus) 1.53 Label propagation (a R) 1.31 Batch label propagation (a R) 1.31 Batch label propagation 3 (α R 300 ) 14.37 3 500 batches of size 3,000; 5 epochs per batch 19/19