Opinion Mining and Sentiment Analysis She Feng Shanghai Jiao Tong University sjtufs@gmail.com April 15, 2016
Outline What & Why? Data Tasks Interesting methods Topic Model Neural Network 2
What is Opinion Mining / Sentiment Analysis? 3
What is Opinion Mining / Sentiment Analysis? Two types of textual information Facts, Opinions Note: facts can imply opinions 4
What is Opinion Mining / Sentiment Analysis? Two types of textual information Facts, Opinions Note: facts can imply opinions Most text information processing systems focus on facts web search, chat bot 5
What is Opinion Mining / Sentiment Analysis? Two types of textual information Facts, Opinions Note: facts can imply opinions Most text information processing systems focus on facts web search, chat bot Sentiment analysis focuses on opinions identify and extract subjective information 6
Why is Sentiment Analysis important? 7
Why is Sentiment Analysis important? It s useful 8
Why is Sentiment Analysis important? It s useful 9
Why is Sentiment Analysis important? It s useful 10
Why is Sentiment Analysis important? It s useful 11
Why is Sentiment Analysis important? It s necessary 12
Why is Sentiment Analysis important? It s necessary 13
Why is Sentiment Analysis important? It s difficult 14
Why is Sentiment Analysis important? It s difficult Honda s Accord and Toyota s Camry are nice sedans. Honda s Accord and Toyota s Camry are nice sedans, but hardly the best cars on the road. 15
Why is Sentiment Analysis important? It s difficult Honda s Accord and Toyota s Camry are nice sedans. Honda s Accord and Toyota s Camry are nice sedans, but hardly the best cars on the road. Opinions are complicated. Human language is ambiguous. 16
Data Lets look at what kind of data is used for sentiment analysis. 17
Data: user-generated content 18
Data: user-generated content 19
Data: user-generated content 20
Data: user-generated content most of them unlabeled the labels are noisy #Panama #PanamaPapers the texts are also noisy different lengths, most are very short overall, the amount of useful training data is a bottleneck for sentiment analysis 21
Data: knowledge bases WordNet A lexical database for English with emphasis on synonymy Nouns, verbs, adjectives are grouped into synonymy sets, synsets Words are linked according to lexical and conceptual relations, creating a net Not specifically sentiment oriented, but helps with identifying opinion targets http://wordnet.princeton.edu/ 22
Data: knowledge bases SentiWordNet A lexical database based on WordNet synsets Each synset is assigned three sentiment scores: positivity, negativity, objectivity http://sentiwordnet.isti.cnr.it/ 23
Data: knowledge bases ProBase A hyponym-hypernym dataset 2.7 million concepts http://probase.msra.cn/ 24
Data: knowledge bases Good for: entity identification summarization unsupervised settings 25
Opinion in user-generated texts Let s look at some examples of review sentences. 26
Opinion in user-generated texts We care about the following about an opinion: target object aspect of the object sentiment value opinion holder time 27
Opinion in user-generated texts target aspect value holder time hotel room utility bathroom / / The room was extremely clean and well kept. The only complaint we had was no refrigerator but since we were only there for 2 nights so it was not a big deal. Bathroom was ample size. 28
Opinion in user-generated texts target aspect value holder time hotel room utility bathroom + - + / / The room was extremely clean and well kept. The only complaint we had was no refrigerator but since we were only there for 2 nights so it was not a big deal. Bathroom was ample size. 29
Opinion in user-generated texts target aspect value holder time hotel room utility bathroom +? + / / The room was extremely clean and well kept. The only complaint we had was no refrigerator but since we were only there for 2 nights so it was not a big deal. Bathroom was ample size. 30
Opinion in user-generated texts target aspect value holder time hotel room utility bathroom +? + / / The room was extremely clean and well kept. The only complaint we had was no refrigerator but since we were only there for 2 nights so it was not a big deal. Bathroom was ample size. Ambiguous! 31
Opinion in user-generated texts target aspect value holder time hotel location great / / Location was great with lots of taxis or Uber to take you anywhere you want, we also walked to a couple of restaurants when you come out of the hotel and take a right. 32
Opinion in user-generated texts target aspect value holder time hotel location great / / Location was great with lots of taxis or Uber to take you anywhere you want, we also walked to a couple of restaurants when you come out of the hotel and take a right. Explicit aspects 33
Opinion in user-generated texts target aspect value holder time hotel?? / / There are restaurants right out the front door. The little pub was our favorite. Lots of restaurants within walking distance as well. 34
target aspect value holder time hotel location good / / There are restaurants right out the front door. The little pub was our favorite. Lots of restaurants within walking distance as well. Implicit aspects 35
Tasks Review summarization Sentiment classification Subjectivity / objectivity identification Opinion holder / target identification unstructured Sexism / racism detection Sarcasm / irony detection structured Humor detection Fake review detection 36
Aspect-based Opinion Mining unstructured 37 structured
Aspect-based Opinion Mining Aspects Ratings 38
Sub-tasks in Aspect-based Opinion Mining Aspect extraction Aspect identification Sentiment score prediction 39
Sub-tasks in Aspect-based Opinion Mining A set of reviews (of some product) We focus on the joint inference of aspects and sentiment scores 40
Aspect-based Opinion Mining LDA 41
Aspect-based Opinion Mining LDA doc-topic document Corpus topic-word 42
Aspect-based Opinion Mining Phrase-LDA A nice hotel head modifier 43
Aspect-based Opinion Mining Separate-LDA It is very clean It is very dirty aspect rating 44
Aspect-based Opinion Mining Dependency-LDA aspect rating 45
Aspect-based Opinion Mining Separate-Phrase-LDA aspect head rating modifier 46
Aspect-based Opinion Mining Dependency-Phrase-LDA aspect head rating modifier 47
Aspect-based Opinion Mining Separate Dependency Word Phrase
Aspect-based Opinion Mining Model Precision Recall MSE Perplexity LDA 0.54 0.51 1.25 813.11 Separate 0.54 0.52 1.22 795.72 Dependency 0.58 0.55 1.18 748.26 Phrase 0.81 0.73 0.96 587.82 Separate- Phrase Dependency- Phrase 0.83 0.73 0.93 335.02 0.87 0.78 0.85 131.80 49
Sentiment Classification The movie was fantastic! Binary, multi-class, regression, ranking Popular datasets IMDB: 50,000; 3/10 classes Stanford Sentiment Treebank: 11,855; 5 classes 50
Let s use Deep Learning! 51
Quick Quiz y = (Wx+ b) 52
Quick Quiz y = (Wx+ b) Why do we need bias? 53
Quick Quiz y = (Wx+ b) Why do we need bias? 54
Quick Quiz y = (Wx+ b) Why do we need bias? 55
Quick Quiz y = (Wx+ b) What s the role / function of bias? True / False: It's sufficient for symmetry breaking in a neural network to initialize all W to 0, provided biases are random. 56
Recurrent Neural Network for Sentiment Classification h i = RNN(h i 1, x i ) 57
Layer / Weights
Activation / State
Recurrent Neural Network for Sentiment Classification h i = RNN(h i 1, x i ) h i = tanh(wh i 1 + Ux i ) 61
Recurrent Neural Network for Sentiment Classification h i = RNN(h i 1, x i ) h i = tanh(wh i 1 + Ux i ) can be very long 62
Recurrent Neural Network for Sentiment Classification h i = RNN(h i 1, x i ) can be very long h i = tanh(wh i 1 + Ux i ) r i = (W r x i + U r h i 1 ) z i = (W z x i + U z h i 1 ) h 0 i = tanh(u(r i h i 1 )+Wx i ) h i =(1 z i ) h 0 i + z i h i 1 LSTM Long-Short Term Memory GRU Gated Recurrent Unit 63
Recurrent Neural Network for Sentiment Classification Inputs States Outputs 64
IMDB Model 3 folds Unigrams 82.8 Unigrams and Osgood 82.8 Unigrams and Turney 83.2 Unigrams, Turney, Osgood 82.8 Lemmas 84.1 Lemmas and Osgood 83.1 Lemmas and Turney 84.2 Lemmas, Turney, Osgood 83.8 Pang et al. 2002 (SVM on unigrams) 82.9 Hybrid SVM (Turney and Lemmas) 84.4 Hybrid SVM (Turney/Osgood and Lemmas) 84.6 65
IMDB Model 3 folds Unigrams 82.8 Unigrams and Osgood 82.8 Unigrams and Turney 83.2 Unigrams, Turney, Osgood 82.8 Lemmas 84.1 Lemmas and Osgood 83.1 Lemmas and Turney 84.2 Lemmas, Turney, Osgood 83.8 Pang et al. 2002 (SVM on unigrams) 82.9 Hybrid SVM (Turney and Lemmas) 84.4 Hybrid SVM (Turney/Osgood and Lemmas) 84.6 LSTM 84.9 66
Stanford Sentiment Treebank Model 5 folds LSTM 45.6 RNTN (Socher, et al. 2013) 45.7 DCNN (Blunsom, et al. 2014) 48.5 CNN-non-static (Kim, 2014) 48.0 DRNN (Irsoy and Cardie, 2014) 49.8 Dependency Tree-LSTM (Tai, et al, 2015) 48.4 Constituency Tree-LSTM w. Glove vectors 51.0 67
Other Important (Big) Topics Domain Adaptation Bias in the system 68
Related materials Survey Opinion mining and sentiment analysis, Bo Pang and Lillian Lee, 2008 Topic Model for Opinion Mining On the Design of LDA Models for Aspect-based Opinion Mining Modeling online reviews with multi-grain topic models Neural Network for Sentiment Classification The Unreasonable Effectiveness of RNN, Andrej Karpathy Recurrent Neural Networks Tutorial, WildML Understanding LSTM Networks, Christopher Olah 69
Thank you! Questions? 70
Backup Slides
Deep Learning for NLP Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves Fei Tian, Bin Gao, Di He, Tie-Yan Liu, 2016 72
Deep Learning for NLP P (d i, ) = Z N Y i KX ik P (s ij k, )d Dir( ) j=1 k=1 = Z Dir( ) N Y i j=1 KX k=1 ik T ij Y t=1 P (y t y t 1,,y 1 ; k)d (3) 73
Deep Learning for NLP Can syntax help? Can structure help? How does syntactic structure guide the composition of the meaning of a sentence? How can we use it in Deep Learning for NLP? 74
Deep Learning for NLP y 1 y 2 y 3 y 4 x 1 x 2 x 3 x 4 y 1 y 2 y 3 x 1 x 2 y 4 y 6 x 4 x 5 x 6 Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks Kai Sheng Tai, Richard Socher, Christopher D. Manning, 2015 75
Deep Learning for NLP Structural Depth Deep Recursive Neural Networks for Compositionality in Language Ozan Irsoy, Claire Cardie, 2014 76
Deep Learning for NLP Temporal Depth that movie was cool Deep Recursive Neural Networks for Compositionality in Language Ozan Irsoy, Claire Cardie, 2014 77
Character level? In RNN, we use random initialized word vectors It doesn t capture lexical similarity compose & composition 78