Deep Convolutional Neural Network based Approach for Aspect-based Sentiment Analysis

Size: px

Start display at page:

Download "Deep Convolutional Neural Network based Approach for Aspect-based Sentiment Analysis"

Madlyn Sutton
6 years ago
Views:

1 , pp Deep Convolutional Neural Network based Approach for Aspect-based Sentiment Analysis Lamei Xu, Jin Lin, Lina Wang, Chunyong Yin, Jin Wang College of Information Engineering, Shanghai Maritime University, Shanghai, China {lmxu, jinliu, Abstract. Sentiment analysis is an important task in natural language processing and has a wide range of applications. This paper describes our deep learning approach to multilingual aspect-based sentiment analysis. Our model use a deep convolutional neural network for both aspect extraction and aspect-based sentiment analysis. We take aspect extraction as a multi-label classification problem, outputting probabilities over aspects parameterized by a threshold. For the sentiment towards an aspect, we concatenate an aspect vector with every word embedding and apply a convolution over it. Experiments result shows that our system performs comparably well on the Yelp reviews. Keywords: Convolution Neural Network, Word2vec, Aspect-based sentiment analysis 1 Introduction The advent of web technologies has made an unprecedented opportunity for online users to share and explain their experiences and opinions about several subjects. This information became very valuable for companies, politicians, etc., who are interested in what user say about their products. Major research studies adopted Natural Language Processing and text mining techniques to better understand and process various type of this information. Such efforts have come to be known as opinion mining or sentiment analysis [1]. Aspect level sentiment classification is a fundamental task in the field of sentiment analysis, this task aims at extracting aspects from the review text and then inferring the sentiment polarity (e.g. positive, negative) of the aspect[2]. For example, a review of the entity restaurant is likely to discuss distinct aspects like food, drink, service, price, and a single product can trigger a positive opinion about one feature, and a negative opinion about another. Recently, deep convolutional neural networks (CNN) utilize layers with convolving filters that are applied to local features, and CNN models have demonstrated remarkable results for text classification and sentiment analysis [3]. In spite of these factors, they have largely gone untested for aspect-based sentiment analysis particularly in the multilingual setting. ISSN: ASTL Copyright 2017 SERSC

2 The outline of the paper is as follows. Section 2 describes recent approaches to the aspect-based polarity analysis, section 3 contains our new method to detect aspect and aspect-based sentiment analysis, section 4 provides our preliminary test results. 2 Related Researches Aspect-based sentiment analysis has been a subject of some interesting works so far. Traditionally, aspect-based polarity analysis is split into aspect extraction and a sentiment analysis subtask. Previous approaches to aspect extraction framed the task as a multiclass classification problem and relied mostly on CRS that leveraged a plethora of common features, e.g. NER, POS tagging, parsing, semantic analysis, bag of-words, as well as domaindependent ones, such as word clusters learnt from Amazon and Yelp data, while previous sentiment analysis approaches have used different classifiers with a wide range of features based on n-grams, POS, negation words, and a large array of sentiment lexica[4]. Approaches such as neural networks, which require less language-specific data and engineering, become attractive. Several sentiment analysis submissions applied neural networks, the system by Kiritchenko[5] used various innovative linguistic features, publicly available sentiment lexicon corpo-ra and automatically generated polarity lexicons, achieve the best performance in polarity classifica-tion. Tang [6] rely on use a target-dependent LSTM to determine sentiment towards a target word, while Nguyen and Shirai[7] use a recursive neural network that leverages both constituency as well as dependency trees. 3 Methodology 3.1 Formulas Our CNN model is an extension of the CNN-rand structure used by Kim. The model takes as input a sentence, and we represent the sentence as a concatenation of its word embedding which can be described using Equation (1). x x x x. (1) 1: n 1 2 n Where that x R i k is a k dimensional word vector for the i th word in sentence which have n words. is the concatenation operator. Generally, a convolution layer hk filter with weights w R of h words and generates a new feature. c i 200 Copyright 2017 SERSC

3 i x: ih1 c f w x b. (2) Whereb R is a bias term and f is a non-linear function, ReLU[8]. This filter is applied to each possible window of h words in the sentence to generate a feature map. c c, c,, cn h (3) We apply a max-pooling operation [9] over the feature map and take the maximum value cˆ as the corresponding to this filter. max c A softmax layer takes the concatenation of the maximum values of the feature maps produced by all filters and computes probability distribution over the possible categories. 3.2 Hyper-parameters We use the following hyper-parameters, which are similar to Kim[3]: the filter lengths of 3, 4 and 5,and 4, 5 and 6 for aspect extraction and sentiment analysis, word embedding size is 200,mini-bach size of 10,maxmum sentence length of 100 tokens, set the dropout rate to 0.5 and the l 2 maximum value to 3. Word embedding are initialized with 300-dimensional word2vec [10] trained on 100 billion words from Google News. The vectors have dimensionality of 300 and were trained using the continuous bag-of-words architecture. Words not present in the set of pre-trained words are initialized randomly. 3.3 Aspect Detection To extract distributions for corpus, and dealt with the problem as a multi-class classification problem and train a convolutional neural network to output probability classifications over aspects, minimizing the cross-entropy loss. To choose the threshold, define an aspect a`s probability p given a sentence s as p a s 1 if a appears in s and a contains pa s 0 with pa s n. We define a threshold f and choose all aspect f, after training, we found out that a threshold value of 5 performed best. Thus, we take that all aspects occur less than 5 times to other aspect. 3.4 Sentiment Analysis Sentiment analysis is to detect the correct sentiment label for each aspect in a sentence; possible sentiment labels were positive and negative. In order to solve this, word embedding as described in section 3.3 can be used. We embed the tokens of all Copyright 2017 SERSC 201

4 aspects in the same embedding space as word tokens to find the semantics of embedding. Then, look up to the embedding of every tokens and average them to retrieve the aspect vector [11]. For this, the model can learn aspect sharing the same entity. Finally, the results aspects vector is concatenated with each word vector, then aspect vector together with word vector to produce a sentence matrix. Max-pooling and softmax layer in convolutional neural network are applied to this matrix. 4 Experiments In this section, we present and discuss the obtained results. We use Yelp Dataset [12] reviews data set, the data set are composed of several restaurant and computer reviews in Table 1. Table 1. Statistics of the dataset. Data Domain Positive sentences Negative sentences Yelp Restaurants Yelp Computers Experimental results are given in Table 2 and Fig.1.We can find that feature-based SVM is an extremely strong performer and substantially outperforms other baseline methods, which demonstrates the importance of powerful representation for aspect level sentiment classification. One of the major reasons why the polarity method did not perform better is that we adapted a method that was designed for identifying two categories (positive and negative), which has three categories (positive, negative and neutral). Table 2. Model performance on training dataset t. Yelp Restaurants Yelp Computers Feature +SVM LSTM Our model Copyright 2017 SERSC

5 Fig. 1. Accuracy of models on test set. 5 Conclusions In this paper, we described our novel approach for aspect-based sentiment analysis used for aspect category identification, extraction of opinion target expression and polarity identification, which employs a convolutional neural network for aspect extraction and sentiment analysis. To advance our model, further work is needed on better identification of neutral cases. We will also explore how our CNN systems can be further enhanced. References 1. Bing Liu Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers 2. Pang B, Lee L. Opinion mining and sentiment analysis [J]//Foundations and trends in information retrieval, 2008, 2(1-2): Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Retrieved from 4. Pontiki, D. Galanis, H. Papageogiou, S. Manandhar, and I. Androutsopoulos Semeval-2015 task 12:Aspect based sentiment analysis. In Proceedings of SemEval 2015, pages 27 35, Denver, Colorado 5. Kalchbrenner N, Grefenstette E, Blunsom P. A Convolutional Neural Network for Modelling Sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014: Duyu Tang, Bing Qin, Xiaocheng Feng, and Ting Liu Target-Dependent Sentiment Classification with Long Short Term Memory. arxiv preprint arxiv: Copyright 2017 SERSC 203

6 7. Thien Hai Nguyen and Kiyoaki Shirai Phrasernn: Phrase recursive neural network for aspect-based sentiment analysis. In Proceedings of the 2015 Conference on Empirical MethodShirai(2015) 8. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I.,&Salakhutdinov,R.R. (2012).Improving neural networks by preventing coadaptation of feature detectors. arxiv: , (Nair and Hinton, Mikolov, T., Sutskever, I., Chen, K., Corrado, G., &Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality 11.Socher R, Perelygin A, Wu J. Y, Chuang J. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank[C]. EMNLP Go A, Bhayani R, Huang L. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 2009: Min-ling Zhang, Zhi-hua Zhou, and Senior Member Multilabel Neural Networks with Ap-plications to Functional Genomics and Text Categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10): Copyright 2017 SERSC

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering