White Paper. Using Sentiment Analysis for Gaining Actionable Insights

Similar documents
Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Python Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Rule Learning With Negation: Issues Regarding Effectiveness

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Machine Learning Basics

Rule Learning with Negation: Issues Regarding Effectiveness

A Case Study: News Classification Based on Term Frequency

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Multilingual Sentiment and Subjectivity Analysis

Applications of data mining algorithms to analysis of medical data

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

STA 225: Introductory Statistics (CT)

Probability and Statistics Curriculum Pacing Guide

Indian Institute of Technology, Kanpur

Learning From the Past with Experiment Databases

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

CS 446: Machine Learning

CS Machine Learning

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Statewide Framework Document for:

Switchboard Language Model Improvement with Conversational Data from Gigaword

Linking Task: Identifying authors and book titles in verbose queries

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Visit us at:

Generative models and adversarial training

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Analyzing the Usage of IT in SMEs

Human Emotion Recognition From Speech

A Comparison of Two Text Representations for Sentiment Analysis

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Grade 6: Correlated to AGS Basic Math Skills

Probabilistic Latent Semantic Analysis

1 3-5 = Subtraction - a binary operation

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

South Carolina English Language Arts

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Lecture 1: Basic Concepts of Machine Learning

A Vector Space Approach for Aspect-Based Sentiment Analysis

Word Segmentation of Off-line Handwritten Documents

Australian Journal of Basic and Applied Sciences

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Modeling function word errors in DNN-HMM based LVCSR systems

EXAMINING THE DEVELOPMENT OF FIFTH AND SIXTH GRADE STUDENTS EPISTEMIC CONSIDERATIONS OVER TIME THROUGH AN AUTOMATED ANALYSIS OF EMBEDDED ASSESSMENTS

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Lecture 10: Reinforcement Learning

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Extending Place Value with Whole Numbers to 1,000,000

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

School Inspection in Hesse/Germany

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

arxiv: v1 [cs.lg] 3 May 2013

CEFR Overall Illustrative English Proficiency Scales

Multi-Lingual Text Leveling

Radius STEM Readiness TM

AP Statistics Summer Assignment 17-18

School Size and the Quality of Teaching and Learning

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Identification of Opinion Leaders Using Text Mining Technique in Virtual Community

Reducing Features to Improve Bug Prediction

B. How to write a research paper

Spinners at the School Carnival (Unequal Sections)

The Evolution of Random Phenomena

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Calculators in a Middle School Mathematics Classroom: Helpful or Harmful?

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Mining Association Rules in Student s Assessment Data

Introduction to Causal Inference. Problem Set 1. Required Problems

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Issues in the Mining of Heart Failure Datasets

Universidade do Minho Escola de Engenharia

Modeling function word errors in DNN-HMM based LVCSR systems

CSL465/603 - Machine Learning

Mathematics Scoring Guide for Sample Test 2005

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Math 96: Intermediate Algebra in Context

Higher education is becoming a major driver of economic competitiveness

Learning Methods in Multilingual Speech Recognition

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

Educator s e-portfolio in the Modern University

Speech Emotion Recognition Using Support Vector Machine

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

Lesson M4. page 1 of 2

learning collegiate assessment]

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

SURVIVING ON MARS WITH GEOGEBRA

Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons

Using focal point learning to improve human machine tacit coordination

Transcription:

corevalue.net info@corevalue.net White Paper Using Sentiment Analysis for Gaining Actionable Insights Sentiment analysis is a growing business trend that allows companies to better understand their brand, products, and services by analyzing the attitudes, opinions, and emotions expressed by an online audience. Author Olena Domanska CoreValue Data Science Engineer

Using sentiment analysis for gaining actionable insights Consumer opinions undoubtedly affect a company s reputation and should be of a high interest to businesses, as they can prove to be extremely valuable assets. Actionable insights provide businesses an advantage over their competition and help them maintain a competitive edge on the market. Today it s easy for consumers to loudly express their satisfaction and their frustration about a company or a product through social media, forums, blogs, and review platforms which can greatly impact public opinion. Sentimental analysis allows businesses to analyze public opinion about a product or service in order to unlock the hidden value contained within. This information, when used correctly, enables them to make better informed business decisions. The notion of sentiment analysis Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information found in source materials. By harnessing the power of sentiment analysis and wrangling all the opinion-related information it contains, businesses can extract tremendous value and use it to their advantage. This data mining requires significant effort, however, as it involves various product/service comparisons, subjectivity and probability defining, emotional components classification, opinion reasoning, and summarizing. In layman s terms, the sentiment analysis engine lurks in social platforms, processes tons of unrestricted data, and derives actionable insights that are directly related to business results. Core techniques of sentiment analysis There are two fundamental ways in which to approach sentiment analysis: supervised and unsupervised (or lexicon-based). The lexicon-based approach rests upon the assumption that the contextual sentiment orientation of the text can be calculated by summing up the sentiment scores of each separate word or phrase. Essentially, this technique relies on external lexical resources that are concerned with mapping words to a categorical class (positive, negative, neutral) or numerical sentiment score. As a result, its effectiveness strongly depends on the quality and adequacy of the chosen resource. While the obvious advantage of the approach is avoiding the arduous step of labeling training data, one must also be aware of its possible limitations. A few examples include instances when a word associated with a positive or negative sentiment actually has opposite orientations in different application domains or, when a sentence containing sentiment words may not express any sentiment at all (in interrogative and conditional sentences). Sentences with a sarcastic tone often warp the polarity of sentiment words and many sentences without sentiment words can also imply opinions. Supervised techniques, on the other hand, work with the notion of training data. Specifically, training samples and the corresponding output values are entered into the algorithm before applying it to the actual data set. This enables the algorithm to handle new unknown data in the future and provide more accurate sentiment classification in specific domains for which it has been trained. The most common supervised learning methods are Naive Bayes classification and Support Vector Machines (SVM) although researchers apply many others as well, including maximum entropy, random forest, neural networks, and regression tree. Recent works in the area shows that supervised approaches tend to overcome unsupervised ones. But is this really true? In this article, we will try to verify this assumption with real data.

Sentiment analysis in action Theory aside, the real questions seem to be, How effective is sentiment analysis in practice? and Which approach is more accurate: supervised or unsupervised? In order to figure this out, we decided to analyze all the available reviews for HubSpot, one of the popular marketing automation platforms, from a natural language processing perspective. The script for the following analysis can be found on GitHub. To perform our analysis, we began by closely examining the data we collected in order to discover the most frequently used words and built associations between them to figure out clusters and themes within the reviews. We then examined how review topics changed over time. Finally, we identified the sentiments of consumer opinions by applying alternatively unsupervised and supervised methods, comparing how each performed on our real data. Exploratory Phase Here is the sample of data we gathered: "HubSpot is our main marketing platform. It's currently used to automate our marketing programs, including email marketing, landing pages, social media and our blog. We also use the tool to score leads, and automate our lead nurturing process. It's easy to measure the success of our programs through the reporting. HubSpot is great for automating workflow emails, creating new campaigns, landing page creation and compiling lists. They have a great training program for gearing up with HubSpot..." Each review was scored by reviewers on a scale from 1 to 10. Here is a distribution of these scores: As you can see, the distribution is strongly left-skewed with a distinct peak at the highest value. It is interesting to see how algorithms perform on such unbalanced real data. To start off, we determined the most frequently used words by building a word cloud with the help of Wordle. Below, you can find the screenshot that illustrates the results.

After examining the word cloud, we concluded that people mainly discussed the features of HubSpot s inbound marketing platform and describe them with the words easy, great and amazing. Even though word clouds give us an understanding of which words are most popular in reviews, they don t allow us to determine numerical proportions of their occurrence frequencies. To do this, we built a so-called document-term matrix that shows which terms contain the review and how often they appear. Adding the number of term occurrences for each review, we get the following histogram:

The most frequently used words in the review texts are the words hubspot (2531), market (1418), use (1393), and tool (690 occurrences). But how are these words connected to each other? To answer this, we built the following graph, illustrating the associations between these words. The thicker the line connecting two words, the higher the probability of their cooccurrence in a review: We see that words hubspot, lead, tool, as well as hubspot, can, help ; email, content, page are usually present together. At this point, the content of the reviews starts to become more clear. Next, we thought it would be interesting to find out what directions of discussions are hidden in reviews texts. Topics Identification To identify topics, we grouped reviews into clusters using the Hierarchical Clustering technique. The results of which are shown below (we first determined the ascending clustering of reviews before constructing the tree with only a few of the uppermost clusters). It was necessary to pick a threshold level to form the groups, so we decided on the simplest and most popular solution, which is to inspect the dendrogram.

Hierarchical Clustering 30 uppermost clusters of reviews In our case, threshold at 0.4 level seemed to be a reasonable choice, which revealed 3 clusters, depicted by black, green, and red boxes containing 73, 12, and 15 percents of reviews, respectively. Thereby revealing that 73 out of 100 reviewers discussed the same topic. But what was the topic they discussed? We determined topics of those clusters based on probabilistic modeling of term frequency. Below are a few of the most frequently used words for each topic: Topic 1: "marketing, hubspot, tool, customer, lead, inbound, sale" Topic 2: "hubspot, email, social, page, blog, content, website, manage" Topic 3: "hubspot, time, can, make, help, need, get" On the basis of the word vectors listed as topics, we concluded the theme of the reviews within each cluster. For instance, the first group of the reviews concentrates on the idea that Hubspot is a leading marketing tool for increasing customer sales, while the second group of reviews is devoted to discussing Hubspot tools like social media and blog posts publisher, landing page creator, content management and website visitor tracking. Let s discover the trending topics across the three-year time frame:

According to this plot, there were just a few reviews on Hubspot until the middle of 2013, with topic 1 proving to be prevalent for yet another year. The highest concentration of reviews occurs at the beginning of 2015, followed by a slowdown which still continues. We continued our sentiment analysis of the reviews using unsupervised and supervised approaches, comparing their accuracy. Sentiment Analysis We began using the unsupervised (lexicon-based) approach, which estimates a record's sentiment by counting the number of occurrences of "positive" and "negative" words and utilizing Hu and Liu's "opinion lexicon". It categorizes around 6,800 words as positive or negative and is available for download here. Other useful resources for lexicon-based sentiment analysis include the MPQA Subjectivity Lexicon, SentiWordNet, SenticNet. To assign a numeric score to each review, we simply subtract the number of negative words from the number of positive words that occur. A new question arose: Should we take into account the length of the review? Consider the following two reviews. The first is fantastic (one word-long, the sentiment score is equal to one) while the second review is several sheets long, expressing both positive and negative thoughts about the product, but with a total score is also equal to one. Obviously, we should rank these reviews differently. One way to take this peculiarity into account is to normalize the score by the length of the review. Not to be unbound, we calculated scores for HubSpot reviews in both cases: with

and without normalization to evaluate the sum of squared errors (SSE). As it turns out, normalization reduced the SSE twice (see the details on GitHub). Based on these arguments, we determined our analysis using three steps: count the sentiment score for each review, normalize the score by the review length, map the obtained scores to the interval [1,10] and round them due to the fact that every review has a rating score assigned by reviewers in the range of 1 to 10. As a result, we obtained 10-class classification problem (NORMAL formulation). However, sentiment classification is usually formulated as a two-class classification problem: positive and negative (BINARY formulation), where a review with the rating score from 1 to 4 is considered to be a negative review, while a review with 5 to 10 rating score is considered to be a positive review. It is also possible to use a neutral class and consider a three-class classification problem, by which a review with 1 to 3 rating score is considered to be negative, 4 to 6 - neutral, and 7 to 10 - positive (BASIC formulation). We obtained the following distributions of the reviews classes for each of the formulated sentiment classification problems: After comparing the distribution of reviews scores assigned by reviewers with the distributions of reviews scores obtained with the help of lexicon-based approach, we concluded that the unsupervised (lexicon-based) technique performs well only in the case of binary classification, where accuracy reaches 0.69. If you consider that normal and basic formulations have accuracies equal to only 0.0065 and 0.19 respectively, our analysis reconfirms the research provided by other scientists who determined the accuracy of classification algorithms depends significantly on the number of classes considered. Taking into account that, according to a series of experiments with Mechanical Turk, humans only agree 79% of the time, this algorithm gives competitive result in the case of two classes. The quantitative perspective of opinions in this case stated that 68.8% of people were positive about the Hubspot product.

Now let s consider the supervised approach. We start with a Naive Bayes classifier. The chosen classifier applies Bayes Theorem to predict the class of the given text using a number of previously classified samples of the same type. We divide our reviews into 2 groups: train and test, in order to evaluate the accuracy of the method on the text set. As in the case of lexicon-based approach, we provide NORMAL, BASIC and BINARY formulations. The accuracy of the method turns out to be 0.63, 0.99, and 1 for normal, basic and binary formulations, respectively. It turns out that compared to unsupervised method, one of the simplest supervised models - Naive Bayes classifier - was able to achieve a recall accuracy up to 100% for our biased data. In the landscape of R, the fantastic RTextTools package was developed by Timothy P. Jurka and colleagues for automatic text classification. The package includes nine algorithms for ensemble classification and is designed to conduct supervised learning in less than 10 steps. For sentiment classification of Hubspot reviews, we chose the following 5 of the existed algorithms: support vector machine (SVM), maximum entropy (MAXENT), random forest (RF), classification or regression tree (TREE) and neural networks (NNET), and implemented them for every formulation: NORMAL, BASIC, and BINARY. All of these methods showed almost 100% accuracy in the case of two or three classes and the accuracy in the interval [0.59, 0.72] in the case of 10 numerical classes. The comparison of the obtained results can be observed in the following plot:

Conclusions In this article, we provided a thorough comparison of unsupervised and supervised approaches to sentiment analysis using the example of Hubspot platform reviews. Specifically, we have used and evaluated the results from seven different models, including lexicon-based, Naive Bayes classifier, support vector machine, maximum entropy, random forest, classification, and neural networks algorithms. Our examination shows that when data is skewed, both lexicon-based (unsupervised) and machine learning (applied to supervised scenario) techniques perform very well in terms of accuracy in the case of binary classification. As expected, the machine learning methodologies outperformed the lexicon-based method. Overall, the sentiment analysis proves to be a relatively simple and effective tool to extract valuable opinion-based information from source data. This creates the potential for further growth of sentiment analysis by expanding its usage into new areas, much to the benefit of businesses who harness the power of this method.

About CoreValue CoreValue, a Software and Technology Services firm headquartered in New Jersey with Development Labs in Eastern Europe, provides Mobility and traditional Cloud based CRM implementation services, Mobile applications in Pharmaceutical, Medical, Media and Telecommunication verticals. Customers trust CoreValue to provide Infrastructure services utilizing premier staff in Data Science, Data Management, Database Services, Quality Assurance and traditional development. CoreValue Services 18 Overlook Ave, Suite 9 Rochelle Park, NJ 07662 908-312-4070 info@corevalue.net