sentiment positive negative

TextBlob VS VADER For Sentiment Analysis Using Python


TextBlob and VADER are two of the most widely used sentiment analysis Python libraries. Comparing to machine learning approaches for sentiment analysis, TextBlob and VADER use a lexicon-based method. The lexicon approach has a mapping between words and sentiment, and the sentiment of a sentence is the aggregation of the sentiment of each term.

Lexicon sentiment analysis outputs a polarity score of -1 to 1, where -1 represents the highly negative sentiment, and 1 shows the highly positive sentiment. A value near 0 represents neutral sentiment.

A critical difference between TextBlob and VADER is that VADER is focused on social media. Therefore, VADER puts a lot of effort into identifying the sentiments of content that typically appear on social media, such as emojis, repetitive words, and punctuations (exclamation marks, for example).

In this article, we will compare the performance of TextBlob and VADER using sample sentences and see which one performs better!

Resources for this post:

  • More video tutorials on NLP
  • More blog posts on NLP
  • If you prefer the video version of the tutorial, please check out the video on YouTube
TextBlob and VADER for Sentiment Analysis – GrabNGoInfo.com

Step 1: Import Libraries

First, we need to install the vaderSentiment package.

pip install vaderSentiment

You will see the output below after it was successfully installed. Note that your version might be different than mine.

Successfully installed vaderSentiment-3.3.2

Now import the packages for VADER and TextBlob.

# Import the packages for sentiment analysis
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from textblob import TextBlob

Step 2: Define Functions For VADER and TextBlob

Let’s create functions for VADER and TextBlob separately, and this will help us code more efficiently later.

# VADER 
vader_sentiment = SentimentIntensityAnalyzer()
def vader_sentiment_scores(text):
    score = vader_sentiment.polarity_scores(text)
    return score['compound']

# TextBlob
def textblob_sentiment_scores(text):
    textblob_sentiment = TextBlob(text)
    score = textblob_sentiment.sentiment.polarity
    return score

Step 3: Check Sentiment Difference

We get the sentiment score for a sentence using VADER and TextBlob separately. Both give us a positive sentiment score, and VADER has a higher value than TextBlob.

text = 'grabngoinfo.com is a fantastic website for step by step machine learning tutorials.'

print(f"Sentence: {text} \nVADER sentiment score: {vader_sentiment_scores(text)} \nTextBlob sentiment score: {textblob_sentiment_scores(text)}")

Output:

Sentence: grabngoinfo.com is a fantastic website for step by step machine learning tutorials. 

VADER sentiment score: 0.5574 

TextBlob sentiment score: 0.4

Step 4: Check Impact of Capitalization

In this step, we use the same sentence. Just change the word “fantastic” to upper case. VADER considers the capitalized version to have stronger sentiment and increased the sentiment score. At the same time, TextBlob did not distinguish the sentiment between the upper and lower case version of the word.

text = 'grabngoinfo.com is a FANTASTIC website for step by step machine learning tutorials.'

print(f"Sentence: {text} \nVADER sentiment score: {vader_sentiment_scores(text)} \nTextBlob sentiment score: {textblob_sentiment_scores(text)}")

Output:

Sentence: grabngoinfo.com is a FANTASTIC website for step by step machine learning tutorials. 

VADER sentiment score: 0.6523 

TextBlob sentiment score: 0.4

Step 5: Check Impact of Repeated Words

In this step, we repeated the word “FANTASTIC” three times to see if that impacts the sentiment score. As a result, VADER increased the score from 0.6523 to 0.9325. However, the TextBlob sentiment score barely increased. The results show that VADER considers repeated words to have stronger sentiment, and TextBlob does not account for the repeated words.

text = 'grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials.'

print(f"Sentence: {text} \nVADER sentiment score: {vader_sentiment_scores(text)} \nTextBlob sentiment score: {textblob_sentiment_scores(text)}")

Output:

Sentence: grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials. 

VADER sentiment score: 0.9325 

TextBlob sentiment score: 0.4000000000000001

Step 6: Check Impact of Punctuation

In this step, we changed the punctuation from period to exclamation mark, and the sentiment score increased for both VADER and TextBlob.

text = 'grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials!'

print(f"Sentence: {text} \nVADER sentiment score: {vader_sentiment_scores(text)} \nTextBlob sentiment score: {textblob_sentiment_scores(text)}")

Output:

Sentence: grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials! 

VADER sentiment score: 0.9359 

TextBlob sentiment score: 0.43333333333333335

Step 7: Check Impact of Emojis

In this step, we added three emojis, a thumbs up, a star eyes, and a heart, at the end of the sentence. People use these three emojis usually when there is a positive sentiment. However, we can see that VADER’s sentiment score did not increase, and TextBlob’s sentiment score even decreased!

text = 'grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials!👍 🤩 ❤️'

print(f"Sentence: {text} \nVADER sentiment score: {vader_sentiment_scores(text)} \nTextBlob sentiment score: {textblob_sentiment_scores(text)}")

Output:

Sentence: grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials!👍 🤩 ❤️ 

VADER sentiment score: 0.9359 

TextBlob sentiment score: 0.4000000000000001

Now let’s try some smiley face emojis. It looks like Textblob did not change sentiment, and VADER increased the positive sentiment from 0.9359 to 0.9538.

text = 'grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials! 😁'

print(f"Sentence: {text} \nVADER sentiment score: {vader_sentiment_scores(text)} \nTextBlob sentiment score: {textblob_sentiment_scores(text)}")

Output

Sentence: grabngoinfo.com is a FANTASTIC FANTASTIC FANTASTIC website for step by step machine learning tutorials! 😁 

VADER sentiment score: 0.9538 

TextBlob sentiment score: 0.43333333333333335

Summary

Based on the comparison above, we can see that VADER provides more granular sentiment than TextBlob and takes capitalization, repeated words, and emoji into consideration when evaluating the sentiment of the text.

For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.


References

Leave a Comment

Your email address will not be published. Required fields are marked *