Emoji Sentiment Analysis in R

I am working on a project where I have used tweets with Emojis and Emoticons. My main goal is to get the combined sentiment score of the tweets( text + Emoticons ) and as we know these emoticons are probably the most meaningful part of the data and that's they can not be neglected. I have converted the encoding structure of the emojis and emoticons via iconv but I am only getting the sentiment score for the text, not the emojis. I am using Vader sentiment in this process but if there is another Sentiment library/Lexicon that can be used which will give me the senti score for all the emojis too it will be a lot helpful and highly appreciated.

Tweets:

dput(df_emoji$Description)
c("DoorDash or Uber method asap<f0><9f><98><ad> cause I be starving<f0><9f><98><ad><f0><9f><98><ad>", 
"such a real ahh niqq cuz I be having myself weak asl<f0><9f><98><82>", 
"shii made me laugh so fuccin hard bro<f0><9f><98><82><f0><9f><98><82><f0><9f><98><82><f0><9f><98><82>", 
"Hart and Will Ferrell made a Gem in Get hard fr<f0><9f><98><82><f0><9f><98><82><f0><9f><98><82>", 
"@NigerianAmazon Chill<f0><9f><a4><a3><f0><9f><98><ad>", "so bomedy <f0><9f><98><82><f0><9f><98><82><f0><9f><98><82>", 
"is that ass Gotdam<f0><9f><98><82><f0><9f><98><82><f0><9f><98><82>", 
"wild<f0><9f><98><82><f0><9f><98><82><f0><9f><98><82>", 
"them late night DoorDash<e2><80><99>s be goin crazy<f0><9f><a4><a3>", 
"of the week<f0><9f><98><82><f0><9f><98><82><f0><9f><98><82><f0><9f><98><82>"
)

Code:

emoji_senti <- data.frame(text = iconv(data_sample$text, "latin1", "ASCII", "byte"), 
                      stringsAsFactors = FALSE)
column1 <- separate(emoji_senti, text, into = c("Bytes", "Description"), sep = "\\ ")
column2 <- separate(emoji_senti, text, into = c("Bytes", "Description"), sep = "^[^\\s]*\\s")
df_emoji <- data.frame(Bytes = column1$Bytes, Description = column2$Description)

allvals_emoji <- NULL
for (i in 1:length(df_emoji$Description)){
  outs <-  vader_df(df_emoji$Description[i])
  allvals_emoji <- rbind(allvals_emoji,outs)
}
allvals_emoji

See this that the first tweet has only 9 English words which have their scores but it misses the score for converted Unicode for emojis.

# word_scores compound   pos   neu   neg but_count
    # 1                 {0, 0, 0, 0, 0, 0, 0, 0, 0}    0.000 0.000 1.000 0.000         0
    # 2  {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1.9, 0, 0}   -0.440 0.000 0.805 0.195         0
    # 3        {0, 0, 0, 2.6, 0, 0, -0.67835, 0, 0}    0.444 0.293 0.570 0.137         0
    # 4        {0, 0, 0, 0, 0, 0, 0, 0, 0, -0.4, 0}   -0.103 0.000 0.877 0.123         0
    # 5                                      {0, 0}    0.000 0.000 1.000 0.000         0
    # 6                                {0, 0, 0, 0}    0.000 0.000 1.000 0.000         0
    # 7                          {0, 0, -2.5, 0, 0}   -0.542 0.000 0.533 0.467         0
    # 8                                      {0, 0}    0.000 0.000 1.000 0.000         0
    # 9                       {0, 0, 0, 0, 0, 0, 0}    0.000 0.000 1.000 0.000         0
    # 10                               {0, 0, 0, 0}    0.000 0.000 1.000 0.000 

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.