My data frame consists of one column containing many sentences.
I know I need to use this:
library(tidytext)
get_sentiments("bing")
I have spent considerable time following various tutorials to figure out the frequency of positive and negative words in each row. I am at a loss for what to do, as I have exhausted so much time already. Any help would be appreciated!
See the vignette of the tidytext package for some examples.
By reading the first few pages I was able to 'compose' the following prose:
library(tidytext)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
sentiment_table = get_sentiments('bing')
my_sentences = data.frame(
text1 = c(
"This good man accuses him of erratic and strange behaviour",
"Uncertainty is a tragic given these days"
)
)
x <- my_sentences |>
mutate(sentence_number = row_number()) |>
unnest_tokens(word,text1) |>
inner_join(sentiment_table,by="word") |>
print()
#> sentence_number word sentiment
#> 1 1 good positive
#> 2 1 accuses negative
#> 3 1 erratic negative
#> 4 1 strange negative
#> 5 2 tragic negative
Created on 2022-09-24 with reprex v2.0.2
Hello @Rapidz,
in the last part of my code replace my_sentences by the name of your data_frame
and replace text1by the name of your column (that contains the text strings that you want to categorize.
If this does not help, show us the code that you use, the first few lines of the input data.frame and messages (if any). In this case the info in producing a minimal reproducible example might help.