Replace words in a data frame

Pingi · August 23, 2021, 2:41pm

Hey community!

I am doing a LSTM-analysis for tweets and facing the following issue:

I want to replace the words in a data frame with the numeric value of the word-frequency of every word.

Therefore I used the following code:

#LSTM

#wordcount

prof.tm<-unnest_tokens(twitter, word, text)

word.freq<-prof.tm %>% count(word, sort = TRUE)

word.freq<-cbind(word.freq,"nr"=1:18420)

word.freq2<-word.freq %>%

select(nr, word) %>%

install.packages("dplyr")

library(dplyr)

tweet <- twitter$text

tweettxt <- data.frame(

stringsAsFactors = F,

tweetwords = (strsplit(tweet," ")[[1]])

)

combine the two tables: column `n` will contain the frequencies, `nr` the ranks

tweetnum <- tweettxt %>%

left_join(word.freq,by=c('tweetwords'='word')) %>%

mutate (n = ifelse(is.na(n),0,n),

nr = ifelse(is.na(nr),Inf,nr))

tweetchar = paste("[",tweetnum$nr,"]",sep='',collapse = ' ')

Do you know how I can use this code for every tweet in the dataset and not only for one tweet?

And how can I create a dataset of the results and not only values?

I hope I could clarify my point and looking forward for every help!

nirgrahamuk · August 23, 2021, 3:11pm

Hello.
Thanks for providing code , however you havent provided example data, so I recommend you take further steps to make it more convenient for other forum users to help you.

format text to appear as code in the forum, this looks much better and stops posts getting cumbersome with length. This is done simply by turning on and off formatting with triple backtick lines like this

```
my code goes here
```

share data as code, use tools such as the library datapasta, or the base function dput() to share a portion of data in code form, i.e. that can be copied from forum and pasted to R session.

for more details on these tips see reprex guide

Replace words in a data frame

combine the two tables: column n will contain the frequencies, nr the ranks

combine the two tables: column `n` will contain the frequencies, `nr` the ranks