Rapidz
September 23, 2022, 2:39am
1
I am using this dataset UCI Machine Learning Repository: Eco-hotel Data Set
I am trying to figure out how to count the frequency of certain words like "room" or "vacation" within each row. I figured out the code to make it work for columns, but I need it for rows.
There are 16 columns in this dataset, but I need the frequency of certain words for each row. If anyone could lend some insights, it would be greatly appreciated.
Here is my code:
library(tidyverse)
EcoResort %>%
summarize(across(everything(), ~ sum(str_detect(., 'room'))))
Hello, i quess rowwise and c_across may help you.
i have created and imaginary dataset for repex.
ps: assuming the dataset has the first column is as id, and the other columns are some strings like reviews.
library(tidyverse)
review1 <- c( "clean room nice vacation", "empty mini bar", "nice hotel", "bad vacation", "nice view in the room")
review2 <- c( "tidy room", "pool is nice", "nice vacation", "bad service", "awful breakfast")
df <- tibble(reviewid = 1:5,
r1 = review1 ,
r2 = review2
)
word_freq_per_row <- function(df, query){
if(!is.character(query)){stop("query must be charcter")}
df %>%
rowwise(1) %>%
summarize(across(everything(), ~ sum(str_detect(., query)))) %>%
mutate(num_of_occurences = sum(c_across())) %>%
select(reviewid, num_of_occurences)
}
word_freq_per_row(df, "room")
# # A tibble: 5 x 2
# # Groups: reviewid [5]
# reviewid num_of_occurences
# <int> <int>
# 1 1 2
# 2 2 0
# 3 3 0
# 4 4 0
# 5 5 1
word_freq_per_row(df, "nice")
# # A tibble: 5 x 2
# # Groups: reviewid [5]
# reviewid num_of_occurences
# <int> <int>
# 1 1 1
# 2 2 1
# 3 3 2
# 4 4 0
# 5 5 1
system
Closed
October 14, 2022, 8:29am
3
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.