Can someone help me finish this function:
# Transformation
Function_Thingie <- function(input_text_df,input_reg_ex){
#Needs to take in a tibble with an id column with the name `id` and a column containing text with the column name `text`.
#This function has an argument called `input_reg_ex` (where the user can specify a regular expression.)
#The function should be written such that it identifies rows in `input_text_df` (considering only the matched rows)
#then the function should tokenize, remove stopwords and finally return the 15 most frequent words.
# Tokenizing
df%>%
unnest_tokens(word, text)
}
# Side-effect
counts <- function(tokens){
# Counting words within documents
count_tokens <- tokens %>%
count(word)
# Arrange and show top 15
top_15 <- count_tokens %>%
arrange(desc(n)) %>%
slice(1:15)%>%
tokens %>%
anti_join(stop_words, by = c("word" = "word"))
}
I have done some of the function, but I can't figure out how to do the rest, I think what I have done is correct so far but I am far from being sure.