the problem arises with the mutate function when I'm trying to add a new column, it only returns the else values for the entire dataframe and I don't know why it's doing that. I've tried to apply the same function on austen_books and there it works. I've even changed the doc_id variable into factors but to no avail.
To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:
That's not how the %in% operator works. Is this what you are trying to do?
doc_id <- c("Title A.txt",
"Title A.txt",
"Title B.txt",
"Title B.txt",
"Title B.txt")
df <- as.data.frame(doc_id)
df$category <- ifelse(grepl("A", doc_id), "A", "B")
print(df)
#> doc_id category
#> 1 Title A.txt A
#> 2 Title A.txt A
#> 3 Title B.txt B
#> 4 Title B.txt B
#> 5 Title B.txt B
Because you're trying to do a partial match. %in% will give you your desired result only if the entire string is found in the target vector. For partial matches, you have to go with grep() or similar pattern matching functions.
# This works.
doc_id %in% "Title A.txt"
[1] TRUE TRUE FALSE FALSE FALSE
# This doesn't.
doc_id %in% "A"
[1] FALSE FALSE FALSE FALSE FALSE