Hi,
I have this simple data frame:
data.frame(stringsAsFactors=FALSE,
URN = c("test1", "test2", "test3", "test4", "test5",
"test6", "test7", "test8", "test9", "test10", "test11"),
Test.heading = c("goede uitleg", "Goede sfeer",
"juiste behandeling", NA, "vakkundige afhandeling",
"No comment", NA, NA, "goede uitleg", "goede uitleg", "-"),
Recommendation.comment = c("-", "xxx", NA, NA, "zzz", "uw verkoper",
"correcte verkoop", "goede uitleg", "professioneel !!!",
"Goeie service", "nee"),
Other.comment = c("ab", "zeer goede uitleg over de aankoop",
"uitleg en vriendelijkheid van de verkoper", NA,
"Bob gaf heel goede uitleg!", NA, "eerder genoemd", NA, NA,
"geen opmerkingen\r\n", "Genoeg info")
Now I need to merge all string variables into one. I know I can do that this way:
source$comment<- paste(source$`Test heading`,source$`Recommendation comment`,source$`Other comment`, sep = ", ")
but I would like to:
- Ignore NAs
- Ignore Blanks listed in this assignment:
blank_statements <- regex("no\\scomment?|nee|neen|^\\s*n.?a.?\\s*$", ignore_case = TRUE)
- Ignore comments with less than 3 characters:
str_length(string = source$`Test heading`) < 3)
str_length(string = source$`Recommendation comment`) < 3)
str_length(string = source$`Other comment`) < 3)
- Ignore comments with same characters repeated multiple times:
str_detect(source$`Test heading`, "(.)\\1{3,}")
str_detect(source$`Recommendation comment`, "(.)\\1{3,}")
str_detect(source$`Other comment`, "(.)\\1{3,}")
in all source string variables before merging them into one called 'All Comments'. Basically, the new string should not contain words and statements mentioned above so rather than:
• having this in raw 1: "goede uitleg, -, ab", we should have "goede uitleg",
• having this in raw 2: "Goede sfeer, xxx, zeer goede uitleg over de aankoop", we should have "Goede sfeer, zeer goede uitleg over de aankoop",
• having this in raw 8: "NA, goede uitleg, NA", we should have "goede uitleg"
• having this in raw 11: " -, nee, Genoeg info", we should have " Genoeg info " etc.
Can you help please?