I need to search info in a data.frame and see if it is duplicated. Duplicated info would have the opposite name, that is to say, imagine name 1 "AAAA_AAAB" and name 2 "AAAB_AAAA", these names are duplicated. When a duplicated name is found I need the code to identify them and categorize as "Confronted_traffic". To expose my issue I present a short reprex:
name<-data.frame(stringsAsFactors=FALSE,
name = c("AAAA_AAAB", "AAAC_AAAD", "AAAD_AAAE",
"AAAB_AAAA", "AAAD_AAAC", "AAAE_AAAD",
"AAAB_AAAA")
)
#To simplify data management, I replace name by numbers
name_ID<-data.frame(stringsAsFactors=FALSE,
name_ID = c(1, 2, 3, 4, 5, 6, 7)
)
solution<-data.frame(stringsAsFactors=FALSE,
name_ID = c(1, 1, 2, 3, 4, 4, 5, 6, 7, 7),
Confronted_Traffic = c(4, 7, 5, 6, 1, 7, 2, 3, 1, 4)
)
As it can be seen in the example first I replace names by numbers just to simplify data management as I have around 96000 rows. Then, I identify duplicated data. In the example for name_ID 1, duplicated data was found twice, name_ID 4 and 7 (that is why name_ID 1 is in the first and second row, in order to match each row with its duplicated data); for name_ID 2, it was only duplicated once, name_ID 5; for name_ID 3, the duplicated data is name_ID 6; for name_ID 4 and name_ID 7 duplicated info is the same than in case name_ID 1 as it was before found; same happen in case name_ID 5 and name_ID 6.