Hello, first post here! I'm trying to use reprex, but I'm not sure if I did it right.
I am trying to identify and modify duplicate values in my dataset:
df
#> function (x, df1, df2, ncp, log = FALSE)
#> {
#> if (missing(ncp))
#> .Call(C_df, x, df1, df2, log)
#> else .Call(C_dnf, x, df1, df2, ncp, log)
#> }
#> <bytecode: 0x0000000013e75248>
#> <environment: namespace:stats>
Created on 2020-06-26 by the reprex package (v0.3.0)
I have two types of duplicate values that I want to remove and/or change:
- Columns 7 and 8 contain duplicates across all columns.
- Columns 3 and 4 contain duplicates across all columns except the "values" column.
I figured out how to remove these duplicates using this code:
df2 <- df %>% distinct(ID, FamId, question, wave, .keep_all = TRUE)
#> Error in df %>% distinct(ID, FamId, question, wave, .keep_all = TRUE): could not find function "%>%"
Created on 2020-06-26 by the reprex package (v0.3.0)
However, I think the duplicates are a data entry problem (this is a 2nd data analysis) I would like to modify the second "8" in the wave column to "12" (so it reads 1 4 8 12 1 4 8 12). How can I identify the duplicates and then change the value in the wave column such that the duplicate changes to 12?
I am happy to clarify, or to retry the reprex if I didn't do it correctly.
Thanks so much, happy to be joining the R community.
Tim