Hi Community,
My requirement was to replace the NA of the columns with other column. I keep getting this warning message.
Any suggestion is welcome
**claims$newprocall <-claims$newproc.x**
**claims$newprocall[is.na(claims$newproc.x)] <- reducedclaims$newproc.y**
**Warning message:**
** number of items to replace is not a multiple of replacement length**
This line replaces the entire column newprocall (in the claims data frame) with the values in a different column of the claims data frame (newproc.x).
claims$newprocall[is.na(claims$newproc.x)]
This selects all the values in the newprocall column (in the claims data frame) that have the same index number as the NA values in the newproc.x column (in the same claims data frame).
reducedclaims$newproc.y
This selects every element of the column newproc.y, which comes from a different data frame (reducedclaims, rather than claims).
First, I'm not sure this code is doing what you want? Do you want to replace the NAs in bothclaims$newprocall and claims$newproc.x with values pulled from some other column? How do you know which are the right values to pick from the other column, if the other column is in a totally different data frame?
Second, the error message: this is telling you that the thing on the left-hand side of the arrow (the values in claims$newprocall that have the same index numbers as the NA values in claims$newproc.x) is a different length from the thing on the right-hand side of the arrow (all of the values in reducedclaims$newproc.y). If the right-hand side is a multiple of the left-hand side, then R will try to recycle it to match the left-hand side length. Otherwise, R doesn't know how to replace a vector of one length with a vector of a different length.
I know that's all a bit general — it's hard to be more specific without a reproducible example to work with. A small reproducible example would also help me understand what it is you are trying to do. For tips on how to create an example that will help people help you more, start here: FAQ: Tips for writing R-related questions
Thank you for your reply.
I am trying to create a new column named newprocall.
I am using values in 2 columns namely newproc.x and newproc.y to create this new column.
Firstly, am assigning newproc.x values to newprocall and then if newproc.x is NA then newproc.y values should be assigned to newprocall
Thanks! I understand much better what you're trying to do now. (It's even more helpful if you can present examples as runnable code, so helpers like me don't have to reformat it).
In this case, I think your best bet is to use ifelse():
claims <- data.frame(
newproc.x = c("Chemo", rep(NA, 5), "Chemo"),
newproc.y = rep("Other", 7),
stringsAsFactors = FALSE
)
claims
#> newproc.x newproc.y
#> 1 Chemo Other
#> 2 <NA> Other
#> 3 <NA> Other
#> 4 <NA> Other
#> 5 <NA> Other
#> 6 <NA> Other
#> 7 Chemo Other
claims$newprocall <- ifelse(is.na(claims$newproc.x), claims$newproc.y, claims$newproc.x)
claims
#> newproc.x newproc.y newprocall
#> 1 Chemo Other Chemo
#> 2 <NA> Other Other
#> 3 <NA> Other Other
#> 4 <NA> Other Other
#> 5 <NA> Other Other
#> 6 <NA> Other Other
#> 7 Chemo Other Chemo
To make it work using subsetting, like you were trying to do, you need to write:
claims$newprocall <- claims$newproc.x
claims$newprocall[is.na(claims$newprocall)] <- claims$newproc.y[is.na(claims$newprocall)]
claims
#> newproc.x newproc.y newprocall
#> 1 Chemo Other Chemo
#> 2 <NA> Other Other
#> 3 <NA> Other Other
#> 4 <NA> Other Other
#> 5 <NA> Other Other
#> 6 <NA> Other Other
#> 7 Chemo Other Chemo