How to avoid "NAs introduced by coercion" warning when no NAs are introduced

After reading together a bunch of csv's, I have a character column called response_char that is a mix of characters and numbers.

Using case_when, I want to create a new column response_num where I will apply a set of specific conditional rules to convert the character responses to numbers (when trial == "one"), and then, convert the numeric responses to numbers (when trial == "two").

In the reproducible example below, I get Warning in eval_tidy(pair$rhs, env = default_env): NAs introduced by coercion despite no NA being actually introduced.

Any idea about how can I avoid these type of warnings? Is this a bug?

library(dplyr, warn.conflicts = FALSE)
df <- tibble(trial = c("one", "two"), response_char = c(1.1, "No"))

df
#> # A tibble: 2 x 2
#>   trial response_char
#>   <chr> <chr>        
#> 1 one   1.1          
#> 2 two   No

df %>% 
  mutate(
    response_num = 
      case_when(
        trial == "two" & response_char == "No" ~ 0,
        trial == "one" ~ as.numeric(response_char) # This line creates the warning
        )
    )
#> Warning in eval_tidy(pair$rhs, env = default_env): NAs introduced by coercion
#> # A tibble: 2 x 3
#>   trial response_char response_num
#>   <chr> <chr>                <dbl>
#> 1 one   1.1                    1.1
#> 2 two   No                     0

Created on 2021-04-13 by the reprex package (v2.0.0)

The documentation mentions this in the examples section.

case_when() evaluates all RHS expressions, and then constructs its
result by extracting the selected (via the LHS expressions) parts.

Thanks for your response @nirgrahamuk. I thought case_when() would evaluate the RHS only in the rows where the LHS is TRUE... but as you point out, in the documentation is shown how this is not the case:

y <- seq(-2, 2, by = .5)
case_when(
  y >= 0 ~ sqrt(y),
  TRUE   ~ y
)
# [1] -2.0000000 -1.5000000 -1.0000000 -0.5000000 0.0000000  0.7071068  1.0000000  1.2247449 1.4142136
# Warning message: In sqrt(y) : NaNs produced

Any idea about what would be the way to do something like the above avoiding the warnings?

There is an ugly hack that "solves" the issue, but I am hoping there is a better way:

df %>% 
  mutate(
    response_num = 
      case_when(
        trial == "two" & response_char == "No" ~ "0",
        trial == "one" ~  response_char)
    ) %>% 
  mutate(response_num = as.numeric(response_num))

If you have expected warnings and want to suppress them this is the standard approach :

suppressWarnings({
df %>%
  mutate(
    response_num = 
      case_when(
      trial == "two" & response_char == "No" ~ 0,
      trial == "one" ~ as.numeric(response_char)
      )
  )})

Thanks @nirgrahamuk . I know about suppressWarnings(), but sadly can't use it here because I need to track any non-expected NA's that might appear during the processing.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.