How do I retain the rows after drop_na(), for later?

For modelling, if NA is included in the data, modelling fails, so we would be forced to do something along the lines of

library(tidytable)
df_dropped = df |> drop_na.()

Then let's say modelling finished, with a column y_hat.

The problem is that the number of rows will not match between df and y_hat (df_dropped and y_hat would match).

So I would like to attach them to their right place in df, with NA as y_hat for the dropped rows. I would like to do the same for the covariates as well.

How can I do this?

I think the easiest way to do this would be to use vctrs. vctrs is a backend package that is used in both tidytable and dplyr.

With vctrs you can use vec_detect_complete() inside of a filter.():

library(tidytable, warn.conflicts = FALSE)
library(vctrs)

df <- tidytable(x = c(1, NA, 3, 4), y = c(1, 2, NA, 4))

complete_locs <- vec_detect_complete(df)

not_na_df <- df %>%
  filter.(complete_locs)

not_na_df
#> # A tidytable: 2 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     4     4

na_df <- df %>%
  filter.(!complete_locs)

na_df
#> # A tidytable: 2 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1    NA     2
#> 2     3    NA

If you need to preserve row order so that the NA predictions are "in the right place" in df you can do something like this:

library(tidytable, warn.conflicts = FALSE)
library(vctrs)

df <- tidytable(x = c(1, NA, 3, 4), y = c(1, 2, NA, 4))

complete_locs <- vec_detect_complete(df)

not_na_df <- df %>%
  filter.(complete_locs) %>%
  mutate.(prediction = 1) # Insert actual prediction here

na_df <- df %>%
  filter.(!complete_locs) %>%
  mutate.(prediction = NA)

pred_df <- df %>%
  mutate.(prediction = 0) # 0 is a simple placeholder

vec_slice(pred_df, complete_locs) <- not_na_df
vec_slice(pred_df, !complete_locs) <- na_df

pred_df
#> # A tidytable: 4 × 3
#>       x     y prediction
#>   <dbl> <dbl>      <dbl>
#> 1     1     1          1
#> 2    NA     2         NA
#> 3     3    NA         NA
#> 4     4     4          1
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.