pathos
March 30, 2022, 7:30pm
1
For modelling, if NA
is included in the data, modelling fails, so we would be forced to do something along the lines of
library(tidytable)
df_dropped = df |> drop_na.()
Then let's say modelling finished, with a column y_hat
.
The problem is that the number of rows will not match between df
and y_hat
(df_dropped
and y_hat
would match).
So I would like to attach them to their right place in df
, with NA
as y_hat
for the dropped rows. I would like to do the same for the covariates as well.
How can I do this?
I think the easiest way to do this would be to use vctrs
. vctrs
is a backend package that is used in both tidytable
and dplyr
.
With vctrs
you can use vec_detect_complete()
inside of a filter.()
:
library(tidytable, warn.conflicts = FALSE)
library(vctrs)
df <- tidytable(x = c(1, NA, 3, 4), y = c(1, 2, NA, 4))
complete_locs <- vec_detect_complete(df)
not_na_df <- df %>%
filter.(complete_locs)
not_na_df
#> # A tidytable: 2 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 4 4
na_df <- df %>%
filter.(!complete_locs)
na_df
#> # A tidytable: 2 × 2
#> x y
#> <dbl> <dbl>
#> 1 NA 2
#> 2 3 NA
If you need to preserve row order so that the NA
predictions are "in the right place" in df
you can do something like this:
library(tidytable, warn.conflicts = FALSE)
library(vctrs)
df <- tidytable(x = c(1, NA, 3, 4), y = c(1, 2, NA, 4))
complete_locs <- vec_detect_complete(df)
not_na_df <- df %>%
filter.(complete_locs) %>%
mutate.(prediction = 1) # Insert actual prediction here
na_df <- df %>%
filter.(!complete_locs) %>%
mutate.(prediction = NA)
pred_df <- df %>%
mutate.(prediction = 0) # 0 is a simple placeholder
vec_slice(pred_df, complete_locs) <- not_na_df
vec_slice(pred_df, !complete_locs) <- na_df
pred_df
#> # A tidytable: 4 × 3
#> x y prediction
#> <dbl> <dbl> <dbl>
#> 1 1 1 1
#> 2 NA 2 NA
#> 3 3 NA NA
#> 4 4 4 1
1 Like
system
Closed
April 7, 2022, 6:23pm
3
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.