Hello everyone!
I would disagree with @bustosmiguel. Simply replacing missing values with 0 is commonly not what you want and can result in very wrong results. na.omit
is the right call there. I suspect by your description that either there is a column with only NAs in your dataset or there is at least one NA
in each row, so that na.omit() returns an empty data.frame
. Since you are using the tidyverse anyway, you can filter()
the rows that have NA
s in the columns you want.
In your case this would be something like
AA22 %>%
filter(!is.na(Time3),
!is.na(Comunity),
... # repeat for all
)
You can make this less cumbersome by using if_any()
and any_of()
AA22 %>%
filter(
!if_any(
any_of(c("Time3",
"Community",
... # add all columns
)), is.na)
)
However, if this also results an empty data frame, you dont have any complete cases. In this case it may be necessary to not consider certain parameters in your glm. Check which parameters have many missing cases by running:
AA22 %>%
summarise(
across(
everything(),
~sum(ifelse(is.na(.x),1,0))
)
)
In base R, you could also subset your dataset to only those columns that you want to include in the glm AA22[c("Time3", "Community", ...)] # add your columns
, running na.omit()
should then return the same as the dplyr-approach above (however missing the additional columns).
Hope this helps!
Best,
Valentin