Spatio-Temporal CV

Ekaterina_Khadonova · November 6, 2025, 4:34pm

Hi, after a recent posit conference I was wondering if there is a way to incorporate splits that are both spatially and temporally explicit to have cross-validation accounting for both simultaneously. Spatial datasets that are simultaneously a timeseries really need that feature, but it does not seem like there is an integrated way to do so.

This book covers the use of mlr3 package that does that. I was wondering if wrapping something like this could be possible?

See a block I found of interest below:

library(mlr3)
library(mlr3spatiotempcv)
task_st = tsk("cookfarm_mlr3")
task_st$set_col_roles("SOURCEID", roles = "space")
task_st$set_col_roles("Date", roles = "time")
resampling = rsmp("sptcv_cstf", folds = 5)

Presently, I have using either one of these:

# Split into train and test datasets - can be temporal OR spatial
set.seed(42)
presence_split <- initial_time_split(
  presence_df %>% arrange(date),
  prop = 0.75)  

# If former, create time-series CV with controlled overlapping and lag-aware splits
temporal_folds <- time_series_cv(
  presence_train,
  date_var   = date,
  initial    = "3 years", 
  assess     = "1 year", 
  skip       = "1 year",
  cumulative = TRUE,
  slice_limit = 10)  


# Alternatively, perform a spatial CV - perhaps on the same time-split dataset? this has been my current approach, but I am not very confident in that. 
presence_train_sf <- presence_train %>%
  st_as_sf(coords = c("lon", "lat"), crs = 26915)          
# sf::st_transform(crs = 26915)                           

spatial_folds <- cv_spatial(
  presence_train_sf, k = 10,
  selection = "systematic")

I would appreciate any insight you might have into this. Thanks!

system · February 4, 2026, 4:34pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.