I want to simulate several types of time-series (TS) trajectories and then compute the R² between each simulated TS and my actual TS data to identify which simulated trajectory best explained by my real-world observations. My objective is to generate simulated NTL (nighttime lights) trajectories representing possible responses to a disruption and compare these with my actual NTL time-series using R². The disruption has known start and end dates (see code below).
I want to simulate TS that follow four conceptual patterns across three periods: pre-disruption, during-disruption, and post-disruption.
- Full Recovery: initial condition (pre-), decrease (during-), increase above pre-level (post-)
- Partial Recovery: initial condition (pre-), decrease (during-), increase below pre-level (post-)
- Chronic Vulnerability: initial condition (pre-), decrease (during-), further decrease (post-)
- High Resilience: initial condition (pre-), increase (during-), stable or mild increase (post-)
Conceptually:
- Pre-disruption: all start from an initial level
- During-disruption: either increase or decrease
- Post-disruption: trajectory depends on the "during" phase
I haven’t found examples on simulating TS with such structural constraints across distinct phases, I created a deterministic method for one city (I have several):
library(dplyr)
library(ggplot2)
library(tidyr)
# lockdown dates
lockdown_dates_retail <- list(
ba = as.Date(c("2020-03-01", "2021-05-01"))
)
# input df
df
# normalize ba column and assign disruption periods
df <- df |>
mutate(
z_ba = scale(ba),
period = case_when(
date < lockdown_dates_retail$ba[1] ~ "pre",
date >= lockdown_dates_retail$ba[1] & date <= lockdown_dates_retail$ba[2] ~ "lockdown",
date > lockdown_dates_retail$ba[2] ~ "post",
TRUE ~ NA_character_
)
)
ggplot(df, aes(x = date, y = z_ba, color = period)) +
geom_line() +
labs(
title = "Original NTL Time Series for BA with Periods",
x = "Date",
y = "NTL"
) +
theme_minimal()
# function to create trajectories for the ba city
create_city_templates <- function(city_data, lockdown_start_date, lockdown_end_date) {
dates <- unique(city_data$date)
n_points <- length(dates)
lockdown_start_idx <- which(dates == lockdown_start_date)
lockdown_end_idx <- which(dates == lockdown_end_date)
pre_lockdown_mean <- mean(city_data$z_ba[city_data$date < lockdown_start_date], na.rm = TRUE)
full_recovery <- rep(pre_lockdown_mean, n_points)
full_recovery[lockdown_start_idx:lockdown_end_idx] <- seq(pre_lockdown_mean, pre_lockdown_mean - 1.5, length.out = lockdown_end_idx - lockdown_start_idx + 1)
full_recovery[(lockdown_end_idx + 1):n_points] <- seq(pre_lockdown_mean - 1.5, pre_lockdown_mean + 1, length.out = n_points - lockdown_end_idx)
partial_recovery <- rep(pre_lockdown_mean, n_points)
partial_recovery[lockdown_start_idx:lockdown_end_idx] <- seq(pre_lockdown_mean, pre_lockdown_mean - 1.5, length.out = lockdown_end_idx - lockdown_start_idx + 1)
partial_recovery[(lockdown_end_idx + 1):n_points] <- seq(pre_lockdown_mean - 1.5, pre_lockdown_mean - 0.3, length.out = n_points - lockdown_end_idx)
chronic_vulnerability <- rep(pre_lockdown_mean, n_points)
chronic_vulnerability[lockdown_start_idx:lockdown_end_idx] <- seq(pre_lockdown_mean, pre_lockdown_mean - 1.5, length.out = lockdown_end_idx - lockdown_start_idx + 1)
chronic_vulnerability[(lockdown_end_idx + 1):n_points] <- seq(pre_lockdown_mean - 1.5, pre_lockdown_mean - 2.5, length.out = n_points - lockdown_end_idx)
high_resilience <- rep(pre_lockdown_mean, n_points)
high_resilience[lockdown_start_idx:lockdown_end_idx] <- seq(pre_lockdown_mean, pre_lockdown_mean + 0.5, length.out = lockdown_end_idx - lockdown_start_idx + 1)
high_resilience[(lockdown_end_idx + 1):n_points] <- seq(pre_lockdown_mean + 0.5, pre_lockdown_mean + 1.5, length.out = n_points - lockdown_end_idx)
list(
dates = dates,
full_recovery = full_recovery,
partial_recovery = partial_recovery,
chronic_vulnerability = chronic_vulnerability,
high_resilience = high_resilience
)
}
# plot templates for BA
city_templates <- create_city_templates(df, lockdown_dates_retail$ba[1], lockdown_dates_retail$ba[2])
templates_df <- data.frame(
date = rep(city_templates$dates, 4),
value = c(
city_templates$full_recovery,
city_templates$partial_recovery,
city_templates$chronic_vulnerability,
city_templates$high_resilience
),
template = rep(
c("Full Recovery", "Partial Recovery", "Chronic Vulnerability", "High Resilience"),
each = length(city_templates$dates)
)
)
ggplot() +
annotate("rect",
xmin = lockdown_dates_retail$ba[1], xmax = lockdown_dates_retail$ba[2],
ymin = -Inf, ymax = Inf,
alpha = 0.2, fill = "gray") +
geom_line(data = templates_df, aes(x = date, y = value, color = template), size = 1) +
labs(
title = "Theoretical Resilience Templates for BA",
x = "Date", y = "Z-scored NTL",
color = "Template"
) +
theme_minimal()
How can I simulate such time series with constraints on slope direction (positive or negative) within specific time intervals?
Dataset:
> dput(df)
structure(list(date = c("01-01-18", "01-02-18", "01-03-18", "01-04-18",
"01-05-18", "01-06-18", "01-07-18", "01-08-18", "01-09-18", "01-10-18",
"01-11-18", "01-12-18", "01-01-19", "01-02-19", "01-03-19", "01-04-19",
"01-05-19", "01-06-19", "01-07-19", "01-08-19", "01-09-19", "01-10-19",
"01-11-19", "01-12-19", "01-01-20", "01-02-20", "01-03-20", "01-04-20",
"01-05-20", "01-06-20", "01-07-20", "01-08-20", "01-09-20", "01-10-20",
"01-11-20", "01-12-20", "01-01-21", "01-02-21", "01-03-21", "01-04-21",
"01-05-21", "01-06-21", "01-07-21", "01-08-21", "01-09-21", "01-10-21",
"01-11-21", "01-12-21", "01-01-22", "01-02-22", "01-03-22", "01-04-22",
"01-05-22", "01-06-22", "01-07-22", "01-08-22", "01-09-22", "01-10-22",
"01-11-22", "01-12-22", "01-01-23", "01-02-23", "01-03-23", "01-04-23",
"01-05-23", "01-06-23", "01-07-23", "01-08-23", "01-09-23", "01-10-23",
"01-11-23", "01-12-23"), ba = c(5.631965012, 5.652943903, 5.673922795,
5.698648054, 5.723373314, 5.749232037, 5.77509076, 5.80020167,
5.82531258, 5.870469864, 5.915627148, 5.973485875, 6.031344603,
6.069760262, 6.10817592, 6.130933313, 6.153690706, 6.157266393,
6.16084208, 6.125815676, 6.090789273, 6.02944691, 5.968104547,
5.905129394, 5.842154242, 5.782085265, 5.722016287, 5.666351167,
5.610686047, 5.571689415, 5.532692782, 5.516260933, 5.499829083,
5.503563375, 5.507297667, 5.531697846, 5.556098024, 5.583567118,
5.611036212, 5.636610944, 5.662185675, 5.715111139, 5.768036603,
5.862347902, 5.956659202, 6.071535763, 6.186412324, 6.30989678,
6.433381236, 6.575014889, 6.716648541, 6.860849606, 7.00505067,
7.099267331, 7.193483993, 7.213179035, 7.232874077, 7.203921341,
7.174968606, 7.12081735, 7.066666093, 6.994413881, 6.922161669,
6.841271288, 6.760380907, 6.673688099, 6.586995291, 6.502777891,
6.418560491, 6.338127583, 6.257694675, 6.179117301)), class = "data.frame", row.names = c(NA,
-72L))
Session info
R version 4.5.0 (2025-04-11 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_1.1.4
loaded via a namespace (and not attached):
[1] tidyselect_1.2.1 compiler_4.5.0 magrittr_2.0.3 R6_2.6.1 generics_0.1.4 cli_3.6.5 tools_4.5.0
[8] pillar_1.10.2 glue_1.8.0 rstudioapi_0.17.1 tibble_3.2.1 vctrs_0.6.5 lifecycle_1.0.4 pkgconfig_2.0.3
[15] rlang_1.1.6