My data looks like this
|id | time| y | x1|
|1 | 1 | 2312 | 34345|
|1 | 2 | 2343 | 234566|
|1 | 3 | 5654 | 4532234|
|2 | 1 | 4234 | 453256|
|2 | 2 | 7647 | 8653|
|2 | 3 | 3457 | 123245|
|3 | 1 | 2453 | 235454|
|3 | 2 | 7654 | 3345675|
|3 | 3 | 7653 | 2542665|
I want a loop that takes different combinations of id (identifier for cities) and time (identifier for years) - eg: take two cities for 2 years and 3 years, then take 3 cities for two years and three years. I want to estimate an interaction matrices for which I have a function for these different combinations of number of cities and years. How can I construct the loop?
I like approach 2 which upfront decides what to do, as that can be sense checked, i.e. that the total volume of possibilities is sensible.
as a tip for you ; if you want every combination of ID , for which your example has 3 unique ids' and there are therefore 7 possible combinations. you can find that like so :
library(tidyverse)
d1 <- read_delim(file = "|id|time|y|x1|
|1|1|2312|34345|
|1|2|2343|234566|
|1|3|5654|4532234|
|2|1|4234|453256|
|2|2|7647|8653|
|2|3|3457|123245|
|3|1|2453|235454|
|3|2|7654|3345675|
|3|3|7653|2542665|", delim = "|") |> select(where(is.numeric))
d1
(uid <- unique(d1$id))
(id_combinations <- map(
seq_along(uid),
\(x)combn(x = uid, m = x, simplify = FALSE)
) |> flatten())
unique_ids <- unique(main_data$id)
results <- list()
# Loop over increasing number of IDs (cities)
for (i in 5:length(unique_ids)) {
# Subset the IDs for the current iteration
selected_ids <- unique_ids[1:i] # Take the first i IDs
# Filter the main data based on the selected cities
data <- main_data[main_data$id %in% selected_ids, c("id", "time", "y", "x1", "x2", "x3") ]
# Run the recoverNetwork function on the subset data
rn <- recoverNetwork(data, lambda = c(0.10, 0.10, 0.10))
# Extract the result
W_matrix <- rn$unpenalisedgmm$W
results[[i]] <- W_matrix
}
so this is what i am trying to do. Each iteration has one more index than the previous iteration. However I encounter this error in indexing:
Error in ku_format_slice(key$row, nrow) :
Index is out of bounds for axis with size 10
unique(main_data$id) is 10 where each id has data for 11 years.
Could you possibly know why I am encountering this error in subsetting? i guess there is a logical oversight
unique_ids <- unique(main_data$id)
results <- list()
# Loop over increasing number of IDs (cities)
for (i in 5:length(unique_ids)) {
# Subset the IDs for the current iteration
selected_ids <- unique_ids[1:i] # Take the first i IDs
# Filter the main data based on the selected cities
data <- main_data[main_data$id %in% selected_ids, c("id", "time", "y", "x1", "x2", "x3") ]
# Run the recoverNetwork function on the subset data
rn <- recoverNetwork(data, lambda = c(0.10, 0.10, 0.10))
# Extract the result
W_matrix <- rn$unpenalisedgmm$W
results[[i]] <- W_matrix
}
Thank you.
But this is what i am trying to do. Each iteration has one more index than the previous iteration. However I encounter this error in indexing:
Error in ku_format_slice(key$row, nrow) :
Index is out of bounds for axis with size 10
unique(main_data$id) is 10 where each id has data for 11 years.
Could you possibly know why I am encountering this error in subsetting? i guess there is a logical oversight
Your approach works on this example data ...
So can you provide an example of data where it fails ?
library(recoverNetwork)
main_data <- gendata(setting=1,1)
main_data <- main_data[main_data$id <=7] # make it smaller because I get bored waiting for a long time
unique_ids <- unique(main_data$id)
results <- list()
# Loop over increasing number of IDs (cities)
for (i in 5:length(unique_ids)) {
# Subset the IDs for the current iteration
selected_ids <- unique_ids[1:i] # Take the first i IDs
# Filter the main data based on the selected cities
data <- main_data[main_data$id %in% selected_ids, c("id", "time", "y", "x1") ]
# Run the recoverNetwork function on the subset data
rn <- recoverNetwork(data, lambda = c(0.10, 0.10, 0.10))
# Extract the result
W_matrix <- rn$unpenalisedgmm$W
results[[i]] <- W_matrix
print(paste0("Done i=",i))
}
It doesn't seem to work for this data
data <- gendata(setting=15,seed=1)
I get the error below . Do you know why it worked for main_data <- gendata(setting=1,1) and not the one above? Both have only one x variable (x1)
Initial conditions
Elastic Net
Error in glmnet::glmnet(ZX, ZY, lambda = lambda, alpha = alpha, penalty.factor = pen, :
x should be a matrix with 2 or more columns
Error in optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower, :
Cannot evaluate function at initial parameters