Threshold VAR model coding problems (structural VAR)

filiptv · January 10, 2024, 10:28am

Hello

I am having a problem with coding a tri-variate threshold VAR model . I am using the tsdyn package to do the linearity tests and to find the threshold. The linearity tests doesnt work but thats a minor problem I couldnt find a package doing the general impulse response function for the threshold VAR modelling , so I use a piece of code I found on the internet and through my department by Alexander Haider.
So I get this error message when I try to create the GIRF (general IRF) of the threshold VAR model:
tv3 <- TVAR(z, lag=5, nthresh=1, thDelay=5, trim=0.15, mTh=2)
Best unique threshold 0.8378869

ImpulseResp <- GIRF(tv3, hor=20, shVar=3, n.hist=274, replic=100, ci=c(0.05,0.95))
Error in irf_1_shock_ave(object = object, shock = M$shock[[i]], hist = M$hist[[i]], :
unused arguments (hor = 20, shVar = 3, replic = 100, ci = c(0.05, 0.95))

Here is the code:
I processed the data as follows: Import and reshape the data , I create a 3 year lagged variable for the variables of household debt and non-financial firm debt as percentage of GDP and the gdp growth in quarters as a change to the former year's respective quarter

GDP <- read_csv("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/GDP.csv",
                col_names = FALSE,
                skip = 17)
hdebt.nominal <- read_excel("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/totcredit.xlsx", 
                            sheet = "Quarterly Series", range = "AOG59:AOG336", 
                            col_names = FALSE)

non_fins_debt_nominal <- read_excel("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/totcredit.xlsx", 
                                    sheet = "Quarterly Series", range = "AOK59:AOK336", 
                                    col_names = FALSE)

us.gdp.growth <- read_csv("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/GDP_growth.csv", 
                          col_names = FALSE, skip = 29)


GDP$X2 <- as.numeric(GDP$X2)

us.gdp.growth$X2 <- as.numeric(us.gdp.growth$X2)

###Creating the lagged vairables and converting them to time series

GDP <- read_csv("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/GDP.csv",
                col_names = FALSE,
                skip = 17)
hdebt.nominal <- read_excel("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/totcredit.xlsx", 
                            sheet = "Quarterly Series", range = "AOG59:AOG336", 
                            col_names = FALSE)

non_fins_debt_nominal <- read_excel("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/totcredit.xlsx", 
                                    sheet = "Quarterly Series", range = "AOK59:AOK336", 
                                    col_names = FALSE)

us.gdp.growth <- read_csv("C:/Users/Fil/Desktop/Filer/Porject/Ny mapp/GDP_growth.csv", 
                          col_names = FALSE, skip = 29)


GDP$X2 <- as.numeric(GDP$X2)

us.gdp.growth$X2 <- as.numeric(us.gdp.growth$X2)

## Making the lagged variables

hdebt.l3 <- hdebt.nominal[1:length(hdebt.nominal)]/GDP$X2[1:(length(GDP$X2)-12)]

non.fin.debt.l3 <- non_fins_debt_nominal[1:length(non_fins_debt_nominal)]/GDP$X2[1:(length(GDP$X2)-12)]


hdebt <- ts(hdebt.l3, start = c(1954,1), end = c(2023,3), frequency = 4)

n.fin.debt <- ts(non.fin.debt.l3, start = c(1954,1), end = c(2023,3), frequency = 4)

growth <- ts(us.gdp.growth$X2, start = c(1954,1), end = c(2023,3), frequency = 4)

### Plotting all variables #####################################################

png("hdebt.png", width = 700, height = 700)
plot(hdebt, ylab="", main="Household debt")
dev.off()

png("growth.png", width = 700, height = 700)
plot(growth, ylab="", main="Growth")
dev.off()

png("n.fin.debt.png", width = 700, height = 700)
plot(n.fin.debt, ylab="", main="Non financial debt")
dev.off()

z <- na.omit(cbind(growth, hdebt, n.fin.debt)) 

### Estimating the Threshold #####################################################


tv1 <- TVAR(z, lag=4, nthresh=1, thDelay=2, trim=0.15, mTh=2)  # Time delay = 1 year = 4 quarters

summary(tv1)

### Plot the Threshold Variable with the threshold value #######################
# <- XX
Threshold <-  0.8361859 ## Threshold found at 0.8361 of GDP

png("Threshold.png", width = 900, height = 350)
plot(hdebt,
     xlab="Time", 
     ylab="",
     main = "Threshold Variable: Household debt as percentage of 3 year lagged GDP",
     lwd=1)
abline(h=Threshold, col=524, lwd=1.5)
legend('topleft', 
       legend=c('hdebt','Threshold value'),
       col = c("black", 524), lty=1)
dev.off()

### Linearity Tests #############################################################


test1 <-  print(TVAR.LRtest(z, lag=4, mTh=c(1,1),thDelay=1, nboot=2, plot=FALSE, trim=0.15, test="1vs", model="TAR",na.rm=TRUE))

##mTh=c(1,1)

######################## Code by Alexander Haider ############################## 

#res <- est.res(hor=20, shk=-1, replic=100, shVar=2, i =1, n.hist=500, ci=c(0.05,0.95), quiet=F)
GIRF1 <- function(res, hor=24, shk=1, shVar = 1, n.hist =500, replic=100, ci=c(0.95,0.975), quiet=F) {
  
  if (!quiet) cat("Shocked variable:", colnames(res$model)[shVar], "\nShock Size:", shk, "\n")
  
  #Regimes (NA is deleted later)
  regimes <- regime(res)
  
  #we need horizon + 1 residuals (+1 for current period) -> increase hor by 1
  #we take a particular history (i.e. the lagged values of a specific point) in a regime and predict 
  #its current value and n periods in the future
  #based on the model and th history-->  increase horizon by 1
  hor <- hor+4 ### I put 4 residuals in order to have a year
  
  threshV <- which(res$model.specific$transCombin == 1) #which variable is used as threshold variable
  datS <- res$model[,1:res$k] #get dataset
  rownames(datS) <- 1:nrow(datS)
  
  #starting positions for simulations --> sample of observations w.r.t. regime are saved in hist.pos
  #in the end we start in period (hist.pos[,i]-1) in SimTVAR! 
  #hist.pos samples through observations according to their regime. But then we have to use the history
  #of the observations to genearte the IR (thus we start in (hist.pos[,i]-1) --> therefore we also
  #need hor+1 residuals)
  hist.pos <- matrix(NA, nrow=n.hist, ncol=max(regimes, na.rm=T))
  for (i in 1:max(regimes, na.rm=T)) {
    regime.rows <- suppressWarnings(as.numeric(rownames(datS[regimes==i,])))
    regime.rows <- regime.rows[!is.na(regime.rows)]
    
    hist.pos[, i] <- sample(regime.rows, size=n.hist, replace = T ) #sample through positions
  }
  
  #deleting NAs from regimes
  regimes <- regimes[!is.na(regime(res))]
  
 
  #===============
  #RESIDUALS
  #==============
  
  # Var-Covar of residuals
  Sigma <- (t(res$residuals) %*% res$residuals) / dim(res$residuals)[1] 
  
  #Cholesky
  P <- t( chol(Sigma) ) #lower triangular (chol(Sigma) is upper triangular)
  P.inv <- solve(P)
  str_err <- t( P.inv %*% t(res$residuals) ) #structural errors (transformed back in SimTVAR function for sim)
  
  #sample bootstrap residuals
  #number of bootstrap residuals needed (replications per history * n.history * horizon)
  boot_size <- replic * n.hist * hor 
  boot_sample <- array(dim=c(boot_size, res$k, max(regimes))) 
  for (i in 1:max(regimes)) {
    #position of bootsample to keep observations together
    boot_pos <- sample(x = 1:dim(str_err)[1], size = boot_size, replace = TRUE) 
    boot_sample[ , ,i] <- str_err[boot_pos, ] #samples
  }
  
  
  #===============
  #Simulation
  #==============
  
  #results for each history will be saved here
  avgDiff <- array(dim = c(hor, ncol(datS), n.hist), NA) 
  #final results -> girf and confidence intervals
  girf <- array(dim = c(hor, ncol(datS), max(regimes)), NA,  
                dimnames = list(1:hor, colnames(datS), names(res$coefficients) ))
  ci_lo <- array(dim = c(hor, ncol(datS), max(regimes)), NA,  
                 dimnames = list(1:hor, colnames(datS), names(res$coefficients) ))
  ci_hi <- array(dim = c(hor, ncol(datS), max(regimes)), NA,  
                 dimnames = list(1:hor, colnames(datS), names(res$coefficients) ))
  
  if (!quiet) prog_b <- txtProgressBar(min=0, max= max(regimes)*n.hist, style=3)
  
  #simulate over all regimes
  for (i in 1:max(regimes)) {
    
    #simulate over all sampled histories within a regime
    for (j in 1:n.hist){
      startV <- hist.pos[j, i]
      #get the residuals for specific history (size= replic * horizon)
      resid_boot <- boot_sample[(replic*hor*(j-1)+1) : (replic*hor*j), , i] 
      
      #call simTVAR (second function)
      resultDiff <- lapply(1:replic, simTVAR, res, datS, hor, shk, threshV, 
                           shVar, startV, resid_boot, P)
      
      avgDiff[, , j] <- Reduce("+", resultDiff) / replic #saving means of all histories of part. regime
      
      if (!quiet) setTxtProgressBar(prog_b, n.hist*(i-1)+j)
    }
    
    #GIRF for regimes (final result) & Confidence interval
    ci_lo[, , i] <- apply(avgDiff, MARGIN = c(1,2), quantile, ci[1])
    ci_hi[, , i] <- apply(avgDiff, MARGIN = c(1,2), quantile, ci[2])
    girf[, , i] <- apply(avgDiff, MARGIN=c(1, 2), mean) 
    
  }
  
  if (!quiet) close(prog_b)
  
  #return Value
  return_val <- list(girf = girf, ci_lo = ci_lo, ci_hi = ci_hi, horizon = hor, 
                     shockedVar = shVar, shocksize = shk, estres=res)
  class(return_val) <- "girf"
  return(return_val)
}  


#k=1; results=res;dat=datS;horizon=hor;sh=shk;shvar=shVar;history=startV; r_boot <- resid_boot; Pm<-P
simTVAR <- function(k, results, dat, horizon, sh, threshV, shvar, history, r_boot, Pm) {
  #k...for apply function (index)
  #dat...data
  #horizon...impulse response horizon
  #sh...shock size (in standard deviations)
  #threshV...which variable is used as threshold Variable
  #shVar...shocked Variable
  #history...history of variable
  
  #Bootstrap errors
  boot_err <- r_boot[(horizon*(k-1)+1): (horizon*k), ]
  shock <- c(rep(0, shvar-1), sh, rep(0,nrow(results$coeffmat)-shvar))
  boot_err_delta <- rbind(boot_err[1,] + shock, boot_err[-1,])
  #boot_err_delta <- rbind(shock, boot_err[-1,])
  
  #Transform back -> Cholesky decomp
  boot_err <- t( Pm %*% t(boot_err) )
  boot_err_delta <- t( Pm %*% t(boot_err_delta) )
  
  #simulation without addtional shock --> innov = resid
  #HISTORY IS LAGGED BY 1 S.T. WE GET THE LAGGED VALUES!!!!!!!
  simul <- TVAR.sim(B = results$coeffmat, 
                    Thresh = results$model.specific$Thresh, 
                    nthres = results$model.specific$nthresh,
                    n = horizon, lag = results$lag, include = results$include,
                    thDelay = results$model.specific$thDelay, mTh = threshV,
                    starting = dat[(history - results$lag) : (history-1), ], innov = boot_err)
  
  #starting = dat[(history - results$lag + 1) : history, ], innov = boot_err)                  
  #starting = dat[(history - results$lag) : (history-1), ], innov = boot_err)
  
  #with resid_delta
  simul_delta <- TVAR.sim(B = results$coeffmat, 
                          Thresh = results$model.specific$Thresh, 
                          nthres = results$model.specific$nthresh,
                          n = horizon, lag = results$lag, include = results$include,
                          thDelay = results$model.specific$thDelay, mTh = threshV,
                          starting = dat[(history - results$lag) : (history-1), ], innov = boot_err_delta)
  #starting = dat[(history - results$lag + 1) : history, ], innov = boot_err_delta)                        
  #starting = dat[(history - results$lag) : (history-1), ], innov = boot_err_delta)
  
  difference <- simul_delta - simul
  
  return(difference)  
}

plot.girf <- function(impR, response = 1, ci=T) {
  impulse <- impR$girf
  ci_l <- impR$ci_lo
  ci_h <- impR$ci_hi
  
  if (is.character(response)) suppressWarnings(response <- which(colnames(impulse)==response))
  
  for (i in response) {
    plot(impulse[ , i, 'Bdown'], type='l', ylim = c(-0.6,0.5),
         main = paste('Response of GDP growth to a financial stress shock'), 
         ylab="response", xlab="horizon")
    lines(impulse[, i, 'Bup'], col='blue')
    abline(h=0, lty=4, col='red')
    
    if (ci) {
      lines(ci_l[, i, 'Bdown'], col='red', lty=2)
      lines(ci_h[, i, 'Bdown'], col='red', lty=2)
      lines(ci_l[, i, 'Bup'], col='red', lty=2)
      lines(ci_h[, i, 'Bup'], col='red', lty=2)
    }
    
    if (dim(impulse)[3] == 3) { #3 regimes?
      lines(impulse[ , i, 'Bmiddle'], type='l', col='green') #middle regime only with 3 regimes
      
      if (ci) {
        lines(ci_l[, i, 'Bmiddle'], col='red', lty=2)
        lines(ci_h[, i, 'Bmiddle'], col='red', lty=2)
      }
      
      legend("bottomright",cex = .8,ncol=1, pch=20,
             c("low","middle", "high"), 
             lty=c(1,1), 
             lwd=c(2.5,2.5),col=c("black","green","blue")) 
    } else {
      legend("bottomright",cex = .8,ncol=1, pch=20,
             c("low", "high"), 
             lty=c(1,1), 
             lwd=c(2.5,2.5),col=c("black", "blue")) 
    }
    
  }
  
} 
################################################################################

### Generalized Impulse Response Functions #####################################

# Response of GDP growth to a Household debt  shock

tv3 <- TVAR(z, lag=5, nthresh=1, thDelay=5, trim=0.15, mTh=2)

ImpulseResp <- GIRF(tv3, hor=20, shVar=3, n.hist=274, replic=100, ci=c(0.05,0.95))

impulse <- ImpulseResp$girf
ci_l <- ImpulseResp$ci_lo
ci_h <- ImpulseResp$ci_hi

_______________________

Here is the error message on the linearity test I got:

test1 <-  print(TVAR.LRtest(z, lag=4, mTh=c(1,1),thDelay=1, nboot=2, plot=FALSE, trim=0.15, test="1vs", model="TAR",na.rm=TRUE))
Error in TVAR.LRtest(z, lag = 4, mTh = c(1, 1), thDelay = 1, nboot = 2,  : 
  unused argument (na.rm = TRUE)

The problem with the test seems to be that while I dont have any NA's in my processed data the test would not run if I dont use the argument na.rm without which the test doesnt run: this is the error code i get for the test:

test1 <-  print(TVAR.LRtest(z, lag=4, mTh=c(1,1),thDelay=1, nboot=2, plot=FALSE, trim=0.15, test="1vs", model="TAR"))
Warning: the thDelay values do not correspond to the univariate implementation in tsdyn
Error in quantile.default(LRtestboot12, probs = c(0.9, 0.95, 0.975, 0.99)) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE
In addition: Warning messages:
1: In FUN(newX[, i], ...) :
  Tolerance reached, ndp possibly underestimated.
2: In FUN(newX[, i], ...) :
  Tolerance reached, ndp possibly underestimated.
3: In FUN(newX[, i], ...) :
  Tolerance reached, ndp possibly underestimated.
4: In FUN(newX[, i], ...) :
  Tolerance reached, ndp possibly underestimated.
5: In FUN(newX[, i], ...) :
  Tolerance reached, ndp possibly underestimated.
6: In FUN(newX[, i], ...) :
  Tolerance reached, ndp possibly underestimated.
7: In matrix(mTh, ncol = 1, nrow = k) :
  data length [2] is not a sub-multiple or multiple of the number of rows [3]

I suspect that for the test the First and last row of my matrix are identical that might be the source of the problem however I m not sure , as for the GIRF I m not able to find the sollution however much aI change the parameters and experiment, I cannot start to see what the problem for the GIRF , Help (I m a student)

filiptv · February 7, 2024, 6:22am

No one interested? Or is it not appropriate?

nirgrahamuk · February 7, 2024, 10:54am

You have shared a fairly long code (unformatted so hard to read) , with private data so the code can not be run by forum users. I expect these are reasons why engagement may be low.

If your problems dont relate to loading data from csv; then dont include code that does so ; its unecessary, you can share representations of R objects via methods of dput. Your goal should be to isolate and minimise the code to that which captures the essence of the issue.

Here is a link to the formatting guide , and the reprex guide.

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

I will go ahead, and throw up a simple format of your code at this time.

system · March 20, 2024, 10:55am

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.