map (from purrr) function running too slow

Hi everybody,

I am trying to replicate the code from Len Kiefer's blog: Vulnerable Housing · Len Kiefer

It is running okay till I get to here:

# map over each quarter, fitting the skewed t distribution by minimizing loss
# (you don't want to know about df2, df3 or their cousins df2a, df2b, ... , df3x5)
df4 <-
  df %>%
  mutate(par=map(date,possible_myf))

To give the context, here is the excerpt describing what the code is trying to do:

"Next, we’ll need a loss function. This function will penalize our parameters by giving us quantiles far away from ones we plotted above. Then we will feed this loss function into the optim function to seek a minimum. We’ll use purrr::possibly to have the function skip cases where they routine doesn’t find a minimum. Then we’ll solve the optimization step for each date, fitting a different distribution to the conditional quantile for each period.

Code detail

myloss <- function(par, x){
  myq = qst(p=c(0.05,0.25, 0.75, 0.95),
            xi=par[1],
            omega=par[2],
            alpha=par[3],
            nu=par[4]
  )
  loss= sum((myq-x)**2)
}

# intialize parameters
# use unconditional mean and sd of log sales for xi and omega, set alpha to 0 and nu at 30
par_init=c(xi=mean(df$lsales),omega=sd(df$lsales),alpha=0,nu=30)

# function which takes a date as an input and estimates 
myf <- function(dd=min(df$date),par0=par_init){
  # find date
  i=which(df$date==dd)
  # solve best fit to skew-t distrubtion
  # do this for row i (corresponding to date dd) of predicted quantiles
  p1 <- optim( par0, myloss, x=dfp[i,])$par
}

possible_myf <- possibly(myf, otherwise="error")

# map over each quarter, fitting the skewed t distribution by minimizing loss
# (you don't want to know about df2, df3 or their cousins df2a, df2b, ... , df3x5)
df4 <-
  df %>%
  mutate(par=map(date,possible_myf))

Any comments/suggestions would be much appreciated! Thanks!

First thing I'd do is to perform the optimization on only 1, and then 5 elements of date to understand how long it takes, and estimate how long it will take to complete the full set.

Second thing I'd do is to use method = "BFGS" in the optim call. This is faster than the default method.

1 Like

Thank you very much! Using "BFGS" reduced the running time substantially!

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.