Estimating adalasso model

I am trying to estimate a adalasso model used in the paper: [https://www.econ.puc-rio.br/mgarcia/170424realtimeinflationforecasting.pdf](Real-time inflation forecasting with high-dimensional models: The case of Brazil)

To estimate a adalasso, the first step is to estimate the coefficients using a Lasso model and then compute a penalty factor for each variable.

The glmnet function from glmnet package has an argument "penalty.factor" that allows a penalty factor for each variable.

However, using tidymodel, the model spec allows only a single number on penalty, for example:

model.spec <- 
    linear_reg(penalty = 0.1, mixture = 1) %>%
    set_engine("glmnet")

I tried to modify it to the following:

model.spec <- 
    linear_reg(mixture = 1) %>%
    set_engine("glmnet", penalty.factor = penalty.factor.ada)

but I got the same error:

Error in `.check_glmnet_penalty_fit()`:
! For the glmnet engine, `penalty` must be a single number (or a value of `tune()`).
• There are 0 values for `penalty`.
• To try multiple values for total regularization, use the tune package.
• To predict multiple penalties, use `multi_predict()`

Any ideas how to estimate Adalasso using tidymodels package??

1 Like

I'm not extremely familiar with tidymodels, but I think here it's pretty simple: you do need to specify a penalty within linear_reg(), but then you can freely give a penalty.factor in set_engine(). Your examples above have one or the other, you need both (which makes sense as the original {glmnet} function takes penalty.factor in addition to its lambda parameter).

Here is a quick validation (I mixed tidymodel and base R code, so it's ugly, but it's just for showing the result):

library(tidymodels)

data(Chicago)

n <- nrow(Chicago)
Chicago <- Chicago %>% select(ridership, Clark_Lake, Quincy_Wells)


# Using only standard glmnet approach
# Adapted from https://rpubs.com/kaz_yos/alasso
library(glmnet)

initial_lasso <- glmnet(x = Chicago |> select(-ridership) |> as.matrix(), y = Chicago$ridership,
                       alpha = 1)

# select "best" lambda (here we keep closest to 0.1)
penalty.factor.ada <- initial_lasso$beta[,which.min(initial_lasso$lambda - 0.1)]

# run adaptive lasso with the weights from above
alasso1 <- glmnet(x = Chicago |> select(-ridership) |> as.matrix(), y = Chicago$ridership,
                  alpha = 1,
                  penalty.factor = 1 / abs(penalty.factor.ada))

# check results
coef(alasso1, s = 0.1)
#> 3 x 1 sparse Matrix of class "dgCMatrix"
#>                     s1
#> (Intercept)  1.6910267
#> Clark_Lake   0.8767005
#> Quincy_Wells .



# All tidymodel
mod_simple_lasso <- 
  linear_reg(mixture = 1, penalty = 0.1) %>%
  set_engine("glmnet")

tm_initial_lasso <- mod_simple_lasso |>
  fit(ridership ~ ., data = Chicago)

# select "best" lambda (here we keep closest to 0.1)
extracted_coefs <- tm_initial_lasso$fit$beta[,which.min(tm_initial_lasso$fit$lambda - 0.1)]
tm_penalty.factor.ada <- 1 / abs(extracted_coefs)

model.spec <- 
  linear_reg(mixture = 1, penalty = .1) %>%
  set_engine("glmnet", penalty.factor = tm_penalty.factor.ada)

linreg_reg_fit <- model.spec |>
  fit(ridership ~ ., data = Chicago)

# same result
coef(linreg_reg_fit$fit, s = 0.1)
#> 3 x 1 sparse Matrix of class "dgCMatrix"
#>                     s1
#> (Intercept)  1.6910267
#> Clark_Lake   0.8767005
#> Quincy_Wells .

Created on 2023-10-16 with reprex v2.0.2

1 Like

It works! Thank you!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.