Dear Community,
Could you help me on retrieving subsets of explanatory variables from „ols_step_best_subset” function from olsrr package?
I would like to create some linear and nonlinear models based on subsets of variables selected by particular criteria (like AIC, BIC, Mallow’s Cp, etc.). To do this I’m using „ols_step_best_subset” function (Variable Selection Methods):
An example for Mallow’s Cp criterion:
model=lm(y~., data=AItraining)
library(olsrr)
SUBSETS<-ols_step_best_subset(model)
CP<-SUBSETS$cp
PRED<-SUBSETS$predictors
CP_MATRIX<-as.matrix(CP)
PRED_MATRIX<-as.matrix(PRED)
CP_VAR<-data.frame(CP_MATRIX,PRED_MATRIX)
library(dplyr)
V_CP<-filter(CP_VAR, CP_VAR$CP_MATRIX == min(CP_VAR$CP_MATRIX))[2]
I’m receiving a response that in this case the subset of explanatory variables is composed of 2 variables (x1 and x12):
V_CP
PRED_MATRIX
1 x1 x12
I’d like to build a model composed of these 2 variables – but I don’t know how to do this automatically?
Say y~x1+x12
At this point I’m doing it manually:
data_train_CP<-as.data.frame(cbind(AItraining$y,AItraining$x1,AItraining$x12))
names(data_train_CP)=c("y","x1","x12")
model_CP=lm(y~., data=data_train_CP)
summary(model_CP)
variables_CP_test<-as.data.frame(cbind(AItesting$x1,AItesting$x12))
names(variables_CP_test)=c("x1","x12")
variables_CP_test
FORECASTS TRAINING DATASET
predict(model_CP)
FORECASTS TESTING DATASET
predict(model_CP, newdata=AItesting)
The code works, only if variables are selected manually, I don’t know how to write code to get this model with variables x1 and x12 “automatically” from the “PRED_MATRIX” – could you help me on this one?