Hello together,
"train data" is a list of 100 dataframes. I would like to apply a separate instance of Random Forest to each dataframe, resulting in 100 individual models (RF_models). My problem is the parameter "mtry".
Instead of using one value for mtry for all dataframes collectively, I have prepared a vector for mtry with 100 specific values (optimal tuned value for each dataframe) and I want the script to use the corresponding value for each of the dataframes from this vector. In this case "corresponding" means, the first value of the vector shall be used for the first dataframe in the list, the second for the second, etc.
My code apparently isn't complete, because it always just uses the first value of the vector for all dataframes. I suspect I'll have to use an index for "mtry" with an additional variable included in the function. But, alas, no cigar.
RF_models <- lapply(train_data, function(i)
{randomForest(i[-25], i$classes, mtry=models_mtry, ntree=500, sampsize=smp.size, strata=i$classes)
})
Thanks in advance for your support.