mmce.test.mean = NA for Neuralnet & Random Forest Hyperparameter Tuning in mlr

I am using mlr to create ML models for research. I would use mlr3, but the paper that I am building on uses mlr, hence why I am sticking with the older library.

I am having an issue with the performance metrics of the hyperparameter tuning on two of the models I'm using - neuralnet and randomforest.

Here is my code for a function that I use to return a predictor of a given type (NB: TUNEITERS = 100L and RESAMPLING = cv5:

getPredictor <- function(ml_alg_id, 
                         data,
                         data_id,
                         target,
                         target_values) {
  
  # Task for classification.
  data.task = makeClassifTask(id = data_id,
                              data = data,
                              target = target,
                              positive = target_values[POS_CLV_INDEX])
  
  # Initialise parallelisation.
  parallelMap::parallelStartSocket(parallel::detectCores(), level = "mlr.tuneParams")
  
  # Choose & train the model and set the predictor.
  pred = NULL
  if (ml_alg_id == NN_ALG_ID) {
    
    # Learner: Neural network.
    lrn = makeLearner("classif.nnet",
                      predict.type = "prob",
                      fix.factors.prediction = TRUE)
    
    # Normalisation/dummy encode.
    data.lrn = cpoScale() %>>% cpoDummyEncode() %>>% lrn
    
    # Parameters for tuning.
    param_grid = makeParamSet(
      makeNumericParam("size", lower = 1, upper = 20),
      makeNumericParam("decay", lower = 0.1, upper = 0.9)
    )
    
    # Random search for tuning method.
    tune_control = makeTuneControlRandom(maxit = TUNEITERS)
    
    # Tune.
    data.lrn.tuned = tuneParams(data.lrn, 
                                task = data.task, 
                                resampling = RESAMPLING, 
                                par.set = param_grid, 
                                control = tune_control)
    
    # Train the model.
    data.model = mlr::train(data.lrn.tuned$learner, data.task)
    
    # Set as predictor.
    pred = Predictor$new(model = data.model,
                         data = data,
                         class = target_values[POS_CLV_INDEX])
  }
  else if (ml_alg_id == RF_ALG_ID) {
    
    # Learner: Random Forest.
    lrn = makeLearner("classif.randomForest", 
                      predict.type = "prob", 
                      fix.factors.prediction = TRUE)
    
    # Parameters for tuning.
    param_grid = makeParamSet(
      makeIntegerParam("ntree", lower = 50, upper = 500),
      makeIntegerParam("mtry", lower = 1, upper = ncol(data) - 1)
    )
    
    # Random search for tuning method.
    tune_control = makeTuneControlRandom(maxit = TUNEITERS)
    
    # Tune.
    lrn.tuned = tuneParams(lrn, 
                           task = data.task, 
                           resampling = RESAMPLING, 
                           par.set = param_grid, 
                           control = tune_control)
    
    # Train the model.
    data.model = mlr::train(lrn.tuned$learner, data.task)
    
    # Set as predictor.
    pred = Predictor$new(model = data.model,
                         data = data,
                         class = target_values[POS_CLV_INDEX])
  }
  else if (ml_alg_id == SVM_ALG_ID) {
    
    # Learner: Support Vector Machine.
    lrn = makeLearner("classif.svm", predict.type = "prob")
    
    # Normalisation/dummy encode.
    data.lrn = cpoScale() %>>% cpoDummyEncode() %>>% lrn
    
    # Parameters for tuning.
    param.set = pSS(
      cost: numeric[0.01, 1]
    )
    
    # Tune.
    ctrl = makeTuneControlRandom(maxit = TUNEITERS * length(param.set$pars))
    lrn.tuning = makeTuneWrapper(lrn, RESAMPLING, list(mlr::acc), param.set, ctrl, show.info = FALSE)
    res = tuneParams(lrn, data.task, RESAMPLING, par.set = param.set, control = ctrl,
                     show.info = FALSE)
    performance = resample(lrn.tuning, data.task, RESAMPLING, list(mlr::acc))$aggr
    data.lrn = setHyperPars2(data.lrn, res$x) 
    
    # Train the model.
    data.model = mlr::train(data.lrn, data.task)
    
    # Set as predictor.
    pred = Predictor$new(model = data.model, 
                         data = data, 
                         class = target_values[POS_CLV_INDEX],
                         conditional = FALSE)
    
    # Fit conditional inference trees.
    ctr = partykit::ctree_control(maxdepth = 5L)
    set.seed(1234)
    pred$conditionals = fit_conditionals(pred$data$get.x(), ctrl = ctr)
  }
  else {
    stop("Error: Invalid ML algorithm ID passed to getPredictor()")
  }
  
  # Stop parallelisation.
  parallelMap::parallelStop()
  
  return(pred)
}

Here is the message I receive for the result of a neuralnet hyperparameter tuning:

I also get mmce.test.mean=NA for random forests too.

As shown, I am using dummy encoding with mlrCPO for the neuralnet model. I am using no such encoding for random forest as I believe it can handle heterogeneous datasets.

What am I doing wrong here to cause mmce.test.mean=NA?

It was an issue with this line. I changed it to makeIntegerParam() and it worked. It seems it was skipping over the parameter.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.