"function is not appropriate"

I am trying to use the "caret" library to train a multiclass classification algorithm.

I found the following website which supposedly contains a function that can be used for multiclass classification problem:Error metrics for multi-class problems in R: beyond Accuracy and Kappa | R-bloggers

multiClassSummary <- cmpfun(function (data, lev = NULL, model = NULL){
  #Load Libraries
  #Check data
  if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
    stop("levels of observed and predicted data do not match")
  #Calculate custom one-vs-all stats for each class
  prob_stats <- lapply(levels(data[, "pred"]), function(class){
    #Grab one-vs-all data for the class
    pred <- ifelse(data[, "pred"] == class, 1, 0)
    obs  <- ifelse(data[,  "obs"] == class, 1, 0)
    prob <- data[,class]

    #Calculate one-vs-all AUC and logLoss and return
    cap_prob <- pmin(pmax(prob, .000001), .999999)
    prob_stats <- c(auc(obs, prob), logLoss(obs, cap_prob))
    names(prob_stats) <- c('ROC', 'logLoss')
  prob_stats <- do.call(rbind, prob_stats)
  rownames(prob_stats) <- paste('Class:', levels(data[, "pred"]))
  #Calculate confusion matrix-based statistics
  CM <- confusionMatrix(data[, "pred"], data[, "obs"])
  #Aggregate and average class-wise stats
  #Todo: add weights
  class_stats <- cbind(CM$byClass, prob_stats)
  class_stats <- colMeans(class_stats)

  #Aggregate overall stats
  overall_stats <- c(CM$overall)
  #Combine overall with class-wise stats and remove some stats we don't want 
  stats <- c(overall_stats, class_stats)
  stats <- stats[! names(stats) %in% c('AccuracyNull', 
    'Prevalence', 'Detection Prevalence')]
  #Clean names and return
  names(stats) <- gsub('[[:blank:]]+', '_', names(stats))

I would like to use this function to train a "Decision Tree" model using the "F1 Score" (or any metric suitable for a multiclass problem). For example:


train.control <- trainControl(method = "repeatedcv", number = 10, repeats = 3, 
                              summaryFunction = multiClassSummary, classProbs = TRUE)

train_model <- train(my_data$response ~., data = my_data, method = "C5.0",
                 trControl=train.control ,
                 preProcess = c("center", "scale"),
                 tuneLength = 15,
                 metric = "F1")

But this produces the following error:

Error in ctrl$summaryFunction(testOutput, lev, method):
Your outcome has 4 levels. The prSummary() function isn't appropriate.

Can someone please show me how to fix this error? Thanks!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.