Error verified with Confusion matrix

ConfusionMatrix() helps to discover that sensitivity() AND specificity() functions are wrong, since they produce negative and positive predictive values, instead of true positive and true negative rates.

Hi, is this a statement or is there a question?

If it is a question, can you provide a reproducible example?

Below follows examples that will help in understanding what's happening. confusionMatrix() operates differently depending on the order of factors in the data. For two-way tables, there is a parameter to specify which factor should be treated as positive, Otherwise, the default is to use the first-in-order.

To see how this plays out in the data that you were looking at, please provide reprex, as suggested by @williaml

library(caret)
#> Loading required package: ggplot2
#> Loading required package: lattice

# from ??confusionMatrix
# What this function does
lvs <- c("normal", "abnormal")
truth <- factor(rep(lvs, times = c(86, 258)),
                levels = rev(lvs))
pred <- factor(
  c(
    rep(lvs, times = c(54, 32)),
    rep(lvs, times = c(27, 231))),
  levels = rev(lvs))

xtab <- table(pred, truth)

confusionMatrix(xtab)
#> Confusion Matrix and Statistics
#> 
#>           truth
#> pred       abnormal normal
#>   abnormal      231     32
#>   normal         27     54
#>                                           
#>                Accuracy : 0.8285          
#>                  95% CI : (0.7844, 0.8668)
#>     No Information Rate : 0.75            
#>     P-Value [Acc > NIR] : 0.0003097       
#>                                           
#>                   Kappa : 0.5336          
#>                                           
#>  Mcnemar's Test P-Value : 0.6025370       
#>                                           
#>             Sensitivity : 0.8953          
#>             Specificity : 0.6279          
#>          Pos Pred Value : 0.8783          
#>          Neg Pred Value : 0.6667          
#>              Prevalence : 0.7500          
#>          Detection Rate : 0.6715          
#>    Detection Prevalence : 0.7645          
#>       Balanced Accuracy : 0.7616          
#>                                           
#>        'Positive' Class : abnormal        
#> 
confusionMatrix(pred, truth)
#> Confusion Matrix and Statistics
#> 
#>           Reference
#> Prediction abnormal normal
#>   abnormal      231     32
#>   normal         27     54
#>                                           
#>                Accuracy : 0.8285          
#>                  95% CI : (0.7844, 0.8668)
#>     No Information Rate : 0.75            
#>     P-Value [Acc > NIR] : 0.0003097       
#>                                           
#>                   Kappa : 0.5336          
#>                                           
#>  Mcnemar's Test P-Value : 0.6025370       
#>                                           
#>             Sensitivity : 0.8953          
#>             Specificity : 0.6279          
#>          Pos Pred Value : 0.8783          
#>          Neg Pred Value : 0.6667          
#>              Prevalence : 0.7500          
#>          Detection Rate : 0.6715          
#>    Detection Prevalence : 0.7645          
#>       Balanced Accuracy : 0.7616          
#>                                           
#>        'Positive' Class : abnormal        
#> 
confusionMatrix(xtab, prevalence = 0.25)
#> Confusion Matrix and Statistics
#> 
#>           truth
#> pred       abnormal normal
#>   abnormal      231     32
#>   normal         27     54
#>                                           
#>                Accuracy : 0.8285          
#>                  95% CI : (0.7844, 0.8668)
#>     No Information Rate : 0.75            
#>     P-Value [Acc > NIR] : 0.0003097       
#>                                           
#>                   Kappa : 0.5336          
#>                                           
#>  Mcnemar's Test P-Value : 0.6025370       
#>                                           
#>             Sensitivity : 0.8953          
#>             Specificity : 0.6279          
#>          Pos Pred Value : 0.4451          
#>          Neg Pred Value : 0.9474          
#>              Prevalence : 0.2500          
#>          Detection Rate : 0.6715          
#>    Detection Prevalence : 0.7645          
#>       Balanced Accuracy : 0.7616          
#>                                           
#>        'Positive' Class : abnormal        
#> 

# from ??negPredValue

lvs <- c("normal", "abnormal")
truth <- factor(rep(lvs, times = c(86, 258)),
                levels = rev(lvs))
pred <- factor(
  c(
    rep(lvs, times = c(54, 32)),
    rep(lvs, times = c(27, 231))),
  levels = rev(lvs))

xtab <- table(pred, truth)

sensitivity(pred, truth)
#> [1] 0.8953488
sensitivity(xtab)
#> [1] 0.8953488
posPredValue(pred, truth)
#> [1] 0.878327
posPredValue(pred, truth, prevalence = 0.25)
#> [1] 0.4450867

specificity(pred, truth)
#> [1] 0.627907
negPredValue(pred, truth)
#> [1] 0.6666667
negPredValue(xtab)
#> [1] 0.6666667
negPredValue(pred, truth, prevalence = 0.25)
#> [1] 0.9473684


prev <- seq(0.001, .99, length = 20)
npvVals <- ppvVals <- prev  * NA
for(i in seq(along = prev))
{
  ppvVals[i] <- posPredValue(pred, truth, prevalence = prev[i])
  npvVals[i] <- negPredValue(pred, truth, prevalence = prev[i])
}

plot(prev, ppvVals,
     ylim = c(0, 1),
     type = "l",
     ylab = "",
     xlab = "Prevalence (i.e. prior)")
points(prev, npvVals, type = "l", col = "red")
abline(h=sensitivity(pred, truth), lty = 2)
abline(h=specificity(pred, truth), lty = 2, col = "red")
legend(.5, .5,
       c("ppv", "npv", "sens", "spec"),
       col = c("black", "red", "black", "red"),
       lty = c(1, 1, 2, 2))

Created on 2023-11-08 with reprex v2.0.2

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.