A demonstration

Basic Usage

Two class

set.seed(1234)

N = 2500
x = rnorm(N)
lp = 0 + .5*x
y = rbinom(N, size = 1, prob = plogis(lp))

model = glm(y ~ x, family = binomial)
predict_class = predict(model) > 0

library(confusionMatrix)
confusion_matrix(predict_class, y)

$Accuracy
# A tibble: 1 x 5
  Accuracy `Accuracy LL` `Accuracy UL` `Accuracy Guessing` `Accuracy P-value`
     <dbl>         <dbl>         <dbl>               <dbl>              <dbl>
1    0.591         0.572         0.611               0.513           2.88e-15

$Other
# A tibble: 1 x 19
  Positive     N `N Positive` `N Negative` `Sensitivity/Re… `Specificity/TN…
  <chr>    <int>        <int>        <int>            <dbl>            <dbl>
1 0         2500         1283         1217            0.634            0.546
# … with 13 more variables: `PPV/Precision` <dbl>, NPV <dbl>, `F1/Dice` <dbl>,
#   Prevalence <dbl>, `Detection Rate` <dbl>, `Detection Prevalence` <dbl>,
#   `Balanced Accuracy` <dbl>, FDR <dbl>, FOR <dbl>, `FPR/Fallout` <dbl>,
#   FNR <dbl>, `D Prime` <dbl>, AUC <dbl>

$`Association and Agreement`
# A tibble: 1 x 6
  Kappa `Adjusted Rand`  Yule   Phi Peirce Jaccard
  <dbl>           <dbl> <dbl> <dbl>  <dbl>   <dbl>
1 0.180          0.0329 0.351 0.181  0.180   0.394

Multi-class

Works for the multi-class setting too, but not all statistics are applicable/available. Note that the ‘Other’ statistics are provided for each class, as well as averaged.

p_multi = sample(letters[1:4], 250, replace = TRUE, prob = 1:4)
o_multi = sample(letters[1:4], 250, replace = TRUE, prob = 1:4)

confusion_matrix(p_multi, o_multi)

Warning in calc_agreement(conf_mat): Some association metrics may not be
calculated due to lack of 2x2 table

$Accuracy
# A tibble: 1 x 5
  Accuracy `Accuracy LL` `Accuracy UL` `Accuracy Guessing` `Accuracy P-value`
     <dbl>         <dbl>         <dbl>               <dbl>              <dbl>
1    0.344         0.285         0.406               0.388              0.933

$Other
# A tibble: 5 x 17
  Class     N `Sensitivity/Re… `Specificity/TN… `PPV/Precision`   NPV `F1/Dice`
  <chr> <dbl>            <dbl>            <dbl>           <dbl> <dbl>     <dbl>
1 a      24             0.0833            0.907          0.0870 0.903    0.0851
2 b      54             0.278             0.806          0.283  0.802    0.280 
3 c      75             0.333             0.714          0.333  0.714    0.333 
4 d      97             0.454             0.641          0.444  0.649    0.449 
5 Aver…  62.5           0.287             0.767          0.287  0.767    0.287 
# … with 10 more variables: Prevalence <dbl>, `Detection Rate` <dbl>,
#   `Detection Prevalence` <dbl>, `Balanced Accuracy` <dbl>, FDR <dbl>,
#   FOR <dbl>, `FPR/Fallout` <dbl>, FNR <dbl>, `D Prime` <dbl>, AUC <dbl>

$`Association and Agreement`
# A tibble: 1 x 6
   Kappa `Adjusted Rand` Yule  Phi   Peirce Jaccard
   <dbl>           <dbl> <lgl> <lgl> <lgl>  <lgl>  
1 0.0652       0.0000361 NA    NA    NA     NA

Available Statistics

Accuracy

Accuracy
Lower and Upper Limits
Guessing rate
Test of accuracy vs. the guessing rate

Additional Statistics

Relative class sizes
Sensitivity/Recall/True Positive Rate
Specificity/True Negative Rate
Positive Predictive Value/Precision
Negative Predictive Value
F1 score/Dice coefficient
Prevalence
Detection Rate
Detection Prevalence
Balanced Accuracy
False Discovery Rate
False Omission Rate
False Positive Rate/Fallout
False Negative Rate
D’
Area Under a ROC

Association and Agreement

Kappa
Adjusted Rand
Yule
Phi
Peirce/Youden J
Jaccard

Long vs. Wide

In wide format, you have a simple way to pluck any particular statistic.

cm = confusion_matrix(predict_class, y)

cm[['Other']][['N']]

[1] 2500

cm[['Association and Agreement']][['Yule']]

[1] 0.3514883

However, if you want to present a lot of statistics, a longer format might be more ideal.

confusion_matrix(predict_class, y, longer = TRUE)

$Accuracy
# A tibble: 5 x 2
  Statistic            Value
  <chr>                <dbl>
1 Accuracy          5.91e- 1
2 Accuracy LL       5.72e- 1
3 Accuracy UL       6.11e- 1
4 Accuracy Guessing 5.13e- 1
5 Accuracy P-value  2.88e-15

$Other
# A tibble: 18 x 3
   Positive Statistic                 Value
   <chr>    <chr>                     <dbl>
 1 0        N                      2500    
 2 0        N Positive             1283    
 3 0        N Negative             1217    
 4 0        Sensitivity/Recall/TPR    0.634
 5 0        Specificity/TNR           0.546
 6 0        PPV/Precision             0.595
 7 0        NPV                       0.586
 8 0        F1/Dice                   0.614
 9 0        Prevalence                0.513
10 0        Detection Rate            0.326
11 0        Detection Prevalence      0.547
12 0        Balanced Accuracy         0.590
13 0        FDR                       0.405
14 0        FOR                       0.414
15 0        FPR/Fallout               0.454
16 0        FNR                       0.366
17 0        D Prime                   0.458
18 0        AUC                       0.627

$`Association and Agreement`
# A tibble: 6 x 2
  Statistic      Value
  <chr>          <dbl>
1 Kappa         0.180 
2 Adjusted Rand 0.0329
3 Yule          0.351 
4 Phi           0.181 
5 Peirce        0.180 
6 Jaccard       0.394

Either way, the tibble output will work nicely with various packages for formatting tables.

Michael Clark

2020-07-10