tidycomm
and irr
both offer very nice options for calculating various statistics of intercoder reliability, but to my knowledge they do not offer options to show the intermediate calculations of agreement for each item or unit. The example below shows what I mean using percent agreement as an example. Having the intermediate calculations could be helpful for reviewing coding with a team because p_agree
could be arranged easily to identify items with low agreement.
I've looked around quite a bit to find a function that could show agreement by item/unit for various statistics and that would take an arbitrary number of coders because the example approach below would be difficult to repurpose for different stats (e.g., Krippendorff's alpha) and for different numbers of coders. Does anyone know of a package or function out there that I've overlooked? If not, is there a simple solution to calculating percent agreement (to take a simple stat as a starting point) for each item for an arbitrary number of coders?
Thanks
library(tidyverse)
irr_data <- tibble(r1 = rep(c("yes", "no"), times = 5),
r2 = rep(c("yes", "no"), each = 5),
r3 = c(rep("yes", times = 3),
rep("no", times = 5),
rep("yes", times = 2)))
irr_data %>%
mutate(r1_r2 = if_else(r1 == r2, 1, 0),
r1_r3 = if_else(r1 == r3, 1, 0),
r2_r3 = if_else(r2 == r3, 1, 0),
n_agree = rowSums(across(r1_r2:r2_r3)),
p_agree = n_agree/3) %>%
arrange(p_agree)
#> # A tibble: 10 x 8
#> r1 r2 r3 r1_r2 r1_r3 r2_r3 n_agree p_agree
#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 no yes yes 0 0 1 1 0.333
#> 2 no yes no 0 1 0 1 0.333
#> 3 yes yes no 1 0 0 1 0.333
#> 4 yes no no 0 0 1 1 0.333
#> 5 yes no yes 0 1 0 1 0.333
#> 6 no no yes 1 0 0 1 0.333
#> 7 yes yes yes 1 1 1 3 1
#> 8 yes yes yes 1 1 1 3 1
#> 9 no no no 1 1 1 3 1
#> 10 no no no 1 1 1 3 1
Created on 2022-11-23 by the reprex package (v2.0.1)