`expect_equal()` test passes, even when difference is greater than tolerance threshold

lorae · November 12, 2024, 3:04pm

Can anyone kindly clarify how the tolerance argument works in the expect_equal function in testthat? I have some examples that appear as if they should be failing tests but actually pass.

library("testthat")

test_that('Two is equal to two', {expect_equal(2, 4.000001/2, tolerance = 1e-4)})

As expected, the output here is "Test passed "

test_that('Two is equal to two', {expect_equal(2, 4.8/2, tolerance = 1)})

As expected, the output here is also "Test passed "

test_that('Two is equal to two', {expect_equal(2, 4.8/2, tolerance = 0.4)})

Since 2.4 - 2 = 0.4, this should be the lowest tolerance level for which the test passes. Indeed, the output: "Test passed "

test_that('Two is equal to two', {expect_equal(2, 4.8/2, tolerance = 0.2)})

Well that's confusing... the output is still "Test passed "

test_that('Two is equal to two', {expect_equal(2, 4.8/2, tolerance = 0.1)})

Finally, the expected error:

── Failure: Two is equal to two ────────────────────────────────────────────────
2 not equal to 4.8/2.
1/1 mismatches
[1] 2 - 2.4 == -0.4

Error:
! Test failed
Backtrace:
    ▆
 1. ├─testthat::test_that(...)
 2. │ └─withr (local) `<fn>`()
 3. └─reporter$stop_if_needed()
 4.   └─rlang::abort("Test failed", call = NULL)

Can someone please explain what's going on here? Why is the test still passing when the difference is greater than the tolerance threshold?

nirgrahamuk · November 12, 2024, 4:02pm

the ?expect_equals documentation gives you the formula for calculating the difference against which tolerance is compared ; it gives :

 mean(abs(x - y) / mean(abs(y))) < tolerance

So here are your evaluations :

(x = 2 )
(y= 4.8/2)

mean(abs(x - y) / mean(abs(y)))
mean(abs(x - y) / mean(abs(y))) < 1
mean(abs(x - y) / mean(abs(y))) < 0.4
mean(abs(x - y) / mean(abs(y))) < 0.2
mean(abs(x - y) / mean(abs(y))) < 0.1

lorae · November 12, 2024, 4:47pm

Thank you very much for the clarification! From the documentation for compare {waldo}:

If non-NULL, used as threshold for ignoring small floating point difference when comparing numeric vectors. Using any non-NULL value will cause integer and double vectors to be compared based on their values, not their types, and will ignore the difference between NaN and NA_real_.

It uses the same algorithm as all.equal(), i.e., first we generate x_diff and y_diff by subsetting x and y to look only locations with differences. Then we check that mean(abs(x_diff - y_diff)) / mean(abs(y_diff)) (or just mean(abs(x_diff - y_diff)) if y_diff is small) is less than tolerance.

It seems like sometimes it's doing an absolute difference when "y_diff is small." Do you know what "small" means in this context?

Also, do you know of any way to use absolute tolerance threshold in my testing? (I'm open to using other functions aside from expect_equal if they have this option).

nirgrahamuk · November 14, 2024, 1:00pm

source : waldo/R/num_equal.R at 64de755c9a14ad9248f04211fdaf0eb735efbbf3 · r-lib/waldo
The implementation is here :

 x_diff <- x[!same]
  y_diff <- y[!same]

  avg_diff <- mean(abs(x_diff - y_diff))
  avg_y <- mean(abs(y_diff))

  # compute relative difference when y is "large" but finite
  if (is.finite(avg_y) && avg_y > tolerance) {
    avg_diff <- avg_diff / avg_y
  }

  avg_diff < tolerance

so the mean absolute value of the y component is tested for being finite and larger than the tolerance itself, and only if so is used to divide the difference.

nirgrahamuk · November 14, 2024, 1:02pm

you can define your own testthat expecter functions, you could perhaps life the waldo expect_equal implementation, and remove the part about the division.

a guide to the general case is here :
Custom expectations • testthat

system · December 2, 2024, 3:44pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.