This blog has relocated to https://coolbutuseless.github.ioand associated packages are now hosted at https://github.com/coolbutuseless.

29 April 2018

mikefc

Assert a logical statement over a code block

Most (all?) testing packages in R are built around the idea of testing a value at a particular moment in time e.g. “Check that a == 2 right now”.

This post proposes a way to test the expected change in value of a particular expression before & after some particular code is run. E.g. I might not know how long a data.frame is, but I do know that it should grow by 2 rows during the execution of some function.

Quick survey of testing packages

  • testthat
    • Easy testing within packages and in scripts
  • ensurer
    • Doing interesting things with creating type safe functions and checking return values
  • assertr
    • A great ROpenSci package for testing within a pipe
  • checkmate
    • Great high-level checking functions meant for checking arguments at the top of a function call.
    • Combines lots of standard checks into single calls to improve readability and reduce clutter.

Missing test type - testing over a code block

All the testing packages (AFAIK) offer testing of values at a single point in time e.g.

  • What is the value of this variable right now?
  • Is this variable NULL, right now?

However, I sometimes want a comparison test of how a value has changed before/after a block of code.

E.g. if you want to test that the length of the vector a is unchanged after running a section of code, it would look something like:

original_length <- length(a)
{
  ... block of code that does something with 'a' ...
}
stopifnot(length(a) == original_length)

I will often write this type of code to sanity check that I hasn’t lost any rows from a data.frame during a complex merge calculation - especially when there might be many corner cases I thought I’d already handled!

assert_over()

The following code tests the validity of a statement over a code block.

The statement to test is of the form [after] [logical operator] [before].

E.g.

  • a == a + 1 – when we expect the value of a to be incremented by 1
  • nrow(df) == nrow(df) - 3 – when we expect 3 rows to be removed from the data.frame df
#-----------------------------------------------------------------------------
#' Check the validity of a statement evaluated over a code block
#' @param statement [after expression] [logical operator] [before expression]
#'                  e.g.  `a == a + 1`
#' @param code code block to be evaluated
#' @return If statement passes, return an `invisible(TRUE)`, otherwise
#'         raise an error
#-----------------------------------------------------------------------------
assert_over <- function(statement, code) {

  # Capture the statement so we can manipulate it
  exp <- substitute(statement)

  # Check that it is a sane statement of the form:  [after] == [before] or similar
  stopifnot(length(exp) == 3)
  stopifnot(as.character(exp[[1]]) %in% c('==', '!=', '>=', '>', '<', '<='))

  # Evaluate in order
  #  - the 'before' side of the statement
  #  - the actual code block
  #  - the 'after' side of the statement
  before <- eval(exp[[3]])
  res    <- eval(code)
  after  <- eval(exp[[2]])

  # The statement passes if the `before` and `after` values obey whatever 
  # logical argument the user provided
  statement_passed <- eval(as.call(list(exp[[1]], after, before)))

  if (!statement_passed) {
    stop("The following statement is not true: ", deparse(exp), call.=FALSE)
  }

  invisible(TRUE)
}

Test that we can write a code block which passes the assertion i.e. expect that the length of a increases by 1, and it does.

a <- c(1, 2, 3)

assert_over(
  length(a) == length(a) + 1,  
  {
    a <- c(a, 4)
  }
)

Test that we can write a code block which causes an error i.e. expect b to more than double in value, but it doesn’t.

b <- 2

assert_over(
  b > 2 * b,  
  {
    b <- 3
  }
)

# Error: The following statement is not true: b > 2 * b

Conclusion

  • Proof-of-concept seems plausible.
  • Extensions
    • Common special case is to test that a particular expression doesn’t change e.g. nrow(df) is a constant. Write a similar function to do this i.e. assert_unchanged(statement, code)
    • Be able to run multiple tests over a code block. The call signature of such a function is going to need thinking though i.e. is it more sensible to:
      1. Have a signature like assert_over(code, ...) which shifts the code to be first followed by a varying number of statements, or
      2. Have a call signature of assert_over(...) and automatically interpret the last item in the argument list be interpreted as the code block, and the preceding arguments are the tests? (This is similar to how ensurer does some argument unpacking)
    • More checks needed to see that this behaves nicely when calling functions, or for more complex statements.
    • More verbose error with the actual evaluated values of the before/after shown e.g. “Error expecting b > 2 * b, but 3 > 4 is not true”