Testing multiple implementations of a trait in Rust

I’ve been hacking on a small practice project in Rust where I implement the same
data structure in several different ways. When testing this project, I want to
run exactly the same set of tests on several types that implement the same
trait.

As a demonstrative example, let’s take the following trait:

pub trait Calculator {
fn new() -> Self;
fn add(&self, a: u32, b: u32) -> u32;
}

A straightforward implementation could be Foo:

pub struct Foo {}

impl Calculator for Foo {
fn new() -> Self {
Self {}
}

fn add(&self, a: u32, b: u32) -> u32 {
a + b
}
}

Or, if you enjoy the Peano axioms, a somewhat more involved
implementation could be Bar:

pub struct Bar {}

impl Calculator for Bar {
fn new() -> Self {
Self {}
}

fn add(&self, a: u32, b: u32) -> u32 {
if b == 0 {
a
} else {
self.add(a, b 1) + 1
}
}
}

Our task is to write the same set of tests once, and invoke it on both
Foo and Bar with as little boilerplate as possible. Let’s examine
several approaches for doing this [1].

Straightforward trait-based testing

The most basic approach to testing our types would be something like:

#[cfg(test)]
mod tests {
use crate::calculator::{Bar, Calculator, Foo};

fn trait_tester<C: Calculator>() {
let c = C::new();
assert_eq!(c.add(2, 3), 5);
assert_eq!(c.add(10, 43), 53);
}

#[test]
fn test_foo() {
trait_tester::<Foo>();
}

#[test]
fn test_bar() {
trait_tester::<Bar>();
}
}

The trait_tester function can be invoked on any type that implements the
Calculator trait and can host a collection of tests. “Concrete” test
functions like test_foo then call trait_tester; the concrete test
functions are what the Rust testing framework sees because they’re marked with
the #[test] attribute.

On the surface, this approach seems workable; looking deeper, however, there
is a serious issue.

Suppose we want to write multiple test functions that test different
features and usages of our Calculator. We could add
trait_tester_feature1, trait_tester_feature2, etc. Then, the concrete
test functions would look something like:

#[test]
fn test_foo() {
trait_tester::<Foo>();
trait_tester_feature1::<Foo>();
trait_tester_feature2::<Foo>();
}

#[test]
fn test_bar() {
trait_tester::<Bar>();
trait_tester_feature1::<Bar>();
trait_tester_feature2::<Bar>();
}

Taken to the limit, there’s quite a bit of repetition here. In a realistic
project the number of tests can easily run into the dozens.

The problem doesn’t end here, though; in Rust, the unit of testing is
test_foo, not the trait_tester* functions. This means that only
test_foo will show up in the testing report, there’s no easy way to select
to run only trait_tester_feature1, etc. Moreover, test parallelization can
only happen between #[test] functions.

The fundamental issue here is: what we really want is to mark each of
the trait_tester* functions with #[test], but this isn’t trivial because
#[test] is a compile-time feature, and the compiler is supposed to know what
concrete types partake in each #[test] function definition.

Thankfully, Rust has just the tool for generating code at compile time.

First attempt with macros

Macros can help us generate functions tagged with #[test] at compile time.
Let’s try this:

macro_rules! calculator_tests {
($($name:ident: $type:ty,)*) => {
$(
#[test]
fn $name() {
let c = <$type>::new();
assert_eq!(c.add(2, 3), 5);
assert_eq!(c.add(10, 43), 53);
}
)*
}
}

#[cfg(test)]
mod tests {
use crate::calculator::{Bar, Calculator, Foo};

calculator_tests! {
foo: Foo,
bar: Bar,
}
}

The calculator_tests macro generates multiple #[test]-tagged functions,
one per type. If we run cargo test, we’ll see that the Rust testing
framework recognizes and runs them:

[…]
test typetest::tests::bar … ok
test typetest::tests::foo … ok
[…]

However, there’s an issue; how to we add more testing functions per type, as
discussed previously? If only we could do something like fn ${name}_feature1
to name a function. But alas, we cannot! Due to macro hygiene rules, Rust won’t let us
generate identifiers like that. It might be possible somehow, but I didn’t find
a straightforward way to do it. Luckily, there’s a better solution.

Second attempt with macros

Instead of encoding the type variant in the function name, we can use a Rust
sub-module:

macro_rules! calculator_tests {
($($name:ident: $type:ty,)*) => {
$(
mod $name {
use super::*;

#[test]
fn test() {
let c = <$type>::new();
assert_eq!(c.add(2, 3), 5);
assert_eq!(c.add(10, 43), 53);
}
}
)*
}
}

#[cfg(test)]
mod tests {
use crate::calculator::{Bar, Calculator, Foo};

calculator_tests! {
foo: Foo,
bar: Bar,
}
}

Now all functions are named test, but they’re namespaced inside a module
with a configurable name. And yes, now we can easily add more testing functions:

macro_rules! calculator_tests {
($($name:ident: $type:ty,)*) => {
$(
mod $name {
use super::*;

#[test]
fn test() {
let c = <$type>::new();
assert_eq!(c.add(2, 3), 5);
assert_eq!(c.add(10, 43), 53);
}

#[test]
fn test_feature1() {
let c = <$type>::new();
assert_eq!(c.add(6, 9), 15);
}
}
)*
}
}

If we run cargo test, it works as expected:

test typetestmod::tests::bar::test … ok
test typetestmod::tests::bar::test_feature1 … ok
test typetestmod::tests::foo::test_feature1 … ok
test typetestmod::tests::foo::test … ok

Each test has its own full path, and is invoked separately. We can select which
tests to run from the command line – running only the tests for Bar, say, or
run all the feature1 tests for all types. Also notice that the test names
are reported “out of order”; this is because they are all run concurrently!

To conclude, with some macro hackery the goal is achieved. We can now write any
number of tests in a generic way, and invoke all these tests on multiple types
with minimal duplication – just one extra line per type [2].

It’s not all perfect, though. Macros add a layer of indirection and it leaks
in the error messages. If one of the assert_eq! invocations fails, the
reported line is at the point of macro instantiation, which is the same line
for all tests for any given type. This is quite inconvenient and makes debugging
failures more challenging. It could be that I’m missing something obvious, or
maybe this is a limitation of the Rust compiler. If you know how to fix this,
please drop me a line!

[1]
The full source code for this post can be found
on GitHub.

[2]
Sharp-eyed readers will note that using this approach the common trait
isn’t actually needed at all! Macros work by textual substitution (AST
substitution, to be precise), so the generated code creates a concrete
type and invokes its methods. The macro-based tests would work even if
Foo and Bar didn’t declare themselves as implementing the
Calculator trait.

Leave a Reply

Your email address will not be published. Required fields are marked *