Test Suite
The test suite is run under normal conditions, then with various verification tools:
- Clang sanitisers (ASan, MSan, UBSan)
- Valgrind
- TIS interpreter
- CompCert, and its interpreter.
Constant time comparisons
Comparisons are pretty straightforward, but the bit twiddling involved still leaves room for error. The test suite uses property based tests that generate something like a couple million tests. They were designed through mutation testing: the comparison routines were modified incorrectly, to see if the tests would catch the error.
Initial versions of the tests didn’t catch them all. The current version does. It is now pretty hard to implement incorrect comparisons that would pass the tests by accident.
ChaCha20
Some test vectors came from RFC 7539. The rest were generated with libsodium. They test every input length, from zero to twice the ChaCha block size.
A couple test vectors were generated specifically to test counter overflow (it is expected to go back to zero).
Property based tests test the consistency of the streaming interface, the consistency of the counter interface, and when the input and output buffers are the same.
XChaCha20
Test vectors were generated by libsodium. No further tests were conducted, because they would be redundant with those of ChaCha20 (XChaCha20 differs only by its initialisation).
HChaCha20
XChaCha20 depends on HChaCha20, so there is no need for further tests (any error in HChaCha20 would fail the XChaCha20 tests)
Poly1305
This one is tricky, because modular arithmetic often allows subtle errors to creep in in ways that are very hard to test.
We use test vectors from RFC 7539, which are specifically designed to catch errors that random tests wouldn’t (most notably limb overflows). Other test vectors were generated by libsodium. They test every input length, from zero to twice the Poly1305 block size.
Property based tests test the consistency of the streaming interface and overlapping input/output buffers (which are allowed).
Additionally, a proof of correctness was conducted to further increase confidence. It is not machine checked, but it is fairly easy to review.
BLAKE2b
Test vectors were generated by libsodium. They span every input length from zero to twice the BLAKE2b block size, all possible key sizes (from 0 to 64), and all possible output sizes (from 1 to 64).
Property based tests test the consistency of the streaming interface, and overlapping input/output buffers (which are allowed).
One line of code proved impossible to test: the overflow of the 64-bit limb of the 128-bit counter:
static void blake2b_incr(crypto_blake2b_ctx *ctx)
{
u64 *x = ctx->input_offset;
size_t y = ctx->input_idx;
x[0] += y;
if (x[0] < y) {
x[1]++; // NEVER REACHED
}
}
We do not care, for two reasons:
- That line of code, and the associated conditional, are easily reviewed manually.
- That line of code will not be reached in production anyway: it would mean hashing more than 2^64 bytes of data, which isn’t going to happen any time soon. Even if there was a bug, it won’t be triggered.
SHA-512
Test vectors were generated by libsodium. They span every input length from zero to twice the SHA-512 block size.
Property based tests test the consistency of the streaming interface and overlapping input/output buffers (which are allowed).
Like BLAKE2b, one line of code proved impossible to test:
static void sha512_incr(u64 x[2], u64 y)
{
x[1] += y;
if (x[1] < y) {
x[0]++; // NEVER REACHED
}
}
We do not care, for the same reasons as BLAKE2b.
HMAC-SHA-512
Test vectors were generated by libsodium. We also added Wycheproof test vectors.
Property based tests test the consistency of the streaming interface and overlapping input/output buffers (which are allowed).
Argon2i
Test vectors were generated by libsodium. They span every working buffer size, from 508 blocks to 516 blocks (exceeding 512 blocks is important to reach all code paths in Argon2i), output hash sizes 63, 64, and 65, and every number of iterations from 3 to 6.
There’s also a test vector from the reference implementation, to make sure the key and additional data aren’t swapped.
Property based tests test overlapping buffers.
X25519
This one also involves modular arithmetic. Confidence was reached by leveraging the ref10 reference implementation. The code was heavily edited for readability and compactness, but the structure stayed the same. This most likely preserved the absence of subtle limb overflows.
Test vectors were generated by libsodium, and the first 1000 iterations of the Monte Carlo test are performed (so were the remaining 999000 iterations, but they are turned off by default). We also added the Wycheproof X25519 test vectors since 2.0.6.
EdDSA
Like X25519, EdDSA leverages existing implementations (ref10 and TweetNaCl). Subtle limb overflow errors are therefore as unlikely.
Test vectors were generated by Ed25519-Donna, which unlike libsodium can use a custom hash instead of the usual SHA-512. We also added the Wycheproof Ed25519 test vectors since 2.0.4, to make sure the vulnerability found in 2.0.3 doesn’t come back (we compile Monocypher with -DED25519_SHA-512 to be compatible with those test vectors).
Property based tests test round trip (signed messages should pass verification, forged messages should be rejected), random input (including points outside the curve for verification), overlapping buffers (which are allowed), and the consistency of the incremental interface.
Authenticated encryption
Test vectors were generated by libsodium (version 2.0.0 and later are compatible with libsodium), and property based tests test round trip and the consistency of the incremental interface.