Currently, at the end of a test vector (valid or invalid) we iterate the flags one by one and unset them and check that it makes the transaction succeed or fail.
This is called ‘minimality’ in that we want to check that each flag is sufficient on it’s own. However, I think we should be instead doing the 2**(specified) combinations of flag, to ensure that for all combinations of flags only the one we have specified is either valid or invalid.
Otherwise, we might have e.g. subsets of flags that have interactions we’re not detecting here.
Interestingly, the asymptotic runtime here should be better on average (since we don’t usually set that many flags, v.s. 32 iterations on the one by one checker), but the worst case is 2**32 flag combos. It’s up to the test writers to not abuse this check.
Contrived example demonstrating the problem:
0const auto sum = flag_a_set + flag_b_set +flag_c_set ;
1// check parity is even
2if (sum & 1) throw "Oops";
having set “a,b,c”, the one-by-one checker for “a,b,c,” expects failure would check [“a,b”, “a,c”, “b,c”] and see no failures since the parity of each is 2. However, [“a”, “b”, “c”] would fail since the parity is 1.
I’m not aware of any code that is written in this specific manner, but similar circumstance might arise naturally in the code.