Added a benchmark for the rolling bloom filter (with parameters corresponding to the tx rejection cache), which showed that on average, adding+checking one item takes around 2.1us, but the refresh action that wipes all old generation items every 60000 iterations takes 65ms, which is very significant.
Thus, this patch also changes the implementation from one that stores 16 2-bit integers in uint32_t’s, to one that stores the first bit of 64 2-bit integers in one uint64_t and the second bit in another. This allows for 450x faster refreshing (0.14ms) and 2.2x faster average adding+checking (0.93us).
All benchmarks done on an Intel Core i7-4800MQ CPU, running at 2.6 GHz, with binaries compiled with GCC 5.3.1.