Improve CRollingBloomFilter performance: replace modulus with FastMod #13176

pull martinus wants to merge 1 commits into bitcoin:master from martinus:optimize-CRollingBloomFilter changing 1 files +11 −2
  1. martinus commented at 12:15 PM on May 6, 2018: contributor

    Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3:

    RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before
    RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod
    

    Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems.

  2. replace modulus with FastMod
    Replaces the slow modulo operation with a much faster 32bit multiplication & shift. This works
    because the hash should be uniformly distributed between 0 and 2^32-1. This speeds up the benchmark
    by a factor of about 1.3:
    
    RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before
    RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod
    
    Be aware that this changes the position of the bits that are toggled, so this should probably
    not be used for CBloomFilter which is serialized.
    9aac9f90d5
  3. fanquake added the label Refactoring on May 6, 2018
  4. fanquake requested review from sipa on May 6, 2018
  5. sipa commented at 10:04 PM on May 6, 2018: member

    utACK 9aac9f90d5e56752cc6cbfac48063ad29a01143c

    Using FastMod instead of '%' looks correct here, and it indeed can't be used for normal bloom filters (which are normative for the BIP37 protocol).

  6. laanwj added the label Utils/log/libs on May 7, 2018
  7. laanwj removed the label Refactoring on May 7, 2018
  8. laanwj commented at 12:02 PM on May 7, 2018: member

    Changed the tag: as the result of FastMod is not the same as % this is not only a refactor.

    It looks like a feasible alternative in this case (h should be evenly distributed 0..2^32-1 by MurmurHash), and ~25% speed-up is nice. CRollingBloomFilter is used in the net processing code to keep track of recent rejects. This is a fast path (reject-quickly), so I think this matters.

    utACK 9aac9f90d5e56752cc6cbfac48063ad29a01143c

  9. in src/bloom.cpp:252 in 9aac9f90d5
     244 | @@ -245,6 +245,14 @@ static inline uint32_t RollingBloomHash(unsigned int nHashNum, uint32_t nTweak,
     245 |      return MurmurHash3(nHashNum * 0xFBA4C795 + nTweak, vDataToHash);
     246 |  }
     247 |  
     248 | +
     249 | +// A replacement for x % n. This assumes that x and n are 32bit integers, and x is a uniformly random distributed 32bit value
     250 | +// which should be the case for a good hash.
     251 | +// See https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
     252 | +static inline uint32_t FastMod(uint32_t x, size_t n) {
    


    Empact commented at 5:18 PM on May 7, 2018:

    nit: Could replace inline with constexpr


    laanwj commented at 8:58 PM on May 10, 2018:

    Would that make a difference in the generated code?


    sipa commented at 6:06 PM on May 11, 2018:

    That's very unlikely - the sort of optimizations that apply to constexpr things also apply to things which the compiler can infer being constant expressions (which is obvious here).

    The advantage to using constexpr is making sure those optimizations aren't made impossible for some reason.


    Empact commented at 12:04 AM on May 13, 2018:

    Agreed it's unlikely. My inclination is to minimize surface area and maximally restrain the code to reduce the space of errors that might occur. It's not always meaningful, but it can be a helpful precaution and can pay incidental dividends, e.g. by surfacing changes to the use of variables over time that were not intended. But yeah, it's a nit rather than an objection.

  10. jimpo commented at 7:01 AM on May 8, 2018: contributor

    utACK 9aac9f9

  11. gmaxwell commented at 12:18 AM on May 15, 2018: contributor

    Perhaps share code and or description with cuckoocache.h compute_hashes?

  12. martinus commented at 6:18 AM on May 15, 2018: contributor

    @gmaxwell oh right, compute_hashes does the same. What would the preferred way to remove this duplication, refactor cuckoocache.h and to use a static method fast_mod, include & use that in bloom.cpp?

  13. laanwj commented at 4:36 PM on May 18, 2018: member

    Normally sharing common code is a good thing but I'm not sure that is worth it here, this is only a few lines, and would involve sharing code between otherwise unrelated units. Would be different if e.g. bloom.cpp already included cuckoocache.h but now the way to now handle this would be to create a new header...

  14. laanwj merged this on May 18, 2018
  15. laanwj closed this on May 18, 2018

  16. laanwj referenced this in commit d9ebb63919 on May 18, 2018
  17. martinus deleted the branch on May 22, 2018
  18. codablock referenced this in commit c46a93149d on Apr 11, 2019
  19. UdjinM6 referenced this in commit 241f76f9bf on Apr 11, 2019
  20. jasonbcox referenced this in commit 1872dec713 on Oct 11, 2019
  21. jonspock referenced this in commit 7310855122 on Dec 27, 2019
  22. jonspock referenced this in commit 0556fd0dad on Dec 29, 2019
  23. zkbot referenced this in commit be459619a8 on Mar 5, 2021
  24. zkbot referenced this in commit 78de0cdf46 on Apr 15, 2021
  25. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me