9% less memory: make SaltedOutpointHasher noexcept #16957

pull martinus wants to merge 1 commits into bitcoin:master from martinus:2019-09-SaltedOutpointHasher-noexcept changing 1 files +9 −1
  1. martinus commented at 7:05 pm on September 24, 2019: contributor

    If the hash is not noexcept, unorderd_map has to assume that it can throw an exception. Thus when rehashing care needs to be taken. libstdc++ solves this by simply caching the hash value, which increases memory of each node by 8 bytes. Adding noexcept prevents this caching. In my experiments with -reindex-chainstate -stopatheight=594000, memory usage (maximum resident set size) has decreased by 9.4% while runtime has increased by 1.6% due to additional hashing. Additionally, memusage::DynamicUsage() is now more accurate and does not underestimate.

    runtime h:mm:ss max RSS kbyte
    master 4:13:59 7696728
    2019-09-SaltedOutpointHasher-noexcept 4:18:11 6971412
    change +1.65% -9,42%

    Comparison of progress masters vs. 2019-09-SaltedOutpointHasher-noexcept out

  2. MarcoFalke added the label Resource usage on Sep 24, 2019
  3. jamesob commented at 7:16 pm on September 24, 2019: member
    Concept ACK. Very exciting. Will start some benchmarks tonight.
  4. emilengler commented at 7:45 pm on September 24, 2019: contributor
    Concept ACK, good found
  5. kristapsk commented at 8:04 pm on September 24, 2019: contributor
    Concept ACK
  6. sipa commented at 8:10 pm on September 24, 2019: member

    Wow, awesome.

    ACK

  7. promag commented at 8:21 pm on September 24, 2019: member
    Little big PR! Concept ACK.
  8. sipa commented at 8:25 pm on September 24, 2019: member
    Is it slower?
  9. martinus commented at 8:32 pm on September 24, 2019: contributor
    Yes it seems to be slower, by 1.6%. But I’m not sure how much of this is random test variation. I think the lower memory usage more than offsets this: If you increase dbcache by about 9% memory usage will be about the same as before but I’m pretty sure runtime will be faster because of better caching
  10. DrahtBot commented at 10:35 pm on September 24, 2019: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #16910 (wallet: reduce loading time by using unordered maps by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  11. martinus commented at 5:23 am on September 25, 2019: contributor

    I did another run with the change and -dbcache=5471, so I allowed more memory. The result:

    runtime h:mm:ss max RSS kbyte
    master -dbcache=5000 4:13:59 7696728
    noexcept -dbcache=5000 4:18:11 6971412
    noexcept -dbcache=5471 4:01:16 7282044

    This time synchronization was about 5% faster than master and still used 5% less memory than master. Here is a graph with all the results:

    out

  12. fanquake added this to the milestone 0.20.0 on Sep 25, 2019
  13. practicalswift commented at 8:12 am on September 25, 2019: contributor

    ACK 9538fce097a4acbb669e94770bce4ab8bff62031 – diff looks correct and SipHashUint256Extra does not throw

    Nice find! I believe there are similar opportunities waiting to be found: we underuse noexcept :)

    Somewhat related previous discussions about the pros/cons of noexcept:

    • #15926 (tinyformat: Add format_no_throw)
    • #13382 (util: Don't throw in GetTime{Millis,Micros}(). Mark as noexcept.)
  14. laanwj commented at 8:14 am on September 25, 2019: member
    ACK 9538fce097a4acbb669e94770bce4ab8bff62031, this is trivially correct, had no idea that no_except could make so much of a concrete difference in generated code.
  15. jonasschnelli commented at 8:35 am on September 25, 2019: contributor
    Nice! utACK 9538fce097a4acbb669e94770bce4ab8bff62031. Should we increase the dbcache by +~9% to not reduce the sync/reindex performance with default parameters (in this or in another PR)?
  16. jamesob commented at 2:22 pm on September 25, 2019: member

    First bench run coincides with @martinus’ findings:

     0-reindex-chainstate to 550,000 (dbcache=5000)
     1
     2bench-s      martinus/2019-09-SaltedOutp...       2:41:35    1.00  6982.89MB
     3bench-s      master                               2:38:42    0.98  7469.20MB
     4
     5on
     6
     7Hostname:            bench-s
     8Kernel:              Linux 4.19.0-5-amd64
     9OS:                  Debian GNU/Linux 10
    10RAM (GB):            31.35
    11Architecture:        x86_64
    12CPU(s):              8
    13Thread(s) per core:  2
    14Core(s) per socket:  4
    15Model name:          Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
    
  17. ryanofsky commented at 2:56 pm on September 25, 2019: member

    when rehashing care needs to be taken. libstdc++ solves this by simply caching the hash value

    Curious if this applies to libc++ as well as libstdc++

  18. MarcoFalke commented at 5:45 pm on September 25, 2019: member
    I’d prefer if the doxygen comment was extended a bit on the effects of the noexcept keyword. Maybe throw some links in there (https://gcc.gnu.org/onlinedocs/gcc-7.2.0/libstdc++/manual/manual/unordered_associative.html#containers.unordered.cache) ?
  19. jamesob commented at 6:21 pm on September 25, 2019: member

    Curious if this applies to libc++ as well as libstdc++ @sdaftuar and I have tried reading through the libc++ implementation of hash tables to find the equivalent caching clause to what’s in libstdc++, but so far we haven’t been able to find it. Does anyone know where this happens in libc++ (clang)?

    I’ll report back soon with clang-based benches.

  20. martinus commented at 6:47 pm on September 25, 2019: contributor

    I’d prefer if the doxygen comment was extended a bit on the effects of the noexcept keyword. Maybe throw some links in there (https://gcc.gnu.org/onlinedocs/gcc-7.2.0/libstdc++/manual/manual/unordered_associative.html#containers.unordered.cache) ?

    Ah, I didn’t know this was documented, thanks for the link. I’ve added a comment

  21. make SaltedOutpointHasher noexcept
    If the hash is not noexcept, unorderd_map has to assume that it can throw an exception. Thus when rehashing care needs to be taken. libstdc++ solves this by simply caching the hash value, which increases memory of each node by 8 bytes. Adding noexcept prevents this caching. In my experiments with -reindex-chainstate -stopatheight=594000, memory usage has decreased by 9.4% while runtime has increased by 1.6% due to additional hashing. Additionally, memusage::DynamicUsage() is now more accurate and does not underestimate.
    67d99900b0
  22. martinus force-pushed on Sep 25, 2019
  23. jamesob commented at 2:15 pm on September 26, 2019: member

    Tested ACK https://github.com/bitcoin/bitcoin/pull/16957/commits/67d99900b0d770038c9c5708553143137b124a6c

    Clang results are in and they’re consistent with what we’ve seen so far (benched IBD 500,000-535,000 at dbcache=4000).

    ibd local range 500000 535000

    2019-09-SaltedOutpointHasher-noexcept vs. master (absolute)

    bench name x 2019-09-SaltedOutpointHasher-noexcept master
    ibd.local.range.500000.535000.total_secs 3 3447.5602 (± 14.9648) 3434.1072 (± 3.6158)
    ibd.local.range.500000.535000.peak_rss_KiB 3 5973120.0000 (± 15741.9850) 6472862.6667 (± 11442.6395)

    2019-09-SaltedOutpointHasher-noexcept vs. master (relative)

    bench name x 2019-09-SaltedOutpointHasher-noexcept master
    ibd.local.range.500000.535000.total_secs 3 1.004 1.000
    ibd.local.range.500000.535000.peak_rss_KiB 3 1.000 1.084
  24. practicalswift commented at 3:06 pm on September 26, 2019: contributor
    We should also bump the default dbcache to make sure IBD is not slowed down by the merge of this PR, right?
  25. jamesob commented at 3:17 pm on September 26, 2019: member

    We should also bump the default dbcache to make sure IBD is not slowed down by the merge of this PR, right?

    I think that’s a good idea. I’ll rerun a bench comparing master:-dbcache=450 (default) and martinus:-dbcache=500 to ensure the latter undershoots the former in terms of memory usage.

  26. jamesob commented at 3:21 pm on September 26, 2019: member

    It’s worth pointing out that when memory is not a limiting factor (e.g. a miner sets -dbcache=15000 for highest possible speed), this change will impose a minor performance penalty on the order of what we’ve seen above (<1.6%).

    I still think this change is worthwhile because using noexcept here is the “correct” phrasing of this program, and the net expected performance gain is positive since most users are limited by RAM, but I think it’s still worth considering that if we at some point offered a configuration setting that optimized for speed above all else, this is something that we would want to tweak.

  27. TheBlueMatt commented at 4:11 pm on September 26, 2019: member
    I’m somewhat confused by the “increase dbcache for equivalent performance” thing - shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage and thus increase the capacity of the map? Is there some way to get the performance back with some reasonable pre-allocation of the map?
  28. jamesob commented at 4:17 pm on September 26, 2019: member

    shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage and thus increase the capacity of the map?

    Nah unfortunately I don’t think so - because our memusage estimator optimistically assumes that each node in the hashmap consumes only as much memory as it takes to store the pair of key and value, plus each bucket pointer.

    Is there some way to get the performance back with some reasonable pre-allocation of the map?

    I’ve tried this with a call to cacheCoins.reserve(...) in init and didn’t see much of a difference, but might be worth investigating again.

  29. martinus commented at 4:55 pm on September 26, 2019: contributor

    I’m somewhat confused by the “increase dbcache for equivalent performance” thing - shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage

    The problem was that DynamicMemUsage was wrong before, it underestimated memory usage. Now it’s more correct.

    I think it might be possible to regain the performance loss by caching the hash ourselves. We have some padding in the maps’ key, and it might be possible to use these bytes to store the hash value (or at least 4 bytes of it) without having to increase the memory.

  30. TheBlueMatt commented at 5:18 pm on September 26, 2019: member

    Ah, I’m too tired and misread the dropping of Z there. I suppose maybe it makes sense to fix it (or note in the release notes that people may want to slightly increase their dbcache).

    On Sep 26, 2019, at 09:17, jamesob notifications@github.com wrote:

     shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage and thus increase the capacity of the map?

    Nah unfortunately I don’t think so - because our memusage estimator optimistically assumes that each node in the hashmap consumes only as much memory as it takes to store the pair of key and value, plus each bucket pointer.

    Is there some way to get the performance back with some reasonable pre-allocation of the map?

    I’ve tried this with a call to cacheCoins.reserve(…) in init and didn’t see much of a difference, but might be worth investigating again.

    — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

  31. jamesob commented at 9:50 pm on September 26, 2019: member

    Interestingly, the difference isn’t pronounced near default dbcache values. This bench compares master at dbcache=450 (default) to this branch at dbcache=500. Master’s runtime is 1.042x longer with this configuration.

    Maybe the runtime difference isn’t as pronounced since this is only for a relatively small subset of the chain (35,000 blocks). In any case, it at least shows that a new dbcache default of 500 with this change still consumes less memory than master with the current default.

    ibd local range 500000 535000

  32. fanquake added the label Needs release note on Sep 27, 2019
  33. laanwj referenced this in commit 6b9405e319 on Sep 30, 2019
  34. laanwj merged this on Sep 30, 2019
  35. laanwj closed this on Sep 30, 2019

  36. laanwj commented at 7:50 am on September 30, 2019: member

    It seems safe to merge this for 0.19.

    It’s clear that there are still opportunities left for optimization in and around the UTXO hash data structure, let’s look further at that for 0.20.

  37. fanquake removed this from the milestone 0.20.0 on Sep 30, 2019
  38. fanquake added this to the milestone 0.19.0 on Sep 30, 2019
  39. MarcoFalke commented at 5:41 pm on September 30, 2019: member
    Does this need release notes with potential action items for node operators and/or the default bumped?
  40. martinus commented at 7:23 pm on September 30, 2019: contributor
    I’d say the default should be bumped from 450 to 500, and Node operators can increase dbcache by 10% without needing more RAM than before
  41. sidhujag referenced this in commit 340643971d on Oct 2, 2019
  42. MarcoFalke removed the label Needs release note on Nov 6, 2019
  43. deadalnix referenced this in commit 7caa48fc89 on Oct 20, 2020
  44. PastaPastaPasta referenced this in commit b2a0c7b880 on Jun 27, 2021
  45. PastaPastaPasta referenced this in commit 5d98102431 on Jun 28, 2021
  46. PastaPastaPasta referenced this in commit fb7cd146b5 on Jun 29, 2021
  47. PastaPastaPasta referenced this in commit 007846d7b6 on Jul 1, 2021
  48. PastaPastaPasta referenced this in commit 706de7de6f on Jul 1, 2021
  49. PastaPastaPasta referenced this in commit 8a66639707 on Jul 12, 2021
  50. PastaPastaPasta referenced this in commit 502a36c5ef on Jul 13, 2021
  51. PastaPastaPasta referenced this in commit bca013ddd6 on Jul 13, 2021
  52. martinus deleted the branch on Nov 8, 2021
  53. DrahtBot locked this on Nov 8, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me