martinus
commented at 7:05 pm on September 24, 2019:
contributor
If the hash is not noexcept, unorderd_map has to assume that it can throw an exception. Thus when rehashing care needs to be taken. libstdc++ solves this by simply caching the hash value, which increases memory of each node by 8 bytes. Adding noexcept prevents this caching. In my experiments with -reindex-chainstate -stopatheight=594000, memory usage (maximum resident set size) has decreased by 9.4% while runtime has increased by 1.6% due to additional hashing. Additionally, memusage::DynamicUsage() is now more accurate and does not underestimate.
runtime h:mm:ss
max RSS kbyte
master
4:13:59
7696728
2019-09-SaltedOutpointHasher-noexcept
4:18:11
6971412
change
+1.65%
-9,42%
Comparison of progress masters vs. 2019-09-SaltedOutpointHasher-noexcept
MarcoFalke added the label
Resource usage
on Sep 24, 2019
jamesob
commented at 7:16 pm on September 24, 2019:
member
Concept ACK. Very exciting. Will start some benchmarks tonight.
emilengler
commented at 7:45 pm on September 24, 2019:
contributor
Concept ACK, good found
kristapsk
commented at 8:04 pm on September 24, 2019:
contributor
Concept ACK
sipa
commented at 8:10 pm on September 24, 2019:
member
Wow, awesome.
ACK
promag
commented at 8:21 pm on September 24, 2019:
member
Little big PR! Concept ACK.
sipa
commented at 8:25 pm on September 24, 2019:
member
Is it slower?
martinus
commented at 8:32 pm on September 24, 2019:
contributor
Yes it seems to be slower, by 1.6%. But I’m not sure how much of this is random test variation. I think the lower memory usage more than offsets this: If you increase dbcache by about 9% memory usage will be about the same as before but I’m pretty sure runtime will be faster because of better caching
DrahtBot
commented at 10:35 pm on September 24, 2019:
member
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#16910 (wallet: reduce loading time by using unordered maps by achow101)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
martinus
commented at 5:23 am on September 25, 2019:
contributor
I did another run with the change and -dbcache=5471, so I allowed more memory. The result:
runtime h:mm:ss
max RSS kbyte
master -dbcache=5000
4:13:59
7696728
noexcept -dbcache=5000
4:18:11
6971412
noexcept -dbcache=5471
4:01:16
7282044
This time synchronization was about 5% faster than master and still used 5% less memory than master. Here is a graph with all the results:
fanquake added this to the milestone 0.20.0
on Sep 25, 2019
practicalswift
commented at 8:12 am on September 25, 2019:
contributor
ACK9538fce097a4acbb669e94770bce4ab8bff62031 – diff looks correct and SipHashUint256Extra does not throw
Nice find! I believe there are similar opportunities waiting to be found: we underuse noexcept :)
Somewhat related previous discussions about the pros/cons of noexcept:
#13382 (util: Don't throw in GetTime{Millis,Micros}(). Mark as noexcept.)
laanwj
commented at 8:14 am on September 25, 2019:
member
ACK9538fce097a4acbb669e94770bce4ab8bff62031, this is trivially correct, had no idea that no_except could make so much of a concrete difference in generated code.
jonasschnelli
commented at 8:35 am on September 25, 2019:
contributor
Nice!
utACK9538fce097a4acbb669e94770bce4ab8bff62031.
Should we increase the dbcache by +~9% to not reduce the sync/reindex performance with default parameters (in this or in another PR)?
jamesob
commented at 2:22 pm on September 25, 2019:
member
First bench run coincides with @martinus’ findings:
0-reindex-chainstate to 550,000 (dbcache=5000)
1 2bench-s martinus/2019-09-SaltedOutp... 2:41:35 1.00 6982.89MB
3bench-s master 2:38:42 0.98 7469.20MB
4 5on
6 7Hostname: bench-s
8Kernel: Linux 4.19.0-5-amd64
9OS: Debian GNU/Linux 10
10RAM (GB): 31.35
11Architecture: x86_64
12CPU(s): 8
13Thread(s) per core: 2
14Core(s) per socket: 4
15Model name: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
ryanofsky
commented at 2:56 pm on September 25, 2019:
member
when rehashing care needs to be taken. libstdc++ solves this by simply caching the hash value
Curious if this applies to libc++ as well as libstdc++
MarcoFalke
commented at 5:45 pm on September 25, 2019:
member
jamesob
commented at 6:21 pm on September 25, 2019:
member
Curious if this applies to libc++ as well as libstdc++
@sdaftuar and I have tried reading through the libc++ implementation of hash tables to find the equivalent caching clause to what’s in libstdc++, but so far we haven’t been able to find it. Does anyone know where this happens in libc++ (clang)?
I’ll report back soon with clang-based benches.
martinus
commented at 6:47 pm on September 25, 2019:
contributor
Ah, I didn’t know this was documented, thanks for the link. I’ve added a comment
make SaltedOutpointHasher noexcept
If the hash is not noexcept, unorderd_map has to assume that it can throw an exception. Thus when rehashing care needs to be taken. libstdc++ solves this by simply caching the hash value, which increases memory of each node by 8 bytes. Adding noexcept prevents this caching. In my experiments with -reindex-chainstate -stopatheight=594000, memory usage has decreased by 9.4% while runtime has increased by 1.6% due to additional hashing. Additionally, memusage::DynamicUsage() is now more accurate and does not underestimate.
67d99900b0
martinus force-pushed
on Sep 25, 2019
jamesob
commented at 2:15 pm on September 26, 2019:
member
Clang results are in and they’re consistent with what we’ve seen so far (benched IBD 500,000-535,000 at dbcache=4000).
2019-09-SaltedOutpointHasher-noexcept vs. master (absolute)
bench name
x
2019-09-SaltedOutpointHasher-noexcept
master
ibd.local.range.500000.535000.total_secs
3
3447.5602 (± 14.9648)
3434.1072 (± 3.6158)
ibd.local.range.500000.535000.peak_rss_KiB
3
5973120.0000 (± 15741.9850)
6472862.6667 (± 11442.6395)
2019-09-SaltedOutpointHasher-noexcept vs. master (relative)
bench name
x
2019-09-SaltedOutpointHasher-noexcept
master
ibd.local.range.500000.535000.total_secs
3
1.004
1.000
ibd.local.range.500000.535000.peak_rss_KiB
3
1.000
1.084
practicalswift
commented at 3:06 pm on September 26, 2019:
contributor
We should also bump the default dbcache to make sure IBD is not slowed down by the merge of this PR, right?
jamesob
commented at 3:17 pm on September 26, 2019:
member
We should also bump the default dbcache to make sure IBD is not slowed down by the merge of this PR, right?
I think that’s a good idea. I’ll rerun a bench comparing master:-dbcache=450 (default) and martinus:-dbcache=500 to ensure the latter undershoots the former in terms of memory usage.
jamesob
commented at 3:21 pm on September 26, 2019:
member
It’s worth pointing out that when memory is not a limiting factor (e.g. a miner sets -dbcache=15000 for highest possible speed), this change will impose a minor performance penalty on the order of what we’ve seen above (<1.6%).
I still think this change is worthwhile because using noexcept here is the “correct” phrasing of this program, and the net expected performance gain is positive since most users are limited by RAM, but I think it’s still worth considering that if we at some point offered a configuration setting that optimized for speed above all else, this is something that we would want to tweak.
TheBlueMatt
commented at 4:11 pm on September 26, 2019:
member
I’m somewhat confused by the “increase dbcache for equivalent performance” thing - shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage and thus increase the capacity of the map? Is there some way to get the performance back with some reasonable pre-allocation of the map?
jamesob
commented at 4:17 pm on September 26, 2019:
member
shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage and thus increase the capacity of the map?
Nah unfortunately I don’t think so - because our memusage estimator optimistically assumes that each node in the hashmap consumes only as much memory as it takes to store the pair of key and value, plus each bucket pointer.
Is there some way to get the performance back with some reasonable pre-allocation of the map?
I’ve tried this with a call to cacheCoins.reserve(...) in init and didn’t see much of a difference, but might be worth investigating again.
martinus
commented at 4:55 pm on September 26, 2019:
contributor
I’m somewhat confused by the “increase dbcache for equivalent performance” thing - shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage
The problem was that DynamicMemUsage was wrong before, it underestimated memory usage. Now it’s more correct.
I think it might be possible to regain the performance loss by caching the hash ourselves. We have some padding in the maps’ key, and it might be possible to use these bytes to store the hash value (or at least 4 bytes of it) without having to increase the memory.
TheBlueMatt
commented at 5:18 pm on September 26, 2019:
member
Ah, I’m too tired and misread the dropping of Z there. I suppose maybe it makes sense to fix it (or note in the release notes that people may want to slightly increase their dbcache).
shouldn’t DynamicUsage(const std::unordered_map<X, Y, Z>& m) take into account the new slightly lower memory usage and thus increase the capacity of the map?
Nah unfortunately I don’t think so - because our memusage estimator optimistically assumes that each node in the hashmap consumes only as much memory as it takes to store the pair of key and value, plus each bucket pointer.
Is there some way to get the performance back with some reasonable pre-allocation of the map?
I’ve tried this with a call to cacheCoins.reserve(…) in init and didn’t see much of a difference, but might be worth investigating again.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
jamesob
commented at 9:50 pm on September 26, 2019:
member
Interestingly, the difference isn’t pronounced near default dbcache values. This bench compares master at dbcache=450 (default) to this branch at dbcache=500. Master’s runtime is 1.042x longer with this configuration.
Maybe the runtime difference isn’t as pronounced since this is only for a relatively small subset of the chain (35,000 blocks). In any case, it at least shows that a new dbcache default of 500 with this change still consumes less memory than master with the current default.
fanquake added the label
Needs release note
on Sep 27, 2019
laanwj referenced this in commit
6b9405e319
on Sep 30, 2019
laanwj merged this
on Sep 30, 2019
laanwj closed this
on Sep 30, 2019
laanwj
commented at 7:50 am on September 30, 2019:
member
It seems safe to merge this for 0.19.
It’s clear that there are still opportunities left for optimization in and around the UTXO hash data structure, let’s look further at that for 0.20.
fanquake removed this from the milestone 0.20.0
on Sep 30, 2019
fanquake added this to the milestone 0.19.0
on Sep 30, 2019
MarcoFalke
commented at 5:41 pm on September 30, 2019:
member
Does this need release notes with potential action items for node operators and/or the default bumped?
martinus
commented at 7:23 pm on September 30, 2019:
contributor
I’d say the default should be bumped from 450 to 500, and Node operators can increase dbcache by 10% without needing more RAM than before
sidhujag referenced this in commit
340643971d
on Oct 2, 2019
MarcoFalke removed the label
Needs release note
on Nov 6, 2019
deadalnix referenced this in commit
7caa48fc89
on Oct 20, 2020
PastaPastaPasta referenced this in commit
b2a0c7b880
on Jun 27, 2021
PastaPastaPasta referenced this in commit
5d98102431
on Jun 28, 2021
PastaPastaPasta referenced this in commit
fb7cd146b5
on Jun 29, 2021
PastaPastaPasta referenced this in commit
007846d7b6
on Jul 1, 2021
PastaPastaPasta referenced this in commit
706de7de6f
on Jul 1, 2021
PastaPastaPasta referenced this in commit
8a66639707
on Jul 12, 2021
PastaPastaPasta referenced this in commit
502a36c5ef
on Jul 13, 2021
PastaPastaPasta referenced this in commit
bca013ddd6
on Jul 13, 2021
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-12-26 12:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me