Use boost::unordered_map in CCoinsView #4413

pull laanwj wants to merge 2 commits into bitcoin:master from laanwj:2014_06_ccoinsview_hashmap changing 5 files +24 −17
  1. laanwj commented at 12:02 pm on June 25, 2014: member

    As we are not dependent on the ordering, a hash map could improve performance here. As suggested by @sipa.

    Consists of a first commit which makes the type easier to change - and should be easy to merge - and a second commit that actually changes the type. This needs benchmarking and extensive testing.

  2. typedef std::map<uint256, CCoins> to CCoinsMap
    This makes it possible to switch to a more efficient map type
    without changing all occurences manually.
    cd7072b0b3
  3. Change CCoinsMap to boost::unordered_map
    As we are not dependent on the ordering, a hash map could
    improve performance here. As suggested by @sipa.
    b1828db846
  4. laanwj commented at 12:28 pm on June 25, 2014: member

    We may need to check whether we don’t depend on iterators CCoinsViewCache iterators remaining stable under modifications.

    It doesn’t seem that we depend on that. The only place where we pass around the iterators is in FetchCoins/GetCoins/HaveCoins, but there we at most return a reference to a CCoins, not the iterator itself (and the doc http://www.boost.org/doc/libs/1_55_0/doc/html/boost/unordered_map.html mentions that pointers and references to elements are never invalidated).

    Outside of CCoinsViewCache (like in txdb) we only ever use constant CCoinsViewMaps.

  5. sipa commented at 12:51 pm on June 25, 2014: member
    Thanks for verifying. Sounds good.
  6. BitcoinPullTester commented at 9:18 pm on June 25, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4413_b1828db8465831e7d7fa842b39f14add03d7a7bd/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  7. laanwj commented at 9:53 am on June 26, 2014: member
    I checked re-indexing performance but it is not affected by this, at least not distinguishable within the variance. It consistently takes 4-4.5 hours. Now checking memory usage…
  8. laanwj commented at 7:24 am on June 27, 2014: member

    I’ve compared heap usage over time of a -reindex with and without the patch (using the heap profiler in gperftools):

    chart_ordered_unordered

    Not much difference in the overall trend or behavior; but a point of interest is that the memory usage with unordered map has fewer (and lower magnitude) sudden peaks.

  9. laanwj commented at 9:12 am on June 27, 2014: member

    I verified that the peak on x=33 for the ordered map is for a major part caused by memory allocation in CCoinsViewCache::BatchWrite / CCoinsViewCache::GetCoins.

    snapshot_ordered_31

    As we avoid these kinds of peaks with unordered_map - implying that there is less overhead - in my opinion it’s worth to merge this.

  10. gavinandresen commented at 4:51 pm on June 27, 2014: contributor
    Untested ACK.
  11. sipa commented at 3:44 pm on June 29, 2014: member
    Tested ACK.
  12. sipa commented at 5:21 pm on June 29, 2014: member

    There’s a mild exploitable performance weakness here: transaction signing can be grinded to make the constructed utxo entries end up in the same bucket, causing O(n) behaviour in that set.

    I think it’s low risk, but it can be easily avoided by using a more complex function uint256 -> 64bit hash that takes a secret parameter into account (which can be generated at random at startup).

  13. laanwj commented at 5:37 am on June 30, 2014: member
    Ouch. Maybe use a murmurhash with a random seed parameter set at start? We already have that in the source for bloom…
  14. sipa commented at 12:53 pm on June 30, 2014: member
    Having a random uint256 in CCoinsViewCache, generated at creation, and xor’ed with the txid before computing the hash (via some simple function) should do.
  15. laanwj commented at 1:34 pm on June 30, 2014: member
    So even murmurhash is not simple enough?
  16. laanwj commented at 10:11 am on July 1, 2014: member

    I don’t think the small win achieved here measures up against the extra brittleness involved in making the hashing non-deterministic between runs. In my experience this introduces problems (or at least extra work) while debugging and testing. std::map has a better worst-case performance, so it is a better fit.

    I’m going to pull the first commit to make it easier to experiment with alternative data structures here, but not the second.

  17. laanwj referenced this in commit dd638dd712 on Jul 1, 2014
  18. laanwj commented at 10:49 am on July 1, 2014: member
    First commit merged through dd638dd.
  19. laanwj closed this on Jul 1, 2014

  20. MathyV referenced this in commit bf897bc9ad on Nov 3, 2014
  21. reddink referenced this in commit 6e55972029 on May 27, 2020
  22. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-23 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me