[kernel 2a/n] Split hashing/index GetUTXOStats codepaths, decouple from coinstatsindex #24410

pull dongcarl wants to merge 15 commits into bitcoin:master from dongcarl:2022-02-libbitcoinkernel-p2-coinstats-minimal changing 11 files +146 −132
  1. dongcarl commented at 10:16 pm on February 21, 2022: contributor

    Part of: #24303 Depends on: #24322

    The GetUTXOStats function has 2 codepaths:

    • One which queries the CoinStatsIndex for the UTXO hash
    • One which actually performs the hashing

    For libbitcoinkernel, the only place where we call GetUTXOStats is in PopulateAndValidateSnapshots, which uses the SHA256D hash, and is therefore unable to use the CoinStatsIndex since that only provides MuHash hashes. Not that I think indices necessarily belong in libbitcoinkernel anyway.

    This PR separates these 2 aforementioned codepaths of GetUTXOStats, uses the hashing codepath in PopulateAndValidateSnapshots, and removes the need to link in index/coinstatsindex.cpp and node/coinstats.cpp.


    Logistically, this PR:

    • Extracts out the index_requested and hash_type members of CoinStats, which served as “in-params” to GetUTXOStats embedded within the CoinStats struct. This allows CoinStats to only consist of “out-param” members, and be returned by GetUTXOStats without needing to be an “in-out” param
    • Introduce the purely virtual UTXOHashers class, with 3 implementations: SHA256DHasher, MuHashHasher, and NullHasher. These replace the existing template-based polymorphism.
    • Split GetUTXOStats into:
      • CalculateUTXOStatsWithHasher(UTXOHasher&, ...), and
      • LookupUTXOStatsWithIndex(CoinStatsIndex&, ...)
    • Use CalculateUTXOStatsWithHasher directly where appropriate (src/validation.cpp and src/fuzz)
    • Move GetUTXOStats to rpc/blockchain, which is the only place that depends on GetUTXOStats’s weird fallback behaviour
    • Move LookupUTXOStatsWithIndex to index/coinstatsindex

    Code organization:

    • src/
      • kernel/ → only contains the hashing codepath
        • coinstats.cpp → hashing codepath implementations
        • coinstats.h → header for kernel/coinstats.cpp
      • index/ → only contains the index codepath
        • coinstatsindex.cpp → index codepath implementations
        • coinstatsindex.h
      • validation.cpp → only uses the hashing codepath
      • rpc/blockchain.cpp → uses both the hashing and index codepath, old GetUTXOStats fallback logic moved here as static
      • test/fuzz/coins_view.cpp → only uses the hashing codepath

    TODOs:

    • Commit messages could be fleshed out more

    Would love any feedback!

  2. DrahtBot added the label Needs rebase on Feb 21, 2022
  3. ryanofsky commented at 4:11 pm on February 24, 2022: contributor
    Approach ACK (a7883ac7a497f3d1cb52f5f71e71de5e5e9301f1). Looked through various commits and I think the changes all make sense
  4. dongcarl force-pushed on Apr 28, 2022
  5. dongcarl commented at 4:50 pm on April 28, 2022: contributor

    Pushed a7883ac7a4 -> c46aafaf11

    • Rebased over master
    • Made the scripted-diff actually work
  6. dongcarl marked this as ready for review on Apr 28, 2022
  7. DrahtBot removed the label Needs rebase on Apr 28, 2022
  8. DrahtBot added the label Build system on Apr 28, 2022
  9. DrahtBot added the label RPC/REST/ZMQ on Apr 28, 2022
  10. DrahtBot added the label UTXO Db and Indexes on Apr 28, 2022
  11. DrahtBot added the label Validation on Apr 28, 2022
  12. MarcoFalke commented at 6:13 pm on April 28, 2022: member
    Would an alternative be to treat assumeutxo as not part of the kernel for now (and completely move it out?)
  13. dongcarl commented at 7:02 pm on April 28, 2022: contributor

    @MarcoFalke

    Would an alternative be to treat assumeutxo as not part of the kernel for now (and completely move it out?)

    Sure, but I don’t prefer that alternative. I think it would actually be more work to extract assumeutxo from the kernel, plus if you look at the GetUTXOStats function and friends prior to this PR, it needed quite some cleaning up, which is done in this PR.

  14. jonatack commented at 7:11 pm on April 28, 2022: contributor
    Concept ACK
  15. dongcarl force-pushed on Apr 28, 2022
  16. dongcarl force-pushed on Apr 28, 2022
  17. DrahtBot commented at 1:00 am on April 29, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #25065 ([kernel 2c/n] Extract only what we use of init/common.cpp by dongcarl)
    • #24952 (rpc: Add sqlite format option for dumptxoutset by dunxen)
    • #24897 ([Draft / POC] Silent Payments by w0xlt)
    • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)
    • #19792 (rpc: Add dumpcoinstats by fjahr)
    • #13462 (Make SER_GETHASH implicit for CHashWriter and SerializeHash by Empact)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  18. willcl-ark commented at 10:28 am on April 29, 2022: contributor

    Perhaps it’s just my system, but the feature_coinstatsindex.py bitcoind does not get terminated after a successful run. This causes my script running tests against each commit to fail. All the other tests terminate just fine, and this happens running all tests and the test individually.

    The logs appear to show the test and node sutting down, but this does not appear to be the case in practice.

     0[I] [11:19] ubuntu:bitcoin (2022-02-libbitcoinkernel-p2-coinstats-minimal) | ₿ test/functional/feature_coinstatsindex.py
     12022-04-29T10:20:14.316000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_so5w4sxf
     22022-04-29T10:20:15.704000Z TestFramework (INFO): Test that gettxoutsetinfo() output is consistent with or without coinstatsindex option
     32022-04-29T10:20:15.758000Z TestFramework (INFO): Test that gettxoutsetinfo() can get fetch data on specific heights with index
     42022-04-29T10:20:15.850000Z TestFramework (INFO): Test gettxoutsetinfo() with index and verbose flag
     52022-04-29T10:20:15.963000Z TestFramework (INFO): Test that the index is robust across restarts
     62022-04-29T10:20:16.419000Z TestFramework (INFO): Test that the index works with -reindex
     72022-04-29T10:20:17.198000Z TestFramework (INFO): Test that -reindex-chainstate is disallowed with coinstatsindex
     82022-04-29T10:20:17.363000Z TestFramework (INFO): Test use_index option for nodes running the index
     92022-04-29T10:20:17.450000Z TestFramework (INFO): Test that index can handle reorgs
    10> /home/will/src/bitcoin/test/functional/feature_coinstatsindex.py(55)run_test()
    11-> self._test_index_rejects_hash_serialized()
    12(Pdb) l
    13 50             self.wallet = MiniWallet(self.nodes[0])
    14 51             self._test_coin_stats_index()
    15 52             self._test_use_index_option()
    16 53             self._test_reorg_index()
    17 54             import pdb; pdb.set_trace()
    18 55  ->         self._test_index_rejects_hash_serialized()
    19 56
    20 57         def block_sanity_check(self, block_info):
    21 58             block_subsidy = 50
    22 59             assert_equal(
    23 60                 block_info['prevout_spent'] + block_subsidy,
    24(Pdb) n
    252022-04-29T10:20:38.130000Z TestFramework (INFO): Test that the rpc raises if the legacy hash is passed with the index
    26--Return--
    27> /home/will/src/bitcoin/test/functional/feature_coinstatsindex.py(55)run_test()->None
    28-> self._test_index_rejects_hash_serialized()
    29(Pdb) n
    30> /home/will/src/bitcoin/test/functional/test_framework/test_framework.py(156)main()
    31-> exit_code = self.shutdown()
    32(Pdb) n
    332022-04-29T10:20:43.555000Z TestFramework (INFO): Stopping nodes
    342022-04-29T10:20:43.657000Z TestFramework (INFO): Cleaning up /tmp/bitcoin_func_test_so5w4sxf on exit
    352022-04-29T10:20:43.657000Z TestFramework (INFO): Tests successful
    36> /home/will/src/bitcoin/test/functional/test_framework/test_framework.py(157)main()
    37-> sys.exit(exit_code)
    38(Pdb) n
    39SystemExit: 0
    40> /home/will/src/bitcoin/test/functional/test_framework/test_framework.py(157)main()
    41-> sys.exit(exit_code)
    42(Pdb) c
    43[I] [11:20] ubuntu:bitcoin (2022-02-libbitcoinkernel-p2-coinstats-minimal) | ₿ ps -Af | rg bitcoind
    44will        6695    3480  0 Apr28 ?        00:14:46 bitcoind
    45will      586063    3480  0 11:20 pts/8    00:00:00 /home/will/src/bitcoin/src/bitcoind -datadir=/tmp/bitcoin_func_test_so5w4sxf/node1 -logtimemicros -debug -debugexclude=libevent -debugexclude=leveldb -uacomment=testnode1 -logthreadnames -logsourcelocations -coinstatsindex -reindex
    
  19. dongcarl force-pushed on Apr 29, 2022
  20. dongcarl commented at 6:55 pm on April 29, 2022: contributor
    @willcl-ark Interesting observation but I tested it on master and I see the same behaviour there too! (you should verify!) So this is unrelated to this PR’s changes. Perhaps you’d like to open a separate Issue about this?
  21. dongcarl force-pushed on Apr 29, 2022
  22. mzumsande commented at 7:49 pm on April 29, 2022: contributor

    @willcl-ark Interesting observation but I tested it on master and I see the same behaviour there too! (you should verify!) So this is unrelated to this PR’s changes. Perhaps you’d like to open a separate Issue about this?

    That was an oversight by me in #24789. I opened #25034 to fix this.

  23. willcl-ark commented at 7:52 pm on April 29, 2022: contributor

    @willcl-ark Interesting observation but I tested it on master and I see the same behaviour there too! (you should verify!) So this is unrelated to this PR’s changes. Perhaps you’d like to open a separate Issue about this?

    That was an oversight by me in #24789. I opened #25034 to fix this.

    Ah thanks Martin, that would explain it. I will cherry-pick that and re-run then :)

  24. willcl-ark commented at 10:19 pm on April 29, 2022: contributor
    @mzumsande cherry-picking https://github.com/bitcoin/bitcoin/commit/5345f752313c6bd7164109bdfd64ca3a55647c22 does indeed fix and allow all tests to pass at each commit in this PR
  25. dongcarl commented at 10:21 pm on April 29, 2022: contributor
    @willcl-ark Okay for me to hide your comments so as to not confuse future reviewers?
  26. willcl-ark commented at 10:22 pm on April 29, 2022: contributor

    @willcl-ark Okay for me to hide your comments so as to not confuse future reviewers?

    Please go ahead!

  27. dongcarl commented at 10:27 pm on April 29, 2022: contributor

    Pushed 756f1a755d138e36a9e50e7e4e46ae0bcd3975ca -> 30503c98a8bc97b582f072c34168b5279d65a492

    • Added commit “fuzz: Remove useless GetUTXOStats fuzz case”, adjusted subsequent commits
    • Removed some extraneous whitespace in scripted-diff

    Reviewers: Please let me know if the new “fuzz: Remove useless GetUTXOStats fuzz case” commit looks okay.

    Ping @practicalswift

  28. in src/index/coinstatsindex.h:69 in f2f72edaad outdated
    65@@ -62,4 +66,7 @@ class CoinStatsIndex final : public BaseIndex
    66 /// The global UTXO set hash object.
    67 extern std::unique_ptr<CoinStatsIndex> g_coin_stats_index;
    68 
    69+std::optional<kernel::CCoinsStats> GetUTXOStatsWithIndex(CoinStatsIndex& coin_stats_index, const CBlockIndex* pindex);
    


    mzumsande commented at 7:39 pm on April 30, 2022:

    commit f2f72edaadb03b52271a14906094530e23b3806c:

    Would it make sense to simplify the final interface in coinstatsindex? I didn’t try it out, but if CoinStatsIndex::LookUpStats() was refactored to set index_used = true internally and return an std::optional<kernel::CCoinsStats>, while the LookupBlockIndex() path was already done in rpc/blockchain.cpp, we could directly call that and get rid of the two GetUTXOStatsWithIndex() functions and have just one public CoinStatsIndex member function to return the statistics.


    dongcarl commented at 6:42 pm on May 2, 2022:
    Great idea! See: 7dbebb3a9c4ea691b11659f2b91e1e4b8e847251
  29. in src/rpc/blockchain.cpp:960 in 55c4a9ddee outdated
    929-                GetUTXOStats(coins_view, *blockman, prev_stats, hash_type, node.rpc_interruption_point, pindex->pprev, index_requested);
    930+                const std::optional<CCoinsStats> maybe_prev_stats = GetUTXOStats(coins_view, *blockman, hash_type, node.rpc_interruption_point, pindex->pprev, index_requested);
    931+                if (maybe_prev_stats.has_value()) {
    932+                    prev_stats = maybe_prev_stats.value();
    933+                } else {
    934+                    throw JSONRPCError(RPC_INTERNAL_ERROR, "Unable to read UTXO set");
    


    mzumsande commented at 7:45 pm on April 30, 2022:

    Commit 55c4a9ddeefdbe7a97bc9d908003693e3ff71968:

    I think the added JSONRPCError for the call of pindex->pprev makes this not a pure refactor (but an improvement, seems like an oversight that the result wasn’t checked before!). Maybe mention this in the commit message. Also, CacheSizes -> CCoinsStats in the commit msg?


    dongcarl commented at 6:43 pm on May 2, 2022:
    Right, I added some info in the commit message. Will be making the commit messages better in the coming days! 2155489fe6d103bdca26c0875058d563a3e8d5c0

    dongcarl commented at 10:45 pm on May 3, 2022:
    Commit messages are now complete! 😄
  30. mzumsande commented at 8:06 pm on April 30, 2022: contributor

    Concept ACK

    I was able to follow the commits step by step and the changes look sensible to me. One suggestion below about the coinstatsindex codepath.

  31. dongcarl force-pushed on May 2, 2022
  32. dongcarl commented at 6:40 pm on May 3, 2022: contributor

    Pushed 30503c98a8bc97b582f072c34168b5279d65a492 -> cfbe8efb08e42d404306c5d59d4a2e0719b2a788

    • Merged GetUTXOStatsWithIndex() logic into CoinStatsIndex::LookUpStats() as mentioned here: #24410 (review) (Thanks Martin!)
  33. dongcarl force-pushed on May 3, 2022
  34. dongcarl commented at 10:39 pm on May 3, 2022: contributor

    Pushed cfbe8efb08e42d404306c5d59d4a2e0719b2a788 -> 06dea7ea36eb1606cd4255e2875d0e4fe2fdd5a4

    • Rebased over master for merge of #25034
    • Fleshed out commit messages
    • Added previously missing (even before this PR) BOOST_CHECKs in “Move logic from GetUTXOStatsWithIndex to CoinStatsIndex::LookUpStats” (b3d18d7bfcfeba7ef3ead5de0ee9bd58300609e9)
    • Moved more declarations out of the “public header” in “coinstats: Split node/coinstats.h to kernel/**.h” (c4ab9ff40f1499f4abd2dd1915733c86bc4bccbf)
    • Added commit at the beginning: “kernel: Remove unnecessary index/blockfilterindex.cpp” to account for merge of #21726

    This PR is now “Very Ready for Review” 😄

  35. in src/index/coinstatsindex.cpp:322 in 06dea7ea36 outdated
    316@@ -316,28 +317,31 @@ static bool LookUpOne(const CDBWrapper& db, const CBlockIndex* block_index, DBVa
    317     return db.Read(DBHashKey(block_index->GetBlockHash()), result);
    318 }
    319 
    320-bool CoinStatsIndex::LookUpStats(const CBlockIndex* block_index, CCoinsStats& coins_stats) const
    321+std::optional<CCoinsStats> CoinStatsIndex::LookUpStats(const CBlockIndex* block_index) const
    322 {
    323+    CCoinsStats stats = kernel::MakeCoinStatsPrefilledWithBlockIndexInfo(block_index);
    


    Miminlaili commented at 10:47 pm on May 3, 2022:
  36. Miminlaili approved
  37. Miminlaili commented at 10:48 pm on May 3, 2022: none
  38. dongcarl force-pushed on May 4, 2022
  39. dongcarl force-pushed on May 4, 2022
  40. dongcarl commented at 6:45 pm on May 4, 2022: contributor

    Pushed 06dea7ea36eb1606cd4255e2875d0e4fe2fdd5a4 -> 660bb3066fff29bd2a16ea3adf9748d58f53363c

    • Also remove blockfilter.cpp from libbitcoinkernel_la_SOURCES (Thanks past Carl for leaving a TODO!)
  41. dongcarl commented at 7:48 pm on May 5, 2022: contributor

    I’ve gotten the feedback that the distinction between:

    • src/kernel/coinstats.h, and
    • src/kernel/include/coinstats.h

    Is probably not necessary in Phase 1 of the libbitcoinkernel project and potentially confusing and lead to churn since the public one (src/kernel/include/coinstats.h) is likely to be drastically changed in Phase 2. I will combine them in this and future libbitcoinkernel Phase 1 PRs if there’s not much opposition to that. Shouldn’t affect the meat of this PR that much.

  42. dongcarl force-pushed on May 5, 2022
  43. dongcarl commented at 10:09 pm on May 5, 2022: contributor

    Pushed 660bb3066fff29bd2a16ea3adf9748d58f53363c -> 8f50fe837eaa41b045c69a97617b2346d57bd879

  44. DrahtBot added the label Needs rebase on May 6, 2022
  45. in src/rpc/blockchain.cpp:812 in 8f50fe837e outdated
    807@@ -809,6 +808,28 @@ CoinStatsHashType ParseHashType(const std::string& hash_type_input)
    808     }
    809 }
    810 
    811+//! Calculate statistics about the unspent transaction output set
    812+
    


    fjahr commented at 3:07 pm on May 8, 2022:
    nit: unnecessary blank line

    dongcarl commented at 8:17 pm on May 10, 2022:
    Fixed as of push 8b853034a13f49b85cdf4168d9d49bd297f5b8ce
  46. dongcarl force-pushed on May 9, 2022
  47. DrahtBot removed the label Needs rebase on May 9, 2022
  48. in src/node/coinstats.h:32 in 82148cec1c outdated
    27@@ -28,9 +28,6 @@ enum class CoinStatsHashType {
    28 };
    29 
    30 struct CCoinsStats {
    31-    //! Which hash type to use
    32-    const CoinStatsHashType m_hash_type;
    


    theuni commented at 3:07 pm on May 9, 2022:
    As a neat side-effect, I believe this makes CCoinsStats moveable, which might actually be a helpful optimization.
  49. in src/node/coinstats.h:72 in 82148cec1c outdated
    70-    CCoinsStats(CoinStatsHashType hash_type) : m_hash_type(hash_type) {}
    71 };
    72 
    73 //! Calculate statistics about the unspent transaction output set
    74-bool GetUTXOStats(CCoinsView* view, node::BlockManager& blockman, CCoinsStats& stats, const std::function<void()>& interruption_point = {}, const CBlockIndex* pindex = nullptr);
    75+bool GetUTXOStats(CCoinsView* view, node::BlockManager& blockman, CCoinsStats& stats, CoinStatsHashType hash_type, const std::function<void()>& interruption_point = {}, const CBlockIndex* pindex = nullptr);
    


    theuni commented at 3:22 pm on May 9, 2022:

    Any reason not to handle this at build-time with a template param instead? I don’t feel strongly, but to me GetUTXOStats<CoinStatsHashType::MUHASH>(...) more effectively communicates that this is the “mode” of the function, as opposed to it being a random parameter.

    The implementation function is already templated anyway so I don’t think it’d be a big change.

    I know this function is going to change more in subsequent commits, so ignore ^^ if something else moots it.


    dongcarl commented at 8:22 pm on May 10, 2022:

    I think this is completely changed when we introduce ComputeUTXOStatsWithHasher(UTXOHasher& hasher, ...) with UTXOHasher& as the first parameter. Does that look alright?

    If it still looks unclear, I’m happy to add Doxygen comments to ComputeUTXOStatsWithHasher(UTXOHasher& hasher, ...) so that’s it’s more clear!


    dongcarl commented at 3:19 am on May 20, 2022:
    I’ve experimented with this on top of the latest push 90358d4b7402bfd4cd9b79ebf828e40f4eabd849, but the template parameter would be based on a runtime variable and that’s not allowed in C++ (at least in clang).
  50. in src/node/coinstats.h:73 in 82148cec1c outdated
    65@@ -69,12 +66,10 @@ struct CCoinsStats {
    66     CAmount total_unspendables_scripts{0};
    67     //! Total cumulative amount of coins lost due to unclaimed miner rewards up to and including this block
    68     CAmount total_unspendables_unclaimed_rewards{0};
    69-
    70-    CCoinsStats(CoinStatsHashType hash_type) : m_hash_type(hash_type) {}
    


    theuni commented at 3:26 pm on May 9, 2022:
    So the stats will no longer “remember” their hash type, but presumably that’s never a problem because the caller always knows?

    dongcarl commented at 3:23 am on May 20, 2022:
    Yup! Feel free to reopen this if it’s still a problem.
  51. in src/node/coinstats.cpp:83 in c853e71d4d outdated
    77@@ -78,6 +78,112 @@ static void ApplyHash(MuHash3072& muhash, const uint256& hash, const std::map<ui
    78     }
    79 }
    80 
    81+class UTXOHasher
    82+{
    83+public:
    


    theuni commented at 7:25 pm on May 9, 2022:

    Is there any reason to ever declare a base UTXOHasher? Maybe give it a protected constructor?

    Edit: I suppose the pure virtual function takes care of that, but imo a protected constructor is a nice way of communicating that the base class is not for direct use.


    dongcarl commented at 8:16 pm on May 10, 2022:

    Maybe give it a protected constructor?

    I’m a bit confused here, is the intent just to convey in code that it’s a “virtual base class” by protecting the constructor? Would a code comment suffice?

    Is there any reason to ever declare a base UTXOHasher?

    My initial thought was that leaving the abstract UTXOHasher in the header (as done in commits subsequent to the one you’re commenting on) could allow users to provide their specific implementation (e.g. they have some hash accelerator that we can’t make use of).

    This was very easily achieved by just leaving the class in the header, but I don’t want to waste any time thinking about this at all so if you feel strongly about it, I’ll just move it out.


    dongcarl commented at 3:17 am on May 20, 2022:
    No longer relevant as of 90358d4b7402bfd4cd9b79ebf828e40f4eabd849
  52. in src/node/coinstats.cpp:92 in c853e71d4d outdated
    87+    virtual ~UTXOHasher();
    88+};
    89+
    90+UTXOHasher::~UTXOHasher() = default;
    91+
    92+class NullHasher : public UTXOHasher {
    


    theuni commented at 7:26 pm on May 9, 2022:
    Any reason not to make these subclasses final?

    dongcarl commented at 8:17 pm on May 10, 2022:
    Fixed as of push 8b853034a13f49b85cdf4168d9d49bd297f5b8ce
  53. in src/node/coinstats.cpp:246 in 1f37d453ed outdated
    239@@ -241,31 +240,19 @@ static bool GetUTXOStats(CCoinsView* view, BlockManager& blockman, CCoinsStats&
    240     }
    241     if (!outputs.empty()) {
    242         ApplyStats(stats, prevkey, outputs);
    243-        ApplyHash(hash_obj, prevkey, outputs);
    244+        hasher.Apply(prevkey, outputs);
    245     }
    246 
    247-    FinalizeHash(hash_obj, stats);
    248+    stats.hashSerialized = hasher.Finalize();
    


    theuni commented at 7:34 pm on May 9, 2022:

    Hmm, since this is no longer set as a side-effect of finalization, I’m not sure hashSerialized still belongs in CCoinsStats? IIUC the caller is now essentially responsible for its sanity, no?

    Edit: Or maybe Finalize() should take a CCoinsStats reference?


    dongcarl commented at 8:32 pm on May 10, 2022:

    I’m not entirely sure what you mean here. Do you mean we should return a std::pair<std::optional<CCoinsStats>, uint256> instead?

    Related: #24410 (review)

    maybe Finalize() should take a CCoinsStats reference?

    I’m not a big fan of that, I’m happier having UTXOHashers not knowing about CCoinsStats and only knowing about things that are appropriate to their domain.


    dongcarl commented at 3:17 am on May 20, 2022:
    No longer relevant as of 90358d4b7402bfd4cd9b79ebf828e40f4eabd849
  54. in src/node/coinstats.cpp:223 in 8c8d3f186d outdated
    222+
    223+std::optional<CCoinsStats> GetUTXOStatsWithIndex(CoinStatsIndex& coin_stats_index, const CBlockIndex* pindex)
    224+{
    225+    CCoinsStats stats = MakeCoinStatsPrefilledWithBlockIndexInfo(pindex);
    226+
    227+    stats.index_used = true;
    


    theuni commented at 8:11 pm on May 9, 2022:
    Same comment as with hashSerialized. index_used does not seem to be a value internal to CCoinsStats anymore.

    dongcarl commented at 8:29 pm on May 10, 2022:

    Although I’ve moved some in-param members out of CCoinStats, I think it’s outside the scope of this PR to move out-param members out of CCoinStats but would be a worthwhile followup. This is because a proper cleanup of CCoinStats’ out-params would likely result in a complete split of the struct itself.

    Observe that:

    • The hashing UTXO stats code path only ever sets the CCoinsStats members:
      • CCoinsStats::CCoinsStats
        • nHeight
        • hashBlock
      • ApplyStats
        • nTransactions
        • nTransactionOutputs
        • total_amount
        • nBogoSize
      • ComputeUTXOStatsWithHasher
        • coins_count
        • hashSerialized
        • nDiskSize
    • The indexing UTXO stats code path only ever sets the CCoinsStats members:
      • CCoinsStats::CCoinsStats
        • nHeight
        • hashBlock
      • CoinstatsIndex::LookupStats
        • index_used
        • hashSerialized
        • nTransactionOutputs
        • nBogoSize
        • total_amount
        • total_subsidy
        • total_unspendable_amount
        • total_prevout_spent_amount
        • total_new_outputs_ex_coinbase_amount
        • total_coinbase_amount
        • total_unspendables_genesis_block
        • total_unspendables_bip30
        • total_unspendables_scripts
        • total_unspendables_unclaimed_rewards
    • The only 6 common fields that are set:
      • nHeight
      • hashBlock
      • nTransactionOutputs
      • total_amount
      • nBogoSize
      • hashSerialized

    So a proper cleanup would make it so that CCoinStats only contains the 6 common fields, and have other structs to contain the fields unique to each codepath. I don’t want to do that in this PR.


    ryanofsky commented at 2:32 pm on May 18, 2022:

    re: #24410 (review)

    So a proper cleanup would make it so that CCoinStats only contains the 6 common fields, and have other structs to contain the fields unique to each codepath. I don’t want to do that in this PR.

    Nice analysis! Agree that cleaning up output parameters seems like an unrelated change.

  55. in src/node/coinstats.h:80 in f05882c453 outdated
    75@@ -76,6 +76,8 @@ std::optional<CCoinsStats> GetUTXOStats(CCoinsView* view, node::BlockManager& bl
    76 uint64_t GetBogoSize(const CScript& script_pub_key);
    77 
    78 CDataStream TxOutSer(const COutPoint& outpoint, const Coin& coin);
    79+
    80+CCoinsStats MakeCoinStatsPrefilledWithBlockIndexInfo(const CBlockIndex* pindex);
    


    theuni commented at 8:16 pm on May 9, 2022:
    Why not make this a ctor which takes a CBlockIndex* and prefills itself?

    fjahr commented at 10:18 pm on May 9, 2022:
    Agree, I was going to leave a nit that this is a bit verbose for my taste ;)

    dongcarl commented at 8:17 pm on May 10, 2022:
    Fixed as of push 8b853034a13f49b85cdf4168d9d49bd297f5b8ce
  56. in src/kernel/coinstats.h:5 in 05979e6e63 outdated
    0@@ -0,0 +1,88 @@
    1+// Copyright (c) 2022 The Bitcoin Core developers
    2+// Distributed under the MIT software license, see the accompanying
    3+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
    4+
    5+#ifndef BITCOIN_KERNEL_COINSTATS_H
    


    theuni commented at 8:24 pm on May 9, 2022:
    Note to other reviewers, the node/coinstats.h is deleted in subsequent commits, so the fact that they share a filename isn’t really interesting.
  57. theuni commented at 8:26 pm on May 9, 2022: member
    Concept ACK. This is a pretty shallow review, I’m planning on following up with a more thorough one once I’ve explored more.
  58. in src/node/coinstats.cpp:175 in c853e71d4d outdated
    170+
    171+std::unique_ptr<UTXOHasher> MakeUTXOHasher(const CoinStatsHashType& hash_type)
    172+{
    173+    switch (hash_type) {
    174+    case(CoinStatsHashType::HASH_SERIALIZED): {
    175+        return MakeSHA256DHasher();
    


    fjahr commented at 8:53 pm on May 9, 2022:
    nit: These specific Make functions seem a bit redundant and could be probably be inlined

    dongcarl commented at 3:16 am on May 20, 2022:
    No longer relevant as of 90358d4b7402bfd4cd9b79ebf828e40f4eabd849
  59. in src/node/coinstats.cpp:233 in 8c8d3f186d outdated
    232     return stats;
    233 }
    234+
    235+std::optional<CCoinsStats> GetUTXOStatsWithIndex(CoinStatsIndex& coin_stats_index, CCoinsView* view, BlockManager& blockman)
    236+{
    237+    CBlockIndex* pindex = WITH_LOCK(cs_main, return blockman.LookupBlockIndex(view->GetBestBlock()));
    


    fjahr commented at 9:40 pm on May 9, 2022:
    nit: ::cs_main

    dongcarl commented at 8:17 pm on May 10, 2022:
    Fixed as of push 8b853034a13f49b85cdf4168d9d49bd297f5b8ce
  60. in src/rpc/blockchain.cpp:830 in dda351d7cf outdated
    825+            return g_coin_stats_index->LookUpStats(block_index);
    826+        }
    827+    }
    828+
    829+    auto hasher = kernel::MakeUTXOHasher(hash_type);
    830+    return kernel::GetUTXOStatsWithHasher(*hasher, view, blockman, interruption_point);
    


    fjahr commented at 10:11 pm on May 9, 2022:
    Maybe we could make an even clearer naming distinction here. While we can really only “get” the stats from the index the kernel computes them. So the kernel function could be renamed to ComputeUTXOStatsWithHasher. But it’s fine to leave it as is. I just found it an interesting thought and interested to hear what others think.

    dongcarl commented at 8:17 pm on May 10, 2022:
    Fixed as of push 8b853034a13f49b85cdf4168d9d49bd297f5b8ce
  61. fjahr commented at 10:32 pm on May 9, 2022: contributor

    Code review ACK 8f50fe837eaa41b045c69a97617b2346d57bd879

    Very nice work. I didn’t see any real issues, just some minor stylistic suggestions. Overall I think this is a great improvement.

    FWIW, I found 8c8d3f186df95b419fba8fddaeaac9b8912646dd more confusing than helpful with the functions that were introduced and removed in the following commit. But overall great work with splitting these commits to bite-sized chunks.

    I will also revisit some of the tests of this code now that I looked at it for a while, but that is tangential.

  62. dongcarl force-pushed on May 10, 2022
  63. josibake commented at 4:03 pm on May 11, 2022: contributor

    Concept ACK

    +1 on keeping the commits small and focused. I’ve done a quick read-through but plan to do a more thorough review later

  64. dongcarl commented at 6:05 pm on May 11, 2022: contributor

    Reviewers: I’m considering squashing 3 commits:

    • 3fc97a7e59a96fa0e1901f625bd52b3d8c598805 “coinstats: Introduce UTXOHasher and implementations”
    • 51c3b93490a05ae37559d5eba981bf8c6107a674 “coinstats: Use newly-introduced UTXOHashers”
    • 24b613e8135e13f2f81e3849f2a97b2ccd7b317f “coinstats: Remove old UTXO hashing implementations”

    However, I do not wish to make my commits less reviewable.

    Let me know your opinion with a 👍 or 👎 reaction to this comment.


    Edit May 16th, 2022: The squash has been done as of 0cc58453e2479a8877e2c0b463d3a7c215ceb77b

  65. michaelfolkson commented at 7:39 pm on May 11, 2022: contributor

    Spent some time trying to understand the direction of the project as a whole in the run up to today’s PR review club and to the extent it builds on multiprocess, AssumeUTXO work. Approach ACK on the project roadmap thus far.

    On this PR specifically Approach ACK, code review (limited familiarity with the coinstats code before this), no testing.

    As others have said great organization of commits and commit messages (and innovative use of thumbs up/down to get reviewer reaction :) )

  66. in src/rpc/blockchain.cpp:961 in 8b853034a1 outdated
    952+                const std::optional<CCoinsStats> maybe_prev_stats = GetUTXOStats(coins_view, *blockman, hash_type, node.rpc_interruption_point, pindex->pprev, index_requested);
    953+                if (maybe_prev_stats.has_value()) {
    954+                    prev_stats = maybe_prev_stats.value();
    955+                } else {
    956+                    throw JSONRPCError(RPC_INTERNAL_ERROR, "Unable to read UTXO set");
    957+                }
    


    theStack commented at 5:40 pm on May 15, 2022:

    nit: in order to be more consistent, could also check the error condition first (like done below for maybe_stats), or the other way round

    0                if (!maybe_prev_stats) {
    1                    throw JSONRPCError(RPC_INTERNAL_ERROR, "Unable to read UTXO set");
    2                } else {
    3                    prev_stats = maybe_prev_stats.value();
    4                }
    

    dongcarl commented at 3:59 pm on May 16, 2022:
    Fixed as of #24410 (comment)
  67. theStack commented at 6:01 pm on May 15, 2022: contributor

    Concept ACK

    While reviewing #24952 recently, I found that the GetUTXOStats interface is currently quite confusing, glad to see that many painpoints (mixing both in- and out-params in one struct, mixed hashing and indexing codepaths in one function…) are fixed with this PR. Lightly reviewed each commit, the changes are concise, well-documented in the commit descriptions and make sense to me; will look over more deeply a second time within the next days. As stated by other reviewers before, kudos for keeping the commits small and to the point 👌

  68. dongcarl force-pushed on May 16, 2022
  69. dongcarl commented at 3:54 pm on May 16, 2022: contributor

    Pushed 8b853034a13f49b85cdf4168d9d49bd297f5b8ce -> 0cc58453e2479a8877e2c0b463d3a7c215ceb77b

  70. dongcarl commented at 3:55 pm on May 16, 2022: contributor
    @theStack Great to hear that someone else found the interface thorny independently 😄 Looking forward to your review!
  71. in src/node/coinstats.h:47 in 4477312faa outdated
    40@@ -41,8 +41,6 @@ struct CCoinsStats {
    41     //! The number of coins contained.
    42     uint64_t coins_count{0};
    43 
    44-    //! Signals if the coinstatsindex should be used (when available).
    


    ryanofsky commented at 4:22 pm on May 17, 2022:

    In commit “coinstats: Extract index_requested in-member to in-param” (4477312faa6be6582337b0dc667d51deabcca266)

    This comment was useful and is dropped. Would be good to document GetUTXOStats parameter


    dongcarl commented at 6:01 pm on May 19, 2022:
    Fixed in 56d7d36ea7c3be0dc015bffc1d3bf8253a917abd
  72. in src/node/coinstats.cpp:45 in 2ec6b8cfb3 outdated
    58-            ss << VARINT(it->second.nHeight * 2 + it->second.fCoinBase ? 1u : 0u);
    59-        }
    60+public:
    61+    virtual void Prepare(const uint256& hash_block){};
    62+    virtual void Apply(const uint256& hash, const std::map<uint32_t, Coin>& outputs){};
    63+    virtual uint256 Finalize() = 0;
    


    ryanofsky commented at 2:13 am on May 18, 2022:

    In commit “coinstats: Introduce and switch to UTXOHashers” (2ec6b8cfb3bf0ddc19966c86a44990108a236a96)

    Just IMO and this is not important, but I think introducing this class that hardcodes a uint256 hash result in more places and uses virtual methods and requires heap allocation just seems less flexible, less type safe, less efficient than code that was here before.

    Not every hash will be a uint256 hash, and it seems to me a null hasher should not compute any hash at all.

    I also don’t know what the use-case would be for a more complicated kernel API:

    0std::unique_ptr<UTXOHasher> MakeUTXOHasher(CoinStatsHashType hash_type);
    1std::optional<CCoinsStats> ComputeUTXOStatsWithHasher(UTXOHasher& hasher, ...);
    

    over simpler:

    0std::optional<CCoinsStats> ComputeUTXOStats(CoinStatsHashType hash_type, ...);
    

    dongcarl commented at 3:16 am on May 20, 2022:
    Resolved as of 90358d4b7402bfd4cd9b79ebf828e40f4eabd849
  73. in src/rpc/blockchain.cpp:2331 in 3c07759445 outdated
    2323@@ -2320,8 +2324,11 @@ UniValue CreateUTXOSnapshot(
    2324 
    2325         chainstate.ForceFlushStateToDisk();
    2326 
    2327-        if (!GetUTXOStats(&chainstate.CoinsDB(), chainstate.m_blockman, stats, CoinStatsHashType::HASH_SERIALIZED, node.rpc_interruption_point)) {
    2328+        const std::optional<CCoinsStats> maybe_stats = GetUTXOStats(&chainstate.CoinsDB(), chainstate.m_blockman, CoinStatsHashType::HASH_SERIALIZED, node.rpc_interruption_point);
    2329+        if (!maybe_stats) {
    2330             throw JSONRPCError(RPC_INTERNAL_ERROR, "Unable to read UTXO set");
    2331+        } else {
    2332+            stats = maybe_stats.value();
    


    ryanofsky commented at 2:13 pm on May 18, 2022:

    In commit “coinstats: Return purely out-param CCoinsStats” (3c07759445d5ca94c6f7983240ea4b07b064f032)

    It is clunky to replace a single “stats” variable with two “stats” and “maybe_stats” variables, and to inefficiently copy data between them. Would suggest only keeping “maybe_stats” variable and dropping “stats”. (The commit already does this for the two other getstats calls, no reason not to do it for this call).


    dongcarl commented at 6:02 pm on May 19, 2022:
    Fixed in 9ec45ff9eba818ce7908344b614b550c3b276bdd
  74. in src/node/coinstats.cpp:23 in c20fc22508 outdated
    18@@ -19,6 +19,11 @@
    19 #include <map>
    20 
    21 namespace node {
    22+
    23+CCoinsStats::CCoinsStats(const CBlockIndex* pindex)
    


    ryanofsky commented at 2:28 pm on May 18, 2022:

    In commit “coinstats: Separate hasher/index lookup codepaths” (c20fc22508086eb3c754c2b6d125bef4dc385124)

    It would be better for this constructor to just take hash/height arguments instead of being coupled to CBlockIndex. No reason for CCoinsStats it to be tied to an internal node data type.


    dongcarl commented at 6:02 pm on May 19, 2022:
    Fixed in 7c0064d451de00a3e53dfd8d29c3fc9e73d4d11a
  75. ryanofsky approved
  76. ryanofsky commented at 6:46 pm on May 18, 2022: contributor

    Code review ACK 0cc58453e2479a8877e2c0b463d3a7c215ceb77b.

    End result looks good, but it seemed to me this PR had a lot of internal churn which was tedious to review. If function signature is changing, it’s nice it if changes one time instead of one parameter at a time. If code is moving, it’s nice if it moves one instead of moving multiple times. In general, unless there is a specific reason (like separating behavior changes from refactoring changes, or separating mechanical refactoring from conceptual refactoring) it’s nice if each line of code changes once per PR, so the changes in the PR can be comprehended without having to think about intermediate states which turn out to be irrelevant.

    But I’m venting a little. In general things look very good here.

  77. dongcarl force-pushed on May 19, 2022
  78. dongcarl commented at 6:05 pm on May 19, 2022: contributor

    Pushed 0cc58453e2479a8877e2c0b463d3a7c215ceb77b -> abeee4b928eacb661a5829f464f0cda10da171c2

    • Addressed Ryan’s review comments (thanks Ryan!)
  79. dongcarl commented at 8:12 pm on May 19, 2022: contributor

    Reviewers, I’m considering dropping the commit:

    • 9dc8093189e41849e91f8425f02e8f30563dfdf3 “coinstats: Introduce and switch to UTXOHashers”

    Originally I had intended for the UTXOHashers class to make the GetUTXOStats interface more extensible (one could imagine a user providing their own UTXOHashers implementation), but I think it might have been more confusing than not and might not be worth the trouble because YAGNI.

    See discussions:

    On the other hand, I also don’t want to impose too much of a re-review burden on folks, and it really doesn’t seem like it matters all that much in practice which way we go.

    Let me know your opinion with a reaction to this comment:

    • 👍 Drop the commit and remove UTXOHashers
    • 😕 Keep the commit and UTXOHashers

    Ping @mzumsande @theuni @fjahr @theStack @ryanofsky


    Update May 19th, 2022: This has been implemented as of 90358d4b7402bfd4cd9b79ebf828e40f4eabd849

  80. ryanofsky approved
  81. ryanofsky commented at 10:29 pm on May 19, 2022: contributor

    Code review ACK abeee4b928eacb661a5829f464f0cda10da171c2. Only changes since last review were implementing some suggestions from my review (avoiding CBlockIndex* constructor, consolidating stats variables, documenting index_requested param).

    I’d be happy to see hasher thing simplified and ComputeUTXOStats not to need a hasher parameter, but also happy to see this merged as is, so I abstain from the emoji vote above!

  82. in src/node/coinstats.cpp:62 in 9dc8093189 outdated
    75+
    76+std::unique_ptr<UTXOHasher> MakeNullHasher() {
    77+    return std::make_unique<NullHasher>();
    78+}
    79+
    80+class SHA256DHasher final : public UTXOHasher
    


    theStack commented at 11:10 pm on May 19, 2022:

    naming nit: I found the name SHA256DHasher a bit too specific and misleading, at first glance one could think that this is a helper for generally computing a SHA256D hash. IMHO the important criteria for naming is not what concrete hash function a UTXO hasher interally uses (e.g., MuHash also uses SHA256 for finalizing, and ChaCha20 for kind of “hashing” UTXOs to big integers via a stream cipher), but rather how it works on a more abstract conceptual level and in what way it commits to UTXOs etc. Not saying that’s is easy to come up with a better name, but I’d prefer to keep UTXO in it, prefixed with a name resembling the output fields of the gettxoutsetinfo fields, e.g. Legacy(Serialized)UTXOHasher or MuHashUTXOHasher.

    (Probably this comment will be obsolete anyway though if commit 9dc8093189e41849e91f8425f02e8f30563dfdf3 gets dropped)


    dongcarl commented at 3:20 am on May 20, 2022:
    Yeah that’s a good point, in any case I think it’s no longer relevant as of 90358d4b7402bfd4cd9b79ebf828e40f4eabd849
  83. theStack commented at 11:34 pm on May 19, 2022: contributor

    Reviewers, I’m considering dropping the commit:

    https://github.com/bitcoin/bitcoin/commit/9dc8093189e41849e91f8425f02e8f30563dfdf3 “coinstats: Introduce and switch to UTXOHashers”

    Originally I had intended for the UTXOHashers class to make the GetUTXOStats interface more extensible (one could imagine a user providing their own UTXOHashers implementation), but I think it might have been more confusing than not and might not be worth the trouble because YAGNI.

    In my first round of (very light) code-review I found the introduction of the UTXOHasher abstract interface quite a good idea, but then on the other hand ryanofsky brings up really strong counter-arguments (https://github.com/bitcoin/bitcoin/pull/24410#discussion_r875408643) that are hard to refute. Particularly “more heap allocations” and “more complicated kernel API” are valid points that I didn’t consider (the pinning of the hash size to 256 bits seems less of a problem to me though), so that convinced me to vote for dropping the commit.

  84. dongcarl force-pushed on May 20, 2022
  85. dongcarl commented at 3:16 am on May 20, 2022: contributor

    Pushed abeee4b928eacb661a5829f464f0cda10da171c2 -> 90358d4b7402bfd4cd9b79ebf828e40f4eabd849

    • Implemented #24410 (comment)
    • A few minor cleanups (scripted-diff simplification, extraneous comments, adjust commit messages)
  86. in src/node/coinstats.cpp:102 in e4b2969b54 outdated
     98@@ -94,24 +99,11 @@ static void ApplyStats(CCoinsStats& stats, const uint256& hash, const std::map<u
     99 
    100 //! Calculate statistics about the unspent transaction output set
    101 template <typename T>
    102-static bool GetUTXOStats(CCoinsView* view, BlockManager& blockman, CCoinsStats& stats, T hash_obj, const std::function<void()>& interruption_point, const CBlockIndex* pindex, CoinStatsHashType& hash_type, bool index_requested)
    103+static bool ComputeUTXOStats(CCoinsView* view, BlockManager& blockman, CCoinsStats& stats, T hash_obj, const std::function<void()>& interruption_point)
    


    mzumsande commented at 5:23 pm on May 20, 2022:
    e4b2969b54a3ddba507de1aeb17b6765ca1f102b (coinstats: Separate hasher/index lookup codepaths): The inner ComputeUTXOStats doesn’t use blockman anymore, so the argument can be dropped.

    dongcarl commented at 10:21 pm on May 20, 2022:
    Fixed in d03f4443e8deb477d0fd4b27056eaeec8a615ca3, thanks!
  87. in src/node/coinstats.cpp:160 in 2e1ff3915c outdated
    170+        break;
    171     }
    172     } // no default case, so the compiler can warn about missing cases
    173-    assert(false);
    174+
    175+    if (!success) {
    


    mzumsande commented at 6:24 pm on May 20, 2022:

    2e1ff3915cab0e902195514f04378f74e3453610 (coinstats: Return purely out-param CCoinsStats)

    This gives me a -Wmaybe-uninitialized compiler warning for success (although this is obviously impossible).


    dongcarl commented at 10:22 pm on May 20, 2022:
    Hmmm, do you think we need to do anything here? I’m not sure how best to fix this since we want to keep it so that the compiler can warn about missing cases

    mzumsande commented at 11:05 pm on May 20, 2022:
    Maybe just initialise success with true or false to appease the compiler? I guess the catch-22 ist that if we had multiple warnings like this on master, we’d easily miss legitimate warnings such as the one about missing cases, which could then lead to success actually being uninitialized in this spot… But I don’t feel strongly about this.

    MarcoFalke commented at 6:04 am on May 21, 2022:

    It is not UB (I think) to pass any underlying integral value as an enum class, so uninitialized success can in theory happen in a “well-formed” program.

    A few alternative fixes:

    • Initialize success with false
    • Keep the assert(false) and inline all return statements into the case statments: if (!Calc(...)) return std::nullopt; return stats;
    • Keep the assert(false), but wrap it with the whole switch (hash_type) scope into an inlined lambda
    • Something else … ?

    ryanofsky commented at 4:18 pm on May 23, 2022:

    In commit “coinstats: Return purely out-param CCoinsStats” (f735c92edc08047bdbe0f47ea0e7c319bcc62eeb)

    Agree with Marco, I think. I don’t understand why you wouldn’t initialize success to false. If function is called with (CoinStatsHashType)123 compiler is correct to warn that it has undefined behavior. It seems like it would be best to assert false in this case, second best to return nullopt if asserting is inconvenient, and third best to have undefined behavior.


    dongcarl commented at 7:07 pm on May 23, 2022:
    Fixed in 524463daf6a10b20a4e20116a68101a684929eda using a lambda, thanks for the help here folks!
  88. mzumsande commented at 6:45 pm on May 20, 2022: contributor

    Review the code one more time - this looks great! Also did some light testing that gettxoutsetinfo still works as before.

    Two points below that are minor but would be worth fixing imo.

  89. in src/node/coinstats.cpp:221 in e4b2969b54 outdated
    216+        } else {
    217+            return LookupUTXOStatsWithIndex(*g_coin_stats_index, view, blockman);
    218+        }
    219+    }
    220+
    221+    return ComputeUTXOStats(hash_type, view, blockman, interruption_point);
    


    ryanofsky commented at 7:10 pm on May 20, 2022:

    In commit “coinstats: Separate hasher/index lookup codepaths” (e4b2969b54a3ddba507de1aeb17b6765ca1f102b)

    Behavior precedes this PR, but it seems like pindex just gets ignored in this case. It might be nice to assert(pindex == nullptr) here or else have a comment “// pindex is ignored if coinstats index is not available or wasn’t requested. Coinstats for the view are returned instead. This is ok because [reasons]”


    dongcarl commented at 10:21 pm on May 20, 2022:
    Fixed in d03f4443e8deb477d0fd4b27056eaeec8a615ca3, thanks!
  90. in src/node/coinstats.cpp:203 in 528472c420 outdated
    198-    }
    199-
    200-    return stats;
    201-}
    202-
    203-std::optional<CCoinsStats> LookupUTXOStatsWithIndex(CoinStatsIndex& coin_stats_index, CCoinsView* view, BlockManager& blockman)
    


    ryanofsky commented at 7:14 pm on May 20, 2022:

    In commit “Move logic from LookupUTXOStatsWithIndex to CoinStatsIndex::LookUpStats” (528472c420dccbeb673dd77b64d327ab5271f23b)

    This might be an example of unnecessary churn. Not a big deal, but why go through trouble of adding these LookupUTXOStatsWithIndex functions in e4b2969b54a3ddba507de1aeb17b6765ca1f102b when they will be deleted in 528472c420dccbeb673dd77b64d327ab5271f23b

  91. ryanofsky approved
  92. ryanofsky commented at 7:20 pm on May 20, 2022: contributor
    Code review ACK 90358d4b7402bfd4cd9b79ebf828e40f4eabd849. I rereviewed from scratch but main change since last review seems to be not adding hasher objects. All looks good, feel free to ignore minor comments.
  93. DrahtBot added the label Needs rebase on May 20, 2022
  94. in src/rpc/blockchain.cpp:816 in 90358d4b74 outdated
    807@@ -808,6 +808,30 @@ CoinStatsHashType ParseHashType(const std::string& hash_type_input)
    808     }
    809 }
    810 
    811+/**
    812+ * Calculate statistics about the unspent transaction output set
    813+ *
    814+ * @param[in] index_requested Signals if the coinstatsindex should be used (when available).
    815+ */
    816+std::optional<kernel::CCoinsStats> GetUTXOStats(CCoinsView* view, node::BlockManager& blockman,
    


    theStack commented at 8:18 pm on May 20, 2022:

    nit: could be made static, since this function is not used outside this module

    0static std::optional<kernel::CCoinsStats> GetUTXOStats(CCoinsView* view, node::BlockManager& blockman,
    

    dongcarl commented at 10:20 pm on May 20, 2022:
    Fixed in e6e2b165fd377806e257fbb5378e22067a9a3957, thanks!
  95. theStack approved
  96. theStack commented at 8:20 pm on May 20, 2022: contributor

    Code-review ACK 90358d4b7402bfd4cd9b79ebf828e40f4eabd849 🏞️

    (I’m aware that this needs rebase, but re-ACKing should be fairly trivial)

  97. kernel: Remove unnecessary blockfilter{index,}.cpp
    It is no longer necessary to link in blockfilter.cpp and
    index/blockfilterindex.cpp after merge of PR#21726 since validation has
    been decouple from the blockfilterindex.
    52b1939993
  98. fuzz: Remove useless GetUTXOStats fuzz case
    In the GetUTXOStats fuzz case, GetUTXOStats is always called with a
    CCoinsViewCache. Which is guaranteed to throw a std::logic_error when
    its ::Cursor() method is called on the first line of GetUTXOStats.
    
    In the fuzz case, we basically catch this logic error and declare
    victory if we caught it.
    
    There is no point to fuzzing this deterministic logic.
    
    Confirmed with IWYU that the node/coinstats.h #include is no longer
    necessary.
    0848db9c35
  99. includes: Remove rpc/util.h -> node/coinstats.h
    Confirmed with IWYU that this is unnecessary.
    102294898d
  100. coinstats: Extract hash_type in-member to in-param
    Currently, CCoinsStats is a struct with both in-params and out-params
    where the hash_type and index_requested members are the only in-params.
    
    This change removes CCoinsStats' hash_type in-param member and adds it
    to the relevant functions instead.
    
    [META] In subsequent commits, all of CCoinsStats' members which serve as
           in-params will be moved out so as to make CCoinsStats a pure
           out-param struct.
    a789f3f2b8
  101. coinstats: Extract index_requested in-member to in-param
    This change removes CCoinsStats' index_requested in-param member and
    adds it to the relevant functions instead.
    46eb9fc56a
  102. dongcarl force-pushed on May 20, 2022
  103. dongcarl commented at 10:20 pm on May 20, 2022: contributor

    Pushed 90358d4b7402bfd4cd9b79ebf828e40f4eabd849 -> e6e2b165fd377806e257fbb5378e22067a9a3957

  104. dongcarl force-pushed on May 20, 2022
  105. DrahtBot removed the label Needs rebase on May 21, 2022
  106. dongcarl commented at 3:59 pm on May 21, 2022: contributor

    Pushed e6e2b165fd377806e257fbb5378e22067a9a3957 -> e42583f928d5461827a946db3baa7b3da9487253

    • Removed over-eager assertion added in the previous e6e2b165fd377806e257fbb5378e22067a9a3957 push: GetUTXOStats is sometimes called in rpc/blockchain.cpp with a non-null pindex even when we aren’t going to use the index. In all such cases the pindex represents view’s best block.
  107. theStack approved
  108. theStack commented at 11:56 pm on May 22, 2022: contributor
    re-ACK e42583f928d5461827a946db3baa7b3da9487253
  109. in src/node/coinstats.cpp:226 in 99b7dced92 outdated
    221+    // pindex is ignored if the coinstats index is not available or wasn't
    222+    // requested. Coinstats for the view are returned instead.
    223+    //
    224+    // This is ok because we never call GetUTXOStats in a way where we can't use
    225+    // the coinstatsindex and the specified pindex is not also the view's best
    226+    // block.
    


    ryanofsky commented at 4:29 pm on May 23, 2022:

    In commit “coinstats: Separate hasher/index lookup codepaths” (99b7dced926eeba1ac0b181feb18ebb7ce35504e)

    Seems like it would be easy for code calling this function to be unaware of this assumption. Could check it with assert(!pindex || pindex->GetBlockHash() == view->GetBestBlock());


    dongcarl commented at 7:08 pm on May 23, 2022:
    Right! Fixed in 1352e410a5b84070279ff28399083cb3ab278593
  110. ryanofsky approved
  111. ryanofsky commented at 4:33 pm on May 23, 2022: contributor
    Code review ACK e42583f928d5461827a946db3baa7b3da9487253. Looks good, I left new suggestions but they are not important. Changes since last review were all minor: adding comment, adding static, dropping unused parameter
  112. coinstats: Return purely out-param CCoinsStats
    In previous commits in this patchset, we removed all in-param members of
    CCoinsStats. Now that that's done, we can modify GetUTXOStats to return
    an optional CCoinsStats instead of a status bool. Callers are modified
    accordingly.
    
    In rpc/blockchain.cpp, we discover that GetUTXOStats' status bool when
    getting UTXO stats for pprev was not checked for error. We fix this as
    well.
    524463daf6
  113. coinstats: Separate hasher/index lookup codepaths
    Split out ComputeUTXOStats and LookupUTXOStatsWithIndex from
    GetUTXOStats, since the hashing and indexing codepaths are quite
    disparate in practice.
    
    Also allow add a constructor to CCoinsStats for it to be constructed
    from a a block height and hash. This is used in both codepaths.
    
    Also add a note in GetUTXOStats documenting a behaviour quirk that
    predates this patchset.
    
    [META] This allows the hashing codepath to be moved to a separate file
           in a future commit, decoupling callers that only rely on the
           hashing codepath from the indexing one. This is key for
           libbitcoinkernel, which needs to have the hashing codepath for
           AssumeUTXO, but does not wish to be coupled with indexes.
    1352e410a5
  114. Move logic from LookupUTXOStatsWithIndex to CoinStatsIndex::LookUpStats
    The indexing codepath logic in node/coinstats.cpp is simple enough to be
    moved into CoinStatsIndex::LookUpStats, avoiding an additional layer of
    function calls. Callers are modified accordingly.
    
    Also, add 2 missed BOOST_CHECKs to the coinstatsindex_initial_sync unit
    test.
    b7634fe02b
  115. coinstats: Move hasher codepath to kernel/coinstats
    As mentioned in a previous commit, the hashing codepath can now be moved
    to a separate file. This decouples callers that only rely on the hashing
    codepath from the indexing one.
    
    This is key for libbitcoinkernel, which needs to have the CoinsStats
    hashing codepath for AssumeUTXO, but does not wish to be coupled with
    indexes.
    
    Note that only the .cpp file is split in this commit, the header files
    will be split in a subsequent commit and the #includes to
    node/coinstats.h will be adjusted to only #include the necessary
    headers.
    35f73ce4b2
  116. coinstats: Split node/coinstats.h to kernel/coinstats.h
    Most of this commit is pure-move.
    
    After this change:
    
    - kernel/coinstats.h
        -> Contains declarations for:
           - enum class CoinStatsHashType
           - struct CCoinsStats
           - GetBogoSize(...)
           - TxOutSer(...)
           - ComputeUTXOStats(...)
    - node/coinstats.h
        -> Just GetUTXOStats, which will be removed as we change callers to
           directly use the hashing/indexing codepaths in future commits.
    80970985c9
  117. Use only kernel/coinstats.h in index/coinstatsindex.h
    Removes a circular dependency, horray!
    0e54456f04
  118. scripted-diff: Move src/kernel/coinstats to kernel::
    Introduces a new kernel:: namespace and move all of src/kernel/coinstats
    under it.
    
    In the verify script, lines like:
    
    line="$(grep -n 'namespace node {' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
    sed -i -e "${line}s@namespace node {@namespace kernel {@" -- src/kernel/coinstats.h
    
    Are intended to replace only the last instance of "namespace node" with
    "namespace kernel", this is to avoid replacing forward declarations of
    things inside the node:: namespace.
    
    -BEGIN VERIFY SCRIPT-
    sed -E -i 's@namespace node@namespace kernel@g' -- src/kernel/coinstats.cpp
    
    line="$(grep -n 'namespace node {' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
    sed -i -e "${line}s@namespace node {@namespace kernel {@" -- src/kernel/coinstats.h
    
    line="$(grep -n '// namespace node' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
    sed -i -e "${line}s@// namespace node@// namespace kernel@" -- src/kernel/coinstats.h
    
    things='(CCoinsStats|CoinStatsHashType|GetBogoSize|TxOutSer|ComputeUTXOStats)'
    git grep -lE 'node::'"$things" | xargs sed -E -i 's@node::'"$things"'@kernel::\1@g'
    sed -E -i 's@'"$things"'@kernel::\1@g' -- src/node/coinstats.cpp src/node/coinstats.h
    sed -E -i 's@BlockManager@node::\0@g' -- src/kernel/coinstats.cpp
    -END VERIFY SCRIPT-
    f329a9298c
  119. style-only: Rearrange using decls after scripted-diff faa52387e8
  120. kernel: Use ComputeUTXOStats in validation
    This is the "fruit of our labor" for this patchset.
    ChainstateManager::PopulateAndValidateSnapshot can now directly call
    ComputeUTXOStats(...).
    
    Our consensus engine is now fully decoupled from all indices.
    
    See the src/Makefile.am for some satisfying removals.
    f100687566
  121. dongcarl force-pushed on May 23, 2022
  122. dongcarl commented at 7:07 pm on May 23, 2022: contributor

    Push e42583f928d5461827a946db3baa7b3da9487253 -> c7e8803e47eadc68e46e9384a7e2c4543404cd1b

  123. coinstats: Move GetUTXOStats to rpc/blockchain
    rpc/blockchain.cpp is now the only user of the vestigial
    GetUTXOStats(...). And since GetUTXOStats(...)'s special fallback logic
    was only really relevant/meant for rpc/blockchain.cpp, we can just move
    it there.
    664a14ba7c
  124. dongcarl force-pushed on May 23, 2022
  125. dongcarl commented at 7:20 pm on May 23, 2022: contributor

    Pushed c7e8803e47eadc68e46e9384a7e2c4543404cd1b -> 664a14ba7ccb40aa82d35a59831acd35db1897a6

    • Use CHECK_NONFATAL instead of assert in GetUTXOStats after moving to rpc/blockchain.cpp
  126. laanwj commented at 8:12 am on May 24, 2022: member
    Code review ACK 664a14ba7ccb40aa82d35a59831acd35db1897a6
  127. in src/kernel/coinstats.cpp:145 in 664a14ba7c
    152+    CBlockIndex* pindex = WITH_LOCK(::cs_main, return blockman.LookupBlockIndex(view->GetBestBlock()));
    153+    CCoinsStats stats{Assert(pindex)->nHeight, pindex->GetBlockHash()};
    154+
    155+    bool success = [&]() -> bool {
    156+        switch (hash_type) {
    157+        case(CoinStatsHashType::HASH_SERIALIZED): {
    


    laanwj commented at 8:30 am on May 24, 2022:
    nit: I’ve never seen this syntax before, why case(CoinStatsHashType::HASH_SERIALIZED) instead of case CoinStatsHashType::HASH_SERIALIZED?

    laanwj commented at 8:31 am on May 24, 2022:
    Oh, that was already in the original code, never mind. Definitely no need to change it here.
  128. laanwj merged this on May 24, 2022
  129. laanwj closed this on May 24, 2022

  130. theStack commented at 12:50 pm on May 24, 2022: contributor
    post-merge re-ACK 664a14ba7ccb40aa82d35a59831acd35db1897a6
  131. sidhujag referenced this in commit 24aef646bd on May 28, 2022
  132. MarcoFalke referenced this in commit 53b1a2426c on Jul 1, 2022
  133. sidhujag referenced this in commit 077c58e94f on Jul 1, 2022
  134. bitcoin deleted a comment on Oct 4, 2022
  135. bitcoin deleted a comment on Oct 4, 2022
  136. bitcoin deleted a comment on Oct 4, 2022
  137. bitcoin locked this on Oct 4, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me