Add hash_type MUHASH for gettxoutsetinfo #19145

pull fjahr wants to merge 5 commits into bitcoin:master from fjahr:csi-3-muhash-rpc changing 10 files +195 −59
  1. fjahr commented at 10:08 pm on June 2, 2020: member
    This is another Pr in the series PRs for Coinstatsindex (see overview in #18000). This PR adds the hash_type option muhash to gettxoutsetinfo through which the user can calculate the serialized muhash of the utxo set. This PR does not use the index yet.
  2. fjahr renamed this:
    [WIP] Allow hash_type options for gettxoutsetinfo
    [WIP] Add hash_type options for gettxoutsetinfo
    on Jun 2, 2020
  3. DrahtBot added the label Build system on Jun 2, 2020
  4. DrahtBot added the label RPC/REST/ZMQ on Jun 2, 2020
  5. DrahtBot added the label Tests on Jun 2, 2020
  6. DrahtBot added the label Utils/log/libs on Jun 2, 2020
  7. DrahtBot commented at 1:22 am on June 3, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19806 (validation: UTXO snapshot activation by jamesob)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. fjahr renamed this:
    [WIP] Add hash_type options for gettxoutsetinfo
    Add hash_type options for gettxoutsetinfo
    on Jun 4, 2020
  9. fjahr commented at 1:32 pm on June 4, 2020: member
    Added another more extensive test, now ready for review.
  10. Sjors commented at 12:55 pm on June 9, 2020: member
    Concept ACK. In the followup PR that adds a MuHash index, the RPC documentation should point out that this index can be used to dramatically speed up the RPC call when used with muhash. And that muhash and none will be more responsive than the default (or just change the default, and note that as a breaking change to anyone who uses the legacy hash).
  11. fjahr force-pushed on Jun 11, 2020
  12. fjahr commented at 11:08 pm on June 11, 2020: member
    @MarcoFalke I would like to hear your thoughts on 2109165e3b61d26a0496d5a1efd1c635a7f6a76a. I had it only in the test at first but put it into the framework because I think it could be useful for other tests as well and since it’s just a method it does not affect the use of P2PDataStore in other tests. But the data it returns on each UTXO is kind of specific to my test.
  13. MarcoFalke removed the label Tests on Jun 11, 2020
  14. MarcoFalke removed the label Build system on Jun 11, 2020
  15. MarcoFalke commented at 11:22 pm on June 11, 2020: member
    Seems fine to put it wherever you want, as long as reviewers are happy. I guess the decision mostly depends on whether this will be used by other tests.
  16. fjahr force-pushed on Jun 12, 2020
  17. DrahtBot added the label Needs rebase on Jun 24, 2020
  18. fjahr force-pushed on Jun 29, 2020
  19. fjahr commented at 7:21 pm on June 29, 2020: member
    rebased
  20. DrahtBot removed the label Needs rebase on Jun 29, 2020
  21. fjahr force-pushed on Jun 30, 2020
  22. fjahr commented at 2:26 pm on June 30, 2020: member
    Fixed a merge issue and pulled in latest changes from dependency PRs.
  23. fjahr force-pushed on Jul 5, 2020
  24. fjahr renamed this:
    Add hash_type options for gettxoutsetinfo
    Add hash_type MUHASH for gettxoutsetinfo
    on Jul 5, 2020
  25. fjahr commented at 11:46 am on July 5, 2020: member
    Removed some code dublication
  26. MarcoFalke referenced this in commit b52e25cc1b on Jul 6, 2020
  27. DrahtBot added the label Needs rebase on Jul 6, 2020
  28. fjahr force-pushed on Jul 6, 2020
  29. fjahr commented at 2:00 pm on July 6, 2020: member
    Rebased and made small style fix in rpc/util.cpp.
  30. DrahtBot removed the label Needs rebase on Jul 6, 2020
  31. sidhujag referenced this in commit e98a4e94d1 on Jul 8, 2020
  32. DrahtBot added the label Needs rebase on Jul 30, 2020
  33. fjahr force-pushed on Aug 10, 2020
  34. DrahtBot removed the label Needs rebase on Aug 10, 2020
  35. DrahtBot added the label Needs rebase on Aug 26, 2020
  36. fjahr force-pushed on Aug 27, 2020
  37. DrahtBot removed the label Needs rebase on Aug 27, 2020
  38. fjahr force-pushed on Aug 28, 2020
  39. DrahtBot added the label Needs rebase on Sep 1, 2020
  40. fjahr force-pushed on Sep 2, 2020
  41. fjahr commented at 11:39 am on September 2, 2020: member
    Rebased after #19105 was merged.
  42. DrahtBot removed the label Needs rebase on Sep 2, 2020
  43. DrahtBot added the label Needs rebase on Sep 10, 2020
  44. fjahr force-pushed on Sep 11, 2020
  45. DrahtBot removed the label Needs rebase on Sep 11, 2020
  46. in src/Makefile.am:421 in 9a06f21e6b outdated
    396@@ -397,6 +397,8 @@ crypto_libbitcoin_crypto_base_a_SOURCES = \
    397   crypto/hmac_sha512.h \
    398   crypto/poly1305.h \
    399   crypto/poly1305.cpp \
    400+  crypto/muhash.h \
    401+  crypto/muhash.cpp \
    


    mjdietzx commented at 2:16 pm on November 24, 2020:
    Should be alphabetical order?

    fjahr commented at 0:30 am on January 14, 2021:
    Sorry, this was part of a previous PR that this one is based on, so I don’t want to touch this anymore.
  47. in test/functional/feature_utxo_set_hash.py:59 in 9a06f21e6b outdated
    54+            height += 1
    55+            block_time += 1
    56+
    57+            # Save first block because we want to spend its coinbase.
    58+            if i == 0:
    59+                block1 = block
    


    mjdietzx commented at 5:40 pm on November 24, 2020:

    Isn’t this sort of a roundabout way to setup what seems to be the actual test, at lines 74-87? I’m not seeing why conn = node.add_p2p_connection(P2PDataStore()) is necessary for what’s being tested here, and if the method added in this PR def estimate_utxo_set(self): is needed.

    Would something like this do what you need:

     0coinbase_blocks = list(map(lambda block: node.getblock(block), node.generate(100)))
     1spending = next(filter(lambda block: block["confirmations"] == 100, coinbase_blocks))
     2coinbase_blocks.remove(spending)
     3
     4tx = node.gettransaction(spending["tx"][0])
     5vtx = FromHex(CTransaction(), tx["hex"])
     6vtx.rehash()
     7tx1 = create_tx_with_script(vtx, 0, script_sig=b'\x51', amount=25 * COIN)
     8tx2 = create_tx_with_script(tx1, 0, script_sig=b'\x51', amount=25 * COIN)
     9
    10tx1id = node.sendrawtransaction(hexstring=tx1.serialize().hex(), maxfeerate=0)['hex']
    11tx2id = node.sendrawtransaction(hexstring=tx2.serialize().hex(), maxfeerate=0)['hex']
    12node.generateblock(output=node.getnewaddress(), transactions=[tx1id, tx2id])
    

    and then you form utxo_set here from coinbase_blocks, and tx, tx1, tx2 to calculate the MuHash of the set


    fjahr commented at 0:33 am on January 14, 2021:
    It was and I did it because I thought there was a good chance that this code could be used for other tests as well and it also made some things a bit easier. But overall of course it adds complexity and I am lacking a specific example where it would be used so I have removed the P2P based implementation and simplified it in a similar way as you imagined, I think. Please check it out.
  48. in test/functional/feature_utxo_set_hash.py:54 in 9a06f21e6b outdated
    70+        block2.solve()
    71+
    72+        conn.send_blocks_and_test([block2], node, success=True)
    73+
    74+        # Calculate the MuHash of the UTXO set estimate
    75+        muhash = MuHash3072()
    


    mjdietzx commented at 6:50 pm on November 24, 2020:
    Would this test be a little better if instead of forming the utxo set at the end, you maintained a running sum and added as blocks were mined, and then added/subtracted when UTXOs were spent and created in tx1, tx2?

    fjahr commented at 0:35 am on January 14, 2021:
    I don’t think so because we need to somehow keep track of what is really in the UTXO set and I think this would make test harder to follow if I did it this way. And I feel like this is more of the standard way of how we structure our tests.
  49. DrahtBot added the label Needs rebase on Jan 7, 2021
  50. fjahr force-pushed on Jan 14, 2021
  51. fjahr commented at 0:38 am on January 14, 2021: member
    Rebased and addressed feedback by @mjdietzx . Thanks for reviewing and sorry for the long wait, I was focussing on #19055 which is now merged and which makes this one my high prio. I also added a small refactor commit to improve encapsulation between MuHash3072 and Num3072.
  52. DrahtBot removed the label Needs rebase on Jan 14, 2021
  53. in src/node/coinstats.cpp:28 in 62e0f39510 outdated
    24@@ -24,31 +25,47 @@ static uint64_t GetBogoSize(const CScript& scriptPubKey)
    25            scriptPubKey.size() /* scriptPubKey */;
    26 }
    27 
    28-static void ApplyStats(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
    29+static void ApplyHash(CCoinsStats &stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs, std::map<uint32_t, Coin>::const_iterator it)
    


    FelixWeis commented at 6:48 am on January 14, 2021:
    nit: CCoinsStats& stats

    fjahr commented at 11:27 pm on January 14, 2021:
    Done
  54. laanwj added this to the "Blockers" column in a project

  55. fjahr force-pushed on Jan 14, 2021
  56. fjahr commented at 11:29 pm on January 14, 2021: member

    There is a deterministic regtest blockchain for the test rpc_getblockstats.py which could (additionally) test the finalized hash against constants, especially in the future when one tests for correct rolling behaviour between blocks.

    Thanks, that’s a good point but I think this will be even more interesting when we get to the actual index when the RPC can give hashes for specific heights so I think I will leave this for one of the follow-ups of this one.

  57. fjahr commented at 11:30 pm on January 14, 2021: member
    Addressed Feedback by @FelixWeis and fixed the functional test for the no-wallet case.
  58. fjahr force-pushed on Jan 17, 2021
  59. fjahr force-pushed on Jan 21, 2021
  60. in src/crypto/muhash.cpp:278 in 6995fbd687 outdated
    274@@ -276,18 +275,33 @@ void Num3072::Divide(const Num3072& a)
    275     if (this->IsOverflow()) this->FullReduce();
    276 }
    277 
    278-Num3072 MuHash3072::ToNum3072(Span<const unsigned char> in) {
    279-    Num3072 out{};
    280-    uint256 hashed_in = (CHashWriter(SER_DISK, 0) << in).GetSHA256();
    281-    unsigned char tmp[BYTE_SIZE];
    282-    ChaCha20(hashed_in.data(), hashed_in.size()).Keystream(tmp, BYTE_SIZE);
    283+Num3072::Num3072(const unsigned char data[BYTE_SIZE]) {
    


    sipa commented at 9:49 pm on January 21, 2021:

    In commit " refactor: Improve encapsulation between MuHash3072 and Num3072"

    I recommend against passing C-style arrays, as despite its notation, it’s not passing an array but a pointer. For example, this works with no warning: https://godbolt.org/z/YrMvfE

    An alternative is using a C++ reference-to-array:

    0 Num3072::Num3072(const unsigned char (&data)[BYTE_SIZE]) { 
    

    fjahr commented at 0:06 am on January 22, 2021:
    Done
  61. in src/node/coinstats.cpp:55 in d20438fd83 outdated
    50+
    51+    CDataStream ss(SER_DISK, PROTOCOL_VERSION);
    52+    ss << outpoint;
    53+    ss << (uint32_t)(coin.nHeight * 2 + coin.fCoinBase);
    54+    ss << coin.out;
    55+    muhash.Insert(MakeUCharSpan(ss.str()));
    


    sipa commented at 10:09 pm on January 21, 2021:

    In commit “rpc: Add hash_type MUHASH to gettxoutsetinfo”

    I think extracting an std::string here is unnecessary.

    0muhash.Insert(MakeUCharSpan(ss));
    

    should work just as well, and avoid the copy.


    fjahr commented at 0:06 am on January 22, 2021:
    Fixed
  62. in src/node/coinstats.cpp:27 in 9470df12ec outdated
    23@@ -24,31 +24,35 @@ static uint64_t GetBogoSize(const CScript& scriptPubKey)
    24            scriptPubKey.size() /* scriptPubKey */;
    25 }
    26 
    27-static void ApplyStats(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
    28+static void ApplyHash(CCoinsStats &stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs, std::map<uint32_t, Coin>::const_iterator it)
    


    sipa commented at 10:11 pm on January 21, 2021:

    In commit “refactor: Seperate hash and stats calculation in coinstats”

    Nit: seperate -> separate in commit title


    fjahr commented at 0:06 am on January 22, 2021:
    Fixed
  63. fjahr force-pushed on Jan 22, 2021
  64. in src/crypto/muhash.h:94 in 7a96f77c00 outdated
    93@@ -91,8 +94,6 @@ class Num3072
    94 class MuHash3072
    


    jonatack commented at 5:22 pm on January 23, 2021:

    f8e4e369 nit, this is the only occurrence in the codebase documentation where MuHash is written Muhash

    0  * efficiently combined later.
    1  *
    2- * Muhash does not support checking if an element is already part of the
    3+ * MuHash does not support checking if an element is already part of the
    4  * set. That is why this class does not enforce the use of a set as the
    5  * data it represents because there is no efficient way to do so.
    

    (the other place is in the python code)

    0test/functional/test_framework/muhash.py:95:class TestFrameworkMuhash(unittest.TestCase):
    

    fjahr commented at 5:39 pm on January 24, 2021:
    Fixed
  65. in test/functional/rpc_blockchain.py:270 in 7a96f77c00 outdated
    266@@ -267,6 +267,13 @@ def _test_gettxoutsetinfo(self):
    267         # hash_type none should not return a UTXO set hash.
    268         res5 = node.gettxoutsetinfo(hash_type='none')
    269         assert 'hash_serialized_2' not in res5
    270+        assert 'muhash' not in res5
    


    jonatack commented at 6:03 pm on January 23, 2021:
    bde3feba should assert 'muhash' not in be added to any of res-res4

    fjahr commented at 5:39 pm on January 24, 2021:
    Done.
  66. in src/rpc/util.cpp:131 in 7a96f77c00 outdated
    135+    } else if (hash_type_input == "none") {
    136+        return CoinStatsHashType::NONE;
    137+    } else if (hash_type_input == "muhash") {
    138+        return CoinStatsHashType::MUHASH;
    139+    } else {
    140+        throw JSONRPCError(RPC_INVALID_PARAMETER, strprintf("%d is not a valid hash_type", hash_type_input));
    


    jonatack commented at 6:23 pm on January 23, 2021:

    bde3febaf this could perhaps be a bit simpler

     0diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
     1index 576dabc349..3a07e5620d 100644
     2--- a/src/rpc/blockchain.cpp
     3+++ b/src/rpc/blockchain.cpp
     4@@ -1079,7 +1079,7 @@ static RPCHelpMan gettxoutsetinfo()
     5     CCoinsStats stats;
     6     ::ChainstateActive().ForceFlushStateToDisk();
     7 
     8-    const CoinStatsHashType hash_type = ParseHashType(request.params[0], CoinStatsHashType::HASH_SERIALIZED);
     9+    const CoinStatsHashType hash_type{request.params[0].isNull() ? CoinStatsHashType::HASH_SERIALIZED : ParseHashType(request.params[0].get_str())};
    10 
    11     CCoinsView* coins_view = WITH_LOCK(cs_main, return &ChainstateActive().CoinsDB());
    12     NodeContext& node = EnsureNodeContext(request.context);
    13diff --git a/src/rpc/util.cpp b/src/rpc/util.cpp
    14index 30e09a9aac..d8dadf41f7 100644
    15--- a/src/rpc/util.cpp
    16+++ b/src/rpc/util.cpp
    17@@ -113,14 +113,8 @@ std::vector<unsigned char> ParseHexO(const UniValue& o, std::string strKey)
    18     return ParseHexV(find_value(o, strKey), strKey);
    19 }
    20 
    21-CoinStatsHashType ParseHashType(const UniValue& param, const CoinStatsHashType default_type)
    22+CoinStatsHashType ParseHashType(const std::string& hash_type_input)
    23 {
    24-    if (param.isNull()) {
    25-        return default_type;
    26-    }
    27-
    28-    std::string hash_type_input = param.get_str();
    29-
    30     if (hash_type_input == "hash_serialized_2") {
    31         return CoinStatsHashType::HASH_SERIALIZED;
    32     } else if (hash_type_input == "none") {
    33diff --git a/src/rpc/util.h b/src/rpc/util.h
    34index 942c243718..611d281014 100644
    35--- a/src/rpc/util.h
    36+++ b/src/rpc/util.h
    37@@ -77,7 +77,7 @@ extern uint256 ParseHashO(const UniValue& o, std::string strKey);
    38 extern std::vector<unsigned char> ParseHexV(const UniValue& v, std::string strName);
    39 extern std::vector<unsigned char> ParseHexO(const UniValue& o, std::string strKey);
    40 
    41-CoinStatsHashType ParseHashType(const UniValue& param, const CoinStatsHashType default_type);
    42+CoinStatsHashType ParseHashType(const std::string& hash_type_input);
    43 
    44 extern CAmount AmountFromValue(const UniValue& value);
    45 extern std::string HelpExampleCli(const std::string& methodname, const std::string& args);
    

    jonatack commented at 6:53 am on January 24, 2021:
    Could also drop the header file declaration and move the function into the same file as the caller, unless you expect it to have callers in other files.

    fjahr commented at 5:38 pm on January 24, 2021:
    You’re right, I don’t foresee any other callers at the moment so I moved the function into rpc/blockchain.cpp.
  67. in src/rpc/blockchain.cpp:1067 in 7a96f77c00 outdated
    1062@@ -1063,6 +1063,7 @@ static RPCHelpMan gettxoutsetinfo()
    1063                         {RPCResult::Type::NUM, "transactions", "The number of transactions with unspent outputs"},
    1064                         {RPCResult::Type::NUM, "txouts", "The number of unspent transaction outputs"},
    1065                         {RPCResult::Type::NUM, "bogosize", "A meaningless metric for UTXO set size"},
    1066+                        {RPCResult::Type::STR_HEX, "muhash", "The serialized hash (only present if 'muhash' hash_type is chosen)"},
    1067                         {RPCResult::Type::STR_HEX, "hash_serialized_2", "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
    


    jonatack commented at 6:32 pm on January 23, 2021:

    bde3febaf pass as optional results

    0-                        {RPCResult::Type::STR_HEX, "muhash", "The serialized hash (only present if 'muhash' hash_type is chosen)"},
    1-                        {RPCResult::Type::STR_HEX, "hash_serialized_2", "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
    2+                        {RPCResult::Type::STR_HEX, "muhash", /* optional */ true, "The serialized hash (only present if 'muhash' hash_type is chosen)"},
    3+                        {RPCResult::Type::STR_HEX, "hash_serialized_2", /* optional */ true, "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
    

    fjahr commented at 5:39 pm on January 24, 2021:
    Done.
  68. in src/node/coinstats.cpp:53 in 7a96f77c00 outdated
    62+    COutPoint outpoint = COutPoint(hash, it->first);
    63+    Coin coin = it->second;
    64+
    65+    CDataStream ss(SER_DISK, PROTOCOL_VERSION);
    66+    ss << outpoint;
    67+    ss << (uint32_t)(coin.nHeight * 2 + coin.fCoinBase);
    


    jonatack commented at 6:36 pm on January 23, 2021:

    bde3feba named cast

    0    ss << static_cast<uint32_t>(coin.nHeight * 2 + coin.fCoinBase);
    

    fjahr commented at 5:39 pm on January 24, 2021:
    Done.
  69. in test/functional/feature_utxo_set_hash.py:14 in 7a96f77c00 outdated
     9+
    10+from test_framework.blocktools import create_transaction
    11+from test_framework.messages import (
    12+    COutPoint,
    13+    FromHex,
    14+    CBlock,
    


    jonatack commented at 6:41 pm on January 23, 2021:
    7a96f77c nit, sort

    fjahr commented at 5:39 pm on January 24, 2021:
    Done.
  70. in test/functional/feature_utxo_set_hash.py:20 in 7a96f77c00 outdated
    15+)
    16+from test_framework.muhash import MuHash3072
    17+from test_framework.test_framework import BitcoinTestFramework
    18+from test_framework.util import assert_equal
    19+
    20+import struct
    


    jonatack commented at 6:42 pm on January 23, 2021:
    7a96f77c nit, place standard library imports before application/library specific imports per https://www.python.org/dev/peps/pep-0008/#imports

    fjahr commented at 5:39 pm on January 24, 2021:
    Done.
  71. jonatack commented at 6:48 pm on January 23, 2021: member
    Built and tested at each commit. This seems close. A few suggestions below.
  72. refactor: Improve encapsulation between MuHash3072 and Num3072
    Also fixes a typo.
    a1fcceac69
  73. fjahr force-pushed on Jan 24, 2021
  74. fjahr commented at 5:39 pm on January 24, 2021: member
    Addressed feedback by @jonatack
  75. fjahr force-pushed on Jan 24, 2021
  76. in src/rpc/blockchain.cpp:1069 in 2e5617f7f4 outdated
    1065     return RPCHelpMan{"gettxoutsetinfo",
    1066                 "\nReturns statistics about the unspent transaction output set.\n"
    1067                 "Note this call may take some time.\n",
    1068                 {
    1069-                    {"hash_type", RPCArg::Type::STR, /* default */ "hash_serialized_2", "Which UTXO set hash should be calculated. Options: 'hash_serialized_2' (the legacy algorithm), 'none'."},
    1070+                    {"hash_type", RPCArg::Type::STR, /* default */ "hash_serialized_2", "Which UTXO set hash should be calculated. Options: 'hash_serialized_2' (the legacy algorithm), 'muhash', 'none'."},
    


    jonatack commented at 7:06 pm on January 24, 2021:

    good catch in the latest push

    0- Options: 'hash_serialized_2' (the legacy algorithm), 'none'."},
    1+ Options: 'hash_serialized_2' (the legacy algorithm), 'muhash', 'none'."},
    

    fjahr commented at 10:22 pm on January 24, 2021:
    Yeah, I feel like I fixed this before but I guess I messed up a rebase somewhere ;)
  77. in src/rpc/blockchain.cpp:1059 in 2e5617f7f4 outdated
    1054+    } else if (hash_type_input == "none") {
    1055+        return CoinStatsHashType::NONE;
    1056+    } else if (hash_type_input == "muhash") {
    1057+        return CoinStatsHashType::MUHASH;
    1058+    } else {
    1059+        throw JSONRPCError(RPC_INVALID_PARAMETER, strprintf("%d is not a valid hash_type", hash_type_input));
    


    jonatack commented at 7:11 pm on January 24, 2021:

    726c9504 I overlooked this in my first review, perhaps add a functional test assertion on this output

    0        throw JSONRPCError(RPC_INVALID_PARAMETER, strprintf("%s is not a valid hash_type", hash_type_input));
    

    jonatack commented at 7:29 pm on January 24, 2021:

    Oh, maybe it works anyway

    0./src/bitcoin-cli -signet gettxoutsetinfo MuHash
    1error code: -8
    2error message:
    3MuHash is not a valid hash_type
    

    fjahr commented at 10:22 pm on January 24, 2021:
    Done
  78. in src/node/coinstats.cpp:141 in 2e5617f7f4 outdated
    137@@ -117,9 +138,17 @@ static void PrepareHash(CHashWriter& ss, const CCoinsStats& stats)
    138     ss << stats.hashBlock;
    139 }
    140 static void PrepareHash(std::nullptr_t, CCoinsStats& stats) {}
    141+// Muhash does not need the prepare step
    


    jonatack commented at 7:18 pm on January 24, 2021:
    726c9504 pico-nit, feel free to ignore, s/Muhash/MuHash/

    fjahr commented at 10:22 pm on January 24, 2021:
    Done
  79. jonatack commented at 7:41 pm on January 24, 2021: member

    ACK 2e5617f7f4bcece4834852cc925068e47ceeffa3 code review, reviewed diff last review per git diff 7a96f77 2e5617f, tested rebased on current master

    Happy to re-ACK for the changes above.

     0# bsi is an alias for ./src/bitcoin-cli -signet
     1
     2((HEAD detached at origin/pr/19145))$ bsi help gettxoutsetinfo
     3gettxoutsetinfo ( "hash_type" )
     4
     5Returns statistics about the unspent transaction output set.
     6Note this call may take some time.
     7
     8Arguments:
     91. hash_type    (string, optional, default=hash_serialized_2) Which UTXO set hash should be calculated. Options: 'hash_serialized_2' (the legacy algorithm), 'muhash', 'none'.
    10
    11Result:
    12{                                 (json object)
    13  "height" : n,                   (numeric) The current block height (index)
    14  "bestblock" : "hex",            (string) The hash of the block at the tip of the chain
    15  "transactions" : n,             (numeric) The number of transactions with unspent outputs
    16  "txouts" : n,                   (numeric) The number of unspent transaction outputs
    17  "bogosize" : n,                 (numeric) A meaningless metric for UTXO set size
    18  "muhash" : "hex",               (string, optional) The serialized hash (only present if 'muhash' hash_type is chosen)
    19  "hash_serialized_2" : "hex",    (string, optional) The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)
    20  "disk_size" : n,                (numeric) The estimated size of the chainstate on disk
    21  "total_amount" : n              (numeric) The total amount
    22}
    23
    24Examples:
    25> bitcoin-cli gettxoutsetinfo 
    26> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "gettxoutsetinfo", "params": []}' -H 'content-type: text/plain;' http://127.0.0.1:8332/
    27
    28((HEAD detached at origin/pr/19145))$ bsi gettxoutsetinfo
    29{
    30  "height": 21927,
    31  "bestblock": "0000000d69e0c1800fd7103a7f1aaf24ec7b8c1db92257bda1035dcfd8e71c30",
    32  "transactions": 17430,
    33  "txouts": 17550,
    34  "bogosize": 1263989,
    35  "hash_serialized_2": "c8dbf6ab5094c36935ca89fc8e347b65ab17320f0937c5ac942274a1a0027892",
    36  "disk_size": 1264123,
    37  "total_amount": 1096350.00000000
    38}
    39
    40((HEAD detached at origin/pr/19145))$ bsi gettxoutsetinfo hash_serialized_2
    41{
    42  "height": 21928,
    43  "bestblock": "000000c3c7e614717b8cdf6c33de7a8a7d97a856eba6640b481a0029fc774249",
    44  "transactions": 17431,
    45  "txouts": 17551,
    46  "bogosize": 1264061,
    47  "hash_serialized_2": "3e31af57df9dac6bf8924560b647fc81f01bdac1c0fe4af2bc12b61c81d87d61",
    48  "disk_size": 1264123,
    49  "total_amount": 1096400.00000000
    50}
    51
    52((HEAD detached at origin/pr/19145))$ bsi gettxoutsetinfo muhash
    53{
    54  "height": 21927,
    55  "bestblock": "0000000d69e0c1800fd7103a7f1aaf24ec7b8c1db92257bda1035dcfd8e71c30",
    56  "transactions": 17430,
    57  "txouts": 17550,
    58  "bogosize": 1263989,
    59  "muhash": "cd74b9cfe35dfee561ed730953ed15085e83c096db30d5c03a8ab1a6a137077a",
    60  "disk_size": 1264123,
    61  "total_amount": 1096350.00000000
    62}
    63
    64((HEAD detached at origin/pr/19145))$ bsi gettxoutsetinfo none
    65{
    66  "height": 21927,
    67  "bestblock": "0000000d69e0c1800fd7103a7f1aaf24ec7b8c1db92257bda1035dcfd8e71c30",
    68  "transactions": 17430,
    69  "txouts": 17550,
    70  "bogosize": 1263989,
    71  "disk_size": 1264123,
    72  "total_amount": 1096350.00000000
    73}
    74
    75((HEAD detached at origin/pr/19145))$ bsi gettxoutsetinfo non
    76error code: -8
    77error message:
    78non is not a valid hash_type
    79
    80((HEAD detached at origin/pr/19145))$ bsi gettxoutsetinfo MuHash
    81error code: -8
    82error message:
    83MuHash is not a valid hash_type
    
  80. fjahr force-pushed on Jan 24, 2021
  81. fjahr commented at 10:23 pm on January 24, 2021: member
    Addressed further feedback by @jonatack . Thanks for reviewing!
  82. jonatack commented at 10:53 am on January 25, 2021: member

    re-ACK 3506d9080166bc250e941d791e3e8a2b85d5bd6b per git diff 2e5617f 3506d90

    Thanks for updating.

  83. jonatack commented at 1:04 pm on January 25, 2021: member
    This might need a release note (even if the note will be updated in planned follow-ups to this).
  84. Sjors commented at 1:46 pm on January 28, 2021: member
    Muhash at height 668,047: c263d20036cc983510e9b347cb88e947168dd5665ed00b0e6118d07a7704f810
  85. in src/node/coinstats.cpp:28 in e6856a3809 outdated
    24@@ -24,7 +25,7 @@ static uint64_t GetBogoSize(const CScript& scriptPubKey)
    25            scriptPubKey.size() /* scriptPubKey */;
    26 }
    27 
    28-static void ApplyHash(CCoinsStats &stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs, std::map<uint32_t, Coin>::const_iterator it)
    29+static void ApplyHash(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs, std::map<uint32_t, Coin>::const_iterator it)
    


    Sjors commented at 3:17 pm on January 29, 2021:
    e6856a3809e54dc4de03cb4cd4f9d78495cc0625: nit this undoes the move of & in the previous commit

    fjahr commented at 0:00 am on January 31, 2021:
    Fixed.
  86. in src/node/coinstats.cpp:124 in e6856a3809 outdated
    123@@ -111,6 +124,10 @@ bool GetUTXOStats(CCoinsView* view, CCoinsStats& stats, CoinStatsHashType hash_t
    124     case(CoinStatsHashType::NONE): {
    125         return GetUTXOStats(view, stats, nullptr, interruption_point);
    126     }
    127+    case(CoinStatsHashType::MUHASH): {
    


    Sjors commented at 3:21 pm on January 29, 2021:
    e6856a3809e54dc4de03cb4cd4f9d78495cc0625 nit: handle MUHASH before NONE? (same with PrepareHash and ParseHashType)

    fjahr commented at 0:00 am on January 31, 2021:
    Done.
  87. in src/rpc/blockchain.cpp:1079 in e6856a3809 outdated
    1075@@ -1063,7 +1076,8 @@ static RPCHelpMan gettxoutsetinfo()
    1076                         {RPCResult::Type::NUM, "transactions", "The number of transactions with unspent outputs"},
    1077                         {RPCResult::Type::NUM, "txouts", "The number of unspent transaction outputs"},
    1078                         {RPCResult::Type::NUM, "bogosize", "A meaningless metric for UTXO set size"},
    1079-                        {RPCResult::Type::STR_HEX, "hash_serialized_2", "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
    1080+                        {RPCResult::Type::STR_HEX, "muhash", /* optional */ true, "The serialized hash (only present if 'muhash' hash_type is chosen)"},
    


    Sjors commented at 3:25 pm on January 29, 2021:
    e6856a3809e54dc4de03cb4cd4f9d78495cc0625 nit: move down?

    fjahr commented at 0:00 am on January 31, 2021:
    Done.
  88. in src/node/coinstats.cpp:146 in 3506d90801 outdated
    142+static void PrepareHash(MuHash3072& muhash, CCoinsStats& stats) {}
    143 
    144 static void FinalizeHash(CHashWriter& ss, CCoinsStats& stats)
    145 {
    146-    stats.hashSerialized = ss.GetHash();
    147+    stats.hashSerialized = ss.GetSHA256();
    


    Sjors commented at 3:46 pm on January 29, 2021:
    This (?) changes the result of gettxoutsetinfo hash_serialized_2 (probably something else)

    fjahr commented at 0:00 am on January 31, 2021:
    Huh, no idea how that happened, makes no sense at all, especially inside that commit. Fixed.
  89. Sjors changes_requested
  90. Sjors commented at 3:55 pm on January 29, 2021: member

    Reviewed 3506d9080166bc250e941d791e3e8a2b85d5bd6b, almost happy…

    Muhash at height 668,212: d0192e6604c3dc9a78f47bfc5ae84b8c0f6baf960d7ba9305664bbb9f65ef092.

    hash_serialized_2 is 63fa057a656168ed832d8abbec8008948fc17aecde9f346d69043871a746941f on master, but 15d3629db2fcbed27a1fd9d35c15b5ac6f0a70f5723ebf9cf87e3df0b0feb061 here (efe73b1f89e82d65100c73c06d9a01ec19cc7e029d93caaa8bfded4d1723d484 if I revert GetHash())

    You may want to hardcode a hash_serialized_2 example in the test suite.

  91. refactor: Separate hash and stats calculation in coinstats 2474645f3b
  92. rpc: Add hash_type MUHASH to gettxoutsetinfo
    Also small style fix in rpc/util.cpp
    0d3b2f643d
  93. test: Add test for gettxoutsetinfo RPC with MuHash 6ccc8fc067
  94. test: Add test for deterministic UTXO set hash results e987ae5a55
  95. fjahr force-pushed on Jan 31, 2021
  96. fjahr commented at 0:14 am on January 31, 2021: member

    Reviewed 3506d90, almost happy…

    Muhash at height 668,212: d0192e6604c3dc9a78f47bfc5ae84b8c0f6baf960d7ba9305664bbb9f65ef092.

    hash_serialized_2 is 63fa057a656168ed832d8abbec8008948fc17aecde9f346d69043871a746941f on master, but 15d3629db2fcbed27a1fd9d35c15b5ac6f0a70f5723ebf9cf87e3df0b0feb061 here (efe73b1f89e82d65100c73c06d9a01ec19cc7e029d93caaa8bfded4d1723d484 if I revert GetHash())

    You may want to hardcode a hash_serialized_2 example in the test suite.

    Thanks a lot for testing @Sjors ! I should have addressed all your comments.

    The difference in that hash was caused by using outputs.end() where I should have used std::prev(outputs.end()) in ApplyHash(). I was actually sure we had some deterministic tests in place for that hash already but obviously, we didn’t. So I added a new commit with a small test calculating the hash based on the cached chain to add such coverage.

  97. in src/rpc/blockchain.cpp:1080 in e987ae5a55
    1075@@ -1063,7 +1076,8 @@ static RPCHelpMan gettxoutsetinfo()
    1076                         {RPCResult::Type::NUM, "transactions", "The number of transactions with unspent outputs"},
    1077                         {RPCResult::Type::NUM, "txouts", "The number of unspent transaction outputs"},
    1078                         {RPCResult::Type::NUM, "bogosize", "A meaningless metric for UTXO set size"},
    1079-                        {RPCResult::Type::STR_HEX, "hash_serialized_2", "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
    1080+                        {RPCResult::Type::STR_HEX, "hash_serialized_2", /* optional */ true, "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
    1081+                        {RPCResult::Type::STR_HEX, "muhash", /* optional */ true, "The serialized hash (only present if 'muhash' hash_type is chosen)"},
    


    jonatack commented at 8:21 pm on February 1, 2021:
    nit, in these two lines, could replace the single quotes with double ones for consistency, e.g. s/'/\"/

    fjahr commented at 9:30 pm on February 1, 2021:
    Yes, will do if I have to retouch! Thanks for re-reviewing!
  98. jonatack commented at 8:53 pm on February 1, 2021: member

    Good catch and nice improvements.

    Tested re-ACK e987ae5a554c9952812746c29f2766bacea4b727 per git diff 3506d90 e987ae5, reviewed diff, debug built, ran gettxoutsetinfo -signet and help on this branch vs master, at height 23127 both returned hash_serialized_2 of 2b72d65f3b6efb2311f58374ea2b939abf49684d44f4bafda45faa3b5452a454 and this branch returned muhash of c9f1ff12d345ccf9939c6bbf087e6f7399b6115adee1569287e9c5c43dbb475c

    Only if you have to retouch and if you agree, one nit below (and maybe use single/double quotes consistently in the functional tests).

  99. Sjors commented at 10:19 am on February 3, 2021: member

    tACK e987ae5

    I get the same MuHash as before for block 668,212 and hash_serialized_2 now matches what I found on master.

  100. achow101 commented at 10:18 pm on February 8, 2021: member

    ACK e987ae5a554c9952812746c29f2766bacea4b727

    It would be nice if the test used MiniWallet or otherwise didn’t rely on the wallet, but it’s fine for now.

  101. in src/crypto/muhash.cpp:278 in a1fcceac69 outdated
    274@@ -276,18 +275,33 @@ void Num3072::Divide(const Num3072& a)
    275     if (this->IsOverflow()) this->FullReduce();
    276 }
    277 
    278-Num3072 MuHash3072::ToNum3072(Span<const unsigned char> in) {
    279-    Num3072 out{};
    280-    uint256 hashed_in = (CHashWriter(SER_DISK, 0) << in).GetSHA256();
    281-    unsigned char tmp[BYTE_SIZE];
    282-    ChaCha20(hashed_in.data(), hashed_in.size()).Keystream(tmp, BYTE_SIZE);
    283+Num3072::Num3072(const unsigned char (&data)[BYTE_SIZE]) {
    


    ryanofsky commented at 5:59 pm on February 10, 2021:

    In commit “refactor: Improve encapsulation between MuHash3072 and Num3072” (a1fcceac69097a8e6540a6fd8121a5d53022528f)

    Note in case it helps other reviewers: I found this commit easier to review rearranging the method order to ToNum3072 Num3072 ToBytes instead of Num3072 ToBytes ToNum3072 so diff was smaller with code moving around less.

  102. in src/node/coinstats.cpp:27 in 2474645f3b outdated
    23@@ -24,31 +24,35 @@ static uint64_t GetBogoSize(const CScript& scriptPubKey)
    24            scriptPubKey.size() /* scriptPubKey */;
    25 }
    26 
    27-static void ApplyStats(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
    28+static void ApplyHash(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs, std::map<uint32_t, Coin>::const_iterator it)
    


    ryanofsky commented at 6:32 pm on February 10, 2021:

    In commit “refactor: Separate hash and stats calculation in coinstats” (2474645f3b15687e7f196b89eb935d6e6a98a9da)

    stats arguments in unused in all three overloads and could be dropped.

    Also, passing iterators around like this seems unnecessarily elaborate. It makes sense to separate stats and hash computations, but is there a reason both computations need to share the a single for loop over the output map? I would think you could make all the functions separate like:

    static void ApplyHash(CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs); static void ApplyHash(MuHash3072& muhash, const uint256& hash, const std::map<uint32_t, Coin>& outputs); static void ApplyHash(std::nullptr_t, const uint256& hash, const std::map<uint32_t, Coin>& outputs); static void ApplyStats(CCoinsStats& stats, const uint256& hash, const std::map<uint32_t, Coin>& outputs);

  103. ryanofsky approved
  104. ryanofsky commented at 6:46 pm on February 10, 2021: member
    Code review ACK e987ae5a554c9952812746c29f2766bacea4b727. Looks very good. I left one suggestion to simplify code, but feel free to ignore it here and maybe consider it for later since PR has already had a lot of review.
  105. fjahr commented at 9:43 pm on February 10, 2021: member
    Thanks a lot for the reviews @achow101 and @ryanofsky ! I will add your suggestions if I have to push again, otherwise I will put them into the next PR in this series.
  106. laanwj merged this on Feb 12, 2021
  107. laanwj closed this on Feb 12, 2021

  108. laanwj removed this from the "Blockers" column in a project

  109. sidhujag referenced this in commit 266fcc8b6d on Feb 12, 2021
  110. in test/functional/feature_utxo_set_hash.py:30 in e987ae5a55
    25+        self.skip_if_no_wallet()
    26+
    27+    def test_deterministic_hash_results(self):
    28+        self.log.info("Test deterministic UTXO set hash results")
    29+
    30+        # These depend on the setup_clean_chain option, the chain loaded from the cache
    


    MarcoFalke commented at 4:12 pm on February 12, 2021:

    there is no loaded chain and the utxo set is empty. Which is also useful to check, but doesn’t check if the hash is deterministic?

    Did you forget to load the chain from ./test/functional/data/rpc_getblockstats.json?

  111. MarcoFalke referenced this in commit 9b48b3ac42 on Mar 26, 2021
  112. kittywhiskers referenced this in commit c46d0b951b on Feb 26, 2022
  113. kittywhiskers referenced this in commit 47e6598166 on Feb 26, 2022
  114. Fabcien referenced this in commit 32714970e1 on Mar 14, 2022
  115. Fabcien referenced this in commit e3aff75e2f on Mar 14, 2022
  116. Fabcien referenced this in commit ef48cb3edd on Mar 14, 2022
  117. kittywhiskers referenced this in commit 88381c4bef on Apr 3, 2022
  118. kittywhiskers referenced this in commit f559701b86 on Apr 20, 2022
  119. kittywhiskers referenced this in commit 6504cafa01 on Apr 24, 2022
  120. kittywhiskers referenced this in commit ef18d0858d on Apr 27, 2022
  121. DrahtBot locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me