test: generalize MuHash calculation in `feature_utxo_set

theStack commented at 12:53 AM on November 23, 2022: contributor

Right now the MuHash calculation in the functional test feature_utxo_set_hash.py is dependent on the specific chain we generate, e.g. it only works as long as the only spent coin is created in block number 1 (and no other UTXO is created in that same block) and uses that quirk to avoid the need to ever remove anything from the MuHash object.

This PR generalizes the MuHash calculation by systematically going through all blocks and taking all spent/created coins into account (reflecting the actual behaviour on the coinstatsindex), which allows to potentially create more complex test-cases for UTXO set hashes in the future.

DrahtBot commented at 12:53 AM on November 23, 2022: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

DrahtBot added the label Tests on Nov 23, 2022

theStack force-pushed on Nov 23, 2022

in test/functional/feature_utxo_set_hash.py:47 in 3accc773c9 outdated

  71 | -                    data += struct.pack("<i", height * 2 + coinbase)
  72 | -                    data += tx_out.serialize()
  73 | -
  74 | -                    muhash.insert(data)
  75 | +        for height in range(1, node.getblockcount() + 1):
  76 | +            for tx in node.getblock(blockhash=node.getblockhash(height), verbose=3)['tx']:

stickies-v commented at 11:35 AM on November 24, 2022:

nit: both are accepted but verbosity is the documented kwargs since it became an integer type so I'd go with that?

            for tx in node.getblock(blockhash=node.getblockhash(height), verbosity=3)['tx']:

stickies-v commented at 3:31 PM on November 24, 2022:

tl;dr - could be improved with batch requests but shouldn't be a blocker for this pull, current approach is very reasonable.

This would benefit from a batch request approach, albeit quite modestly. On my machine, test runtime decreased from ~0.35s to ~0.33s. Overall runtime is very short so I don't think we need to go out of our way to optimize here, but would be a best practice. I'm thinking we should add streamlined batch request handling (~send_batch_request as implemented here) on the AuthServiceProxy level so tests can use it with minimal overhead. But I guess I should just make a separate pull for that - there's probably room for improvement in other tests that iterate over single requests.

diff --git a/test/functional/feature_utxo_set_hash.py b/test/functional/feature_utxo_set_hash.py
index 46a1813d4..a328d6805 100755
--- a/test/functional/feature_utxo_set_hash.py
+++ b/test/functional/feature_utxo_set_hash.py
@@ -9,13 +9,16 @@ from test_framework.messages import (
     CTxOut,
 )
 from test_framework.muhash import MuHash3072
-from test_framework.test_framework import BitcoinTestFramework
+from test_framework.test_framework import BitcoinTestFramework, TestNode
 from test_framework.util import assert_equal
 from test_framework.wallet import MiniWallet
 
+from typing import Dict, List, Any
 
-def txout_serialize(txid_hex, n, height, is_coinbase, value, scriptPubKey_hex):
-    data = COutPoint(int(txid_hex, 16), n).serialize()
+
+def txout_serialize(txid_hex: str, output_index: int, height: int, is_coinbase: bool, value: float,
+                    scriptPubKey_hex: str) -> bytes:
+    data = COutPoint(int(txid_hex, 16), output_index).serialize()
     data += (height * 2 + is_coinbase).to_bytes(4, 'little')
     data += CTxOut(int(value * COIN), bytes.fromhex(scriptPubKey_hex)).serialize()
     return data
@@ -43,20 +46,23 @@ class UTXOSetHashTest(BitcoinTestFramework):
         # apply (add/remove) them to the MuHash object accordingly
         muhash = MuHash3072()
 
-        for height in range(1, node.getblockcount() + 1):
-            for tx in node.getblock(blockhash=node.getblockhash(height), verbose=3)['tx']:
+        hashes = send_batch_request(node, "getblockhash", [[i] for i in range(1, node.getblockcount() + 1)])
+        blocks = send_batch_request(node, "getblock", [{"blockhash": h, "verbosity": 3} for h in hashes])
+        for block in blocks:
+            height = block["height"]
+            for tx in block["tx"]:
                 # add created coins
                 is_coinbase_tx = 'coinbase' in tx['vin'][0]
                 for o in tx['vout']:
                     if o['scriptPubKey']['type'] != 'nulldata':
                         muhash.insert(txout_serialize(tx['txid'], o['n'], height, is_coinbase_tx,
-                                                      o['value'], o['scriptPubKey']['hex']))
+                                                        o['value'], o['scriptPubKey']['hex']))
                 # remove spent coins
                 if not is_coinbase_tx:
                     for i in tx['vin']:
                         prev = i['prevout']
                         muhash.remove(txout_serialize(i['txid'], i['vout'], prev['height'], prev['generated'],
-                                                      prev['value'], prev['scriptPubKey']['hex']))
+                                                        prev['value'], prev['scriptPubKey']['hex']))
 
         finalized = muhash.digest()
         node_muhash = node.gettxoutsetinfo("muhash")['muhash']
@@ -71,5 +77,17 @@ class UTXOSetHashTest(BitcoinTestFramework):
         self.test_muhash_implementation()
 
 
+def send_batch_request(node: TestNode, method: str, params: List[Any]) -> List[Any]:
+    """Send batch request and parse all results"""
+    data = [{"method": method, "params": p} for p in params]
+    response = node.batch(data)
+    result = []
+    for item in response:
+        assert item["error"] is None, item["error"]
+        result.append(item["result"])
+
+    return result
+
+
 if __name__ == '__main__':
     UTXOSetHashTest().main()

</details>

<details> <summary>benchmark results</summary>

unpatched (3accc773c)
------------------------
% repeat 5 time ./test/functional/feature_utxo_set_hash.py > /dev/null
./test/functional/feature_utxo_set_hash.py > /dev/null  0.34s user 0.10s system 35% cpu 1.235 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.35s user 0.09s system 33% cpu 1.315 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.35s user 0.09s system 35% cpu 1.231 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.35s user 0.09s system 37% cpu 1.183 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.35s user 0.09s system 35% cpu 1.247 total

patch (diff above)
------------------
% repeat 5 time ./test/functional/feature_utxo_set_hash.py > /dev/null
./test/functional/feature_utxo_set_hash.py > /dev/null  0.32s user 0.08s system 28% cpu 1.406 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.33s user 0.08s system 27% cpu 1.508 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.33s user 0.08s system 34% cpu 1.207 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.32s user 0.09s system 27% cpu 1.490 total
./test/functional/feature_utxo_set_hash.py > /dev/null  0.32s user 0.08s system 33% cpu 1.211 total

</details>

theStack commented at 12:35 AM on November 27, 2022:

Thanks, done.

theStack commented at 12:48 AM on November 27, 2022:

Since the run-time difference is insignificant, I think that would be a nice candidate for a follow-up. Even if this PR doesn't get merged, in general it would be nice to investigate which tests could be sped up by using this batch request approach.

stickies-v commented at 9:30 PM on November 28, 2022:

FYI: I looked at quite a few/most of the long-running tests and couldn't identify any significant enough improvements because of batching so I'm not going to look further into it for now myself - comment can be marked resolved.

in test/functional/feature_utxo_set_hash.py:60 in 3accc773c9 outdated

  80 | +                    if o['scriptPubKey']['type'] != 'nulldata':
  81 | +                        muhash.insert(txout_serialize(tx['txid'], o['n'], height, is_coinbase_tx,
  82 | +                                                      o['value'], o['scriptPubKey']['hex']))
  83 | +                # remove spent coins
  84 | +                if not is_coinbase_tx:
  85 | +                    for i in tx['vin']:

stickies-v commented at 11:44 AM on November 24, 2022:

                    for vin in tx['vin']:

theStack commented at 12:40 AM on November 27, 2022:

This doesn't seem appropriate, since vin (and vout) doesn't denote a single input/output, but rather a collection of them. I assume the v prefix stems from vector, see e.g. https://github.com/bitcoin/bitcoin/blob/9c47eb450346937b3d7ee21b9e669b5c91a7ed6c/src/primitives/transaction.h#L298-L299 I was thinking only using in/out, but in is a reserved word in Python (for in in ... would look weird), input/output is IMHO too long, so I decided to just stick with i/o.

stickies-v commented at 11:07 PM on November 27, 2022:

You're absolutely right about vin/vout being inappropriate for a single input/output, my bad. I don't agree with input/output being too long though, especially here when we have multiple layers of nesting. However, it's also not important enough and probably too much personal preference to bikeshed over so I'll leave it up to you.

in test/functional/feature_utxo_set_hash.py:54 in 3accc773c9 outdated

  74 | -                    muhash.insert(data)
  75 | +        for height in range(1, node.getblockcount() + 1):
  76 | +            for tx in node.getblock(blockhash=node.getblockhash(height), verbose=3)['tx']:
  77 | +                # add created coins
  78 | +                is_coinbase_tx = 'coinbase' in tx['vin'][0]
  79 | +                for o in tx['vout']:

stickies-v commented at 11:44 AM on November 24, 2022:

                for vout in tx['vout']:

theStack commented at 12:41 AM on November 27, 2022:

See comment above (https://github.com/bitcoin/bitcoin/pull/26555#discussion_r1031417827).

in test/functional/feature_utxo_set_hash.py:17 in 3accc773c9 outdated

  17 |  from test_framework.test_framework import BitcoinTestFramework
  18 |  from test_framework.util import assert_equal
  19 |  from test_framework.wallet import MiniWallet
  20 |  
  21 | +
  22 | +def txout_serialize(txid_hex, n, height, is_coinbase, value, scriptPubKey_hex):

stickies-v commented at 11:53 AM on November 24, 2022:

Sorry, here I am again with the type hint nits. Also renamed n -> output_index

nit: serialize_txout seems like a more natural sounding name but bike shedding

def txout_serialize(txid_hex: str, output_index: int, height: int, is_coinbase: bool, value: float,
                    scriptPubKey_hex: str) -> bytes:

stickies-v commented at 4:19 PM on November 24, 2022:

Okay, I guess you just used snake case of the TxOutSer function name. On that note, I think it would be good to explicitly reference TxOutSer (name is fine imo) here so it's easier for people where this logic is coming from?

theStack commented at 12:43 AM on November 27, 2022:

Good ideas, applied both the type hints and the s/n/output_index rename. Also add a small comment referring to the original serialization function TxOutSer.

in test/functional/feature_utxo_set_hash.py:42 in 3accc773c9 outdated

  48 | -        blocks.append(from_hex(CBlock(), node.getblock(tx_block['hash'], False)))
  49 | +        self.generateblock(node, output=wallet.get_address(), transactions=[txid])
  50 |  
  51 | -        # Serialize the outputs that should be in the UTXO set and add them to
  52 | -        # a MuHash object
  53 | +        # Serialize created/spent transaction outputs in each block and

stickies-v commented at 4:58 PM on November 24, 2022:

nit to make the link with the next sentence a bit more clear

        # Serialize (created/spent) transaction outputs in each block and

theStack commented at 12:44 AM on November 27, 2022:

Thanks, done.

stickies-v commented at 4:59 PM on November 24, 2022: contributor

Concept ACK

The new code is easier to read imo, and makes it easier to extend in the future. I think the comment and PR description could be improved a bit by highlighting that the generalization introduced here is just the fact that we're now also removing consumed outputs from the muhash. It is mentioned, but wasn't immediately clear to me that this should be the only functional change.

fjahr commented at 7:55 PM on November 24, 2022: contributor

Right now the MuHash calculation in the functional test feature_utxo_set_hash.py is dependent on the specific chain we generate, e.g. it only works as long as the only spent coin is created in block number 1 (and no other UTXO is created in that same block) and uses that quirk to avoid the need to ever remove anything from the MuHash object.

The test still depends on the specifc chain because the resulting muhash is tested against a static value at the end. The whole point of this test is have a test against a static value to make sure the muhash results are consistent across refactors.

This PR generalizes the MuHash calculation by systematically going through all blocks and taking all spent/created coins into account (reflecting the actual behaviour on the coinstatsindex), which allows to potentially create more complex test-cases for UTXO set hashes in the future.

Can you give examples what test cases you want to build?

theStack commented at 4:04 PM on November 25, 2022: contributor

@stickies-v: Thanks for the thorough review, will tackle the comments within the next days. Most suggestions look good at a first glance. The batching idea is very nice (I wasn't even aware that this is possible), but maybe a candidate for a follow-up PR, considering that it doesn't change runtime significantly with this small block chain.

Right now the MuHash calculation in the functional test feature_utxo_set_hash.py is dependent on the specific chain we generate, e.g. it only works as long as the only spent coin is created in block number 1 (and no other UTXO is created in that same block) and uses that quirk to avoid the need to ever remove anything from the MuHash object.

The test still depends on the specifc chain because the resulting muhash is tested against a static value at the end. The whole point of this test is have a test against a static value to make sure the muhash results are consistent across refactors. @fjahr: Yes, I was specifically referring to the MuHash calculation part (i.e. the creation of the muhash/finalized objects in the test), not the test as a whole. If the test is extended in the future leading to a different UTXO set, then the hard-coded hash value has to be adapted, obviously.

This PR generalizes the MuHash calculation by systematically going through all blocks and taking all spent/created coins into account (reflecting the actual behaviour on the coinstatsindex), which allows to potentially create more complex test-cases for UTXO set hashes in the future.

Can you give examples what test cases you want to build?

Some ideas:

test that the MuHash calculation for invalidating blocks / reorgs is correct (CoinStatsIndex::ReverseBlock doesn't seem to have test coverage regarding MuHash)
creating/spending UTXOs with different outpoint index (right now they all have n=0)
sending to OP_RETURN outputs (i.e. verify that nulldata outputs are not taken into account for the UTXO set hash calculation)

To make a point for the latter two points -- currently, the following modifications would still pass the test:

diff --git a/src/index/coinstatsindex.cpp b/src/index/coinstatsindex.cpp
index d3559b1b75..7ff144a8d2 100644
--- a/src/index/coinstatsindex.cpp
+++ b/src/index/coinstatsindex.cpp
@@ -166,7 +166,7 @@ bool CoinStatsIndex::CustomAppend(const interfaces::BlockInfo& block)
                 COutPoint outpoint{tx->GetHash(), j};

                 // Skip unspendable coins
-                if (coin.out.scriptPubKey.IsUnspendable()) {
+                if (tx->IsCoinBase() && coin.out.scriptPubKey.IsUnspendable()) {
                     m_total_unspendable_amount += coin.out.nValue;
                     m_total_unspendables_scripts += coin.out.nValue;
                     continue;
diff --git a/src/kernel/coinstats.cpp b/src/kernel/coinstats.cpp
index 06a4b8c974..7ab58b11c9 100644
--- a/src/kernel/coinstats.cpp
+++ b/src/kernel/coinstats.cpp
@@ -50,7 +50,7 @@ uint64_t GetBogoSize(const CScript& script_pub_key)

 CDataStream TxOutSer(const COutPoint& outpoint, const Coin& coin) {
     CDataStream ss(SER_DISK, PROTOCOL_VERSION);
-    ss << outpoint;
+    ss << COutPoint(outpoint.hash, 0);
     ss << static_cast<uint32_t>(coin.nHeight * 2 + coin.fCoinBase);
     ss << coin.out;
     return ss;

test: generalize MuHash calculation in `feature_utxo_set_hash.py`

Right now the MuHash calculation in the functional test
`feature_utxo_set_hash.py` is dependent on the specific chain we
generate, e.g. it only works as long as the only spent coin is created
in block number 1 (and no other UTXO is created in that same block) and
uses that quirk to avoid the need to ever remove anything from the
MuHash object.

This PR generalizes the MuHash calculation by systematically going
through all blocks and taking all spent/created coins into account
(reflecting the actual behaviour on the coinstatsindex), which allows to
potentially create more complex test-cases for UTXO set hashes in the
future.

6553331efa

theStack force-pushed on Nov 27, 2022

theStack commented at 12:49 AM on November 27, 2022: contributor

Force-pushed with some suggestions taken from @stickies-v.

fjahr commented at 1:00 AM on November 27, 2022: contributor

@fjahr: Yes, I was specifically referring to the MuHash calculation part (i.e. the creation of the muhash/finalized objects in the test), not the test as a whole. If the test is extended in the future leading to a different UTXO set, then the hard-coded hash value has to be adapted, obviously.

We need to have some tests that make sure that the results stay consistent across refactors for deterministic cases. So I would be against removing all checks against static values if that is what you mean by adapting.

Some ideas:

test that the MuHash calculation for invalidating blocks / reorgs is correct (CoinStatsIndex::ReverseBlock doesn't seem to have test coverage regarding MuHash)

creating/spending UTXOs with different outpoint index (right now they all have n=0)

sending to OP_RETURN outputs (i.e. verify that nulldata outputs are not taken into account for the UTXO set hash calculation)

I think some, if not all, of these code path are covered in feature_coinstatsindex.py. There the Muhash result is dynamically checked for internal consistency. I would suggest to make further checks against a python reimplementation in there as well and not reimplement them here. It will save a lot of work and prevents duplication plus the functional test suite will run faster as a whole (fewer nodes running, fewer blocks mined).

theStack commented at 11:48 AM on November 27, 2022: contributor

@fjahr: Yes, I was specifically referring to the MuHash calculation part (i.e. the creation of the muhash/finalized objects in the test), not the test as a whole. If the test is extended in the future leading to a different UTXO set, then the hard-coded hash value has to be adapted, obviously.

We need to have some tests that make sure that the results stay consistent across refactors for deterministic cases. So I would be against removing all checks against static values if that is what you mean by adapting.

No plan of removing the static check, by "adapting" I merely mean changing the hardcoded hash if needed in the future -- it's not that this hash is set in stone forever, there have been three previous adaptions before: 041abfebe49ae5e3e882c00cc5caea1365a27a49, f30041c9143d0added18105c9f0c4ae3f340efbc and fa38b1c8bd29e2c792737f6481ab928e46396b7e. I agree though that there should be a very good reason for changes that cause an adaption of the hash. Note that this PR does not change or remove the static hash check, it's for now just a refactor for a more systematic Python Muhash calculation approach and (IMHO) slightly improved readability.

Some ideas:

test that the MuHash calculation for invalidating blocks / reorgs is correct (CoinStatsIndex::ReverseBlock doesn't seem to have test coverage regarding MuHash)

creating/spending UTXOs with different outpoint index (right now they all have n=0)

sending to OP_RETURN outputs (i.e. verify that nulldata outputs are not taken into account for the UTXO set hash calculation)

I think some, if not all, of these code path are covered in feature_coinstatsindex.py. There the Muhash result is dynamically checked for internal consistency. I would suggest to make further checks against a python reimplementation in there as well and not reimplement them here. It will save a lot of work and prevents duplication plus the functional test suite will run faster as a whole (fewer nodes running, fewer blocks mined).

Good point, will take a closer look there. Not sure about the "fewer nodes running" argument, I don't see why this feature_utxo_set_hash.py test would ever need more than a single node.

fanquake commented at 10:13 AM on February 8, 2023: member

@theStack @fjahr what are the next steps here? Is this blocked on other changes?

fjahr commented at 7:36 PM on February 8, 2023: contributor

@theStack @fjahr what are the next steps here? Is this blocked on other changes?

My comment from above stands, I think this change as is doesn't hold much value if the intention is to implement additional test cases that would be a better fit for feature_coinstatsindex.py. @theStack wanted to take a look into that and that seems to be still pending.

fanquake marked this as a draft on Feb 9, 2023

fanquake commented at 3:32 PM on February 9, 2023: member

Moved to draft for the moment.

DrahtBot commented at 12:50 AM on August 8, 2023: contributor

There hasn't been much activity lately. What is the status here?

Finding reviewers may take time. However, if the patch is no longer relevant, please close this pull request. If the author lost interest or time to work on this, please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

theStack commented at 11:20 PM on August 18, 2023: contributor

After looking closer at feature_coinstatsindex.py, I largely agree that this would be a better fit for additional tests. I think this PR would still have a certain value from a pure refactoring / "easier to read" perspective though, but closing now after months of inactivity (sorry for the late reply).

theStack closed this on Aug 18, 2023

bitcoin locked this on Aug 17, 2024

test: generalize MuHash calculation in `feature_utxo_set_hash.py` #26555

Reviews