test: add BIP158 false-positive element check in rpc_scanblocks.py #26341

pull theStack wants to merge 3 commits into bitcoin:master from theStack:202210-test-check_for_false_positives_in_scanblocks changing 3 files +104 −27
  1. theStack commented at 5:40 PM on October 19, 2022: contributor

    This PR adds a fixed false-positive element check to the functional test rpc_scanblocks.py by using a pre-calculated scriptPubKey that collides with the regtest genesis block's coinbase output. Note that determining a BIP158 false-positive at runtime would also be possible, but take too long (we'd need to create and check ~800k output scripts on average, which took at least 2 minutes on average on my machine).

    The introduced check is related to issue #26322 and more concretely inspired by PR #26325 which introduces an "accurate" mode that filters out these false-positives. The introduced cryptography routines (siphash for generic data) and helpers (BIP158 ranged hash calculation, relevant scriptPubKey per block determination) could potentially also be useful for more tests in the future that involve compact block filters.

  2. theStack force-pushed on Oct 19, 2022
  3. theStack commented at 5:52 PM on October 19, 2022: contributor

    @sipa: Since you authored the SipHash implementation optimized for 256-bit ints in Python (9c8593d2b4e25ef628172ceadbedf0ef078d01ef) and also the other ones in our main codebase, you could might want to take a look at the first commit. I'm pretty confident that it works as intended, but maybe there are some subtleties or corner-cases that I overlooked.

  4. DrahtBot added the label Tests on Oct 19, 2022
  5. in test/functional/test_framework/siphash.py:63 in 5555e72d10 outdated
      58 | +    v0, v1, v2, v3 = siphash_round(v0, v1, v2, v3)
      59 | +    v0, v1, v2, v3 = siphash_round(v0, v1, v2, v3)
      60 | +    return v0 ^ v1 ^ v2 ^ v3
      61 | +
      62 | +
      63 |  def siphash256(k0, k1, h):
    


    sipa commented at 6:14 PM on October 19, 2022:

    I don't think we need a specialized siphash256 anymore if you've written a generic one; there won't be a noticeable performance difference between the two.

    So I think you can change siphash256 to be implemented by just calling siphash.


    theStack commented at 11:37 PM on October 19, 2022:

    I agree! Thanks, done.

  6. jamesob commented at 8:30 PM on October 19, 2022: member

    Concept ACK, good idea.

  7. test: add SipHash implementation for generic data in Python
    We will need this in the next commit to calculate ranged hashes
    of scriptPubKeys as defined in BIP158.
    25ee74dd11
  8. test: add compact block filter (BIP158) helper routines
    By now, we add one helper for calculating ranged hashes and another one
    for finding relevant scriptPubKeys given a block.
    3bca6cd61a
  9. test: check for false-positives in rpc_scanblocks.py fa54d3011e
  10. theStack force-pushed on Oct 19, 2022
  11. achow101 commented at 8:03 PM on October 24, 2022: member

    ISTM a simpler test would be to just have a fixed collision with the fixed addresses that we generate blocks to for each node. Then we would not need the siphash or bip158 specific things and could just do normal scanblocks and see that there are false positives for blocks mined by e.g. node 0 for some scriptPubKey that is clearly not used anywhere else.

    A more meta-comment: should we even be scanning block 0 since it's coinbase UTXO is unspendable?

  12. theStack commented at 1:44 AM on October 25, 2022: contributor

    ISTM a simpler test would be to just have a fixed collision with the fixed addresses that we generate blocks to for each node. Then we would not need the siphash or bip158 specific things and could just do normal scanblocks and see that there are false positives for blocks mined by e.g. node 0 for some scriptPubKey that is clearly not used anywhere else.

    Whether two scriptPubKeys BIP158-collide or not is not only dependent on the scripts, but also on the block hash of the block they'd appear in (that's the k used as key passed to SipHash). The only block hash that we know will basically always be constant is the one from genesis block, hence this choice. Though I assume right now the pre-generated functional test chain generates the exact same blocks and thus block hashes between runs (at least per node), the precalculated false-positives wouldn't work anymore if for whatever reason any minor detail of created blocks (e.g. nVersion, timestamp, included tx) ever changes in the future, which could be very annoying. If I didn't miss something, I think there is no easier way than to use the genesis block. The ranged hash BIP158 calculation routines (involving SipHash) are just there for a verification of the false-positive, which IMHO makes sense.

    A more meta-comment: should we even be scanning block 0 since it's coinbase UTXO is unspendable?

    Good question. I'd say yes, as I see scanblocks as a general tool for "show me all blocks containing txs that involve scriptPubKey xyz" rather than taking details like spendability into account (there are a lots of UTXOs that are unspendable, some even provably unspendable, like e.g. P2TR outputs with invalid x-only-pubkeys).

  13. achow101 commented at 7:34 PM on October 25, 2022: member

    ACK fa54d3011ed0cbb7bcdc76548423ba41f0042832

    Whether two scriptPubKeys BIP158-collide or not is not only dependent on the scripts, but also on the block hash of the block they'd appear in (that's the k used as key passed to SipHash). The only block hash that we know will basically always be constant is the one from genesis block, hence this choice

    Ah, I forgot about that. Although the test could make a fixed block manually as well, and that would sidestep any determinism issues.

  14. achow101 merged this on Oct 26, 2022
  15. achow101 closed this on Oct 26, 2022

  16. theStack deleted the branch on Oct 26, 2022
  17. sidhujag referenced this in commit 7faede1033 on Oct 27, 2022
  18. bitcoin locked this on Oct 26, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me