qa: clarify and document assumeutxo tests #31907

pull darosior wants to merge 2 commits into bitcoin:master from darosior:2502_clarify_document_assumeutxo_test changing 1 files +9 −2
  1. darosior commented at 3:11 pm on February 19, 2025: member

    The feature_assumeutxo.py functional test checks various errors with malleated snapshots. Some of these cases are brittle or use confusing and undocumented values. Fix those by making them less brittle and using clear, documented values.

    I ran across those when working on an unrelated changeset which affected the snapshot. It took me a while to understand where the seemingly magic values were coming from, so i figured it was worth proposing this patch on its own for the sake of making the test more maintainable.

    See commit messages for details.

  2. DrahtBot commented at 3:11 pm on February 19, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31907.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK janb84
    Concept ACK theStack, rkrux

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Feb 19, 2025
  4. TheCharlatan commented at 9:15 pm on February 19, 2025: contributor
    The third paragraph in the first commit messages mentions commits that don’t exist here. I’m guessing that is a reference to the original patchset you developed this with?
  5. darosior commented at 9:33 pm on February 19, 2025: member
    Ugh yes it is indeed. Will fix.
  6. in test/functional/feature_assumeutxo.py:144 in f43b6ba276 outdated
    138@@ -139,7 +139,14 @@ def expected_error(msg):
    139             [b"\x81", 34, "3da966ba9826fb6d2604260e01607b55ba44e1a5de298606b08704bc62570ea8", None],  # wrong coin code VARINT
    140             [b"\x80", 34, "091e893b3ccb4334378709578025356c8bcb0a623f37c7c4e493133c988648e5", None],  # another wrong coin code
    141             [b"\x84\x58", 34, None, "Bad snapshot data after deserializing 0 coins"],  # wrong coin case with height 364 and coinbase 0
    142-            [b"\xCA\xD2\x8F\x5A", 39, None, "Bad snapshot data after deserializing 0 coins - bad tx out value"],  # Amount exceeds MAX_MONEY
    143+            [
    144+                # VARINT(Compressed(MAX_MONEY + 1)) + VARINT(0)
    145+                b"\xa0\xc8\xad\xb1\xd1\x83\xff\x01" + b"\x00",
    


    theStack commented at 0:45 am on February 20, 2025:
    Could even go one step further and eliminate the magic bytes by implementing VARINT serialization and amount compression routines in the test framework. If you want to pull in, feel free: https://github.com/theStack/bitcoin/commits/202502-test-introduce_varint_and_amount_compression_routines/

    darosior commented at 2:25 pm on February 20, 2025:
    I’m on the fence. I agree it’s nicer but also it would 10x the size of the change with harder to review code. Thanks for the commit but i’d rather keep it this way for now, unless/until more reviewers believe it’s worth pulling in a full varint implementation.

    theStack commented at 2:49 pm on February 20, 2025:
    Fair enough, doesn’t have to happen in this PR (if at all), happy to ACK without (once #31907 (comment) is fixed). Still, I think it would be nice long-term if varints would be easy to change/create, without having to fiddle around in the C++ codebase to get out the serialization.

    rkrux commented at 10:45 am on February 21, 2025:
    The current implementation certainly makes it easier to parse but for someone new to this piece of code it still feels like magic value as the amount compression logic and the varint serialisation logic are quite custom. To eliminate the magical-ness completely, I would support pulling in the said routines in the test framework. Some of them are already present in the utxo-to-sqlite.py tool.

    fjahr commented at 8:28 pm on February 21, 2025:
    I’m in favor of adding varint and compression implementation here since it’s a more meaningful improvement and the code is already there. Also not sure why we need to add VARINT(0)?
  7. theStack commented at 0:50 am on February 20, 2025: contributor
    Concept ACK, I agree that the snapshot modification tests are in a bit of a messy state currently.
  8. in test/functional/feature_assumeutxo.py:136 in f43b6ba276 outdated
    132@@ -133,13 +133,20 @@ def expected_error(msg):
    133         cases = [
    134             # (content, offset, wrong_hash, custom_message)
    135             [b"\xff" * 32, 0, "7d52155c9a9fdc4525b637ef6170568e5dad6fabd0b1fdbb9432010b8453095b", None],  # wrong outpoint hash
    136-            [(2).to_bytes(1, "little"), 32, None, "Bad snapshot data after deserializing 1 coins."],  # wrong txid coins count
    137+            [(2).to_bytes(1, "little"), 32, None, "Bad snapshot data after deserializing 1 coins"],  # wrong txid coins count
    


    yancyribbens commented at 1:31 am on February 20, 2025:
    nit 1 coin?

    rkrux commented at 3:05 pm on February 20, 2025:
    This is an expected error from the underlying RPC call. Although using coin here in the assertion won’t fail the functional test but I don’t think this makes any meaningful difference.

    yancyribbens commented at 1:45 am on February 21, 2025:
    Ah, I see thanks. I guess the underlying call would be more clear to return coin(s) although that’s besides the point.
  9. janb84 commented at 12:04 pm on February 20, 2025: none

    tACK f43b6ba

    • Ran unit and functional tests related to assumeutxo (test/functional/feature_assumeutxo.py).
    • Ran all the functional tests
    • Validated the code change.

    I also wrote a review note on the PR

  10. DrahtBot requested review from theStack on Feb 20, 2025
  11. qa: make feature_assumeutxo.py test more robust
    The second test case with alternated UTxO data tests that if the number of coins per txid was
    replaced by a higher value we would continue iterating and fail to deserialize the next coin.
    
    (It's not necessary as if it was properly malleated we would successfully deserialize the excess
    coin(s) but then we'd fail the hash check later on.)
    
    The deserialization failure depends on the following data, which was generated by a valid UTxO set
    dump at the beginning of the test. With the current UTxO set it would always result in deserializing
    a block height higher than the base height and fail the check at line 5953 of validation.cpp. The
    test would be equally valid should the deserialization failure occur not with the coin height but
    with the coin amount (checked right after).
    
    Therefore update the error message to catch either one of those two errors. The test is still
    brittle, just less so, as the snapshot data may one day be so that a coin is correctly deserialized
    and none of these errors is hit.
    0a0c803785
  12. qa: use a clearer and documented amount error in malleated snapshot
    The final case with alternated UTxO data tests the error raised when deserializing a snapshot that
    contains a coin with an amount not in range (<0 or >MAX_MONEY).
    
    The current inserted error uses an undocumented byte string and offset which make it hard to
    maintain. In addition, the undocumented offset is set surprisingly high (39 bytes is well into the
    serialization of the amount which starts at offset 36. Similarly the value is surprisingly small,
    presumably one was adjusted for the other. But we don't know how they were chosen, why not in a
    clearer manner and what they are supposed to represent.
    
    Instead replace this seemingly magic value with a clear one, MAX_MONEY + 1, serialize the whole
    value for the amount field at the correct offset, and document the whole thing for the next person
    around.
    dedc2d9cec
  13. darosior force-pushed on Feb 20, 2025
  14. darosior commented at 5:18 pm on February 20, 2025: member
    Updated the commit message to remove the reference to later modification i had in my branch.
  15. janb84 commented at 10:08 am on February 21, 2025: none
    reACK dedc2d9
  16. in test/functional/feature_assumeutxo.py:1 in 0a0c803785 outdated


    rkrux commented at 10:47 am on February 21, 2025:
    Comment about the commit message: Curious to know the reason behind using UTx0 instead of UTXO.

    fjahr commented at 7:04 pm on February 21, 2025:

    In 0a0c803785782e40f4fe8055383565e3d24b59cc : That is weird but also I seem to be unable to parse this commit message (and the next one). Can this be made more understandable?

    What I can gather is this seems to make the test less tight around the actually expected error (?) and I don’t understand how that would make the test better.

    “as the snapshot data may one day be so that a coin is correctly deserialized and none of these errors is hit”. I don’t understand how we are supposed to write tests for hypothetical future changes? If there are changes and these errors are not hit anymore then the test fails and it will need to be updated. I don’t see what’s the problem with that.

  17. rkrux commented at 10:48 am on February 21, 2025: contributor
    Concept ACK dedc2d9cecccb866230c07112d00288237d12eaa
  18. fanquake commented at 5:50 pm on February 21, 2025: member
    @fjahr you might want to have a look here?

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-02-22 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me