test: refactor: overhaul (w)txid determination for CTransaction objects #32421

pull theStack wants to merge 7 commits into bitcoin:master from theStack:202505-test-remove_txid_caching changing 45 files +333 −499
  1. theStack commented at 5:51 pm on May 5, 2025: contributor

    In the functional test framework, determining a (w)txid for a CTransaction instance is currently rather confusing and footgunny due to inconsistent naming/interfaces (see table below) and statefulness involved. This PR aims to improve that by:

    • removing the (w)txid caching mechanism, in order to avoid the need to call additional rehashing functions (.rehash()/.calculate_sha256(), see first two commits and #32050 (review)). This change in theory decreases the performance, as the involved serialization and hashing involved might be called more often than previously, but I couldn’t find a functional test where this leads to a measurable run-time increase on my machine.
    • introduce consistent naming that shows the type of the returned txid, i.e. hex string vs. test-framework-internal representation [currently integers] (see remaining commits)

    Summary table showing (w)txid determaination before/after this PR:

    Task master PR
    get TXID (hex string) .rehash() / .hash[1] .txid_hex
    get TXID (integer) .sha256[1] .txid_int
    get WTXID (hex string) .getwtxid() .wtxid_hex
    get WTXID (integer) .calc_sha256(True) .wtxid_int

    Unfortunately, most renames can’t be done with a scripted-diff, as the property names (.hash, .sha256) are also used for blocks and other message types. The PR is rather invasive and touches a lot of files, but I think it’s worth to do it, also to make life easier for new contributors. Future tasks like e.g. doing the same overhaul for block (header) objects or getting rid of the integer representation (see #32050) become easier should become easier after this one.

    [1] = returned value might be out-of-date, if rehashing function wasn’t called after modification

  2. DrahtBot commented at 5:51 pm on May 5, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32421.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, marcofleon, achow101
    Concept ACK instagibbs
    Stale ACK rkrux

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32468 (rpc: generateblock to allow multiple outputs by polespinasa)
    • #32379 (p2p: stop special-casing witness-stripped error for unconfirmed transactions by darosior)
    • #32317 (kernel: Separate UTXO set access from validation functions by TheCharlatan)
    • #31981 (Add checkBlock() to Mining interface by Sjors)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on May 5, 2025
  4. instagibbs commented at 6:17 pm on May 5, 2025: member

    determining a (w)txid for a CTransaction instance is currently rather confusing and footgunny

    I don’t disbelieve you exactly as I always have to double-check what I’m calling, but have we seen a series of footguns?

  5. theStack commented at 4:32 pm on May 6, 2025: contributor

    determining a (w)txid for a CTransaction instance is currently rather confusing and footgunny

    I don’t disbelieve you exactly as I always have to double-check what I’m calling, but have we seen a series of footguns?

    Not that I’m aware of any recently, but I vividly remember having wasted many hours hunting bugs in test code locally that were often caused by forgetting to rehash before accessing .hash or .sha256 in my earlier contribution days (yeah, a beginner’s mistake one could claim, but one that could have been easily avoided with a better interface). I assume long-term contributors intuitively already tend to avoid accessing these cached fields and use tx.rehash() and int(tx.rehash(), 16) instead for those reasons, even if the values are needed repeatedly.

  6. instagibbs commented at 12:12 pm on May 7, 2025: member
    concept ACK either way; we lose mental cycles every time we need to figure this out
  7. DrahtBot added the label Needs rebase on May 9, 2025
  8. ZKcash-IrishGov commented at 1:01 am on May 10, 2025: none
    .
  9. rkrux commented at 3:18 pm on May 12, 2025: contributor

    Preliminary ACK efc79f2c30f0029bc7ebc5574e37a59d822da06a

    I feel it’s a good cleanup. It became quite evident to me that the previous function names were not expressive enough to gauge their intention just by their name as I had to look into their implementations while verifying the mappings with the new property attributes introduced in the PR description table.

    As mentioned in the description, it seems that the onus of fetching the updated transaction hashes was on the test writer previously that seems a bit unnecessary to me. I don’t intuit that calculating the transaction hashes in the tests would increase the tests time considerably, to confirm which I ran the tests suite on my machine multiple times as shown below - half the time the latency was lower, higher in the other half.

    PR

     0ALL                                                    | ✓ Passed  | 3104 s (accumulated)
     1Runtime: 857s
     2
     3ALL                                                    | ✓ Passed  | 3132 s (accumulated) 
     4Runtime: 861 s
     5
     6ALL                                                    | ✓ Passed  | 3097 s (accumulated) 
     7Runtime: 853 s
     8
     9ALL                                                    | ✓ Passed  | 3061 s (accumulated) 
    10Runtime: 847 s
    

    Master

    0ALL                                                    | ✓ Passed  | 3101 s (accumulated) 
    1Runtime: 854 s
    

    I like the way the PR has been split into commits to aid review, using @property looks like a good way to add calculation while accessing just like a data member. I will take another look soon and confirm ack.

  10. DrahtBot requested review from instagibbs on May 12, 2025
  11. theStack force-pushed on May 12, 2025
  12. DrahtBot removed the label Needs rebase on May 12, 2025
  13. theStack commented at 9:36 pm on May 12, 2025: contributor
    Rebased on master (needed due to a few conflicts in functional tests after #32155 was merged).
  14. DrahtBot added the label Needs rebase on May 29, 2025
  15. test: remove txid caching in CTransaction class
    Rather than txids (represented by the fields `.sha256` and `.hash`)
    being stateful, simply compute them on-the-fly. This ensures that
    the correct values are always returned and takes the burden of
    rehashing from test writers, making the code shorter overall.
    In a first step, the fields are kept at the same name with @property
    functions as drop-in replacements, for a minimal diff. In later commits,
    the names are changed to be more descriptive and indicating the return
    type of the txid.
    a2724e3ea3
  16. test: remove bare CTransaction `.rehash()`/`.calc_sha256()` calls
    Since the previous commit, CTransaction object calls to the
    methods `.rehash()` and `.calc_sha256()` are effectively no-ops
    if the returned value is not used, so we can just remove them.
    9b3dce24a3
  17. test: introduce and use CTransaction `.wtxid_int` property
    This commits removes the `.calc_sha256` method from the CTransaction
    and introduces a property `.wtxid_int` property as replacement.
    e9cdaefb0a
  18. theStack force-pushed on Jun 9, 2025
  19. theStack commented at 3:44 pm on June 9, 2025: contributor
    Rebased on master.
  20. DrahtBot removed the label Needs rebase on Jun 9, 2025
  21. in test/functional/feature_assumeutxo.py:608 in a1b941d0b4 outdated
    604@@ -605,7 +605,7 @@ def check_tx_counts(final: bool) -> None:
    605         privkey = n0.get_deterministic_priv_key().key
    606         raw_tx = n1.createrawtransaction([prevout], {getnewdestination()[2]: 24.99})
    607         signed_tx = n1.signrawtransactionwithkey(raw_tx, [privkey], [prevout])['hex']
    608-        signed_txid = tx_from_hex(signed_tx).rehash()
    609+        signed_txid = tx_from_hex(signed_tx).hash
    


    maflcko commented at 3:01 pm on June 10, 2025:

    nit in a1b941d0b48b7927a45ff40cc20902f835240898: Not sure it makes sense to change the code only to change it again in the next commit (cd0bbb4bbadf0d19eaa321a326967cf7dbf1c2df)

    What about moving this after cd0bbb4bbadf0d19eaa321a326967cf7dbf1c2df?


    maflcko commented at 3:22 pm on June 10, 2025:
    (aborted my review after this comment, so it is only partial right now, but I’ll continue once there is a reply here)

    theStack commented at 11:40 pm on June 10, 2025:
    Good point, decided to squash the two mentioned commits to get rid of the repeated renaming. I.e. both .rehash() and .hash calls/accesses of CTransaction instances are replaced with .txid_hex now in a single commit (ce83924237127fe6d32b63c362b57789e4628954).
  22. maflcko commented at 3:03 pm on June 10, 2025: member

    this touches ~500 lines, but removes ~200 redundant lines, which makes them hard to read/write/maintain. So concept ack.

    lgtm ACK 04b91f865f3c2dd51df9d217afaaf88e1bcb0396 🐎

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: lgtm ACK 04b91f865f3c2dd51df9d217afaaf88e1bcb0396 🐎
    3cG6uA9f/3d06g/gpTSVpcje2ZClavGAWPisr2U11e1C9I2dSB047A7GObsbGJ1+gJYaZJxZGHwGApby2ZbXYAQ==
    
  23. fanquake commented at 3:20 pm on June 10, 2025: member
    @marcofleon you might be interested in reviewing here?
  24. test: rename CTransaction `.rehash()`/`.hash` -> `.txid_hex` for consistency
    Note that we unfortunately can't use a scripted diff here, as the same
    property and method name is also used for `CBlockHeader`/`CBlock` instances.
    ce83924237
  25. test: rename CTransaction `.sha256` -> `.txid_int` for consistency
    Note that we unfortunately can't use a scripted diff here, as the same
    property name is also used for `CBlockHeader`/`CBlock` instances.
    81af4334e8
  26. scripted-diff: test: rename CTransaction `.getwtxid()` -> `wtxid_hex` for consistency
    -BEGIN VERIFY SCRIPT-
    
    sed -i "s|def getwtxid|@property\n    def wtxid_hex|g" ./test/functional/test_framework/messages.py
    sed -i "s|getwtxid()|wtxid_hex|g" $(git grep -l getwtxid)
    
    -END VERIFY SCRIPT-
    472f3770ae
  27. test: avoid unneeded (w)txid hex -> integer conversions
    Rather than determining a CTransaction's (w)txid as an integer by
    converting it's hex value, it can be directly accessed via the
    introduced `.{w,}txid_int` property.
    4ef6253017
  28. theStack force-pushed on Jun 10, 2025
  29. maflcko commented at 6:42 am on June 11, 2025: member

    Only change since my last review is me actually finishing the review.

    re-ACK 4ef6253017672a74c584e6e9b449ffff79f67273 🏈

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: re-ACK 4ef6253017672a74c584e6e9b449ffff79f67273 🏈
    3crx1No35BWI10Tc13aDbejKSrWnXpvskCgdJtpmENZ1xpLIerge/OmQjEL8An9L6YtySO1BlOSleD2wzblgABQ==
    
  30. DrahtBot requested review from rkrux on Jun 11, 2025
  31. marcofleon commented at 1:18 pm on June 11, 2025: contributor

    code review ACK 4ef6253017672a74c584e6e9b449ffff79f67273

    Lgtm, nice refactor. Removing the cached hash makes the class cleaner and there’s less potential for misuse. The new naming is a clear improvement as well. I ran all the functional tests to confirm, all good.

  32. achow101 commented at 8:45 pm on June 11, 2025: member
    ACK 4ef6253017672a74c584e6e9b449ffff79f67273
  33. achow101 merged this on Jun 11, 2025
  34. achow101 closed this on Jun 11, 2025

  35. theStack deleted the branch on Jun 11, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-06-15 09:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me