test: refactor: overhaul (w)txid determination for CTransaction objects #32421

pull theStack wants to merge 8 commits into bitcoin:master from theStack:202505-test-remove_txid_caching changing 45 files +326 −491
  1. theStack commented at 5:51 pm on May 5, 2025: contributor

    In the functional test framework, determining a (w)txid for a CTransaction instance is currently rather confusing and footgunny due to inconsistent naming/interfaces (see table below) and statefulness involved. This PR aims to improve that by:

    • removing the (w)txid caching mechanism, in order to avoid the need to call additional rehashing functions (.rehash()/.calculate_sha256(), see first two commits and #32050 (review)). This change in theory decreases the performance, as the involved serialization and hashing involved might be called more often than previously, but I couldn’t find a functional test where this leads to a measurable run-time increase on my machine.
    • introduce consistent naming that shows the type of the returned txid, i.e. hex string vs. test-framework-internal representation [currently integers] (see remaining commits)

    Summary table showing (w)txid determaination before/after this PR:

    Task master PR
    get TXID (hex string) .rehash() / .hash[1] .txid_hex
    get TXID (integer) .sha256[1] .txid_int
    get WTXID (hex string) .getwtxid() .wtxid_hex
    get WTXID (integer) .calc_sha256(True) .wtxid_int

    Unfortunately, most renames can’t be done with a scripted-diff, as the property names (.hash, .sha256) are also used for blocks and other message types. The PR is rather invasive and touches a lot of files, but I think it’s worth to do it, also to make life easier for new contributors. Future tasks like e.g. doing the same overhaul for block (header) objects or getting rid of the integer representation (see #32050) become easier should become easier after this one.

    [1] = returned value might be out-of-date, if rehashing function wasn’t called after modification

  2. DrahtBot commented at 5:51 pm on May 5, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32421.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK instagibbs
    Stale ACK rkrux

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32468 (rpc: generatetomany by polespinasa)
    • #32406 (policy: uncap datacarrier by default by instagibbs)
    • #32379 (p2p: stop special-casing witness-stripped error for unconfirmed transactions by darosior)
    • #32304 (test: test MAX_SCRIPT_SIZE for block validity by instagibbs)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
    • #29032 (signet: omit commitment for some trivial challenges by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on May 5, 2025
  4. instagibbs commented at 6:17 pm on May 5, 2025: member

    determining a (w)txid for a CTransaction instance is currently rather confusing and footgunny

    I don’t disbelieve you exactly as I always have to double-check what I’m calling, but have we seen a series of footguns?

  5. theStack commented at 4:32 pm on May 6, 2025: contributor

    determining a (w)txid for a CTransaction instance is currently rather confusing and footgunny

    I don’t disbelieve you exactly as I always have to double-check what I’m calling, but have we seen a series of footguns?

    Not that I’m aware of any recently, but I vividly remember having wasted many hours hunting bugs in test code locally that were often caused by forgetting to rehash before accessing .hash or .sha256 in my earlier contribution days (yeah, a beginner’s mistake one could claim, but one that could have been easily avoided with a better interface). I assume long-term contributors intuitively already tend to avoid accessing these cached fields and use tx.rehash() and int(tx.rehash(), 16) instead for those reasons, even if the values are needed repeatedly.

  6. instagibbs commented at 12:12 pm on May 7, 2025: member
    concept ACK either way; we lose mental cycles every time we need to figure this out
  7. DrahtBot added the label Needs rebase on May 9, 2025
  8. ZKcash-IrishGov commented at 1:01 am on May 10, 2025: none
    .
  9. rkrux commented at 3:18 pm on May 12, 2025: contributor

    Preliminary ACK efc79f2c30f0029bc7ebc5574e37a59d822da06a

    I feel it’s a good cleanup. It became quite evident to me that the previous function names were not expressive enough to gauge their intention just by their name as I had to look into their implementations while verifying the mappings with the new property attributes introduced in the PR description table.

    As mentioned in the description, it seems that the onus of fetching the updated transaction hashes was on the test writer previously that seems a bit unnecessary to me. I don’t intuit that calculating the transaction hashes in the tests would increase the tests time considerably, to confirm which I ran the tests suite on my machine multiple times as shown below - half the time the latency was lower, higher in the other half.

    PR

     0ALL                                                    | ✓ Passed  | 3104 s (accumulated)
     1Runtime: 857s
     2
     3ALL                                                    | ✓ Passed  | 3132 s (accumulated) 
     4Runtime: 861 s
     5
     6ALL                                                    | ✓ Passed  | 3097 s (accumulated) 
     7Runtime: 853 s
     8
     9ALL                                                    | ✓ Passed  | 3061 s (accumulated) 
    10Runtime: 847 s
    

    Master

    0ALL                                                    | ✓ Passed  | 3101 s (accumulated) 
    1Runtime: 854 s
    

    I like the way the PR has been split into commits to aid review, using @property looks like a good way to add calculation while accessing just like a data member. I will take another look soon and confirm ack.

  10. DrahtBot requested review from instagibbs on May 12, 2025
  11. test: remove txid caching in CTransaction class
    Rather than txids (represented by the fields `.sha256` and `.hash`)
    being stateful, simply compute them on-the-fly. This ensures that
    the correct values are always returned and takes the burden of
    rehashing from test writers, making the code shorter overall.
    In a first step, the fields are kept at the same name with @property
    functions as drop-in replacements, for a minimal diff. In later commits,
    the names are changed to be more descriptive and indicating the return
    type of the txid.
    ffb89d9925
  12. test: remove bare CTransaction `.rehash()`/`.calc_sha256()` calls
    Since the previous commit, CTransaction object calls to the
    methods `.rehash()` and `.calc_sha256()` are effectively no-ops
    if the returned value is not used, so we can just remove them.
    6bb39b8ccc
  13. test: introduce and use CTransaction `.wtxid_int` property
    This commits removes the `.calc_sha256` method from the CTransaction
    and introduces a property `.wtxid_int` property as replacement.
    7858952363
  14. test: replace CTransaction `.rehash()` calls with `.hash` property access 2ce8fd8638
  15. test: rename CTransaction `.sha256` -> `.txid_int` for consistency
    Note that we unfortunately can't use a scripted diff here, as the same
    property name is also used for `CBlockHeader`/`CBlock` instances.
    14053c7ad0
  16. test: rename CTransaction `.hash` -> `.txid_hex` for consistency
    Note that we unfortunately can't use a scripted diff here, as the same
    property name is also used for `CBlockHeader`/`CBlock` instances, and
    a few other message classes.
    54a57c215c
  17. scripted-diff: test: rename CTransaction `.getwtxid()` -> `wtxid_hex` for consistency
    -BEGIN VERIFY SCRIPT-
    
    sed -i "s|def getwtxid|@property\n    def wtxid_hex|g" ./test/functional/test_framework/messages.py
    sed -i "s|getwtxid()|wtxid_hex|g" $(git grep -l getwtxid)
    
    -END VERIFY SCRIPT-
    5c1d75a0c1
  18. test: avoid unneeded (w)txid hex -> integer conversions
    Rather than determining a CTransaction's (w)txid as an integer by
    converting it's hex value, it can be directly accessed via the
    introduced `.{w,}txid_int` property.
    7b1b98f1fa
  19. theStack force-pushed on May 12, 2025
  20. DrahtBot removed the label Needs rebase on May 12, 2025
  21. theStack commented at 9:36 pm on May 12, 2025: contributor
    Rebased on master (needed due to a few conflicts in functional tests after #32155 was merged).

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-05-25 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me