Mining: Avoid copying template CBlocks #32547

pull luke-jr wants to merge 3 commits into bitcoin:master from luke-jr:mining_avoid_block_copy changing 6 files +54 −43
  1. luke-jr commented at 9:39 am on May 18, 2025: member

    Refactoring to avoid unnecessary copies/complexity, at least in the mainnet paths.

    Also abstracts out a new TemplateToJSON function to keep track of variable scoping better.

  2. interfaces/mining: Avoid copying data unnecessarily and make methods const 8d2c1047af
  3. RPC/Mining: Abstract TemplateToJSON out of getblocktemplate f897b86682
  4. DrahtBot commented at 9:39 am on May 18, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32547.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK Sjors

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32468 (rpc: generatetomany by polespinasa)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  5. DrahtBot added the label Mining on May 18, 2025
  6. Replace node::BlockAssembler::Options with an alias to node::BlockCreateOptions 39a3b5be3d
  7. luke-jr force-pushed on May 18, 2025
  8. DrahtBot added the label CI failed on May 18, 2025
  9. DrahtBot commented at 9:50 am on May 18, 2025: contributor

    🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/runs/42427675996 LLM reason (✨ experimental): The CI failure is caused by linting errors, including spelling mistakes and duplicate includes.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  10. Sjors commented at 7:57 am on May 19, 2025: member

    Concept ACK on adding a TemplateToJSON helper, as well as the new alias for BlockAssembler::Options.

    This breaks the multiprocess build somehow, cc @ryanofsky, e.g.

    0[06:00:12.877] /ci_container_base/src/interfaces/mining.h:54:34: note: unimplemented pure virtual method 'getCoinbaseMerklePath' in 'ProxyClient'
    1[06:00:12.877]    54 |     virtual std::vector<uint256> getCoinbaseMerklePath() const = 0;
    
  11. ryanofsky commented at 4:00 pm on May 19, 2025: contributor

    I think goals of this PR should be achievable without too much work, but unfortunately the implementation as of 39a3b5be3d138c8f3f7cc4d5bd22c1bd42f4d525 isn’t really compatible with IPC so a different approach needs to be taken. There are two problems currently:

    1. First, the pure virtual methods cannot be marked const. In general it doesn’t make sense to mark virtual methods const, because parent classes can’t know how subclasses will implement the virtual methods and what state they may access. The point of pure virtual methods is usually to declare abstract interfaces without leaking information about implementations or making assumptions about them.

    2. Methods called over IPC can’t return raw pointers or references by default, without additional custom code. It’s fine for IPC methods to return ordinary primitive or struct values, or shared_ptr or unique_ptr values, but generated IPC methods like getBlock and getCoinbaseCommitment can’t automatically return raw pointers or references, because after they receive the block data or coinbase commitment data over the socket, then need to return values that can be safely accessed and cleaned up by callers. They can’t just return references directly to the received socket data because that data will no longer be available after the methods return. So if they return references they need to copy the data somewhere where it won’t be freed right away, but will be freed eventually (to not leak memory), and that requires writing custom code and tracking additional state.

    I also don’t know motivation for TemplateToJSON and BlockCreateOptions changes in this PR:

    • I think the point of making the CBlockTemplate::Options struct distinct from the BlockCreateOptions struct, while inheritiing from it, was that BlockCreateOptions was supposed to contain high-level options that were exposed to IPC clients, while CBlockTemplate::Options was supposed to contain lower-level options, not exposed externally over IPC, but could be used for internal code and testing. Current PR seems to collapse that distinction, and it is unclear what the goal of the change is.

    • The TemplateToJSON function seems like a nice way to break up code but seems like it should just be taking input data directly instead of an interface pointer that needs to be actively queried. The could make it simpler internally and also allow it to be easily called from contexts where there is only local data and no IPC.

    All these things together suggest to me that the Mining interface should probably be exposing the CBlockTemplate struct directly and not hiding it behind so many accessor methods. This should allow dropping methods, simplifying the interface, and also making it more efficient by avoiding copies when it is called locally and no IPC is involved.

    The simplest way to do this would be to add new two accessor methods to interfaces::BlockTemplate like:

    0    virtual const CBlockTemplate* getTemplate() { return nullptr; }
    1    virtual CBlockTemplate getTemplateData() = 0;
    

    which could be implemented in node::BlockTemplateImpl as:

    0    const CBlockTemplate* getTemplate() override { return m_block_template.get(); }
    1    CBlockTemplate getTemplateData() override { return *m_block_template; }
    

    Then if the mining interface is being called from inside the local process, getTemplate() will return a valid pointer which can be used directly. But if the mining interface is being called from a remote process getTemplate() will return null, and getTemplateData() can be used to request and copy the data.

    This should work, but would require clients to take call getTemplate() before getTemplateData() to be as efficient as possible and avoid unnecessary copies.

    An alternate approach that would avoid unnecessary copies and be simpler for clients but require a little more complexity in the IPC implementation would be to expose an interfaces::BlockTemplate method like

    0    virtual const CBlockTemplate& getTemplate() = 0;
    

    with a custom implementation that caches the CBlockTemplate internally in the IPC client class so the references it returns will be valid and the data will be freed when the client is destroyed. There are a few examples of this pattern in #10102, but implementing this would requires using some more complicated c++ features like template specialization so would probably recommend trying the simpler approach first.

  12. Sjors commented at 5:27 pm on May 19, 2025: member

    All these things together suggest to me that the Mining interface should probably be exposing the CBlockTemplate struct directly and not hiding it behind so many accessor methods.

    The original reason for this was so that a Stratum v2 IPC sidecar can fetch the minimal info it needs for a NewTemplate message: basically the version field, some parts of the coinbase and its merkle path. It can do this without fetching the full block.

    Later on it turned out that this interface is also very useful for submitting a solution without (again) having to send a full block over the wire. Ditto for an ergonomic way to wait for fees to rise or a new tip to arrive (with multiple clients).

    If we assume that Stratum v2 sidecar is running on the same machine as the node, which seems pretty likely, the original optimisation might be overkill and it might as well get the full template. However it’s also easier for the sidecar implementation if it doesn’t have to hold on to these templates.

    An approach where the sidecar calls getTemplateData(), and we drop getBlockHeader(), getBlock(), getCoinbaseTx() and getCoinbaseCommitment() is fine by me. I’m not worried about making the non-sidecar version https://github.com/Sjors/bitcoin/pull/68 less efficient since that approach is deprecated.

    The only tricky bits are getWitnessCommitmentIndex() and getCoinbaseMerklePath(), which I’d rather not implement on the sidecar side. We could cache these values in the CBlockTemplate struct (as we do with the coinbase SegWit commitment).

    (since #31283 the sidecar doesn’t use getTxFees() and getTxSigops() anymore so those were already on the nomination list for deletion)

    cc @ismaelsadeeq

  13. ryanofsky commented at 6:08 pm on May 19, 2025: contributor

    re: #32547 (comment)

    All these things together suggest to me that the Mining interface should probably be exposing the CBlockTemplate struct directly and not hiding it behind so many accessor methods.

    Sorry, I did not write that very clearly. I am not suggesting dropping the interfaces::BlockTemplate class and only exposing the CBlockTemplate struct. I am just suggesting adding a const CBlockTemplate* accessor to the interface which could return a pointer to the internal struct and avoid copying. The new accessor could replace some of the other accessors, or just augment them.

    This would just be a minor change to avoid unnecessary copies, not a broader rethinking of the interface.

  14. Sjors commented at 6:39 pm on May 19, 2025: member
    I agree it could just be an addition to the interface, though it also fits well if we want to prune it a bit.
  15. luke-jr commented at 2:22 am on May 20, 2025: member

    In general it doesn’t make sense to mark virtual methods const, because parent classes can’t know how subclasses will implement the virtual methods and what state they may access. The point of pure virtual methods is usually to declare abstract interfaces without leaking information about implementations or making assumptions about them.

    The constness of methods is a property of the interface. It is a way to say calling it will not change the object, and thereby allows using const references.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-05-25 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me