rpc: Extend scope of validation mutex in generateblock #31563

pull maflcko wants to merge 2 commits into bitcoin:master from maflcko:2412-generatetoaddress-missing-lock changing 2 files +22 βˆ’13
  1. maflcko commented at 2:48 pm on December 24, 2024: member

    The mutex (required by TestBlockValidity) must be held after creating the block, until TestBlockValidity is called. Otherwise, it is possible that the chain advances in the meantime and leads to a crash in TestBlockValidity: Assertion failed: pindexPrev && pindexPrev == chainstate.m_chain.Tip() (validation.cpp: TestBlockValidity: 4338)

    Fixes #31562

  2. DrahtBot commented at 2:48 pm on December 24, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31563.

    Reviews

    See the guideline for information on the review process.

    If your review is incorrectly listed, please react with πŸ‘Ž to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31564 (Add checkblock RPC and checkBlock() to Mining interface by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label RPC/REST/ZMQ on Dec 24, 2024
  4. maflcko renamed this:
    rpc: Extend scope of validation mutex in generatetoaddress
    rpc: Extend scope of validation mutex in generateblock
    on Dec 24, 2024
  5. rpc: Extend scope of validation mutex in generateblock
    The mutex (required by TestBlockValidity) must be held after creating
    the block, until TestBlockValidity is called. Otherwise, it is possible
    that the chain advances in the meantime and leads to a crash in
    TestBlockValidity:
    
     Assertion failed: pindexPrev && pindexPrev == chainstate.m_chain.Tip() (validation.cpp: TestBlockValidity: 4338)
    
    The diff can be reviewed with the git options
    --ignore-all-space --function-context
    fa62c8b1f0
  6. maflcko force-pushed on Dec 24, 2024
  7. ismaelsadeeq approved
  8. ismaelsadeeq commented at 6:14 pm on December 24, 2024: member

    Code review and tested ACK fa62c8b1f04a5386ffa171aeff713d55bd874cbe

    I was able to recreate the failure by calling generateblock concurrently from multiple threads using https://gist.github.com/ismaelsadeeq/a5e3edb3fbf2d7f3c6698c1610622d22

    02024-12-24T17:10:44Z Saw new header hash=34f195658f52e6cf693a2cb3dbec8d421ef7523168ea18ba5d4fb89f4d631596 height=1301  
    12024-12-24T17:10:44Z [validation] NewPoWValidBlock: block hash=34f195658f52e6cf693a2cb3dbec8d421ef7523168ea18ba5d4fb89f4d631596  
    2Assertion failed: (pindexPrev && pindexPrev == chainstate.m_chain.Tip()), function TestBlockValidity, file validation.cpp, line 4658.  
    3fish: Job 1, 'build/src/bitcoind -regtest -de…' terminated by signal SIGABRT (Abort)  
    

    But after fa62c8b1f04a5386ffa171aeff713d55bd874cbe, the generating block concurrently passes without issue.

  9. davidgumberg commented at 4:40 am on December 25, 2024: contributor

    ACK https://github.com/bitcoin/bitcoin/pull/31563/commits/fa62c8b1f04a5386ffa171aeff713d55bd874cbe

    I checked the two other users of TestBlockValidity() and both seem to not have this issue.

    Given that TestBlockValidity() is only used in the RPC interface, and can’t guarantee that it’s caller won’t give it a bad pindexPrev argument (https://github.com/bitcoin/bitcoin/pull/6123 also) , would it make sense to demote the assertion to an Assume() or to return false?


    (https://github.com/davidgumberg/bitcoin/commit/969194ddf534667cd6d5bd2e9a957cd3366270f9) Added the following test to rpc_generate.py:

     0class RPCGenerateTest(BitcoinTestFramework):
     1     def test_generateblock(self):
     2        # [...]
     3        
     4        # Ensure that generateblock can be called concurrently by many threads.
     5        # See [#31562](/bitcoin-bitcoin/31562/) (https://github.com/bitcoin/bitcoin/issues/31562)
     6        self.log.info('Generate blocks in parallel')
     7        generate_50_blocks = lambda n: [ n.generateblock(output=address, transactions=[]) for _ in range(50) ]
     8        threads = []
     9        for _ in range(6):
    10            n = get_rpc_proxy(node.url, 1, timeout=600, coveragedir=node.coverage_dir)
    11            thread = Thread(target=generate_50_blocks, args=(n,))
    12            threads.append(thread)
    13            thread.start()
    14
    15        for t in threads:
    16            t.join()
    

    which fails on master (fc7b21484703da606c5c69b23daee8c39506d90c) and succeeds on this branch.

  10. test: generateblocks called by multiple threads
    Co-Authored-By: David Gumberg <davidzgumberg@gmail.com>
    fa63b8232f
  11. maflcko commented at 9:31 am on December 25, 2024: member

    Given that TestBlockValidity()’s is only used in the RPC interface, and can’t guarantee that it’s caller won’t give it a bad pindexPrev argument (#6123 also) , would it make sense to demote the assertion to an Assume() or to return false?

    No, it is not possible to replace Assert with Assume here. After a “failed” assume after an internal runtime error (coding error), in non-debug builds, the program execution will continue and then fail later-on with another error (possibly giving a completely different error reason, or even undefined behavior). It seems better to explicitly denote the type of error that was encountered and not mix internal errors with external ones.

    So an alternative to Assert in this context could be CHECK_NONFATAL. However, this can be done in an independent pull request and seems orthogonal to the changes here.

  12. maflcko commented at 9:32 am on December 25, 2024: member

    Added the following test to rpc_generate.py:

    Nice, pushed something based on your idea!

  13. ismaelsadeeq commented at 4:30 pm on December 25, 2024: member

    re-ACK fa63b8232f38e78d3c6413fa7d51809f376de75c

    nice test!

  14. DrahtBot requested review from davidgumberg on Dec 25, 2024
  15. davidgumberg commented at 2:25 am on December 26, 2024: contributor

    reACK https://github.com/bitcoin/bitcoin/commit/fa63b8232f38e78d3c6413fa7d51809f376de75c

    Verified that the functional test added here usually fails on master1 (17/20 runs failed for me)

    0$ git clean -dfx && git switch --detach fc7b2148 && git cherry-pick fa63b823
    1$ CC=clang CXX=clang++ cmake -B build && cmake --build build -j $(nproc)
    2$ ./build/test/functional/rpc_generate.py
    3[...]
    42024-12-26T01:19:59.394000Z TestFramework (INFO): Generate blocks in parallel
    52024-12-26T01:19:59.752000Z TestFramework (ERROR): Called Process failed with 'error: timeout on transient error: Could not connect to the server 127.0.0.1:18943 (error code 1 - "EOF reached")
    6
    7Make sure the bitcoind server is running and that you are connecting to the correct RPC port.
    8Use "bitcoin-cli -help" for more info.
    9[...]
    

    1. fc7b2148 ↩︎

  16. mzumsande commented at 6:38 pm on December 26, 2024: contributor

    utACK fa63b8232f38e78d3c6413fa7d51809f376de75c

    I think in theory it would be sufficient to lock m_chainstate_mutex for the active chainstate instead of cs_main during createNewBlock, but I don’t think it matters.

  17. achow101 commented at 7:26 pm on December 30, 2024: member
    ACK fa63b8232f38e78d3c6413fa7d51809f376de75c
  18. achow101 merged this on Dec 30, 2024
  19. achow101 closed this on Dec 30, 2024

  20. Sjors commented at 8:07 pm on December 30, 2024: member
    Post merge ACK
  21. maflcko deleted the branch on Jan 2, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me