consensus: Avoid data race in CBlock class #14803

pull hebasto wants to merge 2 commits into bitcoin:master from hebasto:20181125-cblock-threadsafe changing 4 files +39 −15
  1. hebasto commented at 11:25 PM on November 25, 2018: member

    Ref: #14058, #14072

    CheckBlock can be invoked from multiple threads in parallel. To avoid a data race std::atomic<bool> is used. All comments from #14072 are addressed.

    As this PR introduces copy constructors for both CBlockHeader and CBlock classes, additional refactoring proposed for CBlock::GetBlockHeader().

  2. fanquake added the label Consensus on Nov 25, 2018
  3. fanquake commented at 11:28 PM on November 25, 2018: member

    @hebasto Can you also link to #14058 from the PR description, as that is the issue #14072 was solving.

  4. hebasto commented at 11:32 PM on November 25, 2018: member

    @fanquake Done.

  5. hebasto renamed this:
    consensus: Avoid data race in CBlock class
    [WIP] consensus: Avoid data race in CBlock class
    on Nov 25, 2018
  6. Avoid data race in CBlock class a57873f0f9
  7. Refactoring with CBlockHeader copy constructor 945a642ff4
  8. hebasto force-pushed on Nov 26, 2018
  9. hebasto renamed this:
    [WIP] consensus: Avoid data race in CBlock class
    consensus: Avoid data race in CBlock class
    on Nov 26, 2018
  10. gmaxwell commented at 2:01 AM on November 26, 2018: contributor

    Thanks for working on this.

    Simply making the check flag atomic doesn't really seem correct to me-- why would we be checking the same block object twice concurrently? Just making the test atomic prevents the that particular bit of UB but it doesn't mean that there isn't wasted work, or even the potential of a deadlock.

    Can someone please summarize how the same block can get checked in parallel and why it makes sense for that to happen? 14058 simply states the test fail with thread sanitizing enabled, but don't explain the sequence of events that lead to the race. I want to make sure that we're not bandaging over a structural flaw that should be fixed with a lock someplace or some other change.

  11. DrahtBot commented at 2:44 AM on November 26, 2018: member

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #14829 (travis: Enable functional tests in the ThreadSanitizer (TSan) build job by practicalswift)
    • #10785 (Serialization improvements by sipa)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  12. promag commented at 11:49 AM on November 26, 2018: member

    I agree with @gmaxwell, why make data structure thread safe?

    Beside, CBlock::operator= doesn't look thread safe to me, can't m_checked change between load and store?

  13. MarcoFalke commented at 5:28 PM on November 26, 2018: member

    The race happens because we run 10 threads in parallel that check 1000 random blocks in sequence. With only about 100 blocks in total, it is unlikely that every thread picks blocks that have not yet been picked by another thread. So I think there are at least two alternative solutions:

    • Assume that we only ever call ProcessNewBlock for each block once, so we'd rewrite the unit test to not pick a random block, but pop (remove and return) a random block from the list. But this assumption seems unlikely, since a block could arrive on the p2p and rpc interface at the same time, which then both call ProcessNewBlock.
    • Assume that ProcessNewBlock avoids checking the same block twice, which could be fixed by guarding the read-write of fChecked on a larger scope, instead of just fixing the symptom that the member variable is racy.
    • something else?
  14. promag commented at 5:37 PM on November 26, 2018: member

    I think 2nd option is the most reasonable.

  15. sipa commented at 6:09 PM on November 26, 2018: member

    My preference is avoid adding synchronization primitives to structures that don't need it. Here it seems the case that the validation logic doesn't actually supported multi-threaded block validation, but the unit tests verify (a small) part of it regardless. That seems overkill.

  16. MarcoFalke commented at 8:27 PM on November 26, 2018: member

    If it doesn't support multi-threaded block validation, it should still be fixed to avoid multiple threads calling it at the same time, because that is a thing that can happen in practice with the rpc thread running along the "main thread".

  17. gmaxwell commented at 9:58 PM on November 27, 2018: contributor

    It doesn't support multithreaded validation and there are lot of things that prevent that, which is why I was concerned. Why doesn't the lock on the block index or even cs main prevent concurrency here?

    I'd missed that the original case was a unit test.

  18. MarcoFalke commented at 10:12 PM on November 27, 2018: member

    Why doesn't the lock on the block index or even cs main prevent concurrency here?

    The cs main lock was moved down in this change "Harden against mistakes handling invalid blocks" #9765 by @sdaftuar

    So unless there was a reason to not put it before CheckBlock, I suggest to move it up by one line.

  19. MarcoFalke closed this on Nov 29, 2018

  20. laanwj referenced this in commit 5ab5341d13 on Dec 1, 2018
  21. hebasto deleted the branch on Dec 1, 2018
  22. Munkybooty referenced this in commit f60f1954de on Aug 17, 2021
  23. vijaydasmp referenced this in commit 845b6df19e on Sep 5, 2021
  24. vijaydasmp referenced this in commit 67893ff277 on Sep 6, 2021
  25. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-16 21:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me