refactor: Replace sets of txiter with CTxMemPoolEntryRefs #28886

pull TheCharlatan wants to merge 4 commits into bitcoin:master from TheCharlatan:setEntryRefs changing 18 files +384 −382
  1. TheCharlatan commented at 10:07 pm on November 15, 2023: contributor

    Currently the mempool returns and consumes sets of multiindex iterators in its public API. A likely motivation for this over working with references to the actual values is that the multi index interface works with these iterators and not with pointers or references to the actual values.

    However, using the iterator type in the setEntries set provides little benefit in practice as applied currently. Its purpose, ownership, and safety semantics often remain ambiguous, and it is hardly used for actually iterating through the data structures. So replace it where possible with CTxMemPoolEntryRefs.

    Since CTxMemPoolEntry itself refers to its Parents and Children by CTxMemPoolEntryRef and not txiter, this allowed for an overall reduction of calls to iterator_to. See the docs on iterator_to for more guidance.

    No change in the performance of the mempool code was observed in my benchmarks.

    This also makes the goal of eliminating boost types from the headers as done in #28335 more feasible.

  2. DrahtBot commented at 10:07 pm on November 15, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK theuni

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29325 (consensus: Store transaction nVersion as uint32_t by achow101)
    • #29306 (policy: enable sibling eviction for v3 transactions by glozow)
    • #26593 (tracing: Only prepare tracepoint arguments when actually tracing by 0xB10C)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Refactoring on Nov 15, 2023
  4. in src/validation.cpp:343 in fee19f9709 outdated
    336+    const auto filter_final_and_mature = [this](const CTxMemPoolEntry& entry)
    337         EXCLUSIVE_LOCKS_REQUIRED(m_mempool->cs, ::cs_main) {
    338         AssertLockHeld(m_mempool->cs);
    339         AssertLockHeld(::cs_main);
    340-        const CTransaction& tx = it->GetTx();
    341+        const CTransaction& tx = entry.GetTx();
    


    mmelotti commented at 10:09 pm on November 15, 2023:
    thanks! really think entry is much better.
  5. TheCharlatan force-pushed on Nov 15, 2023
  6. DrahtBot added the label CI failed on Nov 15, 2023
  7. TheCharlatan force-pushed on Nov 15, 2023
  8. maflcko commented at 7:07 am on November 16, 2023: member
     0 test  2023-11-15T23:32:10.957000Z TestFramework (ERROR): Assertion failed 
     1                                   Traceback (most recent call last):
     2                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main
     3                                       self.run_test()
     4                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 320, in run_test
     5                                       self.replaced_test()
     6                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 263, in replaced_test
     7                                       assert_equal(event.replacement_fee, replacement_fee)
     8                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 56, in assert_equal
     9                                       raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
    10                                   AssertionError: not(40000 == 45000)
    
  9. TheCharlatan marked this as a draft on Nov 16, 2023
  10. TheCharlatan force-pushed on Nov 16, 2023
  11. TheCharlatan marked this as ready for review on Nov 16, 2023
  12. in src/txmempool.cpp:461 in 7248c0a078 outdated
    464@@ -468,10 +465,10 @@ void CTxMemPool::addUnchecked(const CTxMemPoolEntry &entry, setEntries &setAnces
    465     // to clean up the mess we're leaving here.
    466 
    467     // Update ancestors with information about this tx
    468-    for (const auto& pit : GetIterSet(setParentTransactions)) {
    469-            UpdateParent(newit, pit, true);
    470+    for (const CTxMemPoolEntry& pentry : GetEntrySet(setParentTransactions)) {
    


    maflcko commented at 1:06 pm on November 16, 2023:
    0    for (const CTxMemPoolEntry& entry : GetEntrySet(setParentTransactions)) {
    

    nit: Not a pointer?


    TheCharlatan commented at 1:11 pm on November 16, 2023:
    The p is for parent? Just like before in pit?
  13. maflcko requested review from maflcko on Nov 16, 2023
  14. DrahtBot removed the label CI failed on Nov 16, 2023
  15. maflcko commented at 4:26 pm on November 16, 2023: member

    I wonder if it is possible to avoid copies of the entry. Currently one one copy is created when inserting into mapTx.

    However, now that things are moving toward references, which are easy to copy accidentally. It may lead to bugs such as the one mentioned in the boost docs:

    0// The following, though similar to the previous code,
    1// does not work: iterator_to accepts a reference to
    2// the element in the container, not a copy.
    3int x=c.back();
    4c.erase(c.iterator_to(x)); // run-time failure ensues
    

    So disallowing copies would be a good belt-and-suspenders?

  16. TheCharlatan commented at 10:13 pm on November 16, 2023: contributor

    Re #28886 (comment)

    However, now that things are moving toward references, which are easy to copy accidentally. It may lead to bugs such as the one mentioned in the boost docs:

    Thanks for highlighting this.

    So disallowing copies would be a good belt-and-suspenders?

    I did a quick POC where mapTx.insert is replaced with mapTx.emplace_back and the CTxMemPoolEntry constructor arguments are passed to emplace_back. Then I could delete the CTxMemPoolEntry copy constructor. It takes a bit of shuffling around in the fuzzing and test code, as well as the addUnchecked method to forward the required arguments, so not sure if it should go into this PR or a follow up. Let me know what you think.

  17. maflcko commented at 8:27 am on November 17, 2023: member
    There shouldn’t be any conflicts, so maybe a separate/independent/parallel pull?
  18. theuni commented at 10:18 pm on November 22, 2023: member

    Concept ACK on avoiding exposing iterators externally.

    Exposing references instead seems like a bit of a lateral move though, and I think it’s going to take a good bit of convincing to prove that this is at least as safe than what we have now. Are references guaranteed by Boost to stay valid the same as iters are?

  19. TheCharlatan commented at 10:48 pm on November 22, 2023: contributor

    Re #28886 (comment)

    and I think it’s going to take a good bit of convincing to prove that this is at least as safe than what we have now.

    We’re using the references now to keep track of the parents and children. I don’t see what new assumption this patch is introducing compared to this existing practice.

    Are references guaranteed by Boost to stay valid the same as iters are?

    I’d be interested in getting an answer to this though. I’m guessing, since we can pass objects without copy or move ctors to the multiindex map, they can’t be moved in memory, so the references should be safe, right?

  20. DrahtBot added the label Needs rebase on Nov 24, 2023
  21. TheCharlatan force-pushed on Nov 24, 2023
  22. TheCharlatan commented at 6:01 pm on November 24, 2023: contributor

    Rebased 7248c0a078370fc52b9b36d2ed68b3257fe99818 -> fe8c85e2eb93145d5b4db809918f0948c0cfc948 (setEntryRefs_0 -> setEntryRefs_1, compare)

    • Resolved conflict with #28922
  23. DrahtBot removed the label Needs rebase on Nov 24, 2023
  24. sdaftuar commented at 1:58 pm on November 28, 2023: member
    Though #28676 is still in draft (and likely will remain that way for a while), my plan is to completely rewrite most of the mempool logic. To minimize rebase effort, I wonder if it would be reasonable to defer mempool refactors for the time being, and just try to ensure that these goals are reflected in that work when it becomes ready for full review?
  25. TheCharlatan commented at 2:53 pm on November 28, 2023: contributor

    Re #28886 (comment)

    To minimize rebase effort

    The rebase effort doesn’t seem too bad given that all this change does is go from a complex wrapper type to its simpler underlying type, so it should be very mechanical. From reading through your patch set I also don’t see new complexities arising from novel ways of using the multi index, on the contrary I see places where new calls to mapTx.iterator_to fall away. I should go through the exercise myself though to make sure this PR doesn’t complicate yours.

    I get though that there are more fun things to do than rebasing a 64 commit PR because of a silly type change, so if you want to adapt your patch set towards facilitating removing the boost headers from the mempool header files, as well as reducing the number of iterator_to type casts, that sounds good to me too.

  26. achow101 referenced this in commit 535424a10b on Nov 28, 2023
  27. TheCharlatan force-pushed on Nov 28, 2023
  28. TheCharlatan commented at 9:22 pm on November 28, 2023: contributor

    Rebased fe8c85e2eb93145d5b4db809918f0948c0cfc948 -> 2685c9a228c0b2af9bc9bf6bc902bdf816263a22 (setEntryRefs_1 -> setEntryRefs_2, compare)

    • With #28903 merged, reviewers can work under the assumption that no spurious copies of CTxMemPoolEntry occur.
    • Added a commit detailing the reason for making explicit/deleting the CTxMemPoolEntry copy and move constructors.
  29. DrahtBot added the label Needs rebase on Dec 1, 2023
  30. TheCharlatan force-pushed on Dec 1, 2023
  31. TheCharlatan commented at 10:40 pm on December 1, 2023: contributor

    Rebased 2685c9a228c0b2af9bc9bf6bc902bdf816263a22 -> 54f0dc5adc17ca976d9c6e997f317dc22318768f (setEntryRefs_2 -> setEntryRefs_3, compare)

  32. DrahtBot removed the label Needs rebase on Dec 1, 2023
  33. sdaftuar commented at 3:31 pm on December 7, 2023: member

    However, using the iterator type in the setEntries set provides little benefit in practice as applied currently. Its purpose, ownership, and safety semantics often remain ambiguous, and it is hardly used for actually iterating through the data structures. So replace it where possible with CTxMemPoolEntryRefs.

    I was thinking about this issue of ownership and safety semantics some more – it seems like that most of the issues around iterators exist with CTxMemPoolEntryRefs as well, which is that they are only valid to hold onto as long as the mempool is locked. Once the lock is released, it’s no longer safe to use mapTx iterators or CTxMemPoolEntryRef’s, because they can be deleted out from under you.

    So if we really want to address ownership and safety semantics, I think we need to do something a bit different? (Unfortunately I don’t have any great suggestions.)

  34. DrahtBot added the label CI failed on Jan 16, 2024
  35. DrahtBot added the label Needs rebase on Feb 10, 2024
  36. TheCharlatan force-pushed on Feb 10, 2024
  37. TheCharlatan commented at 9:04 am on February 10, 2024: contributor

    Rebased 54f0dc5adc17ca976d9c6e997f317dc22318768f -> c34dcea019ea3b639b9f387df18777bc9a2d1534 (setEntryRefs_3 -> setEntryRefs_4, compare)

  38. DrahtBot removed the label CI failed on Feb 10, 2024
  39. DrahtBot removed the label Needs rebase on Feb 10, 2024
  40. DrahtBot added the label Needs rebase on Feb 13, 2024
  41. mempool: Replace sets of txiter with CTxMemPoolEntryRefs
    Currently the mempool returns and consumes sets of multiindex iterators
    in its public API. A likely motivation for this over working with
    references to the actual values is that the multi index interface works
    with these iterators and not with pointers or references to the actual
    values.
    
    However, the iterator type provides little benefit in practice as
    applied currently. Its purpose, ownership, and safety semantics often
    remain ambiguous, and it is hardly used for actually iterating through
    the data structures. So replace it where possible with
    CTxMemPoolEntryRefs.
    
    Since CTxMemPoolEntry itself refers to its Parents and Children by
    CTxMemPoolEntryRef and not txiter, this allowed for an overall
    reduction of calls to iterator_to.
    
    This also makes the long term goal of eliminating boost types from the
    headers more feasible.
    059fc2a89b
  42. mempool: Remove dead code 5e4cf6c1eb
  43. mempool: Remove txiter/setEntries from public interface 755b7d86b6
  44. doc: Document CTxMemPoolEntry copy and move constructors 7882748c1c
  45. TheCharlatan force-pushed on Feb 15, 2024
  46. TheCharlatan commented at 3:46 pm on February 15, 2024: contributor

    Rebased c34dcea019ea3b639b9f387df18777bc9a2d1534 -> 7882748c1c260dd77c22ac3a85bad18768c08f0d (setEntryRefs_4 -> setEntryRefs_5, compare)

  47. DrahtBot removed the label Needs rebase on Feb 15, 2024
  48. TheCharlatan commented at 7:58 am on March 10, 2024: contributor
    Going to close this while work on cluster mempool is happening. Hopefully a solution for leaking the boost types can be found there. Will keep the parent #28335 open.
  49. TheCharlatan closed this on Mar 10, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 07:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me