Improve UpdateTransactionsFromBlock with Epochs #17925

pull JeremyRubin wants to merge 2 commits into bitcoin:master from JeremyRubin:epoch-mempool-clean-split changing 2 files +87 −14
  1. JeremyRubin commented at 9:14 pm on January 14, 2020: contributor

    UpdateTransactionsFromBlock is called during a re-org. When a re-org occurs, all of the transactions in the mempool may be descendants from a transaction which is in the pre-reorg block. This can cause us to propagate updates, worst case, to every transaction in the mempool.

    Because we construct a setEntries setChildren, which is backed by a std::set, it is possible that this algorithm is O(N log N).

    By using an Epoch visitor pattern, we can limit this to O(N) worst case behavior.

    Epochs are also less resource intensive than almost any set option (e.g., hash set) because they are allocation free.

    This PR is related to #17268, it is a small subset of the changes which have been refactored slightly to ease review. If this PR gets review & merge, I will follow up with more PRs (similar to #17268) to improve the mempool

  2. fanquake added the label Mempool on Jan 14, 2020
  3. in src/txmempool.h:458 in d63d1f3714 outdated
    453@@ -453,6 +454,8 @@ class CTxMemPool
    454     mutable int64_t lastRollingFeeUpdate;
    455     mutable bool blockSinceLastRollingFeeBump;
    456     mutable double rollingMinimumFeeRate; //!< minimum fee to get into the pool, decreases exponentially
    457+    mutable uint64_t m_epoch;
    458+    mutable bool has_epoch_guard;
    


    sdaftuar commented at 9:40 pm on January 14, 2020:
    I don’t see initializations for these variables anywhere?
  4. in src/txmempool.h:743 in d63d1f3714 outdated
    738@@ -736,6 +739,34 @@ class CTxMemPool
    739      *  removal.
    740      */
    741     void removeUnchecked(txiter entry, MemPoolRemovalReason reason) EXCLUSIVE_LOCKS_REQUIRED(cs);
    742+public:
    743+    // This function mutates mutable state!
    


    sdaftuar commented at 9:52 pm on January 14, 2020:

    A bit confusing to have this comment (about a function) right before a class definition, I’m guessing it is leftover from something else (or should be moved a few lines down).

    Actually I think it’d make sense to explain here a little bit about what these epochs are and why/how to use them? (Also, as this would be the first introduction of epochs in the code, it’d be helpful for reviewers of the future PRs that will use it in more places to be able to remember/understand what this construction is.) Perhaps something like:

    0EpochGuard: RAII-style guard for using epoch-based graph traversal algorithms
    1When walking ancestors or descendants, we generally want to avoid visiting the 
    2same transactions twice. In some places we use std::set (or setEntries) to deduplicate
    3 what we visit, but we can do (algorithmically) better by using a counter ("epoch") that 
    4we set on each transaction and comparing against a global epoch to track whether 
    5we've visited something already during a calculation.
    

    Or something like that…


    JeremyRubin commented at 3:31 am on January 15, 2020:
    great comment; rewrote it slightly to a format I think is more clear & also to emphasize that replacement with epochs is ongoing work :)
  5. sdaftuar commented at 9:58 pm on January 14, 2020: member
    Looks like a great application of this traversal algorithm – just a couple comments that I think we should address, otherwise code review ACK. Will test a bit.
  6. DrahtBot commented at 10:27 pm on January 14, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #17786 (refactor: Nuke policy/fees->mempool circular dependencies by hebasto)
    • #16910 (wallet: reduce loading time by using unordered maps by achow101)
    • #10443 (Add fee_est tool for debugging fee estimation code by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. Add Epoch Guards to CTXMemPoolEntry and CTxMemPool 2ccb7cca4a
  8. Make UpdateTransactionsFromBlock use Epochs bd5a026928
  9. JeremyRubin force-pushed on Jan 15, 2020
  10. sdaftuar commented at 3:32 pm on January 16, 2020: member

    I don’t know what’s up with the appveyor build but I think this code is correct, it passes all tests for me locally.

    As a sanity check, I verified that this code does give a speedup in a simple scenario where it would be expected to improve things: create a transaction with 2000 outputs that is confirmed in a block; then spend each of those outputs in a separate transaction in the mempool; then disconnect the block and measure the time spent in UpdateTransactionsFromBlock (which must visit each of those 2000 children). I observed a roughly 12% improvement in runtime in this simple test.

    ACK bd5a02692853f7240a4fdc593d7d0123d7916e45

  11. JeremyRubin added this to the "Epoch MemPool" column in a project

  12. in src/txmempool.h:763 in bd5a026928
    758+     * traversal should be viewed as a TODO for replacement with an epoch based
    759+     * traversal, rather than a preference for std::set over epochs in that
    760+     * algorithm.
    761+     */
    762+    class EpochGuard {
    763+        const CTxMemPool& pool;
    


    ajtowns commented at 3:06 am on January 23, 2020:

    I think this would be cleaner with the ’epoch’ concept split out. I’m thinking:

     0class Epoch {
     1public:
     2    uint64_t m_epoch;
     3    bool m_guarded;
     4
     5    Epoch() : m_epoch{0}, m_guarded{false} {}
     6    Epoch(const Epoch&) = delete; // no copy constructor
     7    Epoch& operator=(const Epoch&) = delete; // not assignable
     8
     9    class SCOPED_LOCKABLE Guard {
    10        Epoch& m_epoch;
    11    public:
    12        Guard(Epoch& epoch) EXCLUSIVE_LOCK_FUNCTION(epoch) LOCKS_EXCLUDED(epoch);
    13        ~Guard() UNLOCK_FUNCTION();
    14    };
    15};
    16#define WITH_EPOCH(epoch) const Epoch::Guard PASTE2(epoch_guard_, __COUNTER__)(epoch)
    

    and

    0class CTxMemPool {
    1    void UpdateTransactionsFromBlock(const std::vector<uint256>& vHashesToUpdate) EXCLUSIVE_LOCKS_REQUIRED(cs, cs_main) LOCKS_EXCLUDED(m_epoch);
    2.
    3    mutable Epoch m_epoch GUARDED_BY(cs);
    4    bool visited(txiter it) const EXCLUSIVE_LOCKS_REQUIRED(cs, m_epoch) { .. }
    5};
    

    and

    0    WITH_EPOCH(m_epoch); // instead of const auto epoch = GetFreshEpoch()
    

    This lets clang check visited() is only invoked within an epoch at compile time, which seems nice.

    Using LOCKS_EXCLUDED gets part of the way to ensuring you’re not doing WITH_EPOCH recursively at compile time (eg, holding it from UpdateTxsFromBlock while calling UpdateForDescendants once UpdForDesc has been epoch-ised), but only works as long as you don’t have indirect invocations – ie if A() calls B() and B() is marked with LOCKS_EXCLUDED, that doesn’t cause A() to also be marked with LOCKS_EXCLUDED.

    This also makes the mutability/constness stuff clearer – in my opinion, anyway.


    JeremyRubin commented at 0:05 am on January 24, 2020:

    Hmm… I think that you’re right that this code is somehow better, but TBH I don’t really understand how the clang lock analyzing stuff works & don’t know if I could personally maintain this code… Is there anything better to look at than http://releases.llvm.org/3.5.0/tools/clang/docs/ThreadSafetyAnalysis.html on how to understand this clang extension?

    Any objection to this being a follow on PR?

    I also have concern that this isn’t compatible with future changes that may recursively use the epoch guard (there are algorithms that are recursive reuse safe, if you want to express a case where A has an epoch which calls B, and B creates a new Epoch which may re-traverse all of the mempool entries but A will not re-traverse anything traversed in B). You could imagine a use for this if I wanted to walk the whole mempool and traverse every connected component once, but at the same time.

    This requires some modified behavior (e.g., making epoch guards aware of the epoch they are guarding), but can still work.

    0const auto epoch = GetFreshEpoch();
    1for (auto& elt : mapTx) {
    2    if (epoch.visited(elt)) continue;
    3    const auto epoch = epoch.SubEpoch();
    4    vector<txiter> component;
    5    GetAllAncestorsEpochAlreadyHeld(elt, component);
    6    GetAllDescendentsEpochAlreadyHeld(elt, component);
    7    DoSomethingWithComponent(component);
    8}
    

    ajtowns commented at 2:55 am on January 28, 2020:

    If you’re doing an “already held” thing, as far as clang goes that’s just marking up the function with “EXCLUSIVE_LOCKS_REQUIRED(epoch)”. I think something like:

    0class SubEpoch { uint64_t subepoch; }
    1class TxMemPool {
    2  ...
    3  GetAllAncestorsEpochAlreadyHeld(elt, vector<txiter>& component, const SubEpoch& subepoch) EXCLUSIVE_LOCKS_REQUIRED(epoch);
    4};
    

    would make things pretty typesafe.

    In the example you give, if you had a tx graph:

    • A outputs A.0, A.1
    • B spends A.0, output B.0
    • C spends A.1, output C.0
    • D spends B.0, C.0

    then if you traverse mapTx ordered as B,A,C,D you’ll do {B,A,D} first, then {C,A,D} second; while if you’d traversed A or D first, you’d do {A,B,C,D} or {D,A,B,C} all in one go. I can’t see why you’d end up with an algorithm like that.

    Yeah, doing the refactor as a separate PR sounds fine


    JeremyRubin commented at 6:37 pm on January 28, 2020:

    @ajtowns so I think you’re missing the algorithm that I suggested slightly, you’d actually in this case traverse all of {A, B, C, D} in the first sub-epoch, because you would be traversing all parents and children recursively.

    So if you had another transaction E, with only confirmed inputs, and F which spends E.0, then you would output:

    {{A, B, C, D}, {E, F}}

    Hence being able to traverse the mempool and pick out connected components.

    However, on further thought it occurs to me that this sort of algorithm would not require sub-epochs at all.


    ajtowns commented at 4:41 am on January 29, 2020:
    Depends how you traversed mapTx – if you picked elt=B first, then “C” would not be a parent or child of “B” so you’d only do {A,B,D} in the first subepoch. You’d need to be ordered by feerate or something weird for this to happen, though. But yeah, I can’t think of something where sub-epochs is actually useful either. But whatever, >= vs == doesn’t make much difference.

    ajtowns commented at 4:46 am on January 29, 2020:
    follow-on PR is #18017

    JeremyRubin commented at 4:57 am on January 29, 2020:

    Ah I think here’s the miscommunication of what the algorithm “GetComponents” does. It’s something like:

     0vector<vector<txiter>> components;
     1for (auto elt: mapTx) {
     2   vector<txiter> to_process;
     3   vector<txiter> processed;
     4   if (visited(elt)) continue;
     5   to_process.push_back(elt);
     6   while (to_process.size()) {
     7    auto todo = to_process.back();
     8    to_process.pop_back();
     9    processed.push_back(todo);
    10   for(auto y : GetMemPoolChildren(todo)) {
    11        if (visited(y)) continue;
    12        to_process.push_back(y);
    13   }
    14   for(auto x : GetMemPoolParents(todo)) {
    15       if (visited(x)) continue;
    16        to_process.push_back(x);
    17    }
    18
    19}
    20    component.push_back(processed);
    21}
    

    This lets you find all connected components I think, and passes the test of B-first. A is a parent of B, so then A goes into the to_process stack, then C is a child of A, so C goes onto the to_process stack.

    Without epochs this would be much messier, but with epochs we can see that things only go onto the stack one time ever because of the visited guards.

  13. in src/txmempool.h:787 in bd5a026928
    782+        bool ret = it->m_epoch >= m_epoch;
    783+        it->m_epoch = std::max(it->m_epoch, m_epoch);
    784+        return ret;
    785+    }
    786+
    787+    bool visited(Optional<txiter> it) const EXCLUSIVE_LOCKS_REQUIRED(cs) {
    


    ajtowns commented at 3:11 am on January 23, 2020:
    This function doesn’t seem to be needed yet; suggest deferring it until it is.

    JeremyRubin commented at 0:19 am on January 24, 2020:

    Correct; it’s not needed yet. I’m indifferent on if it stays or goes. I figured it’s better for it to live in the same commit as the introduction of it’s non-optioned version, but unused functions aren’t great to introduce.

    Given that:

    1. It will be introduced in later work, relatively soon
    2. There’s already an ack for this specific revision – meaning both that this function has already been reviewed and doesn’t require re-review in the future, but also that we won’t make suhas have to re-ack this change.
    3. There’s nothing detectable wrong with it

    I’d rather just keep it for now and focus on getting out the PR that depends on it sooner.


    hebasto commented at 8:58 pm on January 28, 2020:
    If Optional<txiter> it has no value, returned value visited() == true seems semantically a bit weird. Could function naming be improved?

    JeremyRubin commented at 9:30 pm on January 28, 2020:

    It’s not that weird of a decision once you look at the code that depends on it.

    Essentially, if I code something like:

    0for (auto& it : mapTx) {
    1    auto it2 = FindIterByPrevout(it->GetTransaction()->vin[0].prevout);
    2    if (visited(it2)) continue;
    3}
    

    this is kind of semantic, because if the FindByPrevout returns nullopt then we want to skip it because we’ve (in theory) already traversed everything we could want to traverse. @sdaftuar and I spent like at least an hour trying to come up with a better semantic name than the predecessor to visited, and decided that visited was sufficiently semantic. But if you have a better suggestion, would be happy to consider it, we just couldn’t come up with something that wasn’t either awkward sounding or less semantic in other cases.


    ajtowns commented at 3:14 am on January 29, 2020:
    needs_processing(it2) or similar could work, but visited seems pretty fine to me. (I would prefer it if it more clearly indicated that it’s not idempotent/const; ie bool a = visited(it); bool b = visited(it); could result in !a && b, so maybe first_visit() or similar would be better? But still, it’s already fine)

    JeremyRubin commented at 3:59 am on January 29, 2020:

    Things like needs_processing aren’t great because there can be some sort of implication that it2 needs processing after it’s been marked, and doesn’t convey the action. What it is a bit better for is conveying the optionality aspect (something that’s an nullopt doesn’t need processing).

    The reason we liked visited as a name is that there is a hint that it’s not idempotent, that visited suggests both a query (was it2 visited already?) and a statement (it2 has been visited now).

    Ultimately a names just a name, and I don’t think visited is horribly unclear. If visited isn’t sufficient of a name, I’d opt for removing the optional version of this function (as previously noted it’s unused) and just inlining the optional handling where it’s used (although it wouldn’t quite be the same because of the guard assertion checks).

    (One of the other reasons for introducing this PR now rather than later by the way is that there are two or more separate places in the code presently that use Optional txiters, introducing this in a common PR means that those changes can proceed independently of one another.)

  14. in src/txmempool.h:783 in bd5a026928
    778+     *
    779+     */
    780+    bool visited(txiter it) const EXCLUSIVE_LOCKS_REQUIRED(cs) {
    781+        assert(m_has_epoch_guard);
    782+        bool ret = it->m_epoch >= m_epoch;
    783+        it->m_epoch = std::max(it->m_epoch, m_epoch);
    


    ajtowns commented at 3:15 am on January 23, 2020:

    I think:

    0bool visited(txiter it) const {
    1    assert(m_has_epoch_guard);
    2    bool visited = it->m_epoch == m_epoch; // only visits this epoch count
    3    if (!visited) it->m_epoch = m_epoch;
    4    return visited;
    5}
    

    would be clearer. (Also, this could be const txiter it since you’re only changing what the iter points at)


    JeremyRubin commented at 0:09 am on January 24, 2020:

    See above for note on future recursive locking – given that this is a future concern, I agree it could be postponed, but I like that the current version of the code wouldn’t have to change if modified.

    In your version it’s also not clear that if it->m_epoch != m_epoch, that !(it->m_epoch > m_epoch), which violates the contract that it->m_epoch should be monotonic.

    It’s easier IMO to validate this by using std::max, then it->m_epoch is guaranteed to weakly increase.


    hebasto commented at 8:50 pm on January 28, 2020:
    Agree with @ajtowns that replacing std::max() with conditional statement makes things clearer.

    JeremyRubin commented at 9:34 pm on January 28, 2020:

    if i were to update it, I would prefer:

    0if (!ret) it->m_epoch = m_epoch
    

    to preserve the aforementioned invariant of monotonicity. This is equivalent to std::max, so I don’t have any preference on the style, with a slight preference to what’s already there and reviewed.

  15. ajtowns commented at 4:12 am on January 23, 2020: member
    Looks good
  16. in src/txmempool.h:132 in 2ccb7cca4a outdated
    128@@ -129,6 +129,7 @@ class CTxMemPoolEntry
    129     int64_t GetSigOpCostWithAncestors() const { return nSigOpCostWithAncestors; }
    130 
    131     mutable size_t vTxHashesIdx; //!< Index in mempool's vTxHashes
    132+    mutable uint64_t m_epoch; //!< epoch when last touched, useful for graph algorithms
    


    ariard commented at 10:19 pm on January 27, 2020:

    2ccb7cc

    I think you considered it but why not m_entry_epoch, to avoid names collision for reviewers?


    JeremyRubin commented at 6:50 pm on January 28, 2020:

    fair point – I felt that prefixing it with the class name was a bit gauche compared to just naming it the same thing. It’s also an internal detail that isn’t going to be referenced or inspected by any callers, so it isn’t going to yield much name colliding code.

    I suggest that when @ajtowns refactors a lock analyzer friendly interface he can pick whatever variable names he thinks appropriate (except maybe m_aj_is_number_one).

  17. in src/txmempool.cpp:1123 in 2ccb7cca4a outdated
    1118+}
    1119+
    1120+CTxMemPool::EpochGuard::~EpochGuard()
    1121+{
    1122+    // prevents stale results being used
    1123+    ++pool.m_epoch;
    


    ariard commented at 10:40 pm on January 27, 2020:

    2ccb7cc

    What kind of stale scenario do you envision here ?

    If taking twice the epoch is attempted, the assert will be hit and should be able to spot this kind of programming error.

    Otherwise, if you prevent visited returning true for past value, I think that’s bettter than enforcing lock taking on epoch (because worst-case you reprocess an entry, which is better than taking a lock on the average-case, this maybe contrary to AJ opinion).


    JeremyRubin commented at 6:43 pm on January 28, 2020:

    It’s not required for correctness, but I like that it enforces the invariant that if an epoch guard is not held that all mempool entries are “younger” than the mempool’s m_epoch.

    By “prevents stale results being used”, I meant that if some nefarious caller wanted to manually check epochs outside of the context of an epoch guard, they wouldn’t “gain” information about the previous traversal. Of course, this is false because they could just check if it->m_epoch == m_epoch-1, but I personally felt it was a nice way to communicate the intent.

    Also uint64_t is big, so using an extra epochs count per interval isn’t going to cause any sort of issue.

  18. ariard approved
  19. ariard commented at 10:59 pm on January 27, 2020: member

    Code review ACK bd5a026

    Failure case we want to prevent is false positive of visited, returning we already updated this entry when in fact we haven’t done so. This could happen if epoch counter isn’t increased before every update, assert in visited create this strict ordering, though epoch API may be revisited in further commits if we don’t safe enough (specially in case of recursive epoch uses)

  20. hebasto commented at 7:29 pm on January 28, 2020: member
    Concept ACK.
  21. in src/txmempool.cpp:26 in bd5a026928
    22@@ -23,7 +23,7 @@ CTxMemPoolEntry::CTxMemPoolEntry(const CTransactionRef& _tx, const CAmount& _nFe
    23                                  int64_t _nTime, unsigned int _entryHeight,
    24                                  bool _spendsCoinbase, int64_t _sigOpsCost, LockPoints lp)
    25     : tx(_tx), nFee(_nFee), nTxWeight(GetTransactionWeight(*tx)), nUsageSize(RecursiveDynamicUsage(tx)), nTime(_nTime), entryHeight(_entryHeight),
    26-    spendsCoinbase(_spendsCoinbase), sigOpCost(_sigOpsCost), lockPoints(lp)
    27+    spendsCoinbase(_spendsCoinbase), sigOpCost(_sigOpsCost), lockPoints(lp), m_epoch(0)
    


    hebasto commented at 8:31 pm on January 28, 2020:
    nit: might default member initializer be used instead?

    JeremyRubin commented at 9:07 pm on January 28, 2020:
    I think it’s best to be style consistent with the other fields which are also set by the initializer list.
  22. in src/txmempool.cpp:330 in bd5a026928
    326@@ -325,7 +327,7 @@ void CTxMemPoolEntry::UpdateAncestorState(int64_t modifySize, CAmount modifyFee,
    327 }
    328 
    329 CTxMemPool::CTxMemPool(CBlockPolicyEstimator* estimator)
    330-    : nTransactionsUpdated(0), minerPolicyEstimator(estimator)
    331+    : nTransactionsUpdated(0), minerPolicyEstimator(estimator), m_epoch(0), m_has_epoch_guard(false)
    


    hebasto commented at 8:32 pm on January 28, 2020:
    nit: might default member initializers be used instead?

    JeremyRubin commented at 9:07 pm on January 28, 2020:
    I think it’s best to be style consistent with the other fields which are also set by the initializer list.
  23. in src/txmempool.cpp:133 in bd5a026928
    137-            // We can skip updating entries we've encountered before or that
    138-            // are in the block (which are already accounted for).
    139-            if (setChildren.insert(childIter).second && !setAlreadyIncluded.count(childHash)) {
    140-                UpdateChild(it, childIter, true);
    141-                UpdateParent(childIter, it, true);
    142+        // we cache the in-mempool children to avoid duplicate updates
    


    hebasto commented at 8:34 pm on January 28, 2020:
    Is this comment still relevant?

    JeremyRubin commented at 9:18 pm on January 28, 2020:
    Hmm. The comment is still relevant insofar as it explains what the EpochGuard is doing, but specifically the word “cache” is perhaps a bit vague.
  24. in src/txmempool.h:756 in bd5a026928
    751+     * counter to track the time (or, "epoch") that we began a traversal, and
    752+     * check + update a per-transaction epoch for each transaction we look at to
    753+     * determine if that transaction has not yet been visited during the current
    754+     * traversal's epoch.
    755+     *     Algorithms using std::set can be replaced on a one by one basis.
    756+     * Both techniques are not fundamentally incomaptible across the codebase.
    


    hebasto commented at 8:43 pm on January 28, 2020:
    typo: incomaptible –> incompatible

    JeremyRubin commented at 9:18 pm on January 28, 2020:
    good catch
  25. in src/txmempool.h:770 in bd5a026928
    765+        EpochGuard(const CTxMemPool& in);
    766+        ~EpochGuard();
    767+    };
    768+    // N.B. GetFreshEpoch modifies mutable state via the EpochGuard construction
    769+    // (and later destruction)
    770+    EpochGuard GetFreshEpoch() const EXCLUSIVE_LOCKS_REQUIRED(cs);
    


    hebasto commented at 8:45 pm on January 28, 2020:
    nit: since epoch is monotonously increasing, might GetNextEpoch() be a more appropriate name?

    JeremyRubin commented at 9:24 pm on January 28, 2020:

    I think Fresh better encapsulates that it’s an epoch that is unused by any element. Fresh, New, Unused are better to me than Next. All else equal, Fresh is already written.

    I don’t care about the names that much though, other than it causes negligible more rebase work for me – @ajtowns will refactor some of this in a follow up with a new interface, I think if we want to bikeshed names more we can update it then.

  26. in src/txmempool.h:782 in bd5a026928
    777+     * triggered.
    778+     *
    779+     */
    780+    bool visited(txiter it) const EXCLUSIVE_LOCKS_REQUIRED(cs) {
    781+        assert(m_has_epoch_guard);
    782+        bool ret = it->m_epoch >= m_epoch;
    


    hebasto commented at 8:52 pm on January 28, 2020:
    nit: mind adding a comment about “recursive locking” you mention in https://github.com/bitcoin/bitcoin/pull/17925/files#r370419851?

    JeremyRubin commented at 9:37 pm on January 28, 2020:
    Wouldn’t mind, but don’t want to confuse anyone with a currently unsupported use case.
  27. hebasto commented at 8:59 pm on January 28, 2020: member
    ACK bd5a02692853f7240a4fdc593d7d0123d7916e45, modulo some nits and a typo.
  28. hebasto commented at 8:14 am on January 29, 2020: member

    @JeremyRubin

    @sdaftuar and I spent like at least an hour trying to come up with a better semantic name than the predecessor to visited, and decided that visited was sufficiently semantic.

    Good to know it. Let @ajtowns and me join your discussion ;)

    I don’t care about the names that much though, other than it causes negligible more rebase work for me – @ajtowns will refactor some of this in a follow up with a new interface, I think if we want to bikeshed names more we can update it then.

    Good names cause less cognitive work for other contributors, no? @ajtowns

    needs_processing(it2) or similar could work, but visited seems pretty fine to me. (I would prefer it if it more clearly indicated that it’s not idempotent/const; ie bool a = visited(it); bool b = visited(it); could result in !a && b, so maybe first_visit() or similar would be better? But still, it’s already fine)

    May I suggest first_seen() (or FirstSeen() to be in line with our naming convention) with first_seen() == !visited() ?

  29. JeremyRubin commented at 9:06 am on January 29, 2020: contributor

    Apologies, that was an offline discussion!

    first_seen or FirstSeen is no good for this use case. If you negate visited then you impose that callers of visited has to check the negation of the call or to nest the code a level deeper inside an if, which adds more cognitive overhead IMO. Most of the code that I’ve written uses an if (visited(it)) continue; guard to check the next element, this code becomes less readable with if (!first_seen(x)) continue; or if (first_seen(it)) { x }`

    I recommend reading this issue on style nits and whatnot: #15465; I don’t think it’s unreasonable to try to figure out better tweaks to names or whatnot, but I think as long as the code is clear & we can convince ourselves it’s correct in review, that’s more or less sufficient for moving on.

  30. adamjonas commented at 8:33 pm on January 29, 2020: member

    Just to summarize for those looking to review - as of bd5a026 there are 3 ACKs (@sdaftuar, @ariard, and @hebasto) and one “looks good” from @ajtowns with no NACKs or any show-stopping concerns raised.

    • In review, there have been some style nits and naming convention disagreements along with a suggested follow on PR from @ajtowns which uses clang’s thread safety annotations and encapsulates the data more strongly to reduce chances of bugs from API misuse.

    • @sdaftuar has verified a speed-up for the case this PR was designed for.

  31. ajtowns commented at 2:12 am on January 30, 2020: member
    ACK bd5a02692853f7240a4fdc593d7d0123d7916e45 (code review)
  32. laanwj added this to the "Blockers" column in a project

  33. laanwj referenced this in commit b2df21b32c on Feb 3, 2020
  34. laanwj merged this on Feb 3, 2020
  35. laanwj closed this on Feb 3, 2020

  36. laanwj removed this from the "Blockers" column in a project

  37. in src/txmempool.h:748 in bd5a026928
    743+    /** EpochGuard: RAII-style guard for using epoch-based graph traversal algorithms.
    744+     *     When walking ancestors or descendants, we generally want to avoid
    745+     * visiting the same transactions twice. Some traversal algorithms use
    746+     * std::set (or setEntries) to deduplicate the transaction we visit.
    747+     * However, use of std::set is algorithmically undesirable because it both
    748+     * adds an asymptotic factor of O(log n) to traverals cost and triggers O(n)
    


    jonatack commented at 11:30 am on February 3, 2020:
    typo for follow-ups: s/traverals/traversals/
  38. jonatack commented at 11:32 am on February 3, 2020: member
    Code review ACK bd5a02692853f7240a4fdc593d7d0123d7916e45
  39. sidhujag referenced this in commit 2e207dabf6 on Feb 3, 2020
  40. MarcoFalke referenced this in commit 978c5a2122 on Apr 29, 2020
  41. sidhujag referenced this in commit b991503208 on Apr 29, 2020
  42. sidhujag referenced this in commit e2835e9b8c on Nov 10, 2020
  43. Fabcien referenced this in commit da023dfd11 on Dec 22, 2020
  44. laanwj referenced this in commit b59f2787e5 on Feb 24, 2021
  45. kittywhiskers referenced this in commit 20858dcb8a on Jun 16, 2021
  46. kittywhiskers referenced this in commit 8d8bc9bf20 on Jun 16, 2021
  47. kittywhiskers referenced this in commit f3e5ca703c on Jun 16, 2021
  48. PastaPastaPasta referenced this in commit 14c631691f on Jun 21, 2021
  49. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me