refactor: Drop g_orphan_list global #19374

pull hebasto wants to merge 2 commits into bitcoin:master from hebasto:200624-orphan changing 2 files +31 −56
  1. hebasto commented at 5:07 pm on June 24, 2020: member
    This PR partially reverts changes from 7257353b93e9f45e67071b0b86a0313e7a70aaaa, but does not change the uniformly eviction behavior introduced in #14626.
  2. hebasto commented at 5:08 pm on June 24, 2020: member
    cc @sipa
  3. DrahtBot added the label P2P on Jun 24, 2020
  4. DrahtBot added the label Refactoring on Jun 24, 2020
  5. in src/net_processing.cpp:1000 in 1263a64792 outdated
     999+    while (mapOrphanTransactions.size() > nMaxOrphans) {
    1000         // Evict a random orphan:
    1001-        size_t randompos = rng.randrange(g_orphan_list.size());
    1002-        EraseOrphanTx(g_orphan_list[randompos]->first);
    1003+        auto iter = mapOrphanTransactions.begin();
    1004+        std::advance(iter, rng.randrange(mapOrphanTransactions.size()));
    


    sipa commented at 6:14 pm on June 24, 2020:
    That’s O(n) in the number of orphan transactions. I don’t think that’s acceptable.

    hebasto commented at 6:34 pm on June 24, 2020:
    Even with given DEFAULT_MAX_ORPHAN_TRANSACTIONS = 100 ?

    hebasto commented at 8:24 pm on June 24, 2020:

    Benchmarked on the custom branch:

    • master
    0$ ./src/bench/bench_bitcoin -filter="OrphanTxPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2OrphanTxPool, 5, 10000, 2.42326, 4.83155e-05, 4.87947e-05, 4.83993e-0
    
    • this PR
    0$ ./src/bench/bench_bitcoin -filter="OrphanTxPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2OrphanTxPool, 5, 10000, 2.95478, 5.87594e-05, 5.95669e-05, 5.9013e-05
    

    sipa commented at 10:15 pm on June 24, 2020:
    Thanks for providing actual numbers. I agree it’s not so much of a concern in light of those - though I’m also not convinced it’s worth the code simplification, especially if we’d ever want/need to increase the maximum number of orphans.

    hebasto commented at 1:20 pm on June 25, 2020:
  6. DrahtBot commented at 11:50 pm on June 24, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19364 (net processing: Move orphan reprocessing to a global by jnewbery)
    • #18044 (Use wtxid for transaction relay by sdaftuar)
    • #16878 (Fix non-deterministic coverage of test DoS_mapOrphans by davereikher)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. fanquake requested review from ajtowns on Jun 25, 2020
  8. fanquake requested review from amitiuttarwar on Jun 25, 2020
  9. fanquake requested review from naumenkogs on Jun 25, 2020
  10. naumenkogs commented at 8:23 am on June 25, 2020: member

    So, am I correct that we have this redundant g_orphan_list, and we can use the map instead? But then the issue is that taking a random element from a map is expensive (asymptotically)?

    If that’s the case, I don’t have strong opinion here. The code seems correct, but the trade-offs I don’t know.

  11. hebasto force-pushed on Jun 25, 2020
  12. hebasto commented at 1:19 pm on June 25, 2020: member

    Updated 1263a647921a32d8675fc4f18d9c7a4bef12c6c9 -> 4286e3907d79199bc44e4cf6f1c45635339614e2 (pr19374.01 -> pr19374.02, diff):

    If mapOrphanTransactions was say an unordered_map, and was using salted hash for its internal hashing (to convert txids to bucket positions) like is used in some places, then this would perhaps not be an issue.


    Benchmarking results (using #19377):

    • master
    0$ ./src/bench/bench_bitcoin -filter="OrphanTxPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2OrphanTxPool, 5, 10000, 2.42326, 4.83155e-05, 4.87947e-05, 4.83993e-0
    
    • this PR partially – only std::map replaced with std::unordered_map (4c2f2b0e6df359208031f059afd3ef395367ec7f)
    0$ ./src/bench/bench_bitcoin -filter="OrphanTxPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2OrphanTxPool, 5, 10000, 2.15019, 4.28047e-05, 4.33479e-05, 4.29747e-05
    
    • this PR (additionally dropped g_orphan_list)
    0$ ./src/bench/bench_bitcoin -filter="OrphanTxPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2OrphanTxPool, 5, 10000, 1.87376, 3.73627e-05, 3.78177e-05, 3.7388e-05
    
  13. in src/util/uint256hash.h:13 in 4286e3907d outdated
     8+#include <uint256.h>
     9+
    10+#include <cstddef>
    11+#include <cstdint>
    12+
    13+class uint256Hash
    


    vasild commented at 3:21 pm on June 25, 2020:
    nit: Uint256Hash to abide to the naming convention.

    jnewbery commented at 3:36 pm on June 25, 2020:
    We already have one of these. It’s called SaltedTxidHasher in txmempool.cpp. Feel free to take this commit that moves it to validation: https://github.com/jnewbery/bitcoin/commit/0ccd20536350ce0cbe9e453d50873a5bb6ca4d03 (not sure if there’s a better home for it).

    hebasto commented at 4:31 pm on June 25, 2020:
    @jnewbery Thanks!

    MarcoFalke commented at 4:47 pm on June 25, 2020:
    Probably best to move it to a new module to avoid re-introducing the net_processing<->validation circular dependency?

    jnewbery commented at 5:55 pm on June 25, 2020:
    Net processing already includes validation.h

    hebasto commented at 12:59 pm on June 27, 2020:
    Switched to SaltedTxidHasher as suggested by @jnewbery.

    hebasto commented at 1:00 pm on June 27, 2020:
  14. in src/net_processing.cpp:235 in 4286e3907d outdated
    231@@ -230,9 +232,7 @@ namespace {
    232             return &(*a) < &(*b);
    233         }
    234     };
    235-    std::map<COutPoint, std::set<std::map<uint256, COrphanTx>::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(g_cs_orphans);
    236-
    237-    std::vector<std::map<uint256, COrphanTx>::iterator> g_orphan_list GUARDED_BY(g_cs_orphans); //! For random eviction
    238+    std::map<COutPoint, std::set<OrphanTxPool::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(g_cs_orphans);
    


    vasild commented at 3:29 pm on June 25, 2020:
    IteratorComparator compares the addresses of the elements in the container! With std::unordered_map, I guess, if rehashing occurs, those could change in an unpredictable way, confusing the std::set container. Maybe at some point the std::set could end up having element A before element B (two iterators to the std::unordered_map) but they now compare A > B (after a rehash).

    vasild commented at 11:14 am on June 26, 2020:

    This can be overcome by changing the std::set to std::unordered_set, removing the bizarre “order by address” IteratorComparator and hashing by the tx id. I checked that nowhere in the code we rely on the order of the elements in the std::set (as expected, because they are ordered by their address, which is “random” or unpredictable to the application). Something like this:

     0diff --git i/src/net_processing.cpp w/src/net_processing.cpp
     1index d72edae6d..a02945d7e 100644
     2--- i/src/net_processing.cpp
     3+++ w/src/net_processing.cpp
     4@@ -221,21 +221,25 @@ namespace {
     5     /** Relay map */
     6     typedef std::map<uint256, CTransactionRef> MapRelay;
     7     MapRelay mapRelay GUARDED_BY(cs_main);
     8     /** Expiration-time ordered list of (expire time, relay map entry) pairs. */
     9     std::deque<std::pair<int64_t, MapRelay::iterator>> vRelayExpiration GUARDED_BY(cs_main);
    10 
    11-    struct IteratorComparator
    12+    class IteratorHash
    13     {
    14-        template<typename I>
    15-        bool operator()(const I& a, const I& b) const
    16+    public:
    17+        size_t operator()(const OrphanTxPool::iterator& it) const
    18         {
    19-            return &(*a) < &(*b);
    20+            return m_hasher(it->first);
    21         }
    22+
    23+    private:
    24+        const uint256Hash m_hasher;
    25     };
    26-    std::map<COutPoint, std::set<OrphanTxPool::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(g_cs_orphans);
    27+
    28+    std::map<COutPoint, std::unordered_set<OrphanTxPool::iterator, IteratorHash>> mapOrphanTransactionsByPrev GUARDED_BY(g_cs_orphans);
    29 
    30     static size_t vExtraTxnForCompactIt GUARDED_BY(g_cs_orphans) = 0;
    31     static std::vector<std::pair<uint256, CTransactionRef>> vExtraTxnForCompact GUARDED_BY(g_cs_orphans);
    32 } // namespace
    33 
    34 namespace {
    

    vasild commented at 12:05 pm on June 26, 2020:

    Actually, there is a bigger problem which the above patch would not solve - all iterators to mapOrphanTransactions might be invalidated by an insertion to it:

    https://github.com/bitcoin/bitcoin/blob/4286e3907d79199bc44e4cf6f1c45635339614e2/src/net_processing.cpp#L924

    I.e. the above line might invalidate all iterators to mapOrphanTransactions that are stored in mapOrphanTransactionsByPrev.


    hebasto commented at 12:59 pm on June 27, 2020:
  15. in src/net_processing.cpp:956 in 4286e3907d outdated
    963     {
    964-        std::map<uint256, COrphanTx>::iterator maybeErase = iter++; // increment to avoid iterator becoming invalid
    965+        const auto maybeErase = iter++;
    966         if (maybeErase->second.fromPeer == peer)
    967         {
    968             nErased += EraseOrphanTx(maybeErase->second.tx->GetHash());
    


    vasild commented at 3:40 pm on June 25, 2020:

    There is a subtle difference between deleting elements from std::map and std::unordered_map while iterating on the container - with std::unordered_map the order of the elements may change. This is fixed in C++14: https://en.cppreference.com/w/cpp/container/unordered_map/erase

    The order of the elements that are not erased is preserved. (This makes it possible to erase individual elements while iterating through the container.) (since C++14)

    For example, in C++11 the following may happen:

    • the std::unordered_map contains A, B, C, D
    • we have an iterator pointing to B, we copy it, advance it to C and erase B via the saved copy
    • now the advanced iterator is not invalidated and still points to C
    • however the order of the elements in the container is now D, C, A, so advancing the iterator until end() will work, but overall we would have visited A twice and not visited D.

    A possible workaround is to pile the iterators in a std::vector and only delete them after finishing of the iteration over the std::unordered_map.


    hebasto commented at 1:09 pm on June 27, 2020:

    @vasild and me have discussed this concern on IRC. I think as “Other iterators … are not invalidated.” it does not matter whether the internal element order is changed while iterating over std::unordered_map.

    Maybe our C++ connoisseurs could add their opinions here?


    ajtowns commented at 7:09 am on June 29, 2020:

    If you have an valid iterator without order being preserved, I think you could have:

    1. implicit ordering = A, B, C, D, E ; iterator X points at A
    2. X++; X points at B
    3. X++; X points at C
    4. delete B
    5. new implicit ordering = E D C A ; iterator X still points at C
    6. X++; now points at A, which was already seen
    7. X++; now points at end() and we never saw E

    https://stackoverflow.com/questions/38468844/erasing-elements-from-unordered-map-in-a-loop/38469800 suggests that all C++11 compilers probably already did the C++14 behaviour (I imagine it’s hard to do efficient iterators that remain valid on erasure when you’re also reordering elements) so probably hard to find a counterexample, though.

    (edit: fix example, geez)


    hebasto commented at 8:28 am on June 29, 2020:
  16. refactor: Replace iterators w/ references in mapOrphanTransactionsByPrev
    This change is required to replace std::map with std::unordered_map for
    mapOrphanTransactions as iterators could be invalidated on rehashing.
    072066eb03
  17. hebasto force-pushed on Jun 27, 2020
  18. hebasto commented at 12:57 pm on June 27, 2020: member

    Updated 4286e3907d79199bc44e4cf6f1c45635339614e2 -> 538a57f2fbf573d7a2ab3e7fb1a05650fd64d29b (pr19374.02 -> pr19374.03, diff):

    Actually, there is a bigger problem which the above patch would not solve - all iterators to mapOrphanTransactions might be invalidated by an insertion to it

    We already have one of these. It’s called SaltedTxidHasher in txmempool.cpp.

  19. refactor: Make unordered_map type for mapOrphanTransactions
    Dropped g_orphan_list global.
    To get the random element from an std::unordered_map the fact is used
    that internal container hashes are uniformly distributed.
    d02fcd0d45
  20. hebasto force-pushed on Jun 29, 2020
  21. hebasto commented at 8:28 am on June 29, 2020: member

    Updated 538a57f2fbf573d7a2ab3e7fb1a05650fd64d29b -> d02fcd0d45f428f73bd1de8a7a411fd34216d2fc (pr19374.03 -> pr19374.04, diff):

    • implemented uncontroversial and robust erase from std::unordered_map
  22. in src/net_processing.cpp:918 in d02fcd0d45
    916+    auto ret = mapOrphanTransactions.emplace(hash, COrphanTx{tx, peer, GetTime() + ORPHAN_TX_EXPIRE_TIME});
    917     assert(ret.second);
    918-    g_orphan_list.push_back(ret.first);
    919     for (const CTxIn& txin : tx->vin) {
    920-        mapOrphanTransactionsByPrev[txin.prevout].insert(ret.first);
    921+        mapOrphanTransactionsByPrev[txin.prevout].insert(hash);
    


    jnewbery commented at 7:22 pm on June 29, 2020:
    It feels a little scary that we’re adding references to a uint256 where the lifetime of that object is only as long as the transaction exists in the mapOrphanTransactions. The logic looks fine (and is the same as the existing logic where we’re storing iterators which are only valid as long as the tx exists in mapOrphanTransactions), but it feels slightly dangerous.

    vasild commented at 10:41 am on June 30, 2020:

    the lifetime of that object is only as long as the transaction exists in the mapOrphanTransactions

    I think that is incorrect. The hash reference that we insert into mapOrphanTransactionsByPrev is created here:

    https://github.com/bitcoin/bitcoin/blob/d02fcd0d45f428f73bd1de8a7a411fd34216d2fc/src/net_processing.cpp#L897

    and the GetHash() method returns a reference to its CTransaction::hash member:

    https://github.com/bitcoin/bitcoin/blob/d02fcd0d45f428f73bd1de8a7a411fd34216d2fc/src/primitives/transaction.h#L303

    So, the reference is valid as long as the CTransaction object is valid (pointed to by the first argument of AddOrphanTx()).

    Verifying whether the CTransaction object outlives the reference saved in mapOrphanTransactionsByPrev is a bit obscure.

    Saving uint256 instead of uint256& looks more robust, also considering possible future changes. It would require 32 bytes instead of 8 (+24 bytes per transaction).


    jnewbery commented at 4:38 pm on June 30, 2020:

    the reference is valid as long as the CTransaction object is valid

    Right, and I believe that for an orphan transaction, the only shared_ptr to the CTransaction is held in mapOrphanTransactions. As soon as it gets removed from that map, then the CTransaction will be destructed and the uint256& will be dangling.

    Saving uint256 instead of uint256& looks more robust, also considering possible future changes. It would require 32 bytes instead of 8 (+24 bytes per transaction).

    This would require an additional 24 bytes per transaction input, not per transaction.


    hebasto commented at 4:18 pm on July 2, 2020:

    As soon as it gets removed from that map, then the CTransaction will be destructed and the uint256& will be dangling.

    In EraseOrphanTx() a transaction is erased from mapOrphanTransactions only after its inputs have been erased from mapOrphanTransactionsByPrev().

    So, no dangling uint256& are expected.

  23. in src/net_processing.cpp:951 in d02fcd0d45
    967-        std::map<uint256, COrphanTx>::iterator maybeErase = iter++; // increment to avoid iterator becoming invalid
    968-        if (maybeErase->second.fromPeer == peer)
    969-        {
    970-            nErased += EraseOrphanTx(maybeErase->second.tx->GetHash());
    971+
    972+    std::vector<OrphanTxPool::iterator> erase_candidates;
    


    jnewbery commented at 7:46 pm on June 29, 2020:
    why not make this a vector of uint256s?

    hebasto commented at 4:49 pm on July 2, 2020:

    why not make this a vector of uint256s?

    std::vector<OrphanTxPool::iterator> is more effective (both time and memory), I think.

  24. in src/net_processing.cpp:145 in d02fcd0d45
    139@@ -139,10 +140,10 @@ struct COrphanTx {
    140     CTransactionRef tx;
    141     NodeId fromPeer;
    142     int64_t nTimeExpire;
    143-    size_t list_pos;
    144 };
    145 RecursiveMutex g_cs_orphans;
    146-std::map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(g_cs_orphans);
    147+using OrphanTxPool = std::unordered_map<uint256, COrphanTx, SaltedTxidHasher>;
    


    jnewbery commented at 7:55 pm on June 29, 2020:
    Since SaltedTxidHasher is no longer just used by the mempool, it should be moved out of txmempool.cpp. Maybe validation.cpp or primitives/transaction.cpp?

    vasild commented at 10:44 am on June 30, 2020:
    … and maybe renamed because it is hashing uint256 which is broader than Txid?

    hebasto commented at 4:09 pm on June 30, 2020:
    Something like 2ba96b0438fe734b788e56ba881303c84a3f249d from (#16910) ?

    jnewbery commented at 4:39 pm on June 30, 2020:
    Indeed. Just like that!
  25. in src/net_processing.cpp:995 in d02fcd0d45
    993     while (mapOrphanTransactions.size() > nMaxOrphans)
    994     {
    995         // Evict a random orphan:
    996-        size_t randompos = rng.randrange(g_orphan_list.size());
    997-        EraseOrphanTx(g_orphan_list[randompos]->first);
    998+        EraseOrphanTx(mapOrphanTransactions.begin()->first);
    


    jnewbery commented at 8:01 pm on June 29, 2020:
    I think this will exhibit different qualities from the current eviction logic. Currently, each eviction is an independent event, where every element has a 1/101 chance of being evicted. The new logic has dependent events. If a tx hashes to a bucket towards the end of the unordered_map, then it’s unlikely to be evicted in any of these events, or put another way, if it doesn’t get evicted in one round, the probability of it being evicted in a subsequent round is <1/101. I’m not sure if that’s a problem.
  26. jnewbery commented at 8:08 pm on June 29, 2020: member

    I don’t think the new random eviction logic leads to uniform independent events. See comment #19374 (review).

    An alternative way to do this, which I think would be closer to uniform, independent events would be to go back to using an ordered map, but key by salted-txid. When the 101st transaction needs to be added to the map, then evict the tx with upper_bound of the salted-txid of the tx-to-be-inserted. We only ever evict an orphan transaction immediately after adding the 101st, and it should be fine to reverse that order (evict-then-insert rather than insert-then-evict).

  27. vasild commented at 7:46 am on July 2, 2020: member

    @jnewbery that is an interesting observation!

    Background

    Quickly I thought that in the vector (before this PR) we insert at the end and remove a random element, whereas with the hash table (unordered_map, this PR) we insert at a random position and remove from the front, so it shouldn’t make a difference. However, there is a catch!

    With the hash table we insert at a random position at the hash table array, which is not the same as “at a random position within the existent elements” (thanks for the elaboration over IM!). If the existent elements are grouped near the end of the hash table array, then inserting at a random position at that hash table array will likely make the new element the leftmost one (and so it will be evicted soon). And this is exactly what happens when we keep removing the leftmost elements - grouping near the end and very often removing the element we just added.

    Stats

    I gathered some statistics to verify the above - I did 100000x (AddOrphanTx() + LimitOrphanTxSize()) and at each iteration I recorded:

    • the age of the transaction that is being evicted (a transaction enters with age 0 and its age is incremented with 1 on each of the 100000 iterations)
    • the age of the oldest transaction in the container

    master @ 205b87d2f: master

    pr @ d02fcd0d: pr

    pr @ d02fcd0d + improvement suggested by @jnewbery: pr_improved

    The improvement consists of using this container:

     0class Cmp
     1{
     2    public:
     3        bool operator()(const uint256& lhs, const uint256& rhs) const {
     4            return m_hasher(lhs) < m_hasher(rhs);
     5        }
     6
     7    private:
     8        const SaltedTxidHasher m_hasher;
     9};
    10using OrphanTxPool = std::map<uint256, COrphanTx, Cmp>;
    

    swapping the order of the calls to AddOrphanTx() and LimitOrphanTxSize(), passing the newcomer hash to LimitOrphanTxSize() and evicting mapOrphanTransactions.upper_bound(newcomer_hash) (or mapOrphanTransactions.begin() if upper_bound() returns end()).

    Analysis

    • In the case of “pr @ d02fcd0d” in 99494 of 100000 cases we evict the transaction that has just been added.
    • In “master @ 205b87d2f” and in “pr @ d02fcd0d + improvement” we most often evict the transaction that has just been added - about 1000 times of the 100000 iterations. Why? Both graphs look strikingly similar. Why?

    Verdict

    Let’s go with @jnewbery’s improvement suggestion!

  28. vasild commented at 8:46 am on July 2, 2020: member

    we most often evict the transaction that has just been added … Why?

    Here is some explanation - the chance of a transaction surviving

    • one eviction is 99/100 = 0.99
    • two evictions is 99/100 * 99/100 = (99/100)^2 = 0.98
    • 50 evictions is (99/100)^50 = 0.6
    • 400 evictions is (99/100)^400 = 0.018
  29. jnewbery commented at 2:45 pm on July 2, 2020: member

    Thanks for the very detailed investigation @vasild !

    we most often evict the transaction that has just been added … Why?

    Here is some explanation - the chance of a transaction surviving

    Right. In current master, eviction is a negative binomial distribution with r=1 and p=1/101. Each eviction is an independent Bernoulli trial, and as you point out, the chance of survival decreases geometrically over time. If you go to https://homepage.divms.uiowa.edu/~mbognar/applets/nb1.html and plug in r=1 and p=0.01, you’ll see the expected graph.

    Evictions in the current branch of this PR are not independent events. As you insert random elements into the set and remove from the left, the set skews more and more to the right, so in future insert-and-remove-one events, the just-inserted element is more and more likely to be the leftmost.

    Eviction in my suggested change are also not perfectly independent events, but I expect they’re ‘good enough’. Adding an element to the set and removing the next along should keep the set randomly distributed across the entire range, so every element has ~equal chance of being removed in the next insert-and-remove-one event.

  30. ajtowns commented at 1:55 am on July 3, 2020: member
    What’s the motivation for this change? It seems like an awful lot of work to save what seems to be a single pointer per orphan transaction…
  31. hebasto commented at 5:11 am on July 3, 2020: member

    What’s the motivation for this change? It seems like an awful lot of work to save what seems to be a single pointer per orphan transaction…

    To achieve uniformity of transaction selection for its eviction from the orphan pool #14626 added the g_orphan_list global with supported code. The motivation for this PR is to achieve the same uniformity without the added in #14626 stuff. The initial approach was quite simple and straightforward but not effective.

    The recent changes are really unjustified. So closing.

  32. hebasto closed this on Jul 3, 2020

  33. vasild commented at 9:16 am on July 3, 2020: member

    Just out of curiosity, here is the eviction distribution without #14626

    master_revert14626

      0commit 944e7a8a0 (HEAD -> master)
      1Parent: 5d1e3b803
      2Author:     Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
      3AuthorDate: Fri Jun 26 18:09:27 2020 +0300
      4Commit:     Vasil Dimov <vd@FreeBSD.org>
      5CommitDate: Fri Jul 3 11:02:51 2020 +0200
      6gpg: Signature made Fri Jul  3 11:02:58 2020 CEST
      7gpg:                using RSA key E64D8D45614DB07545D9CCC154DF06F64B55CBBF
      8gpg: Good signature from "Vasil Dimov <vd@myforest.net>" [ultimate]
      9gpg:                 aka "Vasil Dimov <vd@FreeBSD.org>" [ultimate]
     10gpg:                 aka "Vasil Dimov <vasild@gmail.com>" [ultimate]
     11
     12
     13    collect stats + test
     14
     15diff --git a/src/net_processing.cpp b/src/net_processing.cpp
     16index a9f5ce429..29a53be2e 100644
     17--- a/src/net_processing.cpp
     18+++ b/src/net_processing.cpp
     19@@ -141,12 +141,20 @@ struct COrphanTx {
     20     int64_t nTimeExpire;
     21     size_t list_pos;
     22 };
     23 RecursiveMutex g_cs_orphans;
     24 std::map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(g_cs_orphans);
     25 
     26+// keys: transaction age at the time of its eviction from mapOrphanTransactions
     27+// values: number of transactions evicted that were that old
     28+std::map<uint32_t, uint32_t> g_eviction_ages;
     29+
     30+// keys: the age of the oldest transaction in mapOrphanTransactions when some tx is evicted
     31+// values: number of times this age was the maximum age after some tx is evicted
     32+std::map<uint32_t, uint32_t> g_max_ages;
     33+
     34 void EraseOrphansFor(NodeId peer);
     35 
     36 /** Increase a node's misbehavior score. */
     37 void Misbehaving(NodeId nodeid, int howmuch, const std::string& message="") EXCLUSIVE_LOCKS_REQUIRED(cs_main);
     38 
     39 // Internal stuff
     40@@ -932,12 +940,43 @@ bool AddOrphanTx(const CTransactionRef& tx, NodeId peer) EXCLUSIVE_LOCKS_REQUIRE
     41 
     42     LogPrint(BCLog::MEMPOOL, "stored orphan tx %s (mapsz %u outsz %u)\n", hash.ToString(),
     43              mapOrphanTransactions.size(), mapOrphanTransactionsByPrev.size());
     44     return true;
     45 }
     46 
     47+void PrintOrphanStats() EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans)
     48+{
     49+    std::cout << "eviction_ages = np.array([";
     50+    for (const auto& e : g_eviction_ages) {
     51+        const auto& age = e.first;
     52+        std::cout << age << ",";
     53+    }
     54+    std::cout << "])" << std::endl;
     55+
     56+    std::cout << "eviction_counts = [";
     57+    for (const auto& e : g_eviction_ages) {
     58+        const auto& count = e.second;
     59+        std::cout << count << ",";
     60+    }
     61+    std::cout << "]" << std::endl;
     62+
     63+    std::cout << "max_ages = np.array([";
     64+    for (const auto& e : g_max_ages) {
     65+        const auto& age = e.first;
     66+        std::cout << age << ",";
     67+    }
     68+    std::cout << "])" << std::endl;
     69+
     70+    std::cout << "max_counts = [";
     71+    for (const auto& e : g_max_ages) {
     72+        const auto& count = e.second;
     73+        std::cout << count << ",";
     74+    }
     75+    std::cout << "]" << std::endl;
     76+}
     77+
     78 int static EraseOrphanTx(uint256 hash) EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans)
     79 {
     80     std::map<uint256, COrphanTx>::iterator it = mapOrphanTransactions.find(hash);
     81     if (it == mapOrphanTransactions.end())
     82         return 0;
     83     for (const CTxIn& txin : it->second.tx->vin)
     84@@ -979,13 +1018,13 @@ void EraseOrphansFor(NodeId peer)
     85         }
     86     }
     87     if (nErased > 0) LogPrint(BCLog::MEMPOOL, "Erased %d orphan tx from peer=%d\n", nErased, peer);
     88 }
     89 
     90 
     91-unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans)
     92+unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans, uint32_t i)
     93 {
     94     LOCK(g_cs_orphans);
     95 
     96     unsigned int nEvicted = 0;
     97     static int64_t nNextSweep;
     98     int64_t nNow = GetTime();
     99@@ -1012,12 +1051,24 @@ unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans)
    100     {
    101         // Evict a random orphan:
    102         uint256 randomhash = rng.rand256();
    103         std::map<uint256, COrphanTx>::iterator it = mapOrphanTransactions.lower_bound(randomhash);
    104         if (it == mapOrphanTransactions.end())
    105             it = mapOrphanTransactions.begin();
    106+
    107+        const uint32_t age_of_evicted = i - it->second.tx->nLockTime;
    108+        ++g_eviction_ages[age_of_evicted];
    109+        uint32_t oldest_tx_age = 0;
    110+        for (const auto& t : mapOrphanTransactions) {
    111+            const uint32_t current_tx_age = i - t.second.tx->nLockTime;
    112+            if (current_tx_age > oldest_tx_age) {
    113+                oldest_tx_age = current_tx_age;
    114+            }
    115+        }
    116+        ++g_max_ages[oldest_tx_age];
    117+
    118         EraseOrphanTx(it->first);
    119         ++nEvicted;
    120     }
    121     return nEvicted;
    122 }
    123 
    124@@ -2869,13 +2920,13 @@ void ProcessMessage(
    125                     if (!AlreadyHave(_inv, mempool)) RequestTx(State(pfrom.GetId()), _inv.hash, current_time);
    126                 }
    127                 AddOrphanTx(ptx, pfrom.GetId());
    128 
    129                 // DoS prevention: do not allow mapOrphanTransactions to grow unbounded (see CVE-2012-3789)
    130                 unsigned int nMaxOrphanTx = (unsigned int)std::max((int64_t)0, gArgs.GetArg("-maxorphantx", DEFAULT_MAX_ORPHAN_TRANSACTIONS));
    131-                unsigned int nEvicted = LimitOrphanTxSize(nMaxOrphanTx);
    132+                unsigned int nEvicted = LimitOrphanTxSize(nMaxOrphanTx, 0);
    133                 if (nEvicted > 0) {
    134                     LogPrint(BCLog::MEMPOOL, "mapOrphan overflow, removed %u tx\n", nEvicted);
    135                 }
    136             } else {
    137                 LogPrint(BCLog::MEMPOOL, "not keeping orphan with rejected parents %s\n",tx.GetHash().ToString());
    138                 // We will continue to reject this tx since it has rejected
    139diff --git a/src/test/denialofservice_tests.cpp b/src/test/denialofservice_tests.cpp
    140index 348b17053..75b59871c 100644
    141--- a/src/test/denialofservice_tests.cpp
    142+++ b/src/test/denialofservice_tests.cpp
    143@@ -40,14 +40,15 @@ struct CConnmanTest : public CConnman {
    144         vNodes.clear();
    145     }
    146 };
    147 
    148 // Tests these internal-to-net_processing.cpp methods:
    149 extern bool AddOrphanTx(const CTransactionRef& tx, NodeId peer);
    150+extern void PrintOrphanStats();
    151 extern void EraseOrphansFor(NodeId peer);
    152-extern unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans);
    153+extern unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans, uint32_t i);
    154 extern void Misbehaving(NodeId nodeid, int howmuch, const std::string& message="");
    155 
    156 struct COrphanTx {
    157     CTransactionRef tx;
    158     NodeId fromPeer;
    159     int64_t nTimeExpire;
    160@@ -436,15 +437,40 @@ BOOST_AUTO_TEST_CASE(DoS_mapOrphans)
    161         size_t sizeBefore = mapOrphanTransactions.size();
    162         EraseOrphansFor(i);
    163         BOOST_CHECK(mapOrphanTransactions.size() < sizeBefore);
    164     }
    165 
    166     // Test LimitOrphanTxSize() function:
    167-    LimitOrphanTxSize(40);
    168+    LimitOrphanTxSize(40, 0);
    169     BOOST_CHECK(mapOrphanTransactions.size() <= 40);
    170-    LimitOrphanTxSize(10);
    171+    LimitOrphanTxSize(10, 0);
    172     BOOST_CHECK(mapOrphanTransactions.size() <= 10);
    173-    LimitOrphanTxSize(0);
    174+    LimitOrphanTxSize(0, 0);
    175     BOOST_CHECK(mapOrphanTransactions.empty());
    176 }
    177 
    178+BOOST_AUTO_TEST_CASE(lifetime)
    179+{
    180+    CKey key;
    181+    key.MakeNewKey(true);
    182+    FillableSigningProvider keystore;
    183+    BOOST_CHECK(keystore.AddKey(key));
    184+
    185+    for (uint32_t i = 0; i < 100000; i++) {
    186+        CMutableTransaction tx;
    187+        tx.vin.resize(1);
    188+        tx.vin[0].prevout.n = 0;
    189+        tx.vin[0].prevout.hash = InsecureRand256();
    190+        tx.vin[0].scriptSig << OP_1;
    191+        tx.vout.resize(1);
    192+        tx.vout[0].nValue = 1*CENT;
    193+        tx.vout[0].scriptPubKey = GetScriptForDestination(PKHash(key.GetPubKey()));
    194+        tx.nLockTime = i;
    195+
    196+        AddOrphanTx(MakeTransactionRef(tx), i);
    197+        LimitOrphanTxSize(100, i);
    198+    }
    199+
    200+    PrintOrphanStats();
    201+}
    202+
    203 BOOST_AUTO_TEST_SUITE_END()
    204
    205commit 5d1e3b803
    206Parent: 7d9008f43
    207Author:     Vasil Dimov <vd@FreeBSD.org>
    208AuthorDate: Fri Jul 3 10:51:39 2020 +0200
    209Commit:     Vasil Dimov <vd@FreeBSD.org>
    210CommitDate: Fri Jul 3 10:51:39 2020 +0200
    211gpg: Signature made Fri Jul  3 10:51:58 2020 CEST
    212gpg:                using RSA key E64D8D45614DB07545D9CCC154DF06F64B55CBBF
    213gpg: Good signature from "Vasil Dimov <vd@myforest.net>" [ultimate]
    214gpg:                 aka "Vasil Dimov <vd@FreeBSD.org>" [ultimate]
    215gpg:                 aka "Vasil Dimov <vasild@gmail.com>" [ultimate]
    216
    217
    218    revert part of [#14626](/bitcoin-bitcoin/14626/)
    219
    220diff --git a/src/net_processing.cpp b/src/net_processing.cpp
    221index 80e58a6db..a9f5ce429 100644
    222--- a/src/net_processing.cpp
    223+++ b/src/net_processing.cpp
    224@@ -1008,14 +1008,17 @@ unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans)
    225         if (nErased > 0) LogPrint(BCLog::MEMPOOL, "Erased %d orphan tx due to expiration\n", nErased);
    226     }
    227     FastRandomContext rng;
    228     while (mapOrphanTransactions.size() > nMaxOrphans)
    229     {
    230         // Evict a random orphan:
    231-        size_t randompos = rng.randrange(g_orphan_list.size());
    232-        EraseOrphanTx(g_orphan_list[randompos]->first);
    233+        uint256 randomhash = rng.rand256();
    234+        std::map<uint256, COrphanTx>::iterator it = mapOrphanTransactions.lower_bound(randomhash);
    235+        if (it == mapOrphanTransactions.end())
    236+            it = mapOrphanTransactions.begin();
    237+        EraseOrphanTx(it->first);
    238         ++nEvicted;
    239     }
    240     return nEvicted;
    241 }
    242 
    243 /**
    
  34. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me