Several performance and privacy improvements to inv/mempool handling #7840

pull sipa wants to merge 5 commits into bitcoin:master from sipa:splitinvtxblock changing 6 files +189 −86
  1. sipa commented at 1:39 pm on April 8, 2016: member

    As we’re increasingly changing how block invs and tx invs are treated, bite the bullet and separate them into two separate queues. This makes sense, as they are subject to very different behaviour: we want blocks to relay as fast as possible, while transactions are subject to rate limiting, deduplication, and source obscurity.

    Once this is done, we can turn the tx queue into a set, and use it to prevent the mempool command from leaking unannounced transactions. The handling of such commands is also moved to the send loop, sorted by dependency order, and made subject to trickling.

    Based on top of #7805.

  2. sipa force-pushed on Apr 8, 2016
  3. sipa force-pushed on Apr 8, 2016
  4. sipa force-pushed on Apr 8, 2016
  5. in src/main.cpp: in 906e5c7fdb outdated
    5809+                        if (!pto->pfilter->IsRelevantAndUpdate(tx)) continue;
    5810+                    }
    5811+                    if (pto->minFeeFilter) {
    5812+                        CFeeRate feeRate;
    5813+                        mempool.lookupFeeRate(hash, feeRate);
    5814+                        LOCK(pto->cs_feeFilter);
    


    gmaxwell commented at 6:01 pm on April 8, 2016:
    Perhaps the fee filter access should be hoisted out of this inner loop on transactions, and a local copy of the filter-limit made.

    sipa commented at 1:38 pm on April 10, 2016:
    Done.
  6. gmaxwell commented at 6:06 pm on April 8, 2016: contributor

    This is a very good idea– much better than my earlier ones around avoiding mempool requests breaking privacy.

    Your commit message should explain that by eliminating queued entries from the mempool response and responding only at trickle time, this makes the mempool no longer leak transaction arrival order information (as the mempool itself is also sorted)– at least no more than relay itself leaks it.

  7. in src/main.cpp: in 906e5c7fdb outdated
    5824+                    }
    5825+                }
    5826             }
    5827+
    5828+            // Determine transactions to relay
    5829             if (fSendTrickle) {
    


    gmaxwell commented at 7:06 pm on April 8, 2016:
    “&& vInv.size() < INVENTORY_BROADCAST_MAX” perhaps? otherwise when a mempool request has prefilled the INV it will waste time heapifying for no reason.

    sipa commented at 1:39 pm on April 10, 2016:
    Done.
  8. gmaxwell commented at 0:32 am on April 9, 2016: contributor
    I have a test-harness now thats sutiable for testing this, a little 13 node network. Seems to work, it caught that the conversion to use the STL heap broke the behavior: the stl heap is a max heap.
  9. sipa force-pushed on Apr 9, 2016
  10. gmaxwell commented at 3:15 am on April 9, 2016: contributor

    @Thanks for fixing, looking good so far.

    We still get a tiny number of orphans in a network the consists of nothing but this code. Here is one reason why node A sends you Tx1. You getdata Tx1 Node B sends you Tx1 and Tx2 and Tx1 is a child of Tx1. You getdata Tx2 from node2. Node2 responds faster. Tx2 is now an orphan.

  11. sipa force-pushed on Apr 10, 2016
  12. sipa renamed this:
    Split and optimize inv queues and improve mempool privacy
    Several performance and privacy improvements to inv/mempool handling
    on Apr 10, 2016
  13. sipa commented at 1:41 pm on April 10, 2016: member
    I added a commit to sort the response of mempool requests in dependency order, and renamed the PR to something more general.
  14. dcousens commented at 6:48 am on April 11, 2016: contributor
    concept ACK
  15. laanwj added the label P2P on Apr 11, 2016
  16. laanwj added the label Privacy on Apr 11, 2016
  17. in src/main.cpp: in 0cdc0fa69b outdated
    5852-                if (vInv.size() >= 1000)
    5853-                {
    5854-                    pto->PushMessage(NetMsgType::INV, vInv);
    5855-                    vInv.clear();
    5856+            // Determine transactions to relay
    5857+            if (fSendTrickle && vInv.size() < INVENTORY_BROADCAST_MAX) {
    


    sdaftuar commented at 7:07 pm on April 11, 2016:
    INVENTORY_BROADCAST_MAX is supposed to apply to rate-limited transaction relay, while vInv at this point can be holding blocks, and/or any leftovers from the mempool command. This seems like it’s not the right comparison.
  18. in src/net.h: in 0cdc0fa69b outdated
    395@@ -396,14 +396,17 @@ class CNode
    396 
    397     // inventory based relay
    398     CRollingBloomFilter filterInventoryKnown;
    399-    std::vector<CInv> vInventoryToSend;
    400+    std::set<uint256> vInventoryTxToSend;
    401+    std::vector<uint256> vInventoryBlockToSend;
    


    sdaftuar commented at 7:27 pm on April 11, 2016:
    nit: Perhaps add a comment explaining why the vector is needed for blocks?
  19. sdaftuar commented at 7:34 pm on April 11, 2016: member
    Concept ACK, will test.
  20. in src/main.h: in 0cdc0fa69b outdated
    103- *  Blocks, whitelisted receivers, and a random 25% of transactions bypass this. */
    104-static const unsigned int AVG_INVENTORY_BROADCAST_INTERVAL = 5;
    105+/** Average delay between trickled inventory transmissions in seconds.
    106+ *  Blocks and whitelisted receivers bypass this, outbound peers get half this delay. */
    107+static const unsigned int INVENTORY_BROADCAST_INTERVAL = 5;
    108+/** Maximum number of inventor items to send per transmission.
    


    sdaftuar commented at 7:36 pm on April 11, 2016:
    “inventor” -> “inventory”
  21. sipa commented at 2:40 pm on April 12, 2016: member

    Added two commits to address the comments.

    Mempool requests are now not filtered by vInventoryTxToSend, but instead every mempool inv sent as a response is removed from vInventoryTxToSend.

  22. gmaxwell commented at 9:30 pm on April 12, 2016: contributor
    In a test topology over 48 hours this reduced the addition of orphan transactions from 436/hr to 2.3/hr.
  23. gmaxwell commented at 9:40 pm on April 12, 2016: contributor
    Also, utack latest improvements.
  24. dcousens commented at 2:32 am on April 13, 2016: contributor
    concept ACK, utACK 98305df
  25. gmaxwell commented at 7:05 am on April 19, 2016: contributor
    @sipa: any squashing interest?
  26. sipa force-pushed on Apr 19, 2016
  27. sipa commented at 5:54 pm on April 19, 2016: member
    Rebased and squashed.
  28. sipa force-pushed on Apr 19, 2016
  29. sipa commented at 6:07 pm on April 19, 2016: member
    Also renamed the vInventoryTxToSend to setInventoryTxToSend, and restructured the tx send loop a bit to be less deeply nested.
  30. gmaxwell commented at 4:58 am on April 20, 2016: contributor
    Travis CI timeout, doesn’t look like a real failure.
  31. Eliminate TX trickle bypass, sort TX invs for privacy and priority.
    Previously Bitcoin would send 1/4 of transactions out to all peers
     instantly.  This causes high overhead because it makes >80% of
     INVs size 1.  Doing so harms privacy, because it limits the
     amount of source obscurity a transaction can receive.
    
    These randomized broadcasts also disobeyed transaction dependencies
     and required use of the orphan pool.  Because the orphan pool is
     so small this leads to poor propagation for dependent transactions.
    
    When the bypass wasn't in effect, transactions were sent in the
     order they were received.  This avoided creating orphans but
     undermines privacy fairly significantly.
    
    This commit:
     Eliminates the bypass. The bypass is replaced by halving the
      average delay for outbound peers.
    
     Sorts candidate transactions for INV by their topological
      depth then by their feerate (then hash); removing the
      information leakage and providing priority service to
      higher fee transactions.
    
     Limits the amount of transactions sent in a single INV to
      7tx/sec (and twice that for outbound); this limits the
      harm of low fee transaction floods, gives faster relay
      service to higher fee transactions. The 7 sounds lower
      than it really is because received advertisements need
      not be sent, and because the aggregate rate is multipled
      by the number of peers.
    f2d3ba7386
  32. sipa force-pushed on Apr 20, 2016
  33. sipa force-pushed on Apr 20, 2016
  34. sipa commented at 7:26 pm on April 20, 2016: member

    Added one more commit by @gmaxwell to move all inv filtering to the send stage (which avoids announcing things that are no longer in our mempool and bloom filter updates for those).

    I’m sorry for the increasingly expansing scope this PR has had, so I’m not going to touch functionality any further unless bugs in it are found.

  35. in src/main.cpp: in 66a5885a9e outdated
    5823+                        if (!fInMemPool) continue; // another thread removed since queryHashes, maybe...
    5824+                        if (!pto->pfilter->IsRelevantAndUpdate(tx)) continue;
    5825+                    }
    5826+                    pto->filterInventoryKnown.insert(hash);
    5827+                    vInv.push_back(inv);
    5828+                    if (vInv.size() == MAX_INV_SZ) {
    


    jonasnick commented at 8:26 pm on April 20, 2016:
    It seems that in general it can happen that vInv.size() > MAX_INV_SZ, namely when pto->vInventoryBlockToSend.size() >= MAX_INV_SZ. Are we assuming that this isn’t the case?

  36. in src/main.cpp: in 1a16f859e4 outdated
    5855+                        if (!fInMemPool) continue; // another thread removed since queryHashes, maybe...
    5856+                        if (!pto->pfilter->IsRelevantAndUpdate(tx)) continue;
    5857+                    }
    5858+                    pto->filterInventoryKnown.insert(hash);
    5859+                    vInv.push_back(inv);
    5860+                    if (vInv.size() == MAX_INV_SZ) {
    


    sipa commented at 8:29 pm on April 20, 2016:
    Entries are added one by one to vInv, and all call sites where that happens are followed by a max size test, so I don’t think vInv can exceed MAX_INV_SZ.

    sipa commented at 8:49 pm on April 20, 2016:
    Ah, good point. There may not be other mechanisms that prevent over 5000 blocks being inved at once.

    sipa commented at 10:36 pm on April 20, 2016:
    Fixed.
  37. sipa force-pushed on Apr 20, 2016
  38. Split up and optimize transaction and block inv queues dc13dcd2be
  39. Handle mempool requests in send loop, subject to trickle
    By eliminating queued entries from the mempool response and responding only at
    trickle time, this makes the mempool no longer leak transaction arrival order
    information (as the mempool itself is also sorted)-- at least no more than
    relay itself leaks it.
    ed7068302c
  40. Return mempool queries in dependency order 4578215e7f
  41. Move bloom and feerate filtering to just prior to tx sending.
    This will avoid sending more pointless INVs around updates, and
     prevents using filter updates to timetag transactions.
    
    Also adds locking for fRelayTxes.
    b559914753
  42. sipa force-pushed on Apr 20, 2016
  43. dcousens commented at 4:08 am on April 21, 2016: contributor
    @sipa could you update your original PR description with the full scope of this PR now? (for others)
  44. gmaxwell commented at 7:04 pm on April 21, 2016: contributor
    ACK.
  45. jonasnick commented at 8:08 am on April 22, 2016: contributor
    Ack
  46. in src/txmempool.cpp: in b559914753
    773+    CTxMemPool *mp;
    774+public:
    775+    DepthAndScoreComparator(CTxMemPool *mempool) : mp(mempool) {}
    776+    bool operator()(const uint256& a, const uint256& b) { return mp->CompareDepthAndScore(a, b); }
    777+};
    778+}
    


    sdaftuar commented at 8:18 pm on April 22, 2016:
    nit: What do you think of exposing the DepthAndScoreComparator defined here, and eliminating the CompareInvMempoolOrder object declared in main.cpp by moving the comparator it defines (which reverses the comparison) into this class?

    sipa commented at 8:54 pm on April 22, 2016:

    The comparator over there sorts iterators to set items, so there is more difference.

    I do think the comparator (or at least a base comparator) should be exposed here, which grabs the mempool lock etc. Perhaps the other comparator can then take a ref to the first one as an argument to add the reversing and set iterator dereferencing.


    sdaftuar commented at 9:08 pm on April 22, 2016:
    I was thinking that because the main.cpp comparator acts on different objects, it’d be easy to just move that function into this class (and explain in comments what the two comparators were for and why they are subtly different!), but your suggestion seems reasonable to me as well. I suppose its not awesome for the implementation details in main.cpp to leak into txmempool…

    sipa commented at 9:36 pm on April 22, 2016:
    Even better would be for txmempool.h to expose an opaque SortedMempoolSet object, which takes a list of hashes as inputs, and maintains a sorted heap of iterators to the mempool entries themselves, in addition to the cs lock.
  47. sdaftuar commented at 8:24 pm on April 22, 2016: member

    ACK, apart from my question about whether we should combine the two comparators that are being introduced.

    Couple notes:

    • RelayTransaction no longer requires a feerate, and AcceptToMemoryPool no longer needs to fill a feerate in. We can clean this up in a future PR.
    • I noticed while testing that CTxMemPool::queryHashes is what we use in the RPC call getrawmempool. It’s hard for me to imagine anyone would care (and in fact the new sort order on the results seems handy!), but perhaps we should flag the change in the release notes?
  48. gmaxwell commented at 6:00 am on April 26, 2016: contributor
    I’ve had this running in its various forms for two weeks on a couple of public nodes, as well as a seperate test topology that consisted of nothing but nodes running this code. What do we need to move this forward? I’m concerned that if I move to testing other PRs this will be forgotten.
  49. laanwj merged this on May 5, 2016
  50. laanwj closed this on May 5, 2016

  51. laanwj referenced this in commit 3b9a0bf41f on May 5, 2016
  52. codablock referenced this in commit 994eb30518 on Sep 16, 2017
  53. codablock referenced this in commit d8a279004c on Sep 19, 2017
  54. codablock referenced this in commit 5d8e94a26f on Dec 21, 2017
  55. codablock referenced this in commit d4b8abf4ff on Dec 21, 2017
  56. MarkLTZ referenced this in commit a1e585bd60 on Apr 28, 2019
  57. furszy referenced this in commit 1f010c0969 on Jan 23, 2021
  58. zkbot referenced this in commit 1b5f17c900 on Apr 1, 2021
  59. zkbot referenced this in commit 80e66e7daa on Apr 2, 2021
  60. zkbot referenced this in commit 12af999d42 on Aug 10, 2021
  61. zkbot referenced this in commit 44f7d7e4ea on Aug 11, 2021
  62. zkbot referenced this in commit 7fa7d5700b on Aug 12, 2021
  63. zkbot referenced this in commit fd462fd8c4 on Aug 13, 2021
  64. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me