Replace setInventoryKnown with a rolling bloom filter. #7100

pull gmaxwell wants to merge 5 commits into bitcoin:master from gmaxwell:known_bloom changing 7 files +15 −165
  1. gmaxwell commented at 5:48 am on November 26, 2015: contributor

    Mruset setInventoryKnown was reduced to a remarkably small 1000 entries as a side effect of sendbuffer size reductions in 2012.

    This removes setInventoryKnown filtering from merkleBlock responses because false positives there are especially unattractive and also because I’m not sure if there aren’t race conditions around the relay pool that would cause some transactions there to be suppressed. (Also, ProcessGetData was accessing setInventoryKnown without taking the required lock.)

  2. Replace setInventoryKnown with a rolling bloom filter.
    Mruset setInventoryKnown was reduced to a remarkably small 1000
     entries as a side effect of sendbuffer size reductions in 2012.
    
    This removes setInventoryKnown filtering from merkleBlock responses
     because false positives there are especially unattractive and
     also because I'm not sure if there aren't race conditions around
     the relay pool that would cause some transactions there to
     be suppressed. (Also, ProcessGetData was accessing
     setInventoryKnown without taking the required lock.)
    3602ed8f66
  3. dcousens commented at 5:54 am on November 26, 2015: contributor
    @gmaxwell with this change, mruset can be completely removed.
  4. laanwj added the label P2P on Nov 26, 2015
  5. pstratem commented at 10:13 am on November 26, 2015: contributor
    @gmaxwell how large is the filter?
  6. sipa commented at 10:15 am on November 26, 2015: member
    1.37 MIB per peer…
  7. pstratem commented at 10:19 am on November 26, 2015: contributor
    @sipa in that case utACK
  8. gmaxwell commented at 11:07 am on November 26, 2015: contributor
    hm. Is it that big? My math must be wrong, I was thinking it was more like 350k. In any case, I didn’t want to make it smaller than the maximum INV size, since having a single INV blow it out would be unfortunate.
  9. sipa commented at 11:08 am on November 26, 2015: member
    Rolling bloom filters are 4 times as large as normal ones.
  10. sipa commented at 3:21 pm on November 26, 2015: member
    Eh, messed up my math. This should result in 700 KiB per peer.
  11. sipa commented at 3:33 pm on November 26, 2015: member
    Confirmed with a sneaky fprintf call: 35 KiB for the address tables, and 702 KiB for the inventory tables.
  12. sipa commented at 3:40 pm on November 26, 2015: member
    utACK. Also, mruset can go away indeed.
  13. dcousens commented at 3:14 am on November 27, 2015: contributor
    utACK
  14. laanwj commented at 10:06 am on November 27, 2015: member
    700kiB per peer is quite a lot for a P2P application in itself, but more relevant is how it compares to mruset overhead when keeping the functionality at the same level. I suppose it’s much better?
  15. sipa commented at 10:28 am on November 27, 2015: member
    An mruset for the same size (last 50000 entries) would be almost 4 MiB. Of course, they are only size 1000 currently…
  16. laanwj commented at 10:34 am on November 27, 2015: member
    Yes, that’s why I mentioned “functionally equivalent”, comparing against the current state isn’t fair as it’s broken.
  17. sipa commented at 10:39 am on November 27, 2015: member

    I do agree 700 KiB is pretty high per peer.

    Do we really need a 1/1000000 false positive rate? An FP will just cause us to not relay a particular transaction to one peer, but we’ll still relay it to others (they have other nonces).

    Also, a data structure exists that is 25% more efficient than the current CRollingBloomFilter.

  18. pstratem commented at 10:42 am on November 27, 2015: contributor
    @sipa Care to share? :)
  19. laanwj commented at 2:02 pm on November 27, 2015: member

    Do we really need a 1/1000000 false positive rate? An FP will just cause us to not relay a particular transaction to one peer, but we’ll still relay it to others (they have other nonces).

    Ideally what we’d want here is the opposite of a bloom filter. False positives are bad, as they cause peers to miss out on transactions, but the occasional false negative is ok, as they might get it INVed another time.

  20. sipa commented at 2:14 pm on November 27, 2015: member

    Agree… unfortunately I don’t know of any better way to do that than an mruset…

    Anyway, #7113 reduces this to 526 KiB per peer.

  21. in src/net.cpp: in 3602ed8f66 outdated
    2369@@ -2370,6 +2370,7 @@ CNode::CNode(SOCKET hSocketIn, const CAddress& addrIn, const std::string& addrNa
    2370     nSendOffset = 0;
    2371     hashContinue = uint256();
    2372     nStartingHeight = -1;
    2373+    setInventoryKnown.reset();
    


    jtimon commented at 4:06 pm on November 27, 2015:
    Won’t this undo the setInventoryKnown(50000, 0.000001) initialization?

    sipa commented at 4:10 pm on November 27, 2015:
    No, it only wipes the set, not the parameters.
  22. jtimon commented at 4:06 pm on November 27, 2015: contributor
    utACK
  23. gmaxwell commented at 6:11 pm on November 27, 2015: contributor

    @sipa I’m not really excited with having false positives at all here; mostly due to the case where you only have one peer, a FP means you may not relay your own transaction.

    In a universe where software complexity had no cost, I’d give MRUsets to (or even just the first few) outbound peers, and higher FP rolling bloom filters to all the others.

  24. Remove mruset as it is no longer used. 7fdf4fdc32
  25. sipa commented at 1:38 pm on November 28, 2015: member
    utACK
  26. pstratem commented at 9:07 am on November 29, 2015: contributor

    needs rename of setInventoryKnown to filterInventoryKnown

    won’t a false positive here break block relaying?

  27. sipa commented at 9:45 am on November 29, 2015: member
    Yes very good point, though I think that after #6494 we may be able to get rid of using inventory for blocks entirely.
  28. pstratem commented at 9:47 am on November 29, 2015: contributor

    there doesn’t seem to be a strong reason to filter block inv messages

    i say just send them without filtering

  29. Rename setInventoryKnown filterInventoryKnown 7dd88dd10a
  30. Only use filterInventoryKnown with MSG_TX inventory messages.
    Previously this logic could erroneously filter a MSG_BLOCK inventory message.
    4448ff695f
  31. pstratem commented at 10:23 am on November 29, 2015: contributor
    ACK
  32. sipa commented at 12:25 pm on November 29, 2015: member
    Updated #7125 to not use setInventoryKnown for blocks anymore.
  33. Actually only use filterInventoryKnown with MSG_TX inventory messages.
    Previously this logic could erroneously filter a MSG_BLOCK inventory message.
    8a0465b59d
  34. in src/net.h: in 4448ff695f outdated
    490@@ -492,8 +491,9 @@ class CNode
    491     {
    492         {
    493             LOCK(cs_inventory);
    494-            if (!setInventoryKnown.count(inv))
    495-                vInventoryToSend.push_back(inv);
    496+            if (inv.type == MSG_TX && filterInventoryKnown.contains(inv.hash))
    


    sipa commented at 2:55 pm on November 29, 2015:
    This test is not enough. SendMessages checks the filter again.
  35. sipa commented at 1:14 am on November 30, 2015: member
    Needs rebase on top of #7129.
  36. sipa commented at 3:23 pm on November 30, 2015: member
    See #7133.
  37. gmaxwell commented at 0:05 am on December 2, 2015: contributor
    replaced by #7133.
  38. gmaxwell closed this on Dec 2, 2015

  39. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me