Use recent-rejects, orphans, and recently-replaced txn for compact-block-reconstruction #9499

pull TheBlueMatt wants to merge 10 commits into bitcoin:master from TheBlueMatt:2016-12-recent-tx-cache-cmpct changing 11 files +105 −31
  1. TheBlueMatt commented at 7:44 pm on January 9, 2017: member

    This takes the rather obvious step of keeping a circular buffer of CTransactionRefs which is populated with orphan transactions, transactions replaced out of mempool, and rejected transactions and uses them to populate compact blocks.

    It defaults to 100 transactions in size, the same as the orphan map, which, in the worst case, would be about 40MB+memory overhead (this would be 100 400K-weight witness-only-ish transactions replaced, same as the orphan map).

    It limits rejected transactions to 100KB in memory usage before storing.

    In previous tests, @gmaxwell had indicated that if we had an infinite-sized buffer here we could save a ton of compact-block-round-trips, so in non-adversarial conditions even a small buffer should help at least sometimes.

  2. Move ORPHAN constants from validation.h to net_processing.h c735540428
  3. Make ATMP optionally return the CTransactionRefs it replaced edded808fc
  4. TheBlueMatt commented at 7:46 pm on January 9, 2017: member
    Most of the diff here is changing the signature of AcceptToMemoryPool trivially….
  5. TheBlueMatt commented at 7:51 pm on January 9, 2017: member
    To clarify: only rejected transactions are limited based on memory-usage (to 100KB), everything else relies on the standardness limits - 400K weight, which would be 400KB of witness data, max.
  6. TheBlueMatt force-pushed on Jan 9, 2017
  7. gmaxwell commented at 8:32 pm on January 9, 2017: contributor

    Concept ACK. nice implementation.

    You should probably consider adding an extra count “% from memory (% hits against extra),” to the successful reconstruction log entry.

    The count limit is unfortunate, merely 100 entries but 40MB max? you could easily keep count of the worst case memory usage (ignoring the deduplication) and then just target 4MB– lots less memory and a higher hitrate no doubt. Ultimately reject will need to be separately limited (too easy for a single trouble maker to blow it out, and not just an attacker: an older peer without a fee filter) somehow but this should be fine for now.

  8. TheBlueMatt commented at 9:17 pm on January 9, 2017: member

    Indeed, doing smarter limiting would be nice, but this should do for a bare-bones help-the-default-case win for 0.14 (if it turns out to do enough and the fast progress we’ve been making on review continues through the end of the week). If we prefer it’s not really any additional review burden to do a memory-based limit across the entire list.

    On January 9, 2017 3:33:01 PM EST, Gregory Maxwell notifications@github.com wrote:

    Concept ACK. nice implementation.

    You should probably consider adding an extra count “% from memory (% hits against extra),” to the successful reconstruction log entry.

    The count limit is unfortunate, merely 100 entries but 40MB max? you could easily keep count of the worst case memory usage (ignoring the deduplication) and then just target 4MB– lots less memory and a higher hitrate no doubt.

    – You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/bitcoin/bitcoin/pull/9499#issuecomment-271399263

  9. TheBlueMatt force-pushed on Jan 10, 2017
  10. in src/net_processing.cpp: in b75d967e3c outdated
    58@@ -59,6 +59,9 @@ map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(cs_main);
    59 map<COutPoint, set<map<uint256, COrphanTx>::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(cs_main);
    60 void EraseOrphansFor(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
    61 
    62+static size_t vOrphanAndRecentlyRemovedTxnIt = 0;
    


    instagibbs commented at 4:12 pm on January 10, 2017:
    technically this is for rejects as well too now… maybe vExtraTxnIt and same for the vector below
  11. instagibbs approved
  12. instagibbs commented at 5:24 pm on January 10, 2017: member

    very nice, and plenty of room for improvement on top later

    utACK, running now with an extra commit on top to see efficacy: https://github.com/instagibbs/bitcoin/commit/fcb8ace8245435289018a8171463f4743ea664a5

  13. gmaxwell commented at 6:54 pm on January 10, 2017: contributor

    It’s working for me. I added logging for transactions that come from the extrapool

    • LogPrint(“cmpctblock”, “Successfully reconstructed block %s with %lu txn prefilled, %lu txn from memory (%lu extra hits), and %lu txn requested\n”, hash.ToString(), prefilled_count, mempool_count, extra_count, vtx_missing.size());

    Please add that, I know extra can’t be counted completely precisely in the presence of collisions but it’s very useful information.

    Some stats:

    $ grep essfu ~/.bitcoin/debug.log | tail -n 72 | awk ‘{aa=int(substr($15,2))>0; bb=$19>0; if (aa && !bb) {saved+=1} else if (!aa && !bb) {noextra+=1} else {roundtrip+=1} } END {print saved/NR " " noextra/NR " " roundtrip/NR}’ 0.180556 0.527778 0.291667

    So the extra increased no-roundtrip from 52% to 68%… in the last 72 blocks here. Though I see there are at least a couple blocks where extra was just too small and would have been successful if it were larger, I can tell because <5 were missed and grepped the logs showed that I had previously received them. I think thats a big enough improvement to justify doing it, but the size should be increased eventually (presumably via better memory management).

  14. Keep shared_ptrs to recently-replaced txn for compact blocks 1531652e02
  15. Use replaced transactions in compact block reconstruction 93380c5247
  16. Consider all orphan txn for compact-block-extra-txn cache 7f8c8cab1e
  17. Consider all (<100k memusage) txn for compact-block-extra-txn cache 863edb45b9
  18. TheBlueMatt force-pushed on Jan 10, 2017
  19. TheBlueMatt commented at 7:50 pm on January 10, 2017: member
    Renamed vOrphanAndRecentlyRemovedTxn -> vExtraTxnForCompact, no other changes
  20. in src/net_processing.cpp: in 863edb45b9 outdated
    1740@@ -1741,7 +1741,10 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
    1741                 // See https://github.com/bitcoin/bitcoin/issues/8279 for details.
    1742                 assert(recentRejects);
    1743                 recentRejects->insert(tx.GetHash());
    1744-            }
    1745+                if (RecursiveDynamicUsage(*ptx) < 100000)
    


    morcos commented at 9:28 pm on January 10, 2017:

    A quick examination shows 80-90% of transactions not accepted to the mempool are due to rate limited free transaction. What do you think about also adding to this check and the one below && state.GetRejectCode() != REJECT_INSUFFICIENTFEE

    You could be throwing away some transactions that would help blocks reconstruct, but I get close to 100 “not accepted” transactions an hour and it seems like I’d rather have 5 hours of non-free transactions rather than 1 hour of all.


    gmaxwell commented at 10:10 pm on January 10, 2017:
    It might be more interesting to distinguish REJECT_INSUFFICIENTFEE from rejection because it didn’t even meet the static minimum minfee.

    TheBlueMatt commented at 10:26 pm on January 10, 2017:
    I was trying to avoid any complication here, but if you think something simple is worth it I’m ok with adding something.

    morcos commented at 10:30 pm on January 10, 2017:
    Yep understood, my guess was that this was simple and a decent improvement but @gmaxwell has a point in that failed replacements could also be REJECT_INSUFFICIENTFEE and those might not be things you want to throw away. So up to you if you can simply disambiguate.

    gmaxwell commented at 0:55 am on January 11, 2017:
    I propose leaving it alone for now, in the long run we’ll want some kind of smarter queue.
  21. sdaftuar commented at 10:25 am on January 11, 2017: member

    So the extra increased no-roundtrip from 52% to 68%… in the last 72 blocks here.

    [Edited: my results I posted were wrong… recalculating.]

  22. gmaxwell commented at 7:08 pm on January 11, 2017: contributor
    My last 72 blocks: 0.125 0.625 0.25 (so .62 to .75)
  23. instagibbs commented at 7:12 pm on January 11, 2017: member

    (piling on) with a 1000 transaction buffer: 0.0833333 0.597222 0.319444 (~8% absolute improvement)

    update: up to 20% RTT saved now.

  24. in src/blockencodings.cpp: in 93380c5247 outdated
    46@@ -47,7 +47,7 @@ uint64_t CBlockHeaderAndShortTxIDs::GetShortID(const uint256& txhash) const {
    47 
    48 
    49 
    50-ReadStatus PartiallyDownloadedBlock::InitData(const CBlockHeaderAndShortTxIDs& cmpctblock) {
    51+ReadStatus PartiallyDownloadedBlock::InitData(const CBlockHeaderAndShortTxIDs& cmpctblock, std::vector<std::pair<uint256, CTransactionRef>>& extra_txn) {
    


    theuni commented at 11:46 pm on January 11, 2017:
    const? Or was this intended to remove the matches?
  25. morcos commented at 1:51 am on January 12, 2017: member

    1000 tx buffer I got 0.19 0.65 0.16

    Definitely worth merging I say..

    ACK 863edb4 , but please also add the extra printing

  26. laanwj added this to the milestone 0.14.0 on Jan 12, 2017
  27. instagibbs commented at 7:54 pm on January 12, 2017: member
  28. Add extra_count lower bound to compact reconstruction debug print b55b416346
  29. Make PartiallyDownloadedBlock::InitData's second param const fac4c78028
  30. TheBlueMatt commented at 8:20 pm on January 12, 2017: member
    Made the second parameter to InitData const and added the requested debug print.
  31. in src/blockencodings.cpp: in fac4c78028 outdated
    159+        // Though ideally we'd continue scanning for the two-txn-match-shortid case,
    160+        // the performance win of an early exit here is too good to pass up and worth
    161+        // the extra risk.
    162+        if (mempool_count == shorttxids.size())
    163+            break;
    164+    }
    


    luke-jr commented at 7:06 pm on January 14, 2017:
    A lot of code duplication here that would probably be better shared, but maybe not critical for 0.14.

    TheBlueMatt commented at 7:58 pm on January 14, 2017:
    Yea, I thought about it, but to avoid having two vectors I decided against it. Will fix in a follow-up pr for post-0.14.
  32. luke-jr commented at 7:07 pm on January 14, 2017: member
    I wonder if it might be better to simply cache whether transactions we discard are valid or invalid, rather than the full transaction data. Then hypothetically we can judge the entire block as valid/invalid quickly, even though we still have more data to fetch. But this idea sounds like a lot of work…
  33. TheBlueMatt commented at 7:59 pm on January 14, 2017: member
    @luke-jr we tried that before…and would have broken consensus had it been shipped, so, yes, that is “a lot of work”.
  34. in src/net_processing.cpp: in fac4c78028 outdated
    588@@ -586,6 +589,17 @@ void UnregisterNodeSignals(CNodeSignals& nodeSignals)
    589 // mapOrphanTransactions
    590 //
    591 
    592+void AddToCompactExtraTransactions(const CTransactionRef& tx)
    593+{
    


    luke-jr commented at 8:35 pm on January 14, 2017:
    AssertLockHeld(cs_main);?

    TheBlueMatt commented at 7:59 pm on January 17, 2017:
    Done

    TheBlueMatt commented at 8:31 pm on January 17, 2017:
    Backed this out - some of our orphan map test stuff doesn’t take cs_main, so it would fail in test_bitcoin (though all of the actual-bitcoind runs succeeded). This should be as a part of a separate PR to add cs_main and AssertLockHelds to all of the orphan handling functions.
  35. in src/net_processing.cpp: in fac4c78028 outdated
    58@@ -59,6 +59,9 @@ map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(cs_main);
    59 map<COutPoint, set<map<uint256, COrphanTx>::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(cs_main);
    60 void EraseOrphansFor(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
    61 
    62+static size_t vExtraTxnForCompactIt = 0;
    


    luke-jr commented at 8:36 pm on January 14, 2017:
    I think this can/should be moved into AddToCompactExtraTransactions itself?

    TheBlueMatt commented at 7:54 pm on January 17, 2017:
    Leaving it outside makes it clear that you cannot simply insert into ExtraTxn, you need to do something special (and, technically, avoids a hidden atomic operation).
  36. in src/blockencodings.cpp: in 93380c5247 outdated
    140+            if (!have_txn[idit->second]) {
    141+                txn_available[idit->second] = extra_txn[i].second;
    142+                have_txn[idit->second]  = true;
    143+                mempool_count++;
    144+            } else {
    145+                // If we find two mempool txn that match the short id, just request it.
    


    sipa commented at 9:53 pm on January 14, 2017:
    Maybe code duplication is acceptable, but at least adapt the comments, or delete them (it’s not about mempool txn at this point).

    TheBlueMatt commented at 3:59 am on January 17, 2017:
    OK, added.

    TheBlueMatt commented at 4:00 am on January 17, 2017:
    Note that the comment had already been mostly adapted, with an extra few lines added at the end to clarify that if there is a conflict between extra and mempool, it is ignored only if the transactions are identical.
  37. gmaxwell commented at 6:56 am on January 16, 2017: contributor
    ACK.
  38. in src/blockencodings.cpp: in 93380c5247 outdated
    130@@ -130,6 +131,35 @@ ReadStatus PartiallyDownloadedBlock::InitData(const CBlockHeaderAndShortTxIDs& c
    131         if (mempool_count == shorttxids.size())
    132             break;
    133     }
    134+    }
    


    sipa commented at 3:39 am on January 17, 2017:
    Maybe add a comment explaining that if a short txid appears in both the mempool and in the extra_txn vector, we still assume the mempool one is the correct one, and don’t treat it as a collision. As the mempool has much better chance for matching with actual blocks, this is justified.

    TheBlueMatt commented at 3:56 am on January 17, 2017:
    This isnt true? This is only true if the witness hash (and thus, the txn) are identical.

    TheBlueMatt commented at 3:58 am on January 17, 2017:
    In fact, there’s already a comment indicating this.

    sipa commented at 4:08 am on January 17, 2017:
    Seems I was misreading the code, and interpreting the early-exit mempool_count == shorttxids.size() as the way the loop always successfully ends.
  39. in src/net_processing.cpp: in 863edb45b9 outdated
    1740@@ -1741,7 +1741,10 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
    1741                 // See https://github.com/bitcoin/bitcoin/issues/8279 for details.
    1742                 assert(recentRejects);
    1743                 recentRejects->insert(tx.GetHash());
    1744-            }
    1745+                if (RecursiveDynamicUsage(*ptx) < 100000)
    1746+                    AddToCompactExtraTransactions(ptx);
    1747+            } else if (tx.HasWitness() && RecursiveDynamicUsage(*ptx) < 100000)
    


    sipa commented at 3:41 am on January 17, 2017:
    This branch looks like it will never be taken, as every time the condition holds, the previous branch should have matches as well.

    sipa commented at 3:45 am on January 17, 2017:
    Nevermind. I’m confused by the the braces. Can you instead please add braces around all the indented sections?

    TheBlueMatt commented at 4:01 am on January 17, 2017:
    OK, braces added.
  40. Clarify comment about mempool/extra conflicts 1ccfe9b1c9
  41. Add braces around AddToCompactExtraTransactions c5945804ca
  42. luke-jr commented at 7:21 pm on January 17, 2017: member

    utACK c73554042886fb63fb48edf29cf827951edde341 edded808fc4eee94c178e1779b90d1c87a08c23a 1531652e02d856c1b8c3ba6f6e51e1983ac0540c 93380c5247526e2248248a7d539233ec48d11bdd 7f8c8cab1e537809848088d3bfaa13ecca7ac8cc b55b4163462756cbb2341294b839a881333d8f1e fac4c78028d132ee7becc1dc54a0c881ae1a0287

    Not so sure about 863edb45b9841c3058e883015c7576e6f69e3cc6

    Disclaimer: Reviewed across 3 sittings

  43. in src/net_processing.cpp: in c5945804ca
    1740@@ -1723,6 +1741,11 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
    1741                 // See https://github.com/bitcoin/bitcoin/issues/8279 for details.
    1742                 assert(recentRejects);
    1743                 recentRejects->insert(tx.GetHash());
    1744+                if (RecursiveDynamicUsage(*ptx) < 100000) {
    1745+                    AddToCompactExtraTransactions(ptx);
    1746+                }
    1747+            } else if (tx.HasWitness() && RecursiveDynamicUsage(*ptx) < 100000) {
    


    luke-jr commented at 7:24 pm on January 17, 2017:
    Shouldn’t we at least check that it’s valid?

    luke-jr commented at 7:24 pm on January 17, 2017:
    What stops someone from flooding peers with the same transaction triggering this case and making the extra-pool useless?

    TheBlueMatt commented at 7:46 pm on January 17, 2017:
    is valid how? The CValidationState was likely not ever set if there were mempool conflicts (which is exactly the case we want here)

    TheBlueMatt commented at 7:47 pm on January 17, 2017:
    There isnt actually much you can do to prevent this - the goal is to capture some number of conflicting transactions and keep them around, and without some complicated rate-limiting logic you’re not gonna win that fight.

    luke-jr commented at 7:51 pm on January 17, 2017:
    There are a few cases of invalid transactions we could detect; eg, script fails.

    TheBlueMatt commented at 7:53 pm on January 17, 2017:
    Yes, but we set state.Invalid for txn-mempool-conflict, and I super dont want a huge list of which cases of invalid we care about and which we dont here.

    luke-jr commented at 7:53 pm on January 17, 2017:
    Fair enough. I guess worst case it simply behaves the same as if this feature was never added.

    TheBlueMatt commented at 7:56 pm on January 17, 2017:
    Yea, this is no worse than before, and in many cases is a big win…its also written to be an easy review and merge, with the opportunity of cleaning it up and making it more clever in the future.

    luke-jr commented at 8:09 pm on January 17, 2017:
    Hmm, that sounds like something we probably shouldn’t be doing (although off-topic for this PR). How about using the ban score?

    TheBlueMatt commented at 8:29 pm on January 17, 2017:
    Lets revisit this after we fix DoS banning and have a more reasonable API to access :).
  44. luke-jr commented at 8:10 pm on January 17, 2017: member
    utACK 863edb45b9841c3058e883015c7576e6f69e3cc6 too (and the formatting commits after my review left off)
  45. TheBlueMatt force-pushed on Jan 17, 2017
  46. laanwj merged this on Jan 19, 2017
  47. laanwj closed this on Jan 19, 2017

  48. laanwj referenced this in commit 9c9af5ab2d on Jan 19, 2017
  49. in src/blockencodings.cpp: in c5945804ca
    158+            }
    159+        }
    160+        // Though ideally we'd continue scanning for the two-txn-match-shortid case,
    161+        // the performance win of an early exit here is too good to pass up and worth
    162+        // the extra risk.
    163+        if (mempool_count == shorttxids.size())
    


    jnewbery commented at 9:53 pm on January 19, 2017:

    I don’t think this early exit is worth it. The only way we could hit this is if we didn’t find all of the shortid transactions in the mempool and we found the remaining shortid transactions in vExtraTxnForCompact. If that’s the case, what do we gain/lose from early exit:

    • gain: we won’t have to scan through the < 100 transactions in vExtraTxnForCompact. That’s an almost negligable win.
    • lose: if there is a second transaction in vExtraTxnForCompact with the same shortid, we run the risk of a READ_STATUS_FAILED in FillBlock.

    TheBlueMatt commented at 10:16 pm on January 19, 2017:
    Looping through the GetShortID calls is nontrivial for large-ish mempools, and collisions should be ~1 in a million, so I think its still probably worth it.

    TheBlueMatt commented at 10:17 pm on January 19, 2017:
    Plus the intention would be to significantly increase the size of the extra pool in 0.15, assuming smarter memory management.

    jnewbery commented at 10:21 pm on January 19, 2017:
    Understood. When the extra pool size is 100 transactions this isn’t worth it, but if the extra pool is made much larger in 0.15, it makes sense.
  50. in src/blockencodings.cpp: in c5945804ca
    150+                // Note that we dont want duplication between extra_txn and mempool to
    151+                // trigger this case, so we compare witness hashes first
    152+                if (txn_available[idit->second] &&
    153+                        txn_available[idit->second]->GetWitnessHash() != extra_txn[i].second->GetWitnessHash()) {
    154+                    txn_available[idit->second].reset();
    155+                    mempool_count--;
    


    jnewbery commented at 10:00 pm on January 19, 2017:

    These counts are wrong if there’s a triple shortid collision. Same for the mempool count above.

    If there’s a shortid collision between a transaction in the mempool and in vExtraTxnForCompact, I believe extra_count will underflow.


    TheBlueMatt commented at 10:15 pm on January 19, 2017:
    Indeed, looks like the counting will provide spurious results sometimes…want to open a new pr to fix it?

    jnewbery commented at 10:22 pm on January 19, 2017:
    Sure. I’ll take a look tomorrow.
  51. jnewbery commented at 10:01 pm on January 19, 2017: member
    Apologies for the post-merge review!
  52. gladcow referenced this in commit 2a17b08e0a on Mar 8, 2018
  53. gladcow referenced this in commit b45003b356 on Mar 13, 2018
  54. gladcow referenced this in commit fb2e3178df on Mar 14, 2018
  55. gladcow referenced this in commit f569316904 on Mar 15, 2018
  56. gladcow referenced this in commit 1f6c948a6f on Mar 15, 2018
  57. gladcow referenced this in commit 83f685b356 on Mar 15, 2018
  58. gladcow referenced this in commit 09927f920f on Mar 15, 2018
  59. gladcow referenced this in commit 63887caf90 on Mar 24, 2018
  60. gladcow referenced this in commit 4ab5e6ec88 on Apr 4, 2018
  61. UdjinM6 referenced this in commit bc45a2f87a on Apr 11, 2018
  62. andvgal referenced this in commit fd5c50bc2b on Jan 6, 2019
  63. CryptoCentric referenced this in commit dd3fd51204 on Feb 28, 2019
  64. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me