Fixes for p2p-compactblocks.py test timeouts on travis (#8842) #9058

pull ryanofsky wants to merge 3 commits into bitcoin:master from ryanofsky:fix-8842-sync-timeouts changing 2 files +25 −5
  1. ryanofsky commented at 2:54 pm on November 1, 2016: member

    Fixes for #8842. There are three commits:

    • Commit [qa] Fix bug in compactblocks v2 merge fixes a p2p-compactblocks.py timeout related to, but not exactly the same as the timeout reported in #8842.
    • Commit [qa] Fix stale data bug in test_compactblocks_not_at_tip adds a missing variable initialization that isn’t currently a problem, but would lead to test failures after the next commit.
    • Commit Modify getblocktxn handler not to drop requests for old blocks is the actual fix for the timeout reported in #8842.
  2. fanquake added the label Tests on Nov 2, 2016
  3. [qa] Fix bug in compactblocks v2 merge
    Bug caused the wait_for_block_announcement to be called on the wrong node,
    leading to nondeterminism and occasional test failures. Bug was introduced in
    merge commit:
    
    d075479 Merge #8882: [qa] Fix race conditions in p2p-compactblocks.py and sendheaders.py
    
    Underlying commits which conflicted were:
    
    27acfc1 [qa] Update p2p-compactblocks.py for compactblocks v2
    6976db2 [qa] Another attempt to fix race condition in p2p-compactblocks.py
    
    The first commit changed the test_compactblock_construction function signature
    and second commit added code which wasn't updated during the merge to use the
    new arguments.
    
    Suhas Daftuar <sdaftuar@chaincode.com> noticed the bug and suggested the fix.
    47e9659ecf
  4. [qa] Fix stale data bug in test_compactblocks_not_at_tip
    Clear test_node.last_block before requesting blocks in the
    compactblocks_not_at_tip test so comparisons won't fail if a blocks were received
    before the test started.
    
    The bug doesn't currently cause any problems due to the order tests run, but
    this will change in the next commit.
    55bfddcabb
  5. laanwj added the label P2P on Nov 3, 2016
  6. laanwj removed the label Tests on Nov 3, 2016
  7. laanwj commented at 9:17 am on November 3, 2016: member

    I’m not sure I understand this: is the protocol change needed just to fix the test?

    Nit: this should not be labeled “tests” if it also involves a protocol change: if there is a protocol change that would be the primary focus, I was also confused by the title.

  8. ryanofsky commented at 10:20 am on November 3, 2016: member
    I’ll add more detail to the third commit message. Making nodes always respond to getblocktxn requests instead of silently ignoring requests for old blocks (depth > 10) fixes a race in the syncing protocol that only happens in the test environment, because the test environment generates many new blocks very quickly, which wouldn’t happen in the real world.
  9. TheBlueMatt commented at 12:25 pm on November 3, 2016: member

    Note that while this is incredibly unlikely on main net, due to the bursts there, this could conceivably happen on test net.

    On November 3, 2016 6:20:13 AM EDT, Russell Yanofsky notifications@github.com wrote:

    I’ll add more detail to the third commit message. Making nodes always respond to getblocktxn requests instead of silently ignoring requests for old blocks (depth > 10) fixes a race in the syncing protocol that only happens in the test environment, because the test environment generates new blocks very quickly, which wouldn’t happen in the real world.

    You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: #9058 (comment)

  10. TheBlueMatt commented at 12:26 pm on November 3, 2016: member

    Concept ACK

    On November 1, 2016 10:54:19 AM EDT, Russell Yanofsky notifications@github.com wrote:

    See commit messages for details. There are two test fixes and a protocol change. Corresponding BIP update is https://github.com/bitcoin/bips/pull/469. You can view, comment on, or merge this pull request online at:

    #9058

    – Commit Summary –

    • [qa] Fix bug in compactblocks v2 merge
    • [qa] Fix stale data bug in test_compactblocks_not_at_tip
    • Modify getblocktxn not to drop requests for old blocks

    – File Changes –

    M qa/rpc-tests/p2p-compactblocks.py (18) M src/main.cpp (12)

    – Patch Links –

    #9058.patch #9058.diff

    You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: #9058

  11. ryanofsky force-pushed on Nov 3, 2016
  12. ryanofsky commented at 2:24 pm on November 3, 2016: member
    Updated the commit message. For completeness, I’ll mention that I did also implement an alternate fix for for the timeouts at https://github.com/ryanofsky/bitcoin/commit/36692c9b426a23fb8ce9e32b5ffee7920b31c98b (Get rid of MAX_GETBLOCKTXN_DEPTH constant).
  13. Modify getblocktxn handler not to drop requests for old blocks
    The current getblocktxn implementation drops and ignores requests for old
    blocks, which causes occasional sync_block timeouts during the
    p2p-compactblocks.py test as reported in
    https://github.com/bitcoin/bitcoin/issues/8842.
    
    The p2p-compactblocks.py test setup creates many new blocks in a short
    period of time, which can lead to getblocktxn requests for blocks below the
    hardcoded depth limit of 10 blocks. This commit changes the getblocktxn
    handler not to ignore these requests, so the peer nodes in the test setup
    will reliably be able to sync.
    
    The protocol change is documented in BIP-152 update "Allow block responses
    to getblocktxn requests" at https://github.com/bitcoin/bips/pull/469.
    
    The protocol change is not expected to affect nodes running outside the test
    environment, because there shouldn't normally be lots of new blocks being
    rapidly added that need to be synced.
    dac53b58b5
  14. in src/main.cpp: in f6c7943782 outdated
    5459+            // actually receive all the data read from disk over the network.
    5460             LogPrint("net", "Peer %d sent us a getblocktxn for a block > %i deep", pfrom->id, MAX_BLOCKTXN_DEPTH);
    5461+            CInv vInv;
    5462+            vInv.type = State(pfrom->GetId())->fWantsCmpctWitness ? MSG_WITNESS_BLOCK : MSG_BLOCK;
    5463+            vInv.hash = req.blockhash;
    5464+            pfrom->vRecvGetData.push_front(vInv);
    


    sdaftuar commented at 7:22 pm on November 7, 2016:
    Perhaps this should be inserted at the end rather than the beginning of vRecvGetData, to maintain ordering with other block requests.

    ryanofsky commented at 7:36 pm on November 7, 2016:
    Thanks, switched to push_back.
  15. ryanofsky force-pushed on Nov 7, 2016
  16. sdaftuar commented at 2:08 am on November 9, 2016: member

    utACK dac53b58b555183ccc0d5e64c428528267cd98b3

    Note that the behavior this is changing could also be problematic for users on mainnet, say if you had an edge router whose network went down for a couple hours – on resumption it might relay the first of a batch of blocks as a compact block, but then ignore the subsequent GETBLOCKTXN if its tip rapidly advances in the meantime as it catches up.

    Also, I think this has been mentioned elsewhere, but this issue doesn’t just affect the p2p-compactblocks.py test, but many other of our python tests as well (as an example, I’ve been dealing with sporadic failures due to this issue in the pruning test this evening).

  17. laanwj commented at 10:02 am on November 11, 2016: member
    Going to merge this, as the random errors in the segwit test are starting to be really annoying.
  18. laanwj merged this on Nov 11, 2016
  19. laanwj closed this on Nov 11, 2016

  20. laanwj referenced this in commit 7977a1157a on Nov 11, 2016
  21. in src/main.cpp: in dac53b58b5
    5456+            // small blocktxn response is preferable in the case where a peer
    5457+            // might maliciously send lots of getblocktxn requests to trigger
    5458+            // expensive disk reads, because it will require the peer to
    5459+            // actually receive all the data read from disk over the network.
    5460             LogPrint("net", "Peer %d sent us a getblocktxn for a block > %i deep", pfrom->id, MAX_BLOCKTXN_DEPTH);
    5461+            CInv vInv;
    


    TheBlueMatt commented at 11:46 pm on November 11, 2016:
    Oops, usually v refers to the type being a vector, but this is just a single inv.

    ryanofsky commented at 7:03 pm on November 14, 2016:
    Thanks, corrected in #9160
  22. TheBlueMatt commented at 11:49 pm on November 11, 2016: member
    Postumous utACK
  23. ryanofsky referenced this in commit 1811547459 on Nov 14, 2016
  24. ryanofsky referenced this in commit ec34648766 on Nov 14, 2016
  25. sdaftuar commented at 9:50 pm on November 19, 2016: member
    @TheBlueMatt Should this be tagged for backport to 0.13?
  26. TheBlueMatt commented at 10:41 pm on November 19, 2016: member

    It’s largely a fix for QA instability…. What is our normal process for that?

    On November 19, 2016 1:50:27 PM PST, Suhas Daftuar notifications@github.com wrote:

    @TheBlueMatt Should this be tagged for backport to 0.13?

    You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #9058 (comment)

  27. sdaftuar commented at 11:03 pm on November 19, 2016: member

    Well usually QA instability is addressed by changing the tests, and I think we tend to be pretty liberal about backporting test changes.

    Personally I think it makes sense to backport this behavior change, as I mentioned above I think this is basically a bugfix since it could be triggered in some mainnet configurations as well, and I think the patch here is small and low risk. Not sure if everyone would agree with my assessment though…?

  28. TheBlueMatt commented at 0:46 am on November 20, 2016: member
    Yea, I’m OK with this being backported.
  29. MarcoFalke referenced this in commit 2ba5d78427 on Nov 20, 2016
  30. MarcoFalke referenced this in commit 286e548d87 on Nov 20, 2016
  31. MarcoFalke referenced this in commit 573b575174 on Nov 20, 2016
  32. MarcoFalke referenced this in commit c65fb7c3ff on Nov 20, 2016
  33. MarcoFalke referenced this in commit e8461666ec on Nov 20, 2016
  34. gladcow referenced this in commit 751d88e974 on Mar 5, 2018
  35. gladcow referenced this in commit 7863436edf on Mar 13, 2018
  36. gladcow referenced this in commit 5a700a70d2 on Mar 14, 2018
  37. gladcow referenced this in commit 32ac7600d1 on Mar 15, 2018
  38. gladcow referenced this in commit 921f64f0e1 on Mar 15, 2018
  39. gladcow referenced this in commit 1b5bbf0181 on Mar 15, 2018
  40. gladcow referenced this in commit 90d822bb04 on Mar 15, 2018
  41. gladcow referenced this in commit 30aff7f3c1 on Mar 24, 2018
  42. gladcow referenced this in commit ac2cd34f51 on Apr 4, 2018
  43. UdjinM6 referenced this in commit bc45a2f87a on Apr 11, 2018
  44. andvgal referenced this in commit fd5c50bc2b on Jan 6, 2019
  45. CryptoCentric referenced this in commit dd3fd51204 on Feb 28, 2019
  46. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-05 04:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me