test: Fix race condition in index prune test #25123

pull fjahr wants to merge 1 commits into bitcoin:master from fjahr:202205-index-prune-fix changing 1 files +3 −4
  1. fjahr commented at 11:04 pm on May 12, 2022: member

    Fixes #25031

    The feature_index_prune.py test seems to be racy because connections are reestablished after restarts and the blocks are synced via the sync_blocks function. The sync_blocks function has a sanity check at the beginning to check that all nodes in the set have at least one established connection and that is not always the case.

    As a solution nodes are not connected via the -connect parameter on start but instead via the connect_nodes helper.

  2. theStack commented at 11:11 pm on May 12, 2022: member

    Concept ACK

    Is there an explanation why this race is only (or at least, much more likely, it seems) happening on Windows? If someone knows how to reproduce the fixed failure behaviour on a Linux/BSD system, would be nice to know.

  3. fjahr commented at 11:29 pm on May 12, 2022: member

    Concept ACK

    Is there an explanation why this race is only (or at least, much more likely, it seems) happening on Windows? If someone knows how to reproduce the fixed failure behaviour on a Linux/BSD system, would be nice to know.

    Agree, it would be nice to be sure :) My assumption is that the Windows build is just slower than everything else and that’s why we have only seen it there so far. Looking at https://cirrus-ci.com/github/bitcoin/bitcoin it always seems the be the job that takes the longest time.

  4. DrahtBot added the label Tests on May 13, 2022
  5. MarcoFalke commented at 5:27 am on May 13, 2022: member

    why should addconnection be less racy than connect?

    Why not use connect_nodes which has the wait_until built into? #25031 (comment)

  6. MarcoFalke commented at 5:28 am on May 13, 2022: member
    It should be possible to reproduce by adding a sleep in the -connect thread?
  7. fjahr commented at 3:29 pm on May 14, 2022: member

    I was able to reproduce the issue as well with this, thanks @MarcoFalke !

    why should addconnection be less racy than connect?

    From my understanding of the code addconnection only returns after the connection has been established (or failed). It doesn’t seem to use threading and since it’s only available for testing I thought something like this should be it’s primary use-case. I confirmed it works with your hint mentioned above.

    Why not use connect_nodes which has the wait_until built into? #25031 (comment)

    Right, I can replace addconnection here with connect_nodes, I didn’t remember connect_nodes has the wait built into it. It looks a little less efficient (calling more RPCs under the hood) but saves us another line for the ip+port in the test, so I am indifferent between the two. I just didn’t like the one-liner in that link because it’s harder to read and would require a comment to explain that this is a clunky way to protect against a potential race condition.

  8. fjahr force-pushed on May 14, 2022
  9. fjahr renamed this:
    test: Use addconnection for robustness in index prune test
    test: Fix race condition in index prune test
    on May 14, 2022
  10. test: Fix race condition in index pruning test
    Nodes are restarted and reconnected as part of the test. Afterwards
    `sync_blocks` is called immediately on the nodes. `sync_blocks`
    first checks that all the included nodes have at least one
    connection. Since adding a connection is usually happening in a
    thread, sometimes nodes could run into this check before the
    connection was fully established so that it would fail the entire
    test.
    
    This fix uses the `connect_nodes` helper to make the connection the
    nodes. `connect_nodes` has a wait for the connection built into it.
    4faa550072
  11. fjahr force-pushed on May 14, 2022
  12. fjahr commented at 3:38 pm on May 14, 2022: member
    Updated to use connect_nodes helper instead of addconnection RPC.
  13. MarcoFalke merged this on May 15, 2022
  14. MarcoFalke closed this on May 15, 2022

  15. MarcoFalke commented at 7:32 am on May 15, 2022: member
    If addconnection is somehow better than the stuff in connect_nodes, it could make sense to use addconnection in the connect_nodes impl?
  16. fjahr commented at 12:32 pm on May 15, 2022: member

    If addconnection is somehow better than the stuff in connect_nodes, it could make sense to use addconnection in the connect_nodes impl?

    Yeah, I was thinking about that as well. I will give it a try and benchmark.

  17. sidhujag referenced this in commit d670f9782d on May 28, 2022
  18. DrahtBot locked this on May 15, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-05 04:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me