Intermittent issue in p2p_addr_relay (AssertionError: not(16 == 20)) #22243

issue MarcoFalke openend this issue on June 14, 2021
  1. MarcoFalke commented at 6:26 pm on June 14, 2021: member

    https://cirrus-ci.com/task/4899698315100160?logs=ci#L5292

    0                                     File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/p2p_addr_relay.py", line 123, in relay_tests
    1                                       assert_equal(total_ipv4_received, num_ipv4_addrs * ipv4_branching_factor)
    2                                     File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 51, in assert_equal
    3                                       raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
    4                                   AssertionError: not(16 == 20)
    
  2. MarcoFalke added the label Bug on Jun 14, 2021
  3. MarcoFalke added the label Tests on Jun 14, 2021
  4. amitiuttarwar commented at 4:49 pm on June 15, 2021: contributor

    I suspect that this issue is caused because the test thread is accessing num_ipv4_received on the P2PInterface object without first acquiring the p2p_lock (link). So there might be a rare-but-possible race between the asyncio threads receiving messages / updating the value & the test logic reading from the value.

    ~I’ll open a PR with the fix. I currently have a commit in #21528 to refactor this test, I’m going to pull it out into a standalone PR with the additional locking.~ (No longer going to do this based on Marco’s comment below)

  5. MarcoFalke commented at 6:55 pm on June 15, 2021: member
    I think two thread accessing the same python symbol is impossible in the “normal” python due to the GIL (python global interpreter lock), so I think the p2p_lock exists only to look nice, but is practically useless. If there is a racy test it should be fixed by adding a synchronisation primitive (like a notification, flush, polling, …) potentially in combination with mocktime.
  6. amitiuttarwar referenced this in commit 94a0328f85 on Jun 21, 2021
  7. amitiuttarwar referenced this in commit 22957432f7 on Jun 22, 2021
  8. amitiuttarwar referenced this in commit 6168eb06b2 on Jun 22, 2021
  9. amitiuttarwar commented at 1:26 am on June 22, 2021: contributor
    I’m pretty convinced the issue is because of the poisson distribution of the m_next_addr_send timer. I opened #22306 with the mocktime bump + some other refactoring to the test file.
  10. MarcoFalke closed this on Jun 24, 2021

  11. MarcoFalke referenced this in commit bfa885898a on Jun 24, 2021
  12. josibake referenced this in commit 081aea34ca on Jul 21, 2021
  13. DrahtBot locked this on Aug 18, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-17 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me