test: expect that files may disappear from /proc/PID/fd/ #31614

pull vasild wants to merge 1 commits into bitcoin:master from vasild:expect_file_not_found_in_get_socket_inodes changing 1 files +6 −3
  1. vasild commented at 10:45 am on January 7, 2025: contributor

    get_socket_inodes() calls os.listdir() and then iterates on the results using os.readlink(). However a file may disappear from the directory after os.listdir() and before os.readlink() resulting in a FileNotFoundError exception.

    It is expected that this may happen for bitcoind which is running and could open or close files or sockets at any time. Thus ignore the FileNotFoundError exception.

  2. test: expect that files may disappear from /proc/PID/fd/
    `get_socket_inodes()` calls `os.listdir()` and then iterates on the
    results using `os.readlink()`. However a file may disappear from the
    directory after `os.listdir()` and before `os.readlink()` resulting in a
    `FileNotFoundError` exception.
    
    It is expected that this may happen for `bitcoind` which is running and
    could open or close files or sockets at any time. Thus ignore the
    `FileNotFoundError` exception.
    b2e9fdc00f
  3. DrahtBot commented at 10:45 am on January 7, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31614.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK theuni, arejula27, hodlinator, sipa, achow101
    Concept ACK luke-jr

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  4. DrahtBot added the label Tests on Jan 7, 2025
  5. vasild commented at 10:47 am on January 7, 2025: contributor

    A failure like this observed in #31349 in CI job https://github.com/bitcoin/bitcoin/actions/runs/12633511389/job/35241370470?pr=31349:

     02025-01-07T09:20:07.629000Z TestFramework (INFO): Bind test for ['[::1]']
     12025-01-07T09:20:08.155000Z TestFramework (ERROR): Unexpected exception caught during testing
     2Traceback (most recent call last):
     3  File "/home/runner/work/_temp/test/functional/test_framework/test_framework.py", line 135, in main
     4    self.run_test()
     5  File "/home/runner/work/_temp/ci/scratch/build-x86_64-pc-linux-gnu/test/functional/rpc_bind.py", line 99, in run_test
     6    self._run_loopback_tests()
     7  File "/home/runner/work/_temp/ci/scratch/build-x86_64-pc-linux-gnu/test/functional/rpc_bind.py", line 126, in _run_loopback_tests
     8    self.run_bind_test(['[::1]'], '[::1]', ['[::1]'],
     9  File "/home/runner/work/_temp/ci/scratch/build-x86_64-pc-linux-gnu/test/functional/rpc_bind.py", line 45, in run_bind_test
    10    assert_equal(set(get_bind_addrs(pid)), set(expected))
    11                     ^^^^^^^^^^^^^^^^^^^
    12  File "/home/runner/work/_temp/test/functional/test_framework/netutil.py", line 84, in get_bind_addrs
    13    inodes = get_socket_inodes(pid)
    14             ^^^^^^^^^^^^^^^^^^^^^^
    15  File "/home/runner/work/_temp/test/functional/test_framework/netutil.py", line 40, in get_socket_inodes
    16    target = os.readlink(os.path.join(base, item))
    17             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    18FileNotFoundError: [Errno 2] No such file or directory: '/proc/18148/fd/19'
    
  6. luke-jr approved
  7. luke-jr commented at 4:20 am on January 8, 2025: member
    crACK
  8. theuni approved
  9. theuni commented at 4:40 pm on January 8, 2025: member

    utACK b2e9fdc00f5c40c241a37739f7b73b74c2181e39

    I looked around for a way to iterate on a cached list using os.scandir, but one doesn’t seems to exist. So this makes sense to me.

  10. DrahtBot requested review from luke-jr on Jan 8, 2025
  11. arejula27 commented at 2:04 pm on February 6, 2025: none

    ACK b2e9fdc

    The issue is that the process of getting the list of files in the directory and then reading each one is not atomic. This can lead to race conditions. The proposed solution doesn’t fully fix the problem but avoids its consequences, preventing the test from failing when it shouldn’t.

    Following @theuni’s idea, I tried exploring other solutions using scandir:

    0for item in os.scandir(base):
    1        if item.is_symlink():
    2            target = os.readlink(os.path.join(base, item))
    3            if target.startswith('socket:'):
    4                inodes.append(int(target[8:-1]))
    5    return inodes
    

    This approach also works and and satisfy the tests test/functional/feature_bind_extra.py where the suggested change is called, but since is_symlink and readlink do not happen atomically, the issue still exists. That said, the proposed solution is simple and effective.

  12. hodlinator approved
  13. hodlinator commented at 9:06 pm on February 7, 2025: contributor

    ACK b2e9fdc00f5c40c241a37739f7b73b74c2181e39

    Concept

    Since there is no trivial platform-independent way in Python to take a snapshot of the filesystem to my knowledge, suppressing exceptions stemming from file descriptors being closed right under our noses seems like the best solution.

    Testing

    Modified test reported to have failed on CI to provoke the issue.

     0diff --git a/test/functional/rpc_bind.py b/test/functional/rpc_bind.py
     1index 69afd45b9a..033612a9eb 100755
     2--- a/test/functional/rpc_bind.py
     3+++ b/test/functional/rpc_bind.py
     4@@ -40,10 +40,14 @@ class RPCBindTest(BitcoinTestFramework):
     5             base_args += ['-rpcallowip=' + x for x in allow_ips]
     6         binds = ['-rpcbind='+addr for addr in addresses]
     7         self.nodes[0].rpchost = connect_to
     8-        self.start_node(0, base_args + binds)
     9-        pid = self.nodes[0].process.pid
    10-        assert_equal(set(get_bind_addrs(pid)), set(expected))
    11-        self.stop_nodes()
    12+        for i in range(1, 100):
    13+            print(f"\nOuter {i}")
    14+            self.nodes[0].start(base_args + binds)
    15+            pid = self.nodes[0].process.pid
    16+            while set(get_bind_addrs(pid)) != set(expected):
    17+                print(".", end='')
    18+            self.nodes[0].wait_for_rpc_connection()
    19+            self.stop_nodes()
    20 
    21     def run_invalid_bind_test(self, allow_ips, addresses):
    22         '''
    

    Code very often succeeds, but fails within a minute when running without the change from this PR.

     0₿ build/test/functional/rpc_bind.py 
     12025-02-07T21:00:40.605000Z TestFramework (INFO): PRNG seed is: 6821738029737765731
     22025-02-07T21:00:40.606000Z TestFramework (INFO): Initializing test directory /tmp/nix-shell-92894-0/nix-shell-159962-0/bitcoin_func_test_jdmurpof
     32025-02-07T21:00:40.606000Z TestFramework (INFO): Check for ipv6
     42025-02-07T21:00:40.606000Z TestFramework (INFO): Check for non-loopback interface
     52025-02-07T21:00:40.606000Z TestFramework (INFO): Bind test for []
     6
     7Outer 1
     8.............................
     9Outer 2
    10..............................
    11Outer 3
    12.............................
    13Outer 4
    14.............................
    15Outer 5
    16.........................
    17Outer 6
    18.............................
    19Outer 7
    20............................2025-02-07T21:00:43.678000Z TestFramework (ERROR): Unexpected exception caught during testing
    21Traceback (most recent call last):
    22  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 135, in main
    23    self.run_test()
    24  File "/home/hodlinator/b2/build/test/functional/rpc_bind.py", line 103, in run_test
    25    self._run_loopback_tests()
    26  File "/home/hodlinator/b2/build/test/functional/rpc_bind.py", line 124, in _run_loopback_tests
    27    self.run_bind_test(None, '127.0.0.1', [],
    28  File "/home/hodlinator/b2/build/test/functional/rpc_bind.py", line 47, in run_bind_test
    29    while set(get_bind_addrs(pid)) != set(expected):
    30              ^^^^^^^^^^^^^^^^^^^
    31  File "/home/hodlinator/b2/test/functional/test_framework/netutil.py", line 84, in get_bind_addrs
    32    inodes = get_socket_inodes(pid)
    33             ^^^^^^^^^^^^^^^^^^^^^^
    34  File "/home/hodlinator/b2/test/functional/test_framework/netutil.py", line 40, in get_socket_inodes
    35    target = os.readlink(os.path.join(base, item))
    36             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    37FileNotFoundError: [Errno 2] No such file or directory: '/proc/290417/fd/13'
    
  14. sipa commented at 9:19 pm on February 7, 2025: member
    utACK b2e9fdc00f5c40c241a37739f7b73b74c2181e39
  15. maflcko added this to the milestone 29.0 on Feb 8, 2025
  16. achow101 commented at 11:32 pm on February 10, 2025: member
    ACK b2e9fdc00f5c40c241a37739f7b73b74c2181e39
  17. achow101 merged this on Feb 10, 2025
  18. achow101 closed this on Feb 10, 2025

  19. vasild deleted the branch on Feb 11, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-02-22 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me