Fix startup failure with RLIM_INFINITY fd limits #34937

pull Sjors wants to merge 3 commits into bitcoin:master from Sjors:2026/03/file-descriptor-limit changing 5 files +89 −14
  1. Sjors commented at 12:49 PM on March 27, 2026: member

    When setting the fd limit to unlimited, the node fails to start:

    ulimit -n unlimited
    build/bin/bitcoind
    Error: Not enough file descriptors available. -1 available, 160 required.
    

    This was caused by RaiseFileDescriptorLimit() (introduced in #2568) casting limitFD.rlim_cur to int, which for RLIM_INFINITY overflows to -1. Fix it by returning std::numeric_limits<int>::max() instead.

    Some platforms implement RLIM_INFINITY as the maximum uint64, others as int64 (-1). So simply changing the return type to uint64_t wouldn't work.

    Similarly, though unlikely to actually happen:

    ulimit -n 214748364
    build/bin/bitcoind
    Error: Not enough file descriptors available. -2147483648 available, 160 required.
    

    The second commit expands the fix by clamping all values above std::numeric_limits<int>::max() instead of letting them overflow.

    This PR also expands test/functional/feature_init.py to cover these, using resource.setrlimit. The check is skipped on environments with a hard limit below infinity (or that don't have the Python Resource module).

    macOS by default has a hard limit of infinity, but on e.g. Ubuntu the default hard limit is 524288.

    The third commit applies a similar fix to PosixLockedPageAllocator::GetLimit() for 32-bit systems, but without a test.

  2. DrahtBot commented at 12:50 PM on March 27, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK winterrdog, sedited

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34970 (init: add -test=pause_load_mempool, test mempool save before loaded by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  3. Sjors force-pushed on Mar 27, 2026
  4. DrahtBot added the label CI failed on Mar 27, 2026
  5. Sjors commented at 1:00 PM on March 27, 2026: member

    In case you wonder how I ran into this...

    While playing with the OpenCode and Claude terminal applications (on macOS), I noticed they started setting ulimit -n ... to get the functional tests running.

    It turns out these both use the Bun JavaScript runtime which for some reason sets RLIM_INFINITY.

    To see for yourself, create ulimit-demo.ts with:

    const proc = Bun.spawn(["sh", "-c", "ulimit -n"], {
        stdout: "pipe",
        stderr: "pipe",
    });
    
    const output = await new Response(proc.stdout).text();
    await proc.exited;
    
    console.log("ulimit -n:", output.trim());
    
    % ulimit -n
    256
    $ % bun ulimit-demo.ts
    ulimit -n: unlimited
    

    Codex terminal is written in Rust and just keeps the system values.

    Codex UI on the other hand returns 1048575, just like VSCode, which is Electron behavior.

  6. in src/init.cpp:1047 in 49a2ce1dcb
    1041 | @@ -1041,7 +1042,9 @@ bool AppInitParameterInteraction(const ArgsManager& args)
    1042 |      int min_required_fds = MIN_CORE_FDS + MAX_ADDNODE_CONNECTIONS + nBind;
    1043 |  
    1044 |      // Try raising the FD limit to what we need (available_fds may be smaller than the requested amount if this fails)
    1045 | -    available_fds = RaiseFileDescriptorLimit(user_max_connection + max_private + min_required_fds);
    1046 | +    available_fds = std::min<uint64_t>(
    1047 | +        RaiseFileDescriptorLimit(user_max_connection + max_private + min_required_fds),
    1048 | +        std::numeric_limits<int>::max());
    


    Sjors commented at 1:23 PM on March 27, 2026:

    A lot of downstream code uses int. Changing all that would be a massive diff.

    An alternative approach is to keep RaiseFileDescriptorLimit use int and then have it special case RLIM_INFINITY.

    Not sure where the right boundary is.

    Note that PosixLockedPageAllocator::GetLimit() uses size_t.

  7. DrahtBot removed the label CI failed on Mar 27, 2026
  8. in src/util/fs_helpers.cpp:161 in 49a2ce1dcb outdated
     163 |  #else
     164 |      struct rlimit limitFD;
     165 |      if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
     166 | -        if (limitFD.rlim_cur < (rlim_t)nMinFD) {
     167 | -            limitFD.rlim_cur = nMinFD;
     168 | +        if (limitFD.rlim_cur < static_cast<rlim_t>(min_fd)) {
    


    luke-jr commented at 2:34 PM on March 27, 2026:

    Won't this still attempt to "raise" RLIM_INFINITY?


    Sjors commented at 2:49 PM on March 27, 2026:

    I don't think it would:

    rlim_cur < static_cast<rlim_t>(min_fd))
      ^                               ^----- number we requested
      | 
      | RLIM_INFINITY
    

    So if(...) is false.


    luke-jr commented at 3:41 PM on March 27, 2026:

    RLIM_INFINITY is typically (but not guaranteed to be) -1, which is less than any requested limit...


    Sjors commented at 3:46 PM on March 27, 2026:

    On macOS it's:

    #define RLIM_INFINITY   (((__uint64_t)1 << 63) - 1)     /* no limit */
    

    But if other platforms define it as -1 (signed integer) then that's indeed a problem.


    Sjors commented at 4:01 PM on March 27, 2026:

    In any case, we shouldn't make assumptions about how rlim_t is implemented. I'll try a different approach.

  9. Sjors marked this as a draft on Mar 27, 2026
  10. Sjors force-pushed on Mar 27, 2026
  11. Sjors marked this as ready for review on Mar 27, 2026
  12. Sjors commented at 4:33 PM on March 27, 2026: member

    I switched the return type back to int and instead added an early return std::numeric_limits<int>::max() for RLIM_INFINITY. See #34937 (review)

  13. Sjors force-pushed on Mar 27, 2026
  14. Sjors commented at 5:30 PM on March 27, 2026: member

    Added a second commit for the (very unlikely in practice) int overflow case.

  15. Sjors commented at 6:32 PM on March 27, 2026: member

    Applied a similar fix to PosixLockedPageAllocator::GetLimit(), in this case because _FILE_OFFSET_BITS=64 makes rlim_t 64-bit whereas size_t might only be 32-bit. I did not bother with a test for this one, nor do I have recipe to hit the condition.

  16. in src/util/fs_helpers.cpp:162 in 91b7198573 outdated
     164 |      struct rlimit limitFD;
     165 |      if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
     166 | -        if (limitFD.rlim_cur < (rlim_t)nMinFD) {
     167 | -            limitFD.rlim_cur = nMinFD;
     168 | +        if (limitFD.rlim_cur == RLIM_INFINITY ||
     169 | +            limitFD.rlim_cur >= static_cast<rlim_t>(std::numeric_limits<int>::max())) {
    


    luke-jr commented at 6:38 PM on March 27, 2026:

    The cast could mess things up if sizeof(int) > sizeof(rlim_t)


    Sjors commented at 6:52 PM on March 27, 2026:

    Is that possible? IIUC sizeof(int) is at least 32 bit on 32-bit systems ~and at least 64 bit on 64-bit systems.~

    Whereas rlim_t is either 32 or 64 bit: https://sourceware.org/git/?p=glibc.git;a=blob;f=bits/resource.h;h=c2d25794d67a4eb574776c2ec77e3a92342851c0;hb=refs/heads/release/2.31/master#l94 and it's 64 bit for us because of _FILE_OFFSET_BITS=64.

    I ~could~ added a static_assert to check.


    Sjors commented at 7:18 PM on March 27, 2026:

    There's also some constraints enforced by compat/assumptions.h, but not this specific check.


    luke-jr commented at 8:00 PM on March 27, 2026:

    sizeof(int) is only guaranteed to be 15-bit, but typically 31-bit.

    When sizeof(rlim_t) is 64-bit (also normal), we already hit a casting issue...


    luke-jr commented at 8:16 PM on March 27, 2026:

    std::cmp_greater_equal feels like the right tool for this job

  17. Sjors force-pushed on Mar 27, 2026
  18. DrahtBot added the label CI failed on Mar 27, 2026
  19. Sjors commented at 7:02 PM on March 27, 2026: member

    Added static_assert(std::numeric_limits<rlim_t>::digits >= std::numeric_limits<int>::digits); to see if that assumption at least holds for all our CI platforms, see #34937 (review)

  20. DrahtBot removed the label CI failed on Mar 27, 2026
  21. luke-jr commented at 9:15 PM on March 27, 2026: member

    Dotting all the 'i's, I ended up with this:

    diff --git a/src/support/lockedpool.cpp b/src/support/lockedpool.cpp
    index 0841802e753..4b683e35e0b 100644
    --- a/src/support/lockedpool.cpp
    +++ b/src/support/lockedpool.cpp
    @@ -261,14 +261,12 @@ void PosixLockedPageAllocator::FreeLocked(void* addr, size_t len)
     size_t PosixLockedPageAllocator::GetLimit()
     {
     #ifdef RLIMIT_MEMLOCK
    -    static_assert(std::numeric_limits<rlim_t>::digits >= std::numeric_limits<size_t>::digits);
         struct rlimit rlim;
         if (getrlimit(RLIMIT_MEMLOCK, &rlim) == 0) {
    -        if (rlim.rlim_cur == RLIM_INFINITY ||
    -            rlim.rlim_cur >= static_cast<rlim_t>(std::numeric_limits<size_t>::max())) {
    -            return std::numeric_limits<size_t>::max();
    +        if (rlim.rlim_cur != RLIM_INFINITY &&
    +            std::cmp_less_equal(rlim.rlim_cur, static_cast<rlim_t>(std::numeric_limits<size_t>::max()))) {
    +            return rlim.rlim_cur;
             }
    -        return rlim.rlim_cur;
         }
     #endif
         return std::numeric_limits<size_t>::max();
    diff --git a/src/util/fs_helpers.cpp b/src/util/fs_helpers.cpp
    index 1cf85e6ad13..10c7cc61266 100644
    --- a/src/util/fs_helpers.cpp
    +++ b/src/util/fs_helpers.cpp
    @@ -164,25 +164,24 @@ int RaiseFileDescriptorLimit(int min_fd)
     #if defined(WIN32)
         return 2048;
     #else
    -    static_assert(std::numeric_limits<rlim_t>::digits >= std::numeric_limits<int>::digits);
         struct rlimit limitFD;
         if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
    +        if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
    +            const auto current_limit{limitFD.rlim_cur};
    +            limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;
    +            if (limitFD.rlim_max != RLIM_INFINITY && (limitFD.rlim_cur == RLIM_INFINITY || limitFD.rlim_cur > limitFD.rlim_max)) {
    +                limitFD.rlim_cur = limitFD.rlim_max;
    +            }
    +            if (current_limit != limitFD.rlim_cur) {
    +                setrlimit(RLIMIT_NOFILE, &limitFD);
    +                getrlimit(RLIMIT_NOFILE, &limitFD);
    +            }
    +        }
             if (limitFD.rlim_cur == RLIM_INFINITY ||
    -            limitFD.rlim_cur >= static_cast<rlim_t>(std::numeric_limits<int>::max())) {
    -            // Some platforms implement RLIM_INFINITY as the maximum uint64,
    -            // others as int64 (-1). Avoid casting even if the return type
    -            // is changed to uint64_t. We also cap unlikely but possible values
    -            // that would overflow int.
    +            std::cmp_greater_equal(limitFD.rlim_cur, std::numeric_limits<int>::max())) {
                 return std::numeric_limits<int>::max();
             }
    -        if (limitFD.rlim_cur < static_cast<rlim_t>(min_fd)) {
    -            limitFD.rlim_cur = static_cast<rlim_t>(min_fd);
    -            if (limitFD.rlim_cur > limitFD.rlim_max)
    -                limitFD.rlim_cur = limitFD.rlim_max;
    -            setrlimit(RLIMIT_NOFILE, &limitFD);
    -            getrlimit(RLIMIT_NOFILE, &limitFD);
    -        }
    -        return limitFD.rlim_cur;
    +        return static_cast<int>(limitFD.rlim_cur);
         }
         return min_fd; // getrlimit failed, assume it's fine
     #endif
    
  22. winterrdog commented at 9:35 AM on March 28, 2026: none

    Really solid work with @luke-jr 's patch.

    I was looking at the RaiseFileDescriptorLimit logic from the outside and had one small nit to toss in for review. Specifically on this line in the patch:

    limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;

    I noticed that if min_fd is out of range for the platform's rlim_t, the local state is momentarily set to RLIM_INFINITY before the subsequent if block pulls it back down to rlim_max.

    While i realize this state isn't committed via setrlimit until later, it might be cleaner to fallback directly to limitFD.rlim_max inside the ternary. it keeps the local rlim_cur in a consistently valid state throughout the function and makes the logic more linear for the reader: we go from 'Current' to 'Target/Max' in one step rather than 'Current' -> 'Infinity' -> 'Max'.

    int RaiseFileDescriptorLimit(int min_fd) {
    #if defined(WIN32)
         return 2048;
    #else
         struct rlimit limitFD;
         if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
              if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
                   const auto current_limit{limitFD.rlim_cur};
                   
    -              limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;
    +              limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
    -              if (limitFD.rlim_max != RLIM_INFINITY && (limitFD.rlim_cur == RLIM_INFINITY || limitFD.rlim_cur > limitFD.rlim_max)) {
    +              if (limitFD.rlim_max != RLIM_INFINITY && limitFD.rlim_cur > limitFD.rlim_max) {
                        limitFD.rlim_cur = limitFD.rlim_max;
                   }
                   if (current_limit != limitFD.rlim_cur) {
                        setrlimit(RLIMIT_NOFILE, &limitFD);
                        getrlimit(RLIMIT_NOFILE, &limitFD);
                   }
              }
              if (limitFD.rlim_cur == RLIM_INFINITY ||
                   std::cmp_greater_equal(limitFD.rlim_cur, std::numeric_limits<int>::max())) {
                   return std::numeric_limits<int>::max();
              }
              return static_cast<int>(limitFD.rlim_cur);
         }
         return min_fd; // getrlimit failed, assume it's fine
    }
    

    what do you think about flattening that logic?

  23. Sjors commented at 7:49 AM on March 30, 2026: member

    @luke-jr thanks for that patch, I'll look into incorporating that and then I'll also respond to @winterrdog.

  24. Sjors marked this as a draft on Mar 30, 2026
  25. Sjors force-pushed on Mar 31, 2026
  26. Sjors commented at 11:27 AM on March 31, 2026: member

    I took @luke-jr's patch, with small adjustments.

    The patch has two advantages:

    1. It uses std::cmp_less_equal (c++20) to safely deal with both signed and unsigned implementations of rlim_cur, which allows using it directly in comparisons. In the previous 3cd0cbdc9e862d88ec825a121459a342d9e2b7e1 I used an early return instead.

    2. It uses std::in_range so that on systems where sizeof(int) > sizeof(rlim_t), out of range values are clamped to RLIM_INFINITY. This should not be confused with the overflow fix in the second commit.

    The adjustments I made:

    1. start with Assert(min_fd >= 0): although std::cmp_less makes it safe, a negative number of file descriptors is impossible.

    2. wrap the std::in_range from (2) in Assume, since requesting that many file descriptors is a bug.

    I also added comments to above checks that were counter intuitive / not obvious to me but do seem correct.

    From the discussion against an earlier version:

    The cast could mess things up if sizeof(int) > sizeof(rlim_t)

    Such architecture does not exist on our CI, as demonstrated by the static_assert in af56e801fc0885dca4cfd626dc2b419c97830914), but ideally should still behave.

    If I understand correctly, with std::cmp_less taking care of handling signed vs unsigned, comparisons are safe on such architecture. The underlying < / > widens the smallest value. What does need special care, is casting, which std::in_range performs in a safe way.

    So my static_assert checks are no longer needed (and dropped). @winterrdog I did not take your suggestion, but hopefully the newly sprinkled comments make it clear enough to follow along with the jump to RLIM_INFINITY and back.

  27. Sjors marked this as ready for review on Mar 31, 2026
  28. Sjors commented at 11:44 AM on March 31, 2026: member

    On second thought this would be a nice simplification:

    --- a/src/util/fs_helpers.cpp
    +++ b/src/util/fs_helpers.cpp
    @@ -164,10 +164,8 @@ int RaiseFileDescriptorLimit(int min_fd)
             if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
                 const auto current_limit{limitFD.rlim_cur};
    -            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;
    +            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
                 // Don't raise soft limit beyond hard limit
                 if (limitFD.rlim_max != RLIM_INFINITY && (
    -                limitFD.rlim_cur > limitFD.rlim_max ||
    -                // Value outside rlim_t range was mapped to infinity
    -                limitFD.rlim_cur == RLIM_INFINITY
    +                limitFD.rlim_cur > limitFD.rlim_max
                     )
                 ) {
    

    Doesn't break any test, but let me ponder it a bit more...

  29. Sjors force-pushed on Mar 31, 2026
  30. Sjors force-pushed on Mar 31, 2026
  31. Sjors commented at 12:51 PM on March 31, 2026: member

    Ended up taking the patch.

  32. DrahtBot added the label CI failed on Mar 31, 2026
  33. DrahtBot removed the label CI failed on Mar 31, 2026
  34. in test/functional/feature_init.py:337 in bc10c5b2d1
     332 | +        except (ValueError, OSError):
     333 | +            self.log.info(f"Skipping rlimit test: cannot set soft limit (hard={hard})")
     334 | +            return
     335 | +        try:
     336 | +            self.restart_node(1)
     337 | +            self.log.debug("Node started successfully with RLIM_INFINITY limit (soft={limit})")
    


    winterrdog commented at 1:03 AM on April 4, 2026:

    small catch: this debug log is missing the f-string prefix, so {limit} is not being interpolated.

                self.log.debug(f"Node started successfully with RLIM_INFINITY limit (soft={limit})")
    

    caught it when i spotted this in the test's temporary logs, after running the tests locally:

    2026-04-04T00:38:24.154835Z TestFramework (DEBUG): Node started successfully with RLIM_INFINITY limit (soft={limit})
    

    Sjors commented at 2:59 PM on April 7, 2026:

    Thanks, fixed.

  35. winterrdog commented at 7:30 PM on April 6, 2026: none

    I noticed that the existing init_rlimit tests often skip on local environments because they request values (RLIM_INFINITY or 1 << 31) that exceed the system's hard limit. While these are excellent for testing integer safety and constants, they don't always exercise the actual setrlimit syscall in practice.

    I've drafted a "boundary" test to complement the existing ones. It queries the current rlim_max and requests that exact value, making sure we have a functional happy path that is guaranteed to run without skipping (provided the resource python module is present).

    diff --git a/test/functional/feature_init.py b/test/functional/feature_init.py
    index 844da8b019c..4a2d5dfbc82 100755
    --- a/test/functional/feature_init.py
    +++ b/test/functional/feature_init.py
    @@ -348,6 +348,25 @@ class InitTest(BitcoinTestFramework):
             self.log.info("Testing node startup with RLIM_INFINITY fd limit")
             self.restart_node_with_fd_limit(self.RLIM_INFINITY)
    
    +    def init_rlimit_boundary_test(self):
    +        """Test behavior at the actual system boundaries (hard limit and mid-point)."""
    +        if self.RLIM_INFINITY is None:
    +            self.log.info("Skipping: resource module not available")
    +            return
    +
    +        import resource
    +        soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
    +
    +        # Test exactly at the hard limit. This is the 'maximum' safe value the system can handle.
    +        self.log.info(f"Testing node startup with fd limit at hard limit ({hard})")
    +        self.restart_node_with_fd_limit(hard)
    +
    +        # Test a sane increment (mid-point between soft and hard). We only do this if there is actually room to grow.
    +        if soft < hard:
    +            target = soft + (hard - soft) // 2
    +            self.log.info(f"Testing node startup with partial increase to {target}")
    +            self.restart_node_with_fd_limit(target)
    +
         def init_rlimit_large_test(self):
             """Test that bitcoind starts correctly when the soft RLIMIT_NOFILE limit is above INT_MAX."""
             if self.RLIM_INFINITY is None:
    @@ -365,6 +384,7 @@ class InitTest(BitcoinTestFramework):
             self.init_empty_test()
             self.init_rlimit_test()
             self.init_rlimit_large_test()
    +        self.init_rlimit_boundary_test()
    
    
    if __name__ == '__main__':
    

    after i made the changes above and ran the tests locally with:

    ./build/test/functional/test_runner.py ./build/test/functional/feature_init.py --loglevel=debug --nocleanup
    

    <details> <summary>these are logs (an excerpt) i got</summary>

    ...
    2026-04-06T19:07:56.187614Z TestFramework.node1 (DEBUG): RPC successfully started
    2026-04-06T19:07:56.189509Z TestFramework (INFO): Testing node startup with RLIM_INFINITY fd limit
    2026-04-06T19:07:56.189673Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
    2026-04-06T19:07:56.189778Z TestFramework (INFO): Testing node startup with fd limit above INT_MAX
    2026-04-06T19:07:56.189870Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
    2026-04-06T19:07:56.189959Z TestFramework (INFO): Testing node startup with fd limit at hard limit (1048576)
    2026-04-06T19:07:56.190048Z TestFramework.node1 (DEBUG): Stopping node
    2026-04-06T19:07:56.696453Z TestFramework.node1 (DEBUG): Node stopped
    2026-04-06T19:07:56.699031Z TestFramework.node1 (DEBUG): bitcoind started, waiting for RPC to come up
    2026-04-06T19:07:57.209972Z TestFramework.node1 (DEBUG): RPC successfully started
    2026-04-06T19:07:57.210359Z TestFramework (DEBUG): Node started successfully with RLIM_INFINITY limit (soft=1048576)
    2026-04-06T19:07:57.210566Z TestFramework (DEBUG): Restored previous RLIMIT_NOFILE limits (soft=2048, hard=1048576)
    2026-04-06T19:07:57.210738Z TestFramework (INFO): Testing node startup with partial increase to 525312
    2026-04-06T19:07:57.210915Z TestFramework.node1 (DEBUG): Stopping node
    2026-04-06T19:07:57.617166Z TestFramework.node1 (DEBUG): Node stopped
    2026-04-06T19:07:57.619702Z TestFramework.node1 (DEBUG): bitcoind started, waiting for RPC to come up
    2026-04-06T19:07:58.385996Z TestFramework.node1 (DEBUG): RPC successfully started
    2026-04-06T19:07:58.386381Z TestFramework (DEBUG): Node started successfully with RLIM_INFINITY limit (soft=525312)
    2026-04-06T19:07:58.386591Z TestFramework (DEBUG): Restored previous RLIMIT_NOFILE limits (soft=2048, hard=1048576)
    2026-04-06T19:07:58.386772Z TestFramework (DEBUG): Closing down network thread
    2026-04-06T19:07:58.437955Z TestFramework (INFO): Stopping nodes
    2026-04-06T19:07:58.438363Z TestFramework.node1 (DEBUG): Stopping node
    2026-04-06T19:07:58.794059Z TestFramework.node1 (DEBUG): Node stopped
    2026-04-06T19:07:58.794547Z TestFramework (WARNING): Not cleaning up dir /tmp/test_runner_₿_🏃_20260406_220701/feature_init_0
    2026-04-06T19:07:58.794818Z TestFramework (INFO): Tests successful
    

    </details>

    Motivation:

    • coverage: makes sure the setrlimit call is actually executed rather than bypassed by a skip.
    • verification: confirms the node remains stable when the soft limit is pushed to the absolute maximum allowed by the OS.

    happy to leave this if you feel the current edge cases are sufficient, but I thought it might be a nice addition for stronger local testing.

  36. Sjors force-pushed on Apr 7, 2026
  37. Sjors commented at 2:59 PM on April 7, 2026: member

    @winterrdog I think it's enough to just test the original issue on a subset of CI machines, so I won't include the extra test. I did take the fix here: #34937 (review)

  38. luke-jr referenced this in commit 0f92fc907f on Apr 13, 2026
  39. luke-jr referenced this in commit 003324bfb8 on Apr 13, 2026
  40. luke-jr referenced this in commit d94b2ee5c0 on Apr 13, 2026
  41. luke-jr referenced this in commit 1953393f48 on Apr 14, 2026
  42. luke-jr referenced this in commit 675e2ba3d9 on Apr 14, 2026
  43. in src/util/fs_helpers.cpp:157 in 101de678a8 outdated
     157 | - * It returns the actual file descriptor limit (which may be more or less than nMinFD)
     158 | - */
     159 | -int RaiseFileDescriptorLimit(int nMinFD)
     160 | +int RaiseFileDescriptorLimit(int min_fd)
     161 |  {
     162 | +    Assert(min_fd >= 0);
    


    winterrdog commented at 10:17 PM on April 16, 2026:

    while auditing this line, i began to wonder: is there a practical "floor" we should be enforcing rather than allowing the process to attempt a start with values that guarantee a crash on non-Windows systems?

    while manually incrementing limits during local functional tests, i realised that setting min_fd to values between 0 and 10 consistently triggers a failure. specifically, if the environment provides fewer than 5 descriptors, the node cannot even initialise its basic logging or database handles before the OS kernel denies further requests.

    i observed that the functional test framework itself requires a slightly higher overhead to capture stdout, but the core issue remains: a limit that is technically "non-negative" can still be physically impossible for a bitcoin node to survive.

    through manual isolation, i've identified two distinct failure thresholds:

    • test framework's floor (11 fds): required for the python test runner to successfully pipe stdout and manage the node process.
    • binary floor (≤ 4 fds): below this value, the bitcoind binary fails to start entirely, as it cannot satisfy basic requirements for standard i/o and its primary log file/entropy source.

    proof of crash

    <details open> <summary>the test i used.</summary>

    --- a/test/functional/feature_init.py
    +++ b/test/functional/feature_init.py
    @@ -348,6 +349,15 @@ class InitTest(BitcoinTestFramework):
    +    def init_rlimit_too_low(self):
    +        """Test that bitcoind behaves predictably when the limit is too low to function."""
    +        if self.RLIM_INFINITY is None:
    +            return
    +
    +        self.log.info("Testing node startup with a very low fd limit")
    +        # current behavior: crashes with OSError 24
    +        self.restart_node_with_fd_limit(8)
    +
         def run_test(self):
             self.init_rlimit_test()
             self.init_rlimit_large_test()
    +        self.init_rlimit_too_low()
    

    </details>

    <details> <summary>my logs after the crash.</summary>

    ...
    2026-04-16T18:49:47.300353Z TestFramework.node1 (DEBUG): RPC successfully started
    2026-04-16T18:49:47.301798Z TestFramework (INFO): Testing node startup with RLIM_INFINITY fd limit
    2026-04-16T18:49:47.302125Z TestFramework (DEBUG): Current RLIMIT_NOFILE limits (soft=2048, hard=1048576), trying to set soft limit to -1
    2026-04-16T18:49:47.302377Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
    2026-04-16T18:49:47.302592Z TestFramework (INFO): Testing node startup with fd limit above INT_MAX
    2026-04-16T18:49:47.302776Z TestFramework (DEBUG): Current RLIMIT_NOFILE limits (soft=2048, hard=1048576), trying to set soft limit to 2147483648
    2026-04-16T18:49:47.302951Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
    2026-04-16T18:49:47.303120Z TestFramework (INFO): Testing node startup with very low fd limit
    2026-04-16T18:49:47.303285Z TestFramework (DEBUG): Current RLIMIT_NOFILE limits (soft=2048, hard=1048576), trying to set soft limit to 8
    2026-04-16T18:49:47.303530Z TestFramework.node1 (DEBUG): Stopping node
    2026-04-16T18:49:47.708177Z TestFramework.node1 (DEBUG): Node stopped
    2026-04-16T18:49:47.708435Z TestFramework (DEBUG): Restored previous RLIMIT_NOFILE limits (soft=2048, hard=1048576)
    2026-04-16T18:49:47.708521Z TestFramework (ERROR): Unexpected exception:
    Traceback (most recent call last):
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_framework.py", line 144, in main
        self.run_test()
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/build/test/functional/feature_init.py", line 379, in run_test
        self.init_rlimit_neg()
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/build/test/functional/feature_init.py", line 360, in init_rlimit_neg
        self.restart_node_with_fd_limit(8)
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/build/test/functional/feature_init.py", line 337, in restart_node_with_fd_limit
        self.restart_node(1)
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_framework.py", line 548, in restart_node
        self.start_node(i, extra_args)
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_framework.py", line 504, in start_node
        node.start(*args, **kwargs)
      File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_node.py", line 279, in start
        stdout = tempfile.NamedTemporaryFile(dir=self.stdout_dir, delete=False)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/tempfile.py", line 721, in NamedTemporaryFile
        file = _io.open(dir, mode, buffering=buffering,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/tempfile.py", line 718, in opener
        fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/tempfile.py", line 395, in _mkstemp_inner
        fd = _os.open(file, flags, 0o600)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    OSError: [Errno 24] Too many open files: '/tmp/test_runner_₿_🏃_20260416_214851/feature_init_0/node1/stdout/tmpaekl08vn'
    2026-04-16T18:49:47.710279Z TestFramework (DEBUG): Closing down network thread
    2026-04-16T18:49:47.760813Z TestFramework (INFO): Not stopping nodes as test failed. The dangling processes will be cleaned up later.
    2026-04-16T18:49:47.761167Z TestFramework (WARNING): Not cleaning up dir /tmp/test_runner_₿_🏃_20260416_214851/feature_init_0
    2026-04-16T18:49:47.761426Z TestFramework (ERROR): Test failed. Test logging available at /tmp/test_runner_₿_🏃_20260416_214851/feature_init_0/test_framework.log
    ...
    

    </details>

    suggestion

    rather than allowing the process to hit a OSError: [Errno 24] Too many open files mid-boot, we could perhaps treat this as a validation gate. it might be more robust to notify the user that the provided limit is insufficient for basic operation, providing a "fail-fast" mechanism rather than an ungraceful crash.

    given that the node fundamentally requires a small handful of descriptors just to "breathe" (logging, leveldb, entropy), do you think it is worth adjusting the c++ logic to cater for these very low-value edge cases, or is the current assertion sufficient for our purposes?


    Sjors commented at 1:56 PM on April 23, 2026:

    allowing the process to attempt a start with values that guarantee a crash on non-Windows systems?

    The point of an Assert is that we expect this to never crash. The caller needs to make sure no value < 0 is passed in. Since RaiseFileDescriptorLimit is only called in one place, we know that is the case. If we ever add new call sites, then this assert would need to be re-evaluated.

    The number requested by the init code should be sufficient for node operations. It might be possible to get by with even less, but afaik there's no realistic environment where that's the case.

    I guess the scenario you're describing is where we don't even get to RaiseFileDescriptorLimit? I don't think that's worth worry about, assuming my assumption above is correct.


    winterrdog commented at 9:29 PM on April 23, 2026:

    The number requested by the init code should be sufficient for node operations. It might be possible to get by with even less, but afaik there's no realistic environment where that's the case.

    agreed!

    prior to calling RaiseFileDescriptorLimit, the logic in init.cpp's AppInitParameterInteraction function already aggregates user_max_connections, max_private, and min_required_fds. this calculated sum (was 176 when i ran bitcoind through GDB on Linux) ensures the request is well above the physical floor required for basic stability.

    I guess the scenario you're describing is where we don't even get to RaiseFileDescriptorLimit? I don't think that's worth worry about, assuming my assumption above is correct.

    true!

    the failure happens during initialisation before the RaiseFileDescriptorLimit is even reached. I think I was perhaps being a bit too paranoid from an adversarial POV, imagining a scenario where a malicious actor gains enough access to suffocate the environment's ulimit to 8 or less, essentially DOSing the node's ability to even log its own failure. given how unlikely that environment is in practice, i agree it's not worth the extra complexity.

    thanks for the clarification!

  44. sedited added this to the milestone 32.0 on Apr 21, 2026
  45. in src/util/fs_helpers.cpp:166 in 4b25341fc3
     169 | -            limitFD.rlim_cur = nMinFD;
     170 | -            if (limitFD.rlim_cur > limitFD.rlim_max)
     171 | +        // If the current soft limit is already higher, don't raise it
     172 | +        if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
     173 | +            const auto current_limit{limitFD.rlim_cur};
     174 | +            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
    


    sedited commented at 1:31 PM on April 21, 2026:

    Do we really want to Assume here? What is the scenario this is guarding against?


    Sjors commented at 8:05 AM on April 22, 2026:

    This is mostly a leftover from earlier versions.

    I'll drop the ternary and add a static_assert to make sure the static_cast is safe.

  46. in src/util/fs_helpers.cpp:172 in 4b25341fc3
     175 | +            // Don't raise soft limit beyond hard limit
     176 | +            if (limitFD.rlim_max != RLIM_INFINITY && (
     177 | +                limitFD.rlim_cur > limitFD.rlim_max
     178 | +                )
     179 | +            ) {
     180 |                  limitFD.rlim_cur = limitFD.rlim_max;
    


    sedited commented at 1:33 PM on April 21, 2026:

    I'm a bit confused by this. Why wouldn't we want to set this to RLIM_INFINITY if that is what the maximum value is?


    Sjors commented at 8:05 AM on April 22, 2026:

    RLIM_INFINITY is a magic number:

    1. It can be negative.
    2. It's not guaranteed to be the actual largest number. IIUC on Darwin it's INT64_MAX, i.e. less than UINT64_MAX
  47. in src/util/fs_helpers.cpp:174 in 4b25341fc3 outdated
     179 | +            ) {
     180 |                  limitFD.rlim_cur = limitFD.rlim_max;
     181 | -            setrlimit(RLIMIT_NOFILE, &limitFD);
     182 | -            getrlimit(RLIMIT_NOFILE, &limitFD);
     183 | +            }
     184 | +            if (current_limit != limitFD.rlim_cur) {
    


    sedited commented at 1:34 PM on April 21, 2026:

    Does this change guard against anything new, or is it just there to not do redundant work?


    Sjors commented at 8:09 AM on April 22, 2026:

    I think the latter, just avoiding an unnecessary write and read call. The original code had the same intention.

  48. in src/util/fs_helpers.cpp:170 in 4b25341fc3
     173 | +            const auto current_limit{limitFD.rlim_cur};
     174 | +            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
     175 | +            // Don't raise soft limit beyond hard limit
     176 | +            if (limitFD.rlim_max != RLIM_INFINITY && (
     177 | +                limitFD.rlim_cur > limitFD.rlim_max
     178 | +                )
    


    sedited commented at 1:42 PM on April 21, 2026:

    This is a bit weird to read. Can you put it all on one line (also suggested by clang-format-diff) and add parentheses to the first condition too?


    Sjors commented at 8:10 AM on April 22, 2026:

    Done

  49. Sjors force-pushed on Apr 22, 2026
  50. sedited approved
  51. sedited commented at 9:03 AM on April 22, 2026: contributor

    ACK 95b884246cedb5f612bc39a727469050e4b14d6d

  52. winterrdog commented at 9:36 AM on April 22, 2026: none

    tACK 95b884246cedb5f612bc39a727469050e4b14d6d

    code builds correctly with zero issues and functional tests (present and new) pass successfully.

    ran tests on:

    • Debian Linux on x86_64 and,
    • FreeBSD 15 on x86_64
  53. DrahtBot added the label Needs rebase on Apr 23, 2026
  54. Fix startup failure with RLIM_INFINITY fd limits
    When setting the fd limit to unlimited, the node fails to start:
    
    ulimit -n unlimited
    build/bin/bitcoind
    Error: Not enough file descriptors available. -1 available, 160 required.
    
    This was caused by RaiseFileDescriptorLimit() casting limitFD.rlim_cur
    to int, which for RLIM_INFINITY overflows to -1. Fix it by returning
    std::numeric_limits<int>::max() instead.
    
    This commit also adds a functional test, which is skipped on environments
    with a hard limit below infinity.
    
    Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
    Co-authored-by: winterrdog
    4afbabdcef
  55. init: clamp fd limits to int
    When setting the fd limit to 1 >> 31, the node fails to start:
    
    ulimit -n 214748364
    build/bin/bitcoind
    Error: Not enough file descriptors available. -2147483648 available, 160 required.
    
    Similar to the previous commit, this is fixed by capping the limit
    to std::numeric_limits<int>::max().
    
    Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
    8ab4b9fc85
  56. support: clamp RLIMIT_MEMLOCK to size_t
    On 32-bit systems we build with _FILE_OFFSET_BITS=64 (see CMakeLists.txt),
    which makes rlim_t 64-bit when building against glibc (see bits/resource.h).
    Since size_t could be 32-bit, clamp RLIMIT_MEMLOCK to
    std::numeric_limits<size_t>::max() in PosixLockedPageAllocator::GetLimit().
    
    Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
    735b25519a
  57. Sjors commented at 2:03 PM on April 23, 2026: member

    Trivial rebase after #34435.

  58. Sjors force-pushed on Apr 23, 2026
  59. DrahtBot removed the label Needs rebase on Apr 23, 2026
  60. winterrdog commented at 10:54 PM on April 23, 2026: none

    Re-ACK 735b25519aad2d3c24345c2d6f69c30d1040f473

  61. DrahtBot requested review from sedited on Apr 23, 2026
  62. sedited approved
  63. sedited commented at 8:07 AM on April 24, 2026: contributor

    Re-ACK 735b25519aad2d3c24345c2d6f69c30d1040f473


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-19 16:50 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me