Fix startup failure with RLIM_INFINITY fd limits #34937

Sjors commented at 12:49 PM on March 27, 2026: member

When setting the fd limit to unlimited, the node fails to start:

ulimit -n unlimited
build/bin/bitcoind
Error: Not enough file descriptors available. -1 available, 160 required.

This was caused by RaiseFileDescriptorLimit() (introduced in #2568) casting limitFD.rlim_cur to int, which for RLIM_INFINITY overflows to -1. Fix it by returning std::numeric_limits<int>::max() instead.

Some platforms implement RLIM_INFINITY as the maximum uint64, others as int64 (-1). So simply changing the return type to uint64_t wouldn't work.

Similarly, though unlikely to actually happen:

ulimit -n 214748364
build/bin/bitcoind
Error: Not enough file descriptors available. -2147483648 available, 160 required.

The second commit expands the fix by clamping all values above std::numeric_limits<int>::max() instead of letting them overflow.

This PR also expands test/functional/feature_init.py to cover these, using resource.setrlimit. The check is skipped on environments with a hard limit below infinity (or that don't have the Python Resource module).

macOS by default has a hard limit of infinity, but on e.g. Ubuntu the default hard limit is 524288.

The third commit applies a similar fix to PosixLockedPageAllocator::GetLimit() for 32-bit systems, but without a test.

DrahtBot commented at 12:50 PM on March 27, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	winterrdog, sedited

If your review is incorrectly listed, please copy-paste <code></code> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#35467 (db: warn about legacy chainstate table format by l0rinc)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Sjors force-pushed on Mar 27, 2026

DrahtBot added the label CI failed on Mar 27, 2026

Sjors commented at 1:00 PM on March 27, 2026: member

In case you wonder how I ran into this...

While playing with the OpenCode and Claude terminal applications (on macOS), I noticed they started setting ulimit -n ... to get the functional tests running.

It turns out these both use the Bun JavaScript runtime which for some reason sets RLIM_INFINITY.

To see for yourself, create ulimit-demo.ts with:

const proc = Bun.spawn(["sh", "-c", "ulimit -n"], {
    stdout: "pipe",
    stderr: "pipe",
});

const output = await new Response(proc.stdout).text();
await proc.exited;

console.log("ulimit -n:", output.trim());

% ulimit -n
256
$ % bun ulimit-demo.ts
ulimit -n: unlimited

Codex terminal is written in Rust and just keeps the system values.

Codex UI on the other hand returns 1048575, just like VSCode, which is Electron behavior.

in src/init.cpp:1047 in 49a2ce1dcb

1041 | @@ -1041,7 +1042,9 @@ bool AppInitParameterInteraction(const ArgsManager& args)
1042 |      int min_required_fds = MIN_CORE_FDS + MAX_ADDNODE_CONNECTIONS + nBind;
1043 |  
1044 |      // Try raising the FD limit to what we need (available_fds may be smaller than the requested amount if this fails)
1045 | -    available_fds = RaiseFileDescriptorLimit(user_max_connection + max_private + min_required_fds);
1046 | +    available_fds = std::min<uint64_t>(
1047 | +        RaiseFileDescriptorLimit(user_max_connection + max_private + min_required_fds),
1048 | +        std::numeric_limits<int>::max());

Sjors commented at 1:23 PM on March 27, 2026:

A lot of downstream code uses int. Changing all that would be a massive diff.

An alternative approach is to keep RaiseFileDescriptorLimit use int and then have it special case RLIM_INFINITY.

Not sure where the right boundary is.

Note that PosixLockedPageAllocator::GetLimit() uses size_t.

DrahtBot removed the label CI failed on Mar 27, 2026

in src/util/fs_helpers.cpp:161 in 49a2ce1dcb outdated

 163 |  #else
 164 |      struct rlimit limitFD;
 165 |      if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
 166 | -        if (limitFD.rlim_cur < (rlim_t)nMinFD) {
 167 | -            limitFD.rlim_cur = nMinFD;
 168 | +        if (limitFD.rlim_cur < static_cast<rlim_t>(min_fd)) {

luke-jr commented at 2:34 PM on March 27, 2026:

Won't this still attempt to "raise" RLIM_INFINITY?

Sjors commented at 2:49 PM on March 27, 2026:

I don't think it would:

rlim_cur < static_cast<rlim_t>(min_fd))
  ^                               ^----- number we requested
  | 
  | RLIM_INFINITY

So if(...) is false.

luke-jr commented at 3:41 PM on March 27, 2026:

RLIM_INFINITY is typically (but not guaranteed to be) -1, which is less than any requested limit...

Sjors commented at 3:46 PM on March 27, 2026:

On macOS it's:

#define RLIM_INFINITY   (((__uint64_t)1 << 63) - 1)     /* no limit */

But if other platforms define it as -1 (signed integer) then that's indeed a problem.

Sjors commented at 4:01 PM on March 27, 2026:

In any case, we shouldn't make assumptions about how rlim_t is implemented. I'll try a different approach.

Sjors marked this as a draft on Mar 27, 2026

Sjors force-pushed on Mar 27, 2026

Sjors marked this as ready for review on Mar 27, 2026

Sjors commented at 4:33 PM on March 27, 2026: member

I switched the return type back to int and instead added an early return std::numeric_limits<int>::max() for RLIM_INFINITY. See #34937 (review)

Sjors force-pushed on Mar 27, 2026

Sjors commented at 5:30 PM on March 27, 2026: member

Added a second commit for the (very unlikely in practice) int overflow case.

Sjors commented at 6:32 PM on March 27, 2026: member

Applied a similar fix to PosixLockedPageAllocator::GetLimit(), in this case because _FILE_OFFSET_BITS=64 makes rlim_t 64-bit whereas size_t might only be 32-bit. I did not bother with a test for this one, nor do I have recipe to hit the condition.

in src/util/fs_helpers.cpp:162 in 91b7198573 outdated

 164 |      struct rlimit limitFD;
 165 |      if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
 166 | -        if (limitFD.rlim_cur < (rlim_t)nMinFD) {
 167 | -            limitFD.rlim_cur = nMinFD;
 168 | +        if (limitFD.rlim_cur == RLIM_INFINITY ||
 169 | +            limitFD.rlim_cur >= static_cast<rlim_t>(std::numeric_limits<int>::max())) {

luke-jr commented at 6:38 PM on March 27, 2026:

The cast could mess things up if sizeof(int) > sizeof(rlim_t)

Sjors commented at 6:52 PM on March 27, 2026:

Is that possible? IIUC sizeof(int) is at least 32 bit on 32-bit systems ~and at least 64 bit on 64-bit systems.~

Whereas rlim_t is either 32 or 64 bit: https://sourceware.org/git/?p=glibc.git;a=blob;f=bits/resource.h;h=c2d25794d67a4eb574776c2ec77e3a92342851c0;hb=refs/heads/release/2.31/master#l94 and it's 64 bit for us because of _FILE_OFFSET_BITS=64.

I ~could~ added a static_assert to check.

Sjors commented at 7:18 PM on March 27, 2026:

There's also some constraints enforced by compat/assumptions.h, but not this specific check.

luke-jr commented at 8:00 PM on March 27, 2026:

sizeof(int) is only guaranteed to be 15-bit, but typically 31-bit.

When sizeof(rlim_t) is 64-bit (also normal), we already hit a casting issue...

luke-jr commented at 8:16 PM on March 27, 2026:

std::cmp_greater_equal feels like the right tool for this job

Sjors force-pushed on Mar 27, 2026

DrahtBot added the label CI failed on Mar 27, 2026

Sjors commented at 7:02 PM on March 27, 2026: member

Added static_assert(std::numeric_limits<rlim_t>::digits >= std::numeric_limits<int>::digits); to see if that assumption at least holds for all our CI platforms, see #34937 (review)

DrahtBot removed the label CI failed on Mar 27, 2026

luke-jr commented at 9:15 PM on March 27, 2026: member

Dotting all the 'i's, I ended up with this:

diff --git a/src/support/lockedpool.cpp b/src/support/lockedpool.cpp
index 0841802e753..4b683e35e0b 100644
--- a/src/support/lockedpool.cpp
+++ b/src/support/lockedpool.cpp
@@ -261,14 +261,12 @@ void PosixLockedPageAllocator::FreeLocked(void* addr, size_t len)
 size_t PosixLockedPageAllocator::GetLimit()
 {
 #ifdef RLIMIT_MEMLOCK
-    static_assert(std::numeric_limits<rlim_t>::digits >= std::numeric_limits<size_t>::digits);
     struct rlimit rlim;
     if (getrlimit(RLIMIT_MEMLOCK, &rlim) == 0) {
-        if (rlim.rlim_cur == RLIM_INFINITY ||
-            rlim.rlim_cur >= static_cast<rlim_t>(std::numeric_limits<size_t>::max())) {
-            return std::numeric_limits<size_t>::max();
+        if (rlim.rlim_cur != RLIM_INFINITY &&
+            std::cmp_less_equal(rlim.rlim_cur, static_cast<rlim_t>(std::numeric_limits<size_t>::max()))) {
+            return rlim.rlim_cur;
         }
-        return rlim.rlim_cur;
     }
 #endif
     return std::numeric_limits<size_t>::max();
diff --git a/src/util/fs_helpers.cpp b/src/util/fs_helpers.cpp
index 1cf85e6ad13..10c7cc61266 100644
--- a/src/util/fs_helpers.cpp
+++ b/src/util/fs_helpers.cpp
@@ -164,25 +164,24 @@ int RaiseFileDescriptorLimit(int min_fd)
 #if defined(WIN32)
     return 2048;
 #else
-    static_assert(std::numeric_limits<rlim_t>::digits >= std::numeric_limits<int>::digits);
     struct rlimit limitFD;
     if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
+        if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
+            const auto current_limit{limitFD.rlim_cur};
+            limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;
+            if (limitFD.rlim_max != RLIM_INFINITY && (limitFD.rlim_cur == RLIM_INFINITY || limitFD.rlim_cur > limitFD.rlim_max)) {
+                limitFD.rlim_cur = limitFD.rlim_max;
+            }
+            if (current_limit != limitFD.rlim_cur) {
+                setrlimit(RLIMIT_NOFILE, &limitFD);
+                getrlimit(RLIMIT_NOFILE, &limitFD);
+            }
+        }
         if (limitFD.rlim_cur == RLIM_INFINITY ||
-            limitFD.rlim_cur >= static_cast<rlim_t>(std::numeric_limits<int>::max())) {
-            // Some platforms implement RLIM_INFINITY as the maximum uint64,
-            // others as int64 (-1). Avoid casting even if the return type
-            // is changed to uint64_t. We also cap unlikely but possible values
-            // that would overflow int.
+            std::cmp_greater_equal(limitFD.rlim_cur, std::numeric_limits<int>::max())) {
             return std::numeric_limits<int>::max();
         }
-        if (limitFD.rlim_cur < static_cast<rlim_t>(min_fd)) {
-            limitFD.rlim_cur = static_cast<rlim_t>(min_fd);
-            if (limitFD.rlim_cur > limitFD.rlim_max)
-                limitFD.rlim_cur = limitFD.rlim_max;
-            setrlimit(RLIMIT_NOFILE, &limitFD);
-            getrlimit(RLIMIT_NOFILE, &limitFD);
-        }
-        return limitFD.rlim_cur;
+        return static_cast<int>(limitFD.rlim_cur);
     }
     return min_fd; // getrlimit failed, assume it's fine
 #endif

winterrdog commented at 9:35 AM on March 28, 2026: none

Really solid work with @luke-jr 's patch.

I was looking at the RaiseFileDescriptorLimit logic from the outside and had one small nit to toss in for review. Specifically on this line in the patch:

limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;

I noticed that if min_fd is out of range for the platform's rlim_t, the local state is momentarily set to RLIM_INFINITY before the subsequent if block pulls it back down to rlim_max.

While i realize this state isn't committed via setrlimit until later, it might be cleaner to fallback directly to limitFD.rlim_max inside the ternary. it keeps the local rlim_cur in a consistently valid state throughout the function and makes the logic more linear for the reader: we go from 'Current' to 'Target/Max' in one step rather than 'Current' -> 'Infinity' -> 'Max'.

int RaiseFileDescriptorLimit(int min_fd) {
#if defined(WIN32)
     return 2048;
#else
     struct rlimit limitFD;
     if (getrlimit(RLIMIT_NOFILE, &limitFD) != -1) {
          if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
               const auto current_limit{limitFD.rlim_cur};
               
-              limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;
+              limitFD.rlim_cur = std::in_range<rlim_t>(min_fd) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
-              if (limitFD.rlim_max != RLIM_INFINITY && (limitFD.rlim_cur == RLIM_INFINITY || limitFD.rlim_cur > limitFD.rlim_max)) {
+              if (limitFD.rlim_max != RLIM_INFINITY && limitFD.rlim_cur > limitFD.rlim_max) {
                    limitFD.rlim_cur = limitFD.rlim_max;
               }
               if (current_limit != limitFD.rlim_cur) {
                    setrlimit(RLIMIT_NOFILE, &limitFD);
                    getrlimit(RLIMIT_NOFILE, &limitFD);
               }
          }
          if (limitFD.rlim_cur == RLIM_INFINITY ||
               std::cmp_greater_equal(limitFD.rlim_cur, std::numeric_limits<int>::max())) {
               return std::numeric_limits<int>::max();
          }
          return static_cast<int>(limitFD.rlim_cur);
     }
     return min_fd; // getrlimit failed, assume it's fine
}

what do you think about flattening that logic?

Sjors commented at 7:49 AM on March 30, 2026: member

@luke-jr thanks for that patch, I'll look into incorporating that and then I'll also respond to @winterrdog.

Sjors marked this as a draft on Mar 30, 2026

Sjors force-pushed on Mar 31, 2026

Sjors commented at 11:27 AM on March 31, 2026: member

I took @luke-jr's patch, with small adjustments.

The patch has two advantages:

It uses std::cmp_less_equal (c++20) to safely deal with both signed and unsigned implementations of rlim_cur, which allows using it directly in comparisons. In the previous 3cd0cbdc9e862d88ec825a121459a342d9e2b7e1 I used an early return instead.
It uses std::in_range so that on systems where sizeof(int) > sizeof(rlim_t), out of range values are clamped to RLIM_INFINITY. This should not be confused with the overflow fix in the second commit.

The adjustments I made:

start with Assert(min_fd >= 0): although std::cmp_less makes it safe, a negative number of file descriptors is impossible.
wrap the std::in_range from (2) in Assume, since requesting that many file descriptors is a bug.

I also added comments to above checks that were counter intuitive / not obvious to me but do seem correct.

From the discussion against an earlier version:

The cast could mess things up if sizeof(int) > sizeof(rlim_t)

Such architecture does not exist on our CI, as demonstrated by the static_assert in af56e801fc0885dca4cfd626dc2b419c97830914), but ideally should still behave.

If I understand correctly, with std::cmp_less taking care of handling signed vs unsigned, comparisons are safe on such architecture. The underlying < / > widens the smallest value. What does need special care, is casting, which std::in_range performs in a safe way.

So my static_assert checks are no longer needed (and dropped). @winterrdog I did not take your suggestion, but hopefully the newly sprinkled comments make it clear enough to follow along with the jump to RLIM_INFINITY and back.

Sjors marked this as ready for review on Mar 31, 2026

Sjors commented at 11:44 AM on March 31, 2026: member

On second thought this would be a nice simplification:

--- a/src/util/fs_helpers.cpp
+++ b/src/util/fs_helpers.cpp
@@ -164,10 +164,8 @@ int RaiseFileDescriptorLimit(int min_fd)
         if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
             const auto current_limit{limitFD.rlim_cur};
-            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : RLIM_INFINITY;
+            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
             // Don't raise soft limit beyond hard limit
             if (limitFD.rlim_max != RLIM_INFINITY && (
-                limitFD.rlim_cur > limitFD.rlim_max ||
-                // Value outside rlim_t range was mapped to infinity
-                limitFD.rlim_cur == RLIM_INFINITY
+                limitFD.rlim_cur > limitFD.rlim_max
                 )
             ) {

Doesn't break any test, but let me ponder it a bit more...

Sjors force-pushed on Mar 31, 2026

Sjors commented at 12:51 PM on March 31, 2026: member

Ended up taking the patch.

DrahtBot added the label CI failed on Mar 31, 2026

DrahtBot removed the label CI failed on Mar 31, 2026

in test/functional/feature_init.py:337 in bc10c5b2d1

 332 | +        except (ValueError, OSError):
 333 | +            self.log.info(f"Skipping rlimit test: cannot set soft limit (hard={hard})")
 334 | +            return
 335 | +        try:
 336 | +            self.restart_node(1)
 337 | +            self.log.debug("Node started successfully with RLIM_INFINITY limit (soft={limit})")

winterrdog commented at 1:03 AM on April 4, 2026:

small catch: this debug log is missing the f-string prefix, so {limit} is not being interpolated.

            self.log.debug(f"Node started successfully with RLIM_INFINITY limit (soft={limit})")

caught it when i spotted this in the test's temporary logs, after running the tests locally:

2026-04-04T00:38:24.154835Z TestFramework (DEBUG): Node started successfully with RLIM_INFINITY limit (soft={limit})

Sjors commented at 2:59 PM on April 7, 2026:

Thanks, fixed.

winterrdog commented at 7:30 PM on April 6, 2026: none

I noticed that the existing init_rlimit tests often skip on local environments because they request values (RLIM_INFINITY or 1 << 31) that exceed the system's hard limit. While these are excellent for testing integer safety and constants, they don't always exercise the actual setrlimit syscall in practice.

I've drafted a "boundary" test to complement the existing ones. It queries the current rlim_max and requests that exact value, making sure we have a functional happy path that is guaranteed to run without skipping (provided the resource python module is present).

diff --git a/test/functional/feature_init.py b/test/functional/feature_init.py
index 844da8b019c..4a2d5dfbc82 100755
--- a/test/functional/feature_init.py
+++ b/test/functional/feature_init.py
@@ -348,6 +348,25 @@ class InitTest(BitcoinTestFramework):
         self.log.info("Testing node startup with RLIM_INFINITY fd limit")
         self.restart_node_with_fd_limit(self.RLIM_INFINITY)

+    def init_rlimit_boundary_test(self):
+        """Test behavior at the actual system boundaries (hard limit and mid-point)."""
+        if self.RLIM_INFINITY is None:
+            self.log.info("Skipping: resource module not available")
+            return
+
+        import resource
+        soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
+
+        # Test exactly at the hard limit. This is the 'maximum' safe value the system can handle.
+        self.log.info(f"Testing node startup with fd limit at hard limit ({hard})")
+        self.restart_node_with_fd_limit(hard)
+
+        # Test a sane increment (mid-point between soft and hard). We only do this if there is actually room to grow.
+        if soft < hard:
+            target = soft + (hard - soft) // 2
+            self.log.info(f"Testing node startup with partial increase to {target}")
+            self.restart_node_with_fd_limit(target)
+
     def init_rlimit_large_test(self):
         """Test that bitcoind starts correctly when the soft RLIMIT_NOFILE limit is above INT_MAX."""
         if self.RLIM_INFINITY is None:
@@ -365,6 +384,7 @@ class InitTest(BitcoinTestFramework):
         self.init_empty_test()
         self.init_rlimit_test()
         self.init_rlimit_large_test()
+        self.init_rlimit_boundary_test()


if __name__ == '__main__':

after i made the changes above and ran the tests locally with:

./build/test/functional/test_runner.py ./build/test/functional/feature_init.py --loglevel=debug --nocleanup

<details> <summary>these are logs (an excerpt) i got</summary>

...
2026-04-06T19:07:56.187614Z TestFramework.node1 (DEBUG): RPC successfully started
2026-04-06T19:07:56.189509Z TestFramework (INFO): Testing node startup with RLIM_INFINITY fd limit
2026-04-06T19:07:56.189673Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
2026-04-06T19:07:56.189778Z TestFramework (INFO): Testing node startup with fd limit above INT_MAX
2026-04-06T19:07:56.189870Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
2026-04-06T19:07:56.189959Z TestFramework (INFO): Testing node startup with fd limit at hard limit (1048576)
2026-04-06T19:07:56.190048Z TestFramework.node1 (DEBUG): Stopping node
2026-04-06T19:07:56.696453Z TestFramework.node1 (DEBUG): Node stopped
2026-04-06T19:07:56.699031Z TestFramework.node1 (DEBUG): bitcoind started, waiting for RPC to come up
2026-04-06T19:07:57.209972Z TestFramework.node1 (DEBUG): RPC successfully started
2026-04-06T19:07:57.210359Z TestFramework (DEBUG): Node started successfully with RLIM_INFINITY limit (soft=1048576)
2026-04-06T19:07:57.210566Z TestFramework (DEBUG): Restored previous RLIMIT_NOFILE limits (soft=2048, hard=1048576)
2026-04-06T19:07:57.210738Z TestFramework (INFO): Testing node startup with partial increase to 525312
2026-04-06T19:07:57.210915Z TestFramework.node1 (DEBUG): Stopping node
2026-04-06T19:07:57.617166Z TestFramework.node1 (DEBUG): Node stopped
2026-04-06T19:07:57.619702Z TestFramework.node1 (DEBUG): bitcoind started, waiting for RPC to come up
2026-04-06T19:07:58.385996Z TestFramework.node1 (DEBUG): RPC successfully started
2026-04-06T19:07:58.386381Z TestFramework (DEBUG): Node started successfully with RLIM_INFINITY limit (soft=525312)
2026-04-06T19:07:58.386591Z TestFramework (DEBUG): Restored previous RLIMIT_NOFILE limits (soft=2048, hard=1048576)
2026-04-06T19:07:58.386772Z TestFramework (DEBUG): Closing down network thread
2026-04-06T19:07:58.437955Z TestFramework (INFO): Stopping nodes
2026-04-06T19:07:58.438363Z TestFramework.node1 (DEBUG): Stopping node
2026-04-06T19:07:58.794059Z TestFramework.node1 (DEBUG): Node stopped
2026-04-06T19:07:58.794547Z TestFramework (WARNING): Not cleaning up dir /tmp/test_runner_₿_🏃_20260406_220701/feature_init_0
2026-04-06T19:07:58.794818Z TestFramework (INFO): Tests successful

</details>

Motivation:

coverage: makes sure the setrlimit call is actually executed rather than bypassed by a skip.
verification: confirms the node remains stable when the soft limit is pushed to the absolute maximum allowed by the OS.

happy to leave this if you feel the current edge cases are sufficient, but I thought it might be a nice addition for stronger local testing.

Sjors force-pushed on Apr 7, 2026

Sjors commented at 2:59 PM on April 7, 2026: member

@winterrdog I think it's enough to just test the original issue on a subset of CI machines, so I won't include the extra test. I did take the fix here: #34937 (review)

luke-jr referenced this in commit 0f92fc907f on Apr 13, 2026

luke-jr referenced this in commit 003324bfb8 on Apr 13, 2026

luke-jr referenced this in commit d94b2ee5c0 on Apr 13, 2026

luke-jr referenced this in commit 1953393f48 on Apr 14, 2026

luke-jr referenced this in commit 675e2ba3d9 on Apr 14, 2026

in src/util/fs_helpers.cpp:157 in 101de678a8 outdated

 157 | - * It returns the actual file descriptor limit (which may be more or less than nMinFD)
 158 | - */
 159 | -int RaiseFileDescriptorLimit(int nMinFD)
 160 | +int RaiseFileDescriptorLimit(int min_fd)
 161 |  {
 162 | +    Assert(min_fd >= 0);

winterrdog commented at 10:17 PM on April 16, 2026:

while auditing this line, i began to wonder: is there a practical "floor" we should be enforcing rather than allowing the process to attempt a start with values that guarantee a crash on non-Windows systems?

while manually incrementing limits during local functional tests, i realised that setting min_fd to values between 0 and 10 consistently triggers a failure. specifically, if the environment provides fewer than 5 descriptors, the node cannot even initialise its basic logging or database handles before the OS kernel denies further requests.

i observed that the functional test framework itself requires a slightly higher overhead to capture stdout, but the core issue remains: a limit that is technically "non-negative" can still be physically impossible for a bitcoin node to survive.

through manual isolation, i've identified two distinct failure thresholds:

test framework's floor (11 fds): required for the python test runner to successfully pipe stdout and manage the node process.
binary floor (≤ 4 fds): below this value, the bitcoind binary fails to start entirely, as it cannot satisfy basic requirements for standard i/o and its primary log file/entropy source.

proof of crash

--- a/test/functional/feature_init.py
+++ b/test/functional/feature_init.py
@@ -348,6 +349,15 @@ class InitTest(BitcoinTestFramework):
+    def init_rlimit_too_low(self):
+        """Test that bitcoind behaves predictably when the limit is too low to function."""
+        if self.RLIM_INFINITY is None:
+            return
+
+        self.log.info("Testing node startup with a very low fd limit")
+        # current behavior: crashes with OSError 24
+        self.restart_node_with_fd_limit(8)
+
     def run_test(self):
         self.init_rlimit_test()
         self.init_rlimit_large_test()
+        self.init_rlimit_too_low()

</details>

<details> <summary>my logs after the crash.</summary>

...
2026-04-16T18:49:47.300353Z TestFramework.node1 (DEBUG): RPC successfully started
2026-04-16T18:49:47.301798Z TestFramework (INFO): Testing node startup with RLIM_INFINITY fd limit
2026-04-16T18:49:47.302125Z TestFramework (DEBUG): Current RLIMIT_NOFILE limits (soft=2048, hard=1048576), trying to set soft limit to -1
2026-04-16T18:49:47.302377Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
2026-04-16T18:49:47.302592Z TestFramework (INFO): Testing node startup with fd limit above INT_MAX
2026-04-16T18:49:47.302776Z TestFramework (DEBUG): Current RLIMIT_NOFILE limits (soft=2048, hard=1048576), trying to set soft limit to 2147483648
2026-04-16T18:49:47.302951Z TestFramework (INFO): Skipping rlimit test: cannot set soft limit (hard=1048576)
2026-04-16T18:49:47.303120Z TestFramework (INFO): Testing node startup with very low fd limit
2026-04-16T18:49:47.303285Z TestFramework (DEBUG): Current RLIMIT_NOFILE limits (soft=2048, hard=1048576), trying to set soft limit to 8
2026-04-16T18:49:47.303530Z TestFramework.node1 (DEBUG): Stopping node
2026-04-16T18:49:47.708177Z TestFramework.node1 (DEBUG): Node stopped
2026-04-16T18:49:47.708435Z TestFramework (DEBUG): Restored previous RLIMIT_NOFILE limits (soft=2048, hard=1048576)
2026-04-16T18:49:47.708521Z TestFramework (ERROR): Unexpected exception:
Traceback (most recent call last):
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_framework.py", line 144, in main
    self.run_test()
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/build/test/functional/feature_init.py", line 379, in run_test
    self.init_rlimit_neg()
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/build/test/functional/feature_init.py", line 360, in init_rlimit_neg
    self.restart_node_with_fd_limit(8)
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/build/test/functional/feature_init.py", line 337, in restart_node_with_fd_limit
    self.restart_node(1)
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_framework.py", line 548, in restart_node
    self.start_node(i, extra_args)
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_framework.py", line 504, in start_node
    node.start(*args, **kwargs)
  File "/home/oldman/Documents/btc/btc-core/review-tests/bitcoin/test/functional/test_framework/test_node.py", line 279, in start
    stdout = tempfile.NamedTemporaryFile(dir=self.stdout_dir, delete=False)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/tempfile.py", line 721, in NamedTemporaryFile
    file = _io.open(dir, mode, buffering=buffering,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/tempfile.py", line 718, in opener
    fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/tempfile.py", line 395, in _mkstemp_inner
    fd = _os.open(file, flags, 0o600)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 24] Too many open files: '/tmp/test_runner_₿_🏃_20260416_214851/feature_init_0/node1/stdout/tmpaekl08vn'
2026-04-16T18:49:47.710279Z TestFramework (DEBUG): Closing down network thread
2026-04-16T18:49:47.760813Z TestFramework (INFO): Not stopping nodes as test failed. The dangling processes will be cleaned up later.
2026-04-16T18:49:47.761167Z TestFramework (WARNING): Not cleaning up dir /tmp/test_runner_₿_🏃_20260416_214851/feature_init_0
2026-04-16T18:49:47.761426Z TestFramework (ERROR): Test failed. Test logging available at /tmp/test_runner_₿_🏃_20260416_214851/feature_init_0/test_framework.log
...

</details>

suggestion

rather than allowing the process to hit a OSError: [Errno 24] Too many open files mid-boot, we could perhaps treat this as a validation gate. it might be more robust to notify the user that the provided limit is insufficient for basic operation, providing a "fail-fast" mechanism rather than an ungraceful crash.

given that the node fundamentally requires a small handful of descriptors just to "breathe" (logging, leveldb, entropy), do you think it is worth adjusting the c++ logic to cater for these very low-value edge cases, or is the current assertion sufficient for our purposes?

Sjors commented at 1:56 PM on April 23, 2026:

allowing the process to attempt a start with values that guarantee a crash on non-Windows systems?

The point of an Assert is that we expect this to never crash. The caller needs to make sure no value < 0 is passed in. Since RaiseFileDescriptorLimit is only called in one place, we know that is the case. If we ever add new call sites, then this assert would need to be re-evaluated.

The number requested by the init code should be sufficient for node operations. It might be possible to get by with even less, but afaik there's no realistic environment where that's the case.

I guess the scenario you're describing is where we don't even get to RaiseFileDescriptorLimit? I don't think that's worth worry about, assuming my assumption above is correct.

winterrdog commented at 9:29 PM on April 23, 2026:

The number requested by the init code should be sufficient for node operations. It might be possible to get by with even less, but afaik there's no realistic environment where that's the case.

agreed!

prior to calling RaiseFileDescriptorLimit, the logic in init.cpp's AppInitParameterInteraction function already aggregates user_max_connections, max_private, and min_required_fds. this calculated sum (was 176 when i ran bitcoind through GDB on Linux) ensures the request is well above the physical floor required for basic stability.

I guess the scenario you're describing is where we don't even get to RaiseFileDescriptorLimit? I don't think that's worth worry about, assuming my assumption above is correct.

true!

the failure happens during initialisation before the RaiseFileDescriptorLimit is even reached. I think I was perhaps being a bit too paranoid from an adversarial POV, imagining a scenario where a malicious actor gains enough access to suffocate the environment's ulimit to 8 or less, essentially DOSing the node's ability to even log its own failure. given how unlikely that environment is in practice, i agree it's not worth the extra complexity.

thanks for the clarification!

sedited added this to the milestone 32.0 on Apr 21, 2026

in src/util/fs_helpers.cpp:166 in 4b25341fc3

 169 | -            limitFD.rlim_cur = nMinFD;
 170 | -            if (limitFD.rlim_cur > limitFD.rlim_max)
 171 | +        // If the current soft limit is already higher, don't raise it
 172 | +        if (limitFD.rlim_cur != RLIM_INFINITY && std::cmp_less(limitFD.rlim_cur, min_fd)) {
 173 | +            const auto current_limit{limitFD.rlim_cur};
 174 | +            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;

sedited commented at 1:31 PM on April 21, 2026:

Do we really want to Assume here? What is the scenario this is guarding against?

Sjors commented at 8:05 AM on April 22, 2026:

This is mostly a leftover from earlier versions.

I'll drop the ternary and add a static_assert to make sure the static_cast is safe.

in src/util/fs_helpers.cpp:172 in 4b25341fc3

 175 | +            // Don't raise soft limit beyond hard limit
 176 | +            if (limitFD.rlim_max != RLIM_INFINITY && (
 177 | +                limitFD.rlim_cur > limitFD.rlim_max
 178 | +                )
 179 | +            ) {
 180 |                  limitFD.rlim_cur = limitFD.rlim_max;

sedited commented at 1:33 PM on April 21, 2026:

I'm a bit confused by this. Why wouldn't we want to set this to RLIM_INFINITY if that is what the maximum value is?

Sjors commented at 8:05 AM on April 22, 2026:

RLIM_INFINITY is a magic number:

It can be negative.
It's not guaranteed to be the actual largest number. IIUC on Darwin it's INT64_MAX, i.e. less than UINT64_MAX

in src/util/fs_helpers.cpp:174 in 4b25341fc3 outdated

 179 | +            ) {
 180 |                  limitFD.rlim_cur = limitFD.rlim_max;
 181 | -            setrlimit(RLIMIT_NOFILE, &limitFD);
 182 | -            getrlimit(RLIMIT_NOFILE, &limitFD);
 183 | +            }
 184 | +            if (current_limit != limitFD.rlim_cur) {

sedited commented at 1:34 PM on April 21, 2026:

Does this change guard against anything new, or is it just there to not do redundant work?

Sjors commented at 8:09 AM on April 22, 2026:

I think the latter, just avoiding an unnecessary write and read call. The original code had the same intention.

in src/util/fs_helpers.cpp:170 in 4b25341fc3

 173 | +            const auto current_limit{limitFD.rlim_cur};
 174 | +            limitFD.rlim_cur = Assume(std::in_range<rlim_t>(min_fd)) ? static_cast<rlim_t>(min_fd) : limitFD.rlim_max;
 175 | +            // Don't raise soft limit beyond hard limit
 176 | +            if (limitFD.rlim_max != RLIM_INFINITY && (
 177 | +                limitFD.rlim_cur > limitFD.rlim_max
 178 | +                )

sedited commented at 1:42 PM on April 21, 2026:

This is a bit weird to read. Can you put it all on one line (also suggested by clang-format-diff) and add parentheses to the first condition too?

Sjors commented at 8:10 AM on April 22, 2026:

Done

Sjors force-pushed on Apr 22, 2026

sedited approved

sedited commented at 9:03 AM on April 22, 2026: contributor

ACK 95b884246cedb5f612bc39a727469050e4b14d6d

winterrdog commented at 9:36 AM on April 22, 2026: none

tACK 95b884246cedb5f612bc39a727469050e4b14d6d

code builds correctly with zero issues and functional tests (present and new) pass successfully.

ran tests on:

Debian Linux on x86_64 and,
FreeBSD 15 on x86_64

DrahtBot added the label Needs rebase on Apr 23, 2026

Fix startup failure with RLIM_INFINITY fd limits

When setting the fd limit to unlimited, the node fails to start:

ulimit -n unlimited
build/bin/bitcoind
Error: Not enough file descriptors available. -1 available, 160 required.

This was caused by RaiseFileDescriptorLimit() casting limitFD.rlim_cur
to int, which for RLIM_INFINITY overflows to -1. Fix it by returning
std::numeric_limits<int>::max() instead.

This commit also adds a functional test, which is skipped on environments
with a hard limit below infinity.

Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
Co-authored-by: winterrdog

4afbabdcef

init: clamp fd limits to int

When setting the fd limit to 1 >> 31, the node fails to start:

ulimit -n 214748364
build/bin/bitcoind
Error: Not enough file descriptors available. -2147483648 available, 160 required.

Similar to the previous commit, this is fixed by capping the limit
to std::numeric_limits<int>::max().

Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>

8ab4b9fc85

support: clamp RLIMIT_MEMLOCK to size_t

On 32-bit systems we build with _FILE_OFFSET_BITS=64 (see CMakeLists.txt),
which makes rlim_t 64-bit when building against glibc (see bits/resource.h).
Since size_t could be 32-bit, clamp RLIMIT_MEMLOCK to
std::numeric_limits<size_t>::max() in PosixLockedPageAllocator::GetLimit().

Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>

735b25519a

Sjors commented at 2:03 PM on April 23, 2026: member

Trivial rebase after #34435.

Sjors force-pushed on Apr 23, 2026

DrahtBot removed the label Needs rebase on Apr 23, 2026

winterrdog commented at 10:54 PM on April 23, 2026: none

Re-ACK 735b25519aad2d3c24345c2d6f69c30d1040f473

DrahtBot requested review from sedited on Apr 23, 2026

sedited approved

sedited commented at 8:07 AM on April 24, 2026: contributor

Re-ACK 735b25519aad2d3c24345c2d6f69c30d1040f473

fanquake commented at 1:12 PM on April 30, 2026: member

Someone have run into something like this downstream: https://github.com/idealgroup/bridge/commit/29131c080a7a09bb62162df4f9135bef9463b8d0#diff-1889898b27bd517612562d22c54b08c9cffe0999ffc47cd9c2fdaa273697e090.

Sjors referenced this in commit 29131c080a on Apr 30, 2026

sedited requested review from vasild on May 21, 2026