Fixes #11368
This requires libevent 2.2.1 which so far was only released in alpha.
For more context see specifically #11368 (comment), this uses the feature mentioned there as libevent#592.
Fixes #11368
This requires libevent 2.2.1 which so far was only released in alpha.
For more context see specifically #11368 (comment), this uses the feature mentioned there as libevent#592.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/27731.
See the guideline for information on the review process. A summary of reviews will appear here.
No conflicts as of last run.
418@@ -419,6 +419,15 @@ bool InitHTTPServer()
419 int workQueueDepth = std::max((long)gArgs.GetIntArg("-rpcworkqueue", DEFAULT_HTTP_WORKQUEUE), 1L);
420 LogPrintfCategory(BCLog::HTTP, "creating work queue of depth %d\n", workQueueDepth);
421
422+#if LIBEVENT_VERSION_NUMBER >= 0x02020001
423+ if (event_get_version_number() >= 0x02020001) {
424+ // Limit the maximum number of open connections to prevent exhausting
425+ // the file descriptor limit. When the http server gets overwhelmed it
426+ // will respond with 503 Service Unavailable.
427+ evhttp_set_max_connections(http, workQueueDepth * 2);
I am not sure this actually limits us within the file descriptor limit. For example if you run:
0$ ulimit -n 175
1$ src/bitcoind
2Warning: Reducing -maxconnections from 125 to 15, because of system limitations.
3Bitcoin Core starting
I think (but have not tested) that if I then recieve 15 inbound connections I would could still run out of file descriptors for the http server?
Perhaps if we also modify this line:
https://github.com/bitcoin/bitcoin/blob/51c050787fd6bcd016969dd7e245818ebd110b67/src/init.cpp#L943
…to consider the workqueue depth too, then it could work?
Yeah, I think it will be hard to find a number where there isn’t a specific limit and specific circumstances that would still cause problems. If we set the limit dynamically based on what’s really available there would be a range where we would effectively set evhttp_set_max_connections(http, 0);
which would probably mean that we couldn’t even shut down the server via the RPC interface.
When the maxconnections are reduced to 15 that doesn’t mean that there are only 15 file descriptors left. There are more reserved than are not always needed. But that part of the code certainly could use improvement, see #27539 and #27730. I think making changes to the maxconnections calculation is not necessarily linked to this change here because it could reduce the risk of running out of FDs regardless. Also, note that the node will even start if you set ulimit -n 150
and it have maxconnection at 0. So that is another question to discuss if that even makes sense. Because if we alter the calculation of maxconnections but still let the node run regardless of what the result is, we haven’t really prevented the file descriptor exhaustion completely. And it may be too limiting on the other hand if we set the requirements of available file descriptors so high that we can absolutely be sure the node will not encounter an exhaustion under any circumstances.
So, I am not sure what would be the optimal number to use. I think we can still tweak this later so it covers the most common use cases best. What I am currently concerned about is if this actually improves the situation in #11368 for the default file descriptor limits in the most commonly used OS. macOS was among the lowest as far as I remember and it was 256 usually I think. So I am currently aiming at showing that it’s an improvement for that. But I do struggle to reproduce the original issue on my new machine right now so I am not 100% sure anymore if this actually needs fixing. Though I am on a different machine now than I was back then.
418@@ -419,6 +419,15 @@ bool InitHTTPServer()
419 int workQueueDepth = std::max((long)gArgs.GetIntArg("-rpcworkqueue", DEFAULT_HTTP_WORKQUEUE), 1L);
420 LogPrintfCategory(BCLog::HTTP, "creating work queue of depth %d\n", workQueueDepth);
421
422+#if LIBEVENT_VERSION_NUMBER >= 0x02020001
423+ if (event_get_version_number() >= 0x02020001) {
evhttp_set_max_connections
symbol?
159-want to increase the number of maximum allowed file descriptors in your system
160-and try to prevent opening too many connections to your JSON-RPC interface at the
161-same time if this is under your control. It is hard to give general advice
162-since this depends on your system but if you make several hundred requests at
163-once you are definitely at risk of encountering this issue.
164+When too many connections are opened quickly the interface will start to
🤔 There hasn’t been much activity lately and the CI seems to be failing.
If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn’t been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.
🤔 There hasn’t been much activity lately and the CI seems to be failing.
If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn’t been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.
removing libevent
Nice. Is there a tracking issue?
removing libevent
Nice. Is there a tracking issue?
Not yet, but should be open soon (TM).
I am not familiar with the intricacies of this PR, but is it possible for bitcoind to detect this impending crash before it happens and just return some sort of 500 error?
I ran into this issue (using v28.0) when running a straightforward script which was calling getblock
for each block, starting from genesis. After a couple thousand blocks bitcoind crashed with the file descriptor error.
After some trial and error I have found that putting a 100 millisecond delay between http requests at least has kept bitcoind from crashing (on my system with very standard commodity hardware), but of course this is very slow and will take days to run. lsof -u bitcoin | wc -l
seems to hover between 1100 and 1500 files open for me.
Reading through this issue, it seems we are either waiting for bitcoin core to upgrade to using libevent 2.2.1, or we are waiting for libevent to be removed entirely as a dependency.
However, in the interim it would be really very helpful if, instead of bitcoind crashing, it at least stops responding to the rpc calls (lets the client timeout) or returns some 500 error. Then clients can at least know that they are overwhelming the server and backoff.
Reading through this issue, it seems we are either waiting for bitcoin core to upgrade to using libevent 2.2.1, or we are waiting for libevent to be removed entirely as a dependency.
I would open this PR for review if libevent 2.2.1 was available but it’s in Alpha since May 2023: https://github.com/libevent/libevent/releases So upgrading libevent isn’t really an option yet but the removal is in the works, see #31194
However, in the interim it would be really very helpful if, instead of bitcoind crashing, it at least stops responding to the rpc calls (lets the client timeout) or returns some 500 error. Then clients can at least know that they are overwhelming the server and backoff.
I don’t think there is a simple fix, there have been a few attempts over the years to address this issue outside of libevent but none have been merged and I would guess there wouldn’t be considerable interest from reviewers given the parallel effort to remove libevent altogether. So I don’t believe an interim solution would have a high chance to get merged given our limited resources for review, particular in this area of the code base (which is also a motivating factor for the removal of the dependency btw). I don’t want to discourage anyone from trying to find a simple fix and opening a PR though.