http: fix submission during shutdown race #34577

pull furszy wants to merge 3 commits into bitcoin:master from furszy:2026_http_submission_error changing 4 files +78 −41
  1. furszy commented at 7:52 pm on February 12, 2026: member

    Fixes #34573.

    As mentioned in #34573 (comment), the ThreadPool PR (#33689) revealed an existing issue.

    Before that PR, we were returning an incorrect error “Request rejected because http work queue depth exceeded” during shutdown for unhandled requests (we were not differentiating between “queue depth exceeded” and “server interrupted” errors). Now, with the ThreadPool inclusion, we return the proper error but we don’t handle it properly.

    This PR improves exactly that. Handling the missing error and properly returning it to the user.

    The race can be reproduced as follows:

    1. The server receives an http request.
    2. Processing of the request is delayed, and shutdown is triggered in the meantime.
    3. During shutdown, the libevent callback is unregistered and the threadpool interrupted.
    4. The delayed request (step 2) resumes and tries to submit a task to the now-interrupted server.

    Reproduction test can be found #34577 (comment).

    Also, to prevent this kind of issue from happening again, this PR changes task submission to return the error as part of the function’s return value using util::Expected instead of throwing the exception. Unlike exceptions, which require extra try-catch blocks and can be ignored, returning Expected forces callers to explicitly handle failures, and attributes like [[nodiscard]] allow us catch unhandled ones at compile time.

  2. DrahtBot added the label RPC/REST/ZMQ on Feb 12, 2026
  3. DrahtBot commented at 7:52 pm on February 12, 2026: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK hodlinator, ismaelsadeeq
    Stale ACK andrewtoth

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34562 (ThreadPool follow-ups, proactive shutdown and HasReason dependency cleanup by furszy)
    • #32061 (Replace libevent with our own HTTP and socket-handling implementation by pinheadmz)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. furszy force-pushed on Feb 12, 2026
  5. DrahtBot added the label CI failed on Feb 12, 2026
  6. DrahtBot commented at 8:03 pm on February 12, 2026: contributor

    🚧 At least one of the CI tasks failed. Task No wallet: https://github.com/bitcoin/bitcoin/actions/runs/21961944156/job/63441463470 LLM reason (✨ experimental): Compilation failed in fuzz/threadpool.cpp: Submit returns util::Expected<future, string> which cannot be assigned to std::future, causing build errors.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. in src/test/threadpool_tests.cpp:49 in 0b2d12f83c
    43@@ -44,6 +44,13 @@ BOOST_FIXTURE_TEST_SUITE(threadpool_tests, ThreadPoolFixture)
    44         }                                                                         \
    45     } while (0)
    46 
    47+// Helper to unwrap a valid pool submission
    48+template <typename F> [[nodiscard]]
    49+auto Submit(ThreadPool& pool, F&& fn)
    


    hodlinator commented at 8:09 pm on February 12, 2026:

    Nice helper! Was actually experimenting with changing ThreadPool::Submit() to return util::Expected yesterday (suffering from lack of knowledge of std::invoke_result_t<F> though). It was so cumbersome to adapt these test without realizing a helper would be useful. :)

    nit: The going style seems to be [[nodiscard]] on the same line as the return type:

    0template <typename F>
    1[[nodiscard]] auto Submit(ThreadPool& pool, F&& fn)
    

    Examples from master: https://github.com/bitcoin/bitcoin/blob/07b924775e4ff5dbfb9f2e7549fd13db8ec5c61b/src/util/overflow.h#L14-L15 https://github.com/bitcoin/bitcoin/blob/07b924775e4ff5dbfb9f2e7549fd13db8ec5c61b/src/test/fuzz/util.h#L84-L85

    Same style nit goes for ThreadPool::Submit().


    furszy commented at 11:01 pm on February 12, 2026:
    Done as suggested. [[nodiscard]] inlined. Also, credits to ryanofsky for the helper function idea <3. I was about to use a macro.
  8. in src/util/threadpool.h:169 in 5326821cbf
    169             }
    170             m_work_queue.emplace(std::move(task));
    171         }
    172         m_cv.notify_one();
    173-        return future;
    174+        return util::Expected<decltype(future), std::string>(std::move(future));
    


    hodlinator commented at 8:19 pm on February 12, 2026:

    nit: Can replace with:

    0        return {std::move(future)};
    

    furszy commented at 11:02 pm on February 12, 2026:
    Cool. Done as suggested. Thanks.
  9. in src/util/threadpool.h:164 in 5326821cbf
    163         auto future{task.get_future()};
    164         {
    165             LOCK(m_mutex);
    166             if (m_interrupt || m_workers.empty()) {
    167-                throw std::runtime_error("No active workers; cannot accept new tasks");
    168+                return util::Unexpected<std::string>("No active workers; cannot accept new tasks");
    


    hodlinator commented at 8:20 pm on February 12, 2026:

    nit: No need to specify Unexpected template arg:

    0                return util::Unexpected{"No active workers; cannot accept new tasks"};
    

    furszy commented at 11:02 pm on February 12, 2026:
    Cool. Done as suggested. Thanks.
  10. in src/httpserver.cpp:284 in 5326821cbf
    280+        if (auto res = g_threadpool_http.Submit(std::move(item)); !res.has_value()) {
    281+            LogWarning("Request rejected during server shutdown");
    282+            hreq->WriteReply(HTTP_SERVICE_UNAVAILABLE, "Request rejected during server shutdown");
    283+            return;
    284+        }
    285+
    


    hodlinator commented at 8:26 pm on February 12, 2026:

    meganit: Superfluous newline after if-block before end of outer block.


    furszy commented at 11:02 pm on February 12, 2026:
    Done.
  11. andrewtoth commented at 8:31 pm on February 12, 2026: contributor

    The race can be reproduced as follows:

    Can this be made into a functional regression test?

  12. in src/httpserver.cpp:214 in 5326821cbf outdated
    210@@ -211,7 +211,7 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
    211             }
    212         }
    213     }
    214-    auto hreq{std::make_unique<HTTPRequest>(req, *static_cast<const util::SignalInterrupt*>(arg))};
    215+    auto hreq{std::make_shared<HTTPRequest>(req, *static_cast<const util::SignalInterrupt*>(arg))};
    


    hodlinator commented at 8:32 pm on February 12, 2026:
    remark: Was experimenting with having ThreadPool::Submit() take a “task_creator” lambda that only gets called when no error condition in Submit occurs, it would then return the inner lambda which would take ownership of the unique_ptr as on master. Even though I have a distaste for shared_ptr, your version is more elegant.
  13. in src/test/fuzz/threadpool.cpp:1 in 241367d021 outdated


    hodlinator commented at 8:51 pm on February 12, 2026:

    Commit message suggestion for 241367d021d210f25e427e7bb35c676b293ff28a:

    0threadpool: make Submit return Expected instead of throwing
    1
    2Unlike exceptions, which can be ignored as they require extra try-catch
    3blocks, returning expected errors forces callers to always handle
    4submission failures.
    5
    6+ Not throwing an exception also happens to fix unclean shutdown bug
    7+ [#34573](/bitcoin-bitcoin/34573/) since we no longer throw when attempting to Submit() from http_request_cb().
    

    hodlinator commented at 8:54 pm on February 12, 2026:

    Commit message suggestion for 5326821cbffbef388421af082d43c7bc6bcc48f7:

    Given that I think the unclean shutdown is already fixed in the first commit by using util::Expected instead of throwing exceptions, I think this is somewhat more correct for the second:

     0http: properly respond to HTTP request during shutdown
     1
     2Makes sure we respond to the client as the HTTP request attempts to submit a task to
     3the threadpool during server shutdown.
     4
     5Roughly what happens:
     6
     71) The server receives an HTTP request and starts calling http_request_cb().
     82) Meanwhile on another thread, shutdown is triggered which calls InterruptHTTPServer()
     9   and unregisters http_request_cb() and interrupts the thread pool.
    103) The request (step 1) resumes and tries to submit a task
    11   to the now-interrupted server.
    12
    13This fix detects failed submissions immediately, and the server
    14responds with HTTP_SERVICE_UNAVAILABLE.
    

    I don’t think it’s necessary to classify this as a race condition. I guess it depends on how you define it.


    furszy commented at 11:02 pm on February 12, 2026:
    Sure. Done as suggested. Thanks.

    furszy commented at 11:06 pm on February 12, 2026:

    Taken, thanks!

    I don’t think it’s necessary to classify this as a race condition. I guess it depends on how you define it.

    There is a thread (shutdown - main thread) modifying a resource (thread pool) just before another thread (libevent http handler) accesses it. It is pretty much a race condition to me.


    hodlinator commented at 8:34 pm on February 13, 2026:

    There is a thread (shutdown - main thread) modifying a resource (thread pool) just before another thread (libevent http handler) accesses it. It is pretty much a race condition to me.

    Fair.

  14. furszy commented at 8:53 pm on February 12, 2026: member

    Can this be made into a functional regression test?

    The “processing request is delayed” step means manually placing a sleep() inside the libevent callback http_request_cb() to progress on the server shutdown concurrently. Then we need to sleep() during shutdown, just after InterruptHTTPServer(), to resume the request processing before shutdown occurs. So, the short answer is no. Sadly races aren’t easy to replicate deterministically in functional tests. We will surely be able to exercise more scenarios like this one, in unit tests, after the new http server inclusion (libevent removal path), which will bring better encapsulation.

    That being said, could provide a sleep()-based patch to replicate it locally.

  15. hodlinator commented at 9:01 pm on February 12, 2026: contributor
    Concept ACK 5326821cbffbef388421af082d43c7bc6bcc48f7
  16. DrahtBot removed the label CI failed on Feb 12, 2026
  17. furszy force-pushed on Feb 12, 2026
  18. furszy commented at 11:06 pm on February 12, 2026: member
    Updated per feedback, thanks @hodlinator!
  19. fanquake added this to the milestone 31.0 on Feb 13, 2026
  20. in src/util/threadpool.h:157 in 668eaa4ab4
    152@@ -151,20 +153,20 @@ class ThreadPool
    153      * Note: Ignoring the returned future requires guarding the task against
    154      * uncaught exceptions, as they would otherwise be silently discarded.
    155      */
    156-    template <class F> [[nodiscard]] EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    157-    auto Submit(F&& fn)
    158+    template <class F>
    159+    [[nodiscard]] util::Expected<std::future<std::invoke_result_t<F>>, std::string> Submit(F&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    


    andrewtoth commented at 2:25 pm on February 13, 2026:
    Can this be noexcept now?

    furszy commented at 3:38 pm on February 13, 2026:
    Sure, done as suggested.
  21. furszy force-pushed on Feb 13, 2026
  22. in src/httpserver.cpp:283 in b95e403595 outdated
    275@@ -276,7 +276,11 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
    276             req->WriteReply(HTTP_INTERNAL_SERVER_ERROR, err_msg);
    277         };
    278 
    279-        [[maybe_unused]] auto _{g_threadpool_http.Submit(std::move(item))};
    280+        if (auto res = g_threadpool_http.Submit(std::move(item)); !res.has_value()) {
    281+            LogWarning("Request rejected during server shutdown");
    282+            hreq->WriteReply(HTTP_SERVICE_UNAVAILABLE, "Request rejected during server shutdown");
    283+            return;
    284+        }
    


    hodlinator commented at 8:55 pm on February 13, 2026:
    1. Log the inner error string (currently it’s never read).
    2. Even though we know it in the context of this PR/issue - I think it’s a bit overly presumptuous to claim we are shutting down. All this outer logic knows is that Submit() failed. Would go with more neutral language.
    0        if (auto res = g_threadpool_http.Submit(std::move(item)); !res.has_value()) {
    1            LogWarning("Request rejected by thread pool: %s", res.error());
    2            hreq->WriteReply(HTTP_SERVICE_UNAVAILABLE, "Request rejected by thread pool");
    3            return;
    4        }
    

    (I was experimenting with changing ThreadPool::Submit() to enforce the queue depth by itself, which would be another potential reason for it to fail).


    furszy commented at 9:14 pm on February 13, 2026:

    The LogWarning change makes sense, but the WriteReply one isn’t really appropriate. Have in mind that WriteReply is an user-facing response, and referring to the “thread pool” there leaks internal details that do not help the user understand what happened nor what to do next.

    (I was experimenting with changing ThreadPool::Submit() to enforce the queue depth by itself, which would be another potential reason for it to fail).

    I briefly thought about this and ended up discarding the idea because there is no other known use-case that requires it for now. Capping the pool is required when the pool faces user’s requests, and all other applications we have for it are purely internal (parallel index sync, block inputs fetching, wallet rescan, kernel validation signals, etc). Ideally, we should only generalize features into the thread pool class only when we plan to use them in more than one place.


    furszy commented at 9:28 pm on February 13, 2026:
    We currently don’t have a simple way to detect shutdown from within the libevent request callback to be able to differentiate between states, and we shouldn’t pollute code just because of this. Submission can only fail because of a shutdown atm. We will be able to access more information from within this function after the libevent removal path, as the new server provides better encapsulation.

    hodlinator commented at 9:37 pm on February 13, 2026:

    Yeah, I can agree on not being to specific with the internals in user-facing messages and omitting thread pool.

    0        if (auto res = g_threadpool_http.Submit(std::move(item)); !res.has_value()) {
    1            LogWarning("Request rejected by thread pool: %s", res.error());
    2            hreq->WriteReply(HTTP_SERVICE_UNAVAILABLE, "Request rejected, probably due to server shutdown");
    3            return;
    4        }
    

    If you want to keep the current very assertive wording “Request rejected during server shutdown” I would suggest returning a one-value error enum instead of the std::string in util::Expected and having:

    0        if (auto res = g_threadpool_http.Submit(std::move(item)); !res.has_value()) {
    1            switch (res.error()) {
    2            case ThreadPool::SubmitError::Interrupted:
    3                LogWarning("Request rejected during server shutdown");
    4                hreq->WriteReply(HTTP_SERVICE_UNAVAILABLE, "Request rejected during server shutdown");
    5                return;
    6            } // no default case, so the compiler can warn about missing cases
    7        }
    

    hodlinator commented at 9:44 pm on February 13, 2026:

    I briefly thought about this and ended up discarding the idea because there is no other known use-case that requires it for now. Capping the pool is required when the pool faces user’s requests, and all other applications we have for it are purely internal (parallel index sync, block inputs fetching, wallet rescan, kernel validation signals, etc). Ideally, we should only generalize features into the thread pool class only when we plan to use them in more than one place.

    I was worried about multiple threads calling Submit() and doing the queue depth check on the outside in a racy way where they both pass and then both Submit() and exceed the depth by one. But libevent probably only calls http_request_cb() from one thread so we should be fine right now. Thanks for laying out your reasoning, makes sense.


    furszy commented at 9:51 pm on February 13, 2026:

    If you want to keep the current very assertive wording “Request rejected during server shutdown” I would suggest returning a one-value error enum instead of the std::string in util::Expected

    I’m not convinced we should add an error code enum when we’re not expecting to expand it anytime soon. It would force everyone using the pool to add a bunch of boilerplate code for no real gain. We could differentiate states now, without waiting for the post-libevent encapsulation, by checking whether the pool was interrupted too. Still, it feels like overthinking this for no real reason. I would focus on fixing the issue here and keep the code simple for now. We can always adapt/modify it when more pool use-cases get merged.


    hodlinator commented at 10:03 pm on February 13, 2026:
    True, I am probably somewhat biased by my earlier experiments. Maybe it would be cleaner for Submit() to return std::optional<std::future<std::invoke_result_t<F>>> instead of util::Expected<std::future<std::invoke_result_t<F>>, std::string> after all - as long as there is only one single cause of failure?

    furszy commented at 11:14 pm on February 13, 2026:

    Maybe it would be cleaner for Submit() to return std::optional<std::future<std::invoke_result_t<F>>> instead of util::Expected<std::future<std::invoke_result_t<F>>, std::string> after all - as long as there is only one single cause of failure?

    Hmm, good observation. Thinking further.. there are currently two possible errors during submission that are currently under the same error message: 1) a non-started pool and 2) an interrupted pool. The first one is actually a logical error rather than one during runtime, and we should probably assert/assume that cannot happen. Will think more about this.


    andrewtoth commented at 3:11 pm on February 15, 2026:
    I think Expected is the appropriate return value here. It is an unexpected failure to submit a task, rather than the result being “not found” or “not existing” as a nullopt would imply.

    hodlinator commented at 8:59 am on February 16, 2026:

    optional doesn’t strictly mean “found/not found”, IMO it’s closer to “may be set or not”. Submit() is not a query type function, we are submitting to a thread pool and should be getting a std::future back.

    The user-code and messages here immediately draw the conclusion that failure to submit means we are shutting down, so it is not that unexpected. Having a string as the error approaches stringly typed coding which should also be avoided (potentially returning different strings for different error conditions makes user code worse).

    As long as we only have one combined error condition (not started or interrupted), I think optional makes the most sense. If we separate the two I think util::Unexpected{enum SubmitError { NotStarted, Interrupted }} makes more sense.

    PS. I’m sorry if this feels nit-picky, I would probably let it slide if it was more of a leaf-type component.


    furszy commented at 3:36 pm on February 16, 2026:

    I went one step back and asked myself why added the submission exception in the first place (3.5 years ago). Should have remembered the reasoning sooner and not gotten obnubilated by our new shiny util::Expected toy.. sorry.

    Thread pools are a well-studied structure, and most languages and libraries handle submission errors in the same way:

    • Java’s ExecutorService throws RejectedExecutionException after shutdown().
    • Python’s ThreadPoolExecutor raises a RuntimeError after shutdown().
    • Boost.Asio treats submission after stop or shutdown as misuse.

    So, the principle is the same everywhere: a thread pool is a resource with a lifecycle. Once you cross the shutdown boundary, submitting work isn’t normal control flow anymore, and it should be treated differently.

    For us, that means:

    • “Pool not started”: that’s a bug in the calling code. We can either Assume it or throw a logic_error. Callers can’t recover from this anyway.
    • “Pool interrupted”: is a hard lifecycle boundary. Submitting after Interrupt() should throw a dedicated interrupted_exception. Not everyone has to worry about this.
    • Otherwise: strong guarantee. If you get a future back, your task will run.

    Will re-accommodate the code to follow this principles again. I’m inclined towards being consistent with existing libraries and not reinvent the wheel.

    Let me know if you guys have any reservation about this.


    hodlinator commented at 3:55 pm on February 16, 2026:
    I think the way Java and Python throw exceptions is dated. It’s what got us into the situation with this PR in the first place, returning an error by value is a safer pattern IMO. That’s why Rust opted to not support exceptions.

    furszy commented at 4:18 pm on February 16, 2026:

    I think the way Java and Python throw exceptions is dated. It’s what got us into the situation with this PR in the first place, returning an error by value is a safer pattern IMO. That’s why Rust opted to not support exceptions.

    It seems low-level Rust panics/aborts when submitting a task to an interrupted pool. What they have is a try_spawn wrapper (TrySubmit for us) that behaves in the way you describe. See rusty_pool or tokio-threadpool.

    It’s what got us into the situation with this PR in the first place

    Well, realistically speaking, the race was already there. We were returning the wrong error message before. The exception actually helped us catch it.


    andrewtoth commented at 4:46 pm on February 16, 2026:
    Agree with @hodlinator. Changing to exceptions is a step backwards. Modern error handling should use std::expected, which we can’t use yet. Also agree to use an enum for the different error cases, instead of using a std::optional. It’s ok if we don’t check the value yet in production code, but we can update the test cases to assert the proper error.

    andrewtoth commented at 5:10 pm on February 16, 2026:

    Well, realistically speaking, the race was already there. We were returning the wrong error message before. The exception actually helped us catch it.

    The previous race would return an error to the caller. The exception caused the node to crash. Using Expected prevents this kind of error from occurring because the caller is forced to handle the error instead of forgetting to wrap in try/catch.


    furszy commented at 0:50 am on February 17, 2026:

    The previous race would return an error to the caller. The exception caused the node to crash. Using Expected prevents this kind of error from occurring because the caller is forced to handle the error instead of forgetting to wrap in try/catch.

    Yes, that’s pretty much the PR description. What I meant above is that it’s great we’re here, because it means we’ve caught something new. That comment at least wasn’t a pro exceptions argument. It was a pro thread pool inclusion argument really.

  23. hodlinator commented at 8:59 pm on February 13, 2026: contributor
    Thanks for incorporating my earlier feedback btw!
  24. furszy force-pushed on Feb 13, 2026
  25. andrewtoth commented at 10:26 pm on February 14, 2026: contributor

    That being said, could provide a sleep()-based patch to replicate it locally.

    Here is a diff, along with a functional test rpc_http_shutdown_race.py. Applying this to the base commit 07b924775e4ff5dbfb9f2e7549fd13db8ec5c61b will fail with a remote disconnected error, and applying to this patch 542879da7b66e890d2c1bf668ab5680e939541f8 will pass.

     0diff --git a/src/httpserver.cpp b/src/httpserver.cpp
     1index 671e119642..d2de754183 100644
     2--- a/src/httpserver.cpp
     3+++ b/src/httpserver.cpp
     4@@ -18,6 +18,7 @@
     5 #include <util/strencodings.h>
     6 #include <util/threadnames.h>
     7 #include <util/threadpool.h>
     8+#include <util/time.h>
     9 #include <util/translation.h>
    10 
    11 #include <condition_variable>
    12@@ -257,6 +258,7 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
    13             hreq->WriteReply(HTTP_SERVICE_UNAVAILABLE, "Work queue depth exceeded");
    14             return;
    15         }
    16+        UninterruptibleSleep(2s);
    17 
    18         auto item = [req = std::move(hreq), in_path = std::move(path), fn = i->handler]() {
    19             std::string err_msg;
    20@@ -446,6 +448,7 @@ void InterruptHTTPServer()
    21     }
    22     // Interrupt pool after disabling requests
    23     g_threadpool_http.Interrupt();
    24+    UninterruptibleSleep(2s);
    25 }
    26 
    27 void StopHTTPServer()
    28diff --git a/test/functional/rpc_http_shutdown_race.py b/test/functional/rpc_http_shutdown_race.py
    29new file mode 100755
    30index 0000000000..f91cdee9f3
    31--- /dev/null
    32+++ b/test/functional/rpc_http_shutdown_race.py
    33@@ -0,0 +1,42 @@
    34+#!/usr/bin/env python3
    35+# Copyright (c) The Bitcoin Core developers
    36+# Distributed under the MIT software license, see the accompanying
    37+# file COPYING or http://www.opensource.org/licenses/mit-license.php.
    38+"""Test that an in-flight RPC during shutdown receives 503."""
    39+
    40+from test_framework.test_framework import BitcoinTestFramework
    41+from test_framework.util import assert_equal, str_to_b64str
    42+
    43+import http.client
    44+import urllib.parse
    45+
    46+class RpcHttpShutdownRaceTest(BitcoinTestFramework):
    47+    def set_test_params(self):
    48+        self.num_nodes = 1
    49+        self.setup_clean_chain = True
    50+
    51+    def run_test(self):
    52+        url = urllib.parse.urlparse(self.nodes[0].url)
    53+        authpair = f"{url.username}:{url.password}"
    54+        headers = {"Authorization": f"Basic {str_to_b64str(authpair)}"}
    55+
    56+        conn = http.client.HTTPConnection(url.hostname, url.port, timeout=10)
    57+        conn.connect()
    58+        conn.request("POST", "/", '{"method":"getblockchaininfo","id":1}', headers)
    59+
    60+        # Trigger shutdown outside RPC callback path so the in-flight request
    61+        # is interrupted before submitting to the pool.
    62+        self.nodes[0].process.terminate()
    63+
    64+        resp = conn.getresponse()
    65+        body = resp.read()
    66+        assert_equal(resp.status, http.client.SERVICE_UNAVAILABLE)
    67+        assert b"Request rejected during server shutdown" in body
    68+        node = self.nodes[0]
    69+        self.wait_until(lambda: node.process.poll() is not None, timeout=20)
    70+        stderr = node.stderr.read().decode("utf-8", errors="replace").strip()
    71+        node.is_node_stopped(expected_stderr=stderr, expected_ret_code=node.process.returncode)
    72+
    73+
    74+if __name__ == '__main__':
    75+    RpcHttpShutdownRaceTest(__file__).main()
    76diff --git a/test/functional/test_runner.py b/test/functional/test_runner.py
    77index cdfa8c1022..0d21907677 100755
    78--- a/test/functional/test_runner.py
    79+++ b/test/functional/test_runner.py
    80@@ -201,6 +201,7 @@ BASE_SCRIPTS = [
    81     'wallet_reindex.py',
    82     'wallet_reorgsrestore.py',
    83     'interface_http.py',
    84+    'rpc_http_shutdown_race.py',
    85     'interface_rpc.py',
    86     'interface_usdt_coinselection.py',
    87     'interface_usdt_mempool.py',
    
  26. andrewtoth approved
  27. andrewtoth commented at 3:11 pm on February 15, 2026: contributor

    ACK 542879da7b66e890d2c1bf668ab5680e939541f8

    This fixes the unclean shutdown race condition. It’s too bad we can’t have a regression test without adding sleeps.

  28. DrahtBot requested review from hodlinator on Feb 15, 2026
  29. furszy commented at 4:29 pm on February 15, 2026: member

    Thanks for the reproduction patch @andrewtoth! Added link to the PR description.

    This fixes the unclean shutdown race condition. It’s too bad we can’t have a regression test without adding sleeps.

    A small reflection about this.

    We can’t expect to be able to cover everything in functional tests. Simply because they live on a higher level and have no access to internals to exercise race conditions.

    These types of scenarios belong to the unit tests land. But for that to happen, the surrounding code has to be properly encapsulated. Which is not the case for the http server, as there is currently no class for it, we basically import an entire file with global variables and static functions (and up to a few days ago, we had no tests for the work queue at all and this race condition was already there, just hidden returning an incorrect error message instead).

    The only way to replicate scenarios like this one atm, on a unit test, would be to manually copy paste startup and shutdown. Which is not really scalable nor maintainable.

    Still, we will get better soon, the libevent removal path @pinheadmz is working on also brings better encapsulation that we can use to deeply cover the area.

  30. in src/util/threadpool.h:24 in 818e2edadc outdated
    17@@ -17,6 +18,7 @@
    18 #include <queue>
    19 #include <stdexcept>
    20 #include <thread>
    21+#include <type_traits>
    22 #include <utility>
    23 #include <vector>
    24 
    


    ismaelsadeeq commented at 3:42 pm on February 16, 2026:

    In “threadpool: make Submit return Expected instead of throwing” 818e2edadc11dca226539ddee945f0acf6e53814

    This introduces inconsistency. In Start, we throw a runtime error as a result of multiple calls to Start, or possibly an exception could occur when thread creation fails for some reason.

    I’m also a bit conflicted here. Why are we using util::Expected? Shouldn’t the client be patched to handle throws when interacting with the thread pool? It’s likely that we throw for some cases, and still back to an issue like the unclean shutdown we are fixing here?


    furszy commented at 3:47 pm on February 16, 2026:
    Yes. Please see #34577 (review). Same conclusion there. I’m changing the current PR approach now. I shouldn’t have gotten obnubilated by our shiny new util::Expected toy.

    hodlinator commented at 3:58 pm on February 16, 2026:
    Start() could probably Assume() or assert() instead of throw, it seems to be a bit of a logic error to Start() twice without stopping.

    andrewtoth commented at 4:48 pm on February 16, 2026:

    obnubilated

    lol

    Throwing exceptions should be deprecated IMO, and we should aim to use the more modern Expected approach for errors. If an error is unrecoverable, abort instead of throwing an exception.


    sedited commented at 5:45 pm on February 16, 2026:

    Throwing exceptions should be deprecated IMO

    I disagree with this as a general statement. For example, throwing an exception is the standard way to indicate an error in a constructor, as utilized all over the standard library. We also use the standard library very extensively, which makes extensive use of exceptions, and will require us to handle them for as long as we use it.


    andrewtoth commented at 10:15 pm on February 16, 2026:
    Right, we can’t return an Expected from a constructor like we can return a Result in Rust when creating an object. Other than that, though, I don’t think there’s a need to throw an exception. We still need to catch exceptions, of course. I didn’t suggest we stop doing that!
  31. ismaelsadeeq commented at 3:44 pm on February 16, 2026: member

    Concept ACK @furszy, it would be nice if you share a branch with a reproducer.

    edit: see #34577 (comment) for reproducer

  32. threadpool: make Submit return Expected instead of throwing
    Unlike exceptions, which can be ignored as they require extra try-catch
    blocks, returning expected errors forces callers to always handle
    submission failures.
    
    Not throwing an exception also fixes an unclean shutdown bug
    #34573 since we no longer throw when attempting to Submit()
    from the libevent callback http_request_cb().
    b576a24f36
  33. furszy force-pushed on Feb 17, 2026
  34. http: properly respond to HTTP request during shutdown
    Makes sure we respond to the client as the HTTP request attempts to submit a task to
    the thread pool during server shutdown.
    
    Roughly what happens:
    
    1) The server receives an HTTP request and starts calling http_request_cb().
    2) Meanwhile on another thread, shutdown is triggered which calls InterruptHTTPServer()
       and unregisters libevent http_request_cb() callback and interrupts the thread pool.
    3) The request (step 1) resumes and tries to submit a task to the now-interrupted server.
    
    This fix detects failed submissions immediately, and the server responds with
    HTTP_SERVICE_UNAVAILABLE.
    7fdc481599
  35. threadpool: Treat double-Start as logic error
    Calling Start() on an already-running pool is a programming error, not
    a recoverable runtime condition.
    881568b39b
  36. furszy force-pushed on Feb 17, 2026
  37. DrahtBot added the label CI failed on Feb 17, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-02-17 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me