Shutdown during reindex-chainstate can block forever #23234

issue luke-jr openend this issue on October 8, 2021
  1. luke-jr commented at 11:23 pm on October 8, 2021: member

    During Shutdown, we stop the scheduler before waiting on the load-block thread. But the load-block thread can call LimitValidationInterfaceQueue via ActivateBestChain. LimitValidationInterfaceQueue then schedules a dummy call and waits for it. But since the scheduler has stopped, it never gets there, and blocks forever. Shutdown remains joined to the thread, and also never exits.

    Can we just wait for the load-block thread before killing the scheduler?

  2. luke-jr added the label Bug on Oct 8, 2021
  3. hebasto commented at 7:54 am on October 9, 2021: member

    I’ve observed such behavior but did not notice the reasons.

    Thanks @luke-jr!

  4. ajtowns commented at 5:20 am on November 3, 2022: member

    I had what I think was a similar issue (scriptcheck thread hanging, waiting on SyncWithValidationInterfaceQueue to complete). I found the changing SyncWithValidationInterfaceQueue to be:

     0void SyncWithValidationInterfaceQueue()
     1{
     2    AssertLockNotHeld(cs_main);
     3    // Block until the validation queue drains
     4    auto promise = std::make_shared<std::promise<void>>();
     5    CallFunctionInValidationInterfaceQueue([promise] {
     6        promise->set_value();
     7    });
     8    std::future_status status;
     9    do {
    10        status = promise->get_future().wait_for(10s);
    11    } while (status != std::future_status::ready); // && !ShutdownRequested());
    12}
    

    fixed my problem. In particular, Shutdown() calls scheduler->stop() before StopScriptCheckWorkerThreads() which gives a deadlock: scheduler is stopped, but worker threads can’t complete because they’re waiting for the scheduler to finish off the promise. Just having the scriptcheck threads exit early isn’t enough, because after the scriptcheck threads have finished FlushBackgroundCallbacks() is called which does the promise->set_value() above, so promise still needs to exist, hence making it a shared_ptr.

  5. Crypt-iQ commented at 3:18 pm on June 18, 2023: contributor
    I also experienced this issue - my bitcoind hanged and when I dumped the running threads, it was waiting on LimitValidationInterfaceQueue but the scheduler thread had already exited. I did not have reindex-chainstate enabled. I think this can happen in any situation where the callbacks aren’t cleared quickly enough and SyncWithValidationInterfaceQueue is called.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 19:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me