init: Signal-safe instant shutdown #20605

pull laanwj wants to merge 1 commits into bitcoin:master from laanwj:2020_12_instant_shutdown changing 4 files +112 −14
  1. laanwj commented at 9:14 pm on December 8, 2020: member

    Replace the 200ms polling loop with a faster and more efficient waiting operation. This should speed up short RPC tests.

    This change has been tried a few times before, but abandoned every time because solutions used a condition variable which is not safe for use in signals, as they need to be reentrant.

    On UNIX-ish OSes, use a safe way: a pipe. When shutdown is requested write a dummy byte to the pipe. Waiting for shutdown is a matter of a blocking read from the pipe.

    On Windows, there are no signals so using a condition variable is safe.

    This only affects bitcoind. The GUI is unaffected by this change, and keeps polling as before in BitcoinGUI::detectShutdown(). It might be possible to listen to a pipe there, too, but I’m not sure, and it’s complicated by the GUI-node abstraction.

  2. laanwj added the label Utils/log/libs on Dec 8, 2020
  3. laanwj force-pushed on Dec 8, 2020
  4. sipa commented at 9:19 pm on December 8, 2020: member
    That’s an amazing hack. I didn’t know it was allowed to call system library functions from a signal handler, but apparently POSIX mandates that some functions are callable - including read() and write().
  5. DrahtBot commented at 6:49 am on December 9, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #20487 (draft: Add syscall sandboxing using seccomp-bpf (Linux secure computing mode) by practicalswift)
    • #19461 (multiprocess: Add bitcoin-gui -ipcconnect option by ryanofsky)
    • #19460 (multiprocess: Add bitcoin-wallet -ipcconnect option by ryanofsky)
    • #19160 (multiprocess: Add basic spawn and IPC support by ryanofsky)
    • #10102 ([experimental] Multiprocess bitcoin by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  6. jonasschnelli commented at 8:09 am on December 9, 2020: contributor

    Nice! Tested ACK f8585d59f1b0040765361eebd8a4544670de2f04 - (on macOS only)

    Is there a documentation what system function one can call in signal handler (to confirm with POSIX)?

  7. laanwj commented at 9:26 am on December 9, 2020: member

    Is there a documentation what system function one can call in signal handler (to confirm with POSIX)?

    This is a good question, I wasn’t able to find any definitive reference. The “self-pipe trick” is an incredibly common pattern in UNIX daemons though. It certainly works across Linux, Macos, modern BSDs… Fairly sure everything that can run Bitcoin Core. If it would be broken in principle then a lot of software would break.

    For example Python uses this too, calls it a ‘wakeup fd’. See https://github.com/python/cpython/blob/master/Modules/signalmodule.c#L294

  8. in src/shutdown.cpp:64 in eae384f8fa outdated
    59+    // case of a reentrant signal.
    60+    if (!fRequestShutdown.exchange(true)) {
    61+        // Write an arbitrary byte to the write end of the shutdown pipe.
    62+        const char token = 'x';
    63+        while (true) {
    64+            int result = write(g_shutdown_pipe[1], &token, 1);
    


    promag commented at 11:36 am on December 9, 2020:
    I’ve tried locally to just close() and it also works (tested on macos only). Note that WaitForShutdown is called twice so just one read succeeds, see https://github.com/promag/bitcoin/commit/c44a33c5532247d74cdaaf06d43a104ae2e36aa1.

    laanwj commented at 12:06 pm on December 9, 2020:
    What do you mean with “WaitForShutdown is called twice”? It isn’t. It’s called at the end of AppInit which itself is called once.

    laanwj commented at 12:07 pm on December 9, 2020:
    Or if you mean the call in AbortShutdown, sure, yes but that’s on purpose to reset the shutdown condition so it can be used again. I’d really prefer to use read/write and not close a socket under someone waiting on it !

    promag commented at 12:13 pm on December 9, 2020:
    Ah right, the second call is In AbortShutdown. But that doesn’t invalidate the point of just close() instead of write(), which looks enough to make read() return.

    laanwj commented at 12:17 pm on December 9, 2020:

    “looks enough” Yes, but we’re dealing with a lot of possibilities in different OSes here, just because it works for you doesn’t mean it’s safe. This whole thing (try googling about signals and pipe tricks) is a controversial subject, so I prefer to mimic existing code instead of inventing our own. Also, close() makes it more work to re-arm the condition in AbortShutdown.

    Edit: To be fair I do think it’s a clever trick to use close, and it might be good in some circumstances, but probably not in this case.

  9. promag commented at 11:48 am on December 9, 2020: member
    Concept ACK.
  10. jonatack commented at 11:53 am on December 9, 2020: member
    Tested ACK eae384f8fa0215 on Debian 5.9.11-1 (2020-11-27) x86_64 GNU/Linux
  11. promag commented at 12:20 pm on December 9, 2020: member
    Tested ACK eae384f8fa0215cfd63235b8cb22077a0c356ee0 on macos 10.15.6, needs squash I guess.
  12. laanwj force-pushed on Dec 9, 2020
  13. laanwj commented at 12:25 pm on December 9, 2020: member
    Squashed f8585d59f1b0040765361eebd8a4544670de2f04..eae384f8fa0215cfd63235b8cb22077a0c356ee0 → f628398631bc2a5b7839aeab5b3d89cbbff1f3fa
  14. jonatack commented at 12:54 pm on December 9, 2020: member

    Tested re-ACK f628398631bc2a5b7839aeab5b3d89cbbff1f3fa on Debian 5.9.11-1 (2020-11-27) x86_64 GNU/Linux

    Timed runs of a short rpc tests compared to master, they appear faster, and manual bitcoind shutdowns seem snappier.

  15. decryp2kanon commented at 6:39 am on December 10, 2020: contributor
    Concept ACK ❤️
  16. practicalswift commented at 9:29 am on December 10, 2020: contributor
    Concept ACK: swift shutdown is more practical than non-swift shutdown
  17. luke-jr commented at 4:22 pm on December 14, 2020: member

    On Windows, there are no signals so using a condition variable is safe.

    But Windows does have signals…?

  18. laanwj commented at 4:59 pm on December 14, 2020: member

    But Windows does have signals…?

    But not asynchronous ones like in POSIX. Windows always uses event loops.

    Edit: okay this is apparently not true, some notifications launch in a separate thread. It seems they are not reentrant in the same sense as POSIX signals though.

  19. laanwj commented at 5:38 pm on December 14, 2020: member

    Closing this, might pick it up some day or someone else could.

    Edit: we only set a SetConsoleCtrlHandler on Windows and this launches in a separate thread so this is exactly the right appreach.

  20. laanwj closed this on Dec 14, 2020

  21. laanwj reopened this on Dec 14, 2020

  22. sipa commented at 5:48 pm on December 14, 2020: member
    I suggest reopening, the approach here looks correct for Windows: https://docs.microsoft.com/en-us/windows/console/console-control-handlers (the control handlers are run in a separate thread).
  23. in src/shutdown.cpp:29 in f628398631 outdated
    24+std::condition_variable g_shutdown_cv;
    25+#else
    26+/** On UNIX-like operating systems use the self-pipe trick.
    27+ * Index 0 will be the read end of the pipe, index 1 the write end.
    28+ */
    29+static int g_shutdown_pipe[2];
    


    ryanofsky commented at 9:53 pm on December 14, 2020:

    In commit “init: Signal-safe instant shutdown” (f628398631bc2a5b7839aeab5b3d89cbbff1f3fa)

    Could set some initial values:

    0static int g_shutdown_pipe[2] = {-1, -1};
    

    To avoid possibility of code reading and writing to random pipes if someone forgets to call InitShutdownState.


    laanwj commented at 3:57 pm on December 15, 2020:
    Good idea. (0 is STDIN_FILENO so it should be more or less safe, but still, it’d be surprising behavior, it’s better to raise an error)
  24. in src/shutdown.cpp:83 in f628398631 outdated
    78 void AbortShutdown()
    79 {
    80+    if (fRequestShutdown) {
    81+        // Cancel existing shutdown by waiting for it, this will reset condition flags and remove
    82+        // the shutdown token from the pipe.
    83+        WaitForShutdown();
    


    ryanofsky commented at 10:34 pm on December 14, 2020:

    In commit “init: Signal-safe instant shutdown” (f628398631bc2a5b7839aeab5b3d89cbbff1f3fa)

    I don’t understand what AbortShutdown is supposed to do. Why would a caller request and abort a shutdown instead of just not requesting a shutdown until it wanted the shutdown? What prevents AppInit from exiting between the StartShutdown and AbortShutdown calls and causing AbortShutdown to hang forever? I could be misunderstanding, but it seems like AbortShutdown functionality was racy before this PR, and now racy and deadlocky after this PR.


    laanwj commented at 3:56 pm on December 15, 2020:
    I’m happy to see it go but it’s a) used in the tests b) before WaitForShutdown() is called in init. Both uses are safe, and I mention this in the comment (in the header file).

    ryanofsky commented at 1:06 pm on December 17, 2020:

    re: #20605 (review)

    I’m happy to see it go but it’s a) used in the tests b) before WaitForShutdown() is called in init. Both uses are safe, and I mention this in the comment (in the header file).

    Thanks for pointing out the comment. I missed that this was meant to be called before WaitForShutdown, so this makes much more sense now!

  25. ryanofsky referenced this in commit 73be44da44 on Dec 14, 2020
  26. ryanofsky approved
  27. ryanofsky commented at 11:01 pm on December 14, 2020: member

    Code review ACK f628398631bc2a5b7839aeab5b3d89cbbff1f3fa. Seems like a good approach. Could be cleaned up various ways (could be wrapped in a util class, could close file descriptors before shutdown, could handle more runtime errors and avoid asserts) but none of this seems important. I have a question below about AbortShutdown that would be good to clear up, but at worst I think may be a minor bug there, and wouldn’t want to hold up the PR.

    re: #20605#issue-534729158

    This only affects bitcoind. The GUI is unaffected by this change, and keeps polling as before in BitcoinGUI::detectShutdown(). It might be possible to listen to a pipe there, too, but I’m not sure, and it’s complicated by the GUI-node abstraction.

    The GUI is already asynchronous so shouldn’t need any polling/piping/condition variable stuff. Just some signals boilerplate like 73be44da440a961ad3acff4b6b358f918758c684 (branch)

  28. laanwj commented at 4:00 pm on December 15, 2020: member

    The GUI is already asynchronous so shouldn’t need any polling/piping/condition variable stuff. Just some signals boilerplate like 73be44d (branch)

    Wouldn’t raising a Qt signal also, likely, be unsafe in a signal handler?

    could handle more runtime errors and avoid asserts

    How do you suggest handling them? I used asserts because the program is in an unknown state if any of the conditions happen, it should only happen due to a coding bug or memory corruption, and there’s not a sane way to continue, e.g. either we spend forever waiting for shutdown, or the shutdown sequence runs prematurely (thinking of it, the latter is safe? then again it would be unexpected, at least with an assert it’s easier to see where something went wrong).

  29. init: Signal-safe instant shutdown
    Replace the 200ms polling loop with a faster and more efficient waiting
    operation.
    
    This was tried a few times before, but given up every time because
    solutions use a condition variable which is not safe for use in signals
    as they need to be reentrant.
    
    On UNIX-ish OSes, use a safe way: a pipe. When shutdown is requested
    write a dummy byte to the pipe. Waiting for shutdown is a matter of a
    blocking read from the pipe.
    
    On Windows, there are no signals so using a condition variable is safe.
    cd03513dc2
  30. laanwj force-pushed on Dec 15, 2020
  31. laanwj commented at 4:22 pm on December 15, 2020: member
    Added initializer for g_shutdown_pipe. f628398631bc2a5b7839aeab5b3d89cbbff1f3facd03513dc2fcccaa142e9632a28b38efd0056436
  32. jonatack commented at 10:52 pm on December 15, 2020: member

    ACK cd03513dc2fcccaa142e9632a28b38efd0056436 tested on Debian 5.9.11-1 (2020-11-27) x86_64 GNU/Linux

    When I read while (true) for some reason I think of 20 GOTO 10 or JMP ($000A)

  33. laanwj commented at 9:54 am on December 16, 2020: member

    When I read while (true) for some reason I think of 20 GOTO 10 or JMP ($000A)

    Heh, it could definitely be done without that, but I tend to write retry loops (especially those with complicated exit conditions, though that’s not the case here) as infinite loops usually. Here it avoids having to define result outside the loop and think about what to initialize it with for the first iteration.

  34. laanwj merged this on Dec 16, 2020
  35. laanwj closed this on Dec 16, 2020

  36. sidhujag referenced this in commit 5e4bc5616c on Dec 17, 2020
  37. ryanofsky commented at 1:15 pm on December 17, 2020: member

    Post-merge code review ACK cd03513dc2fcccaa142e9632a28b38efd0056436. Only change since last review initializing descriptor values.

    re: #20605 (comment)

    Wouldn’t raising a Qt signal also, likely, be unsafe in a signal handler?

    Oh! You’re completely right according to https://doc.qt.io/qt-5/unix-signals.html. So do need a thread. This can be added to previous implementation with small tweak: b6b5674a751604462184f59f7c1a5b19928b1274 (pr/guishut.2)

    I think potentional good followups for this PR would be:

    • Removing polling from GUI. Sketched above.
    • Moving platform specific pipe / condition variable code into an RAII util class that closes file descriptors and can be unit tested
  38. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me