refactor: Replace boost::mutex,condition_var,chrono with std equivalents in scheduler #18234

pull ajtowns wants to merge 6 commits into bitcoin:master from ajtowns:202002-scheduler-deboost changing 14 files +151 −117
  1. ajtowns commented at 5:09 am on March 1, 2020: member

    Replacing boost functionality with C++11 stuff.

    Motivated by #18227, but should stand alone. Changing from boost::condition_var to std::condition_var means threadGroup.interrupt_all isn’t enough to interrupt serviceQueue anymore, so that means calling stop() before join_all() is needed. And the existing reverselock.h code doesn’t work with sync.h’s DebugLock code (because the reversed lock won’t be removed from g_lockstack which then leads to incorrect potential deadlock warnings), so I’ve replaced that with a dedicated class and macro that’s aware of our debug lock behaviour.

    Fixes #16027, Fixes #14200, Fixes #18227

  2. fanquake added the label Refactoring on Mar 1, 2020
  3. fanquake commented at 5:14 am on March 1, 2020: member
    cc @theuni. I know you have been working on similar refactors recently.
  4. ajtowns force-pushed on Mar 1, 2020
  5. ajtowns force-pushed on Mar 1, 2020
  6. DrahtBot commented at 10:03 am on March 1, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18271 (scheduler: Workaround negative nsecs bug in boost’s wait_until by luke-jr)
    • #18264 ([WIP] build: Remove Boost Chrono by fanquake)
    • #16117 (util: Replace boost sleep with std sleep by MarcoFalke)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. in src/sync.h:213 in 3cb2664971 outdated
    208+        const int line;
    209+     };
    210+     friend class reverse_lock;
    211 };
    212 
    213+#define REVERSE_LOCK(g) decltype(g)::reverse_lock PASTE2(revlock, __COUNTER__)(g, #g, __FILE__, __LINE__)
    


    elichai commented at 1:29 pm on March 1, 2020:
    Pending comments on my post here: #18088 (comment) Maybe replace it with PASTE2(revlock,PASTE2(__FILE__,__LINE__))

    ajtowns commented at 1:40 am on March 2, 2020:
    We already use __COUNTER__ for LOCK() in sync.h, so reusing it here seems fine. Using __FILE__ seems unlikely to generate a usable symbol name too :)
  8. in src/sync.cpp:63 in 42a5326a4d outdated
    59@@ -60,6 +60,11 @@ struct CLockLocation {
    60             mutexName, sourceFile, itostr(sourceLine), (fTry ? " (TRY)" : ""), m_thread_name);
    61     }
    62 
    63+    std::string Name() const
    


    elichai commented at 2:24 pm on March 1, 2020:
    Maybe change to const std::string& Name() so that it will only copy/allocate it if needed?

    MarcoFalke commented at 3:04 pm on March 1, 2020:

    elichai commented at 3:10 pm on March 1, 2020:
    oh god. thanks. (for some reason I thought/hoped C++ helps you making sure you’re not violating the lifetime when using references) I think I should stop using Rust lol
  9. practicalswift commented at 8:14 pm on March 1, 2020: contributor

    Concept ACK

    Thanks for working on de-boosting! :)

  10. in src/test/txindex_tests.cpp:75 in 42a5326a4d outdated
    69@@ -70,9 +70,6 @@ BOOST_FIXTURE_TEST_CASE(txindex_initial_sync, TestChain100Setup)
    70     // shutdown sequence (c.f. Shutdown() in init.cpp)
    71     txindex.Stop();
    72 
    73-    threadGroup.interrupt_all();
    74-    threadGroup.join_all();
    75-
    


    jonatack commented at 2:22 pm on March 2, 2020:

    In 69460c9 could these be also removed in src/test/transaction_tests.cpp L::457-458 if they are handled with TestingSetup::~TestingSetup(); the test compiles and runs without them.

    0--- a/src/test/transaction_tests.cpp
    1+++ b/src/test/transaction_tests.cpp
    2@@ -453,9 +453,6 @@ BOOST_AUTO_TEST_CASE(test_big_witness_transaction)
    3
    4     bool controlCheck = control.Wait();
    5     assert(controlCheck);
    6-
    7-    threadGroup.interrupt_all();
    8-    threadGroup.join_all();
    9 }
    

    MarcoFalke commented at 5:08 pm on March 2, 2020:
    Not sure why this should be safe to remove. The scheduler needs to be stopped before this test ends, so the interrupt_all should be replaced by a scheduler.stop(). I think that otherwise you reintroduce bug #15410

    jonatack commented at 5:19 pm on March 2, 2020:
    Thanks. I was just now looking at src/test/checkqueue_tests.cpp which has similar code.

    ajtowns commented at 3:36 am on March 4, 2020:
    @jonatack In transaction_tests, that’s a locally declared threadGroup, and it’s a BasicTestingSetup too, so the TestingSetup destructor wouldn’t execute, and even if it did, wouldn’t clean up that threadGroup. @MarcoFalke Ah, you’re quite right. Have undeleted the code and added a comment.

    jonatack commented at 2:57 pm on March 5, 2020:
    @ajtowns right, makes sense. Thanks for the explanation.

    MarcoFalke commented at 3:00 pm on March 17, 2020:

    in commit 306f71b4eb4a0fd8e64f47dc008bc235b80b13d9:

    After adding the scheduler.stop(), they are again redundant and confusing. I’d suggest to remove the threadGroup calls


    ajtowns commented at 6:32 pm on March 17, 2020:
    Calling stop() doesn’t actually cause the thread to stop, you need to wait for it to complete. Could do that by looping and waiting for AreThreadsServicingQueue() to be false, but that still leaves the possibility you continue on and destruct the object in between --nThreadsServicingQueue; and newTaskScheduled.notify_one(); at the end of serviceQueue. Could drop the interrupt_all but if some other thread in the group needed an interrupt before being joined, that’d cause a hang too… So until we change something else, this doesn’t seem redundant to me?

    MarcoFalke commented at 6:34 pm on March 17, 2020:
    Ups, correct.
  11. jonatack commented at 4:16 pm on March 2, 2020: member
    ACK 42a5326 I reviewed the changes and surrounding code in detail, built/ran tests for each commit, ran bitcoind for 2 days (debian), and shut down/restarted bitcoind several times. This looks good.
  12. theuni commented at 4:30 pm on March 2, 2020: member

    This doesn’t seem to take into account that boost::condition_variable is interruptible, and can’t simply be swapped out. Have I missed some prior work that makes this ok?

    Ok, I see that in the description. Re-reviewing.

  13. theuni commented at 5:03 pm on March 2, 2020: member
    I would’ve expected this to require more work, but it looks like the scheduler was already surprisingly interruptible. Concept ACK. @TheBlueMatt might be aware of more scheduling minefields.
  14. in src/scheduler.cpp:18 in 42a5326a4d outdated
    14@@ -16,22 +15,14 @@ CScheduler::CScheduler() : nThreadsServicingQueue(0), stopRequested(false), stop
    15 
    16 CScheduler::~CScheduler()
    17 {
    18+    LOCK(newTaskMutex);
    


    theuni commented at 5:50 pm on March 2, 2020:
    Adding a lock in a destructor just for an assertion is a shame. Maybe this should be behind a debug guard?

    ajtowns commented at 5:27 am on March 3, 2020:
    It’s probably not useful in the first place, the lock’s about to be destroyed, so if anyone else even has a reference to it there’s a bug, and threadsafety annotations don’t apply to constructors and destructors for the same reason.
  15. in src/init.cpp:209 in 42a5326a4d outdated
    206@@ -207,6 +207,7 @@ void Shutdown(NodeContext& node)
    207 
    208     // After everything has been shut down, but before things get flushed, stop the
    209     // CScheduler/checkqueue threadGroup
    210+    if (node.scheduler) node.scheduler->stop();
    


    theuni commented at 6:52 pm on March 2, 2020:
    Are these guaranteed to fully execute in-order? Can an interruption-point hit while the scheduler is still tearing down?

    ajtowns commented at 4:02 am on March 4, 2020:
    Not sure what you’re asking; node.scheduler->stop() just sets a boolean to say “exit the loop” and notifies everything waiting on the condition var to prevent it staying asleep, all the tear down happens in the thread. The interruption could certainly happen while it’s tearing down, stop(); interrupt_all() happen in one thread, and only afterwards does the other thread even wake up. I don’t think boost interrupt will have an effect anymore though, because serviceQueue doesn’t use any boost primitives and boost interrupt only sets a flag which is checked by other boost things.
  16. MarcoFalke renamed this:
    scheduler.cpp: Replace boost::mutex,condition_var,chrono with std equivalents
    refactor: Replace boost::mutex,condition_var,chrono with std equivalents in scheduler
    on Mar 3, 2020
  17. ajtowns force-pushed on Mar 4, 2020
  18. MarcoFalke added this to the "In progress" column in a project

  19. MarcoFalke added this to the milestone 0.20.0 on Mar 5, 2020
  20. MarcoFalke commented at 9:03 pm on March 5, 2020: member

    Should add

    0Fixes [#16027](/bitcoin-bitcoin/16027/), Fixes [#14200](/bitcoin-bitcoin/14200/), Fixes [#18227](/bitcoin-bitcoin/18227/)
    

    to OP?

  21. fanquake commented at 7:42 am on March 6, 2020: member
    @ajtowns Can you rebase on master now that #16117 is merged. Also, feel free to include a commit here to kill the last DHAVE_WORKING_BOOST_SLEEP_FOR.
  22. DrahtBot added the label Needs rebase on Mar 6, 2020
  23. scheduler: don't rely on boost interrupt on shutdown
    Calling interrupt_all() will immediately stop the scheduler, so it's
    safe to invoke stop() beforehand, and this removes the reliance on boost
    to interrupt serviceQueue().
    306f71b4eb
  24. sync.h: add REVERSE_LOCK b9c4260127
  25. scheduler: switch from boost to std
    Changes from boost::chrono to std::chrono, boost::condition_var to
    std::condition_var, boost::mutex to sync.h Mutex, and reverselock.h to
    sync.h REVERSE_LOCK. Also adds threadsafety annotations to CScheduler
    members.
    d0ebd93270
  26. Drop unused reverselock.h cea19f6859
  27. scheduler_tests: re-enable mockforward test 294937b39d
  28. MarcoFalke removed the label Needs rebase on Mar 6, 2020
  29. MarcoFalke added the label Needs rebase on Mar 6, 2020
  30. lint-cppcheck: Remove -DHAVE_WORKING_BOOST_SLEEP_FOR 70a6b529f3
  31. ajtowns force-pushed on Mar 6, 2020
  32. DrahtBot removed the label Needs rebase on Mar 6, 2020
  33. laanwj commented at 7:08 pm on March 6, 2020: member
    Great work! I’m really glad this is happening, thanks for working on this. ACK 70a6b529f306ff72ea1badf25e970a92b2b17ab3
  34. laanwj merged this on Mar 6, 2020
  35. laanwj closed this on Mar 6, 2020

  36. MarcoFalke moved this from the "In progress" to the "Done" column in a project

  37. in src/sync.cpp:163 in b9c4260127 outdated
    159@@ -155,6 +160,18 @@ void EnterCritical(const char* pszName, const char* pszFile, int nLine, void* cs
    160     push_lock(cs, CLockLocation(pszName, pszFile, nLine, fTry, util::ThreadGetInternalName()));
    161 }
    162 
    163+void CheckLastCritical(void* cs, std::string& lockname, const char* guardname, const char* file, int line)
    


    MarcoFalke commented at 3:05 pm on March 17, 2020:

    in commit b9c426012770d166e6ebfab27689be44e6e89aa5:

    Why is this not called GetLastLockName, which better describes what this function does, imo.

  38. in src/sync.h:206 in b9c4260127 outdated
    201+        reverse_lock(reverse_lock const&);
    202+        reverse_lock& operator=(reverse_lock const&);
    203+
    204+        UniqueLock& lock;
    205+        UniqueLock templock;
    206+        std::string lockname;
    


    MarcoFalke commented at 3:09 pm on March 17, 2020:

    in commit b9c426012770d166e6ebfab27689be44e6e89aa5:

    Should be m_lockname. Also for other members?

  39. in src/sync.h:205 in b9c4260127 outdated
    200+     private:
    201+        reverse_lock(reverse_lock const&);
    202+        reverse_lock& operator=(reverse_lock const&);
    203+
    204+        UniqueLock& lock;
    205+        UniqueLock templock;
    


    MarcoFalke commented at 4:40 pm on March 17, 2020:

    in commit b9c426012770d166e6ebfab27689be44e6e89aa5:

    I think it would help readers of the code to briefly explain why this is needed. Also, m_templock? ;)

  40. in src/scheduler.h:10 in d0ebd93270 outdated
     6@@ -7,11 +7,12 @@
     7 
     8 //
     9 // NOTE:
    10-// boost::thread / boost::chrono should be ported to std::thread / std::chrono
    11+// boost::thread should be ported to std::thread
    


    MarcoFalke commented at 4:45 pm on March 17, 2020:

    in commit d0ebd93270758ea97ea956b8821e17a2d001ea94:

    Actually boost::thread is no longer needed an can be replaced with std::thread in this file and the scheduler tests. I know that the thread group is still boost, but the scheduler doesn’t care about that.

  41. MarcoFalke commented at 4:48 pm on March 17, 2020: member

    ACK 70a6b529f306ff72ea1badf25e970a92b2b17ab3, just style-nits 🌩

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK 70a6b529f306ff72ea1badf25e970a92b2b17ab3, just style-nits 🌩
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUjbywwAhqXP0sWwiqNLLsI0h9fBTK3DQ7qRxX2cxV9RmUoiok2nKutVPGef2v4L
     8brJk/asu9hCJ5nb9/SMf6NRqnyhJ4lbTbt961Tu1au0LNfloCUfvfb8JxY42yDXD
     9zxzKYI/vQeUnaNBMINmZ9rOLkLMYqnxXb7fjejNL9l8txmE4jmRp53lvZj34IfDr
    100Mhu3rq+FwLFgvwMWPEbPR1jG/AN2XaW0sHMMe1Xpw/intRxJLh3BSpPHZ9+AjIF
    11kWF6Dqp4Onf2dO4jmfaB5ipQI9IYFFZh3+jD3wePBD9kZuL2iqsfXvfblm+rXGZ4
    12XMOSXjJoNreRp/hH14gcGuKQrOOiraemT15rH4XiC/DUI6lLMa1FWMNGn1Pob5A0
    13EwjIUwnHVNbCqpJIUcGXLEHTZ0pE/1cQ6nBKEFoPcTEgk7D1gH5mX4wP3UBdPw57
    14Vka9bjoTAOLqdy7B2FE8qg+dNiVqaDm9eRxixS1uqDY+lWHfNkeubg9SaE6LkC//
    155qT991xc
    16=GGMb
    17-----END PGP SIGNATURE-----
    

    Timestamp of file with hash 8b7a37edb1d87523e3df70995e0c51ee4dd4dafc50f78c0b12195a66e49c4ce2 -

  42. deadalnix referenced this in commit 014e5697a6 on Jun 11, 2020
  43. deadalnix referenced this in commit fba3cd6167 on Jun 11, 2020
  44. deadalnix referenced this in commit 1af5e7c22b on Jun 11, 2020
  45. deadalnix referenced this in commit 879fe2a322 on Jun 11, 2020
  46. MarcoFalke referenced this in commit 89a6bb9245 on Aug 28, 2020
  47. kittywhiskers referenced this in commit 54f16ee1a0 on Jul 13, 2021
  48. kittywhiskers referenced this in commit 6d6bc2abda on Jul 13, 2021
  49. Fabcien referenced this in commit cdc0cfb74c on Aug 27, 2021
  50. furszy referenced this in commit c8ad2c17ab on Aug 27, 2021
  51. PastaPastaPasta referenced this in commit 417099870a on Sep 28, 2021
  52. PastaPastaPasta referenced this in commit 9b8884b430 on Sep 28, 2021
  53. kittywhiskers referenced this in commit efb9cdad7a on Oct 12, 2021
  54. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 13:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me