Fix intermittent issue in scheduler_tests #6171

pull laanwj wants to merge 1 commits into bitcoin:master from laanwj:2015_05_scheduler_test_fix_2 changing 1 files +0 −2
  1. laanwj commented at 5:06 PM on May 21, 2015: member

    Don't clear stopRequested and stopWhenEmpty at the top of serviceQueue, as this results in a race condition: on systems under heavy load, some of the threads only get scheduled on the CPU when the other threads have already finished their work. This causes the flags to be cleared post-hoc and thus those threads to wait forever.

    The potential drawback of this change is that the scheduler cannot be restarted after being stopped (an explicit reset method would be needed), but we don't use this functionality anyway.

  2. laanwj added the label Tests on May 21, 2015
  3. Fix intermittent hang issue in scheduler_tests
    Don't clear `stopRequested` and `stopWhenEmpty` at the top of
    `serviceQueue`, as this results in a race condition: on systems under
    heavy load, some of the threads only get scheduled on the CPU when the
    other threads have already finished their work. This causes the flags to
    be cleared post-hoc and thus those threads to wait forever.
    
    The potential drawback of this change is that the scheduler cannot be
    restarted after being stopped (an explicit reset would be needed), but
    we don't use this functionality anyway.
    bdcf5dea7c
  4. laanwj force-pushed on May 21, 2015
  5. gavinandresen commented at 7:08 PM on May 21, 2015: contributor

    utACK. That's what I get for trying to be too clever....

  6. laanwj merged this on May 21, 2015
  7. laanwj closed this on May 21, 2015

  8. laanwj referenced this in commit 4bf83b27b2 on May 21, 2015
  9. posita commented at 8:57 PM on July 30, 2015: contributor

    FYI, I'm still seeing intermittent deadlocks in test_bitcoin, even in v0.11.0. I'll see if I can get a dump from gdb --pid "$( pidof test_bitcoin )" the next time I observe it.

  10. posita commented at 11:14 PM on August 8, 2015: contributor

    Hmmm, I'm only seeing occasional deadlocks in v0.11.0 with debugging symbols removed (e.g., via eu-strip .../bin/test_bitcoin). They may also happen with a binary whose debugging symbols are intact, but I haven't observed that (yet).

    Here's what I found:

    % eu-strip -f .../bin/test_bitcoin.debug .../bin/test_bitcoin
    % # Run test_bitcoin in another shell and wait for it to hang
    % gdb -s .../bin/test_bitcoin.debug .../bin/test_bitcoin <PID>
    ...
    Reading symbols from .../bin/test_bitcoin...Reading symbols from .../bin/test_bitcoin.debug...done.
    done.
    Attaching to program: .../bin/test_bitcoin, process <PID>
    ...
    (gdb) bt
    [#0](/bitcoin-bitcoin/0/)  0x00007fa0334e308f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    [#1](/bitcoin-bitcoin/1/)  0x00007fa0359f8a9b in boost::condition_variable::wait (this=0x7fa036fde358, m=...) at /usr/include/boost/thread/pthread/condition_variable.hpp:73
    [#2](/bitcoin-bitcoin/2/)  0x00007fa034e455ac in boost::thread::join_noexcept (this=this@entry=0x7fa036fd9180) at libs/thread/src/pthread/thread.cpp:312
    [#3](/bitcoin-bitcoin/3/)  0x00007fa0359f93e6 in join (this=0x7fa036fd9180) at /usr/include/boost/thread/detail/thread.hpp:756
    [#4](/bitcoin-bitcoin/4/)  boost::thread_group::join_all (this=this@entry=0x7ffe608fbf90) at /usr/include/boost/thread/detail/thread_group.hpp:118
    [#5](/bitcoin-bitcoin/5/)  0x00007fa0359f6c7e in scheduler_tests::manythreads::test_method (this=this@entry=0x7ffe608fcc87) at test/scheduler_tests.cpp:109
    [#6](/bitcoin-bitcoin/6/)  0x00007fa0359f70ce in scheduler_tests::manythreads_invoker () at test/scheduler_tests.cpp:43
    [#7](/bitcoin-bitcoin/7/)  0x00007fa035970bf7 in invoke<void (*)()> (this=<optimized out>, f=<optimized out>) at /usr/include/boost/test/utils/callback.hpp:56
    [#8](/bitcoin-bitcoin/8/)  boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke (this=<optimized out>)
        at /usr/include/boost/test/utils/callback.hpp:89
    [#9](/bitcoin-bitcoin/9/)  0x00007fa0349ed8f1 in operator() (this=<optimized out>) at ./boost/test/utils/callback.hpp:118
    [#10](/bitcoin-bitcoin/10/) operator() (this=<optimized out>) at ./boost/test/impl/unit_test_monitor.ipp:41
    [#11](/bitcoin-bitcoin/11/) invoke<boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<boost::unit_test::callback0<> > > (this=<optimized out>, f=...)
        at ./boost/test/utils/callback.hpp:42
    [#12](/bitcoin-bitcoin/12/) boost::unit_test::ut_detail::callback0_impl_t<int, boost::unit_test::(anonymous namespace)::zero_return_wrapper_t<boost::unit_test::callback0<boost::unit_test::ut_detail::unused> > >::invoke (this=<optimized out>) at ./boost/test/utils/callback.hpp:89
    [#13](/bitcoin-bitcoin/13/) 0x00007fa0349c46de in operator() (this=0x7ffe608fd7d0) at ./boost/test/utils/callback.hpp:118
    [#14](/bitcoin-bitcoin/14/) do_invoke<boost::scoped_ptr<boost::detail::translate_exception_base>, boost::unit_test::callback0<int> > (F=..., tr=...)
        at ./boost/test/impl/execution_monitor.ipp:281
    [#15](/bitcoin-bitcoin/15/) boost::execution_monitor::catch_signals (
        this=this@entry=0x7fa034c2d9a0 <boost::unit_test::singleton<boost::unit_test::unit_test_monitor_t>::instance()::the_inst>, F=...)
        at ./boost/test/impl/execution_monitor.ipp:885
    [#16](/bitcoin-bitcoin/16/) 0x00007fa0349c4f33 in boost::execution_monitor::execute (
        this=this@entry=0x7fa034c2d9a0 <boost::unit_test::singleton<boost::unit_test::unit_test_monitor_t>::instance()::the_inst>, F=...)
        at ./boost/test/impl/execution_monitor.ipp:1211
    [#17](/bitcoin-bitcoin/17/) 0x00007fa0349eda05 in boost::unit_test::unit_test_monitor_t::execute_and_translate (
        this=0x7fa034c2d9a0 <boost::unit_test::singleton<boost::unit_test::unit_test_monitor_t>::instance()::the_inst>, tc=...)
        at ./boost/test/impl/unit_test_monitor.ipp:69
    [#18](/bitcoin-bitcoin/18/) 0x00007fa0349d44af in boost::unit_test::framework_impl::visit (this=0x7fa034c2d8c0 <boost::unit_test::(anonymous namespace)::s_frk_impl()::the_inst>,
        tc=...) at ./boost/test/impl/framework.ipp:156
    [#19](/bitcoin-bitcoin/19/) 0x00007fa034a08a33 in boost::unit_test::traverse_test_tree (suite=..., V=...) at ./boost/test/impl/unit_test_suite.ipp:207
    [#20](/bitcoin-bitcoin/20/) 0x00007fa034a08a33 in boost::unit_test::traverse_test_tree (suite=..., V=...) at ./boost/test/impl/unit_test_suite.ipp:207
    [#21](/bitcoin-bitcoin/21/) 0x00007fa0349cf9fa in boost::unit_test::framework::run (id=1, id@entry=4294967295, continue_test=continue_test@entry=true)
        at ./boost/test/impl/framework.ipp:442
    [#22](/bitcoin-bitcoin/22/) 0x00007fa0349eb697 in boost::unit_test::unit_test_main (init_func=<optimized out>, argc=<optimized out>, argv=<optimized out>)
        at ./boost/test/impl/unit_test_main.ipp:185
    [#23](/bitcoin-bitcoin/23/) 0x00007fa03314fb45 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
    [#24](/bitcoin-bitcoin/24/) 0x00007fa035957b7e in _start ()
    (gdb) generate-core-file
    warning: Memory read failed for corefile section, 8192 bytes at 0x7ffe6097a000.
    Saved corefile core.<PID>
    ...
    

    It looks like it's hanging on a join, but I don't know why. See test/scheduler_tests.cpp#L109. @laanwj, is this similar to what you saw before?

    Core file (dumped from gdb) is here, if it's worth anything. Please let me know if I should open a separate issue.

  11. laanwj commented at 4:00 PM on August 10, 2015: member

    Yes, could be that there is another deadlock. Haven't noticed one myself yet, though, not even on travis.

  12. posita commented at 5:21 PM on August 10, 2015: contributor

    @laanwj, are you running binaries with or without debugging symbols in your Travis builds? As I mentioned. I have not yet seen this deadlock happen with a binary with debugging symbols. FYI, I filed #6540 to track this further.

  13. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me