WIP test: make mockscheduler test more reliable #18174

pull amitiuttarwar wants to merge 1 commits into bitcoin:master from amitiuttarwar:2020-02-fix-scheduler-test changing 1 files +1 −2
  1. amitiuttarwar commented at 5:12 pm on February 18, 2020: contributor

    The scheduler_tests/mockforward test introduced in #18037 sometimes fails on the x86_64 Linux machine with no wallet.

    This is an attempt to fix by changing the order of events in the tests & starting the scheduler thread before queueing events.

    Unfortunately I’m unable to reproduce locally, so I’m opening this PR & would like to get a few green runs to see if this is actually the fix.

  2. start scheduler thread before queuing jobs df29a71f05
  3. DrahtBot added the label Tests on Feb 18, 2020
  4. theStack commented at 5:26 pm on February 18, 2020: member
    Just as a general hint for the approach of “get a few green runs”, it’s also possible to register at travis-ci.org with your github account and add your forked bitcoin repository there: then on every push the CI will be triggered as well. This proved to be very helpful for me personally, as check before I opened PRs, as I wrongly assumed for some reason that the CI runs can only be triggered here on the main bitcoin repository.
  5. amitiuttarwar commented at 5:30 pm on February 18, 2020: contributor

    thanks for the tip @theStack !

    so you were able to get travis passing on your forked repo? when I tried previously I wouldn’t get meaningful results, eg. they would frequently time out or error for unrelated reasons. But I didn’t investigate too closely, so I can take another look

  6. theStack commented at 5:52 pm on February 18, 2020: member

    so you were able to get travis passing on your forked repo? when I tried previously I wouldn’t get meaningful results, eg. they would frequently time out or error for unrelated reasons. But I didn’t investigate too closely, so I can take another look

    It could be that on a first build you have to restart the job manually from the Travis dashboard, if a message like this appears at the very bottom of the job log:

    0Error! Initial build successful, but not enough time remains to run later build stages and tests.
    1See https://docs.travis-ci.com/user/customizing-the-build#build-timeouts . Please manually re-run
    2this job by using the travis restart button. The next run should not time out because the build cache
    3has been saved.
    

    Other than that, after this initial hurdle I think a Travis run here should behave the same as a run from your bitcoin fork – at least that’s my experience :)

  7. in src/test/scheduler_tests.cpp:176 in df29a71f05
    171@@ -171,8 +172,6 @@ BOOST_AUTO_TEST_CASE(mockforward)
    172     size_t num_tasks = scheduler.getQueueInfo(first, last);
    173     BOOST_CHECK_EQUAL(num_tasks, 3ul);
    174 
    175-    std::thread scheduler_thread([&]() { scheduler.serviceQueue(); });
    176-
    177     // bump the scheduler forward 5 minutes
    178     scheduler.MockForward(boost::chrono::seconds(5*60));
    


    MarcoFalke commented at 7:16 pm on February 18, 2020:

    It looks like this line (or any line after it, but before the next BOOST_CHECK) still fails. E.g.

     0Running 3 test cases...
     1
     2Test cases order is shuffled using seed: 902263820
     3
     4Entering test module "Bitcoin Core Test Suite"
     5
     6test/scheduler_tests.cpp(11): Entering test suite "scheduler_tests"
     7
     8test/scheduler_tests.cpp(156): Entering test case "mockforward"
     9
    10terminate called after throwing an instance of 'boost::wrapexcept<boost::condition_error>'
    11
    12  what():  boost::condition_variable::do_wait_until failed in pthread_cond_timedwait: Invalid argument
    13
    14unknown location(0): fatal error: in "scheduler_tests/mockforward": signal: SIGABRT (application abort requested)
    15
    16test/scheduler_tests.cpp(173): last checkpoint
    17
    18test/scheduler_tests.cpp(156): Leaving test case "mockforward"; testing time: 430us
    19
    20test/scheduler_tests.cpp(112): Entering test case "singlethreadedscheduler_ordered"
    21
    22test/scheduler_tests.cpp(112): Leaving test case "singlethreadedscheduler_ordered"; testing time: 7233us
    23
    24test/scheduler_tests.cpp(38): Entering test case "manythreads"
    25
    26test/scheduler_tests.cpp(38): Leaving test case "manythreads"; testing time: 8082us
    27
    28test/scheduler_tests.cpp(11): Leaving test suite "scheduler_tests"; testing time: 15837us
    29
    30Leaving test module "Bitcoin Core Test Suite"; testing time: 15956us
    
  8. MarcoFalke commented at 11:16 pm on February 18, 2020: member
    I tried on three machines and could not reproduce the issue locally
  9. MarcoFalke commented at 0:56 am on February 19, 2020: member
  10. MarcoFalke commented at 1:21 am on February 19, 2020: member
    @jonasschnelli Can the build be reproduced locally? Is it using depends or system boost?
  11. jonasschnelli commented at 6:09 pm on February 19, 2020: contributor

    I looked a bit into it. This PR seems still to fail on bitcoinbuilds.org: https://bitcoinbuilds.org/index.php?job=2f3c8572-cbaf-4cab-a2d2-20809d713084

    What I observed was that only the depends-build configuration fail (system-libs configuration works). If I clear the ccache-cache, the build runs successful.

    Maybe that helps identifying the issue.

  12. MarcoFalke commented at 6:14 pm on February 19, 2020: member

    If I clear the ccache-cache, the build runs successful.

    I think the failure is intermittent, so it might be hard to draw conclusions early.

  13. jonasschnelli commented at 6:19 pm on February 19, 2020: contributor

    I think the failure is intermittent, so it might be hard to draw conclusions early.

    It seems to fail on bitcoinbuilds.org 100% on “Linux x86_64 depends” “Linux 32 depends” when using ccache/dependency cache.

  14. MarcoFalke commented at 0:57 am on February 20, 2020: member

    I am finally able to reproduce on a single CPU instance on gce n1-standard-1 (1 vCPU, 3.75 GB memory)

    I am using depends with NO_QT=1 NO_WALLET=1 and ccache. The bug is only reproducible once after each reboot with: cd bitcoin/ && while sudo bash -c 'rm -f ./src/test/test_bitcoin && (echo 1 > /proc/sys/vm/drop_caches) && make -j 2 check' ; do true ; done

    I could not reproduce it when attaching gdb, running in valgrind or when enabling core dumps (in combination with #18183 )

  15. practicalswift commented at 1:25 pm on February 20, 2020: contributor

    @jonasschnelli @MarcoFalke I’m unable to reproduce :(

    What toolchain are you compiling using? Is there any theory as to why the use of ccache would matter?

  16. practicalswift commented at 1:34 pm on February 20, 2020: contributor
    Also, are you able to trigger it when doing src/test/test_bitcoin -t scheduler_tests?
  17. fanquake closed this on Feb 27, 2020

  18. fanquake closed this on Feb 27, 2020

  19. sidhujag referenced this in commit d2de31888b on Feb 27, 2020
  20. amitiuttarwar deleted the branch on Feb 28, 2020
  21. amitiuttarwar commented at 8:40 pm on February 28, 2020: contributor

    thank you all for your help with this bug !

    I opened issue #18227 to track findings & continue investigation

  22. sidhujag referenced this in commit 5e1f190a26 on Nov 10, 2020
  23. DrahtBot locked this on Feb 15, 2022
  24. PastaPastaPasta referenced this in commit dc36f495de on Apr 20, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-04 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me