test: disable comparison tool #6278

laanwj commented at 4:48 AM on June 13, 2015: member

Recently, many Travis builds fail due to timeouts, which happen in the comparison tool. Just a test to see if it can also be triggered without. Not meant to be merged as-is.

test: disable comparison tool cf6b62bbda

laanwj added the label Tests on Jun 13, 2015

dexX7 commented at 2:54 PM on June 13, 2015: contributor

Hmmm.. that's interesting. Are you sure this is related to the comparison tool?

I'm asking, because I've never seen timeouts in another project/fork, which is currently based on Bitcoin Core 0.10. There are probably less than 10 builds per day though. But what I actually noticed: after testing a migration to 0.11 (based on 053110d), the Boost tests timed out from time to time, but not always.

I assumed this was somehow a mistake on my part, or simply "bad luck". The number of 0.11 builds was very low, and right now 0.10 is still used, so this might not be related at all. On top, Travis builds are routed through the container based infrastructure.

However, when running ./src/test/test_bitcoin --log_level=test_suite in a loop locally (done a few minutes ago), it stops at some point here:

Entering test suite "scheduler_tests"
Entering test case "manythreads"

I think this rules out that it's related to Travis. I'm going to test a clean version of Core now.

dexX7 commented at 3:00 PM on June 13, 2015: contributor

The current master seems to have the same problem.

When running test_bitcoin in a loop, it stops at some point during the scheduler_tests.

Tested on Ubuntu 14.04 LTS x64 with:

Bitcoin version v0.11.99.0-ab0ec67 (2015-06-12 16:49:53 +0200)
Using OpenSSL version OpenSSL 1.0.1f 6 Jan 2014
Using BerkeleyDB version Berkeley DB 4.8.30: (April  9, 2010)

laanwj commented at 5:52 AM on June 15, 2015: member

The scheduler test is another possible source of hangs. It has been fixed a few times, but it's still possible for there to be some race condition that makes it either never finish or really slowly.

Adding timestamps to the test output, as well as more verbose logging to test_bitcoin, may be a good idea.

laanwj commented at 5:54 AM on June 15, 2015: member

On the other hand I see a lot of false positives in travis end in the comparison tool .e.g. the end usually looks like

11:06:51 15 BitcoindComparisonTool$1.onPreMessageReceived: Got empty header message from bitcoind
11:06:51 1 BitcoindComparisonTool.main: Block "b3" completed processing
11:06:51 1 BitcoindComparisonTool.main: Testing block b3 499b1ec0ece4c4ef3b123d7498e3a5cfc85685fc9998a3f07b0fc7c977433627
11:06:51 1 BitcoindComparisonTool.main: Sent inv with block 499b1ec0ece4c4ef3b123d7498e3a5cfc85685fc9998a3f07b0fc7c977433627
11:06:51 15 BitcoindComparisonTool$1.onPreMessageReceived: Got empty header message from bitcoind
Exception in thread "main" java.lang.NullPointerException
at com.google.bitcoin.core.BitcoindComparisonTool.main(BitcoindComparisonTool.java:311)
No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.

The build has been terminated

I think this is the NULL pointer that @theuni is looking for too.

theuni commented at 5:57 PM on June 15, 2015: member

@laanwj Yes, if you see the NPE, the tests will timeout. In that case, it's 100% related to the comparison tool.

I'm still not sure how to proceed with troubleshooting that issue.

laanwj commented at 12:41 PM on June 16, 2015: member

@theuni I remember your plan was to replace the comparison tool with one built from a known source code. This at least makes the line numbers reliable. Or did problems come up with that?

theuni commented at 3:10 AM on June 17, 2015: member

Ah right.

Nope, it just slipped my mind. Will do.

laanwj commented at 8:21 AM on June 17, 2015: member

Yippie. I found one build that random-errored on master that didn't involve the comparison tool.

make[3]: Entering directory `/home/travis/build/bitcoin/bitcoin/bitcoin-x86_64-unknown-linux-gnu/src'

Running 158 test cases...

No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.

The build has been terminated

Probably the scheduler-tests again, although without more diagnostics it's hard to say.

dexX7 commented at 1:37 PM on June 17, 2015: contributor

although without more diagnostics it's hard to say.

Locally I'm able to pin it down:

STATUS=0; while [ $STATUS=0 ]; do ./src/test/test_bitcoin --run_test=scheduler_tests/manythreads --log_level=all; STATUS=$?; done

After a few rounds it always stops with:

Entering test case "manythreads"
test/scheduler_tests.cpp(70): info: check nTasks == 0 passed
test/scheduler_tests.cpp(82): info: check nTasks == 100 passed
test/scheduler_tests.cpp(83): info: check first < last passed
test/scheduler_tests.cpp(84): info: check last > now passed

I confirmed it via Travis in a similar manner, see here and here.

With additional output, it looks like it doesn't make it past joining the threads in the test:

// Drain the task queue then exit threads
microTasks.stop(true);
microThreads.join_all(); // ... wait until all the threads are done | <---- not passed

There seem to be a few reports mentioning issues in this context:

https://www.google.com/search?q=thread_group+join_all+hang

I'm not sure, how this plays together with the comparison tool though.

gavinandresen commented at 1:49 PM on June 17, 2015: contributor

@dexX7 what OS and version of boost? (I cannot reproduce a hang using your while loop on OSX 10.10.3 and boost 1.58.0). @laanwj : if it turns out to be a bug in boost or the OS handling gazillions of threads, perhaps just removing the scheduler stress test would be the right thing to do. We aren't actually using gazillions of threads with the scheduler in Core code, just one.

dexX7 commented at 2:04 PM on June 17, 2015: contributor

@gavinandresen: locally I'm using Ubuntu 14.04.2 with Boost 1.54. Travis uses Ubuntu 12.04 with Boost 1.55.

Disabling the scheduler test may reduce the number of occurrences, but as this PR indicates, timeouts were also seen during the comparison tool tests. It's not yet shown that this is caused by Boost threads, and it's only my guess.

laanwj commented at 4:39 PM on June 17, 2015: member

The comparison tool failures and the scheduler tests failures are completely unrelated, the only thing they have in common is that they cause transient travis failures. @dexX7 Thanks for the extensive report, I'll have a look at the scheduler tests again, and see if I can find the issue. I doubt that it is an issue with boost, just with our usage of it (under load). Disabling the test is also a possibility, but I'd prefer to find out what is wrong.

laanwj commented at 3:40 PM on June 19, 2015: member

Closing this pull for now, let's see how it goes after #6305

laanwj closed this on Jun 19, 2015

MarcoFalke locked this on Sep 8, 2021