Overview of slow unit tests

laanwj commented at 8:34 am on March 18, 2017: member

At last IRC meeting we’ve discussed trying to speed up the unit tests, as they’ve become much too slow. For this purpose I’ve run test_bitcoin --log_level=test_suite and picked out the slowest from the list.

The top 20 is (updated 2018-04-14 by @lucash-dev):

Test	Time (μs)
`test_big_witness_transaction`	39133014
`test_CheckQueue_Correct_Random`	14808019
`knapsack_solver_test`	9304171
`checkinputs_test`	6370348
`CreateNewBlock_validity`	6105558
`test_CheckQueue_Memory`	4230330
`versionbits_test`	3835090
`coins_cache_simulation_test`	3205869
`updatecoins_simulation_test`	2499840
`merkle_test`	2391688
`rescan`	1900315
`SelectCoins_test`	1731901
`passphrase`	1730167
`ListCoins`	1689793
`test_CheckQueue_Catches_Failure`	1563869
`test_CheckQueue_UniqueCheck`	1521864
`PrevectorTestInt`	1409636
`test_CheckQueue_Correct_Max`	1401687
`bnb_search_test`	1147974
`cuckoocache_generations`	978637

I think (open for discussion, of course) it is unreasonable for a unit test case to take more than ~1s (mind you, we have 242 of them, so each test is supposed to be quick) so we should take a look at each of these up to merkle_test.

It could be as simple in some cases as getting rid of (reasonably heavy) BOOST_* tests in inner loops, moving the checks up to a higher level.

Edit: I added the script to create this table to maintainer-tools: https://github.com/bitcoin-core/bitcoin-maintainer-tools/blob/master/unittest-statistics.py

laanwj added the label Tests on Mar 18, 2017

paveljanik commented at 9:19 am on March 18, 2017: contributor

The first one is

0/** Test that random numbers of checks are correct
1 */
2BOOST_AUTO_TEST_CASE(test_CheckQueue_Correct_Random)
3{
4    std::vector<size_t> range;
5    range.reserve(100000/1000);
6    for (size_t i = 2; i < 100000; i += std::max((size_t)1, (size_t)GetRand(std::min((size_t)1000, ((size_t)100000) - i))))
7        range.push_back(i);
8    Correct_Queue_range(range);
9}

By simply removing GetRand call, the time is reduced from 33s to 15s here. Ie. the test spends ~half of the time in generating random numbers.

paveljanik commented at 9:24 am on March 18, 2017: contributor

test_big_witness_transaction is signing 4500 inputs and it takes time.

NicolasDorier commented at 9:59 am on March 22, 2017: contributor

test_big_witness_transaction was a test I wrote for checking if the CachedHashes of segwit was thread safe. What takes time is the signing of inputs. I think a better idea is to hard code the transaction. But I think it would be kind of big.

JeremyRubin commented at 6:50 pm on March 22, 2017: contributor

Sorry for writing so many of the slow tests!

Getting rid of GetRand from the CheckQueue tests is 100% fine; and should speed things up ( @kallewoof brought that up in review, but I didn’t think it would be a huge time sink).

The cuckoocache_generations test is parametrized by:

0    const uint32_t BLOCK_SIZE = 10000;
1    // We expect window size 60 to perform reasonably given that each epoch
2    // stores 45% of the cache size (~472k).
3    const uint32_t WINDOW_SIZE = 60;
4    const uint32_t POP_AMOUNT = (BLOCK_SIZE / WINDOW_SIZE) / 2;
5    const double load = 10;
6    const size_t megabytes = 32;
7    const size_t bytes = megabytes * (1 << 20);
8    const uint32_t n_insert = static_cast<uint32_t>(load * (bytes / sizeof(uint256)));

picking smaller constants should be fine (although increases the potential variance, e.g., likelihood of deterministic failure). The rest of the cuckoocache_tests are also parametric (e.g., just a size in MB and a hit rate lower bound) so can be easily adjusted. I initially picked a size to match the default parameter, but we could scale down to, say 4mb or 2mb.

NicolasDorier commented at 2:36 am on March 23, 2017: contributor

would writing my transaction with 4500 input hard coded be a problem ? I can do it but unsure if it is the right way.

laanwj commented at 5:40 am on March 23, 2017: member

@NicolasDorier just how large would that data be? Very large (multi-MB) files of test data likely shouldn’t be part of the repository.

NicolasDorier commented at 5:53 am on March 23, 2017: contributor

@laanwj around 1 MB.

JeremyRubin commented at 7:24 pm on March 23, 2017: contributor

I would also note that checkqueue tests should improve a lot under #9938, I need to clean up that PR a bit though in accordance to Matt’s feedback.

JeremyRubin referenced this in commit 39e906da29 on Mar 27, 2017

JeremyRubin referenced this in commit 18b4830242 on Mar 29, 2017

JeremyRubin referenced this in commit 690acd1970 on Mar 29, 2017

laanwj commented at 1:45 pm on April 2, 2017: member

Updated for #10128

JeremyRubin referenced this in commit ee0c0bee31 on Aug 9, 2017

sipa referenced this in commit 424be03305 on Oct 12, 2017

CinchBlue commented at 9:28 pm on February 8, 2018: none

Do you guys think we could get an update on this issue? I think that the top unit test_CheckQueue_Correct_Random is ultimately using a FastRandomContext which seems to be doing a lot of extra sampling for the random values – the procedure retries every time it fails to generate a bit sequence in a certain bit range. Would there be a more efficient way to pre-set the range, customize the random number generator to generate uniformly-distributed values in this bit range so that no retries are not necessary. Or, could the slowdown be happening in the verification phase?

MarcoFalke added the label Up for grabs on Apr 11, 2018

MarcoFalke commented at 8:01 pm on April 11, 2018: member

Marking “up for grabs” to see what unit tests are still slow and master and figure out how to speed them up.

https://github.com/bitcoin-core/bitcoin-maintainer-tools/blob/master/unittest-statistics.py might come in hnady.

jamesob commented at 8:03 pm on April 11, 2018: member

Should we add a “good first issue” label to this?

MarcoFalke added the label good first issue on Apr 11, 2018

lucash-dev commented at 2:38 pm on April 13, 2018: contributor

Hi, I would like to work on this as my first issue.

MarcoFalke commented at 2:46 pm on April 13, 2018: member

Just go ahead. No need to ask for permission :)

On Fri, Apr 13, 2018, 10:39 lucash-dev notifications@github.com wrote:

Hi, I would like to work on this as my first issue.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bitcoin/bitcoin/issues/10026#issuecomment-381156178, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGmv7DEw1coKfkr1_L-iIWAfQwmQW17ks5toLi3gaJpZM4MhVdy .

lucash-dev commented at 2:20 pm on April 14, 2018: contributor

Updated table (as of #11200) Can’t compare numbers directly with previous ones, since it’s a different setup, but the ordering should be consistent.

(pasted into OP)

test_big_witness_transaction and test_CheckQueue_Correct_Random are still the most time consuming, so I’ll focus on these.

laanwj commented at 3:49 pm on April 14, 2018: member

@lucash-dev Thanks! Moved your table into the OP.

lucash-dev commented at 2:28 am on April 16, 2018: contributor

The https://github.com/bitcoin-core/bitcoin-maintainer-tools/blob/master/unittest-statistics.py script was mixing up test suite and test case statistics (presumably due to some change in how test_bitcoinoutput is generated).

Created an issue and a PR in the maintainer tools repo.

lucash-dev commented at 7:32 pm on April 21, 2018: contributor

test_big_witness_transaction is spending most of time converting CMutableTransaction to CTransaction, which involves recalculating hashes. That’s because each call to SignSignature does the conversion again for the whole transaction.

This means the time complexity for providing witnesses for the 4500 inputs has a O(n2) component.

Changing the test to use lower level functions ProduceSignature and UpdateTransaction, and reusing the same CTransaction in all iterations, reduces run time in about 70%.

Created a WIP PR (will add improvements to other tests to this PR).

I’m just wondering if that is not rather a bug in SignSignature. Should it create a new CTransaction every time it’s called?

lucash-dev commented at 4:31 pm on April 22, 2018: contributor

Update: went down a rabbit hole about this. I don’t think there’s much can be done without a very major refactoring affecting CTransaction and CMutableTransaction.

Actually the problem lies deeper, in MutableTransactionSignatureCreator. It’s constructor just creates a new CTransaction out of the mutable one. This doesn’t seem right. Changing it might require a bit of refactoring though. I’ll try and see if I can come up with a solution.

lucash-dev commented at 6:09 am on May 11, 2018: contributor

Ignore message below. figured it out. @JeremyRubin Is there a particular reason test_CheckQueue_Correct_Random needs to check so many ranges? Would 10,000 instead of 100,000 represent too low a probability of catching errors? This test case is still one of the most time consuming, and maybe you could better analyze the cost-benefit trade-off.

lucash-dev commented at 3:35 pm on May 15, 2018: contributor

I think #13050, which addresses the top 5 test cases, is in good shape for a review and possible merge.

laanwj referenced this in commit 36fc8052f6 on May 31, 2018

MarcoFalke referenced this in commit 3d3d8ae3a0 on Jun 7, 2018

MarcoFalke removed the label Up for grabs on Jun 15, 2018

MarcoFalke removed the label good first issue on Jun 15, 2018

laanwj referenced this in commit e3fec3cfa8 on Sep 11, 2018

MarcoFalke commented at 9:03 pm on October 11, 2019: member

I think this has been fixed? Anything left to do here?

adamjonas commented at 9:08 pm on October 11, 2019: member

Latest run on my machine (2019 Macbook pro) still found a bunch of tests over a second:

Test	Time (μs)
`test_big_witness_transaction`	3453172
`CreateNewBlock_validity`	2167830
`knapsack_solver_test`	2074167
`test_CheckQueue_Correct_Random`	1675857
`updatecoins_simulation_test`	1314895
`checkinputs_test`	1229128
`SelectCoins_test`	1206660
`coins_cache_simulation_test`	1134890
`versionbits_test`	1001234

adamjonas commented at 11:57 pm on October 15, 2019: member

Though in the middle of the list above, my old laptop (2013 MacBook Pro) runs into a bunch of trouble with test_CheckQueue_Correct_Random. It consistently comes above 30 seconds. Here are the others that are over a second:

Test	Time (μs)
`test_CheckQueue_Correct_Random`	35218784
`test_big_witness_transaction`	6219762
`CreateNewBlock_validity`	3941990
`knapsack_solver_test`	3168931
`test_CheckQueue_Memory`	2229287
`updatecoins_simulation_test`	2214178
`checkinputs_test`	1905050
`coins_cache_simulation_test`	1775631
`versionbits_test`	1510278

MarcoFalke commented at 8:00 pm on October 16, 2019: member

I was running the tests on arm64 bionic and still got a bunch of tests that take longer than one second. (Including the test_CheckQueue_Correct_Random test case)

Test	Time (μs)
`test_big_witness_transaction`	29437948
`test_CheckQueue_Correct_Random`	22050372
`knapsack_solver_test`	14629919
`CreateNewBlock_validity`	12781948
`updatecoins_simulation_test`	11699386
`checkinputs_test`	8679608
`SelectCoins_test`	8266631
`coins_cache_simulation_test`	8176292
`util_ArgsMerge`	6467467
`versionbits_test`	6014103
`processnewblock_signals_ordering`	5476931
`psbt_updater_test`	3793910
`merkle_test`	3661272
`mempool_locks_reorg`	3524035
`passphrase`	3458565
`cuckoocache_generations`	3174970
`compress_amounts`	2708759
`skiplist_test`	2623142
`ListCoins`	2607134
`test_CheckQueue_Memory`	2349730
`sighash_test`	1611576
`BlockPolicyEstimates`	1611403
`getlocator_test`	1591312
`addrman_getaddr`	1524868
`test_CheckQueue_Catches_Failure`	1503379
`script_build`	1313951
`script_json_test`	1294559
`script_standard_push`	1265544
`cuckoocache_hit_rate_ok`	1190260
`PrevectorTestInt`	1181760
`varints`	987794

JeremyRubin commented at 8:27 pm on October 16, 2019: contributor

ducks

I can take a look at improving it. Generally the checkqueue tests are aggressive and slow for two reasons:

They were designed to be slow on the existing checkqueue, and fast on the lock free one (edit: that I was working on but couldn’t get sufficient review for)
They want to test the parallelism-safety of the checkqueue, so absent a parallelism permutation testing framework (like loom for rust) we’re stuck with something that is bigger (& probabilistic)

PastaPastaPasta referenced this in commit 67513d3df3 on Dec 22, 2019

PastaPastaPasta referenced this in commit b85ffe17e5 on Jan 2, 2020

PastaPastaPasta referenced this in commit 84a3603b87 on Jan 4, 2020

PastaPastaPasta referenced this in commit 84ab39d400 on Jan 12, 2020

PastaPastaPasta referenced this in commit 962becc9eb on Jan 12, 2020

PastaPastaPasta referenced this in commit 69991c3764 on Jan 12, 2020

PastaPastaPasta referenced this in commit 4d0475b592 on Jan 12, 2020

PastaPastaPasta referenced this in commit 920e3e709d on Jan 12, 2020

PastaPastaPasta referenced this in commit 40f559ebbb on Jan 12, 2020

PastaPastaPasta referenced this in commit 1e2ab7633f on Jan 16, 2020

MarcoFalke commented at 0:29 am on April 27, 2020: member

The feature request didn’t seem to attract much attention in the past. Also, the issue seems not important enough right now to keep it sitting around idle in the list of open issues.

Closing due to lack of interest. Pull requests with improvements are always welcome.

MarcoFalke closed this on Apr 27, 2020

ckti referenced this in commit 72d0e25cdd on Mar 28, 2021

UdjinM6 referenced this in commit 3d00f2c563 on Jun 19, 2021

UdjinM6 referenced this in commit 8c8b51e5f5 on Jun 19, 2021

UdjinM6 referenced this in commit 7a5b4f3aee on Jun 24, 2021

UdjinM6 referenced this in commit d02a223c97 on Jun 24, 2021

UdjinM6 referenced this in commit f7c2a3844d on Jun 26, 2021

UdjinM6 referenced this in commit 5bb511dd46 on Jun 26, 2021

UdjinM6 referenced this in commit 87c9b5c73e on Jun 26, 2021

UdjinM6 referenced this in commit a13dee3fa7 on Jun 26, 2021

PastaPastaPasta referenced this in commit 02b0442fda on Jun 27, 2021

PastaPastaPasta referenced this in commit 7642625575 on Jun 28, 2021

PastaPastaPasta referenced this in commit 9c23705d84 on Jun 28, 2021

UdjinM6 referenced this in commit 3d4017d30d on Jun 28, 2021

UdjinM6 referenced this in commit e52bf1371b on Jun 28, 2021

PastaPastaPasta referenced this in commit a023431fbc on Jun 29, 2021

Munkybooty referenced this in commit 5b5a76e4eb on Jul 7, 2021

Munkybooty referenced this in commit c2d9621ec7 on Jul 8, 2021

Munkybooty referenced this in commit e427d09b5e on Jul 9, 2021

Munkybooty referenced this in commit 3e388c6c91 on Jul 11, 2021

MarcoFalke referenced this in commit fd557ceb88 on Jul 24, 2021

sidhujag referenced this in commit 52375cf41a on Jul 28, 2021

DrahtBot locked this on Feb 15, 2022

Overview of slow unit tests #10026