test: Add test on skip heights in CBlockIndex

optout21 commented at 12:08 pm on October 20, 2025: contributor

The skip height values, used by CBlockIndex::BuildSkip() and GetSkipHeight() are not tested directly and not well documented. Indirect tests, through GetAncestor() exist in skiplist_tests.cpp. (This was noticed while reviewing #33515, recently merged.)

The motivation is to document the skip value computation through a test.

The first commit adds a test for the number of steps needed for an ancestor search, the number should be much lower than the linear distance.

The next commits add low-level tests to the GetSkipHeight() method. The previously internal method is exposed first.

DrahtBot added the label Tests on Oct 20, 2025

DrahtBot commented at 12:08 pm on October 20, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33661.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	l0rinc
Concept ACK	sedited
Approach ACK	sipa

If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

Conflicts

No conflicts as of last run.

optout21 force-pushed on Oct 20, 2025

l0rinc commented at 8:48 pm on October 20, 2025: contributor

Concept ACK, thanks for helping with the coverage and documentation. For reference https://maflcko.github.io/b-c-cov/test_bitcoin.coverage/src/chain.cpp.gcov.html is pretty well covered otherwise, but these tests can help with the specifics.

optout21 commented at 9:04 am on October 21, 2025: contributor

For reference https://maflcko.github.io/b-c-cov/test_bitcoin.coverage/src/chain.cpp.gcov.html is pretty well covered otherwise, but these tests can help with the specifics.

Thanks for the link! Indirectly it is tested extensively, as the skip mechanism is used while searching in the chain. However, there are no tests focusing on this specifically, and the why and how on the exact skip generation algorithm is not documented.

optout21 marked this as ready for review on Oct 30, 2025

in src/chain.cpp:79 in 6a22013f5a outdated

75@@ -76,7 +76,7 @@ CBlockIndex* CChain::FindEarliestAtLeast(int64_t nTime, int height) const
76 int static inline InvertLowestOne(int n) { return n & (n - 1); }
77 
78 /** Compute what height to jump back to with the CBlockIndex::pskip pointer. */
79-int static inline GetSkipHeight(int height) {
80+int GetSkipHeight(int height) {

l0rinc commented at 8:50 am on November 1, 2025:

👍 for the static removal, but why remove the inline keyword?

optout21 commented at 2:37 pm on November 3, 2025:

If it’s inlined, it can’t be used as extern from the test :(

l0rinc commented at 4:02 pm on November 3, 2025:

how so, what’s the failure you’re seeing? It was working for me locally

optout21 commented at 4:22 pm on November 3, 2025:

If GetSkipHeight is inline, I get the error:

0[ 75%] Built target bitcoin-node
1/usr/bin/ld: CMakeFiles/test_bitcoin.dir/chain_tests.cpp.o: in function `chain_tests::get_skip_height_test::test_method()':
2/home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:50:(.text+0x68b): undefined reference to `GetSkipHeight(int)'
3/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:51:(.text+0x7cf): undefined reference to `GetSkipHeight(int)'
4/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:52:(.text+0x90c): undefined reference to `GetSkipHeight(int)'
5/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:53:(.text+0xa49): undefined reference to `GetSkipHeight(int)'
6/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:54:(.text+0xb86): undefined reference to `GetSkipHeight(int)'
7/usr/bin/ld: CMakeFiles/test_bitcoin.dir/chain_tests.cpp.o:/home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:55: more undefined references to `GetSkipHeight(int)' follow
8collect2: error: ld returned 1 exit status
9gmake[2]: *** [src/test/CMakeFiles/test_bitcoin.dir/build.make:2463: bin/test_bitcoin] Error 1

Inlined method is not included in the obj file as a method. The solution would be to expose it through the header file, but I did not want to make that change just for the sake of the test.

l0rinc commented at 9:53 pm on November 3, 2025:

interesting, it does work for me, but inline might indeed be problematic in this case. Removing it should be fine, please resolve this comment.

optout21 commented at 9:39 am on November 4, 2025:

Indeed. Inline is only a hint, not crucial for modern compliers.

in src/test/chain_tests.cpp:51 in cdffd675d5 outdated

46+    // The test collects and analyses the distribution of the
47+    // skip difference numbers, and checks some properties:
48+    // non-uniform distribution with most values small but
49+    // with some large values as well.
50+
51+    FastRandomContext ctx;

l0rinc commented at 9:00 am on November 1, 2025:

This seems unused

optout21 commented at 2:58 pm on November 3, 2025:

Removed

in src/chain.cpp:80 in 6a22013f5a outdated

75@@ -76,7 +76,7 @@ CBlockIndex* CChain::FindEarliestAtLeast(int64_t nTime, int height) const
76 int static inline InvertLowestOne(int n) { return n & (n - 1); }
77 
78 /** Compute what height to jump back to with the CBlockIndex::pskip pointer. */
79-int static inline GetSkipHeight(int height) {
80+int GetSkipHeight(int height) {
81     if (height < 2)

l0rinc commented at 11:34 am on November 3, 2025:

nit: can this be if (height <= 2) as well? It’s likely outside the scope of the change though

optout21 commented at 1:01 pm on November 3, 2025:

Indeed, the input 2 also gives 0 as result. The intent may have been that got indices 0 and 1 it makes no sense to jump, for the rest follow the computation (but 2 accidentally also gives 0). In any case, I don’t want to change product behavior to the slightest in this PR.

in src/test/chain_tests.cpp:54 in 6a22013f5a

49+    BOOST_CHECK(GetSkipHeight(0b0001011101101001) == 0b0001011101000001);
50+    BOOST_CHECK(GetSkipHeight(0b0110101101011000) == 0b0110101101010000);
51+    BOOST_CHECK(GetSkipHeight(0b0110101101011001) == 0b0110101101000001);
52+    BOOST_CHECK(GetSkipHeight(0) == 0);
53+    BOOST_CHECK(GetSkipHeight(1) == 0);
54+    BOOST_CHECK(GetSkipHeight(2) == 0);

l0rinc commented at 11:38 am on November 3, 2025:

we could add comments explaining the magic values: # height <= 2 # powers of 2 # odd/even heights # absurdly big values or whatever

Otherwise it’s not obvious why 4 is 0 and 5 is 1 and there is no 6 but 7 is 1 again…

I need to understand why we chose these numbers, especially since the PR is meant to document the change

optout21 commented at 2:58 pm on November 3, 2025:

I’ve adjusted the numbers used and added comments

in src/test/chain_tests.cpp:48 in 6a22013f5a

42@@ -41,6 +43,78 @@ const CBlockIndex* NaiveLastCommonAncestor(const CBlockIndex* a, const CBlockInd
43 
44 } // namespace
45 
46+BOOST_AUTO_TEST_CASE(get_skip_height_test)
47+{
48+    BOOST_CHECK(GetSkipHeight(0b0001011101101000) == 0b0001011101100000);

l0rinc commented at 11:40 am on November 3, 2025:

I like the binary representation, but maybe we could use BOOST_CHECK_EQUAL insead (helps with the reason when they fail) and maybe we could align these to simplify finding the differences in their endings:

0    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101000),
1                                    0b0001011101100000);
2    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101001),
3                                    0b0001011101000001);
4    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011000),
5                                    0b0110101101010000);
6    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011001),
7                                    0b0110101101000001);

And a short comment could help in drawing our attention to the ending

optout21 commented at 2:59 pm on November 3, 2025:

Changed to BOOST_CHECK_EQUAL, added comments

l0rinc commented at 9:36 pm on November 3, 2025:

Formatting them to be on top of each other as the above example shows allows easier bit comparison, otherwise the binary representation doesn’t really help

optout21 commented at 9:40 am on November 4, 2025:

I did that for the longer values, for shorter values with only a few ones is not really need.

l0rinc commented at 2:13 pm on November 4, 2025:

Can we do it for the shorter ones as well? it’s easier to compare them and more consistent with the rest

optout21 commented at 10:15 am on February 4, 2026:

Done, in 2b94acee3dd3a70c17218d5599e4c4976afeba7d

in src/test/chain_tests.cpp:71 in 6a22013f5a outdated

66+
67+BOOST_AUTO_TEST_CASE(skip_height_properties_test)
68+{
69+    // The test collects and analyses the distribution of the
70+    // skip difference numbers, and checks some properties:
71+    // non-uniform distribution with most values small but

l0rinc commented at 11:41 am on November 3, 2025:

it seems to test more than non-uniformity. My understanding is that you want to show that most jumps are small, right? If so, maybe we don’t have to be that specific, can we just check that the median jump is tiny?

 0BOOST_AUTO_TEST_CASE(skip_height_properties_test)
 1{
 2    constexpr int n{30'000};
 3
 4    std::vector<std::unique_ptr<CBlockIndex>> block_index;
 5    block_index.push_back(std::make_unique<CBlockIndex>());
 6    for(int i{0}; i < n; ++i) {
 7        auto new_index{std::make_unique<CBlockIndex>()};
 8        new_index->pprev = block_index.back().get();
 9        new_index->nHeight = new_index->pprev->nHeight + 1;
10        new_index->BuildSkip();
11        block_index.push_back(std::move(new_index));
12    }
13
14    std::vector<int> skip_diffs;
15    skip_diffs.reserve(n);
16    for(int i{1}; i <= n; ++i) {
17        skip_diffs.push_back(i - block_index[i]->pskip->nHeight);
18    }
19
20    std::ranges::sort(skip_diffs);
21    const int median{skip_diffs[n / 2]};
22    
23    BOOST_CHECK_LT(median, 10); // Most jumps are small
24}

optout21 commented at 2:44 pm on November 3, 2025:

I check the avg, which is computationally easier to obtain. It would make sense to check the median as well, and maybe compare the median with the average. However, I prefer to avoid the O(N log N) complexity of the sorting.

l0rinc commented at 4:05 pm on November 3, 2025:

median is less sensitive to outliers - and nlogn is fine in tests

optout21 commented at 9:41 am on November 4, 2025:

Added median check as well. Much lower than the avg!

l0rinc commented at 2:15 pm on November 4, 2025:

The test became more complicated than it was before (which was my main critique of it, seems arbitrary). Could we simplify it? I think we’re testing too many things, a single median check should likely suffice in my opinion.

optout21 commented at 8:12 am on February 4, 2026:

Clarification: the checking part of the test was reworked, there is no longer checks on aggregate values (max, average). Only the iteration count is checked in each iteration. Thus this remark became obsolete. Resolving.

in src/test/chain_tests.cpp:93 in 6a22013f5a outdated

88+        new_index->nHeight = new_index->pprev->nHeight + 1;
89+        new_index->BuildSkip();
90+        block_index.push_back(std::move(new_index));
91+    }
92+
93+    // Analyze the diff (in height) of each element and its pskip

l0rinc commented at 11:43 am on November 3, 2025:

I need more context, how did we get to these numbers exactly? Do we really need to be this precise? Wouldn’t it suffice to check that their median is low enough?

optout21 commented at 9:41 am on November 4, 2025:

Added explanatory comments

in src/test/chain_tests.cpp:98 in 6a22013f5a

93+    // Analyze the diff (in height) of each element and its pskip
94+    int count_diff_smaller_16 = 0;
95+    int max_diff = 0;
96+    int total_diff = 0;
97+    for(int i = 1; i < n; ++i) {
98+        BOOST_CHECK(block_index[i]->nHeight == i);

l0rinc commented at 11:52 am on November 3, 2025:

this sounds like a BOOST_REQUIRE_EQUAL instead

optout21 commented at 8:17 am on February 4, 2026:

The check on nHeight has been removed, as it was essentially just validating the setup procedure, not the actual logic under test. This comment is thus outdated, resolving.

in src/test/chain_tests.cpp:80 in 6a22013f5a

75+    std::vector<std::unique_ptr<CBlockIndex>> block_index;
76+
77+    // Create genesis block
78+    auto genesis = std::make_unique<CBlockIndex>();
79+    genesis->nHeight = 0;
80+    block_index.push_back(std::move(genesis));

l0rinc commented at 11:55 am on November 3, 2025:

maybe this would also work

0    std::vector<std::unique_ptr<CBlockIndex>> block_index;
1    block_index.push_back(std::make_unique<CBlockIndex>());

optout21 commented at 3:00 pm on November 3, 2025:

Done

l0rinc commented at 9:49 pm on November 3, 2025:

I see you’ve moved it as well, clang complains with:

Clangd: Moving a temporary object prevents copy elision

optout21 commented at 9:41 am on November 4, 2025:

Oops, fixed!

in src/test/chain_tests.cpp:83 in 6a22013f5a

78+    auto genesis = std::make_unique<CBlockIndex>();
79+    genesis->nHeight = 0;
80+    block_index.push_back(std::move(genesis));
81+
82+    // Build a chain
83+    const auto n{30000};

l0rinc commented at 12:02 pm on November 3, 2025:

was wondering how slow this is, but seems acceptable:

0cmake -B build  -DCMAKE_BUILD_TYPE=Debug && cmake --build build && time ./build/bin/test_bitcoin --run_test=chain_tests/skip_height_properties_test
1...
2./build/bin/test_bitcoin --run_test=chain_tests/skip_height_properties_test  0.32s user 0.02s system 27% cpu 1.269 total

optout21 commented at 2:45 pm on November 3, 2025:

Yes, nontheless I reduce the count from 30000 to 10000, that’s also enough for the test.

l0rinc commented at 9:50 pm on November 3, 2025:

nit: consider formatting it as 10'000 for readability

optout21 commented at 9:41 am on November 4, 2025:

Done

in src/test/chain_tests.cpp:97 in 6a22013f5a outdated

92+
93+    // Analyze the diff (in height) of each element and its pskip
94+    int count_diff_smaller_16 = 0;
95+    int max_diff = 0;
96+    int total_diff = 0;
97+    for(int i = 1; i < n; ++i) {

l0rinc commented at 12:07 pm on November 3, 2025:

how come we’re iterating one less than before?

0    for(int i = 1; i <= n; ++i) {

optout21 commented at 2:34 pm on November 3, 2025:

It doesn’t make sense to check the skip for block index 0, that’s why 0 is excluded.

l0rinc commented at 4:01 pm on November 3, 2025:

yeah, but why is n skipped (see my suggested code)

optout21 commented at 4:25 pm on November 3, 2025:

Oh, I see. There are in fact n+1 elements. Will change.

optout21 commented at 9:42 am on November 4, 2025:

All cycles go from 1 to n-1; 0 is the genesis.

in src/chain.cpp:86 in 6a22013f5a

l0rinc commented at 12:09 pm on November 3, 2025:

Can we add a fuzz test that asserts for all h > 0: GetSkipHeight(h) < h and popcount(result) < popcount(h) (and just skip 0 since that’s already unit tested)?

optout21 commented at 9:43 am on November 4, 2025:

A fuzz test for this functionality (essentially a one-liner) seems a bit too much.

l0rinc commented at 2:18 pm on November 4, 2025:

k, please resolve the comment

l0rinc changes_requested

l0rinc commented at 12:33 pm on November 3, 2025: contributor

Concept ACK, during review I also understood this behavior better :) I have a few suggestions for simplification and am throwing you into the deep end by asking for a property based (fuzz) test as well to check certain invariants. Not a must-have, but seems simple enough. I also suggest simplifying the distribution test to just check for the median instead.

optout21 force-pushed on Nov 3, 2025

optout21 commented at 3:02 pm on November 3, 2025: contributor

Review comments addressed, mostly minor:

Reduce the number of iterations to 10000 (from 30000), for faster test execution.
Add comments on choice of test inputs.
Use BOOST_REQUIRE_EQUAL or BOOST_CHECK_GT where possible.

optout21 commented at 3:03 pm on November 3, 2025: contributor

@sipa, as this is triggered by #33515 of yours, I would kindly ask for a review.

sipa requested review from sipa on Nov 3, 2025

optout21 force-pushed on Nov 4, 2025

optout21 commented at 9:38 am on November 4, 2025: contributor

Changes applied:

Fix lint issue
Added check on the median as well
some minor formatting

optout21 force-pushed on Nov 4, 2025

optout21 commented at 12:26 pm on November 4, 2025: contributor

CI failed, limits of the 3rd for cycle were off; fixed

0     int count_diff_smaller_16 = 0;
1     int max_diff = 0;
2     int total_diff = 0;
3-    for(int i{1}; i < n; ++i) {
4+    for(size_t i{0}; i < skip_diffs.size(); ++i) {
5         auto diff = skip_diffs[i];
6         if (diff < 16) {
7             ++count_diff_smaller_16;

in src/test/chain_tests.cpp:49 in 6adafaa721

42@@ -41,6 +43,113 @@ const CBlockIndex* NaiveLastCommonAncestor(const CBlockIndex* a, const CBlockInd
43 
44 } // namespace
45 
46+BOOST_AUTO_TEST_CASE(get_skip_height_test)
47+{
48+    // Even values: the rightmost bit is zeroed
49+    // Test with variouse even values with at least 2 bits set

DrahtBot commented at 2:06 pm on November 4, 2025:

variouse -> various [spelling error in comment making the word invalid]

in src/test/chain_tests.cpp:73 in 6adafaa721

68+                                    0b0001011101000001);
69+    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011000),
70+                                    0b0110101101010000);
71+    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011001),
72+                                    0b0110101101000001);
73+    // All values 0--10

l0rinc commented at 2:15 pm on November 4, 2025:

what does -- mean here?

optout21 commented at 2:02 pm on November 6, 2025:

“0–10” meant “from 0 to 10 inclusive”. Changed to “0-10” to avoid parsing issues.

in src/test/chain_tests.cpp:143 in 6adafaa721

138+    // but general enough to be resilient to the exact test
139+    // details and algorithm
140+
141+    // The median skip diff is lower than the average
142+    BOOST_CHECK_LT(median, avg);
143+    // Most skip diffs are small (more than 67% are below 16):

l0rinc commented at 2:17 pm on November 4, 2025:

I don’t understand where 16 and the 67% come from. Can we just keep the very last check of the median instead?

optout21 commented at 2:03 pm on November 6, 2025:

The checks have been simplified, usage of “magic numbers” reduced.

optout21 force-pushed on Nov 6, 2025

optout21 commented at 2:01 pm on November 6, 2025: contributor

Pushed minor fixes after comments:

simplify test assertions
simplify computing of total/avg and max (no for loop)
formatting

in src/test/chain_tests.cpp:135 in ea998e18f6

130+        skip_diffs.push_back(i - skip_height);
131+    }
132+
133+    // Analyze the diffs
134+    // Compute the average
135+    const auto avg = double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size());

l0rinc commented at 3:09 pm on November 6, 2025:

nit: below we’re using brace init + dividing a double by a size_t is also double-division (untested):

0    const auto avg{double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / skip_diffs.size()};

in src/test/chain_tests.cpp:142 in ea998e18f6

137+    std::ranges::sort(skip_diffs);
138+    const auto median{skip_diffs[n / 2]};
139+    // Take the maximum
140+    const auto max{skip_diffs[skip_diffs.size() - 1]};
141+
142+    // Assert some parameters of the distribution

l0rinc commented at 3:10 pm on November 6, 2025:

nit: redundant comment (there a few more)

in src/test/chain_tests.cpp:139 in ea998e18f6

134+    // Compute the average
135+    const auto avg = double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size());
136+    // Compute the median
137+    std::ranges::sort(skip_diffs);
138+    const auto median{skip_diffs[n / 2]};
139+    // Take the maximum

l0rinc commented at 3:10 pm on November 6, 2025:

nit: redudnant comment, the code already expresses this clearly

in src/test/chain_tests.cpp:117 in ea998e18f6

112+    block_index.push_back(std::make_unique<CBlockIndex>());
113+    // Build a chain
114+    for(int i{1}; i < n; ++i) {
115+        auto new_index = std::make_unique<CBlockIndex>();
116+        new_index->pprev = block_index.back().get();
117+        BOOST_CHECK(new_index->pprev);

l0rinc commented at 3:12 pm on November 6, 2025:

nit: should likely be a BOOST_REQUIRE since this is still the setup phase, not yet the validation stage

l0rinc commented at 1:54 pm on February 3, 2026:

Doesn’t this trivially follow from the previous line since we never pushed a null unique_ptr? If you think we need a sanity check here, consider:

0        BOOST_REQUIRE_EQUAL(new_index->pprev, block_index[i - 1].get());

optout21 commented at 10:15 am on February 4, 2026:

True; check was removed (with the change to on-stack allocation).

l0rinc approved

l0rinc commented at 3:14 pm on November 6, 2025: contributor

ACK ea998e18f6d64f485d4e2d85a1028474fb35120a

Thanks for taking my suggestion, the code is very clean and easy to understand now. I left some nits I still saw, but they’re not blocking.

optout21 force-pushed on Nov 14, 2025

optout21 commented at 1:24 pm on November 14, 2025: contributor

Minor nits applied

 0@@ -114,7 +114,7 @@ BOOST_AUTO_TEST_CASE(skip_height_properties_test)
 1     for(int i{1}; i < n; ++i) {
 2         auto new_index = std::make_unique<CBlockIndex>();
 3         new_index->pprev = block_index.back().get();
 4-        BOOST_CHECK(new_index->pprev);
 5+        BOOST_REQUIRE(new_index->pprev);
 6         new_index->nHeight = new_index->pprev->nHeight + 1;
 7         new_index->BuildSkip();
 8         block_index.push_back(std::move(new_index));
 9@@ -131,16 +131,12 @@ BOOST_AUTO_TEST_CASE(skip_height_properties_test)
10     }
11 
12     // Analyze the diffs
13-    // Compute the average
14-    const auto avg = double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size());
15-    // Compute the median
16+    // Compute the average, median, maximum
17+    const auto avg{double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size())};
18     std::ranges::sort(skip_diffs);
19     const auto median{skip_diffs[n / 2]};
20-    // Take the maximum
21     const auto max{skip_diffs[skip_diffs.size() - 1]};
22 
23-    // Assert some parameters of the distribution
24-
25     // The median skip diff is lower than the average (non-uniform distribution)
26     BOOST_CHECK_LT(median, avg);
27     // There are some large skip diffs (more than half of n):

optout21 requested review from l0rinc on Nov 14, 2025

optout21 force-pushed on Nov 15, 2025

DrahtBot added the label CI failed on Dec 8, 2025

in src/chain.cpp:79 in e2e25f4947

75@@ -70,7 +76,7 @@ CBlockIndex* CChain::FindEarliestAtLeast(int64_t nTime, int height) const
76 int static inline InvertLowestOne(int n) { return n & (n - 1); }
77 
78 /** Compute what height to jump back to with the CBlockIndex::pskip pointer. */
79-int static inline GetSkipHeight(int height) {
80+int GetSkipHeight(int height) {

maflcko commented at 12:46 pm on December 9, 2025:

nit: clang-format this line, while touching. (leaving this nit, because the CI has to be re-run anyway)

0+int GetSkipHeight(int height)
1+{

optout21 commented at 7:09 am on December 11, 2025:

I kept the change minimal, but since the line is touched, I’ve adjusted the curly brace placement to be consistent, as suggested.

optout21 force-pushed on Dec 11, 2025

optout21 commented at 7:06 am on December 11, 2025: contributor

Very minor touch: moving of a curly nit-fix, rebase (noop); mostly to get CI to run again (had some apparently temporary issue).

(e2e25f4947e65869b014483be70b764a62bd0df7 -> 1a0423cf25bb60f98a29e06c39b7db2998860cd9)

DrahtBot removed the label CI failed on Dec 11, 2025

l0rinc commented at 6:39 pm on December 11, 2025: contributor

reACK 1a0423cf25bb60f98a29e06c39b7db2998860cd9

sipa commented at 7:06 pm on December 11, 2025: member

Concept ACK, but disagree with the approach of the first commit.

I don’t think it tests anything useful:

If it wanted to test the exact behavior of the skip-height pointer building, it could test that directly. No need for averages or medians here; it could test that block 171 points to block 161 and no other, for example.
If it wanted to test something related to the performance of the constructed index, without hardcoding the exact behavior, it would need to do something like pick random pairs of blocks in the chain, and see how many iterations CBlockIndex::GetAncestor (or a test-only reimplementation thereof that keeps track of that count) takes to get from one to the other. While this is related to properties of the differences, the structure really matters for this, and just distribution/median/average aren’t sufficient to guarantee this number is reasonable.

sipa removed review request from sipa on Dec 11, 2025

optout21 commented at 7:17 pm on December 11, 2025: contributor

I don’t think it tests anything useful:

* If it wanted to test the exact behavior of the skip-height pointer building, it could test that directly. No need for averages or medians here; it could test that block 171 points to block 161 and no other, for example.

* If it wanted to test something related to the performance of the constructed index, without hardcoding the exact behavior, it would need to do something like pick random pairs of blocks in the chain, and see how many iterations `CBlockIndex::GetAncestor` (or a test-only reimplementation thereof that keeps track of that count) takes to get from one to the other. While this is related to properties of the differences, the structure really matters for this, and just distribution/median/average aren't sufficient to guarantee this number is reasonable.

I’m more interested in the emergent behavior of the chain of skips, not in the exact values for some concrete values. Therefore I don’t think a unit test checking the result directly for some values has much added value. But testing the number of steps to get between two points on the chain is a good idea, that’s the important property of the skips. Thanks for the suggestion, I will reconsider the test.

sipa commented at 7:19 pm on December 11, 2025: member

My point is that these two properties, regardless of how they are accomplished, are the interesting ones to test.

Your test achieves the first, but only approximately, and is much more complicated than needed for that purpose.
Your test does not achieve the second at all, despite having complexity that seems intended to test for it.

Pick one of the two, and do it well.

optout21 marked this as a draft on Dec 12, 2025

optout21 force-pushed on Dec 12, 2025

DrahtBot added the label CI failed on Dec 12, 2025

optout21 force-pushed on Dec 12, 2025

optout21 marked this as ready for review on Dec 12, 2025

optout21 commented at 11:57 am on December 12, 2025: contributor

I redid the test, following @sipa’s remark.

The test on skip height properties – median, avg – is dropped
A new test is added, that tests the skip heights indirectly, by doing a bunch of ancestor searches, and counting the number of steps needed, and doing some asserts on the step count. In a 200k long chain a series 1000 runs are performed. In each run a start and a target heights are picked randomly: the start is in the upper half of the chain, and the target is below it, by at least 500. Ancestor search is performed, by duplicated the GetAncestor algorithm. (Deterministic randomness is used.)
The simple one-input-assert-the-output GetSkipHeight unit test is kept unmodified. This test has limited test harness value, it serves more as a documentation-by-test for this ill-documented method. I’m not very keen on this unit test, I can remove it if reviewers don’t see it useful.

fanquake commented at 1:24 pm on December 12, 2025: member

https://github.com/bitcoin/bitcoin/actions/runs/20165856545/job/57889037572#step:8:1220:

0 /home/runner/work/bitcoin/bitcoin/src/test/chain_tests.cpp:87:30: error: use of undeclared identifier 'GetSkipHeight'
1   87 |             int heightSkip = GetSkipHeight(heightWalk);
2      |                              ^
3/home/runner/work/bitcoin/bitcoin/src/test/chain_tests.cpp:88:34: error: use of undeclared identifier 'GetSkipHeight'
4   88 |             int heightSkipPrev = GetSkipHeight(heightWalk - 1);
5      |                                  ^
62 errors generated.

in src/test/chain_tests.cpp:152 in 2a1a0f9d89 outdated

147+                                        heightSkipPrev >= target_height)))) {
148+                // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
149+                pindexWalk = pindexWalk->pskip;
150+                heightWalk = heightSkip;
151+            } else {
152+                assert(pindexWalk->pprev);

l0rinc commented at 2:19 pm on December 12, 2025:

Could unify these to BOOST_REQUIRE instead of assert

optout21 commented at 6:09 pm on December 15, 2025:

The traversal code is copied CBlockIndex::GetAncestor(), hence the assert. Changed to BOOST_REQUIRE.

in src/test/chain_tests.cpp:154 in 2a1a0f9d89 outdated

149+                pindexWalk = pindexWalk->pskip;
150+                heightWalk = heightSkip;
151+            } else {
152+                assert(pindexWalk->pprev);
153+                pindexWalk = pindexWalk->pprev;
154+                heightWalk--;

l0rinc commented at 2:20 pm on December 12, 2025:

return value isn’t needed, could be simplified to:

0                --heightWalk;

optout21 commented at 6:09 pm on December 15, 2025:

As this code is copied, and not so important, I leave it as it is.

l0rinc commented at 6:15 pm on December 15, 2025:

We usually prefer new code be added according to the best practices instead of surrounding code - the history doesn’t matter, see: https://github.com/bitcoin/bitcoin/blob/fa4395dffd432b999002dfd24eb6f8d7384fbcbe/doc/developer-notes.md?plain=1#L7-L8

optout21 commented at 10:17 am on December 16, 2025:

Changed.

in src/test/chain_tests.cpp:162 in 2a1a0f9d89 outdated

157+        }
158+
159+        // Asserts on the number of iterations:
160+        // always less than 100 and less than 5% of the distance
161+        BOOST_CHECK_LT(iteration_count, 100);
162+        BOOST_REQUIRE(delta > 0);

l0rinc commented at 2:20 pm on December 12, 2025:

0        BOOST_REQUIRE_GT(delta, 0);

And if we keep this, it could go closer to the delta declaration

optout21 commented at 6:12 pm on December 15, 2025:

This check protects the next line, which divides by delta. I keep the check just before it is needed. Changed BOOST_REQUIRE to BOOST_REQUIRE_GT

in src/test/chain_tests.cpp:129 in 2a1a0f9d89 outdated

124+    const auto runs{1000};
125+    const auto min_distance{500};
126+    FastRandomContext ctx(/*fDeterministic =*/true);
127+    for (auto i{0}; i < runs; ++i) {
128+        // pick a height in the second half of the chain
129+        auto start_height = chain_size / 2 + ctx.randrange(chain_size / 2);

l0rinc commented at 2:23 pm on December 12, 2025:

To make sure we don’t have signed/unsigned problems here (since we’re subtracting from this later), we might as well make it explicitly int:

0        int start_height{chain_size / 2 + ctx.randrange(chain_size / 2)};

Actually we could reduce duplication if we subtract from the end instead:

0        const int start_height{(chain_size - 1) - ctx.randrange(chain_size / 2)};       // pick a height in the second half of the chain

optout21 commented at 6:12 pm on December 15, 2025:

Changed.

in src/test/chain_tests.cpp:131 in 2a1a0f9d89 outdated

126+    FastRandomContext ctx(/*fDeterministic =*/true);
127+    for (auto i{0}; i < runs; ++i) {
128+        // pick a height in the second half of the chain
129+        auto start_height = chain_size / 2 + ctx.randrange(chain_size / 2);
130+        BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
131+        auto& start_p = block_index[start_height];

l0rinc commented at 2:25 pm on December 12, 2025:

can you use brace init consistently to avoid narrowing? And corecheck suggests const reference - though we can likely inline it instead…

optout21 commented at 10:19 am on December 16, 2025:

Changed.

in src/test/chain_tests.cpp:62 in 2a1a0f9d89 outdated

57+                                    0b00010000);
58+    BOOST_CHECK_EQUAL(GetSkipHeight(0b00011000),
59+                                    0b00010000);
60+    BOOST_CHECK_EQUAL(GetSkipHeight(0b10101010),
61+                                    0b10101000);
62+    // Odd values: the 2nd and 3rd bits are zeroed

l0rinc commented at 2:31 pm on December 12, 2025:

I think you mean that the second and third set bits are zerod, right?

0    // Even heights: clear the lowest set bit.
1...
2    // Odd heights: clear the 2nd and 3rd set bits.

optout21 commented at 6:14 pm on December 15, 2025:

Yes, that’s how it was meant, I added the clarification!

in src/test/chain_tests.cpp:130 in 2a1a0f9d89 outdated

125+    const auto min_distance{500};
126+    FastRandomContext ctx(/*fDeterministic =*/true);
127+    for (auto i{0}; i < runs; ++i) {
128+        // pick a height in the second half of the chain
129+        auto start_height = chain_size / 2 + ctx.randrange(chain_size / 2);
130+        BOOST_REQUIRE(start_height > 0 && start_height < chain_size);

l0rinc commented at 2:37 pm on December 12, 2025:

This may be useful during development, but it’s basically testing the testing logic itself, isn’t it? If you keep them, can you add them after the initializations to have separate init and check sections, something like:

0const int start_height{(chain_size - 1) - ctx.randrange(chain_size / 2)};       // pick a height in the second half of the chain
1const int delta{min_distance + ctx.randrange(start_height - min_distance + 1)}; // pick a target height, earlier by at least min_distance
2const int target_height{start_height - delta};
3
4BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
5BOOST_REQUIRE(target_height > 0 && target_height < chain_size && target_height < start_height);
6BOOST_REQUIRE(delta > 0);

optout21 commented at 10:20 am on December 16, 2025:

Done; delta check no longer needed.

sipa commented at 2:37 pm on December 12, 2025: member

@optout21 Cool, approach ACK. You’ll need to make sure both commits compile, though.

in src/test/chain_tests.cpp:138 in 2a1a0f9d89 outdated

133+        auto delta{min_distance + ctx.randrange(start_height - min_distance)};
134+        auto target_height{start_height - delta};
135+        BOOST_REQUIRE(target_height > 0 && target_height < chain_size && target_height < start_height);
136+
137+        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
138+        const CBlockIndex* pindexWalk = start_p.get();

l0rinc commented at 2:42 pm on December 12, 2025:

it seems to me we can inline start_p here:

0        const auto* pindexWalk{block_index[start_height].get()};

optout21 commented at 6:16 pm on December 15, 2025:

It is saved in a local variable because it is also used at the end of the test method (auto p_ancestor{start_p->GetAncestor(target_height)};)

in src/test/chain_tests.cpp:141 in 2a1a0f9d89 outdated

136+
137+        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
138+        const CBlockIndex* pindexWalk = start_p.get();
139+        auto heightWalk = start_height;
140+        auto iteration_count{0};
141+        while (heightWalk > target_height) {

l0rinc commented at 2:43 pm on December 12, 2025:

nit: we could reduce the scope of the iteration-only variables:

0int iteration_count{0};
1for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {

nit: heightWalk -> height_walk

optout21 commented at 6:16 pm on December 15, 2025:

I rather keep it to match the original code.

in src/test/chain_tests.cpp:160 in 2a1a0f9d89 outdated

155+            }
156+            ++iteration_count;
157+        }
158+
159+        // Asserts on the number of iterations:
160+        // always less than 100 and less than 5% of the distance

l0rinc commented at 2:45 pm on December 12, 2025:

why 100 and why 5%? The numbers are already visible in the code, but not sure why - can we have that in the comment instead?

The 5% doesn’t seem to hold for non-deterministic random - seems arbitrary. Neither does the 100 - so my suspicion seems to have been accurate that these aren’t real properties of the system, just accidental ones of the sample.

optout21 commented at 10:21 am on December 16, 2025:

These were empirically derived values. I’ve adjusted them, see #33661 (comment) .

in src/test/chain_tests.cpp:143 in 2a1a0f9d89 outdated

138+        const CBlockIndex* pindexWalk = start_p.get();
139+        auto heightWalk = start_height;
140+        auto iteration_count{0};
141+        while (heightWalk > target_height) {
142+            int heightSkip = GetSkipHeight(heightWalk);
143+            int heightSkipPrev = GetSkipHeight(heightWalk - 1);

l0rinc commented at 2:51 pm on December 12, 2025:

Given that we’ve already called BuildSkip, do we need to test this directly or can we do it indirectly as well?

0            const int height_skip{pindexWalk->pskip->nHeight};
1            const int height_skip_prev{pindexWalk->pprev->pskip->nHeight};

optout21 commented at 6:17 pm on December 15, 2025:

Same here, I’d rather keep it to match the original code.

in src/test/chain_tests.cpp:167 in 2a1a0f9d89 outdated

162+        BOOST_REQUIRE(delta > 0);
163+        auto rel_iter_count = double(iteration_count) / double(delta);
164+        BOOST_CHECK_LT(rel_iter_count, 0.05);
165+
166+        // Since we are at it, we can also check that `GetAncestor` works
167+        // and gives the same result

l0rinc commented at 2:52 pm on December 12, 2025:

If you insist on the comment - though it duplicates what the code already does

0        // Check that `GetAncestor` works and gives the same result

optout21 commented at 6:17 pm on December 15, 2025:

Changed

in src/test/chain_tests.cpp:126 in 2a1a0f9d89 outdated

121+        block_index.push_back(std::move(new_index));
122+    }
123+
124+    const auto runs{1000};
125+    const auto min_distance{500};
126+    FastRandomContext ctx(/*fDeterministic =*/true);

l0rinc commented at 2:54 pm on December 12, 2025:

making it deterministic kinda’ defeats the purpose in my opinion (it’s a low bar to construct something that accidentally works, a lot more difficult to find the exact properties of a system that work less predictable values)

nit: extra space before =

optout21 commented at 10:25 am on December 16, 2025:

Non-deterministic unit tests are a bad practice, they can result in tests passing locally, but fail occasionally in CI, which is a pain. Deterministic random is a good way to make an unbiased selection out of a huge number of possibilities (which would be too slow to test). I disagree that deterministic selection defeats the purpose. (Glancing at the chart attached shows that it covers the problem space well.)

l0rinc commented at 10:33 am on December 16, 2025:

when those tests fail they should uncover an actual bug, so I’d argue that’s very useful. If they don’t, the test wasn’t really exercising the algorithm but was testing a special case only, so it was deceptive.

optout21 commented at 11:14 am on December 16, 2025:

In my opinion (unit) tests executed in a gating CI should be 100% reproducible. We may disagree on this, which is fine :) There is not much difference between a single run of a randomized test using a fixed seed vs. a variable (e.g. time-based). Multiple runs with the non-fixed seed will cover different input, that’s true, but they will be hard to reproduce. Imagine a random test that has a chance to fail once in every 1000 runs. It can be annoying if it fails suddenly in someone’s PR. Also it’s very difficult to reproduce to analyze it (unless the seed is recorded, which is usually not the case). The failure may be due to a test issue, but it may also have caught a long-hidden bug. A robust and reliable test harness is very valuable from software engineering point of view.

in src/test/chain_tests.cpp:119 in 2a1a0f9d89 outdated

114+    block_index.push_back(std::make_unique<CBlockIndex>());
115+    for(auto i{1}; i < chain_size; ++i) {
116+        auto new_index{std::make_unique<CBlockIndex>()};
117+        new_index->pprev = block_index.back().get();
118+        BOOST_CHECK(new_index->pprev);
119+        new_index->nHeight = new_index->pprev->nHeight + 1;

l0rinc commented at 3:00 pm on December 12, 2025:

0        new_index->nHeight = i;

optout21 commented at 6:19 pm on December 15, 2025:

Makes sense, but I keep it as it is. This way it shows better, the for each object nHeight = pprev->nHieght + 1 holds. Plus this is done the same way in the original chain_test test.

in src/test/chain_tests.cpp:164 in 2a1a0f9d89 outdated

159+        // Asserts on the number of iterations:
160+        // always less than 100 and less than 5% of the distance
161+        BOOST_CHECK_LT(iteration_count, 100);
162+        BOOST_REQUIRE(delta > 0);
163+        auto rel_iter_count = double(iteration_count) / double(delta);
164+        BOOST_CHECK_LT(rel_iter_count, 0.05);

l0rinc commented at 3:05 pm on December 12, 2025:

can we avoid floating point arithmetic here?

0        BOOST_CHECK_LT(iteration_count * 20, delta);

optout21 commented at 6:20 pm on December 15, 2025:

I think this makes the intent a bit less clear, and performance is not crucial in this line.

in src/test/chain_tests.cpp:114 in 2a1a0f9d89 outdated

109+    // Build a chain
110+    const auto chain_size{200'000};
111+    std::vector<std::unique_ptr<CBlockIndex>> block_index;
112+    block_index.reserve(chain_size);
113+    // Create genesis block
114+    block_index.push_back(std::make_unique<CBlockIndex>());

l0rinc commented at 4:05 pm on December 12, 2025:

genesis is only special that it’s the head of the linked list, right? We could include that in the loop:

0    std::vector<std::unique_ptr<CBlockIndex>> block_index;
1    block_index.reserve(chain_size);
2    for (int i{0}; i < chain_size; ++i) {
3        auto index{std::make_unique<CBlockIndex>()};
4        index->pprev = i ? block_index.back().get() : nullptr;
5        index->nHeight = i;
6        index->BuildSkip();
7        block_index.push_back(std::move(index));
8    }

optout21 commented at 6:21 pm on December 15, 2025:

Makes sense, but it make sense to keep the genesis special in this test.

in src/test/chain_tests.cpp:171 in 2a1a0f9d89 outdated

166+        // Since we are at it, we can also check that `GetAncestor` works
167+        // and gives the same result
168+        auto p_ancestor = start_p->GetAncestor(target_height);
169+        BOOST_CHECK_EQUAL(p_ancestor, block_index[target_height].get());
170+    }
171+}

l0rinc commented at 4:42 pm on December 12, 2025:

we have some checks in the while, some after, but we don’t have a check for all runs - we could add whether we saw both skips and prevs, something like:

 0    bool saw_skip{false}, saw_prev{false};
 1...
 2    for (int run{0}; run < 1'000; ++run) {
 3...
 4        for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
 5...
 6            if (skip && ... {
 7                saw_skip |= true;
 8            } else {
 9                saw_prev |= true;
10            }
11...
12    BOOST_CHECK(saw_skip && saw_prev);

optout21 commented at 6:22 pm on December 15, 2025:

It’s sufficient if all test runs are OK, I don’t see what could be checked across all the runs.

in src/test/chain_tests.cpp:148 in 2a1a0f9d89 outdated

143+            int heightSkipPrev = GetSkipHeight(heightWalk - 1);
144+            if (pindexWalk->pskip != nullptr &&
145+                (heightSkip == target_height ||
146+                (heightSkip > target_height && !(heightSkipPrev < heightSkip - 2 &&
147+                                        heightSkipPrev >= target_height)))) {
148+                // Only follow pskip if pprev->pskip isn't better than pskip->pprev.

l0rinc commented at 4:50 pm on December 12, 2025:

Instead of commenting, can we extract the logic to variables that explain what happens here? After a lot of fiddling with this part, it seems to me this may explain it a bit better:

 0// Perform traversal from start to target, skipping, as implemented in `GetAncestor`
 1const auto* pindex_walk{block_index[start_height].get()};
 2int iteration_count{0};
 3for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
 4    BOOST_REQUIRE(pindex_walk);
 5    const auto* const prev{Assert(pindex_walk->pprev)};
 6    const auto* const skip{Assert(pindex_walk->pskip)};
 7    BOOST_REQUIRE(prev->pskip);
 8
 9    const bool prev_skip_better{prev->pskip->nHeight >= target_height && prev->pskip->nHeight < skip->nHeight - 2};
10    if (skip->nHeight == target_height || (skip->nHeight > target_height && !prev_skip_better)) {
11        saw_skip = true;
12        pindex_walk = skip;
13        BOOST_REQUIRE(skip->nHeight < height_walk);
14        height_walk = skip->nHeight;
15    } else {
16        saw_prev = true;
17        pindex_walk = prev;
18        --height_walk;
19    }
20    BOOST_REQUIRE(height_walk >= target_height && height_walk == pindex_walk->nHeight);
21}

optout21 commented at 6:22 pm on December 15, 2025:

Again, not changed, copied code.

in src/test/chain_tests.cpp:125 in 2a1a0f9d89 outdated

120+        new_index->BuildSkip();
121+        block_index.push_back(std::move(new_index));
122+    }
123+
124+    const auto runs{1000};
125+    const auto min_distance{500};

l0rinc commented at 4:51 pm on December 12, 2025:

This will make boundary conditions less likely, running it with constexpr int min_distance{100}; revealed new failures for the randomized case for me locally.

optout21 commented at 6:24 pm on December 15, 2025:

The shorter the runs, the more variation there is. The ratio depends on the length (probably a log relationship). Performance-wise the long traversals are interested, that’s why I put a min distance limit.

sipa commented at 6:27 pm on December 15, 2025:

On average, when considering all pairs of heights (x,y), with $0 \leq x \leq y \leq N$, I believe the average number of steps scales roughly as $\log^2(N)$.

optout21 commented at 10:27 am on December 16, 2025:

Details in #33661 (comment)

optout21 commented at 10:29 am on December 16, 2025:

This will make boundary conditions less likely, running it with constexpr int min_distance{100}; revealed new failures for the randomized case for me locally.

The “5%” rule was very rough, and broke with short distances, you are right. I added it as the global maximum check does not enforce much for short distances. But now the formula-based boundary is stricter over a wide range of distance values.

in src/test/chain_tests.cpp:124 in 2a1a0f9d89 outdated

119+        new_index->nHeight = new_index->pprev->nHeight + 1;
120+        new_index->BuildSkip();
121+        block_index.push_back(std::move(new_index));
122+    }
123+
124+    const auto runs{1000};

l0rinc commented at 4:52 pm on December 12, 2025:

I think we can safely inline this if we call the loop iteration run:

0    for (int run{0}; run < 1'000; ++run) {

optout21 commented at 6:24 pm on December 15, 2025:

Changed.

in src/test/chain_tests.cpp:150 in 2a1a0f9d89 outdated

145+                (heightSkip == target_height ||
146+                (heightSkip > target_height && !(heightSkipPrev < heightSkip - 2 &&
147+                                        heightSkipPrev >= target_height)))) {
148+                // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
149+                pindexWalk = pindexWalk->pskip;
150+                heightWalk = heightSkip;

l0rinc commented at 5:08 pm on December 12, 2025:

we should assert that this will always converge towards genesis (i.e. it’s decreasing):

0                BOOST_REQUIRE(skip->nHeight < height_walk);

optout21 commented at 6:31 pm on December 15, 2025:

It should hold, but validating the traversal algorithm is not the primary aim, and I prefer to keep the duplicated code as close as possible. For this test it is relevant that the traversal completes in a finite number of steps.

optout21 commented at 10:30 am on December 16, 2025:

Added now.

in src/test/chain_tests.cpp:107 in 2a1a0f9d89 outdated

102+    // and asserts that its properties are suitable for performant
103+    // ancestor search.
104+    // It does so by imitating the `GetAncestor` function to
105+    // look for an ancestor, and ensuring that an arbitrary
106+    // ancestor is found in a number of steps that is much lower
107+    // than simple linear traversal.

l0rinc commented at 5:11 pm on December 12, 2025:

“much lower” sounds very subjective and I couldn’t find an obvious asymptote - can we just assert that it’s always lower? Maybe we can use the 110 from GetSkipHeight as an upper limit instead

optout21 commented at 10:31 am on December 16, 2025:

Now there is a upper-bound formula that quantifies how much lower.

in src/test/chain_tests.cpp:99 in 2a1a0f9d89 outdated

94+    BOOST_CHECK_EQUAL(GetSkipHeight(8), 0);
95+    BOOST_CHECK_EQUAL(GetSkipHeight(9), 1);
96+    BOOST_CHECK_EQUAL(GetSkipHeight(10), 8);
97+}
98+
99+BOOST_AUTO_TEST_CASE(skip_height_analysis)

l0rinc commented at 5:39 pm on December 12, 2025:

Seems fast enough:

./build/bin/test_bitcoin –run_test=chain_tests/skip_height_analysis 0.53s user 0.02s system 32% cpu 1.714 total

l0rinc changes_requested

l0rinc commented at 5:43 pm on December 12, 2025: contributor

I might have gone a bit overboard, but I have tried explaining every change that I suggested in context.

For convenience, here’s the full local diff I did during review (this is the latest state, some of the suggestions may reflect a previous one):

  0BOOST_AUTO_TEST_CASE(get_skip_height_test)
  1{
  2    // Even heights: clear the lowest set bit.
  3    BOOST_CHECK_EQUAL(GetSkipHeight(0b00010010),
  4                                    0b00010000);
  5    BOOST_CHECK_EQUAL(GetSkipHeight(0b00100010),
  6                                    0b00100000);
  7    BOOST_CHECK_EQUAL(GetSkipHeight(0b01000010),
  8                                    0b01000000);
  9    BOOST_CHECK_EQUAL(GetSkipHeight(0b00010100),
 10                                    0b00010000);
 11    BOOST_CHECK_EQUAL(GetSkipHeight(0b00011000),
 12                                    0b00010000);
 13    BOOST_CHECK_EQUAL(GetSkipHeight(0b10101010),
 14                                    0b10101000);
 15    // Odd heights: clear the 2nd and 3rd set bits.
 16    BOOST_CHECK_EQUAL(GetSkipHeight(0b10010011),
 17                                    0b10000001);
 18    BOOST_CHECK_EQUAL(GetSkipHeight(0b10100011),
 19                                    0b10000001);
 20    BOOST_CHECK_EQUAL(GetSkipHeight(0b11000011),
 21                                    0b10000001);
 22    BOOST_CHECK_EQUAL(GetSkipHeight(0b10010101),
 23                                    0b10000001);
 24    BOOST_CHECK_EQUAL(GetSkipHeight(0b10011001),
 25                                    0b10000001);
 26    BOOST_CHECK_EQUAL(GetSkipHeight(0b10101011),
 27                                    0b10100001);
 28    // Some longer random values (even and odd)
 29    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101000),
 30                                    0b0001011101100000);
 31    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101001),
 32                                    0b0001011101000001);
 33    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011000),
 34                                    0b0110101101010000);
 35    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011001),
 36                                    0b0110101101000001);
 37    // All values 0-10
 38    BOOST_CHECK_EQUAL(GetSkipHeight(0), 0);
 39    BOOST_CHECK_EQUAL(GetSkipHeight(1), 0);
 40    BOOST_CHECK_EQUAL(GetSkipHeight(2), 0);
 41    BOOST_CHECK_EQUAL(GetSkipHeight(3), 1);
 42    BOOST_CHECK_EQUAL(GetSkipHeight(4), 0);
 43    BOOST_CHECK_EQUAL(GetSkipHeight(5), 1);
 44    BOOST_CHECK_EQUAL(GetSkipHeight(6), 4);
 45    BOOST_CHECK_EQUAL(GetSkipHeight(7), 1);
 46    BOOST_CHECK_EQUAL(GetSkipHeight(8), 0);
 47    BOOST_CHECK_EQUAL(GetSkipHeight(9), 1);
 48    BOOST_CHECK_EQUAL(GetSkipHeight(10), 8);
 49}
 50
 51BOOST_AUTO_TEST_CASE(skip_height_analysis)
 52{
 53    // This test case indirectly tests the `GetSkipHeight` function,
 54    // and asserts that its properties are suitable for performant
 55    // ancestor search.
 56    // It does so by imitating the `GetAncestor` function to
 57    // look for an ancestor, and ensuring that an arbitrary
 58    // ancestor is found in no more steps than linear traversal.
 59
 60    constexpr int chain_size{200'000};
 61    std::vector<std::unique_ptr<CBlockIndex>> block_index;
 62    block_index.reserve(chain_size);
 63    for (int i{0}; i < chain_size; ++i) {
 64        auto index{std::make_unique<CBlockIndex>()};
 65        index->pprev = i ? block_index.back().get() : nullptr;
 66        index->nHeight = i;
 67        index->BuildSkip();
 68        block_index.push_back(std::move(index));
 69    }
 70
 71    bool saw_skip{false}, saw_prev{false};
 72
 73    FastRandomContext rng(/*fDeterministic=*/false);
 74    for (int run{0}; run < 1'000; ++run) {
 75        const int start_height{(chain_size - 1) - rng.randrange(chain_size / 2)};   // pick a height in the second half of the chain
 76        constexpr int min_distance{100};
 77        const int delta{min_distance + rng.randrange(start_height - min_distance)}; // pick a target height, earlier by at least min_distance
 78        const int target_height{start_height - delta};
 79
 80        BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
 81        BOOST_REQUIRE(target_height > 0 && target_height < chain_size && target_height < start_height);
 82        BOOST_REQUIRE(delta > 0);
 83
 84        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
 85        const auto* pindex_walk{block_index[start_height].get()};
 86        int iteration_count{0};
 87        for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
 88            BOOST_REQUIRE(pindex_walk);
 89            const auto* const prev{Assert(pindex_walk->pprev)};
 90            const auto* const skip{Assert(pindex_walk->pskip)};
 91            BOOST_REQUIRE(prev->pskip);
 92
 93            const bool prev_skip_better{prev->pskip->nHeight >= target_height && prev->pskip->nHeight < skip->nHeight - 2};
 94            if (skip->nHeight == target_height || (skip->nHeight > target_height && !prev_skip_better)) {
 95                saw_skip = true;
 96                pindex_walk = skip;
 97                BOOST_REQUIRE(skip->nHeight < height_walk);
 98                height_walk = skip->nHeight;
 99            } else {
100                saw_prev = true;
101                pindex_walk = prev;
102                --height_walk;
103            }
104            BOOST_REQUIRE(height_walk >= target_height && height_walk == pindex_walk->nHeight);
105        }
106
107        // End-of-walk invariants.
108        BOOST_CHECK_EQUAL(pindex_walk->nHeight, target_height);
109        BOOST_CHECK_LE(iteration_count, 110); // See GetSkipHeight
110
111        // Cross-check with the real implementation.
112        const auto* ancestor{block_index[start_height]->GetAncestor(target_height)};
113        BOOST_CHECK_EQUAL(ancestor, block_index[target_height].get());
114    }
115
116    BOOST_CHECK(saw_skip && saw_prev);
117}

optout21 force-pushed on Dec 15, 2025

optout21 commented at 1:48 pm on December 15, 2025: contributor

Fixed build error in 1st commit (the extern, etc, was added to the 2nd commit only, was missing from 1st).

optout21 commented at 1:48 pm on December 15, 2025: contributor

2 errors generated.

Fixed in 464ce57745894d4038c970c5baac3c9da5933c19

optout21 force-pushed on Dec 15, 2025

optout21 commented at 5:01 pm on December 15, 2025: contributor

Applied review comment changes to chain_tests.cpp. -> 2f77d8dd91a87071bc01cfb5b68f01ab176f06f7

optout21 force-pushed on Dec 15, 2025

optout21 commented at 6:36 pm on December 15, 2025: contributor

I might have gone a bit overboard, but I have tried explaining every change that I suggested in context.

Thanks for the details comments; addressed case-by-case.

sipa commented at 6:39 pm on December 15, 2025: member

There is an annoying conflict in design goals here.

On the one hand, it’s copying code to mimick the traversal algorithm. When copying code, you should not assume that the two remain in sync, and reviewers might rightfully be confused about whether that might be a problem. From this perspective, it’d be better if instead, CChain::GetAncestor itself was modified to count its number of steps performed, and use that in the test rather than a copy/reimplementation.

On the other hand, the goal isn’t to validate the traversal algorithm itself, but to validate a property of the skiplist it uses. The two are linked, but are not identical. In fact, there is no real reason why the two need to be identical in behavior. The added test here could use a simpler traversal algorithm (e.g. just always pick following pskip whenever that doesn’t jump past the target, and pprev otherwise) - the rest can be seen as a local optimization in the traversal code, which isn’t essential for verifying the skiplist itself.

So I think it’s worth considering one of those two approaches (annotating the original, or using a simplified copy), but I’m also ok with the current approach - perhaps if some comments are added that explain it’s copied, but doesn’t require keeping the two in sync.

DrahtBot removed the label CI failed on Dec 15, 2025

optout21 marked this as a draft on Dec 16, 2025

optout21 commented at 7:55 am on December 16, 2025: contributor

So I think it’s worth considering one of those two approaches

I agree, point taken. I have considered adding the counting to the product code, but it’s not a good idea to modify/extend the production code for testing purpose only, even just slightly. You are right (both of you) that duplicated likely will drift away in the futures, so there is no point in keeping it verbatim copy. I was just thinking that it’s easier to find the original code if it’s a verbatim copy, but that’s a minor consideration. So I give up on the verbatim copy, and I can do @l0rinc ’s suggestions (code style).

optout21 commented at 9:52 am on December 16, 2025: contributor

Regarding model/approximation (https://github.com/bitcoin/bitcoin/pull/33661#discussion_r2620420093):

If there was only the 1-bit clearing (applied for even numbers), the expected average for the skip is $log(y) - 1$. For the number of iterations, the approximation is be $(y-x)/log(y)$.
The two-bit clearing is a more complex, but it approximates to $(log(y) - 1)(log(y) - 1)/2$. Combining the two (for 50% even numbers 1-bit, for 50% odd numbers 2-bit) yields $(log(y) - 1)(log(y) + 1)/2$. For iteration count estimate: $2*(y-x)/(log(y) - 1)(log(y) + 1)$ (having the order of $(y-x)/log^2(y)$ ).
However, we are interested of the sum of skips in a particular traversal, not random. There is the effect that as bits are cleared, they tend to progress towards numbers with lesser low-order bits set, which mean higher skips. Modeling this is beyond me (and not warranted by this simple test)
I looked empirically at the number of iterations vs. the log of the distance ($y-x$). Apparently, the function $8*log(y-x)-40$ gives a good-enough fit for the upper bound (visually derived; log is 2-based). See chart:

I will use this formula as an upper bound in the test (together with the fixed overall-max of 115, also empirically measured).

optout21 force-pushed on Dec 16, 2025

optout21 commented at 10:39 am on December 16, 2025: contributor

Reworked based on the comments (@l0rinc @sipa, thx):

I switched to chain of len 1,000,000, for better fit with actual main chain.
The start I chose to be in the upper 90% of the chain (that is, over 100,000).
The target is below start, with the only constraint that the distance is at least 100 (reduced from 1,000).
The absolute value bound check is kept, 115 (higher than the previous 110, as chain length was increased).
The liner ratio to distance check (5%) was removed, it was too rough, and failed for low distances.
Added boundary check with the $8*log(y−x) - 40$ formula.
The copied traversal code was reworked for consistent style, and some added checks.

I have kept commits separately, but I plan to squash the commits before merge (the first 3 commits touching the same test method)

Ready for new review.

optout21 commented at 10:42 am on December 16, 2025: contributor

An observation regarding the original code in CBlockIndex::GetAncestor:

These two lines:

0        int heightSkip = GetSkipHeight(heightWalk);
1        int heightSkipPrev = GetSkipHeight(heightWalk - 1);

could be changed to

0            const int skip_height{walk_p->pskip->nHeight};
1            const int prev_skip_height{walk_p->pprev->pskip->nHeight};

resulting in a slight performance optimization (less invocations to GetSkipHeight). I may explore in the future the option to change if (possible, worth it).

optout21 marked this as ready for review on Dec 16, 2025

in src/test/chain_tests.cpp:112 in 157caa98da outdated

108@@ -108,11 +109,11 @@ BOOST_AUTO_TEST_CASE(skip_height_analysis)
109         // End of copied traversal algorithm
110 
111         // Asserts on the number of iterations:
112-        // always less than 100 and less than 5% of the distance
113-        BOOST_CHECK_LT(iteration_count, 100);
114-        BOOST_REQUIRE_GT(delta, 0);
115-        const auto rel_iter_count{double(iteration_count) / double(delta)};
116-        BOOST_CHECK_LT(rel_iter_count, 0.05);
117+        // Absolute value maximum (derived empirically)

l0rinc commented at 2:38 pm on February 2, 2026:

these were introduced in the previous commit, could you merge them or separate the commits differently? Or if you think there’s an advantage in introducing something that gets changed immediately, can you please provide the reasoning for it?

sipa commented at 5:27 pm on February 2, 2026:

Both of the tests here seem to fail regularly when I increase the iteration count.

optout21 commented at 7:39 am on February 3, 2026:

You’re correct, the first 3 commits should be merged, they just reflect the evolution over the review, and there’s no need for that. I will do that.

optout21 commented at 8:33 am on February 3, 2026:

Indeed. I adjust the parameter to work with huge number of iterations as well (the formula is empirically derived, as the note says). Thanks for trying out! BTW, the test is deterministic, so sporadic failing is not an issue (if it passes with a given number of iteration, it does so consistently).

optout21 commented at 11:06 am on February 3, 2026:

Done

in src/test/chain_tests.cpp:12 in ca2c9bc63e

 8@@ -9,6 +9,8 @@
 9 
10 #include <memory>
11 
12+extern int GetSkipHeight(int height);

sipa commented at 5:17 pm on February 2, 2026:

In commit “Add indirect test for skip height through ancestor search”

I think it’s cleaner to put this declaration in chain.h, even if it’s just a test-only function. As-is, if the function declaration is changed in just one of the two places, it won’t be detected until link time.

optout21 commented at 8:02 am on February 3, 2026:

Correct, that’s a cleaner, slightly-more-impact solution. I was shy to touch non-test code. I’ll do as suggested.

in src/test/chain_tests.cpp:78 in 157caa98da outdated

71@@ -71,11 +72,11 @@ BOOST_AUTO_TEST_CASE(skip_height_analysis)
72         block_index.push_back(std::move(new_index));
73     }
74 
75-    const auto min_distance{500};
76+    const auto min_distance{100};
77     FastRandomContext ctx(/*fDeterministic =*/true);
78     for (auto i{0}; i < 1'000; ++i) {
79         // pick a height in the second half of the chain

sipa commented at 5:23 pm on February 2, 2026:

In commit “Adjust test selection parameters and checks, use log(distance)-derived formula”

This comment seems outdated.

optout21 commented at 8:07 am on February 3, 2026:

In the check a “log(distance)-derived” value is used, so I think the message is correct. It’s in lines 115-116:

0        const auto bound_log_distance{std::ceil(8*std::log(delta)/std::log(2) - 40)};
1        BOOST_CHECK_LE(double(iteration_count), bound_log_distance);

The min_distance is constant, but that is not a boundary check parameter but a test setup parameter. BTW, I will merge the first 3 commits (on the same test), so commit message will be adjusted as well.

l0rinc commented at 2:23 pm on February 3, 2026:

I think he mean “second half” is an understatement for the last 90% of the chain

optout21 commented at 10:14 am on February 4, 2026:

Comment updated (to “top part”)

l0rinc commented at 2:16 pm on February 4, 2026:

Was reverted since

optout21 commented at 7:48 pm on February 4, 2026:

Fixed

optout21 force-pushed on Feb 3, 2026

optout21 commented at 11:08 am on February 3, 2026: contributor

Changes applied (in response to recent review comments):

Adjust test boundary parameters, to ensure the test works with much higher iteration counts as well (tested with 10M, the test uses 1000).
The GetSkipHeight() method is exposed in chain.h header, so the test can use it directly (not through extern, that’s prone to drifting). This is the only change to non-test code, done in separate first commit.

Changes not affecting content:

Merged the first 3 commits; they all relate to the same test.
Added reviewers as co-authors.
Rebased to current master (no strict reason, no change)

Please consider (re)review.

in src/test/chain_tests.cpp:108 in 9665afdcba

103+    // It does so by imitating the `CBlockIndex::GetAncestor()`
104+    // function to look for an ancestor, and ensuring that an arbitrary
105+    // ancestor is found in a number of steps that is much lower
106+    // than simple linear traversal.
107+    // This could be tested by a performance test as well, but here the
108+    // number of iterations can be measured, that gives a more precice

l0rinc commented at 1:38 pm on February 3, 2026:

nit, LLM is right:

Possible typos and grammar issues:

precice -> precise [misspelling makes the sentence harder to understand]

optout21 commented at 10:12 am on February 4, 2026:

Done

l0rinc commented at 2:16 pm on February 4, 2026:

This was also reverted accidentally

optout21 commented at 7:48 pm on February 4, 2026:

Fixed

in src/test/chain_tests.cpp:112 in 9665afdcba

107+    // This could be tested by a performance test as well, but here the
108+    // number of iterations can be measured, that gives a more precice
109+    // metric than CPU time.
110+
111+    // Build a chain
112+    const int chain_size{1'000'000};

l0rinc commented at 1:39 pm on February 3, 2026:

That’s a lot of allocations, I thought it would be problematic with our sanitizers, but the CI indicates it wasn’t. Running it locally from my IDE (with logs enabled) this takes 3.5 minutes to run. Command line indicates it’s only 1.5 seconds, so it should likely be fine. nit:

0    constexpr int chain_size{1'000'000};

optout21 commented at 10:12 am on February 4, 2026:

Done; switch to stack-allocation as well as this nit.

in src/test/chain_tests.cpp:117 in 9665afdcba

112+    const int chain_size{1'000'000};
113+    std::vector<std::unique_ptr<CBlockIndex>> block_index;
114+    block_index.reserve(chain_size);
115+    // Create genesis block
116+    block_index.push_back(std::make_unique<CBlockIndex>());
117+    for(auto i{1}; i < chain_size; ++i) {

l0rinc commented at 1:49 pm on February 3, 2026:

please reformat new code with clang-format

l0rinc commented at 2:12 pm on February 3, 2026:

not sure we need this unique pointer an individual heap allocations here, wouldn’t a contiguous CBlockIndex vector work better?

0    constexpr int chain_size{1'000'000};
1    std::vector<CBlockIndex> block_index(chain_size);
2    for (auto i{0}; i < chain_size; ++i) {
3        block_index[i].pprev = i == 0 ? nullptr : &block_index[i - 1];
4        block_index[i].nHeight = i;
5        block_index[i].BuildSkip();
6    }

This would also speed up the test from 1.5 seconds (from terminal, without logs) to 0.5s.

Also note that skiplist_tests.cpp seems more appropriate for this change (moved it locally and it seems to pass there).

optout21 commented at 10:14 am on February 4, 2026:

Allocation changed (to on-stack). Move is yet TODO.

optout21 commented at 10:16 am on February 4, 2026:

Done (91fce2c893cfe88180feae4d22c102813352548f)

l0rinc commented at 11:18 am on February 4, 2026:

Thanks, though it’s usually easier to review if it’s pushed atomically (all changes addressed), otherwise we’re just getting notifications and there’s no way to act on it yet.

Also note that most reviewers prefer having a clear commit history that doesn’t reflect its evolution inside the same PR - instead of adding new commits on top to modify code that we’ve just introduced, please do an interactive rebase on the existing commits.

optout21 commented at 1:16 pm on February 4, 2026:

The move is also done.

optout21 commented at 1:22 pm on February 4, 2026:

instead of adding new commits on top to modify code that we’ve just introduced, please do an interactive rebase on the existing commits.

Thanks, noted.

l0rinc changes_requested

l0rinc commented at 2:33 pm on February 3, 2026: contributor

I think we can get rid of a few unique pointers to speed up and simplify the test.

The PR description needs to be updated now since the added test doesn’t test a distribution anymore, the low-level test is commit 3, and the non-test change (making GetSkipHeight linkable) is commit 1. The last commit still claims: The method is changed to be not static and inline, so that the test can link with it. which is done in a different commit now.

And are not tested and not well documented is bit overstated since there is existing skiplist coverage (src/test/skiplist_tests.cpp) that exercises BuildSkip() + GetAncestor(). I think we should move the new changes there.

Lastly, a few of my comments were resolved but not addressed, which is a bit frustrating, could you please recheck them and either explain why my assessment was incorrect or apply it to the change?

optout21 commented at 8:07 am on February 4, 2026: contributor

I think we can get rid of a few unique pointers to speed up and simplify the test. […] And are not tested and not well documented is bit overstated since there is existing skiplist coverage (src/test/skiplist_tests.cpp) that exercises BuildSkip() + GetAncestor(). I think we should move the new changes there. […] Lastly, a few of my comments were resolved but not addressed, which is a bit frustrating, could you please recheck them and either explain why my assessment was incorrect or apply it to the change?

Thx. I will address the separate review comments. The allocation remark is very useful. Thx for mentioning skiplist_tests.cpp, it was off my radar (I initially searched on GetSkipHeight in tests). I have re-checked Resolved comments (~42 of them), and indeed, I found 4 where the closing reason was not very clear. Sorry for that. I unresolved them and I will address or explain each of them.

optout21 marked this as a draft on Feb 4, 2026

optout21 commented at 10:12 am on February 4, 2026: contributor

WIP status: suggestions addressed, TODO:

fixup commits
review/update commit messages and PR description
move tests from chain_tests.cpp to skiplist_tests.cpp

in src/test/chain_tests.cpp:49 in 2b94acee3d

45@@ -46,6 +46,7 @@ BOOST_AUTO_TEST_CASE(get_skip_height_test)
46 {
47     // Even values: the rightmost set bit is zeroed
48     // Test with various even values with at least 2 bits set
49+    // clang-format off

l0rinc commented at 11:15 am on February 4, 2026:

we can do that, but it’s obvious that this done is on purpose and we don’t have a format linter that we have to satisfy

optout21 commented at 1:19 pm on February 4, 2026:

OK, removed.

optout21 force-pushed on Feb 4, 2026

DrahtBot added the label CI failed on Feb 4, 2026

optout21 force-pushed on Feb 4, 2026

optout21 commented at 1:15 pm on February 4, 2026: contributor

Ready for review. Changes (since 9665afdcbabebcfc3db91e25f42aa1c5dcb7f34e):

Move new tests from chain_tests.cpp to skiplist_tests.cpp
Simplified stack allocation, get rid of unique_ptr’s (eb00ab8212a4a07f6c67e93834216c4d6ec183dd)
Minor comment fix, a typo, formatting (clang) (779b539930ff1bd8712c998dd2652b3331f28103)
commit messages and PR description updated

optout21 force-pushed on Feb 4, 2026

optout21 marked this as ready for review on Feb 4, 2026

in src/test/skiplist_tests.cpp:266 in 8dc3e3ede9

261+    // It does so by imitating the `CBlockIndex::GetAncestor()`
262+    // function to look for an ancestor, and ensuring that an arbitrary
263+    // ancestor is found in a number of steps that is much lower
264+    // than simple linear traversal.
265+    // This could be tested by a performance test as well, but here the
266+    // number of iterations can be measured, that gives a more precice

l0rinc commented at 2:12 pm on February 4, 2026:

0    // number of iterations can be measured, that gives a more precise

optout21 commented at 7:41 pm on February 4, 2026:

Good catch, corrected. One fix commit got lost accidentally.

in src/test/skiplist_tests.cpp:283 in 8dc3e3ede9

278+    constexpr auto min_distance{100};
279+    FastRandomContext ctx(/*fDeterministic =*/true);
280+    // Repeat iterations. The test should work with much higher numbers as well,
281+    // kept reasonable to keep execution time low
282+    for (auto i{0}; i < 1'000; ++i) {
283+        // pick a height in the second half of the chain

l0rinc commented at 2:13 pm on February 4, 2026:

seems this was reverted in-between rebases

optout21 commented at 7:42 pm on February 4, 2026:

Re-done, corrected.

DrahtBot removed the label CI failed on Feb 4, 2026

in src/test/skiplist_tests.cpp:279 in 8dc3e3ede9

274+        block_index[i].nHeight = i;
275+        block_index[i].BuildSkip();
276+    }
277+
278+    constexpr auto min_distance{100};
279+    FastRandomContext ctx(/*fDeterministic =*/true);

l0rinc commented at 2:17 pm on February 4, 2026:

to make sure the linter actually validates the name:

0    FastRandomContext ctx(/*fDeterministic=*/true);

optout21 commented at 7:40 pm on February 4, 2026:

Done.

in src/test/skiplist_tests.cpp:295 in 8dc3e3ede9

290+
291+        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
292+        const auto& start_p{block_index[start_height]};
293+        const CBlockIndex* walk_p{&start_p};
294+        auto walk_height{start_height};
295+        auto iteration_count{0};

l0rinc commented at 2:19 pm on February 4, 2026:

nit: auto numeric declarations can be tricky

optout21 commented at 7:39 pm on February 4, 2026:

Changed to int.

l0rinc approved

l0rinc commented at 2:19 pm on February 4, 2026: contributor

I left a few nits that I would prefer if we still addressed, but the current version is quite clean.

Some changes were lost during the rebase, if you could go over the comments again and make sure they are part of the final state, I’d appreciate it. I tried doing the same, left some notes.

ACK 8dc3e3ede97e3deea277f1d88583cc4be2a2b208

DrahtBot requested review from sipa on Feb 4, 2026

optout21 force-pushed on Feb 4, 2026

optout21 commented at 7:52 pm on February 4, 2026: contributor

As noticed, one fix commit was accidentally lost (779b539930ff1bd8712c998dd2652b3331f28103). Corrected, together with a few minor nits.

l0rinc commented at 8:02 pm on February 4, 2026: contributor

ACK 45a5c5f3d63da22aca7f14b9eb0203ffa827a1a7

Thanks for the quick turnaround, new tests look good to me. super-nit: to avoid duplication in PR title consider something like: test: validate skip heights...

in src/test/skiplist_tests.cpp:256 in f2418d8706

251+
252+        // Asserts on the number of iterations:
253+        // Absolute value maximum (derived empirically)
254+        BOOST_CHECK_LE(iteration_count, 130);
255+        // Dynamic bound, function of distance. Formula derived empirically.
256+        const auto bound_log_distance{std::ceil(8.3 * std::log(delta) / std::log(2) - 35)};

sipa commented at 3:33 pm on February 9, 2026:

I still hit some exceptions here when I increase the number of iterations here. Of course, the RNG is deterministic, so this won’t be hit in CI, but it begs the question what it’s actually testing if this is not an upper bound.

I believe I determined before that the worst case actually scales with log^2(N) (where N is the total chain length, not just the delta). Maybe it’s worth finding an actual upper bound?

l0rinc commented at 11:37 am on February 10, 2026:

If I understand this correctly, GetSkipHeight’s complexity should come from its bit-width, not distance, so maybe we can just estimate it via triangular numbers:

0B = ⌊log₂(N − 1)⌋ + 1

0bound(N) = B(B + 1) / 2

This isn’t as tight as the previous one, but is very simple and seems to pass.

The patch would be something like:

 0diff --git a/src/test/skiplist_tests.cpp b/src/test/skiplist_tests.cpp
 1index d7d18cf76e..fd7b409c88 100644
 2--- a/src/test/skiplist_tests.cpp
 3+++ b/src/test/skiplist_tests.cpp
 4@@ -6,7 +6,7 @@
 5 #include <test/util/random.h>
 6 #include <test/util/setup_common.h>
 7
 8-#include <cmath>
 9+#include <bit>
10 #include <vector>
11
12 #include <boost/test/unit_test.hpp>
13@@ -268,6 +268,8 @@ BOOST_AUTO_TEST_CASE(skip_height_analysis)
14
15     // Build a chain
16     constexpr int chain_size{1'000'000};
17+    constexpr int chain_bits{std::bit_width<unsigned>(chain_size - 1)}; // Bit width of the maximum height.
18+    constexpr int iteration_bound{chain_bits * (chain_bits + 1) / 2}; // Conservative O(log^2(N)) bound.
19     std::vector<CBlockIndex> block_index(chain_size);
20     for (auto i{0}; i < chain_size; ++i) {
21         block_index[i].pprev = i == 0 ? nullptr : &block_index[i - 1];
22@@ -313,12 +315,7 @@ BOOST_AUTO_TEST_CASE(skip_height_analysis)
23             ++iteration_count;
24         }
25
26-        // Asserts on the number of iterations:
27-        // Absolute value maximum (derived empirically)
28-        BOOST_CHECK_LE(iteration_count, 130);
29-        // Dynamic bound, function of distance. Formula derived empirically.
30-        const auto bound_log_distance{std::ceil(8.3 * std::log(delta) / std::log(2) - 35)};
31-        BOOST_CHECK_LE(double(iteration_count), bound_log_distance);
32+        BOOST_CHECK_LE(iteration_count, iteration_bound);
33
34         // Also check that `GetAncestor` gives the same result
35         auto p_ancestor{start_p.GetAncestor(target_height)};

optout21 commented at 12:07 pm on February 10, 2026:

@sipa, I try to get the actual upper bound by running all combinations of start and end. It takes a while because it is O(n^2) with n=1_000_000.

Generally the longer the distance to traverse, the lower the number of iterations are, that’s why I think it makes sense to take into account the distance (delta), and not only the chain size. @l0rinc , your check is nice and simple, but it effectively replaces the check against 130 with check against 210 (20*21/2), which is much less strict. I’m fine with it, but I’m curious of the search result first. BTW, the max number found in my exhaustive search is exactly 130 (so far, it is still running, but it has started with the very long subchains).

l0rinc commented at 12:19 pm on February 10, 2026:

but it effectively replaces the check against 130 with check against 210 (20*21/2), which is much less strict

yeah, but is that important?

the max number found in my exhaustive search is exactly 130

~~mine was 132~~ apparently I misremembered

optout21 commented at 7:28 am on February 13, 2026:

See #33661 (comment)

DrahtBot requested review from sipa on Feb 9, 2026

optout21 force-pushed on Feb 13, 2026

optout21 commented at 7:24 am on February 13, 2026: contributor

Thanks for the thorough checking! With adjusted boundary check I’m confident now that this holds for ALL possible subchains (with the given parameters: chain length of 1'000'000 and minimum subchain length of 100). If you prefer, I’m open to reduce the test to a much simpler and relaxed check (e.g. "< 200"), but I think the current version is more informative.

Change:

Boundary check values have been adjusted, to be more resilient for all possible subchains, not just the ones used in the test. The new formula is 8 * std::bit_width<unsigned>(delta) - 29 (a step-like function).

Additional info:

A more thorough measurement was done. From the charts below it is visible that the formula used was not sufficient, it was not an upper bound in all cases (esp. for longer subchain).
The chart confirms that the max iteration count has strong correlation with the delta.
The absolute upper limit of 130 still holds
The test passed with a test iteration count of 2'000'000'000, instead of the 1000 used in the test.
I reproduce some charts, plotting the measured maximum iteration count against the length of the subchain (‘delta’). Several percentiles are also plotted for the distribution of itercounts for a given delta: 10th, 50th, 90th. The second chart is log-scale x axis, the other two are closeups. The data is not exhaustive.
I also tried the test with non-deterministic RNG, ran 5000 times successfully.
The bit_width used is a nice shortcut for ceil(log(x)/log(2)), without using floating point arithmetics (thanks for the hint, @l0rinc )

 0diff --git a/src/test/skiplist_tests.cpp b/src/test/skiplist_tests.cpp
 1index d7d18cf76e..56a15d5203 100644
 2--- a/src/test/skiplist_tests.cpp
 3+++ b/src/test/skiplist_tests.cpp
 4@@ -6,6 +6,7 @@
 5 #include <test/util/setup_common.h>
 6 
 7+#include <bit>
 8 #include <cmath>
 9 #include <vector>
10@@ -317,8 +318,8 @@ BOOST_AUTO_TEST_CASE(skip_height_analysis)
11         BOOST_CHECK_LE(iteration_count, 130);
12         // Dynamic bound, function of distance. Formula derived empirically.
13-        const auto bound_log_distance{std::ceil(8.3 * std::log(delta) / std::log(2) - 35)};
14-        BOOST_CHECK_LE(double(iteration_count), bound_log_distance);
15+        const auto bound_log_distance{8 * std::bit_width<unsigned>(delta) - 29};
16+        BOOST_CHECK_LE(iteration_count, bound_log_distance);
17 
18         // Also check that `GetAncestor` gives the same result

DrahtBot added the label CI failed on Feb 13, 2026

optout21 marked this as a draft on Feb 14, 2026

optout21 marked this as ready for review on Feb 14, 2026

optout21 closed this on Feb 14, 2026

optout21 reopened this on Feb 14, 2026

l0rinc commented at 1:27 pm on February 14, 2026: contributor

ACK 527b186bb7908bdd19da7a8277727dd1079f7c95

Since my last ack the test formula was updated to be more robust. Very cool plots, shows precisely what’s happening.

Rebased and tested locally, the previous CI failure (vcpkg couldn’t download the Boost Bind library) was unrelated to the change - close/open dance should fix it.

nit: #include <cmath> in src/test/skiplist_tests.cpp is unused now, if you need to touch again, please consider removing it. Not worth a new push otherwise.

The absolute upper limit of 130 still holds

It seems I misremembered, sorry for the confusion.

DrahtBot removed the label CI failed on Feb 14, 2026

sipa commented at 4:03 pm on February 14, 2026: member

I wrote an exhaustive search to find all the worst possible cases:

 0#include <stdint.h>
 1#include <vector>
 2#include <iostream>
 3
 4namespace {
 5
 6int constexpr inline InvertLowestOne(int n) { return n & (n - 1); }
 7int constexpr static inline GetSkipHeight(int height) {
 8    if (height < 2) return 0;
 9    return (height & 1) ? InvertLowestOne(InvertLowestOne(height - 1)) + 1 : InvertLowestOne(height);
10}
11
12void Analyse(int target_height, std::vector<std::pair<int, int>>& max_by_delta) noexcept {
13    std::vector<int> steps_by_delta(1000000 - target_height);
14    steps_by_delta[0] = 0;
15    for (int delta = 1; delta <= 999999 - target_height; ++delta) {
16        int starting_height = target_height + delta;
17        int walk = starting_height;
18        int skip = GetSkipHeight(walk);
19        int skipprev = GetSkipHeight(walk - 1);
20        if (skip == target_height || (skip > target_height && !(skipprev < skip - 2 && skipprev >= target_height))) {
21            steps_by_delta[delta] = steps_by_delta[skip - target_height] + 1;
22        } else {
23            steps_by_delta[delta] = steps_by_delta[delta - 1] + 1;
24        }
25        if (steps_by_delta[delta] > max_by_delta[delta].first) {
26            max_by_delta[delta].first = steps_by_delta[delta];
27            max_by_delta[delta].second = target_height;
28            std::cout << std::unitbuf << "delta=" << delta << ": " << starting_height << "->" << target_height << ": " << steps_by_delta[delta] << " steps\n";
29        }
30    }
31}
32} // namespace
33
34int main() {
35    std::vector<std::pair<int, int>> max_by_delta(1000000);
36    for (int target_height = 1; target_height <= 999899; ++target_height) {
37        Analyse(target_height, max_by_delta);
38    }
39    std::cout << "\n";
40    int farmax = 0;
41    for (int delta = 1; delta <= 999999; ++delta) {
42        farmax = std::max(farmax, max_by_delta[delta].first);
43        std::cout << "delta=" << delta << ": " << (delta + max_by_delta[delta].second) << "->" << max_by_delta[delta].second << ": " << max_by_delta[delta].first << " steps (max " << farmax << ")\n";
44    }
45}

And found:

  0delta=1: 2->1: 1 steps (max 1)
  1delta=2: 4->2: 2 steps (max 2)
  2delta=3: 5->2: 3 steps (max 3)
  3delta=4: 9->5: 4 steps (max 4)
  4delta=6: 8->2: 5 steps (max 5)
  5delta=7: 9->2: 6 steps (max 6)
  6delta=9: 11->2: 7 steps (max 7)
  7delta=14: 16->2: 8 steps (max 8)
  8delta=15: 17->2: 9 steps (max 9)
  9delta=17: 19->2: 10 steps (max 10)
 10delta=25: 35->10: 11 steps (max 11)
 11delta=30: 32->2: 12 steps (max 12)
 12delta=31: 33->2: 13 steps (max 13)
 13delta=33: 35->2: 14 steps (max 14)
 14delta=42: 44->2: 15 steps (max 15)
 15delta=58: 60->2: 16 steps (max 16)
 16delta=63: 65->2: 17 steps (max 17)
 17delta=65: 67->2: 18 steps (max 18)
 18delta=74: 76->2: 19 steps (max 19)
 19delta=90: 92->2: 20 steps (max 20)
 20delta=122: 156->34: 21 steps (max 21)
 21delta=127: 129->2: 22 steps (max 22)
 22delta=129: 131->2: 23 steps (max 23)
 23delta=138: 140->2: 24 steps (max 24)
 24delta=154: 156->2: 25 steps (max 25)
 25delta=250: 252->2: 26 steps (max 26)
 26delta=255: 257->2: 27 steps (max 27)
 27delta=257: 259->2: 28 steps (max 28)
 28delta=266: 268->2: 29 steps (max 29)
 29delta=282: 284->2: 30 steps (max 30)
 30delta=378: 380->2: 31 steps (max 31)
 31delta=506: 540->34: 32 steps (max 32)
 32delta=511: 513->2: 33 steps (max 33)
 33delta=513: 515->2: 34 steps (max 34)
 34delta=522: 524->2: 35 steps (max 35)
 35delta=538: 540->2: 36 steps (max 36)
 36delta=634: 636->2: 37 steps (max 37)
 37delta=1014: 1016->2: 38 steps (max 38)
 38delta=1023: 1025->2: 39 steps (max 39)
 39delta=1025: 1027->2: 40 steps (max 40)
 40delta=1034: 1036->2: 41 steps (max 41)
 41delta=1050: 1052->2: 42 steps (max 42)
 42delta=1146: 1148->2: 43 steps (max 43)
 43delta=1526: 1528->2: 44 steps (max 44)
 44delta=2038: 2040->2: 45 steps (max 45)
 45delta=2047: 2049->2: 46 steps (max 46)
 46delta=2049: 2051->2: 47 steps (max 47)
 47delta=2058: 2060->2: 48 steps (max 48)
 48delta=2074: 2076->2: 49 steps (max 49)
 49delta=2170: 2172->2: 50 steps (max 50)
 50delta=2550: 2552->2: 51 steps (max 51)
 51delta=3062: 3064->2: 52 steps (max 52)
 52delta=4086: 4088->2: 53 steps (max 53)
 53delta=4097: 4099->2: 54 steps (max 54)
 54delta=4106: 4108->2: 55 steps (max 55)
 55delta=4122: 4124->2: 56 steps (max 56)
 56delta=4218: 4220->2: 57 steps (max 57)
 57delta=4598: 4600->2: 58 steps (max 58)
 58delta=5110: 5112->2: 59 steps (max 59)
 59delta=6134: 6136->2: 60 steps (max 60)
 60delta=8182: 8184->2: 61 steps (max 61)
 61delta=8193: 8195->2: 62 steps (max 62)
 62delta=8202: 8204->2: 63 steps (max 63)
 63delta=8218: 8220->2: 64 steps (max 64)
 64delta=8314: 8316->2: 65 steps (max 65)
 65delta=8694: 8696->2: 66 steps (max 66)
 66delta=9206: 9208->2: 67 steps (max 67)
 67delta=10230: 10232->2: 68 steps (max 68)
 68delta=12278: 12280->2: 69 steps (max 69)
 69delta=16374: 16376->2: 70 steps (max 70)
 70delta=16394: 16396->2: 71 steps (max 71)
 71delta=16410: 16412->2: 72 steps (max 72)
 72delta=16506: 16508->2: 73 steps (max 73)
 73delta=16886: 16888->2: 74 steps (max 74)
 74delta=17398: 17400->2: 75 steps (max 75)
 75delta=18422: 18424->2: 76 steps (max 76)
 76delta=20470: 20472->2: 77 steps (max 77)
 77delta=24566: 24568->2: 78 steps (max 78)
 78delta=32758: 32760->2: 79 steps (max 79)
 79delta=32778: 32780->2: 80 steps (max 80)
 80delta=32794: 32796->2: 81 steps (max 81)
 81delta=32890: 32892->2: 82 steps (max 82)
 82delta=33270: 33272->2: 83 steps (max 83)
 83delta=33782: 33784->2: 84 steps (max 84)
 84delta=34806: 34808->2: 85 steps (max 85)
 85delta=36854: 36856->2: 86 steps (max 86)
 86delta=40950: 40952->2: 87 steps (max 87)
 87delta=49142: 49144->2: 88 steps (max 88)
 88delta=65526: 65528->2: 89 steps (max 89)
 89delta=65562: 65564->2: 90 steps (max 90)
 90delta=65658: 65660->2: 91 steps (max 91)
 91delta=66038: 66040->2: 92 steps (max 92)
 92delta=66550: 66552->2: 93 steps (max 93)
 93delta=67574: 67576->2: 94 steps (max 94)
 94delta=69622: 69624->2: 95 steps (max 95)
 95delta=73718: 73720->2: 96 steps (max 96)
 96delta=81910: 81912->2: 97 steps (max 97)
 97delta=98294: 98296->2: 98 steps (max 98)
 98delta=131062: 131064->2: 99 steps (max 99)
 99delta=131098: 131100->2: 100 steps (max 100)
100delta=131194: 131196->2: 101 steps (max 101)
101delta=131574: 131576->2: 102 steps (max 102)
102delta=132086: 132088->2: 103 steps (max 103)
103delta=133110: 133112->2: 104 steps (max 104)
104delta=135158: 135160->2: 105 steps (max 105)
105delta=139254: 139256->2: 106 steps (max 106)
106delta=147446: 147448->2: 107 steps (max 107)
107delta=163830: 163832->2: 108 steps (max 108)
108delta=196598: 196600->2: 109 steps (max 109)
109delta=262134: 262136->2: 110 steps (max 110)
110delta=262266: 262268->2: 111 steps (max 111)
111delta=262646: 262648->2: 112 steps (max 112)
112delta=263158: 263160->2: 113 steps (max 113)
113delta=264182: 264184->2: 114 steps (max 114)
114delta=266230: 266232->2: 115 steps (max 115)
115delta=270326: 270328->2: 116 steps (max 116)
116delta=278518: 278520->2: 117 steps (max 117)
117delta=294902: 294904->2: 118 steps (max 118)
118delta=327670: 327672->2: 119 steps (max 119)
119delta=393206: 393208->2: 120 steps (max 120)
120delta=524278: 524280->2: 121 steps (max 121)
121delta=524410: 524412->2: 122 steps (max 122)
122delta=524790: 524792->2: 123 steps (max 123)
123delta=525302: 525304->2: 124 steps (max 124)
124delta=526326: 526328->2: 125 steps (max 125)
125delta=528374: 528376->2: 126 steps (max 126)
126delta=532470: 532472->2: 127 steps (max 127)
127delta=540662: 540664->2: 128 steps (max 128)
128delta=557046: 557048->2: 129 steps (max 129)
129delta=589814: 589816->2: 130 steps (max 130)
130delta=655350: 655352->2: 131 steps (max 131)
131delta=786422: 786424->2: 132 steps (max 132)

optout21 marked this as a draft on Feb 15, 2026

Add indirect test for skip height through ancestor search

Add a unit test that indirectly tests the `GetSkipHeight()` function,
by doing checks on the performance of ancestor search that relies on
skip heights. Random searches are performed on a long subchain,
and the effective number of steps are counted, and asserted that are
much less than the distance to the ancestor. For boundary value a
log(distance)-based formula is used (the formula is of the form
$A*log(y−x) - B$, with empirically established A and B values).

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
Co-authored-by: sipa <pieter@wuille.net>

ec914417c6

Expose GetSkipHeight() in chain.h

In preparation for its test, the `GetSkipHeight()` method is exposed
in `chain.h`. Thus it can be used directly in the test
(not through `extern`, that's prone to drifting).

Co-authored-by: sipa <pieter@wuille.net>

b9d7fee3ae

Add low-level tests for GetSkipHeight()

This previously internal method in chain.cpp has low and only
indirect test coverage, and it is not documented well.
A straightforward, simple, direct unit test is added.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>

0f6b58ba9a

optout21 force-pushed on Feb 15, 2026

optout21 commented at 7:53 am on February 15, 2026: contributor

I was wrong, I stand corrected. @sipa, elegant way to do the exhaustive search in less than O(N^2).

Adjusted the parameters: 130 -> 132, -29 -> -28. Verified with all new max values found (in the above code). The test execution with the given parameters is not affected, but the checks are more correct this way.

Note: The max iteration count charts show the fractal nature of the structure. Interestingly, plotting the max iteration count against log2(delta) shows that a quadratic equation is a better fit than linear (0.25*log^2 + 1.5*log + 2 vs 8*log - 28).

 0diff --git a/src/test/skiplist_tests.cpp b/src/test/skiplist_tests.cpp
 1index 56a15d5203..409119a914 100644
 2--- a/src/test/skiplist_tests.cpp
 3+++ b/src/test/skiplist_tests.cpp
 4@@ -316,9 +316,9 @@ BOOST_AUTO_TEST_CASE(skip_height_analysis)
 5 
 6         // Asserts on the number of iterations:
 7         // Absolute value maximum (derived empirically)
 8-        BOOST_CHECK_LE(iteration_count, 130);
 9+        BOOST_CHECK_LE(iteration_count, 132);
10         // Dynamic bound, function of distance. Formula derived empirically.
11-        const auto bound_log_distance{8 * std::bit_width<unsigned>(delta) - 29};
12+        const auto bound_log_distance{8 * std::bit_width<unsigned>(delta) - 28};
13         BOOST_CHECK_LE(iteration_count, bound_log_distance);
14 
15         // Also check that `GetAncestor` gives the same result

 0log(delta)  maxiter
 17	22
 28	27
 39	33
 410	39
 511	46
 612	53
 713	61
 814	70
 915	79
1016	89
1117	99
1218	110
1319	121
1420	132

optout21 marked this as ready for review on Feb 15, 2026

l0rinc commented at 10:28 am on February 15, 2026: contributor

ACK 0f6b58ba9a2989feac6292e74e2d721e35492051

Since last ack the formula was adjusted to be more precise. This documents the change better, it provides more accurate documentation. Thanks for persevering!

optout21 commented at 3:28 pm on February 15, 2026: contributor

Note: The subchain 1'048'568 → 2 takes 133 iterations (1048568 = 2^20-8). This is more than 132, but this subchain is outside the 1'000'000-long chain considered. I believe that this 133 is the largest value of any subchain with 20-bit length.

sedited commented at 9:43 am on March 6, 2026: contributor

Concept ACK

The first commit seems like a good thing to exercise, but I’m not sure about the latter two. Do we really need to validate this internal behaviour?

l0rinc commented at 10:29 am on March 6, 2026: contributor

Do we really need to validate this internal behaviour?

I see them as living documentation, specifying instances of the exact behavior. I think we should keep them.

sedited commented at 11:33 am on March 6, 2026: contributor

I see them as living documentation, specifying instances of the exact behavior. I think we should keep them.

I’m a bit worried about the scope creep this might invite towards testing similar data structures mid state. To improve the documenting aspect of it, how about interleaving the binary values with the decimals in the 0-10 value range?

optout21 commented at 12:46 pm on March 6, 2026: contributor

I’m not sure about the latter two. Do we really need to validate this internal behaviour?

I’m a bit indifferent about the low-level tests. The cost is minimal (in test code amount, maintenance overhead and and execution time), the change to expose a method is also low-impact. However, the are really low-level (one-liner tests for a one-liner method), and they don’t add much test coverage, other then explicitly showing how bits are changed. Possibly a few examples as comments above the implementation would be sufficient.

how about interleaving the binary values with the decimals in the 0-10 value range?

Sorry, I’m not sure I understand this proposal or how it would simplify the tests.

test: Add test on skip heights in CBlockIndex #33661

Code Coverage & Benchmarks

Reviews

Conflicts