test: Add test on skip heights in CBlockIndex #33661

pull optout21 wants to merge 4 commits into bitcoin:master from optout21:skip-height-test2 changing 2 files +136 −1
  1. optout21 commented at 12:08 pm on October 20, 2025: contributor

    The skip height values, used by CBlockIndex::BuildSkip() and GetSkipHeight are not tested and not well documented. (noticed while reviewing #33515, recently merged).

    The motivation is to document the skip value computation through a test.

    The first commit adds a test for the properties of the distribution of skip values, namely that they have non-uniform distribution: most values are small but there are some large ones as well.

    The second commit adds low-level tests to the GetSkipHeight() internal method. The tests are low level, thus of limited value, and the commit also touches the non-test code minimally (to make the method accessible). Therefore this commit has lower priority.

  2. DrahtBot added the label Tests on Oct 20, 2025
  3. DrahtBot commented at 12:08 pm on October 20, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33661.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK sipa
    Stale ACK l0rinc

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • precice -> precise [spelling error: “precice” should be “precise”]

    2025-12-16

  4. optout21 force-pushed on Oct 20, 2025
  5. l0rinc commented at 8:48 pm on October 20, 2025: contributor
    Concept ACK, thanks for helping with the coverage and documentation. For reference https://maflcko.github.io/b-c-cov/test_bitcoin.coverage/src/chain.cpp.gcov.html is pretty well covered otherwise, but these tests can help with the specifics.
  6. optout21 commented at 9:04 am on October 21, 2025: contributor

    For reference https://maflcko.github.io/b-c-cov/test_bitcoin.coverage/src/chain.cpp.gcov.html is pretty well covered otherwise, but these tests can help with the specifics.

    Thanks for the link! Indirectly it is tested extensively, as the skip mechanism is used while searching in the chain. However, there are no tests focusing on this specifically, and the why and how on the exact skip generation algorithm is not documented.

  7. optout21 marked this as ready for review on Oct 30, 2025
  8. in src/chain.cpp:79 in 6a22013f5a outdated
    75@@ -76,7 +76,7 @@ CBlockIndex* CChain::FindEarliestAtLeast(int64_t nTime, int height) const
    76 int static inline InvertLowestOne(int n) { return n & (n - 1); }
    77 
    78 /** Compute what height to jump back to with the CBlockIndex::pskip pointer. */
    79-int static inline GetSkipHeight(int height) {
    80+int GetSkipHeight(int height) {
    


    l0rinc commented at 8:50 am on November 1, 2025:
    👍 for the static removal, but why remove the inline keyword?

    optout21 commented at 2:37 pm on November 3, 2025:
    If it’s inlined, it can’t be used as extern from the test :(

    l0rinc commented at 4:02 pm on November 3, 2025:
    how so, what’s the failure you’re seeing? It was working for me locally

    optout21 commented at 4:22 pm on November 3, 2025:

    If GetSkipHeight is inline, I get the error:

    0[ 75%] Built target bitcoin-node
    1/usr/bin/ld: CMakeFiles/test_bitcoin.dir/chain_tests.cpp.o: in function `chain_tests::get_skip_height_test::test_method()':
    2/home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:50:(.text+0x68b): undefined reference to `GetSkipHeight(int)'
    3/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:51:(.text+0x7cf): undefined reference to `GetSkipHeight(int)'
    4/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:52:(.text+0x90c): undefined reference to `GetSkipHeight(int)'
    5/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:53:(.text+0xa49): undefined reference to `GetSkipHeight(int)'
    6/usr/bin/ld: /home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:54:(.text+0xb86): undefined reference to `GetSkipHeight(int)'
    7/usr/bin/ld: CMakeFiles/test_bitcoin.dir/chain_tests.cpp.o:/home/bitcoin/bitcoin-core/build/src/test/./test/chain_tests.cpp:55: more undefined references to `GetSkipHeight(int)' follow
    8collect2: error: ld returned 1 exit status
    9gmake[2]: *** [src/test/CMakeFiles/test_bitcoin.dir/build.make:2463: bin/test_bitcoin] Error 1
    

    Inlined method is not included in the obj file as a method. The solution would be to expose it through the header file, but I did not want to make that change just for the sake of the test.


    l0rinc commented at 9:53 pm on November 3, 2025:
    interesting, it does work for me, but inline might indeed be problematic in this case. Removing it should be fine, please resolve this comment.

    optout21 commented at 9:39 am on November 4, 2025:
    Indeed. Inline is only a hint, not crucial for modern compliers.
  9. in src/test/chain_tests.cpp:51 in cdffd675d5 outdated
    46+    // The test collects and analyses the distribution of the
    47+    // skip difference numbers, and checks some properties:
    48+    // non-uniform distribution with most values small but
    49+    // with some large values as well.
    50+
    51+    FastRandomContext ctx;
    


    l0rinc commented at 9:00 am on November 1, 2025:
    This seems unused

    optout21 commented at 2:58 pm on November 3, 2025:
    Removed
  10. in src/chain.cpp:75 in 6a22013f5a outdated
    75@@ -76,7 +76,7 @@ CBlockIndex* CChain::FindEarliestAtLeast(int64_t nTime, int height) const
    76 int static inline InvertLowestOne(int n) { return n & (n - 1); }
    77 
    78 /** Compute what height to jump back to with the CBlockIndex::pskip pointer. */
    79-int static inline GetSkipHeight(int height) {
    80+int GetSkipHeight(int height) {
    81     if (height < 2)
    


    l0rinc commented at 11:34 am on November 3, 2025:
    nit: can this be if (height <= 2) as well? It’s likely outside the scope of the change though

    optout21 commented at 1:01 pm on November 3, 2025:
    Indeed, the input 2 also gives 0 as result. The intent may have been that got indices 0 and 1 it makes no sense to jump, for the rest follow the computation (but 2 accidentally also gives 0). In any case, I don’t want to change product behavior to the slightest in this PR.
  11. in src/test/chain_tests.cpp:54 in 6a22013f5a
    49+    BOOST_CHECK(GetSkipHeight(0b0001011101101001) == 0b0001011101000001);
    50+    BOOST_CHECK(GetSkipHeight(0b0110101101011000) == 0b0110101101010000);
    51+    BOOST_CHECK(GetSkipHeight(0b0110101101011001) == 0b0110101101000001);
    52+    BOOST_CHECK(GetSkipHeight(0) == 0);
    53+    BOOST_CHECK(GetSkipHeight(1) == 0);
    54+    BOOST_CHECK(GetSkipHeight(2) == 0);
    


    l0rinc commented at 11:38 am on November 3, 2025:

    we could add comments explaining the magic values: # height <= 2 # powers of 2 # odd/even heights # absurdly big values or whatever

    Otherwise it’s not obvious why 4 is 0 and 5 is 1 and there is no 6 but 7 is 1 again…

    I need to understand why we chose these numbers, especially since the PR is meant to document the change


    optout21 commented at 2:58 pm on November 3, 2025:
    I’ve adjusted the numbers used and added comments
  12. in src/test/chain_tests.cpp:48 in 6a22013f5a
    42@@ -41,6 +43,78 @@ const CBlockIndex* NaiveLastCommonAncestor(const CBlockIndex* a, const CBlockInd
    43 
    44 } // namespace
    45 
    46+BOOST_AUTO_TEST_CASE(get_skip_height_test)
    47+{
    48+    BOOST_CHECK(GetSkipHeight(0b0001011101101000) == 0b0001011101100000);
    


    l0rinc commented at 11:40 am on November 3, 2025:

    I like the binary representation, but maybe we could use BOOST_CHECK_EQUAL insead (helps with the reason when they fail) and maybe we could align these to simplify finding the differences in their endings:

    0    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101000),
    1                                    0b0001011101100000);
    2    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101001),
    3                                    0b0001011101000001);
    4    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011000),
    5                                    0b0110101101010000);
    6    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011001),
    7                                    0b0110101101000001);
    

    And a short comment could help in drawing our attention to the ending


    optout21 commented at 2:59 pm on November 3, 2025:
    Changed to BOOST_CHECK_EQUAL, added comments

    l0rinc commented at 9:36 pm on November 3, 2025:
    Formatting them to be on top of each other as the above example shows allows easier bit comparison, otherwise the binary representation doesn’t really help

    optout21 commented at 9:40 am on November 4, 2025:
    I did that for the longer values, for shorter values with only a few ones is not really need.

    l0rinc commented at 2:13 pm on November 4, 2025:
    Can we do it for the shorter ones as well? it’s easier to compare them and more consistent with the rest
  13. in src/test/chain_tests.cpp:105 in 6a22013f5a outdated
    66+
    67+BOOST_AUTO_TEST_CASE(skip_height_properties_test)
    68+{
    69+    // The test collects and analyses the distribution of the
    70+    // skip difference numbers, and checks some properties:
    71+    // non-uniform distribution with most values small but
    


    l0rinc commented at 11:41 am on November 3, 2025:

    it seems to test more than non-uniformity. My understanding is that you want to show that most jumps are small, right? If so, maybe we don’t have to be that specific, can we just check that the median jump is tiny?

     0BOOST_AUTO_TEST_CASE(skip_height_properties_test)
     1{
     2    constexpr int n{30'000};
     3
     4    std::vector<std::unique_ptr<CBlockIndex>> block_index;
     5    block_index.push_back(std::make_unique<CBlockIndex>());
     6    for(int i{0}; i < n; ++i) {
     7        auto new_index{std::make_unique<CBlockIndex>()};
     8        new_index->pprev = block_index.back().get();
     9        new_index->nHeight = new_index->pprev->nHeight + 1;
    10        new_index->BuildSkip();
    11        block_index.push_back(std::move(new_index));
    12    }
    13
    14    std::vector<int> skip_diffs;
    15    skip_diffs.reserve(n);
    16    for(int i{1}; i <= n; ++i) {
    17        skip_diffs.push_back(i - block_index[i]->pskip->nHeight);
    18    }
    19
    20    std::ranges::sort(skip_diffs);
    21    const int median{skip_diffs[n / 2]};
    22    
    23    BOOST_CHECK_LT(median, 10); // Most jumps are small
    24}
    

    optout21 commented at 2:44 pm on November 3, 2025:
    I check the avg, which is computationally easier to obtain. It would make sense to check the median as well, and maybe compare the median with the average. However, I prefer to avoid the O(N log N) complexity of the sorting.

    l0rinc commented at 4:05 pm on November 3, 2025:
    median is less sensitive to outliers - and nlogn is fine in tests

    optout21 commented at 9:41 am on November 4, 2025:
    Added median check as well. Much lower than the avg!

    l0rinc commented at 2:15 pm on November 4, 2025:
    The test became more complicated than it was before (which was my main critique of it, seems arbitrary). Could we simplify it? I think we’re testing too many things, a single median check should likely suffice in my opinion.
  14. in src/test/chain_tests.cpp:106 in 6a22013f5a outdated
    88+        new_index->nHeight = new_index->pprev->nHeight + 1;
    89+        new_index->BuildSkip();
    90+        block_index.push_back(std::move(new_index));
    91+    }
    92+
    93+    // Analyze the diff (in height) of each element and its pskip
    


    l0rinc commented at 11:43 am on November 3, 2025:
    I need more context, how did we get to these numbers exactly? Do we really need to be this precise? Wouldn’t it suffice to check that their median is low enough?

    optout21 commented at 9:41 am on November 4, 2025:
    Added explanatory comments
  15. in src/test/chain_tests.cpp:98 in 6a22013f5a
    93+    // Analyze the diff (in height) of each element and its pskip
    94+    int count_diff_smaller_16 = 0;
    95+    int max_diff = 0;
    96+    int total_diff = 0;
    97+    for(int i = 1; i < n; ++i) {
    98+        BOOST_CHECK(block_index[i]->nHeight == i);
    


    l0rinc commented at 11:52 am on November 3, 2025:
    this sounds like a BOOST_REQUIRE_EQUAL instead
  16. in src/test/chain_tests.cpp:80 in 6a22013f5a
    75+    std::vector<std::unique_ptr<CBlockIndex>> block_index;
    76+
    77+    // Create genesis block
    78+    auto genesis = std::make_unique<CBlockIndex>();
    79+    genesis->nHeight = 0;
    80+    block_index.push_back(std::move(genesis));
    


    l0rinc commented at 11:55 am on November 3, 2025:

    maybe this would also work

    0    std::vector<std::unique_ptr<CBlockIndex>> block_index;
    1    block_index.push_back(std::make_unique<CBlockIndex>());
    

    optout21 commented at 3:00 pm on November 3, 2025:
    Done

    l0rinc commented at 9:49 pm on November 3, 2025:

    I see you’ve moved it as well, clang complains with:

    Clangd: Moving a temporary object prevents copy elision


    optout21 commented at 9:41 am on November 4, 2025:
    Oops, fixed!
  17. in src/test/chain_tests.cpp:83 in 6a22013f5a
    78+    auto genesis = std::make_unique<CBlockIndex>();
    79+    genesis->nHeight = 0;
    80+    block_index.push_back(std::move(genesis));
    81+
    82+    // Build a chain
    83+    const auto n{30000};
    


    l0rinc commented at 12:02 pm on November 3, 2025:

    was wondering how slow this is, but seems acceptable:

    0cmake -B build  -DCMAKE_BUILD_TYPE=Debug && cmake --build build && time ./build/bin/test_bitcoin --run_test=chain_tests/skip_height_properties_test
    1...
    2./build/bin/test_bitcoin --run_test=chain_tests/skip_height_properties_test  0.32s user 0.02s system 27% cpu 1.269 total 
    

    optout21 commented at 2:45 pm on November 3, 2025:
    Yes, nontheless I reduce the count from 30000 to 10000, that’s also enough for the test.

    l0rinc commented at 9:50 pm on November 3, 2025:
    nit: consider formatting it as 10'000 for readability

    optout21 commented at 9:41 am on November 4, 2025:
    Done
  18. in src/test/chain_tests.cpp:110 in 6a22013f5a outdated
    92+
    93+    // Analyze the diff (in height) of each element and its pskip
    94+    int count_diff_smaller_16 = 0;
    95+    int max_diff = 0;
    96+    int total_diff = 0;
    97+    for(int i = 1; i < n; ++i) {
    


    l0rinc commented at 12:07 pm on November 3, 2025:

    how come we’re iterating one less than before?

    0    for(int i = 1; i <= n; ++i) {
    

    optout21 commented at 2:34 pm on November 3, 2025:
    It doesn’t make sense to check the skip for block index 0, that’s why 0 is excluded.

    l0rinc commented at 4:01 pm on November 3, 2025:
    yeah, but why is n skipped (see my suggested code)

    optout21 commented at 4:25 pm on November 3, 2025:
    Oh, I see. There are in fact n+1 elements. Will change.

    optout21 commented at 9:42 am on November 4, 2025:
    All cycles go from 1 to n-1; 0 is the genesis.
  19. in src/chain.cpp:81 in 6a22013f5a


    l0rinc commented at 12:09 pm on November 3, 2025:
    Can we add a fuzz test that asserts for all h > 0: GetSkipHeight(h) < h and popcount(result) < popcount(h) (and just skip 0 since that’s already unit tested)?

    optout21 commented at 9:43 am on November 4, 2025:
    A fuzz test for this functionality (essentially a one-liner) seems a bit too much.

    l0rinc commented at 2:18 pm on November 4, 2025:
    k, please resolve the comment
  20. l0rinc changes_requested
  21. l0rinc commented at 12:33 pm on November 3, 2025: contributor
    Concept ACK, during review I also understood this behavior better :) I have a few suggestions for simplification and am throwing you into the deep end by asking for a property based (fuzz) test as well to check certain invariants. Not a must-have, but seems simple enough. I also suggest simplifying the distribution test to just check for the median instead.
  22. optout21 force-pushed on Nov 3, 2025
  23. optout21 commented at 3:02 pm on November 3, 2025: contributor

    Review comments addressed, mostly minor:

    • Reduce the number of iterations to 10000 (from 30000), for faster test execution.
    • Add comments on choice of test inputs.
    • Use BOOST_REQUIRE_EQUAL or BOOST_CHECK_GT where possible.
  24. optout21 commented at 3:03 pm on November 3, 2025: contributor
    @sipa, as this is triggered by #33515 of yours, I would kindly ask for a review.
  25. sipa requested review from sipa on Nov 3, 2025
  26. optout21 force-pushed on Nov 4, 2025
  27. optout21 commented at 9:38 am on November 4, 2025: contributor

    Changes applied:

    • Fix lint issue
    • Added check on the median as well
    • some minor formatting
  28. optout21 force-pushed on Nov 4, 2025
  29. optout21 commented at 12:26 pm on November 4, 2025: contributor

    CI failed, limits of the 3rd for cycle were off; fixed

    0     int count_diff_smaller_16 = 0;
    1     int max_diff = 0;
    2     int total_diff = 0;
    3-    for(int i{1}; i < n; ++i) {
    4+    for(size_t i{0}; i < skip_diffs.size(); ++i) {
    5         auto diff = skip_diffs[i];
    6         if (diff < 16) {
    7             ++count_diff_smaller_16;
    
  30. in src/test/chain_tests.cpp:49 in 6adafaa721
    42@@ -41,6 +43,113 @@ const CBlockIndex* NaiveLastCommonAncestor(const CBlockIndex* a, const CBlockInd
    43 
    44 } // namespace
    45 
    46+BOOST_AUTO_TEST_CASE(get_skip_height_test)
    47+{
    48+    // Even values: the rightmost bit is zeroed
    49+    // Test with variouse even values with at least 2 bits set
    


    DrahtBot commented at 2:06 pm on November 4, 2025:
    variouse -> various [spelling error in comment making the word invalid]
    
  31. in src/test/chain_tests.cpp:73 in 6adafaa721
    68+                                    0b0001011101000001);
    69+    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011000),
    70+                                    0b0110101101010000);
    71+    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011001),
    72+                                    0b0110101101000001);
    73+    // All values 0--10
    


    l0rinc commented at 2:15 pm on November 4, 2025:
    what does -- mean here?

    optout21 commented at 2:02 pm on November 6, 2025:
    “0–10” meant “from 0 to 10 inclusive”. Changed to “0-10” to avoid parsing issues.
  32. in src/test/chain_tests.cpp:143 in 6adafaa721
    138+    // but general enough to be resilient to the exact test
    139+    // details and algorithm
    140+
    141+    // The median skip diff is lower than the average
    142+    BOOST_CHECK_LT(median, avg);
    143+    // Most skip diffs are small (more than 67% are below 16):
    


    l0rinc commented at 2:17 pm on November 4, 2025:
    I don’t understand where 16 and the 67% come from. Can we just keep the very last check of the median instead?

    optout21 commented at 2:03 pm on November 6, 2025:
    The checks have been simplified, usage of “magic numbers” reduced.
  33. optout21 force-pushed on Nov 6, 2025
  34. optout21 commented at 2:01 pm on November 6, 2025: contributor

    Pushed minor fixes after comments:

    • simplify test assertions
    • simplify computing of total/avg and max (no for loop)
    • formatting
  35. in src/test/chain_tests.cpp:135 in ea998e18f6
    130+        skip_diffs.push_back(i - skip_height);
    131+    }
    132+
    133+    // Analyze the diffs
    134+    // Compute the average
    135+    const auto avg = double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size());
    


    l0rinc commented at 3:09 pm on November 6, 2025:

    nit: below we’re using brace init + dividing a double by a size_t is also double-division (untested):

    0    const auto avg{double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / skip_diffs.size()};
    
  36. in src/test/chain_tests.cpp:142 in ea998e18f6
    137+    std::ranges::sort(skip_diffs);
    138+    const auto median{skip_diffs[n / 2]};
    139+    // Take the maximum
    140+    const auto max{skip_diffs[skip_diffs.size() - 1]};
    141+
    142+    // Assert some parameters of the distribution
    


    l0rinc commented at 3:10 pm on November 6, 2025:
    nit: redundant comment (there a few more)
  37. in src/test/chain_tests.cpp:139 in ea998e18f6
    134+    // Compute the average
    135+    const auto avg = double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size());
    136+    // Compute the median
    137+    std::ranges::sort(skip_diffs);
    138+    const auto median{skip_diffs[n / 2]};
    139+    // Take the maximum
    


    l0rinc commented at 3:10 pm on November 6, 2025:
    nit: redudnant comment, the code already expresses this clearly
  38. in src/test/chain_tests.cpp:122 in ea998e18f6
    112+    block_index.push_back(std::make_unique<CBlockIndex>());
    113+    // Build a chain
    114+    for(int i{1}; i < n; ++i) {
    115+        auto new_index = std::make_unique<CBlockIndex>();
    116+        new_index->pprev = block_index.back().get();
    117+        BOOST_CHECK(new_index->pprev);
    


    l0rinc commented at 3:12 pm on November 6, 2025:
    nit: should likely be a BOOST_REQUIRE since this is still the setup phase, not yet the validation stage
  39. l0rinc approved
  40. l0rinc commented at 3:14 pm on November 6, 2025: contributor

    ACK ea998e18f6d64f485d4e2d85a1028474fb35120a

    Thanks for taking my suggestion, the code is very clean and easy to understand now. I left some nits I still saw, but they’re not blocking.

  41. optout21 force-pushed on Nov 14, 2025
  42. optout21 commented at 1:24 pm on November 14, 2025: contributor

    Minor nits applied

     0@@ -114,7 +114,7 @@ BOOST_AUTO_TEST_CASE(skip_height_properties_test)
     1     for(int i{1}; i < n; ++i) {
     2         auto new_index = std::make_unique<CBlockIndex>();
     3         new_index->pprev = block_index.back().get();
     4-        BOOST_CHECK(new_index->pprev);
     5+        BOOST_REQUIRE(new_index->pprev);
     6         new_index->nHeight = new_index->pprev->nHeight + 1;
     7         new_index->BuildSkip();
     8         block_index.push_back(std::move(new_index));
     9@@ -131,16 +131,12 @@ BOOST_AUTO_TEST_CASE(skip_height_properties_test)
    10     }
    11 
    12     // Analyze the diffs
    13-    // Compute the average
    14-    const auto avg = double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size());
    15-    // Compute the median
    16+    // Compute the average, median, maximum
    17+    const auto avg{double(std::accumulate(skip_diffs.begin(), skip_diffs.end(), 0)) / double(skip_diffs.size())};
    18     std::ranges::sort(skip_diffs);
    19     const auto median{skip_diffs[n / 2]};
    20-    // Take the maximum
    21     const auto max{skip_diffs[skip_diffs.size() - 1]};
    22 
    23-    // Assert some parameters of the distribution
    24-
    25     // The median skip diff is lower than the average (non-uniform distribution)
    26     BOOST_CHECK_LT(median, avg);
    27     // There are some large skip diffs (more than half of n):
    
  43. optout21 requested review from l0rinc on Nov 14, 2025
  44. optout21 force-pushed on Nov 15, 2025
  45. DrahtBot added the label CI failed on Dec 8, 2025
  46. in src/chain.cpp:79 in e2e25f4947
    75@@ -70,7 +76,7 @@ CBlockIndex* CChain::FindEarliestAtLeast(int64_t nTime, int height) const
    76 int static inline InvertLowestOne(int n) { return n & (n - 1); }
    77 
    78 /** Compute what height to jump back to with the CBlockIndex::pskip pointer. */
    79-int static inline GetSkipHeight(int height) {
    80+int GetSkipHeight(int height) {
    


    maflcko commented at 12:46 pm on December 9, 2025:

    nit: clang-format this line, while touching. (leaving this nit, because the CI has to be re-run anyway)

    0+int GetSkipHeight(int height)
    1+{
    

    optout21 commented at 7:09 am on December 11, 2025:
    I kept the change minimal, but since the line is touched, I’ve adjusted the curly brace placement to be consistent, as suggested.
  47. optout21 force-pushed on Dec 11, 2025
  48. optout21 commented at 7:06 am on December 11, 2025: contributor

    Very minor touch: moving of a curly nit-fix, rebase (noop); mostly to get CI to run again (had some apparently temporary issue).

    (e2e25f4947e65869b014483be70b764a62bd0df7 -> 1a0423cf25bb60f98a29e06c39b7db2998860cd9)

  49. DrahtBot removed the label CI failed on Dec 11, 2025
  50. l0rinc commented at 6:39 pm on December 11, 2025: contributor
    reACK 1a0423cf25bb60f98a29e06c39b7db2998860cd9
  51. sipa commented at 7:06 pm on December 11, 2025: member

    Concept ACK, but disagree with the approach of the first commit.

    I don’t think it tests anything useful:

    • If it wanted to test the exact behavior of the skip-height pointer building, it could test that directly. No need for averages or medians here; it could test that block 171 points to block 161 and no other, for example.
    • If it wanted to test something related to the performance of the constructed index, without hardcoding the exact behavior, it would need to do something like pick random pairs of blocks in the chain, and see how many iterations CBlockIndex::GetAncestor (or a test-only reimplementation thereof that keeps track of that count) takes to get from one to the other. While this is related to properties of the differences, the structure really matters for this, and just distribution/median/average aren’t sufficient to guarantee this number is reasonable.
  52. sipa removed review request from sipa on Dec 11, 2025
  53. optout21 commented at 7:17 pm on December 11, 2025: contributor

    I don’t think it tests anything useful:

    * If it wanted to test the exact behavior of the skip-height pointer building, it could test that directly. No need for averages or medians here; it could test that block 171 points to block 161 and no other, for example.
    
    * If it wanted to test something related to the performance of the constructed index, without hardcoding the exact behavior, it would need to do something like pick random pairs of blocks in the chain, and see how many iterations `CBlockIndex::GetAncestor` (or a test-only reimplementation thereof that keeps track of that count) takes to get from one to the other. While this is related to properties of the differences, the structure really matters for this, and just distribution/median/average aren't sufficient to guarantee this number is reasonable.
    

    I’m more interested in the emergent behavior of the chain of skips, not in the exact values for some concrete values. Therefore I don’t think a unit test checking the result directly for some values has much added value. But testing the number of steps to get between two points on the chain is a good idea, that’s the important property of the skips. Thanks for the suggestion, I will reconsider the test.

  54. sipa commented at 7:19 pm on December 11, 2025: member

    My point is that these two properties, regardless of how they are accomplished, are the interesting ones to test.

    • Your test achieves the first, but only approximately, and is much more complicated than needed for that purpose.
    • Your test does not achieve the second at all, despite having complexity that seems intended to test for it.

    Pick one of the two, and do it well.

  55. optout21 marked this as a draft on Dec 12, 2025
  56. optout21 force-pushed on Dec 12, 2025
  57. optout21 force-pushed on Dec 12, 2025
  58. DrahtBot added the label CI failed on Dec 12, 2025
  59. optout21 force-pushed on Dec 12, 2025
  60. optout21 force-pushed on Dec 12, 2025
  61. optout21 marked this as ready for review on Dec 12, 2025
  62. optout21 commented at 11:57 am on December 12, 2025: contributor

    I redid the test, following @sipa’s remark.

    • The test on skip height properties – median, avg – is dropped
    • A new test is added, that tests the skip heights indirectly, by doing a bunch of ancestor searches, and counting the number of steps needed, and doing some asserts on the step count. In a 200k long chain a series 1000 runs are performed. In each run a start and a target heights are picked randomly: the start is in the upper half of the chain, and the target is below it, by at least 500. Ancestor search is performed, by duplicated the GetAncestor algorithm. (Deterministic randomness is used.)
    • The simple one-input-assert-the-output GetSkipHeight unit test is kept unmodified. This test has limited test harness value, it serves more as a documentation-by-test for this ill-documented method. I’m not very keen on this unit test, I can remove it if reviewers don’t see it useful.
  63. fanquake commented at 1:24 pm on December 12, 2025: member

    https://github.com/bitcoin/bitcoin/actions/runs/20165856545/job/57889037572#step:8:1220:

    0 /home/runner/work/bitcoin/bitcoin/src/test/chain_tests.cpp:87:30: error: use of undeclared identifier 'GetSkipHeight'
    1   87 |             int heightSkip = GetSkipHeight(heightWalk);
    2      |                              ^
    3/home/runner/work/bitcoin/bitcoin/src/test/chain_tests.cpp:88:34: error: use of undeclared identifier 'GetSkipHeight'
    4   88 |             int heightSkipPrev = GetSkipHeight(heightWalk - 1);
    5      |                                  ^
    62 errors generated.
    
  64. in src/test/chain_tests.cpp:152 in 2a1a0f9d89 outdated
    147+                                        heightSkipPrev >= target_height)))) {
    148+                // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
    149+                pindexWalk = pindexWalk->pskip;
    150+                heightWalk = heightSkip;
    151+            } else {
    152+                assert(pindexWalk->pprev);
    


    l0rinc commented at 2:19 pm on December 12, 2025:
    Could unify these to BOOST_REQUIRE instead of assert

    optout21 commented at 6:09 pm on December 15, 2025:
    The traversal code is copied CBlockIndex::GetAncestor(), hence the assert. Changed to BOOST_REQUIRE.
  65. in src/test/chain_tests.cpp:157 in 2a1a0f9d89 outdated
    149+                pindexWalk = pindexWalk->pskip;
    150+                heightWalk = heightSkip;
    151+            } else {
    152+                assert(pindexWalk->pprev);
    153+                pindexWalk = pindexWalk->pprev;
    154+                heightWalk--;
    


    l0rinc commented at 2:20 pm on December 12, 2025:

    return value isn’t needed, could be simplified to:

    0                --heightWalk;
    

    optout21 commented at 6:09 pm on December 15, 2025:
    As this code is copied, and not so important, I leave it as it is.

    l0rinc commented at 6:15 pm on December 15, 2025:
    We usually prefer new code be added according to the best practices instead of surrounding code - the history doesn’t matter, see: https://github.com/bitcoin/bitcoin/blob/fa4395dffd432b999002dfd24eb6f8d7384fbcbe/doc/developer-notes.md?plain=1#L7-L8

    optout21 commented at 10:17 am on December 16, 2025:
    Changed.
  66. in src/test/chain_tests.cpp:162 in 2a1a0f9d89 outdated
    157+        }
    158+
    159+        // Asserts on the number of iterations:
    160+        // always less than 100 and less than 5% of the distance
    161+        BOOST_CHECK_LT(iteration_count, 100);
    162+        BOOST_REQUIRE(delta > 0);
    


    l0rinc commented at 2:20 pm on December 12, 2025:
    0        BOOST_REQUIRE_GT(delta, 0);
    

    And if we keep this, it could go closer to the delta declaration


    optout21 commented at 6:12 pm on December 15, 2025:
    This check protects the next line, which divides by delta. I keep the check just before it is needed. Changed BOOST_REQUIRE to BOOST_REQUIRE_GT
  67. in src/test/chain_tests.cpp:129 in 2a1a0f9d89 outdated
    124+    const auto runs{1000};
    125+    const auto min_distance{500};
    126+    FastRandomContext ctx(/*fDeterministic =*/true);
    127+    for (auto i{0}; i < runs; ++i) {
    128+        // pick a height in the second half of the chain
    129+        auto start_height = chain_size / 2 + ctx.randrange(chain_size / 2);
    


    l0rinc commented at 2:23 pm on December 12, 2025:

    To make sure we don’t have signed/unsigned problems here (since we’re subtracting from this later), we might as well make it explicitly int:

    0        int start_height{chain_size / 2 + ctx.randrange(chain_size / 2)};
    

    Actually we could reduce duplication if we subtract from the end instead:

    0        const int start_height{(chain_size - 1) - ctx.randrange(chain_size / 2)};       // pick a height in the second half of the chain
    

    optout21 commented at 6:12 pm on December 15, 2025:
    Changed.
  68. in src/test/chain_tests.cpp:131 in 2a1a0f9d89 outdated
    126+    FastRandomContext ctx(/*fDeterministic =*/true);
    127+    for (auto i{0}; i < runs; ++i) {
    128+        // pick a height in the second half of the chain
    129+        auto start_height = chain_size / 2 + ctx.randrange(chain_size / 2);
    130+        BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
    131+        auto& start_p = block_index[start_height];
    


    l0rinc commented at 2:25 pm on December 12, 2025:
    can you use brace init consistently to avoid narrowing? And corecheck suggests const reference - though we can likely inline it instead…

    optout21 commented at 10:19 am on December 16, 2025:
    Changed.
  69. in src/test/chain_tests.cpp:62 in 2a1a0f9d89 outdated
    57+                                    0b00010000);
    58+    BOOST_CHECK_EQUAL(GetSkipHeight(0b00011000),
    59+                                    0b00010000);
    60+    BOOST_CHECK_EQUAL(GetSkipHeight(0b10101010),
    61+                                    0b10101000);
    62+    // Odd values: the 2nd and 3rd bits are zeroed
    


    l0rinc commented at 2:31 pm on December 12, 2025:

    I think you mean that the second and third set bits are zerod, right?

    0    // Even heights: clear the lowest set bit.
    1...
    2    // Odd heights: clear the 2nd and 3rd set bits.
    

    optout21 commented at 6:14 pm on December 15, 2025:
    Yes, that’s how it was meant, I added the clarification!
  70. in src/test/chain_tests.cpp:130 in 2a1a0f9d89 outdated
    125+    const auto min_distance{500};
    126+    FastRandomContext ctx(/*fDeterministic =*/true);
    127+    for (auto i{0}; i < runs; ++i) {
    128+        // pick a height in the second half of the chain
    129+        auto start_height = chain_size / 2 + ctx.randrange(chain_size / 2);
    130+        BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
    


    l0rinc commented at 2:37 pm on December 12, 2025:

    This may be useful during development, but it’s basically testing the testing logic itself, isn’t it? If you keep them, can you add them after the initializations to have separate init and check sections, something like:

    0const int start_height{(chain_size - 1) - ctx.randrange(chain_size / 2)};       // pick a height in the second half of the chain
    1const int delta{min_distance + ctx.randrange(start_height - min_distance + 1)}; // pick a target height, earlier by at least min_distance
    2const int target_height{start_height - delta};
    3
    4BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
    5BOOST_REQUIRE(target_height > 0 && target_height < chain_size && target_height < start_height);
    6BOOST_REQUIRE(delta > 0);
    

    optout21 commented at 10:20 am on December 16, 2025:
    Done; delta check no longer needed.
  71. sipa commented at 2:37 pm on December 12, 2025: member
    @optout21 Cool, approach ACK. You’ll need to make sure both commits compile, though.
  72. in src/test/chain_tests.cpp:141 in 2a1a0f9d89 outdated
    133+        auto delta{min_distance + ctx.randrange(start_height - min_distance)};
    134+        auto target_height{start_height - delta};
    135+        BOOST_REQUIRE(target_height > 0 && target_height < chain_size && target_height < start_height);
    136+
    137+        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
    138+        const CBlockIndex* pindexWalk = start_p.get();
    


    l0rinc commented at 2:42 pm on December 12, 2025:

    it seems to me we can inline start_p here:

    0        const auto* pindexWalk{block_index[start_height].get()};
    

    optout21 commented at 6:16 pm on December 15, 2025:
    It is saved in a local variable because it is also used at the end of the test method (auto p_ancestor{start_p->GetAncestor(target_height)};)
  73. in src/test/chain_tests.cpp:144 in 2a1a0f9d89 outdated
    136+
    137+        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
    138+        const CBlockIndex* pindexWalk = start_p.get();
    139+        auto heightWalk = start_height;
    140+        auto iteration_count{0};
    141+        while (heightWalk > target_height) {
    


    l0rinc commented at 2:43 pm on December 12, 2025:

    nit: we could reduce the scope of the iteration-only variables:

    0int iteration_count{0};
    1for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
    

    nit: heightWalk -> height_walk


    optout21 commented at 6:16 pm on December 15, 2025:
    I rather keep it to match the original code.
  74. in src/test/chain_tests.cpp:164 in 2a1a0f9d89 outdated
    155+            }
    156+            ++iteration_count;
    157+        }
    158+
    159+        // Asserts on the number of iterations:
    160+        // always less than 100 and less than 5% of the distance
    


    l0rinc commented at 2:45 pm on December 12, 2025:

    why 100 and why 5%? The numbers are already visible in the code, but not sure why - can we have that in the comment instead?

    The 5% doesn’t seem to hold for non-deterministic random - seems arbitrary. Neither does the 100 - so my suspicion seems to have been accurate that these aren’t real properties of the system, just accidental ones of the sample.


    optout21 commented at 10:21 am on December 16, 2025:
    These were empirically derived values. I’ve adjusted them, see #33661 (comment) .
  75. in src/test/chain_tests.cpp:146 in 2a1a0f9d89 outdated
    138+        const CBlockIndex* pindexWalk = start_p.get();
    139+        auto heightWalk = start_height;
    140+        auto iteration_count{0};
    141+        while (heightWalk > target_height) {
    142+            int heightSkip = GetSkipHeight(heightWalk);
    143+            int heightSkipPrev = GetSkipHeight(heightWalk - 1);
    


    l0rinc commented at 2:51 pm on December 12, 2025:

    Given that we’ve already called BuildSkip, do we need to test this directly or can we do it indirectly as well?

    0            const int height_skip{pindexWalk->pskip->nHeight};
    1            const int height_skip_prev{pindexWalk->pprev->pskip->nHeight};
    

    optout21 commented at 6:17 pm on December 15, 2025:
    Same here, I’d rather keep it to match the original code.
  76. in src/test/chain_tests.cpp:167 in 2a1a0f9d89 outdated
    162+        BOOST_REQUIRE(delta > 0);
    163+        auto rel_iter_count = double(iteration_count) / double(delta);
    164+        BOOST_CHECK_LT(rel_iter_count, 0.05);
    165+
    166+        // Since we are at it, we can also check that `GetAncestor` works
    167+        // and gives the same result
    


    l0rinc commented at 2:52 pm on December 12, 2025:

    If you insist on the comment - though it duplicates what the code already does

    0        // Check that `GetAncestor` works and gives the same result
    

    optout21 commented at 6:17 pm on December 15, 2025:
    Changed
  77. in src/test/chain_tests.cpp:129 in 2a1a0f9d89 outdated
    121+        block_index.push_back(std::move(new_index));
    122+    }
    123+
    124+    const auto runs{1000};
    125+    const auto min_distance{500};
    126+    FastRandomContext ctx(/*fDeterministic =*/true);
    


    l0rinc commented at 2:54 pm on December 12, 2025:

    making it deterministic kinda’ defeats the purpose in my opinion (it’s a low bar to construct something that accidentally works, a lot more difficult to find the exact properties of a system that work less predictable values)

    nit: extra space before =


    optout21 commented at 10:25 am on December 16, 2025:
    Non-deterministic unit tests are a bad practice, they can result in tests passing locally, but fail occasionally in CI, which is a pain. Deterministic random is a good way to make an unbiased selection out of a huge number of possibilities (which would be too slow to test). I disagree that deterministic selection defeats the purpose. (Glancing at the chart attached shows that it covers the problem space well.)

    l0rinc commented at 10:33 am on December 16, 2025:
    when those tests fail they should uncover an actual bug, so I’d argue that’s very useful. If they don’t, the test wasn’t really exercising the algorithm but was testing a special case only, so it was deceptive.

    optout21 commented at 11:14 am on December 16, 2025:
    In my opinion (unit) tests executed in a gating CI should be 100% reproducible. We may disagree on this, which is fine :) There is not much difference between a single run of a randomized test using a fixed seed vs. a variable (e.g. time-based). Multiple runs with the non-fixed seed will cover different input, that’s true, but they will be hard to reproduce. Imagine a random test that has a chance to fail once in every 1000 runs. It can be annoying if it fails suddenly in someone’s PR. Also it’s very difficult to reproduce to analyze it (unless the seed is recorded, which is usually not the case). The failure may be due to a test issue, but it may also have caught a long-hidden bug. A robust and reliable test harness is very valuable from software engineering point of view.
  78. in src/test/chain_tests.cpp:123 in 2a1a0f9d89 outdated
    114+    block_index.push_back(std::make_unique<CBlockIndex>());
    115+    for(auto i{1}; i < chain_size; ++i) {
    116+        auto new_index{std::make_unique<CBlockIndex>()};
    117+        new_index->pprev = block_index.back().get();
    118+        BOOST_CHECK(new_index->pprev);
    119+        new_index->nHeight = new_index->pprev->nHeight + 1;
    


    l0rinc commented at 3:00 pm on December 12, 2025:
    0        new_index->nHeight = i;
    

    optout21 commented at 6:19 pm on December 15, 2025:
    Makes sense, but I keep it as it is. This way it shows better, the for each object nHeight = pprev->nHieght + 1 holds. Plus this is done the same way in the original chain_test test.
  79. in src/test/chain_tests.cpp:164 in 2a1a0f9d89 outdated
    159+        // Asserts on the number of iterations:
    160+        // always less than 100 and less than 5% of the distance
    161+        BOOST_CHECK_LT(iteration_count, 100);
    162+        BOOST_REQUIRE(delta > 0);
    163+        auto rel_iter_count = double(iteration_count) / double(delta);
    164+        BOOST_CHECK_LT(rel_iter_count, 0.05);
    


    l0rinc commented at 3:05 pm on December 12, 2025:

    can we avoid floating point arithmetic here?

    0        BOOST_CHECK_LT(iteration_count * 20, delta);
    

    optout21 commented at 6:20 pm on December 15, 2025:
    I think this makes the intent a bit less clear, and performance is not crucial in this line.
  80. in src/test/chain_tests.cpp:118 in 2a1a0f9d89 outdated
    109+    // Build a chain
    110+    const auto chain_size{200'000};
    111+    std::vector<std::unique_ptr<CBlockIndex>> block_index;
    112+    block_index.reserve(chain_size);
    113+    // Create genesis block
    114+    block_index.push_back(std::make_unique<CBlockIndex>());
    


    l0rinc commented at 4:05 pm on December 12, 2025:

    genesis is only special that it’s the head of the linked list, right? We could include that in the loop:

    0    std::vector<std::unique_ptr<CBlockIndex>> block_index;
    1    block_index.reserve(chain_size);
    2    for (int i{0}; i < chain_size; ++i) {
    3        auto index{std::make_unique<CBlockIndex>()};
    4        index->pprev = i ? block_index.back().get() : nullptr;
    5        index->nHeight = i;
    6        index->BuildSkip();
    7        block_index.push_back(std::move(index));
    8    }
    

    optout21 commented at 6:21 pm on December 15, 2025:
    Makes sense, but it make sense to keep the genesis special in this test.
  81. in src/test/chain_tests.cpp:176 in 2a1a0f9d89 outdated
    166+        // Since we are at it, we can also check that `GetAncestor` works
    167+        // and gives the same result
    168+        auto p_ancestor = start_p->GetAncestor(target_height);
    169+        BOOST_CHECK_EQUAL(p_ancestor, block_index[target_height].get());
    170+    }
    171+}
    


    l0rinc commented at 4:42 pm on December 12, 2025:

    we have some checks in the while, some after, but we don’t have a check for all runs - we could add whether we saw both skips and prevs, something like:

     0    bool saw_skip{false}, saw_prev{false};
     1...
     2    for (int run{0}; run < 1'000; ++run) {
     3...
     4        for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
     5...
     6            if (skip && ... {
     7                saw_skip |= true;
     8            } else {
     9                saw_prev |= true;
    10            }
    11...
    12    BOOST_CHECK(saw_skip && saw_prev);
    

    optout21 commented at 6:22 pm on December 15, 2025:
    It’s sufficient if all test runs are OK, I don’t see what could be checked across all the runs.
  82. in src/test/chain_tests.cpp:151 in 2a1a0f9d89 outdated
    143+            int heightSkipPrev = GetSkipHeight(heightWalk - 1);
    144+            if (pindexWalk->pskip != nullptr &&
    145+                (heightSkip == target_height ||
    146+                (heightSkip > target_height && !(heightSkipPrev < heightSkip - 2 &&
    147+                                        heightSkipPrev >= target_height)))) {
    148+                // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
    


    l0rinc commented at 4:50 pm on December 12, 2025:

    Instead of commenting, can we extract the logic to variables that explain what happens here? After a lot of fiddling with this part, it seems to me this may explain it a bit better:

     0// Perform traversal from start to target, skipping, as implemented in `GetAncestor`
     1const auto* pindex_walk{block_index[start_height].get()};
     2int iteration_count{0};
     3for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
     4    BOOST_REQUIRE(pindex_walk);
     5    const auto* const prev{Assert(pindex_walk->pprev)};
     6    const auto* const skip{Assert(pindex_walk->pskip)};
     7    BOOST_REQUIRE(prev->pskip);
     8
     9    const bool prev_skip_better{prev->pskip->nHeight >= target_height && prev->pskip->nHeight < skip->nHeight - 2};
    10    if (skip->nHeight == target_height || (skip->nHeight > target_height && !prev_skip_better)) {
    11        saw_skip = true;
    12        pindex_walk = skip;
    13        BOOST_REQUIRE(skip->nHeight < height_walk);
    14        height_walk = skip->nHeight;
    15    } else {
    16        saw_prev = true;
    17        pindex_walk = prev;
    18        --height_walk;
    19    }
    20    BOOST_REQUIRE(height_walk >= target_height && height_walk == pindex_walk->nHeight);
    21}
    

    optout21 commented at 6:22 pm on December 15, 2025:
    Again, not changed, copied code.
  83. in src/test/chain_tests.cpp:127 in 2a1a0f9d89 outdated
    120+        new_index->BuildSkip();
    121+        block_index.push_back(std::move(new_index));
    122+    }
    123+
    124+    const auto runs{1000};
    125+    const auto min_distance{500};
    


    l0rinc commented at 4:51 pm on December 12, 2025:
    This will make boundary conditions less likely, running it with constexpr int min_distance{100}; revealed new failures for the randomized case for me locally.

    optout21 commented at 6:24 pm on December 15, 2025:
    The shorter the runs, the more variation there is. The ratio depends on the length (probably a log relationship). Performance-wise the long traversals are interested, that’s why I put a min distance limit.

    sipa commented at 6:27 pm on December 15, 2025:
    On average, when considering all pairs of heights (x,y), with $0 \leq x \leq y \leq N$, I believe the average number of steps scales roughly as $\log^2(N)$.

    optout21 commented at 10:27 am on December 16, 2025:
    Details in #33661 (comment)

    optout21 commented at 10:29 am on December 16, 2025:

    This will make boundary conditions less likely, running it with constexpr int min_distance{100}; revealed new failures for the randomized case for me locally.

    The “5%” rule was very rough, and broke with short distances, you are right. I added it as the global maximum check does not enforce much for short distances. But now the formula-based boundary is stricter over a wide range of distance values.

  84. in src/test/chain_tests.cpp:124 in 2a1a0f9d89 outdated
    119+        new_index->nHeight = new_index->pprev->nHeight + 1;
    120+        new_index->BuildSkip();
    121+        block_index.push_back(std::move(new_index));
    122+    }
    123+
    124+    const auto runs{1000};
    


    l0rinc commented at 4:52 pm on December 12, 2025:

    I think we can safely inline this if we call the loop iteration run:

    0    for (int run{0}; run < 1'000; ++run) {
    

    optout21 commented at 6:24 pm on December 15, 2025:
    Changed.
  85. in src/test/chain_tests.cpp:153 in 2a1a0f9d89 outdated
    145+                (heightSkip == target_height ||
    146+                (heightSkip > target_height && !(heightSkipPrev < heightSkip - 2 &&
    147+                                        heightSkipPrev >= target_height)))) {
    148+                // Only follow pskip if pprev->pskip isn't better than pskip->pprev.
    149+                pindexWalk = pindexWalk->pskip;
    150+                heightWalk = heightSkip;
    


    l0rinc commented at 5:08 pm on December 12, 2025:

    we should assert that this will always converge towards genesis (i.e. it’s decreasing):

    0                BOOST_REQUIRE(skip->nHeight < height_walk);
    

    optout21 commented at 6:31 pm on December 15, 2025:
    It should hold, but validating the traversal algorithm is not the primary aim, and I prefer to keep the duplicated code as close as possible. For this test it is relevant that the traversal completes in a finite number of steps.

    optout21 commented at 10:30 am on December 16, 2025:
    Added now.
  86. in src/test/chain_tests.cpp:108 in 2a1a0f9d89 outdated
    102+    // and asserts that its properties are suitable for performant
    103+    // ancestor search.
    104+    // It does so by imitating the `GetAncestor` function to
    105+    // look for an ancestor, and ensuring that an arbitrary
    106+    // ancestor is found in a number of steps that is much lower
    107+    // than simple linear traversal.
    


    l0rinc commented at 5:11 pm on December 12, 2025:
    “much lower” sounds very subjective and I couldn’t find an obvious asymptote - can we just assert that it’s always lower? Maybe we can use the 110 from GetSkipHeight as an upper limit instead

    optout21 commented at 10:31 am on December 16, 2025:
    Now there is a upper-bound formula that quantifies how much lower.
  87. in src/test/chain_tests.cpp:100 in 2a1a0f9d89 outdated
    94+    BOOST_CHECK_EQUAL(GetSkipHeight(8), 0);
    95+    BOOST_CHECK_EQUAL(GetSkipHeight(9), 1);
    96+    BOOST_CHECK_EQUAL(GetSkipHeight(10), 8);
    97+}
    98+
    99+BOOST_AUTO_TEST_CASE(skip_height_analysis)
    


    l0rinc commented at 5:39 pm on December 12, 2025:

    Seems fast enough:

    ./build/bin/test_bitcoin –run_test=chain_tests/skip_height_analysis 0.53s user 0.02s system 32% cpu 1.714 total

  88. l0rinc changes_requested
  89. l0rinc commented at 5:43 pm on December 12, 2025: contributor

    I might have gone a bit overboard, but I have tried explaining every change that I suggested in context.

    For convenience, here’s the full local diff I did during review (this is the latest state, some of the suggestions may reflect a previous one):

      0BOOST_AUTO_TEST_CASE(get_skip_height_test)
      1{
      2    // Even heights: clear the lowest set bit.
      3    BOOST_CHECK_EQUAL(GetSkipHeight(0b00010010),
      4                                    0b00010000);
      5    BOOST_CHECK_EQUAL(GetSkipHeight(0b00100010),
      6                                    0b00100000);
      7    BOOST_CHECK_EQUAL(GetSkipHeight(0b01000010),
      8                                    0b01000000);
      9    BOOST_CHECK_EQUAL(GetSkipHeight(0b00010100),
     10                                    0b00010000);
     11    BOOST_CHECK_EQUAL(GetSkipHeight(0b00011000),
     12                                    0b00010000);
     13    BOOST_CHECK_EQUAL(GetSkipHeight(0b10101010),
     14                                    0b10101000);
     15    // Odd heights: clear the 2nd and 3rd set bits.
     16    BOOST_CHECK_EQUAL(GetSkipHeight(0b10010011),
     17                                    0b10000001);
     18    BOOST_CHECK_EQUAL(GetSkipHeight(0b10100011),
     19                                    0b10000001);
     20    BOOST_CHECK_EQUAL(GetSkipHeight(0b11000011),
     21                                    0b10000001);
     22    BOOST_CHECK_EQUAL(GetSkipHeight(0b10010101),
     23                                    0b10000001);
     24    BOOST_CHECK_EQUAL(GetSkipHeight(0b10011001),
     25                                    0b10000001);
     26    BOOST_CHECK_EQUAL(GetSkipHeight(0b10101011),
     27                                    0b10100001);
     28    // Some longer random values (even and odd)
     29    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101000),
     30                                    0b0001011101100000);
     31    BOOST_CHECK_EQUAL(GetSkipHeight(0b0001011101101001),
     32                                    0b0001011101000001);
     33    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011000),
     34                                    0b0110101101010000);
     35    BOOST_CHECK_EQUAL(GetSkipHeight(0b0110101101011001),
     36                                    0b0110101101000001);
     37    // All values 0-10
     38    BOOST_CHECK_EQUAL(GetSkipHeight(0), 0);
     39    BOOST_CHECK_EQUAL(GetSkipHeight(1), 0);
     40    BOOST_CHECK_EQUAL(GetSkipHeight(2), 0);
     41    BOOST_CHECK_EQUAL(GetSkipHeight(3), 1);
     42    BOOST_CHECK_EQUAL(GetSkipHeight(4), 0);
     43    BOOST_CHECK_EQUAL(GetSkipHeight(5), 1);
     44    BOOST_CHECK_EQUAL(GetSkipHeight(6), 4);
     45    BOOST_CHECK_EQUAL(GetSkipHeight(7), 1);
     46    BOOST_CHECK_EQUAL(GetSkipHeight(8), 0);
     47    BOOST_CHECK_EQUAL(GetSkipHeight(9), 1);
     48    BOOST_CHECK_EQUAL(GetSkipHeight(10), 8);
     49}
     50
     51BOOST_AUTO_TEST_CASE(skip_height_analysis)
     52{
     53    // This test case indirectly tests the `GetSkipHeight` function,
     54    // and asserts that its properties are suitable for performant
     55    // ancestor search.
     56    // It does so by imitating the `GetAncestor` function to
     57    // look for an ancestor, and ensuring that an arbitrary
     58    // ancestor is found in no more steps than linear traversal.
     59
     60    constexpr int chain_size{200'000};
     61    std::vector<std::unique_ptr<CBlockIndex>> block_index;
     62    block_index.reserve(chain_size);
     63    for (int i{0}; i < chain_size; ++i) {
     64        auto index{std::make_unique<CBlockIndex>()};
     65        index->pprev = i ? block_index.back().get() : nullptr;
     66        index->nHeight = i;
     67        index->BuildSkip();
     68        block_index.push_back(std::move(index));
     69    }
     70
     71    bool saw_skip{false}, saw_prev{false};
     72
     73    FastRandomContext rng(/*fDeterministic=*/false);
     74    for (int run{0}; run < 1'000; ++run) {
     75        const int start_height{(chain_size - 1) - rng.randrange(chain_size / 2)};   // pick a height in the second half of the chain
     76        constexpr int min_distance{100};
     77        const int delta{min_distance + rng.randrange(start_height - min_distance)}; // pick a target height, earlier by at least min_distance
     78        const int target_height{start_height - delta};
     79
     80        BOOST_REQUIRE(start_height > 0 && start_height < chain_size);
     81        BOOST_REQUIRE(target_height > 0 && target_height < chain_size && target_height < start_height);
     82        BOOST_REQUIRE(delta > 0);
     83
     84        // Perform traversal from start to target, skipping, as implemented in `GetAncestor`
     85        const auto* pindex_walk{block_index[start_height].get()};
     86        int iteration_count{0};
     87        for (int height_walk{start_height}; height_walk > target_height; ++iteration_count) {
     88            BOOST_REQUIRE(pindex_walk);
     89            const auto* const prev{Assert(pindex_walk->pprev)};
     90            const auto* const skip{Assert(pindex_walk->pskip)};
     91            BOOST_REQUIRE(prev->pskip);
     92
     93            const bool prev_skip_better{prev->pskip->nHeight >= target_height && prev->pskip->nHeight < skip->nHeight - 2};
     94            if (skip->nHeight == target_height || (skip->nHeight > target_height && !prev_skip_better)) {
     95                saw_skip = true;
     96                pindex_walk = skip;
     97                BOOST_REQUIRE(skip->nHeight < height_walk);
     98                height_walk = skip->nHeight;
     99            } else {
    100                saw_prev = true;
    101                pindex_walk = prev;
    102                --height_walk;
    103            }
    104            BOOST_REQUIRE(height_walk >= target_height && height_walk == pindex_walk->nHeight);
    105        }
    106
    107        // End-of-walk invariants.
    108        BOOST_CHECK_EQUAL(pindex_walk->nHeight, target_height);
    109        BOOST_CHECK_LE(iteration_count, 110); // See GetSkipHeight
    110
    111        // Cross-check with the real implementation.
    112        const auto* ancestor{block_index[start_height]->GetAncestor(target_height)};
    113        BOOST_CHECK_EQUAL(ancestor, block_index[target_height].get());
    114    }
    115
    116    BOOST_CHECK(saw_skip && saw_prev);
    117}
    
  90. optout21 force-pushed on Dec 15, 2025
  91. optout21 commented at 1:48 pm on December 15, 2025: contributor
    Fixed build error in 1st commit (the extern, etc, was added to the 2nd commit only, was missing from 1st).
  92. optout21 commented at 1:48 pm on December 15, 2025: contributor

    2 errors generated.

    Fixed in 464ce57745894d4038c970c5baac3c9da5933c19

  93. optout21 force-pushed on Dec 15, 2025
  94. optout21 force-pushed on Dec 15, 2025
  95. optout21 commented at 5:01 pm on December 15, 2025: contributor
    Applied review comment changes to chain_tests.cpp. -> 2f77d8dd91a87071bc01cfb5b68f01ab176f06f7
  96. Add indirect test for skip height through ancestor search
    Add a unit test that indirectly tests the `GetSkipHeight()` function,
    by doing checks on the performance of ancestor search, that relies on
    GetSkipHeight. Random searches are performed on a long chain, and the
    effective number of steps are counted, and asserted that are much less
    than the distance to the ancestor.
    ca2c9bc63e
  97. optout21 force-pushed on Dec 15, 2025
  98. optout21 commented at 6:36 pm on December 15, 2025: contributor

    I might have gone a bit overboard, but I have tried explaining every change that I suggested in context.

    Thanks for the details comments; addressed case-by-case.

  99. sipa commented at 6:39 pm on December 15, 2025: member

    There is an annoying conflict in design goals here.

    On the one hand, it’s copying code to mimick the traversal algorithm. When copying code, you should not assume that the two remain in sync, and reviewers might rightfully be confused about whether that might be a problem. From this perspective, it’d be better if instead, CChain::GetAncestor itself was modified to count its number of steps performed, and use that in the test rather than a copy/reimplementation.

    On the other hand, the goal isn’t to validate the traversal algorithm itself, but to validate a property of the skiplist it uses. The two are linked, but are not identical. In fact, there is no real reason why the two need to be identical in behavior. The added test here could use a simpler traversal algorithm (e.g. just always pick following pskip whenever that doesn’t jump past the target, and pprev otherwise) - the rest can be seen as a local optimization in the traversal code, which isn’t essential for verifying the skiplist itself.

    So I think it’s worth considering one of those two approaches (annotating the original, or using a simplified copy), but I’m also ok with the current approach - perhaps if some comments are added that explain it’s copied, but doesn’t require keeping the two in sync.

  100. DrahtBot removed the label CI failed on Dec 15, 2025
  101. optout21 marked this as a draft on Dec 16, 2025
  102. optout21 commented at 7:55 am on December 16, 2025: contributor

    So I think it’s worth considering one of those two approaches

    I agree, point taken. I have considered adding the counting to the product code, but it’s not a good idea to modify/extend the production code for testing purpose only, even just slightly. You are right (both of you) that duplicated likely will drift away in the futures, so there is no point in keeping it verbatim copy. I was just thinking that it’s easier to find the original code if it’s a verbatim copy, but that’s a minor consideration. So I give up on the verbatim copy, and I can do @l0rinc ’s suggestions (code style).

  103. optout21 commented at 9:52 am on December 16, 2025: contributor

    Regarding model/approximation (https://github.com/bitcoin/bitcoin/pull/33661#discussion_r2620420093):

    • If there was only the 1-bit clearing (applied for even numbers), the expected average for the skip is $log(y) - 1$. For the number of iterations, the approximation is be $(y-x)/log(y)$.

    • The two-bit clearing is a more complex, but it approximates to $(log(y) - 1)(log(y) - 1)/2$. Combining the two (for 50% even numbers 1-bit, for 50% odd numbers 2-bit) yields $(log(y) - 1)(log(y) + 1)/2$. For iteration count estimate: $2*(y-x)/(log(y) - 1)(log(y) + 1)$ (having the order of $(y-x)/log^2(y)$ ).

    • However, we are interested of the sum of skips in a particular traversal, not random. There is the effect that as bits are cleared, they tend to progress towards numbers with lesser low-order bits set, which mean higher skips. Modeling this is beyond me (and not warranted by this simple test)

    • I looked empirically at the number of iterations vs. the log of the distance ($y-x$). Apparently, the function $8*log(y-x)-40$ gives a good-enough fit for the upper bound (visually derived; log is 2-based). See chart:

    I will use this formula as an upper bound in the test (together with the fixed overall-max of 115, also empirically measured).

  104. Adjust test selection parameters and checks, use log(distance)-derived formula
    I swithed to chain of len 1M, for better fit with actual main chain.
    The start I chose to be in the upper 90% of the chain (that is, over 100,000).
    The target is below start, with the only constraint that the distance is at least 100 (reduced from 1000).
    The absolute value bound check is kept, 115 (higher than the previous 110, as chain length was increased).
    The liner ratio to distance check (5%) was removed, it is too rough, and fails for low distances.
    Added check with the $8*log(y−x) - 40$ formula.
    157caa98da
  105. Styling changes in the (copied) traverasal code a0e30410dc
  106. Add low-level tests for GetSkipHeight()
    This internal method in chain.cpp has low and indirect test
    coverage, and not documented well.
    A straightforward, simple, direct unit test is added.
    The method is changed to be not static and inline, so that
    the test can link with it.
    5f3513b0e0
  107. optout21 force-pushed on Dec 16, 2025
  108. optout21 commented at 10:39 am on December 16, 2025: contributor

    Reworked based on the comments (@l0rinc @sipa, thx):

    • I switched to chain of len 1,000,000, for better fit with actual main chain.
    • The start I chose to be in the upper 90% of the chain (that is, over 100,000).
    • The target is below start, with the only constraint that the distance is at least 100 (reduced from 1,000).
    • The absolute value bound check is kept, 115 (higher than the previous 110, as chain length was increased).
    • The liner ratio to distance check (5%) was removed, it was too rough, and failed for low distances.
    • Added boundary check with the $8*log(y−x) - 40$ formula.
    • The copied traversal code was reworked for consistent style, and some added checks.

    I have kept commits separately, but I plan to squash the commits before merge (the first 3 commits touching the same test method)

    Ready for new review.

  109. optout21 commented at 10:42 am on December 16, 2025: contributor

    An observation regarding the original code in CBlockIndex::GetAncestor:

    These two lines:

    0        int heightSkip = GetSkipHeight(heightWalk);
    1        int heightSkipPrev = GetSkipHeight(heightWalk - 1);
    

    could be changed to

    0            const int skip_height{walk_p->pskip->nHeight};
    1            const int prev_skip_height{walk_p->pprev->pskip->nHeight};
    

    resulting in a slight performance optimization (less invocations to GetSkipHeight). I may explore in the future the option to change if (possible, worth it).

  110. optout21 marked this as ready for review on Dec 16, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-12-23 03:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me