index: Fix backwards search for bestblock #23365

pull mzumsande wants to merge 2 commits into bitcoin:master from mzumsande:202110_index_backtrack changing 2 files +19 −7
  1. mzumsande commented at 10:05 PM on October 26, 2021: contributor

    This PR attempts to fix an intermittent Init issue encountered during the stress testing of #23289, which relates to the pruning-compatible filter reconstruction logic introduced in #15946.

    The problem would occur when the node starts with -txindex=1 but ThreadSync is interrupted after it sets m_best_block_index to Genesis, and before it gets do any further work. In that case, during the next restart of the node, an Init error would be thrown because BaseIndex::Init() tries to backtrack from the tip to the last block which has been successfully indexed (here: Genesis), but the backtracking logic didn't work properly in this case: The loop while (block_to_test && block->pprev && (block->pprev->nStatus & BLOCK_HAVE_DATA)) checks if a predecessor exists before performing the check block_to_test == block and then possbily setting prune_violation = false If block_to_test and block are the Genesis block this check will not be reached because block->pprev does not exist.

    To reproduce this bug on regtest:

    1. start a node with a fresh datadir using -txindex=1 (or any other index)
    2. stop and restart without any index
    3. mine a block
    4. stop and restart again with the index enabled ->InitError Error: txindex best block of the index goes beyond pruned data. (...)

    Fix this by requiring that we have the data for the block of the current iteration block (instead of requiring it for the predecessor block->pprev) That way, the check for block_to_test == block is also reached when block_to_test is the Genesis block. No longer requiring the data of block->pprev also means that we can now prune up to m_best_block_index height without requiring a reindex (one block more than before). I added this edge case to feature_blockfilterindex_prune.py, the new version should fail on master.

  2. DrahtBot added the label UTXO Db and Indexes on Oct 26, 2021
  3. jamesob commented at 2:30 PM on October 27, 2021: member

    Concept ACK

  4. mzumsande marked this as a draft on Oct 27, 2021
  5. mzumsande commented at 11:17 PM on October 27, 2021: contributor

    The proposed change would also mean that it would now be possible to prune up to the best block of an index without needing to reindex (+1 block compared to master). I hope I'm not missing something, but I don't see why we still need the full data of block->pprev in the case where block is the best block of the index and block->pprev was indexed in the past. I added a test for this edge case to feature_blockfilterindex_prune.py.

  6. DrahtBot commented at 1:35 AM on October 28, 2021: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #21726 (Improve Indices on pruned nodes via prune blockers by fjahr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. mzumsande marked this as ready for review on Nov 9, 2021
  8. mzumsande commented at 12:20 AM on November 9, 2021: contributor

    Moved out of draft and added steps to reproduce to OP. Not sure if the second commit (test) should be merged, maybe it's too detailed, plus the test gets reworked in #21726 anyway.

    Note that the last commit of #21726 would partly "fix" the issue too by just not running the affected code unless -prune=1 is specified but doesn't address the root cause.

  9. in test/functional/feature_blockfilterindex_prune.py:27 in a36e23b183 outdated
      23 | @@ -24,11 +24,7 @@ def run_test(self):
      24 |          self.log.info("check if we can access a blockfilter when pruning is enabled but no blocks are actually pruned")
      25 |          self.sync_index(height=200)
      26 |          assert_greater_than(len(self.nodes[0].getblockfilter(self.nodes[0].getbestblockhash())['filter']), 0)
      27 | -        # Mine two batches of blocks to avoid hitting NODE_NETWORK_LIMITED_MIN_BLOCKS disconnection
      28 | -        self.generate(self.nodes[0], 250)
      29 | -        self.sync_all()
      30 | -        self.generate(self.nodes[0], 250)
      31 | -        self.sync_all()
      32 | +        self.generate(self.nodes[0], 500)
    


    maflcko commented at 9:54 AM on November 9, 2021:

    Why is this change correct?


    mzumsande commented at 10:09 AM on November 9, 2021:

    Because preventing disconnections due to NODE_NETWORK_LIMITED_MIN_BLOCKS makes little sense when there is just one node involved in the test.

  10. laanwj commented at 12:45 PM on November 10, 2021: member

    Concept ACK

  11. DrahtBot added the label Needs rebase on Nov 16, 2021
  12. jamesob approved
  13. jamesob commented at 4:35 PM on November 22, 2021: member

    ACK a36e23b183a6b8d1c60598618f2a5c8f31da02e0 (jamesob/ackr/23365.1.mzumsande.index_fix_backwards_sear)

    Thanks for looking into this and figuring out a good fix. Once this is merged, I can update #23289 to cover interrupting init after index initialization.


    I built this locally and confirmed that the attached test fails if the index/base.cpp changes are reverted. Logically the change makes sense - if the index's best block is the genesis, we should be able to continue indexation from that point (which is all the rephrased while condition does).

    The block = block->pprev; does make me a little nervous (although as-written the code would never cause any problems given how we initialize block_to_test), so if you wanted to you could include an assertion like

    diff --git a/src/index/base.cpp b/src/index/base.cpp
    index 88c74ec543..8525dcbfa0 100644
    --- a/src/index/base.cpp
    +++ b/src/index/base.cpp
    @@ -96,6 +96,7 @@ bool BaseIndex::Init()
                         prune_violation = false;
                         break;
                     }
    +                assert(block->pprev);
                     block = block->pprev;
                 }
             }
    

    <details><summary>Show signature data</summary> <p>

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA512
    
    ACK a36e23b183a6b8d1c60598618f2a5c8f31da02e0 ([`jamesob/ackr/23365.1.mzumsande.index_fix_backwards_sear`](https://github.com/jamesob/bitcoin/tree/ackr/23365.1.mzumsande.index_fix_backwards_sear))
    
    -----BEGIN PGP SIGNATURE-----
    
    iQIzBAEBCgAdFiEEGNRVI1NPYuZCSIrGepNdrbLETwUFAmGbxWwACgkQepNdrbLE
    TwVIdg/8CKYBVyw9DvvSx/G2E31/z7avW4uwj5cq10HMmjtsT/cCHhFtofDtVvIG
    9binTF/dlYhyzQmv7FduzQkgpZtehokrrBugoktyh6zeJpRV+mPUxZvFJ0VUCK1w
    +dVJ2zdbDmDuogJR7+ZZbGgBGxPKjS9t/Q+/LQIP221lgAVi1MPJ8oYuyg4+R6hn
    ueS3piEtXmHnQt0F05PUhJDZZidB/3hRp1bu9Z0H79TrL65wnA1bnDf3+IKbGeQL
    CPsYxZR4VZRiIqkSESsKuIbIVE2ELNJcQHP5pKoA389dBDbDlMfw1Lo/N+opB1tm
    kbkrCt0ZB4HEAwsbyi3KWOue5rYkx5zuz6/7QW6sChkfGLwE+F+UfyvzCzy2P8FI
    Fzd8lx/74O/+0NtcIpY+kSlPxztdQCuqCZHmx8cuYvnfPDIWkg2TrgX77JPoGJHO
    x8mJBGWrTkxYoEM2brLWBKK24pPbafrx860NWEzJEoJ5xreZOfAzsBZPLtEZtymq
    hggWvPjgGljGUFnKakjhP3ahMgDBNyQxoXM/n/XIE3NSepvRE7WOc5yxj2gGScM8
    h6oIYZVmlgbbImA1asNDgXRk4acrA/N+3dZdUCAyjPz0lhx3/g2XznCZsfRyZdjA
    cftewu0RF0l/XnyRZ37K5xvTUsjDYqb2omufwnmgy2giD9QnHdA=
    =5ydW
    -----END PGP SIGNATURE-----
    
    

    </p></details>

    <details><summary>Show platform data</summary> <p>

    Tested on Linux-5.10.0-9-amd64-x86_64-with-glibc2.28
    
    Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin/db4/include/ CXXFLAGS=-fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis --enable-wallet --enable-debug --with-daemon --enable-natpmp-default CXX=/usr/bin/clang++ CC=/usr/bin/clang PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental --disable-openssl-tests --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental --disable-openssl-tests --no-create --no-recursion
    
    Compiled with /usr/bin/ccache /usr/bin/clang++ -std=c++17 -mavx -mavx2 -fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis -O0 -g3 -ftrapv -fdebug-prefix-map=$(abs_top_srcdir)=.  -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection -msse4 -msha -msse4.1 -msse4.2  i
    
    Compiler version: Debian clang version 13.0.1-++20211019122913+8a93745a7121-1~exp1~20211019003538.10
    

    </p></details>

  14. index: Fix backwards search for bestblock
    This allows filters to be reconstructed when the best known block is
    the Genesis block without needing to reindex.
    It fixes Init errors seen in #23289.
    698c524698
  15. test: Add edge case of pruning up to index height 9600ea0145
  16. mzumsande force-pushed on Nov 26, 2021
  17. DrahtBot removed the label Needs rebase on Nov 26, 2021
  18. mzumsande commented at 3:43 PM on November 26, 2021: contributor

    The block = block->pprev; does make me a little nervous (although as-written the code would never cause any problems given how we initialize block_to_test), so if you wanted to you could include an assertion like

    Thanks - I added the assertion as suggested, and rebased. Note that adding the assert only changes the kind of error: Before, if block->pprev had not existed (which I agree shouldn't be possible), the InitError would have been thrown (because the while condition includes a check for block).

  19. in src/index/base.cpp:99 in 9600ea0145
      95 | +            while (block_to_test && block && (block->nStatus & BLOCK_HAVE_DATA)) {
      96 |                  if (block_to_test == block) {
      97 |                      prune_violation = false;
      98 |                      break;
      99 |                  }
     100 | +                assert(block->pprev);
    


    jnewbery commented at 1:16 PM on December 7, 2021:

    It may be worth adding a comment here on why block->pprev must exist (because we're walking backwards from the active chain tip to a block index in the active chain. If a block index that we encounter doesn't have a backwards pointer, then the chain is malformed).


    jamesob commented at 2:37 PM on December 7, 2021:

    This isn't actually a correct explanation (the genesis block is in the active chain and doesn't have pprev set). The reason this assertion will hold is because of how block and block_to_test are chosen; the conditional above this line is guaranteed to break at-or-higher-than genesis because of the few lines of logic above (which, if summarized in a comment, would just end up repeating the fairly-straightforward code above).


    jnewbery commented at 3:46 PM on December 7, 2021:

    I think you're saying the same thing as I am. Because of the way block and block_to_test are chosen, if we walk backwards from block, we're guaranteed to encounter block_to_test (either at or before the genesis block).


    mzumsande commented at 9:50 PM on December 13, 2021:

    Sorry I didn't get to this earlier, would have added a comment, do people think that this warrants a follow-up?


    fanquake commented at 1:30 AM on December 14, 2021:

    @mzumsande sounds like it could be.


    mzumsande commented at 10:28 PM on December 14, 2021:

    done in #23777

  20. in test/functional/feature_blockfilterindex_prune.py:53 in 9600ea0145
      49 | -        self.generate(self.nodes[0], 1000)
      50 | +        self.generate(self.nodes[0], 502)
      51 | +
      52 | +        self.log.info("prune exactly up to the blockfilterindexes best block while blockfilters are disabled")
      53 | +        pruneheight_2 = self.nodes[0].pruneblockchain(1000)
      54 | +        assert_equal(pruneheight_2, 998)
    


    jnewbery commented at 1:18 PM on December 7, 2021:

    Is there a specific reason why calling pruneblockchain(1000) prunes up to exactly block 998? Is it because that's where the block file happens to wrap (i.e. blocks 999 and 1000 are in the same file as later blocks and so can't be pruned). If so, that seems a bit brittle (e.g. if the default block file size of disk serialization changes, then this test will fail), and definitely seems worthy of a comment.


    mzumsande commented at 9:45 PM on December 13, 2021:

    Is it because that's where the block file happens to wrap

    Yes, I believe that is the reason, and I agree it's brittle (although the test relied before on this, e.g. at assert_equal(pruneheight, 248) - I'm not sure though if there is a better way to test pruning in a detailed way.


    mzumsande commented at 10:28 PM on December 14, 2021:

    comment added in #23777

  21. jnewbery commented at 1:18 PM on December 7, 2021: contributor

    Concept ACK.

  22. ryanofsky commented at 8:18 PM on December 7, 2021: contributor

    Partial code review ACK 9600ea01450b0d39be90eb2971c1ac5c9b69a66e for the code change, not the test changes. (Test changes are indirect and little over my head.) It seems obvious that previous code prune_violation = true, while (block->pprev) would incorrectly detect a prune violation at the genesis block, and the fix here make sense and looks correct.

  23. maflcko merged this on Dec 13, 2021
  24. maflcko closed this on Dec 13, 2021

  25. jamesob referenced this in commit 0fc6b72d1d on Dec 13, 2021
  26. sidhujag referenced this in commit 1fd54088ea on Dec 13, 2021
  27. mzumsande referenced this in commit 59ba4cbb0f on Dec 14, 2021
  28. mzumsande referenced this in commit 43ccaf6fdf on Dec 14, 2021
  29. bitcoin deleted a comment on Dec 14, 2021
  30. mzumsande referenced this in commit e4a8d561ed on Dec 15, 2021
  31. maflcko referenced this in commit df6e961c41 on Dec 16, 2021
  32. sidhujag referenced this in commit 8eeb7d1aaf on Dec 16, 2021
  33. 0xB10C referenced this in commit 6c5ecf97f9 on Dec 16, 2021
  34. jamesob referenced this in commit 8904f17ea7 on Dec 29, 2021
  35. mzumsande referenced this in commit 82ec24709b on Jan 17, 2022
  36. rebroad referenced this in commit d1a11217eb on Feb 3, 2022
  37. rebroad referenced this in commit f4db8ed45e on Feb 3, 2022
  38. kwvg referenced this in commit 27bcd1b42b on Feb 26, 2022
  39. Fabcien referenced this in commit eacbda7591 on Apr 5, 2022
  40. Fabcien referenced this in commit 7a31619610 on Nov 25, 2022
  41. bitcoin locked this on Dec 14, 2022
  42. mzumsande deleted the branch on Oct 13, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-02 15:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me