index: Fix backwards search for bestblock #23365

pull mzumsande wants to merge 2 commits into bitcoin:master from mzumsande:202110_index_backtrack changing 2 files +19 −7
  1. mzumsande commented at 10:05 pm on October 26, 2021: contributor

    This PR attempts to fix an intermittent Init issue encountered during the stress testing of #23289, which relates to the pruning-compatible filter reconstruction logic introduced in #15946.

    The problem would occur when the node starts with -txindex=1 but ThreadSync is interrupted after it sets m_best_block_index to Genesis, and before it gets do any further work. In that case, during the next restart of the node, an Init error would be thrown because BaseIndex::Init() tries to backtrack from the tip to the last block which has been successfully indexed (here: Genesis), but the backtracking logic didn’t work properly in this case: The loop while (block_to_test && block->pprev && (block->pprev->nStatus & BLOCK_HAVE_DATA)) checks if a predecessor exists before performing the check block_to_test == block and then possbily setting prune_violation = false If block_to_test and block are the Genesis block this check will not be reached because block->pprev does not exist.

    To reproduce this bug on regtest:

    1. start a node with a fresh datadir using -txindex=1 (or any other index)
    2. stop and restart without any index
    3. mine a block
    4. stop and restart again with the index enabled ->InitError Error: txindex best block of the index goes beyond pruned data. (...)

    Fix this by requiring that we have the data for the block of the current iteration block (instead of requiring it for the predecessor block->pprev) That way, the check for block_to_test == block is also reached when block_to_test is the Genesis block. No longer requiring the data of block->pprev also means that we can now prune up to m_best_block_index height without requiring a reindex (one block more than before). I added this edge case to feature_blockfilterindex_prune.py, the new version should fail on master.

  2. DrahtBot added the label UTXO Db and Indexes on Oct 26, 2021
  3. jamesob commented at 2:30 pm on October 27, 2021: member
    Concept ACK
  4. mzumsande marked this as a draft on Oct 27, 2021
  5. mzumsande commented at 11:17 pm on October 27, 2021: contributor
    The proposed change would also mean that it would now be possible to prune up to the best block of an index without needing to reindex (+1 block compared to master). I hope I’m not missing something, but I don’t see why we still need the full data of block->pprev in the case where block is the best block of the index and block->pprev was indexed in the past. I added a test for this edge case to feature_blockfilterindex_prune.py.
  6. DrahtBot commented at 1:35 am on October 28, 2021: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #21726 (Improve Indices on pruned nodes via prune blockers by fjahr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. mzumsande marked this as ready for review on Nov 9, 2021
  8. mzumsande commented at 0:20 am on November 9, 2021: contributor

    Moved out of draft and added steps to reproduce to OP. Not sure if the second commit (test) should be merged, maybe it’s too detailed, plus the test gets reworked in #21726 anyway.

    Note that the last commit of #21726 would partly “fix” the issue too by just not running the affected code unless -prune=1 is specified but doesn’t address the root cause.

  9. in test/functional/feature_blockfilterindex_prune.py:27 in a36e23b183 outdated
    23@@ -24,11 +24,7 @@ def run_test(self):
    24         self.log.info("check if we can access a blockfilter when pruning is enabled but no blocks are actually pruned")
    25         self.sync_index(height=200)
    26         assert_greater_than(len(self.nodes[0].getblockfilter(self.nodes[0].getbestblockhash())['filter']), 0)
    27-        # Mine two batches of blocks to avoid hitting NODE_NETWORK_LIMITED_MIN_BLOCKS disconnection
    28-        self.generate(self.nodes[0], 250)
    29-        self.sync_all()
    30-        self.generate(self.nodes[0], 250)
    31-        self.sync_all()
    32+        self.generate(self.nodes[0], 500)
    


    maflcko commented at 9:54 am on November 9, 2021:
    Why is this change correct?

    mzumsande commented at 10:09 am on November 9, 2021:
    Because preventing disconnections due to NODE_NETWORK_LIMITED_MIN_BLOCKS makes little sense when there is just one node involved in the test.
  10. laanwj commented at 12:45 pm on November 10, 2021: member
    Concept ACK
  11. DrahtBot added the label Needs rebase on Nov 16, 2021
  12. jamesob approved
  13. jamesob commented at 4:35 pm on November 22, 2021: member

    ACK a36e23b183a6b8d1c60598618f2a5c8f31da02e0 (jamesob/ackr/23365.1.mzumsande.index_fix_backwards_sear)

    Thanks for looking into this and figuring out a good fix. Once this is merged, I can update #23289 to cover interrupting init after index initialization.


    I built this locally and confirmed that the attached test fails if the index/base.cpp changes are reverted. Logically the change makes sense - if the index’s best block is the genesis, we should be able to continue indexation from that point (which is all the rephrased while condition does).

    The block = block->pprev; does make me a little nervous (although as-written the code would never cause any problems given how we initialize block_to_test), so if you wanted to you could include an assertion like

     0diff --git a/src/index/base.cpp b/src/index/base.cpp
     1index 88c74ec543..8525dcbfa0 100644
     2--- a/src/index/base.cpp
     3+++ b/src/index/base.cpp
     4@@ -96,6 +96,7 @@ bool BaseIndex::Init()
     5                     prune_violation = false;
     6                     break;
     7                 }
     8+                assert(block->pprev);
     9                 block = block->pprev;
    10             }
    11         }
    
     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK a36e23b183a6b8d1c60598618f2a5c8f31da02e0 ([`jamesob/ackr/23365.1.mzumsande.index_fix_backwards_sear`](https://github.com/jamesob/bitcoin/tree/ackr/23365.1.mzumsande.index_fix_backwards_sear))
     4
     5-----BEGIN PGP SIGNATURE-----
     6
     7iQIzBAEBCgAdFiEEGNRVI1NPYuZCSIrGepNdrbLETwUFAmGbxWwACgkQepNdrbLE
     8TwVIdg/8CKYBVyw9DvvSx/G2E31/z7avW4uwj5cq10HMmjtsT/cCHhFtofDtVvIG
     99binTF/dlYhyzQmv7FduzQkgpZtehokrrBugoktyh6zeJpRV+mPUxZvFJ0VUCK1w
    10+dVJ2zdbDmDuogJR7+ZZbGgBGxPKjS9t/Q+/LQIP221lgAVi1MPJ8oYuyg4+R6hn
    11ueS3piEtXmHnQt0F05PUhJDZZidB/3hRp1bu9Z0H79TrL65wnA1bnDf3+IKbGeQL
    12CPsYxZR4VZRiIqkSESsKuIbIVE2ELNJcQHP5pKoA389dBDbDlMfw1Lo/N+opB1tm
    13kbkrCt0ZB4HEAwsbyi3KWOue5rYkx5zuz6/7QW6sChkfGLwE+F+UfyvzCzy2P8FI
    14Fzd8lx/74O/+0NtcIpY+kSlPxztdQCuqCZHmx8cuYvnfPDIWkg2TrgX77JPoGJHO
    15x8mJBGWrTkxYoEM2brLWBKK24pPbafrx860NWEzJEoJ5xreZOfAzsBZPLtEZtymq
    16hggWvPjgGljGUFnKakjhP3ahMgDBNyQxoXM/n/XIE3NSepvRE7WOc5yxj2gGScM8
    17h6oIYZVmlgbbImA1asNDgXRk4acrA/N+3dZdUCAyjPz0lhx3/g2XznCZsfRyZdjA
    18cftewu0RF0l/XnyRZ37K5xvTUsjDYqb2omufwnmgy2giD9QnHdA=
    19=5ydW
    20-----END PGP SIGNATURE-----
    
    0Tested on Linux-5.10.0-9-amd64-x86_64-with-glibc2.28
    1
    2Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin/db4/include/ CXXFLAGS=-fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis --enable-wallet --enable-debug --with-daemon --enable-natpmp-default CXX=/usr/bin/clang++ CC=/usr/bin/clang PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental --disable-openssl-tests --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental --disable-openssl-tests --no-create --no-recursion
    3
    4Compiled with /usr/bin/ccache /usr/bin/clang++ -std=c++17 -mavx -mavx2 -fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis -O0 -g3 -ftrapv -fdebug-prefix-map=$(abs_top_srcdir)=.  -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection -msse4 -msha -msse4.1 -msse4.2  i
    5
    6Compiler version: Debian clang version 13.0.1-++20211019122913+8a93745a7121-1~exp1~20211019003538.10
    
  14. index: Fix backwards search for bestblock
    This allows filters to be reconstructed when the best known block is
    the Genesis block without needing to reindex.
    It fixes Init errors seen in #23289.
    698c524698
  15. test: Add edge case of pruning up to index height 9600ea0145
  16. mzumsande force-pushed on Nov 26, 2021
  17. DrahtBot removed the label Needs rebase on Nov 26, 2021
  18. mzumsande commented at 3:43 pm on November 26, 2021: contributor

    The block = block->pprev; does make me a little nervous (although as-written the code would never cause any problems given how we initialize block_to_test), so if you wanted to you could include an assertion like

    Thanks - I added the assertion as suggested, and rebased. Note that adding the assert only changes the kind of error: Before, if block->pprev had not existed (which I agree shouldn’t be possible), the InitError would have been thrown (because the while condition includes a check for block).

  19. in src/index/base.cpp:99 in 9600ea0145
     95+            while (block_to_test && block && (block->nStatus & BLOCK_HAVE_DATA)) {
     96                 if (block_to_test == block) {
     97                     prune_violation = false;
     98                     break;
     99                 }
    100+                assert(block->pprev);
    


    jnewbery commented at 1:16 pm on December 7, 2021:
    It may be worth adding a comment here on why block->pprev must exist (because we’re walking backwards from the active chain tip to a block index in the active chain. If a block index that we encounter doesn’t have a backwards pointer, then the chain is malformed).

    jamesob commented at 2:37 pm on December 7, 2021:
    This isn’t actually a correct explanation (the genesis block is in the active chain and doesn’t have pprev set). The reason this assertion will hold is because of how block and block_to_test are chosen; the conditional above this line is guaranteed to break at-or-higher-than genesis because of the few lines of logic above (which, if summarized in a comment, would just end up repeating the fairly-straightforward code above).

    jnewbery commented at 3:46 pm on December 7, 2021:
    I think you’re saying the same thing as I am. Because of the way block and block_to_test are chosen, if we walk backwards from block, we’re guaranteed to encounter block_to_test (either at or before the genesis block).

    mzumsande commented at 9:50 pm on December 13, 2021:
    Sorry I didn’t get to this earlier, would have added a comment, do people think that this warrants a follow-up?

    fanquake commented at 1:30 am on December 14, 2021:
    @mzumsande sounds like it could be.

    mzumsande commented at 10:28 pm on December 14, 2021:
    done in #23777
  20. in test/functional/feature_blockfilterindex_prune.py:53 in 9600ea0145
    49-        self.generate(self.nodes[0], 1000)
    50+        self.generate(self.nodes[0], 502)
    51+
    52+        self.log.info("prune exactly up to the blockfilterindexes best block while blockfilters are disabled")
    53+        pruneheight_2 = self.nodes[0].pruneblockchain(1000)
    54+        assert_equal(pruneheight_2, 998)
    


    jnewbery commented at 1:18 pm on December 7, 2021:
    Is there a specific reason why calling pruneblockchain(1000) prunes up to exactly block 998? Is it because that’s where the block file happens to wrap (i.e. blocks 999 and 1000 are in the same file as later blocks and so can’t be pruned). If so, that seems a bit brittle (e.g. if the default block file size of disk serialization changes, then this test will fail), and definitely seems worthy of a comment.

    mzumsande commented at 9:45 pm on December 13, 2021:

    Is it because that’s where the block file happens to wrap

    Yes, I believe that is the reason, and I agree it’s brittle (although the test relied before on this, e.g. at assert_equal(pruneheight, 248) - I’m not sure though if there is a better way to test pruning in a detailed way.


    mzumsande commented at 10:28 pm on December 14, 2021:
    comment added in #23777
  21. jnewbery commented at 1:18 pm on December 7, 2021: contributor
    Concept ACK.
  22. ryanofsky commented at 8:18 pm on December 7, 2021: contributor
    Partial code review ACK 9600ea01450b0d39be90eb2971c1ac5c9b69a66e for the code change, not the test changes. (Test changes are indirect and little over my head.) It seems obvious that previous code prune_violation = true, while (block->pprev) would incorrectly detect a prune violation at the genesis block, and the fix here make sense and looks correct.
  23. maflcko merged this on Dec 13, 2021
  24. maflcko closed this on Dec 13, 2021

  25. jamesob referenced this in commit 0fc6b72d1d on Dec 13, 2021
  26. sidhujag referenced this in commit 1fd54088ea on Dec 13, 2021
  27. mzumsande referenced this in commit 59ba4cbb0f on Dec 14, 2021
  28. mzumsande referenced this in commit 43ccaf6fdf on Dec 14, 2021
  29. bitcoin deleted a comment on Dec 14, 2021
  30. mzumsande referenced this in commit e4a8d561ed on Dec 15, 2021
  31. maflcko referenced this in commit df6e961c41 on Dec 16, 2021
  32. sidhujag referenced this in commit 8eeb7d1aaf on Dec 16, 2021
  33. 0xB10C referenced this in commit 6c5ecf97f9 on Dec 16, 2021
  34. jamesob referenced this in commit 8904f17ea7 on Dec 29, 2021
  35. mzumsande referenced this in commit 82ec24709b on Jan 17, 2022
  36. rebroad referenced this in commit d1a11217eb on Feb 3, 2022
  37. rebroad referenced this in commit f4db8ed45e on Feb 3, 2022
  38. kwvg referenced this in commit 27bcd1b42b on Feb 26, 2022
  39. Fabcien referenced this in commit eacbda7591 on Apr 5, 2022
  40. Fabcien referenced this in commit 7a31619610 on Nov 25, 2022
  41. bitcoin locked this on Dec 14, 2022
  42. mzumsande deleted the branch on Oct 13, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me