Optimize reindex #7917

pull sipa wants to merge 5 commits into bitcoin:master from sipa:betterreindex changing 12 files +130 −49
  1. sipa commented at 2:13 pm on April 20, 2016: member

    Several changes:

    • Make reindex/import use AcceptBlock rather than ProcessNewBlock, so activation of the resulting best chain is delayed.
    • That delayed activation is handled by the existing “Activating best chain” step in the startup process, which is moved to ThreadImport.
    • Optimize ActivateBestChain to not walk the entire chain to find the most work chain to connect for each block (O(n^2) -> O(n)).

    This has several advantages:

    • As the actual activation is done after all blocks have been added back to the index, it gets to use the checkpoints information (better progress indicator, skipping of signature checks, …).
    • Having a block index with many unactivated blocks (for example, after deleting the chainstate directory) becomes much faster
    • That case also runs in the background now (as it’s simply treated as part of the importing process).

    Downsides:

    • All blocks are read twice from disk during reindex (once to rebuild the index, once to activate).

    Todo:

    • Better progress indicator during the index rebuild phase.
  2. laanwj added the label Validation on Apr 20, 2016
  3. laanwj commented at 2:20 pm on April 20, 2016: member

    Good to see this!

    All blocks are read twice from disk during reindex (once to rebuild the index, once to activate)

    Could this (theoretically) be avoided by just reading the header during the initial scan? after all, the extra header that we tack on includes the size of the block, so we could skip the rest the first time.

  4. sipa commented at 2:22 pm on April 20, 2016: member
    @laanwj Yes, I’ve considered that. That could lead to an inconsistent block index though (as in: I’m not sure whether there are any internal assumptions that all entries in the block index are valid).
  5. jonasschnelli commented at 2:23 pm on April 20, 2016: contributor
    Nice. Concept ACK. Testing this now on a mainnet node -reindex.
  6. sipa commented at 2:25 pm on April 20, 2016: member
    @laanwj Another problem with that approach: say you get a block corrupted near the chain tip on disk, you find it and do a reindex. You expect it to reindex up to that corrupted block and then start back from network. Instead, if you don’t do transaction validation during the rebuild loop, it may happily add the block to the index, and have activation fail (making it mark the chain as invalid).
  7. laanwj commented at 2:27 pm on April 20, 2016: member
    Yes, it would be quite more complex to handle those issues well. In any case if the speed-up is more than the extra disk i/o that this generates takes, it doesn’t matter, no need to worry about it in this pull.
  8. sipa commented at 2:30 pm on April 20, 2016: member
    @jonasschnelli If you feel like benchmarking, an interesting thing to test is (reindex on master) vs (deleting chainstate on this PR), both with signature checks hackishly disabled. Deleting of chainstate caused a very slow, foreground, and (before #7821) uninterruptible process earlier. With this PR is should be (nearly) the same speed as an old reindex.
  9. sipa force-pushed on Apr 20, 2016
  10. sipa commented at 2:39 pm on April 20, 2016: member
    Would it be useful to have a -weakreindex or -rebuild now, which does not throw out the block index, and only the chainstate? That would be faster, as it wouldn’t need the rebuild phase.
  11. laanwj commented at 3:37 pm on April 20, 2016: member

    Would it be useful to have a -weakreindex or -rebuild now, which does not throw out the block index, and only the chainstate? That would be faster, as it wouldn’t need the rebuild phase.

    Would need better reporting about what database is corrupted for that to be useful to end-users. But even without that I can see that being very useful for benchmarking the UTXO database.

    I like the idea of being able to rebuild databases separately. What about the other way around? If you have a UTXO set, but want to reindex the block chain? (e.g. when switching a pruned node to a non-pruned one by copying block data from another node)

    If it was up to me I wouldn’t use the name ‘weakindex’ or ‘rebuid’ (that’s another one for the rescan, reindex, re…. category of vaguenes) but call it what it is: -rebuild-blockindex -rebuild-chainstate etc (using the same name as the db directory names).

  12. sipa commented at 3:41 pm on April 20, 2016: member

    I like the idea of being able to rebuild databases separately. What about the other way around? If you have a UTXO set, but want to reindex the block chain? (e.g. when switching a pruned node to a non-pruned one by copying block data from another node)

    You can just overwrite your blocks/ database with another version (including its index), as long as it has your currently active chain in it.

  13. sipa commented at 4:02 pm on April 20, 2016: member

    If it was up to me I wouldn’t use the name ‘weakindex’ or ‘rebuid’ (that’s another one for the rescan, reindex, re…. category of vaguenes) but call it what it is: -rebuild-blockindex -rebuild-chainstate etc (using the same name as the db directory names).

    -rebuild-chainstate sounds like an excellent idea. I don’t know what -rebuild-blockindex would mean though…

  14. sipa force-pushed on Apr 21, 2016
  15. sipa commented at 12:17 pm on April 21, 2016: member

    Added a commit to implement -reindex-chainstate.

    I guess the semantics for -reindex-blockindex could be:

    • Rebuild the block index without rebuilding the chainstate
    • If afterwards the chainstate’s tip is in the resulting block index, mark every block on the best chain as validated and pretend nothing happened
    • If not, wipe the chainstate and rebuild that as well.
  16. laanwj commented at 12:23 pm on April 21, 2016: member

    I don’t know what -rebuild-blockindex would mean though…

    That hypothetical option would rebuild the blockindex database, say after some new-fangled corruption check detects that that database was corrupted.

    • Throw away the block index (so the database in blocks/index)
    • Reindex blocks

    But don’t touch the chain state in any way, it is assumed to be still good. If the best chain is not included after reindexing the blocks, throw another error that a full reindex -reindex-chainstate is necessary.

    I guess the semantics for -reindex-blockindex could be:

    Yes, something like that, sounds good to me.

  17. sipa force-pushed on Apr 21, 2016
  18. sipa force-pushed on Apr 21, 2016
  19. dcousens commented at 0:03 am on April 22, 2016: contributor
    concept ACK -(rebuild|reindex)-*
  20. sipa commented at 10:24 am on April 22, 2016: member
    I tried implementing -reindex-blockindex, but it is significantly harder: it would mean first initializing with a dummy chainstate, then reindexing the blockindex while leaving the original chainstate untouched, then stopping the node, loading the real chainstate, marking the blocks along the best chain as valid (or fail if not found anymore), and starting the node again.
  21. laanwj commented at 1:09 pm on April 26, 2016: member
    Going to test this
  22. laanwj commented at 1:29 pm on April 26, 2016: member

    Not sure if this is expected behavior, it’s probably harmless; but during the “Reindexing block file” initial scan, rev*.dat files are created with size 0:

    0-rw------- 1 user user         0 Apr 26 15:26 rev00415.dat
    1-rw------- 1 user user         0 Apr 26 15:26 rev00416.dat
    2-rw------- 1 user user         0 Apr 26 15:26 rev00417.dat
    3-rw------- 1 user user         0 Apr 26 15:26 rev00418.dat
    

    Edit: the block counts are off? possibly due to out-of-order blocks not being counted. It looks a bit strange but probably harmless:

    02016-04-26 13:30:48 Reindexing block file blk00471.dat...
    12016-04-26 13:30:51 Loaded 3 blocks from external file in 2585ms
    
  23. laanwj commented at 5:58 pm on April 26, 2016: member

    Did a reindex of my test chain using this patch:

    02016-04-26 13:07:36 Reindexing block file blk00000.dat...
    12016-04-26 13:32:16 UpdateTip: new best=00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048 height=1 version=0x00000001 log2_work=33.000022 tx=2 date='2009-01-09 02:54:25' progress=0.000000 cache=0.0MiB(1tx)
    22016-04-26 17:44:18 UpdateTip: new best=00000000000000000360e6d03330560950285d1062c6f6e6a4558ba1720d6e73 height=407838 version=0x00000004 log2_work=84.507935 tx=123156514 date='2016-04-18 11:34:09' progress=0.994753 cache=43.8MiB(23470tx)
    

    Times:

    0phase1   00:24:40 (1480s)
    1phase2   04:12:02 (15122s)
    2         -----------------
    3total    04:36:42 (16602s)
    

    For comparison, a sync from a local node on the same machine, under the same conditions, takes 05:32:29 (19949s). That’s very good. I have not tried doing a ’traditional’ reindex,

    Edit: also verified that this fixes #7038. Also tested that interrupting the reindex then restarting works, both in the first phase and second phase.

    Edit.2: also did a “traditional” reindex - it took 04:45:49 (17149s). Only a bit longer. Note that this is a fast CPU with 8 cores, so possibly the difference with not validating is less than it would otherwise be.

    Edit.3: verified that utxo set hash is the same after old reindex versus new reindex (just to be sure).

  24. in src/init.cpp: in 691cbc972a outdated
    607     }
    608 
    609+    // scan for better chains in the block chain database, that are not yet connected in the active best chain
    610+    CValidationState state;
    611+    if (!ActivateBestChain(state, chainparams))
    612+        LogPrintf("Failed to connect best block");
    


    laanwj commented at 9:22 am on April 27, 2016:
    Shouldn’t this error be fatal?
  25. jonasschnelli commented at 9:05 am on April 28, 2016: contributor

    Tested on a mainnet node:

    0jonasschnelli@bitcoinsrv:~/.bitcoin$ cat debug.log | grep "Bitcoin version\|Reindexing finished\|height=409149"
    12016-04-27 14:11:51 Bitcoin version v0.12.99.0-fb682d7 (2016-04-27 15:34:26 +0200)
    22016-04-27 14:34:17 Reindexing finished
    32016-04-27 18:52:50 UpdateTip: new best=000000000000000000c2963b062bebef4beff8ef38ae5d81e7ccfa8e05abcb1a height=409149 version=0x00000004 log2_work=84.559763 tx=125119626 date='2016-04-27 14:05:06' progress=0.999875 cache=4.0MiB(0tx)
    

    = Times:

    0phase1    1'346s (00:22:26)
    1phase2   15'513s (04:18:33)
    2===========================
    3total    16'859s (04:40:59)
    

    Full debug log with -debug=reindex http://bitcoinsrv.jonasschnelli.ch/debug.reindex.log (>240MB!)

  26. sipa commented at 2:19 pm on April 28, 2016: member
    Added GUI progress reporting
  27. sipa force-pushed on Apr 28, 2016
  28. sipa commented at 9:22 pm on May 1, 2016: member

    The two phases in the GUI:

    reindexing processing

    If the chainstate is ahead of the block index at startup, you get the same “Processing blocks on disk” now, while the main GUI is running (and not during loading).

  29. sipa force-pushed on May 1, 2016
  30. laanwj commented at 11:22 am on May 2, 2016: member
    @sipa Very nice!
  31. jonasschnelli commented at 1:50 pm on May 2, 2016: contributor

    Just started to sync with master+thisPR a new node over the GUI on Ubuntu 16.04 with dbcache=3500 and got this:

    The GUI used chainActive.tip(). Is this no longer updated continuously?

  32. sipa force-pushed on May 5, 2016
  33. sipa commented at 4:56 pm on May 5, 2016: member
    @jonasschnelli I added another commit that hopefully fixes that (untested).
  34. sipa force-pushed on May 9, 2016
  35. sipa commented at 6:38 pm on May 9, 2016: member
    Rebased.
  36. jonasschnelli commented at 7:54 am on May 10, 2016: contributor

    Just started a Qt reindex with https://bitcoin.jonasschnelli.ch/pulls/7917/ (7ac4114). The debug window shows 0 as current number of blocks (is this intentional?). The progress bar in the main window is updating.

  37. jonasschnelli commented at 9:41 am on May 10, 2016: contributor
    After a while it started updating the debug window (maybe after first reindex [chain index] phase).
  38. laanwj commented at 9:50 am on May 10, 2016: member
    During the first phase the number of blocks will be 0, that makes sense. I guess it could show the longest header chain as well in the debug window, but right now it only shows what is fully validated.
  39. laanwj commented at 9:55 am on May 10, 2016: member

    Hm but I see the same behavior, reindexing a regtest chain. Log shows:

    02016-05-10 09:53:32 UpdateTip: new best=57d8789c33ee61db98c8b87076cae5c637056bff4a4c340b3ef28e9bc853a4a3 height=250 version=0x00000003 log2_work=8.9715436 tx=251 date='2015-06-23 12:20:30' progress=1.000000 cache=0.1MiB(250tx)
    

    While the current number of blocks in the debug window is still at 1. Same for the onmouseover “Processed 1 block of transaction history”

  40. laanwj commented at 10:05 am on May 10, 2016: member

    In my case it is a problem is in the rate limiting in BlockTipChanged. If the updating goes too fast, it will only send a signal for the initial value.

    0if (!initialSync || now - nLastBlockTipUpdateNotification > MODEL_UPDATE_DELAY) {
    

    This is not new in this patch, just exposed by the new syncing behavior. I think we should move ahead with this and solve the UI issues separately.

  41. sipa commented at 7:53 pm on May 15, 2016: member
    @laanwj @jonasschnelli Added another commit to make header updates not compete for rate limiting with block updates.
  42. Make ProcessNewBlock dbp const and update comment d253ec4baa
  43. Switch reindexing to AcceptBlock in-loop and ActivateBestChain afterwards 316623f2c1
  44. Optimize ActivateBestChain for long chains fb8fad1586
  45. Add -reindex-chainstate that does not rebuild block index d3d7547911
  46. Report reindexing progress in GUI b4d24e142e
  47. in src/main.cpp: in 7bb8649816 outdated
    4101@@ -4060,17 +4102,19 @@ bool LoadExternalBlockFile(const CChainParams& chainparams, FILE* fileIn, CDiskB
    4102                         std::multimap<uint256, CDiskBlockPos>::iterator it = range.first;
    4103                         if (ReadBlockFromDisk(block, it->second, chainparams.GetConsensus()))
    4104                         {
    4105-                            LogPrintf("%s: Processing out of order child %s of %s\n", __func__, block.GetHash().ToString(),
    4106+                            LogPrint("reindex", "%s: Processing out of order child %s of %s\n", __func__, block.GetHash().ToString(),
    


    MarcoFalke commented at 10:45 am on May 16, 2016:
    Does this require mention in the -debug=<category> help text?

    sipa commented at 10:38 pm on May 16, 2016:
    Agree!
  48. sipa force-pushed on May 16, 2016
  49. sipa commented at 10:46 pm on May 16, 2016: member
    Squashed and rebased.
  50. pstratem commented at 1:58 am on May 17, 2016: contributor
    utACK fb8fad1586ced69fa37c665a11916ae4c4d0df05
  51. pstratem commented at 3:00 am on May 17, 2016: contributor
    utACK d3d75479115bc3480f163df774ee9dd2f8bd9f54
  52. pstratem commented at 3:08 am on May 17, 2016: contributor
    very untested ACK b4d24e142e25a21c78ab74146c3520f2259fd7c2 (sanity not guaranteed)
  53. pstratem commented at 3:29 am on May 17, 2016: contributor
    On a related note access to chainActive immediately after threadGroup.create_thread(boost::bind(&ThreadImport, vImportFiles)); is not protected by cs_main
  54. laanwj merged this on May 18, 2016
  55. laanwj closed this on May 18, 2016

  56. laanwj referenced this in commit 239d419864 on May 18, 2016
  57. str4d referenced this in commit f42cde3c30 on Feb 23, 2017
  58. sickpig referenced this in commit 2142ea0cd4 on Feb 25, 2017
  59. sickpig referenced this in commit ae828d508d on Feb 25, 2017
  60. sickpig referenced this in commit f1773ac268 on Feb 27, 2017
  61. sickpig referenced this in commit 9ba0220027 on Mar 4, 2017
  62. random-zebra referenced this in commit b993c56afe on Apr 4, 2021
  63. LarryRuane referenced this in commit d68a5581ee on May 27, 2021
  64. zkbot referenced this in commit ab8ed50c4e on Aug 9, 2021
  65. zkbot referenced this in commit f721e3b067 on Aug 9, 2021
  66. in src/qt/clientmodel.cpp:27 in b4d24e142e
    23@@ -24,6 +24,7 @@
    24 class CBlockIndex;
    25 
    26 static const int64_t nClientStartupTime = GetTime();
    27+static int64_t nLastHeaderTipUpdateNotification = 0;
    


    rebroad commented at 11:48 am on September 7, 2021:
    This variable seems to be initialized here, but there’s nowhere in the code where it’s ever updated, and yet it is checked as if the value might change.

    MarcoFalke commented at 11:54 am on September 7, 2021:

    Please don’t use more than 5 year old pull requests to ask beginner programming questions.

    You can open a new issue, if you are absolutely unable to find the answer yourself.

    In this case it is updated via a reference (a reference is like a pointer).

  67. MarcoFalke locked this on Sep 7, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-04 19:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me