assumeutxo (2) #27596

pull jamesob wants to merge 26 commits into bitcoin:master from jamesob:assumeutxo changing 43 files +1597 −289
  1. jamesob commented at 3:08 pm on May 8, 2023: member

    This changeset finishes the first phase of the assumeutxo project. It makes UTXO snapshots loadable via RPC (loadtxoutset) and adds assumeutxo parameters to chainparams. It contains all the remaining changes necessary to both use an assumedvalid snapshot chainstate and do a full validation sync in the background.

    This may look like a lot to review, but note that

    • ~200 lines are a (non-essential) demo shell script
    • Many lines are functional test, documentation, and relatively dilute RPC code.

    So it shouldn’t be as burdensome to review as the linecount might suggest.

    • P2P: minor changes are made to init.cpp and net_processing.cpp to make simultaneous IBD across multiple chainstates work.
    • Pruning: implement correct pruning behavior when using a background chainstate
    • Blockfile separation: to prevent “fragmentation” in blockfile storage, have background chainstates use separate blockfiles from active snapshot chainstates to avoid interleaving heights and impairing pruning.
    • Indexing: some CValidationInterface events are given with an additional parameter, ChainstateRole, and all indexers ignore events from ChainstateRole::ASSUMEDVALID so that indexation only happens sequentially.
    • Have -reindex properly wipe snapshot chainstates.
    • RPC: introduce RPC commands loadtxoutset and (hidden) getchainstates.
    • Release docs & first assumeutxo commitment: add notes and a particular assumeutxo hash value for first AU-enabled release.
      • This will complete the project and allow use of UTXO snapshots for faster node bootstrap.

    The next phase, if it were to be pursued, would be coming up with a way to distribute the UTXO snapshots over the P2P network.


    UTXO snapshots

    Create your own with ./contrib/devtools/utxo_snapshot.sh, e.g.

    0./contrib/devtools/utxo_snapshot.sh 788000 utxo.dat ./src/bitcoin-cli -datadir=$(pwd)/testdata`)
    

    or use the pre-generated ones listed below.

    • Testnet: 2'500'000 (Sjors):
      • torrent: magnet:?xt=urn:btih:511e09f4bf853aefab00de5c070b1e031f0ecbe9&dn=utxo-testnet-2500000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969
      • sha256: 79db4b025448cc0ac388d8589a28eab02de53055d181e34eb47391717aa16388
    • Signet: 160'000 (Sjors):
      • torrent: magnet:?xt=urn:btih:9da986cb27b3980ea7fd06b21e199b148d486880&dn=utxo-signet-160000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969
      • sha256: eeeca845385ba91e84ef58c09d38f98f246a24feadaad57fe1e5874f3f92ef8c
    • Mainnet: 800'000 (Sjors):

    Testing

    For fun (~5min)

    If you want to do a quick test, you can run ./contrib/devtools/test_utxo_snapshots.sh and follow the instructions. This is mostly obviated by the functional tests, though.

    For real (longer)

    If you’d like to experience a real usage of assumeutxo, you can do that too. I’ve cut a new snapshot at height 788'000 (http://img.jameso.be/utxo-788000.dat - but you can do it yourself with ./contrib/devtools/utxo_snapshot.sh if you want). Download that, and then create a datadir for testing:

     0$ cd ~/src/bitcoin  # or whatever
     1
     2# get the snapshot
     3$ curl http://img.jameso.be/utxo-788000.dat > utxo-788000.dat
     4
     5# you'll want to do this if you like copy/pasting 
     6$ export AU_DATADIR=/home/${USER}/au-test # or wherever
     7
     8$ mkdir ${AU_DATADIR}
     9$ vim ${AU_DATADIR}/bitcoin.conf
    10
    11dbcache=8000  # or, you know, something high
    12blockfilterindex=1
    13coinstatsindex=1
    14prune=3000
    15logthreadnames=1
    

    Obtain this branch, build it, and then start bitcoind:

    0$ git remote add jamesob https://github.com/jamesob/bitcoin
    1$ git fetch jamesob assumeutxo
    2$ git checkout jamesob/assumeutxo
    3
    4$ ./configure $conf_args && make  # (whatever you like to do here)
    5
    6# start 'er up and watch the logs
    7$ ./src/bitcoind -datadir=${AU_DATADIR}
    

    Then, in some other window, load the snapshot

    0$ ./src/bitcoin-cli -datadir=${AU_DATADIR} loadtxoutset $(pwd)/utxo-788000.dat
    

    You’ll see some log messages about headers retrieval and waiting to see the snapshot in the headers chain. Once you get the full headers chain, you’ll spend a decent amount of time (~10min) loading the snapshot, checking it, and flushing it to disk. After all that happens, you should be syncing to tip in pretty short order, and you’ll see the occasional [background validation] log message go by.

    In yet another window, you can check out chainstate status with

    0$ ./src/bitcoin-cli -datadir=${AU_DATADIR} getchainstates
    

    as well as usual favorites like getblockchaininfo.

  2. DrahtBot commented at 3:08 pm on May 8, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK achow101
    Concept ACK D33r-Gee, Sjors
    Stale ACK pablomartin4btc, fjahr, naumenkogs, ryanofsky

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28550 (Covenant tools softfork by jamesob)
    • #28453 (wallet: Receive silent payment transactions by achow101)
    • #28368 (Fee Estimator updates from Validation Interface/CScheduler thread by ismaelsadeeq)
    • #28120 (p2p: make block download logic aware of limited peers threshold by furszy)
    • #28031 (Package Relay 1/3: Introduce TxDownloadManager and improve orphan-handling by glozow)
    • #27837 (net: introduce block tracker to retry to download blocks after failure by furszy)
    • #27770 (Introduce ‘getblockfileinfo’ RPC command by furszy)
    • #26966 (index: blockfilter initial sync speedup, parallelize process by furszy)
    • #26762 (bugfix: Make CCheckQueue RAII-styled (attempt 2) by hebasto)
    • #26045 (rpc: Optimize serialization disk space of dumptxoutset by aureleoules)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label CI failed on May 8, 2023
  4. jamesob force-pushed on May 8, 2023
  5. DrahtBot removed the label CI failed on May 8, 2023
  6. jamesob commented at 5:26 pm on May 8, 2023: member

    CI’s passing after a silent conflict in the rebase. I’ve added a link to @Sjors’ snapshot torrent in the PR description.

    No known outstanding issues here; ready for testing!

  7. jamesob force-pushed on May 9, 2023
  8. jamesob commented at 3:30 pm on May 9, 2023: member
    Rebased.
  9. Sjors commented at 7:10 am on May 10, 2023: member
    When running 8f431ad3b600a8f7e6dff83354ac3475b8936505 I noticed (and reproduced) that -prune is not fully honored when loading a snapshot and doing background validation. When I set -prune=550 the blocks dir ends up somewhere between 3 and 5 GB. Judging by the blk….dat timestamps it seems that both the snapshot and background IBD hold on to more blocks than they should. Tested on Ubuntu 23.04 and with coinstats- and blockfilterindex enabled. The first time I tested I allowed the node to sync for a few hours before loading the snapshot, the second time I loaded the snapshot almost immediately.
  10. fjahr commented at 12:58 pm on May 10, 2023: contributor
    @jamesob What’s your feeling on how important figuring out the fix for the pruning issue is at the moment? People don’t consider it blocking #24008, right? Just curious about where to spend review time right now. FWIW, I noted the same issue on getblockfrompeer and wrote a test for it back then, in case you haven’t seen it, it might come in handy: https://github.com/bitcoin/bitcoin/pull/23813
  11. jamesob commented at 3:05 pm on May 10, 2023: member

    What’s your feeling on how important figuring out the fix for the pruning issue is at the moment? People don’t consider it blocking #24008, right?

    Right, pruning issues shouldn’t block #24008 - those changes need to happen regardless of how we manage pruning with snapshots.

    I’ll reproduce/investigate the pruning issues over the coming days. But it’s worth noting that @Sjors didn’t see any regressions in pruning on this branch before loading the snapshot, meaning the degraded pruning behavior only kicks in if the user loads a snapshot. And even though, the code just doesn’t prune as aggressively as it seems it should.

  12. jamesob commented at 3:17 pm on May 10, 2023: member

    Hm, for what it’s worth, pruning is working as expected for me on this branch.

    image

     0Every 2.0s: cat <(./src/bitcoin-cli -datadir=/home/james/tmp/bitcoin-au-testing getmempoolinfo) <( ./src/bitcoin-cli -datadir=/home/james/tmp/bitcoin-au-testing getchainstates)                                                                                                                                                                         fido: Wed May 10 11:16:49 2023
     1
     2{
     3  "loaded": true,
     4  "size": 108146,
     5  "bytes": 42192499,
     6  "usage": 234882016,
     7  "total_fee": 23.04459003,
     8  "maxmempool": 300000000,
     9  "mempoolminfee": 0.00013381,
    10  "minrelaytxfee": 0.00001000,
    11  "incrementalrelayfee": 0.00001000,
    12  "unbroadcastcount": 0,
    13  "fullrbf": false
    14}
    15{
    16  "active_chain_type": "validated_snapshot",
    17  "validated_snapshot": {
    18    "blocks": 789094,
    19    "bestblockhash": "00000000000000000002dad6b3cd82fc00075978eb0d63dcbf6bce3bdc2c3f32",
    20    "difficulty": 48005534313578.78,
    21    "verificationprogress": 0.9999931560421257,
    22    "snapshot_blockhash": "00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e",
    23    "initialblockdownload": false,
    24    "coins_db_cache_bytes": 8388608,
    25    "coins_tip_cache_bytes": 11498684416,
    26    "has_mempool": true
    27  },
    28  "headers": 789094
    29}
    

    Will attempt to repro with a fresh run, I guess?

  13. DrahtBot added the label Needs rebase on May 11, 2023
  14. jamesob force-pushed on May 12, 2023
  15. jamesob commented at 1:22 am on May 12, 2023: member
    Rebased.
  16. DrahtBot removed the label Needs rebase on May 12, 2023
  17. jamesob commented at 2:43 pm on May 12, 2023: member

    I’ve done a number of runs here with some combination of indexing and pruning, and I think this is in about as good a shape as it can be, with the following caveat:

    When using assumeutxo with pruning and indexing, the snapshot chainstate cannot be pruned because of how we handle indexing during bg sync. Since the snapshot chainstate isn’t indexed until after the background sync completes (in order to avoid out-of-order indexation), we aren’t able to prune the snapshot chainstate based upon the prune blocks that the indices configure.

    So the upshot of all this is that pruning, when used with assumeutxo and indexing, is basically best-effort. I think that’s an okay state as long as we’re clear about it to end users. Probably merits some kind of GUI alert which I think can be done later.

    After bg sync completes, we’re in a pretty good state (-blockfilterindex + -coinstatsindex + -prune=550):

     0777M    /home/james/tmp/bitcoin-au-testing/blocks
     1total 667M
     2drwx------ 3 james users 4.0K May 12 07:02 .
     3drwx------ 5 james users 4.0K May 12 10:34 ..
     4-rw------- 1 james users 128M May 11 21:24 blk00014.dat
     5-rw------- 1 james users 128M May 11 21:24 blk00015.dat
     6-rw------- 1 james users 127M May 11 21:24 blk00016.dat
     7-rw------- 1 james users 128M May 12 05:04 blk00017.dat
     8-rw------- 1 james users  32M May 12 06:41 blk02910.dat
     9-rw------- 1 james users  48M May 12 10:31 blk03593.dat
    10drwx------ 2 james users 4.0K May 12 10:34 index
    11-rw------- 1 james users  18M May 11 21:24 rev00014.dat
    12-rw------- 1 james users  18M May 11 21:24 rev00015.dat
    13-rw------- 1 james users  18M May 11 21:24 rev00016.dat
    14-rw------- 1 james users  16M May 12 05:04 rev00017.dat
    15-rw------- 1 james users 3.3M May 12 06:41 rev02910.dat
    16-rw------- 1 james users 6.0M May 12 10:31 rev03593.dat
    

    Through the last few days of testing, I did find that I forgot to the divide the prune target by the number of chainstates outstanding, so that may have created some of the weirdness @Sjors was seeing. But otherwise I just think that enabling all these features together results in the necessary trade-off of blowing out the pruning budget.

    Also worth noting that based on the current size of blocks over the last few weeks, the 244 block trailing prune window is potentially going to result in over 550MB of block data, which is our lowest allowable target.

  18. mzumsande commented at 10:18 pm on May 12, 2023: contributor

    I started testing this by creating / loading a snapshot on signet at height 100016. I haven’t really looked at the code yet and also didn’t read the provided docs so that I could do things by trial and error, and hopefully get to the desired result anyway but have a better chance to encounter some edge cases / bugs along the way.

    Things worked really well! Just some observations / minor issues:

    • Didn’t encounter any problems with indexes, pruning, -reindex or reindex-chainstate
    • In one run, I successfully loaded the chainstate from disk, and then restarted bitcoind (without loading any additional blocks via p2p). Then, init would fail the verification checks with the next startup because the blocks aren’t there: (Verification error: ReadBlockFromDisk failed at 100016, hash=000001301bada810147463d5b1d72caf0807ad1fdaf015d084f5c474b8cc44c6)
    • I am allowed to do invalidateblock on a block smaller than the assumeUTXO block before the background state has been loaded and corrupt my block database that way. That’s obviously not an issue in practice, so probably no need to fix - just wanted to mention it.
    • could add an empty m_assumeutxo_data entry in chainparams for signet, like testnet has
    • in the getchainstates rpc, the field verificationprogress has some numerical fluctuations - maybe round this to some reasonable precision
    • after the background chainstate has been fully validated, I observed some warnings in the log while running with indexes, not sure if this is something serious: 2023-05-12T22:06:22Z ChainStateFlushed: WARNING: Locator contains block (hash=000001301bada810147463d5b1d72caf0807ad1fdaf015d084f5c474b8cc44c6) not on known best chain (tip=000000f04acd86089bf33802cfa46e31fa9f99918c32d5a4e7535ae7de57f805); not writing index locator
  19. DrahtBot added the label CI failed on May 16, 2023
  20. Sjors commented at 10:42 am on May 16, 2023: member

    Here’s a signet snapshot torrent: magnet:?xt=urn:btih:0e1a61b4ed9a57c41d98f1ddd5061bc088c857b6&dn=utxo-signet-140000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969

    Can be added to kernel/chainparams.cpp with:

    0m_assumeutxo_data = MapAssumeutxo{
    1            {
    2                140'000,
    3                {AssumeutxoHash{uint256S("0x91412d7018c8564beef1938eef2bfe18feee53674fd26bea425daa4280437641")}, 2233882},
    4            },
    5        };
    

    This basically gets you to the signet tip faster than you can hit enter :-)

  21. DrahtBot added the label Needs rebase on May 17, 2023
  22. in src/rpc/blockchain.cpp:2732 in bbaa581225 outdated
    2722+    fs::path path = fs::PathFromString(request.params[0].get_str());
    2723+    if (path.is_relative()) {
    2724+        path = gArgs.GetDataDirNet() / path;
    2725+    }
    2726+    FILE* file{fsbridge::fopen(path, "rb")};
    2727+    CAutoFile afile{file, SER_DISK, CLIENT_VERSION};
    


    theStack commented at 11:25 pm on May 17, 2023:

    Right now if a non-existing file path is passed, a rather cryptic error message appears:

    0$ ./src/bitcoin-cli loadtxoutset jslkdf 
    1error code: -1
    2error message:
    3CAutoFile::operator>>: file handle is nullptr: unspecified iostream_category error
    

    Should ensure that afile is not null and throw an explicit error if it is (like dumptxoutset already does), e.g.:

    0    CAutoFile afile{file, SER_DISK, CLIENT_VERSION};
    1    if (afile.IsNull()) {
    2        throw JSONRPCError(
    3            RPC_INVALID_PARAMETER,
    4            "Couldn't open file " + path.u8string() + " for reading.");
    5    }
    

    jamesob commented at 8:34 pm on August 24, 2023:
    Thanks - fixed and added you as a coauther.
  23. theStack commented at 0:37 am on May 18, 2023: contributor

    Did the following more or less random test scenario (build at commit b8fd765a06998a79ef6efef5355c956a4679df6f) over the last days:

    • start up bitcoind -prune=800 -blockfilterindex=1 (-dbcache intentionally left at default)
    • load snapshot from height 788000 via bitcoind loadtxoutset /path/to/utxo-788000.dat (thanks for providing the file!)
    • wait until snapshot chainstate reaches network tip
    • shut down the node and wait 1-2 days (with the idea to again provoke both chainstates being in IBD again)
    • restart bitcoind -prune=800 -blockfilterindex=1
    • wait until snapshot chainstate reaches network tip
    • watch background validation proceed

    So far so good, everything went quite smooth and nothing unexpected happened! The only disappointment was finding that apparently the provided prune target of 800MB was not respected:

    0$ du -h ./datadir/blocks/ 
    1116M    ./datadir/blocks/index
    24.8G    ./datadir/blocks/
    

    (background validation was at block ~230000 at that point)

    Looking at the block files (being in ranges blk00001-blk00030 and blk00084-blk00086), it seems that in my case the prune target is respected only for the background chainstate, but the block files blk00001.dat-blk00030.dat (which were obviously created in the course of snapshot chainstate validation, catching up to the network tip for the second time, looking at the timestamp) are just sitting there and didn’t get pruned anymore for some reason. Any idea what’s going on here?

  24. jamesob commented at 2:15 pm on May 18, 2023: member

    Thanks for testing, @theStack!

    So far so good, everything went quite smooth and nothing unexpected happened! The only disappointment was finding that apparently the provided prune target of 800MB was not respected:

    Yes, this is unfortunately expected behavior. Since you have -blockfilterindex enabled and the background sync hasn’t yet completed, prune locks are preventing any pruning from happening on the snapshot chainstate. I mentioned this obliquely here: #27596 (comment)

    The only way around this would be to mark indexes which can be built out of order (i.e. everything except coinstats) and only halt snapshot indexation for those that can’t. But this is something that could be done after the initial assumeutxo feature.

  25. theStack commented at 11:55 pm on May 22, 2023: contributor

    So far so good, everything went quite smooth and nothing unexpected happened! The only disappointment was finding that apparently the provided prune target of 800MB was not respected:

    Yes, this is unfortunately expected behavior. Since you have -blockfilterindex enabled and the background sync hasn’t yet completed, prune locks are preventing any pruning from happening on the snapshot chainstate. I mentioned this obliquely here: #27596 (comment) @jamesob: Thanks for clarifying (admittedly I didn’t read your comment just a few posts above before posting my test results), it makes sense that we have to keep the blocks from the snapshot chainstate for the indexer at least until background sync completes. I’d agree that this is not a problem as long as it is clearly communicated to the user, e.g. by putting a well-visible warning into the loadtxoutset RPC help. The ultra-conservative approach here would be to simply throw an error if a snapshot is loaded on a node that was started with both pruning and an index enabled (if keeping the -prune target promise has absolute highest priority), but that seems to be too limiting.

  26. jamesob force-pushed on May 23, 2023
  27. jamesob commented at 4:12 pm on May 23, 2023: member
    Rebased
  28. jamesob commented at 4:48 pm on May 23, 2023: member

    Cross-linking to some discussion on how we load the block index that @sdaftuar has surfaced: https://github.com/bitcoin/bitcoin/pull/23174/files#r1198450499

    If anyone has encountered an issue like this while testing this branch, a comment would be welcome.

  29. DrahtBot removed the label CI failed on May 23, 2023
  30. DrahtBot removed the label Needs rebase on May 23, 2023
  31. jamesob commented at 7:49 pm on May 23, 2023: member

    New bug, albeit pretty minor:

    • if you restart the node after activating the snapshot but before connecting any blocks to the snapshot chainstate, ReadBlockFromDisk checks fail when calling VerifyDB() during chainstate initialization. I’m not sure when this would happen in practice, but it’s an annoying edge case.

    Edit: fixed this.

  32. DrahtBot added the label Needs rebase on May 24, 2023
  33. jamesob force-pushed on May 25, 2023
  34. jamesob commented at 8:16 pm on May 25, 2023: member

    Rebased.

    I’ve made a few changes/improvements that I think fixes the bug @sdaftuar found (not adding all potential tips to setBlockIndexCandidates), albeit I’m not yet integrating the changes in #27746.

    • More comprehensive functional tests: the tests now stop and start at more points throughout the sync process, including immediately after snapshot load and before bg validation completes.
    • We now include the blockhash of the snapshot base in m_assumeutxo_data; this allows us to move the nChainTx reconstruction where it belongs, into BlockManager::LoadBlockIndex().
    • All blocks for which we have data (or are assumedvalid) are added to all chainstates’ setBlockIndexCandidates.
      • Some related changes in CheckBlockIndex() needed to be made to update some outdated assertions.

    I’m going to be taking a close look at @sdaftuar’s proposed changes over the next few days to see if I can integrate them here.

  35. jamesob force-pushed on May 25, 2023
  36. DrahtBot added the label CI failed on May 25, 2023
  37. DrahtBot removed the label Needs rebase on May 25, 2023
  38. sdaftuar commented at 10:13 pm on May 25, 2023: member

    All blocks for which we have data (or are assumedvalid) are added to all chainstates’ setBlockIndexCandidates.

    Being assumedvalid is unrelated to whether we should add an entry to setBlockIndexCandidates; setBlockIndexCandidates should have the current tip and any block for which we HAVE_DATA and for which we HAVE_DATA for all the blocks that we’d need to connect, in order to get from our current tip to that target block.

    The assumedvalid bit is really just something we’re using to help with adjusting the invariants in CheckBlockIndex(); it should have no bearing on any of the actual validation logic outside of that.

  39. jamesob force-pushed on May 26, 2023
  40. jamesob force-pushed on May 26, 2023
  41. jamesob force-pushed on May 26, 2023
  42. DrahtBot removed the label CI failed on May 26, 2023
  43. in contrib/devtools/test_utxo_snapshots.sh:20 in fc6299c8c2 outdated
    15+export LC_ALL=C
    16+set -e
    17+
    18+BASE_HEIGHT=${1:-30000}
    19+INCREMENTAL_HEIGHT=20000
    20+FINAL_HEIGHT=$(("$BASE_HEIGHT" + "$INCREMENTAL_HEIGHT"))
    


    alexanderwiederin commented at 12:38 pm on May 30, 2023:

    Getting syntax error: operand expected (error token is ""30000" + "20000"") on macOS

    0FINAL_HEIGHT=$(($BASE_HEIGHT + $INCREMENTAL_HEIGHT))
    

    Sjors commented at 2:45 pm on May 30, 2023:
    Which version of Python are you using?

    jamesob commented at 3:05 pm on May 30, 2023:
    @Sjors this is bash :). If he’s on macOS, probably some kind of zsh?

    maflcko commented at 3:52 pm on May 30, 2023:
    macOS ships an ancient version of bash

    alexanderwiederin commented at 10:41 am on May 31, 2023:

    @jamesob line 1 (#!/usr/bin/env bash) sets the interpreter to Bash regardless of what the system default is. @MarcoFalke you were right! macOS comes with an outdated version of bash (3.2.57). I have updated to 5.2.15. Script runs like a charm now.

    Thanks for the help!


    Sjors commented at 1:56 pm on May 31, 2023:

    @Sjors this is bash :)

    bashes head against table


    jamesob commented at 3:13 pm on May 31, 2023:
    Fixed, thanks all.
  44. in contrib/devtools/test_utxo_snapshots.sh:141 in fc6299c8c2 outdated
    136+echo "-- IBDing more blocks to the server node (height=$FINAL_HEIGHT) so there is a diff between snapshot and tip..."
    137+./src/bitcoind $SERVER_PORTS -logthreadnames=1 -datadir="$SERVER_DATADIR" \
    138+    $CHAIN_HACK_FLAGS -stopatheight="$FINAL_HEIGHT" >/dev/null
    139+
    140+echo
    141+echo "-- Staring the server node to provide blocks to the client node..."
    


    alexanderwiederin commented at 12:39 pm on May 30, 2023:

    Missing the second “t” of “Starting”

    0echo "-- Starting the server node to provide blocks to the client node..."
    

    jamesob commented at 3:13 pm on May 31, 2023:
    Fixed, thanks
  45. DrahtBot added the label Needs rebase on May 30, 2023
  46. jamesob force-pushed on May 31, 2023
  47. jamesob commented at 3:16 pm on May 31, 2023: member

    Rebased

    Also fixed the minor issues with the test_utxo_snapshots.sh demo script.

    Since last push, I’ve completed a full test of the mainnet snapshot; after a few days having finished the background sync, -prune=550 (and everything else) working as expected.

     0759M    /home/james/tmp/bitcoin-au-testing/blocks
     1total 650M
     2drwx------ 3 james users 4.0K May 31 10:54 .
     3drwx------ 5 james users 4.0K May 31 11:09 ..
     4-rw------- 1 james users 127M May 29 15:33 blk03626.dat
     5-rw------- 1 james users 128M May 30 02:38 blk03627.dat
     6-rw------- 1 james users 128M May 30 17:29 blk03628.dat
     7-rw------- 1 james users 128M May 31 06:05 blk03629.dat
     8-rw------- 1 james users  64M May 31 10:57 blk03630.dat
     9drwx------ 2 james users 4.0K May 31 10:54 index
    10-rw------- 1 james users  17M May 29 15:30 rev03626.dat
    11-rw------- 1 james users  15M May 30 02:36 rev03627.dat
    12-rw------- 1 james users  19M May 30 17:23 rev03628.dat
    13-rw------- 1 james users  18M May 31 06:01 rev03629.dat
    14-rw------- 1 james users 8.0M May 31 10:57 rev03630.dat
    
  48. DrahtBot removed the label Needs rebase on May 31, 2023
  49. jamesob force-pushed on May 31, 2023
  50. DrahtBot added the label CI failed on May 31, 2023
  51. DrahtBot added the label Needs rebase on May 31, 2023
  52. jamesob force-pushed on May 31, 2023
  53. jamesob commented at 7:21 pm on May 31, 2023: member
    Rebased
  54. DrahtBot removed the label Needs rebase on May 31, 2023
  55. DrahtBot removed the label CI failed on Jun 1, 2023
  56. hazeycode commented at 0:05 am on June 9, 2023: contributor
    Related minor issue: #27841. I don’t think it matters but posting for visibility in case it happens to be helpful to someone.
  57. ajtowns commented at 9:45 am on June 10, 2023: contributor

    This may look like a lot to review, but note that

    * 200 lines are duplicated in [#24008](/bitcoin-bitcoin/24008/)
    

    This should refer to #27746 now, shouldn’t it?

  58. DrahtBot added the label Needs rebase on Jun 12, 2023
  59. ismaelsadeeq commented at 9:45 pm on June 28, 2023: member

    While testing.

    Once you get the full headers chain, you’ll spend a decent amount of time (~10min) loading the snapshot, checking it, and flushing it to disk

    FlushSnapshotToDisk: saving snapshot chain state (10588 MB) started takes about 5 hours to complete on pop-OS 22. During the FlushSnapshotToDisk process, I believe there should be some progress indication, similar to what occurs during the snapshot loading process. Currently, it appears to be like it is stuck.

    When I attempted to shut down bitcoind while FlushSnapshotToDisk was in progress, it resulted in some sort of a deadlock.

    Below are the logs after I stop bitcoind while saving the snapshot.

    02023-06-27T17:45:52Z [httpworker.0] [snapshot] 90000000 coins loaded (99.72%, 10561 MB)
    12023-06-27T17:45:52Z [httpworker.0] [snapshot] loaded 90249783 (10588 MB) coins from snapshot 00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e
    22023-06-27T17:45:52Z [httpworker.0] FlushSnapshotToDisk: saving snapshot chainstate (10588 MB) started
    32023-06-27T18:00:33Z [httpworker.1] [snapshot] waiting to see blockheader 00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e in headers chain before snapshot activation
    42023-06-27T18:15:54Z [init] tor: Thread interrupt
    52023-06-27T18:15:54Z [addcon] addcon thread exit
    62023-06-27T18:15:54Z [torcontrol] torcontrol thread exit
    72023-06-27T18:15:54Z [init] Shutdown: In progress...
    82023-06-27T18:15:54Z [net] net thread exit
    

    Tried to pkill bitcoind but it’s still running possibly waiting for the flush to finish.

  60. bitcoin deleted a comment on Jun 29, 2023
  61. D33r-Gee commented at 5:04 pm on July 6, 2023: none

    Tested ACK built with and tested on WSL Ubuntu 22.04

    you’ll spend a decent amount of time (~10min) loading the snapshot

    this took closer to 30 min with an Intel(R) Core(TM) i9-10980HK CPU @ 2.40GHz 3.10 GHz

    getchainstates returns:

     0{
     1  "active_chain_type": "snapshot",
     2  "ibd": {
     3    "blocks": 896,
     4    "bestblockhash": "00000000dabe35441efb615ad072e22ece58326cf0a43b990fc00aab808e8589",
     5    "difficulty": 1,
     6    "verificationprogress": 1.076164026301743e-06,
     7    "snapshot_blockhash": "0000000000000000000000000000000000000000000000000000000000000000",
     8    "initialblockdownload": true,
     9    "coins_db_cache_bytes": 419430,
    10    "coins_tip_cache_bytes": 366490419,
    11    "has_mempool": false
    12  },
    13  "snapshot": {
    14    "blocks": 789045,
    15    "bestblockhash": "000000000000000000030939094d57c25369fbed9b6bafe94896de37f47d77f5",
    16    "difficulty": 48005534313578.78,
    17    "verificationprogress": 0.9786754966337523,
    18    "snapshot_blockhash": "00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e",
    19    "initialblockdownload": true,
    20    "coins_db_cache_bytes": 7969177,
    21    "coins_tip_cache_bytes": 6963317964,
    22    "has_mempool": true
    23  },
    24  "headers": 797452
    25}
    

    And getblockchaininfo returns:

     0{
     1  "chain": "main",
     2  "blocks": 789140,
     3  "headers": 797452,
     4  "bestblockhash": "000000000000000000049973648ea7a15bf2c7f3585eb8700f889e9aca5af73c",
     5  "difficulty": 48005534313578.78,
     6  "time": 1683751586,
     7  "mediantime": 1683748352,
     8  "verificationprogress": 0.9788762180334918,
     9  "initialblockdownload": true,
    10  "chainwork": "0000000000000000000000000000000000000000481500309c4ab2d789f587f6",
    11  "size_on_disk": 2225580771,
    12  "pruned": true,
    13  "pruneheight": 788001,
    14  "automatic_pruning": true,
    15  "prune_target_size": 3145728000,
    16  "warnings": "This is a pre-release test build - use at your own risk - do not use for mining or merchant applications"
    17}
    
  62. Sjors commented at 8:11 am on August 1, 2023: member
    With #24008 landed this would a great time for a rebase (I briefly tried but it’s non-trivial).
  63. jamesob commented at 6:21 pm on August 1, 2023: member
    In the process of rebasing now.
  64. jamesob force-pushed on Aug 7, 2023
  65. jamesob force-pushed on Aug 7, 2023
  66. DrahtBot removed the label Needs rebase on Aug 7, 2023
  67. jamesob commented at 7:36 pm on August 7, 2023: member

    Rebased au.027 -> au.029

    The merge of #27607 really made that difficult.

    This rebase fixes merge conflicts as well as incorporates @sdaftuar’s changes from #27746. His net changes are in the first commit here.

    I’m very tired of this project and hope we can focus effort on getting this changeset reviewed and merged to be done with the development of this feature.

  68. DrahtBot added the label Needs rebase on Aug 7, 2023
  69. maflcko commented at 7:29 am on August 8, 2023: member
    (The next rebase should be trivial, only a one-line include conflict, I think)
  70. jamesob force-pushed on Aug 8, 2023
  71. DrahtBot removed the label Needs rebase on Aug 8, 2023
  72. in src/net_processing.cpp:1387 in 253bcd5fd3 outdated
    1409@@ -1406,6 +1410,62 @@ void PeerManagerImpl::FindNextBlocksToDownload(const Peer& peer, unsigned int co
    1410     }
    1411 }
    1412 
    1413+void PeerManagerImpl::TryDownloadingHistoricalBlocks(const Peer& peer, unsigned int count, std::vector<const CBlockIndex*>& vBlocks, const CBlockIndex *from_tip, const CBlockIndex* target_block)
    


    dergoegge commented at 2:30 pm on August 10, 2023:

    This look suspiciously similar to FindNextBlocksToDownload, is there no way to unify the two?

    (sorry if I missed previous discussion on this)


    ryanofsky commented at 1:14 pm on August 15, 2023:

    In commit “First draft of background sync logic” (6be9fd22ee2b9759cdd30a7bca4d07a9346eb57b)

    re: #27596 (review)

    This look suspiciously similar to FindNextBlocksToDownload, is there no way to unify the two?

    This seems pretty bad. The most complicated part of the function is copied and pasted basically verbatim, and to make things worse, the comments which explain how it works are removed in the duplicated code.

    Would suggest 43b322188de227c1161fe3ee2dfd7b3a019cfc6a (branch) as a simpler alternative. It also add a commit description and a doxygen comment to explain all the variables which I found confusing. Feel free to add it to this PR or adopt anything there.


    jamesob commented at 8:36 pm on August 24, 2023:
    Thanks guys - I’ve taken https://github.com/bitcoin/bitcoin/commit/43b322188de227c1161fe3ee2dfd7b3a019cfc6a and made a cosmetic refactoring change.
  73. in src/net_processing.cpp:1974 in 253bcd5fd3 outdated
    1969@@ -1901,7 +1970,10 @@ void PeerManagerImpl::BlockDisconnected(const std::shared_ptr<const CBlock> &blo
    1970  * Maintain state about the best-seen block and fast-announce a compact block
    1971  * to compatible peers.
    1972  */
    1973-void PeerManagerImpl::NewPoWValidBlock(const CBlockIndex *pindex, const std::shared_ptr<const CBlock>& pblock)
    1974+void PeerManagerImpl::NewPoWValidBlock(
    


    dergoegge commented at 2:38 pm on August 10, 2023:
    NewPoWValidBlock is already gated with m_highest_fast_announce but I think it would make sense to also ignore bg chainstate events explicitly.

    jamesob commented at 2:01 pm on August 25, 2023:
    Yep, makes sense. Done!
  74. in src/validationinterface.h:91 in 253bcd5fd3 outdated
    87@@ -87,13 +88,13 @@ class CValidationInterface {
    88      * but may not be called on every intermediate tip. If the latter behavior is desired,
    89      * subscribe to BlockConnected() instead.
    90      *
    91-     * Called on a background thread.
    92+     * Called on a background thread. Only called for the active chainstate.
    


    dergoegge commented at 2:41 pm on August 10, 2023:
    propbably not worth it to pass the chainstate role then?

    jamesob commented at 8:42 pm on August 24, 2023:
    Good idea; done.
  75. dergoegge commented at 2:54 pm on August 10, 2023: member
    The net processing changes look much better (no chainstate pointers on CNodeState, yay!).
  76. in src/node/chainstate.cpp:193 in 3b9d3d63d9 outdated
    189+    bool has_snapshot = chainman.DetectSnapshotChainstate(options.mempool);
    190+
    191+    if (has_snapshot && (options.reindex || options.reindex_chainstate)) {
    192+        return {ChainstateLoadStatus::FAILURE_NO_REINDEX, _(
    193+            "Reindex cannot be performed while background validation is in progress. "
    194+            "Please wait for background sync to complete, or delete data directory "
    


    fjahr commented at 2:42 pm on August 11, 2023:
    Better to be precise and ask them to only recommend deleting blocks, chainstate and indexes. Or alternatively add a reminder that they should backup their wallet before deleting the whole datadir.

    ryanofsky commented at 3:29 pm on August 16, 2023:

    In commit “assumeutxo: disallow -reindex when background syncing” (3b9d3d63d9ccb609ec08da34776f7c5f29d21a20)

    re: #27596 (review)

    I don’t understand the advice to “wait for background sync to complete” in this error message. I thought use-cases for reindex were mostly things like recovering from data corruption or restoring from backup. Is there a case where you would ever want to wait for a sync to complete and then reindex after that?

    Also, I’m not sure why it needs be an error to reindex when a snapshot is loaded. The reindex options are supposed delete the UTXO database (and optionally other databases) without deleting block data, and rebuild using whatever data is left over. So if a snapshot is loaded it seems like the -reindex implementation could just delete both UTXO databases and then continue as normal without the snapshot. If the user still had a copy of the snapshot, they could reload it with the loadtxoutset RPC later and reuse the blocks that were already downloaded.

    It seems like implementing this making making reindex work cleanly with assumeutxo would not be much harder than adding this error message.


    jamesob commented at 2:45 pm on August 25, 2023:
    Okay, good points. I’ve removed this error message completely and instead have implemented the behavior that the snapshot chainstate is completely removed during -reindex{-chainstate}. This is consistent with the spirit of the existing reindex operation, I think. The change is also about as simple as this one.
  77. in src/util/vector.h:73 in cdb73e44b7 outdated
    50@@ -49,4 +51,15 @@ inline V Cat(V v1, const V& v2)
    51     return v1;
    52 }
    53 
    54+template<typename V, typename L>
    55+inline std::optional<V> FindFirst(const std::vector<V>& vec, const L fnc)
    


    fjahr commented at 2:46 pm on August 11, 2023:
    Doesn’t std::find_if work for this?

    jamesob commented at 3:03 pm on August 25, 2023:

    I went to change this and found that, surprisingly, massaging find_if into a std::optional output was actually wordier than the current approach:

     0diff --git a/src/util/vector.h b/src/util/vector.h
     1index 9cd5e118a6..32d1e7f933 100644
     2--- a/src/util/vector.h
     3+++ b/src/util/vector.h
     4@@ -54,12 +54,12 @@ inline V Cat(V v1, const V& v2)
     5 template<typename V, typename L>
     6 inline std::optional<V> FindFirst(const std::vector<V>& vec, const L fnc)
     7 {
     8-    for (const auto& el : vec) {
     9-        if (fnc(el)) {
    10-            return el;
    11-        }
    12+    auto got = std::find_if(vec.begin(), vec.end(), fnc);
    13+    if (got != vec.end()) {
    14+        return *got;
    15+    } else {
    16+        return std::nullopt;
    17     }
    18-    return std::nullopt;
    19 }
    20 
    21 #endif // BITCOIN_UTIL_VECTOR_H
    
  78. in src/zmq/zmqnotificationinterface.cpp:176 in f2a1542827 outdated
    173@@ -173,6 +174,9 @@ void CZMQNotificationInterface::TransactionRemovedFromMempool(const CTransaction
    174 
    175 void CZMQNotificationInterface::BlockConnected(const ChainstateRole role, const std::shared_ptr<const CBlock>& pblock, const CBlockIndex* pindexConnected)
    176 {
    177+    if (role == ChainstateRole::BACKGROUND) {
    


    fjahr commented at 5:31 pm on August 11, 2023:
    Hmm, might be good to update ZMQ and wallet docs as a follow-up that blocks that are connected to the background chain will not be notified…

    jamesob commented at 4:28 pm on August 25, 2023:
    Good point, added release notes for ZMQ.
  79. in src/node/blockstorage.cpp:195 in 9e2b8ef836 outdated
    195+    ChainstateManager& chainman)
    196 {
    197     LOCK2(cs_main, cs_LastBlockFile);
    198-    if (chain_tip_height < 0 || GetPruneTarget() == 0) {
    199+    // Distribute our -prune budget over all chainstates.
    200+    const auto target = GetPruneTarget() / chainman.GetAll().size();
    


    fjahr commented at 6:09 pm on August 11, 2023:
    I am not sure it makes sense to distribute the prune budget between all chainstates equally. Ideally we would sync the background chainstate block by block and throw them away instantly because if we are pruning the likelihood that we keep the bg blocks at the end of IBD is much lower. I’m not sure what the perfect distribution would be but I would shift it more towards the assumed chainstate.

    jamesob commented at 5:28 pm on August 25, 2023:
    Yeah I definitely think we can be smarter about pruning, but because we do it chainstate-by-chainstate I think that better management of the prune budget between chainstates might take a little more rework. Maybe something for a follow-up?

    ryanofsky commented at 1:55 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    I think if you divide by 2 you also need to take max with MIN_DISK_SPACE_FOR_BLOCK_FILES to avoid going below the minimum supported pruning size.

    Maybe the easiest and most conservative thing to do for now would just be to keep the pruning target the same and just write documentation that says if you load a snapshot, you may temporarily need twice as much disk space as otherwise,


    jamesob commented at 8:19 pm on August 29, 2023:

    Good point @ryanofsky. I’m taking your first recommendation here because doubling the prune budget in all cases feels excessive to me.

    I’ve added release notes describing this issue.

  80. in src/rpc/blockchain.cpp:2764 in 8ac7427b41 outdated
    2735+        base_blockhash.ToString());
    2736+
    2737+    NodeContext& node = EnsureAnyNodeContext(request.context);
    2738+    ChainstateManager& chainman = *node.chainman;
    2739+
    2740+    while (max_secs_to_wait_for_headers > 0) {
    


    fjahr commented at 6:37 pm on August 11, 2023:
    A bit simpler approach for this RPC would be to return to the user that the headers are not synced yet immediately and tell them they should try again when the headers are fully synced. I would be interested if others favor this approach as well.

    jamesob commented at 5:27 pm on August 25, 2023:
    Yeah, agree this is weirdly implicit and am open to alternative approaches here. Since we expect headers to sync within a relatively short amount of time, I can see the utility of waiting. If anyone has strong feelings here please chime in.

    Sjors commented at 10:11 am on October 3, 2023:

    Any automated tool that loads a snapshot (e.g. for a server deployment) needs to wait for something before calling this RPC, since they’re not available immediately upon startup.

    So they’ll probably just end up waiting a little bit and then repeatedly calling it until it no long throws an error.

    With that in mind, it seems fine to just fail if header sync is in progress. At the same time these 10 lines of RPC code seem fine and convenient.

    A more thorough improvement would be to make the call return immediately and have another method to track progress. E.g. the loading progress could be shoehorned into getchainstates. Having a way to poll for progress is nice for any UI as well (including our GUI).

    A loadtxoutset progress RPC also helps regular command-line users, because the logs are currently much too noisy to easily follow what’s happening; if you call this very early on the node will be churning through early mainnet blocks at breakneck pace.

  81. fjahr commented at 10:00 pm on August 11, 2023: contributor
    Leaving some initial comments and will do some manual testing next. @jamesob there are several todo comments in the code that sound like you were planning to address them before merge. Are you still planning to address those? Otherwise, could you update the comments so that we know what your plan is for those changes?
  82. Sjors commented at 11:17 am on August 14, 2023: member

    Test failure for a66790c662d6516b9fec9ca2cc97e410a72fe899 on Ubuntu 23.04 (gcc 12.3.0)

     0$ src/test/test_bitcoin --run_test=validation_chainstatemanager_tests
     1Running 7 test cases...
     2node/blockstorage.cpp:307 LoadBlockIndex: Assertion `GetParams().AssumeutxoForBlockhash(*snapshot_blockhash)' failed.
     3unknown location(0): fatal error: in "validation_chainstatemanager_tests/chainstatemanager_loadblockindex": signal: SIGABRT (application abort requested)
     4test/validation_chainstatemanager_tests.cpp(441): last checkpoint
     5test_bitcoin: common/args.cpp:577: void ArgsManager::AddArg(const std::string&, const std::string&, unsigned int, const OptionsCategory&): Assertion `ret.second' failed.
     6unknown location(0): fatal error: in "validation_chainstatemanager_tests/chainstatemanager_snapshot_init": signal: SIGABRT (application abort requested)
     7test/validation_chainstatemanager_tests.cpp(508): last checkpoint: "chainstatemanager_snapshot_init" fixture ctor
     8test_bitcoin: common/args.cpp:577: void ArgsManager::AddArg(const std::string&, const std::string&, unsigned int, const OptionsCategory&): Assertion `ret.second' failed.
     9unknown location(0): fatal error: in "validation_chainstatemanager_tests/chainstatemanager_snapshot_completion": signal: SIGABRT (application abort requested)
    10test/validation_chainstatemanager_tests.cpp(577): last checkpoint: "chainstatemanager_snapshot_completion" fixture ctor
    11test_bitcoin: common/args.cpp:577: void ArgsManager::AddArg(const std::string&, const std::string&, unsigned int, const OptionsCategory&): Assertion `ret.second' failed.
    12unknown location(0): fatal error: in "validation_chainstatemanager_tests/chainstatemanager_snapshot_completion_hash_mismatch": signal: SIGABRT (application abort requested)
    13test/validation_chainstatemanager_tests.cpp(660): last checkpoint: "chainstatemanager_snapshot_completion_hash_mismatch" fixture ctor
    

    It’s happy again the next commit fe735572d05b95569e9c82fa1b1767c7da23b423.

    — Here’s a torrent for 800,000: magnet:?xt=urn:btih:50ee955bef37f5ec3e5b0df4cf0288af3d715a2e&dn=utxo-800000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969

    You’ll need this:

    0{
    1  "coins_written": 111535121,
    2  "base_hash": "00000000000000000002a7c4c1e48d76c5a37902165a270156b7a8d72728a054",
    3  "base_height": 800000,
    4  "txoutset_hash": "7d69b87512db3d13b9758ea32b93ce468d18cf7456fb5d250c9e1bed9339e4d2",
    5  "nchaintx": 868965226
    6}
    

    I’ll try a sync with this shortly. Update: ran successful IBD on Intel macOS and on Ubuntu 23. With the latter I first had it sync until 2013 so the dbcache would be 1 GB, before loading the snapshot.

    On macOS the shutdown (a day) after completing background sync seems gets stuck:

    0023-08-16T07:48:43Z [msghand] [net] Saw new cmpctblock header hash=0000000000000000000229118fcb007fc035045c3a3394233116aa25d496129e peer=398
    12023-08-16T07:48:44Z [msghand] UpdateTip: new best=0000000000000000000229118fcb007fc035045c3a3394233116aa25d496129e height=803421 version=0x20000000 log2_work=94.362393 tx=879900343 date='2023-08-16T07:48:37Z' progress=1.000000 cache=114.9MiB(924996txo)
    22023-08-16T07:49:00Z [msghand] New outbound peer connected: version: 70016, blocks=803421, peer=491 (block-relay-only)
    3^C2023-08-16T07:50:33Z [addcon] addcon thread exit
    42023-08-16T07:50:33Z [opencon] opencon thread exit
    52023-08-16T07:50:33Z [init] Shutdown: In progress...
    62023-08-16T07:50:33Z [net] net thread exit
    72023-08-16T07:50:33Z [msghand] msghand thread exit
    82023-08-16T07:55:12Z [scheduler] Flushed fee estimates to fee_estimates.dat.
    92023-08-16T08:23:36Z [scheduler] Potential stale tip detected, will try using extra outbound peer (last tip update: 2092 seconds ago)
    

    I didn’t run with useful debug flags unfortunately. It was a pruned node. Despite kill -9 it was able to restart in good shape though:

    02023-08-16T09:03:57Z [init] [snapshot] cleaning up unneeded background chainstate, then reinitializing
    
  83. in src/net_processing.cpp:1439 in 6be9fd22ee outdated
    1434+    int nMaxHeight = std::min<int>(target_block->nHeight, nWindowEnd + 1);
    1435+
    1436+    std::vector<const CBlockIndex*> vToFetch;
    1437+
    1438+    while (pindexWalk->nHeight < nMaxHeight) {
    1439+        int nToFetch = std::min(nMaxHeight-pindexWalk->nHeight, std::max<int>(count-vBlocks.size(), 128));
    


    ryanofsky commented at 4:30 pm on August 15, 2023:

    In commit “First draft of background sync logic” (6be9fd22ee2b9759cdd30a7bca4d07a9346eb57b)

    This code is just copied and pasted from previous code which was added back in #4468, but std::max seems like a bug here, and I would expect it to be std::min if the point is to read up to 128 successors of pIndexWalk at a time.

  84. ryanofsky commented at 4:49 pm on August 15, 2023: contributor

    Started review (will update list below with progress).

    • 6be9fd22ee2b9759cdd30a7bca4d07a9346eb57b First draft of background sync logic (1/23)
    • a62ed3267d69ae8f4e9edf68f4a0f5348fdefe2d bugfix: correct is_snapshot_cs in VerifyDB (2/23)
    • 3b9d3d63d9ccb609ec08da34776f7c5f29d21a20 assumeutxo: disallow -reindex when background syncing (3/23)
    • cdb73e44b7f0d4ea186ba657ddf7809434f0a12a chainparams: add blockhash to AssumeutxoData (4/23)
    • 5ea9abfbac3543f2d2cef90a1528abed35623822 validation: MaybeRebalanceCaches when chain leaves IBD (5/23)
    • d9afcd70910f30fe9c99afa4dbbcdd7ccaf29d6e validation: add ChainstateRole (6/23)
    • 2856710077abbc8f07a97adfff5f6829358155be validation: pass ChainstateRole for validationinterface calls (7/23)
    • f2a15428277090d6ad387de6d517303b9bc4e2af validationinterface: only send zmq notifications for active (8/23)
    • 550ac6248d1a596207c357181edbec79236d576d wallet: validationinterface: only handle active chain notifications (9/23)
    • ddbd0639e0af4a56cd5f647e11e0d38f06d20a68 net_processing: validationinterface: ignore some events for bg chain (10/23)
    • c177b8f442892a9b865b0b771f0a4823f797322d index: cache indexer references on ChainstateManager (11/23)
    • f64eeaeeb83d1c21f2d9e65c630eee2a1a5d48ba validation: indexing changes for assumeutxo (12/23)
    • 9e2b8ef836aeb1dc2e4b57449fc56badebea6772 validation: pruning for multiple chainstates (13/23)
    • a66790c662d6516b9fec9ca2cc97e410a72fe899 validation: populate nChainTx value for assumedvalid chainstates (14/23)
    • fe735572d05b95569e9c82fa1b1767c7da23b423 blockstorage: segment normal/assumedvalid blockfiles (15/23)
    • d342085067306231a2a454c39d276495dbd13cd3 validation: assumeutxo: swap m_mempool on snapshot activation (16/23)
    • 8ac7427b41fdec1dbc7e06d0aa2c0e469b879ba2 rpc: add loadtxoutset (17/23)
    • bc3042436e4e4a84328a2e56ca021a617d37c27a rpc: add getchainstates (18/23)
    • 24497a881ebf699b92c01342a24f1c4b35b79590 test: add feature_assumeutxo functional test (19/23)
    • c5ce27802be60d81f909a8f0f6a3827f956d88e3 contrib: add script to demo/test assumeutxo (20/23)
    • 9e46cf539533076d6206964e8d74e504030fc4a5 doc: update release-process.md for assumeutxo (21/23)
    • 9e62b849852bfea64eda18f51137a4a53e80f8c0 chainparams: add assumeutxo param at height 788_000 (22/23)
    • 253bcd5fd392f70de82ec7c675f9777d1e098d50 assumeutxo: swap mempool on existing activation (23/23)
  85. in src/net_processing.cpp:1395 in 6be9fd22ee outdated
    1418+    vBlocks.reserve(count);
    1419+    CNodeState *state = State(peer.m_id);
    1420+    assert(state != nullptr);
    1421+
    1422+    if (state->pindexBestKnownBlock == nullptr || state->pindexBestKnownBlock->GetAncestor(target_block->nHeight) != target_block) {
    1423+        // This peer is useless to us -- it has a chain that doesn't contain
    


    ryanofsky commented at 4:58 pm on August 15, 2023:

    In commit “First draft of background sync logic” (6be9fd22ee2b9759cdd30a7bca4d07a9346eb57b)

    This is an exaggeration, right? Peer doesn’t seem completely useless if it doesn’t contain the snapshot block because it could contain blocks leading up the snapshot block. It this is right it would be good to clarify, or maybe change to allow requesting blocks from the peer.


    ryanofsky commented at 9:11 pm on August 28, 2023:

    re: #27596 (review)

    This is an exaggeration, right?

    In commit “net_processing: Request assumeutxo background chain blocks” (08a34e56ba7aa8714982d352238f1cb6eba43e29)

    Still curious about this and think it would be good to either explain it more, or change the behavior, or just s/useless/less useful/ if that would be more accurate to avoid confusion.


    jamesob commented at 4:57 pm on August 29, 2023:
    I’ve rephrased the comment and added a note about potentially requesting partial subset of blocks in the future.
  86. in src/validation.cpp:1647 in 5ea9abfbac outdated
    1641@@ -1642,6 +1642,9 @@ bool Chainstate::IsInitialBlockDownload() const
    1642     }
    1643     LogPrintf("Leaving InitialBlockDownload (latching to false)\n");
    1644     m_cached_finished_ibd.store(true, std::memory_order_relaxed);
    1645+    // Important that this comes *after* the previous line, else we'll wind up in an
    1646+    // infinite recursion.
    1647+    m_chainman.MaybeRebalanceCaches();
    


    ryanofsky commented at 5:56 pm on August 16, 2023:

    In commit “validation: MaybeRebalanceCaches when chain leaves IBD” (5ea9abfbac3543f2d2cef90a1528abed35623822)

    It seems dangerous if just calling IsInitialBlockDownload to find out IBD state could now block and resize caches and flush state to disk. It also makes the codebase hard to understand if you can’t answer a basic question like “when do caches get resized” without having to look up IsInitialBlockDownload() calls all over the codebase.

    I think it would be better to choose one place where it would be logical to check for the end of IBD and rebalance caches, like after a block is connected or after a header is received, and put the MaybeRebalanceCaches in a specific place instead of letting it happen anywhere.


    jamesob commented at 8:47 pm on August 24, 2023:
    Yeah, that’s fair criticism. I’ve moved this potential rebalance to within ActivateBestChain(), which requires a little bit of extra code but is as you say much more explicit.
  87. DrahtBot added the label Needs rebase on Aug 21, 2023
  88. in src/validation.cpp:3184 in 2856710077 outdated
    3188-                    // the caller so the caller could distinguish between
    3189-                    // completed and interrupted operations.
    3190-                    break;
    3191+                GetMainSignals().UpdatedBlockTip(this->GetRole(), pindexNewTip, pindexFork, fInitialDownload);
    3192+
    3193+                if (this == &m_chainman.ActiveChainstate() &&
    


    ryanofsky commented at 5:28 pm on August 23, 2023:

    In commit “validation: pass ChainstateRole for validationinterface calls” (2856710077abbc8f07a97adfff5f6829358155be)

    Would suggest moving this change to a separate commit. Otherwise adding the chainstate check here is a behavior change in a refactoring commit. Also it doesn’t seem directly related to the other changes here


    jamesob commented at 5:32 pm on August 25, 2023:
    Done, thanks.
  89. in src/validation.cpp:4074 in 2856710077 outdated
    4070+                const auto au_data = Assert(GetParams().AssumeutxoForBlockhash(*snapshot));
    4071+                role = (pindex->nHeight >= au_data->height) ?
    4072+                    ChainstateRole::ASSUMEDVALID : ChainstateRole::BACKGROUND;
    4073+            }
    4074+
    4075+            GetMainSignals().BlockChecked(role, *block, state);
    


    ryanofsky commented at 5:31 pm on August 23, 2023:

    In commit “validation: pass ChainstateRole for validationinterface calls” (2856710077abbc8f07a97adfff5f6829358155be)

    I don’t see the BlockChecked role parameter being used later in the PR, and I don’t know another reason why it should be added. It seems like it would be simpler to not add the parameter and get rid of the code above, which is not very straightforward and seems to basically be making up the role value.


    jamesob commented at 8:47 pm on August 24, 2023:
    Good find! I’ll remove this unused parameter.
  90. in src/validationinterface.h:97 in 2856710077 outdated
     95+    virtual void UpdatedBlockTip(const ChainstateRole, const CBlockIndex *pindexNew, const CBlockIndex *pindexFork, bool fInitialDownload) {}
     96     /**
     97      * Notifies listeners of a transaction having been added to mempool.
     98      *
     99-     * Called on a background thread.
    100+     * Called on a background thread. Only called for the active chainstate.
    


    ryanofsky commented at 5:37 pm on August 23, 2023:

    In commit “validation: pass ChainstateRole for validationinterface calls” (2856710077abbc8f07a97adfff5f6829358155be)

    These TransactionAddedToMempool and TransactionRemovedFromMempool comments are a little confusing because I wouldn’t expect the background chainstate to have a mempool, and even if there were multiple mempools, I wouldn’t think knowing what chainstate they build on to be enough to identify them. I think it would be better to drop this new comment and avoid mixing up chainstate/mempool concepts.


    jamesob commented at 5:45 pm on August 25, 2023:
    Good point, I think these are a vestige when mempool ownership wasn’t so clear. I’ve removed them.
  91. jamesob force-pushed on Aug 24, 2023
  92. jamesob commented at 9:11 pm on August 24, 2023: member

    Rebased, but only part of the way addressing all feedback so far.

    • Took @ryanofsky’s net_processing cleanup commit
    • Better error handling in loadtxoutset (@theStack)
    • Cleaned up unnecessary chainstate roles (@dergoegge, @ryanofsky)
      • Refactoring and behavior changes are now separate commits
  93. DrahtBot removed the label Needs rebase on Aug 24, 2023
  94. DrahtBot added the label CI failed on Aug 25, 2023
  95. DrahtBot removed the label CI failed on Aug 25, 2023
  96. jamesob force-pushed on Aug 25, 2023
  97. jamesob commented at 6:20 pm on August 25, 2023: member

    Okay, the latest rebase is pushed, and I believe I’ve addressed all outstanding feedback (with the exception of two marginal comments from @ryanofsky relating to the first commit here, which perhaps I’ll leave to @sdaftuar… unless he doesn’t get back in some amount of time, in which case I’ll take a look myself).

    I’ve iterated through all commits to ensure they build and pass the chainstate unittests. I’ve also fixed the intermediate-commit test failure that @Sjors discovered, which was based on not updating the unittest to use a “recognized” snapshot early enough. I’ve introduced a commit that fixes that sequencing.

    For anyone curious, the tag updates here are au.029 -> au.031 (GitHub range diff) but the diff is kind of useless because of the rebase.

    Much more useful is doing the range-diff locally yourself:

    0git range-diff master jamesob/au.029 jamesob/au.031
    

    Thanks again for continued review and testing @ryanofsky @dergoegge @fjahr @Sjors @theStack @achow101 @D33r-Gee @ismaelsadeeq @alexanderwiederin @mzumsande @hazeycode and anyone else I forgot.

  98. jamesob commented at 6:21 pm on August 25, 2023: member
    Oops - forgot these comments. :)
  99. fjahr commented at 12:50 pm on August 27, 2023: contributor
    Small update, I did a lot of manual testing on 157e510bc6deaaca43401f15f10105e44926ca68 while also monitoring stats like the size of the datadir etc. I mostly created my own snapshots on the current tip and then patched chainparams myself to be able to use them to push the limit a bit. It worked well and I didn’t run into any issues aside from what I mention in #28350 which is not only an assumeutxo problem.
  100. in src/net_processing.cpp:920 in 08a34e56ba outdated
    915+    *                     first requested the first in-flight block in the
    916+    *                     download window. It is only set if vBlocks is empty at
    917+    *                     the end of this function call and if increasing
    918+    *                     nWindowEnd by 1 would have caused it to be non-empty
    919+    *                     (which indicates the download might be stalled since
    920+    *                     every block in the window is in flight.)
    


    ryanofsky commented at 9:04 pm on August 28, 2023:

    In commit “net_processing: Request assumeutxo background chain blocks” (08a34e56ba7aa8714982d352238f1cb6eba43e29)

    Suggested comment updates to clarify effect of passing activeChain and better describe what it means for a nodeStaller value to be returned:

     0--- a/src/net_processing.cpp
     1+++ b/src/net_processing.cpp
     2@@ -906,18 +906,20 @@ private:
     3     * \param nWindowEnd   Maximum height of blocks to allow in vBlocks. No
     4     *                     blocks will be added above this height.
     5     * \param activeChain  Optional pointer to a chain to compare against. If
     6-    *                     provided, any blocks contained by this chain will only
     7-    *                     be used to update state->pindexLastCommonBlock, and
     8-    *                     will not be appended to vBlocks.
     9+    *                     provided, any next blocks which are already contained
    10+    *                     in this chain will not be appended to vBlocks, but
    11+    *                     instead will be used to update the
    12+    *                     state->pindexLastCommonBlock pointer.
    13     * \param nodeStaller  Optional pointer to a NodeId variable that will receive
    14     *                     the ID of another peer that might be causing this peer
    15     *                     to stall. This is set to the ID of the peer which
    16     *                     first requested the first in-flight block in the
    17     *                     download window. It is only set if vBlocks is empty at
    18     *                     the end of this function call and if increasing
    19-    *                     nWindowEnd by 1 would have caused it to be non-empty
    20-    *                     (which indicates the download might be stalled since
    21-    *                     every block in the window is in flight.)
    22+    *                     nWindowEnd by 1 would cause it to be non-empty (which
    23+    *                     indicates the download might be stalled because every
    24+    *                     block in the window is in flight and no other peer is
    25+    *                     trying to download the next block).
    26     */
    27     void FindNextBlocks(std::vector<const CBlockIndex*>& vBlocks, const Peer& peer, CNodeState *state, const CBlockIndex *pindexWalk, unsigned int count, int nWindowEnd, const CChain* activeChain=nullptr, NodeId* nodeStaller=nullptr) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
    28 
    

    jamesob commented at 4:50 pm on August 29, 2023:
    Done, thanks.
  101. in src/validation.h:1039 in 08a34e56ba outdated
    1032@@ -1036,12 +1033,25 @@ class ChainstateManager
    1033     //! Otherwise, revert to using the ibd chainstate and shutdown.
    1034     SnapshotCompletionResult MaybeCompleteSnapshotValidation() EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    1035 
    1036+    //! Returns nullptr if no snapshot has been loaded.
    1037+    const CBlockIndex* GetSnapshotBaseBlock() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    ryanofsky commented at 9:14 pm on August 28, 2023:

    In commit “net_processing: Request assumeutxo background chain blocks” (08a34e56ba7aa8714982d352238f1cb6eba43e29)

    Any reason these two lines are moving? This change seems like it have been accidentally added in a rebase. Would probably be good to revert.


    jamesob commented at 4:47 pm on August 29, 2023:
    Yep - this method now must be public to be made use of in PeerManagerImpl::SendMessages() (which was a refactoring from the commit you offered in an earlier review - the content of this function had been reproduced).
  102. in src/validation.cpp:4074 in 08a34e56ba outdated
    4069@@ -4070,6 +4070,11 @@ bool ChainstateManager::ProcessNewBlock(const std::shared_ptr<const CBlock>& blo
    4070     if (!ActiveChainstate().ActivateBestChain(state, block)) {
    4071         return error("%s: ActivateBestChain failed (%s)", __func__, state.ToString());
    4072     }
    4073+    BlockValidationState background_sync_state;
    4074+    if (WITH_LOCK(::cs_main, return BackgroundSyncInProgress()) &&
    


    ryanofsky commented at 9:24 pm on August 28, 2023:

    In commit “net_processing: Request assumeutxo background chain blocks” (08a34e56ba7aa8714982d352238f1cb6eba43e29)

    background_sync_state variable is not actually referenced below, and code which locks and unlocks cs_main twice seems potentially racy. Would suggest cleaning this up with:

     0--- a/src/validation.cpp
     1+++ b/src/validation.cpp
     2@@ -4070,10 +4070,11 @@ bool ChainstateManager::ProcessNewBlock(const std::shared_ptr<const CBlock>& blo
     3     if (!ActiveChainstate().ActivateBestChain(state, block)) {
     4         return error("%s: ActivateBestChain failed (%s)", __func__, state.ToString());
     5     }
     6-    BlockValidationState background_sync_state;
     7-    if (WITH_LOCK(::cs_main, return BackgroundSyncInProgress()) &&
     8-            !WITH_LOCK(::cs_main, return *m_ibd_chainstate).ActivateBestChain(state, block)) {
     9-        return error("%s: [background] ActivateBestChain failed (%s)", __func__, state.ToString());
    10+
    11+    Chainstate* bg_chain{WITH_LOCK(cs_main, return BackgroundSyncInProgress() ? m_ibd_chainstate.get() : nullptr)};
    12+    BlockValidationState bg_state;
    13+    if (bg_chain && !bg_chain->ActivateBestChain(bg_state, block)) {
    14+        return error("%s: [background] ActivateBestChain failed (%s)", __func__, bg_state.ToString());
    15     }
    16 
    17     return true;
    

    naumenkogs commented at 10:13 am on August 29, 2023:

    seems potentially racy

    How so?


    ryanofsky commented at 11:49 am on August 29, 2023:

    seems potentially racy

    How so?

    I don’t think there is a real race condition here because if stale a BackgroundSyncInProgress value were used, it wouldn’t cause any harm. By “potentially racy” I just meant that any time I see a mutex being unlocked and immediately relocked I have to turn on the part of my brain that thinks about race conditions, so it’s nicer to use one lock than worry about what state can change in the gap between locks.


    jamesob commented at 5:21 pm on August 29, 2023:
    Good point and thanks for the suggestion; took your diff.
  103. in src/net_processing.cpp:5894 in 08a34e56ba outdated
    5888@@ -5831,6 +5889,14 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
    5889             std::vector<const CBlockIndex*> vToDownload;
    5890             NodeId staller = -1;
    5891             FindNextBlocksToDownload(*peer, MAX_BLOCKS_IN_TRANSIT_PER_PEER - state.vBlocksInFlight.size(), vToDownload, staller);
    5892+            if (m_chainman.BackgroundSyncInProgress() && !IsLimitedPeer(*peer)) {
    5893+                const CBlockIndex *target = m_chainman.GetSnapshotBaseBlock();
    5894+                if (target != nullptr) {
    


    ryanofsky commented at 9:35 pm on August 28, 2023:

    In commit “net_processing: Request assumeutxo background chain blocks” (08a34e56ba7aa8714982d352238f1cb6eba43e29)

    It seems impossible that target could be nullptr here if condition above is true and cs_main is held. Would be clearer to just assert(target) rather than checking for this condition which should be impossible. Or if this is actually expected would be good to add a comment saying when target may be null.


    jamesob commented at 5:03 pm on August 29, 2023:
    Sounds good, since we’ve ensured that the snapshot base block index exists prior to snapshot activation (PopulateAndValidateSnapshot()).
  104. in src/validation.cpp:5727 in 1755c53365 outdated
    5722+    AssertLockHeld(::cs_main);
    5723+    Assert(m_snapshot_chainstate);
    5724+    Assert(m_ibd_chainstate);
    5725+
    5726+    fs::path snapshot_datadir = GetSnapshotCoinsDBPath(*m_snapshot_chainstate);
    5727+    DeleteCoinsDBFromDisk(snapshot_datadir, /*is_snapshot=*/ true);
    


    ryanofsky commented at 9:52 pm on August 28, 2023:

    In commit “assumeutxo: remove snapshot during -reindex{-chainstate}” (1755c53365d53693213a721138a44bdd889d8794)

    Return value from this delete is being ignored. I think DeleteSnapshotChainstate should return a bool and return false if this fails. It would also be good to mark both functions [[nodiscard]] and exit with InitError if the DeleteSnapshotChainstate call fails. I don’t think it makes sense to ignore the failure and do a partial reindexing of the background chain only.


    jamesob commented at 5:35 pm on August 29, 2023:
    Good points, fixed per your advice.
  105. in src/validation.cpp:3291 in 3e7d3537fc outdated
    3201@@ -3195,6 +3202,11 @@ bool Chainstate::ActivateBestChain(BlockValidationState& state, std::shared_ptr<
    3202         }
    3203         // When we reach this point, we switched to a new tip (stored in pindexNewTip).
    3204 
    3205+        if (exited_ibd) {
    3206+            LOCK(::cs_main);
    3207+            m_chainman.MaybeRebalanceCaches();
    


    ryanofsky commented at 10:21 pm on August 28, 2023:

    In commit “validation: MaybeRebalanceCaches when chain leaves IBD” (3e7d3537fc06c8aff8a48edc8f179a70af46e04c)

    I think it would be helpful to have a reminder about why this is necessary, maybe “// Call MaybeRebalanceCaches in case background and snapshot chainstates both exist, and the snapshot chainstate is almost finished downloading blocks, so more resources can be shifted to the background download.”


    jamesob commented at 5:42 pm on August 29, 2023:
    Added a comment that’s slightly more vague because I think the logic for how we balance the caches should be confined to MaybeRebalanceCaches() so that this comment doesn’t fall out of date.
  106. in src/validation.cpp:5872 in 95f45e38b5 outdated
    5728@@ -5729,6 +5729,16 @@ void ChainstateManager::DeleteSnapshotChainstate()
    5729     m_snapshot_chainstate.reset();
    5730 }
    5731 
    5732+ChainstateRole Chainstate::GetRole() const
    


    ryanofsky commented at 10:38 pm on August 28, 2023:

    In commit “validation: add ChainstateRole” (95f45e38b5e47eaef56468909d04418b93e9c962)

    I don’t have a specific suggestion here, just an observation that this method seems like it is called from multiple threads but it doesn’t require cs_main. So potentially there could be a race condition where a snapshot is loaded during this call, or validated during this call and it returns a stale value. I think this is not a new issue, since returning stale values also seems to be characteristic of the existing GetAll() and ActiveChainstate() methods. But in the long term it would probably be better if all of these methods were marked EXCLUSIVE_LOCKS_REQUIRED(cs_main) and stopped acquiring cs_main internally so they could not return stale values and create potential race conditions.


    jamesob commented at 6:19 pm on August 29, 2023:

    Definitely agree here, but it’d be nice to tackle this as a follow-up since inverting all the cs_main acquisitions for ActiveChainstate() will be a big diff (if not mechanical).

    For now I can add EXCLUSIVE_LOCKS_REQUIRED annotations to GetRole(), which compiles okay at the HEAD of this branch.

  107. in src/validation.cpp:3271 in 9eda822411 outdated
    3185@@ -3186,7 +3186,7 @@ bool Chainstate::ActivateBestChain(BlockValidationState& state, std::shared_ptr<
    3186 
    3187             // Notify external listeners about the new tip.
    3188             // Enqueue while holding cs_main to ensure that UpdatedBlockTip is called in the order in which blocks are connected
    3189-            if (pindexFork != pindexNewTip) {
    3190+            if (this == &m_chainman.ActiveChainstate() && pindexFork != pindexNewTip) {
    


    ryanofsky commented at 10:45 pm on August 28, 2023:

    In commit “validation: only call UpdatedBlockTip for active chainstate” (9eda8224119450b8614837e0c8decfc81a6b9915)

    Could update commit message to mention that in addition to skipping CValidationInterface::UpdatedBlockTip notifications for the background chainstate, it also skips kernel::Notifications::blockTip notifications.


    jamesob commented at 7:30 pm on August 29, 2023:
    Done.
  108. in src/test/util/validation.h:24 in 2873d91ada outdated
    18@@ -19,7 +19,11 @@ struct TestChainstateManager : public ChainstateManager {
    19 class ValidationInterfaceTest
    20 {
    21 public:
    22-    static void BlockConnected(CValidationInterface& obj, const std::shared_ptr<const CBlock>& block, const CBlockIndex* pindex);
    23+    static void BlockConnected(
    24+        const ChainstateRole role,
    25+        CValidationInterface& obj,
    


    ryanofsky commented at 10:50 pm on August 28, 2023:

    In commit “validation: pass ChainstateRole for validationinterface calls” (2873d91ada1c240de280a35bbf169ed0c708e057)

    Argument order here is a little weird. It would make more sense for it to be obj, role rather than role, obj so the object is still first and so role is right before block just like everywhere else.


    jamesob commented at 7:31 pm on August 29, 2023:
    I figured that since we’d immediately be filtering on role for many of these notifications it kind of made sense going first. I can change it but I won’t do that unless you really want me to because I’m lazy.
  109. in src/validationinterface.h:91 in 2873d91ada outdated
    87@@ -87,7 +88,7 @@ class CValidationInterface {
    88      * but may not be called on every intermediate tip. If the latter behavior is desired,
    89      * subscribe to BlockConnected() instead.
    90      *
    91-     * Called on a background thread.
    92+     * Called on a background thread. Only called for the active chainstate.
    


    ryanofsky commented at 10:51 pm on August 28, 2023:

    In commit “validation: pass ChainstateRole for validationinterface calls” (2873d91ada1c240de280a35bbf169ed0c708e057)

    Would make more sense to update this comment in the previous commit “validation: only call UpdatedBlockTip for active chainstate” (9eda8224119450b8614837e0c8decfc81a6b9915) instead of this commit.


    jamesob commented at 7:33 pm on August 29, 2023:
    Done.
  110. in src/validationinterface.h:208 in 2873d91ada outdated
    204@@ -199,11 +205,11 @@ class CMainSignals {
    205     void UpdatedBlockTip(const CBlockIndex *, const CBlockIndex *, bool fInitialDownload);
    206     void TransactionAddedToMempool(const CTransactionRef&, uint64_t mempool_sequence);
    207     void TransactionRemovedFromMempool(const CTransactionRef&, MemPoolRemovalReason, uint64_t mempool_sequence);
    208-    void BlockConnected(const std::shared_ptr<const CBlock> &, const CBlockIndex *pindex);
    209+    void BlockConnected(const ChainstateRole, const std::shared_ptr<const CBlock> &, const CBlockIndex *pindex);
    


    ryanofsky commented at 11:05 pm on August 28, 2023:

    In commit “validation: pass ChainstateRole for validationinterface calls” (2873d91ada1c240de280a35bbf169ed0c708e057)

    Ignore this comment if this was intentional, but it seems unusual to make the ChainstateRole parameter const, when the codebase doesn’t use const value parameters in general or in this interface for other parameters like MemPoolRemovalReason. IMO the const just adds noise without being more safe, and makes the method less convenient to implement. But not a problem if this is just a style difference and was intended.


    jamesob commented at 7:35 pm on August 29, 2023:
    Fixed. Just dumb muscle memory on my part.
  111. in doc/release-notes-27596.md:1 in b48729144f outdated
    0@@ -0,0 +1,7 @@
    1+ZMQ
    


    ryanofsky commented at 11:13 pm on August 28, 2023:

    In commit “validationinterface: only send zmq notifications for active” (b48729144fba1dbc64d603b2644a76c3f00d44eb)

    I’m surprised to see this release note. It seems like no observable behavior change could be happening with this PR because it wasn’t possible to load a snapshot previously. If this is the case, I think putting this information in release notes could only be confusing and it would be better to document this behavior in doc/zmq.md instead


    jamesob commented at 7:44 pm on August 29, 2023:
    Done.
  112. in src/zmq/zmqnotificationinterface.cpp:148 in b48729144f outdated
    143@@ -144,7 +144,8 @@ void TryForEachAndRemoveFailed(std::list<std::unique_ptr<CZMQAbstractNotifier>>&
    144 
    145 void CZMQNotificationInterface::UpdatedBlockTip(const CBlockIndex *pindexNew, const CBlockIndex *pindexFork, bool fInitialDownload)
    146 {
    147-    if (fInitialDownload || pindexNew == pindexFork) // In IBD or blocks were disconnected without any new ones
    148+    // In IBD or blocks were disconnected without any new ones
    149+    if (fInitialDownload || pindexNew == pindexFork)
    


    ryanofsky commented at 11:22 pm on August 28, 2023:

    In commit “validationinterface: only send zmq notifications for active” (b48729144fba1dbc64d603b2644a76c3f00d44eb)

    This seems like no-op change that could be reverted. Or if this is supposed to be a style cleanup, would probably make sense to add {} and modernize it since the line is changing anyway.


    jamesob commented at 7:44 pm on August 29, 2023:
    Done.
  113. in src/validation.h:913 in 688b68f34f outdated
    909@@ -909,6 +910,8 @@ class ChainstateManager
    910 
    911     explicit ChainstateManager(const util::SignalInterrupt& interrupt, Options options, node::BlockManager::Options blockman_options);
    912 
    913+    std::vector<BaseIndex*> indexers{};
    


    ryanofsky commented at 11:35 pm on August 28, 2023:

    In commit “index: cache indexer references on ChainstateManager” (688b68f34fe061c3064a516cdf77d4c155bfcafd)

    Haven’t reviewed later commits yet, but it seems not ideal for ChainstateManager to be aware of indexes. I think it would might be better to add a new ValidationInterface method that indexes (and wallets) could handle to start resyncing themselves whenever the background download finishes.

    Minor: Since this is a member, should be prefixed with m_. Also would be good to s/indexer/index/ everywhere for consistency with `NodeContext::indexes and other index variables. (In general I think “index” sounds better than “indexer” and “indexing” sounds better than “indexation” to stick to more common terms.)


    jamesob commented at 7:59 pm on August 29, 2023:
    This is resolved after removing ChainstateManager.indexers.
  114. in src/init.cpp:1486 in 5d01151b41 outdated
    1481+
    1482+            // Drain the validation interface queue to ensure that the old indexers
    1483+            // don't have any pending work.
    1484+            SyncWithValidationInterfaceQueue();
    1485+
    1486+            for (auto* indexer : chainman.indexers) {
    


    ryanofsky commented at 0:04 am on August 29, 2023:

    In commit “net_processing: validationinterface: ignore some events for bg chain” (5d096add65e84530c907b5ee934af144067d5ef1)

    This could probably loop over node.indexes instead of chainman.indexers and then the chainman.indexers variable could be eliminated.


    jamesob commented at 7:55 pm on August 29, 2023:
    Good point! Fixed.
  115. in src/init.cpp:1480 in 5d01151b41 outdated
    1472@@ -1473,6 +1473,26 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1473         node.chainman = std::make_unique<ChainstateManager>(node.kernel->interrupt, chainman_opts, blockman_opts);
    1474         ChainstateManager& chainman = *node.chainman;
    1475 
    1476+        // This is defined and set here instead of inline in validation.h to avoid a hard
    1477+        // dependency between validation and index/base, since the latter is not in
    1478+        // libbitcoinkernel.
    1479+        chainman.restart_indexers = [](const ChainstateManager& chainman) {
    1480+            LogPrintf("[snapshot] restarting indexers\n");
    


    ryanofsky commented at 0:05 am on August 29, 2023:

    In commit “net_processing: validationinterface: ignore some events for bg chain” (5d096add65e84530c907b5ee934af144067d5ef1)

    Would be nice to s/indexers/indexes/, s/indexer/s/index/ /s/indexation/indexing/ everywhere.


    jamesob commented at 7:55 pm on August 29, 2023:
    Done.
  116. in src/validation.h:916 in 5d01151b41 outdated
    910@@ -911,6 +911,10 @@ class ChainstateManager
    911     explicit ChainstateManager(const util::SignalInterrupt& interrupt, Options options, node::BlockManager::Options blockman_options);
    912 
    913     std::vector<BaseIndex*> indexers{};
    914+    //! Function to restart active indexers; set dynamically to avoid a circular
    915+    //! dependency on `base/index.cpp`.
    916+    std::function<void(const ChainstateManager&)> restart_indexers =
    


    ryanofsky commented at 0:13 am on August 29, 2023:

    In commit “net_processing: validationinterface: ignore some events for bg chain” (5d096add65e84530c907b5ee934af144067d5ef1)

    Would suggest simplifying function type to std::function<void()> because handler should be able to determine what state is needs to run and capture it explicitly. Also the reference might be unnecessary if the m_indexers variable is removed as suggested.


    jamesob commented at 8:00 pm on August 29, 2023:
    Done.
  117. in src/validation.cpp:5854 in 5d01151b41 outdated
    5847@@ -5838,3 +5848,9 @@ bool ChainstateManager::ValidatedSnapshotCleanup()
    5848     }
    5849     return true;
    5850 }
    5851+
    5852+Chainstate& ChainstateManager::GetChainstateForIndexing()
    5853+{
    5854+    return (IsSnapshotActive() && !IsSnapshotValidated()) ?
    


    ryanofsky commented at 0:16 am on August 29, 2023:

    In commit “net_processing: validationinterface: ignore some events for bg chain” (5d096add65e84530c907b5ee934af144067d5ef1)

    I’m surprised this method can’t just return m_ibd_chainstate.get(). It would be good to simplify this code if possible or else add some explanation for the logic, which seems pretty complicated.


    jamesob commented at 7:53 pm on August 29, 2023:
    We can’t just return m_ibd_chainstate because we want the indexes to switch to the active (snapshot) chainstate once the background validation completes (but before a restart has happened). I’ll make a note.
  118. in src/validation.cpp:5876 in b9ac4f7925 outdated
    5871+        return {0, 0};
    5872+    }
    5873+    unsigned int prune_start{0};
    5874+
    5875+    if (this->GetAll().size() > 1) {
    5876+        std::optional<int> snapshot_height{GetSnapshotBaseHeight()};
    


    ryanofsky commented at 0:25 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    I think it would be good assert(snapshot_height) or else add a comment saying when snapshot_height is expected to be nullopt, rather than write code that seems like it is trying to handle an impossible condition.


    jamesob commented at 8:09 pm on August 29, 2023:
    Good point, fixed.
  119. in src/validation.cpp:5875 in b9ac4f7925 outdated
    5870+    if (chainstate.m_chain.Height() <= 0) {
    5871+        return {0, 0};
    5872+    }
    5873+    unsigned int prune_start{0};
    5874+
    5875+    if (this->GetAll().size() > 1) {
    


    ryanofsky commented at 0:26 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    Seems like the code would be a little simpler if this said if (this->GetAll().size() > 1 && &chainstate == m_snapshot_chainstate.get()) so relevant conditions would be stated up front, and nested if statement below could go away


    jamesob commented at 8:27 pm on August 29, 2023:
    Fixed.
  120. in src/validation.h:1251 in b9ac4f7925 outdated
    1243@@ -1248,6 +1244,17 @@ class ChainstateManager
    1244     //!   fully validated chain.
    1245     Chainstate& GetChainstateForIndexing() EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    1246 
    1247+    //! Return the [start, end] (inclusive) of block heights we can prune.
    1248+    //!
    1249+    //! If we're pruning the snapshot chainstate, be sure not to
    1250+    //! prune blocks being used by the background chainstate.
    1251+    std::pair<unsigned int, unsigned int> GetPruneRange(
    


    ryanofsky commented at 0:42 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    Would definitely suggest using plain int instead of unsigned int for heights everywhere to avoid overflow bugs and extra type casts. The rest of the code represents block heights with ints, and unsigned types are great for representing bits but less great at representing quantities you would like to perform arithmetic on.


    jamesob commented at 8:40 pm on August 29, 2023:
    Done.
  121. in src/validation.cpp:5991 in b9ac4f7925 outdated
    5862@@ -5854,3 +5863,40 @@ Chainstate& ChainstateManager::GetChainstateForIndexing()
    5863     return (IsSnapshotActive() && !IsSnapshotValidated()) ?
    5864         *m_ibd_chainstate : *m_active_chainstate;
    5865 }
    5866+
    5867+std::pair<unsigned int, unsigned int> ChainstateManager::GetPruneRange(
    5868+    const Chainstate& chainstate, unsigned int prune_height)
    5869+{
    5870+    if (chainstate.m_chain.Height() <= 0) {
    5871+        return {0, 0};
    


    ryanofsky commented at 0:48 am on August 29, 2023:

    In commit “net_processing: validationinterface: ignore some events for bg chain” (5d096add65e84530c907b5ee934af144067d5ef1)

    Probably doesn’t matter in practice, but according to the function documentation, this would indicate the pruning range includes the genesis block. Would suggest returning {1, 0} here and also initializing prune_start to 1 instead of 0 below.


    jamesob commented at 8:05 pm on August 29, 2023:

    Fixed.

    Edit: Initializing prune_start to 1 prevents the pruning of the blockfile with block height=0 in it, which causes the feature_pruning.py functional test to fail, so I’m keeping that part as-is.

  122. in src/validation.cpp:2481 in b9ac4f7925 outdated
    2474@@ -2474,11 +2475,19 @@ bool Chainstate::FlushStateToDisk(
    2475             if (nManualPruneHeight > 0) {
    2476                 LOG_TIME_MILLIS_WITH_CATEGORY("find files to prune (manual)", BCLog::BENCH);
    2477 
    2478-                m_blockman.FindFilesToPruneManual(setFilesToPrune, std::min(last_prune, nManualPruneHeight), m_chain.Height());
    2479+                m_blockman.FindFilesToPruneManual(
    2480+                    setFilesToPrune,
    2481+                    std::min(last_prune, nManualPruneHeight),
    2482+                    m_chain.Height(),
    


    ryanofsky commented at 1:16 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    No reason to pass m_chain.Height() here anymore now this is passing *this chainstate reference. Having the redundant parameter makes code in the function more complicated and hard to follow because there are multiple variables referring to the same height.


    jamesob commented at 8:27 pm on August 29, 2023:
    Fixed.
  123. in src/node/blockstorage.h:135 in b9ac4f7925 outdated
    127@@ -122,7 +128,13 @@ class BlockManager
    128      *
    129      * @param[out]   setFilesToPrune   The set of file indices that can be unlinked will be returned
    130      */
    131-    void FindFilesToPrune(std::set<int>& setFilesToPrune, uint64_t nPruneAfterHeight, int chain_tip_height, int prune_height, bool is_ibd);
    132+    void FindFilesToPrune(
    133+        std::set<int>& setFilesToPrune,
    134+        uint64_t nPruneAfterHeight,
    135+        int chain_tip_height,
    136+        int prune_height,
    


    ryanofsky commented at 1:31 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    Same as earlier comment, but again it is confusing to keep this chain_tip_height parameter which is redundant with the chain parameter and makes the implementation more complicated because there are multiple variables referring to the same height.

    Also would suggest calling the prune_height parameter last_prune to be be more descriptive, and consistent with the argument that is passed in validation.cpp. Also could add documentation for the parameter as “the last height we can prune” or something similar.


    jamesob commented at 8:25 pm on August 29, 2023:
    Fixed.
  124. in src/validation.h:1239 in b9ac4f7925 outdated
    1243@@ -1248,6 +1244,17 @@ class ChainstateManager
    1244     //!   fully validated chain.
    1245     Chainstate& GetChainstateForIndexing() EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    1246 
    1247+    //! Return the [start, end] (inclusive) of block heights we can prune.
    


    ryanofsky commented at 1:40 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    Would be good to say that start > end is possible and indicates prune range is empty.


    jamesob commented at 8:40 pm on August 29, 2023:
    Done.
  125. in src/node/blockstorage.cpp:167 in b9ac4f7925 outdated
    164+    unsigned int nLastBlockWeCanPrune;
    165+
    166+    std::tie(min_block_to_prune, nLastBlockWeCanPrune) = chainman.GetPruneRange(
    167+        chain, chain_tip_height);
    168+
    169+    nLastBlockWeCanPrune = std::min(nLastBlockWeCanPrune, (unsigned)nManualPruneHeight);
    


    ryanofsky commented at 2:07 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    Code would be simpler if this line were dropped and nManualPruneHeight was passed as the parameter to GetPruneRange above instead of chain_tip_height. Passing chain_tip_height above has no effect because GetPruneRange already takes the min with the chain.m_chain.Height().


    jamesob commented at 8:20 pm on August 29, 2023:
    Done.
  126. in src/validation.cpp:5899 in b9ac4f7925 outdated
    5894+        // While you might be tempted to prune the background chainstate more
    5895+        // aggressively (i.e. fewer MIN_BLOCKS_TO_KEEP), this won't work with index
    5896+        // building - specifically blockfilterindex requires undo data, and if
    5897+        // we don't maintain this trailing window, we hit indexation failures.
    5898+        prune_end = std::min(prune_height, chain_height - MIN_BLOCKS_TO_KEEP);
    5899+    }
    


    ryanofsky commented at 2:29 am on August 29, 2023:

    In commit “validation: pruning for multiple chainstates” (b9ac4f7925e2d9dec67be06fdc2a381532581136)

    Would suggest compressing this all down to:

    0// Set the end of the prune range to the maximum height which is at least
    1// MIN_BLOCKS_TO_KEEP blocks away from the chain tip, and is not higher than the
    2// highest block that is allowed to be pruned according to the caller.
    3prune_end = std::min(std::max<int>(0, chain_height - MIN_BLOCKS_TO_KEEP), last_prune);
    

    Assuming prune_height is renamed back to last_prune as suggested earlier and made into an int.


    jamesob commented at 8:40 pm on August 29, 2023:
    Done.
  127. ryanofsky commented at 2:48 am on August 29, 2023: contributor

    This all looks very good and I feel like I got through the most complicated changes, but I’m still working through the rest. Most of my comments are not too important but I do think the indexing and pruning changes can be simplified a bit.

    Progress so far:

    • 08a34e56ba7aa8714982d352238f1cb6eba43e29 net_processing: Request assumeutxo background chain blocks (1/24)
    • 37af5f47541e77753d20bfbb7c105b55c0caf305 bugfix: correct is_snapshot_cs in VerifyDB (2/24)
    • 1755c53365d53693213a721138a44bdd889d8794 assumeutxo: remove snapshot during -reindex{-chainstate} (3/24)
    • 3d103bbf266bf3c8f66b3aff2237cc0a95f96a75 chainparams: add blockhash to AssumeutxoData (4/24)
    • 3e7d3537fc06c8aff8a48edc8f179a70af46e04c validation: MaybeRebalanceCaches when chain leaves IBD (5/24)
    • 95f45e38b5e47eaef56468909d04418b93e9c962 validation: add ChainstateRole (6/24)
    • 9eda8224119450b8614837e0c8decfc81a6b9915 validation: only call UpdatedBlockTip for active chainstate (7/24)
    • 2873d91ada1c240de280a35bbf169ed0c708e057 validation: pass ChainstateRole for validationinterface calls (8/24)
    • b48729144fba1dbc64d603b2644a76c3f00d44eb validationinterface: only send zmq notifications for active (9/24)
    • ba2f21a9ef88ce9631a2143f16569ee231953a91 wallet: validationinterface: only handle active chain notifications (10/24)
    • 5d096add65e84530c907b5ee934af144067d5ef1 net_processing: validationinterface: ignore some events for bg chain (11/24)
    • 688b68f34fe061c3064a516cdf77d4c155bfcafd index: cache indexer references on ChainstateManager (12/24)
    • 5d01151b41b244dabeda54122400df17a3dfc4d2 validation: indexing changes for assumeutxo (13/24)
    • b9ac4f7925e2d9dec67be06fdc2a381532581136 validation: pruning for multiple chainstates (14/24)
    • abaf61487bd5eb2c33ed62dae602b5f479f757f9 test: adjust chainstate tests to use recognized snapshot base (15/24)
    • a2a49ea7f1ed2931ab388bac677196dedc10714e validation: populate nChainTx value for assumedvalid chainstates (16/24)
    • 07517a2311c410dd107c4feb4e113bb7c629da8e blockstorage: segment normal/assumedvalid blockfiles (17/24)
    • 0e507021932f72aa7345bf84b805c7cda68eea65 validation: assumeutxo: swap m_mempool on snapshot activation (18/24)
    • 03de0d5ac9457b89e577a11e146cc394a8494b7b rpc: add loadtxoutset (19/24)
    • 05aba13ce27bbe7346ffb72a619638b6f84d3679 rpc: add getchainstates (20/24)
    • 60d6f6bbe91879d542439bec05ef406eb9954eb8 test: add feature_assumeutxo functional test (21/24)
    • 5394f2e9b7fea36d3cba43af82143f5c72c08ae5 contrib: add script to demo/test assumeutxo (22/24)
    • a3f96fbee900fdf23a160a461164a65a0c0cfe84 doc: update release-process.md for assumeutxo (23/24)
    • e44815e9a754c967415402b34807633625393084 chainparams: add assumeutxo param at height 788_000 (24/24)
  128. naumenkogs commented at 11:02 am on August 29, 2023: member

    Did code review at e44815e9a754c967415402b34807633625393084. Most of the @ryanofsky I think are worth addressing. Once these are addressed or resolved I will do my own clean path.

    I also spotted a couple of typos along the way.

  129. jamesob force-pushed on Aug 29, 2023
  130. jamesob commented at 9:44 pm on August 29, 2023: member

    Okay, all feedback addressed - some good simplifications there. Thanks for the comprehensive review, @ryanofsky @naumenkogs.

    You can check the diff with

    0git range-diff upstream/master jamesob/au.031 jamesob/au.033
    
  131. DrahtBot added the label CI failed on Aug 29, 2023
  132. jamesob force-pushed on Aug 30, 2023
  133. jamesob commented at 2:36 pm on August 30, 2023: member

    Had to push a small commit to fix underflow in pruning calculations.

    0git range-diff upstream/master jamesob/au.033 jamesob/au.034
    
  134. in src/validation.cpp:5890 in 5e4c824433 outdated
    5885+        prune_start = *Assert(GetSnapshotBaseHeight()) + 1;
    5886+    }
    5887+
    5888+    int max_prune = chainstate.m_chain.Height() - static_cast<int>(MIN_BLOCKS_TO_KEEP);
    5889+
    5890+    // last block to prune is the lesser of (user-specified height, MIN_BLOCKS_TO_KEEP from the tip)
    


    ryanofsky commented at 2:51 pm on August 30, 2023:

    In commit “validation: pruning for multiple chainstates” (5e4c824433fb4bf8b380d0f61a0aa424dc5456d4)

    Would s/user-specified/caller-specified/. In the manual pruning case it is actually user-specified, but in the automatic case it depends on prune locks.


    jamesob commented at 1:37 pm on August 31, 2023:
    Done.
  135. in src/node/blockstorage.cpp:270 in 5e4c824433 outdated
    158+    if (chain.m_chain.Height() < 0) {
    159         return;
    160     }
    161 
    162-    // last block to prune is the lesser of (user-specified height, MIN_BLOCKS_TO_KEEP from the tip)
    163-    unsigned int nLastBlockWeCanPrune = std::min((unsigned)nManualPruneHeight, chain_tip_height - MIN_BLOCKS_TO_KEEP);
    


    ryanofsky commented at 2:55 pm on August 30, 2023:

    In commit “validation: pruning for multiple chainstates” (5e4c824433fb4bf8b380d0f61a0aa424dc5456d4)

    It looks like there was an overflow bug here previously which is fixed now? If chain_tip_height was less than MIN_BLOCKS_TO_KEEP, that would overflow and produce a large unsigned integer, so it would have been possible to manually prune chain data very close to the tip which the chain height was low previously, IIUC


    jamesob commented at 1:34 pm on August 31, 2023:
    Yep, looks so to me. Although MIN_BLOCKS_TO_KEEP is 288, so probably this would never be able to happen because of how quickly that part of the chain connects.
  136. in src/node/blockstorage.cpp:193 in 5e4c824433 outdated
    197+    int height = chain.m_chain.Height();
    198+    if (height < 0 || target == 0) {
    199         return;
    200     }
    201-    if ((uint64_t)chain_tip_height <= nPruneAfterHeight) {
    202+    if ((uint64_t)height <= nPruneAfterHeight) {
    


    ryanofsky commented at 3:03 pm on August 30, 2023:

    In commit “validation: pruning for multiple chainstates” (5e4c824433fb4bf8b380d0f61a0aa424dc5456d4)

    I think it would simplify the code more to drop the nPruneAfterHeight parameter and replace it directly with _chainman.GetParams().PruneAfterHeight() to make it obvious this is a hardcoded parameter.

    Also maybe replace the C cast with a static cast if this line is changing anyway.


    jamesob commented at 1:36 pm on August 31, 2023:
    Yep, good point. Done.
  137. DrahtBot removed the label CI failed on Aug 30, 2023
  138. in src/node/blockstorage.cpp:189 in 5e4c824433 outdated
    192-    if (chain_tip_height < 0 || GetPruneTarget() == 0) {
    193+    // Distribute our -prune budget over all chainstates.
    194+    const auto target = std::max(
    195+        MIN_DISK_SPACE_FOR_BLOCK_FILES, GetPruneTarget() / chainman.GetAll().size());
    196+
    197+    int height = chain.m_chain.Height();
    


    ryanofsky commented at 3:10 pm on August 30, 2023:

    In commit “validation: pruning for multiple chainstates” (5e4c824433fb4bf8b380d0f61a0aa424dc5456d4)

    Suggest keeping the name “chain_tip_height” instead of “height” for this variable. There are a lot of heights in this function so easy to forget what the the generic “height” is.


    jamesob commented at 1:35 pm on August 31, 2023:
    Done.
  139. in src/validation.cpp:5895 in 5e4c824433 outdated
    5890+    // last block to prune is the lesser of (user-specified height, MIN_BLOCKS_TO_KEEP from the tip)
    5891+    //
    5892+    // While you might be tempted to prune the background chainstate more
    5893+    // aggressively (i.e. fewer MIN_BLOCKS_TO_KEEP), this won't work with index
    5894+    // building - specifically blockfilterindex requires undo data, and if
    5895+    // we don't maintain this trailing window, we hit indexation failures.
    


    ryanofsky commented at 3:12 pm on August 30, 2023:

    In commit “validation: pruning for multiple chainstates” (5e4c824433fb4bf8b380d0f61a0aa424dc5456d4)

    Last remaining s/indexation/indexing/


    jamesob commented at 1:37 pm on August 31, 2023:
    Done.
  140. in src/test/validation_chainstatemanager_tests.cpp:88 in 76d616c957 outdated
    88+    }
    89     BlockValidationState _;
    90     BOOST_CHECK(c2.ActivateBestChain(_, nullptr));
    91 
    92+    BOOST_CHECK_EQUAL(manager.SnapshotBlockhash().value(), snapshot_blockhash);
    93+    BOOST_CHECK(c2.ActivateBestChain(_));
    


    ryanofsky commented at 9:55 pm on August 30, 2023:

    In commit “test: adjust chainstate tests to use recognized snapshot base” (76d616c957522e7be52eaf10b3ac33accaf75c45)

    Seems pointless to call ActivateBestChain twice in a row here. Would be good to remove this line and simplify test setup or add a comment explaining why this is necessary.


    jamesob commented at 1:41 pm on August 31, 2023:
    Done.
  141. in src/test/validation_chainstatemanager_tests.cpp:79 in 76d616c957 outdated
    79-    WITH_LOCK(::cs_main, c2.InitCoinsCache(1 << 23));
    80-    c2.m_chain.SetTip(*active_tip);
    81+    {
    82+        LOCK(::cs_main);
    83+        c2.InitCoinsCache(1 << 23);
    84+        c2.m_chain.SetTip(*active_tip);
    


    ryanofsky commented at 10:01 pm on August 30, 2023:

    In commit “test: adjust chainstate tests to use recognized snapshot base” (76d616c957522e7be52eaf10b3ac33accaf75c45)

    This SetTip call seems pointless now because the LoadChainTip call below will overwrite the chain tip. Would suggest removing this line or adding more explanation in a comment.


    jamesob commented at 1:40 pm on August 31, 2023:
    Good point, done.
  142. in src/test/validation_chainstatemanager_tests.cpp:106 in 76d616c957 outdated
    104 
    105     auto active_tip2 = WITH_LOCK(manager.GetMutex(), return manager.ActiveTip());
    106     auto exp_tip2 = c2.m_chain.Tip();
    107     BOOST_CHECK_EQUAL(active_tip2, exp_tip2);
    108 
    109-    BOOST_CHECK_EQUAL(exp_tip, exp_tip2);
    


    ryanofsky commented at 10:14 pm on August 30, 2023:

    In commit “test: adjust chainstate tests to use recognized snapshot base” (76d616c957522e7be52eaf10b3ac33accaf75c45)

    I think it would be good to clarify this code and verify both tips are set correctly at the end of the test.

    Would suggest:

    0-    auto exp_tip2 = c2.m_chain.Tip();
    1-    BOOST_CHECK_EQUAL(active_tip2, exp_tip2);
    2+    BOOST_CHECK_EQUAL(active_tip, active_tip2->pprev);
    3+    BOOST_CHECK_EQUAL(active_tip, c1.m_chain.Tip());
    4+    BOOST_CHECK_EQUAL(active_tip2, c2.m_chain.Tip());
    

    jamesob commented at 1:42 pm on August 31, 2023:
    Makes sense, done.
  143. in src/test/validation_chainstatemanager_tests.cpp:493 in 76d616c957 outdated
    489@@ -488,13 +490,13 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
    490 
    491     // Note: cs2's tip is not set when ActivateExistingSnapshot is called.
    492     Chainstate& cs2 = WITH_LOCK(::cs_main,
    493-        return chainman.ActivateExistingSnapshot(&mempool, *assumed_base->phashBlock));
    494+        return chainman.ActivateExistingSnapshot(&mempool, *assumed_tip->phashBlock));
    


    ryanofsky commented at 10:33 pm on August 30, 2023:

    In commit “test: adjust chainstate tests to use recognized snapshot base” (76d616c957522e7be52eaf10b3ac33accaf75c45)

    What is going on here and in on line 499 below? It seems strange to advance the snapshot chain all the way to the chain tip if the point of the test below is to make sure setBlockIndexCandidates is initialized correctly after loading the snapshot, and setBlockIndexCandidates is supposed to be use to work towards the chain tip.

    Also changing assumed_base to assumed_tip here and below seems contrary to the comments “Set tip of the assume-valid-based chain to the assume-valid block”

    If I revert the changes this line and on line 499 the test still passes as of b6b62a96bd6d1eccaf6bd9eadfa1c60232aadeed, so I think they may be unnecessary


    jamesob commented at 1:47 pm on August 31, 2023:
    Good catch. I think I was grasping at straws trying to get the tests to pass having changed heights to match up with the recognized chainparams snapshots. Not sure what I was thinking.
  144. in src/node/blockstorage.cpp:293 in 9df91113b8 outdated
    289@@ -290,7 +290,13 @@ bool BlockManager::LoadBlockIndex()
    290         // Pruned nodes may have deleted the block.
    291         if (pindex->nTx > 0) {
    292             if (pindex->pprev) {
    293-                if (pindex->pprev->nChainTx > 0) {
    294+                if (snapshot_blockhash && pindex->GetBlockHash() == *snapshot_blockhash) {
    


    ryanofsky commented at 11:38 pm on August 30, 2023:

    In commit “validation: populate nChainTx value for assumedvalid chainstates” (9df91113b8bbab8c98924bef61da388a8141b4e6)

    It seems like this might be broken if a user loaded a snapshot and then quit before the snapshot block was successfully downloaded. In this case I think pindex->nTx for the snapshot block on line 291 would be 0 so the new code would not run.

    A good way to fix this might be to add something like if (snapshot_blockhash) LookupBlockIndex(hash)->nChainTx = AssumeutxoForBlockhash(*snapshot_blockhash) right after the LoadBlockIndexGuts call, before the for-loop.

    The other advantage setting nChainTx before the for-loop would be that the for-loop can remain simpler and unchanged, and also be more efficient because it will no longer be need to compare pindex->GetBlockHash() == *snapshot_blockhash) for each block.


    jamesob commented at 2:10 pm on August 31, 2023:
    Oh wow, that’s a great simplification - thanks. Fixed.
  145. in src/validation.cpp:4843 in 9df91113b8 outdated
    4764@@ -4765,6 +4765,10 @@ void ChainstateManager::CheckBlockIndex()
    4765     CBlockIndex* pindexFirstAssumeValid = nullptr; // Oldest ancestor of pindex which has BLOCK_ASSUMED_VALID
    4766     while (pindex != nullptr) {
    4767         nNodes++;
    4768+        if (pindex->pprev && pindex->nTx > 0) {
    4769+            // nChainTx should increase monotonically
    4770+            assert(pindex->pprev->nChainTx <= pindex->nChainTx);
    


    ryanofsky commented at 11:44 pm on August 30, 2023:

    In commit “validation: populate nChainTx value for assumedvalid chainstates” (9df91113b8bbab8c98924bef61da388a8141b4e6)

    Could you assert not that just that it increases monotonically, but that it increases by the number of transactions?

    0 assert(pindex->nChainTx == pindex->nTx + pindex->pprev->nChainTx);
    

    Or maybe add a comment if there’s a special case where this does not hold.


    jamesob commented at 2:17 pm on August 31, 2023:

    Getting test failures with this change, not certain why:

     010:17:24 james@fido src/bitcoin ( 11555828ee 1155582) % ./src/test/test_bitcoin
     1Running 578 test cases...
     2test_bitcoin: validation.cpp:4767: void ChainstateManager::CheckBlockIndex(): Assertion `pindex->nChainTx == pindex->nTx + pindex->pprev->nChainTx' failed.
     3unknown location(0): fatal error: in "validation_block_tests/processnewblock_signals_ordering": signal: SIGABRT (application abort requested)
     4test/validation_block_tests.cpp(160): last checkpoint
     5unknown location(0): fatal error: in "validation_block_tests/mempool_locks_reorg": memory access violation at address: 0x5824eadac8d1: no mapping at fault address
     6test/validation_block_tests.cpp(224): last checkpoint: "mempool_locks_reorg" fixture ctor
     7
     8*** 2 failures are detected in the test module "Bitcoin Core Test Suite"
     9test_bitcoin: checkqueue.h:198: CCheckQueue<CScriptCheck>::~CCheckQueue() [T = CScriptCheck]: Assertion `m_worker_threads.empty()' failed.
    10zsh: IOT instruction (core dumped)  ./src/test/test_bitcoin
    11./src/test/test_bitcoin  18.78s user 2.94s system 118% cpu 18.291 total
    

    Maybe worth doing this improvement as a follow-up.


    naumenkogs commented at 8:52 am on September 4, 2023:
    Well in theory that code is certainly prone to overflows, but I’m not sure whether that’s the cause of a failure :)

    ryanofsky commented at 3:32 pm on September 12, 2023:

    re: #27596 (review)

    In commit “validation: populate nChainTx value for assumedvalid chainstates” (e3cae85738c8bf4116547af8d868c6c01577b80f)

    It seems like one thing that was causing this assert to fail is that the hardcoded regtest nChainTx value is wrong. If I change the snapshot nChainTx value from 110 to 111 tests are able to pass.

    Another problem is that some tests avoid storing blocks, so leave the nChainTx value unset. If I update the assert to allow nChainTx to still be unset, all tests seem to pass.

    So I think the following changes would be make the check stronger and clarify the code:

     0--- a/src/kernel/chainparams.cpp
     1+++ b/src/kernel/chainparams.cpp
     2@@ -493,7 +493,7 @@ public:
     3             {
     4                 .height = 110,
     5                 .hash_serialized = AssumeutxoHash{uint256S("0x1ebbf5850204c0bdb15bf030f47c7fe91d45c44c712697e4509ba67adb01c618")},
     6-                .nChainTx = 110,
     7+                .nChainTx = 111,
     8                 .blockhash = uint256S("0x696e92821f65549c7ee134edceeeeaaa4105647a3c4fd9f298c0aec0ab50425c")
     9             },
    10             {
    11--- a/src/test/validation_tests.cpp
    12+++ b/src/test/validation_tests.cpp
    13@@ -138,11 +138,11 @@ BOOST_AUTO_TEST_CASE(test_assumeutxo)
    14 
    15     const auto out110 = *params->AssumeutxoForHeight(110);
    16     BOOST_CHECK_EQUAL(out110.hash_serialized.ToString(), "1ebbf5850204c0bdb15bf030f47c7fe91d45c44c712697e4509ba67adb01c618");
    17-    BOOST_CHECK_EQUAL(out110.nChainTx, 110U);
    18+    BOOST_CHECK_EQUAL(out110.nChainTx, 111U);
    19 
    20     const auto out110_2 = *params->AssumeutxoForBlockhash(uint256S("0x696e92821f65549c7ee134edceeeeaaa4105647a3c4fd9f298c0aec0ab50425c"));
    21     BOOST_CHECK_EQUAL(out110_2.hash_serialized.ToString(), "1ebbf5850204c0bdb15bf030f47c7fe91d45c44c712697e4509ba67adb01c618");
    22-    BOOST_CHECK_EQUAL(out110_2.nChainTx, 110U);
    23+    BOOST_CHECK_EQUAL(out110_2.nChainTx, 111U);
    24 }
    25 
    26 BOOST_AUTO_TEST_SUITE_END()
    27--- a/src/validation.cpp
    28+++ b/src/validation.cpp
    29@@ -4762,10 +4762,15 @@ void ChainstateManager::CheckBlockIndex()
    30     CBlockIndex* pindexFirstAssumeValid = nullptr; // Oldest ancestor of pindex which has BLOCK_ASSUMED_VALID
    31     while (pindex != nullptr) {
    32         nNodes++;
    33-        if (pindex->pprev && pindex->nTx > 0) {
    34-            // nChainTx should increase monotonically
    35-            assert(pindex->pprev->nChainTx <= pindex->nChainTx);
    36-        }
    37+
    38+        // Make sure nChainTx sum is correctly computed.
    39+        unsigned int prev_chain_tx = pindex->pprev ? pindex->pprev->nChainTx : 0;
    40+        assert((pindex->nChainTx == pindex->nTx + prev_chain_tx)
    41+               // For testing, allow transaction counts to be completely unset.
    42+               || (pindex->nChainTx == 0 && pindex->nTx == 0)
    43+               // For testing, allow this nChainTx to be unset if previous is also unset.
    44+               || (pindex->nChainTx == 0 && prev_chain_tx == 0 && pindex->pprev));
    45+
    46         if (pindexFirstAssumeValid == nullptr && pindex->nStatus & BLOCK_ASSUMED_VALID) pindexFirstAssumeValid = pindex;
    47         if (pindexFirstInvalid == nullptr && pindex->nStatus & BLOCK_FAILED_VALID) pindexFirstInvalid = pindex;
    48         if (pindexFirstMissing == nullptr && !(pindex->nStatus & BLOCK_HAVE_DATA)) {
    

    fjahr commented at 4:23 pm on October 6, 2023:
    This is addressed in #28562
  146. in src/node/blockstorage.cpp:759 in 1c668b690e outdated
    752@@ -680,13 +753,14 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
    753         pos.nPos = m_blockfile_info[nFile].nSize;
    754     }
    755 
    756-    if ((int)nFile != m_last_blockfile) {
    757+    if (nFile != last_blockfile) {
    758         if (!fKnown) {
    759-            LogPrint(BCLog::BLOCKSTORAGE, "Leaving block file %i: %s\n", m_last_blockfile, m_blockfile_info[m_last_blockfile].ToString());
    760+            LogPrint(BCLog::BLOCKSTORAGE, "Leaving block file %i: %s (onto %i) (height %i)\n",
    761+                m_last_blockfile, m_blockfile_info[last_blockfile].ToString(), nFile, nHeight);
    


    ryanofsky commented at 0:12 am on August 31, 2023:

    In commit “validation: assumeutxo: swap m_mempool on snapshot activation” (f2dd7a0c1b67d3e4a6e086a619eff34fe6625263)

    This is still referencing m_last_blockfile instead of last_blockfile, which does not make sense. I think it would be a good idea to remove the m_last_blockfile member because this appears to be the only remaining usage.


    jamesob commented at 5:45 pm on August 31, 2023:
    Oops, done.
  147. in src/node/blockstorage.h:124 in 1c668b690e outdated
    87+    // Track the largest height block whose undo data is in `file_num` blockfile.
    88+    // When we move to the next blockfile, if the undo data is keeping up then
    89+    // we know we can trim the size of the file and truncate any excess
    90+    // allocations. If not, we assume that eventually we'll add more undo data
    91+    // to this file, and when we revisit it we'll trim it then.
    92+    int undo_height{0};
    


    ryanofsky commented at 0:27 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    This comment appears to be copied from an old version of PR #27746 (commit 6b5b53eafefa38250bb4e5d914e5ad5743416563) that was updated later. The final version of the #27746 had a more complete comment which can be found on line 178 below documenting the m_undo_height_in_last_blockfile member.

    I think to avoid confusion and bugs it’s pretty important to delete the m_last_blockfile and m_undo_height_in_last_blockfile members below fully replace them with new members in this struct and move the latest m_undo_height_in_last_blockfile comment over here. I also do think the old names last_blockfile and undo_height_in_last_blockfile are more descriptive than the new names file_num and undo_height, but I guess there was some reasoning for changing them, so not sure about that.


    jamesob commented at 5:47 pm on August 31, 2023:

    Oops, silent merge conflict. Fixed.

    m_last_blockfile just didn’t make much sense to me when you’re already in the context of a blockfile cursor; within that scope, file_num felt like it was a complete description. Same rationale for the other.

  148. in src/node/blockstorage.h:75 in 1c668b690e outdated
    70@@ -70,6 +71,30 @@ struct PruneLockInfo {
    71     int height_first{std::numeric_limits<int>::max()}; //! Height of earliest block that should be kept and not pruned
    72 };
    73 
    74+enum class BlockfileType {
    75+    // For the purposes of blockfile fragmentation, treat background IBD chainstates
    76+    // as normal.
    


    ryanofsky commented at 0:46 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    I think it would make sense to move this comment to the BlockfileTypeForRole function. The comment seems out of context here since the enum isn’t used yet here and the purpose isn’t described until the m_blockfile_cursors comment 100 lines below.


    jamesob commented at 5:46 pm on August 31, 2023:
    I’ve just removed the comment because I don’t think it adds much anymore.

    ryanofsky commented at 5:22 pm on September 12, 2023:

    re: #27596 (review)

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    I’ve just removed the comment because I don’t think it adds much anymore.

    Comment seems to be here still (and would be good to remove)


    jamesob commented at 4:52 pm on September 17, 2023:
    Oops! Fixed.
  149. in src/node/blockstorage.cpp:403 in 1c668b690e outdated
    398+            if (snapshot_base_height && fileinfo.nHeightLast > *snapshot_base_height) {
    399+                blockfile_type = BlockfileType::ASSUMED;
    400+            }
    401+            m_blockfile_cursors[blockfile_type] = {static_cast<int>(i), 0};
    402+            LogPrint(BCLog::BLOCKSTORAGE, "Set blockfile cursor %s to %i\n",
    403+                blockfile_type, *m_blockfile_cursors[blockfile_type]);
    


    ryanofsky commented at 1:00 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    This seems like it is going to create log spam if it is printing an intermediate cursor value for each block file on disk. I think it would make more sense just to print two final NORMAL and ASSUMED last file numbers after the loop, than all the intermediate last file numbers during the loop.


    jamesob commented at 2:50 pm on August 31, 2023:
    Removed.
  150. in src/node/blockstorage.cpp:759 in 1c668b690e outdated
    645+
    646+void BlockManager::FlushChainstateBlockFile(ChainstateRole chainstate_role, bool finalize, bool finalize_undo)
    647+{
    648+    LOCK(cs_LastBlockFile);
    649+    auto& cursor = m_blockfile_cursors[BlockfileTypeForRole(chainstate_role)];
    650+    if (cursor) {
    


    ryanofsky commented at 1:09 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    I was confused by this at first about why it was not an error for the cursor to be null here. Would be helpful to add comment saying it is safe to skip flushing if the cursor does not exist, because this only happens when a new snapshot has been loaded and and no blocks have been downloaded or written yet.


    jamesob commented at 3:12 pm on August 31, 2023:
    Thanks, added.

    ryanofsky commented at 4:57 pm on September 12, 2023:

    re: #27596 (review)

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    Would suggest following change to mention it should be safe not to flush and also move the comment to be clearer it is describing the if condition not the flush call:

     0--- a/src/node/blockstorage.cpp
     1+++ b/src/node/blockstorage.cpp
     2@@ -634,9 +634,10 @@ void BlockManager::FlushChainstateBlockFile(int tip_height, bool finalize, bool
     3 {
     4     LOCK(cs_LastBlockFile);
     5     auto& cursor = m_blockfile_cursors[BlockfileTypeForHeight(tip_height)];
     6+    // If the cursor does not exist, it means an assumeutxo snapshot is loaded,
     7+    // but no blocks past the snapshot height have been written yet, so there
     8+    // is no data associated with the chainstate, and it is safe not to flush.
     9     if (cursor) {
    10-        // The cursor may not exist after a snapshot has been loaded but before any
    11-        // blocks have been downloaded.
    12         return FlushBlockFile(cursor->file_num, finalize, finalize_undo);
    13     }
    14 }
    

    naumenkogs commented at 7:13 am on September 21, 2023:

    bd770ff59075fa2fd0441311ef1c02439ed68fdd

    I agree with this suggested change.

  151. in src/node/blockstorage.cpp:517 in 1c668b690e outdated
    407+        // last NORMAL blockfile.
    408+        if (snapshot_blockhash && !m_blockfile_cursors[BlockfileType::ASSUMED]) {
    409+            const auto newcursor = BlockfileCursor{this->MaxBlockfileNum() + 1, 0};
    410+            m_blockfile_cursors[BlockfileType::ASSUMED] = newcursor;
    411+            LogPrint(BCLog::BLOCKSTORAGE, "Initialized empty assumed blockfile cursor to %i\n", newcursor);
    412+        }
    


    ryanofsky commented at 1:13 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    It seems pointless and little wasteful to create this cursor and increment the max block number before any blocl data is downloaded. Wouldn’t it be simpler to just delete this code and let the code in BlockManager::FindBlockPos create the cursor when there is actually data ready to be written?


    jamesob commented at 7:16 pm on August 31, 2023:
    Done, thanks.
  152. in src/node/blockstorage.cpp:715 in 1c668b690e outdated
    712+        assert(chain_type == BlockfileType::ASSUMED);
    713+        const auto new_cursor = BlockfileCursor{this->MaxBlockfileNum() + 1, 0};
    714+        m_blockfile_cursors[chain_type] = new_cursor;
    715+        LogPrint(BCLog::BLOCKSTORAGE, "[%s] initializing blockfile cursor to %s\n", chainstate_role, new_cursor);
    716+    }
    717+    const unsigned int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
    


    ryanofsky commented at 1:27 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    nFile was already unsigned previously, but I think it’s a small regression to change last_blockfile variable from unsigned to signed and have this implicit cast. I think ideally all these numbers would just be signed and it would be better to move in that direction instead of the other way.


    jamesob commented at 5:43 pm on August 31, 2023:
    Okay, sounds good.
  153. in src/node/blockstorage.cpp:640 in 1c668b690e outdated
    638+}
    639+
    640+static BlockfileType BlockfileTypeForRole(ChainstateRole chainstate_role)
    641+{
    642+    return chainstate_role == ChainstateRole::ASSUMEDVALID ?
    643+        BlockfileType::ASSUMED : BlockfileType::NORMAL;
    


    ryanofsky commented at 1:50 am on August 31, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (1c668b690eeb4b98899be7e0ff33373b79932f2a)

    This seems like like as soon as the snapshot is validated, the new blocks from the snapshot chainstate will start being added to the last background block file, preventing that file from being pruned potentially for a long time.

    This really makes me think it is a mistake expose ChainstateRole to all these individual BlockManager methods and try to use it as a heuristic for selecting the cursor. I think a better approach that would keep block files more tidy and simplify the BlockManager API would be to give BlockManager an int m_snapshot_base_height variable that could be initialized in BlockManager::LoadBlockIndexDB and used internally to derive the BlockfileType value without requiring ChainstateRole chainstate_role parameters to be passed around everywhere. This would also simplify the validation code, avoiding the need to change ChainstateManager::AcceptBlock.

    I think the new BlockManager::FlushChainstateBlockFile method would still need to be added, but the rest of the BlockManager API changes could be reverted, and FlushChainstateBlockFile could take a chain tip height instead of a ChainstateRole, so the same cursor select code could be used everywhere.


    jamesob commented at 3:10 pm on August 31, 2023:

    Hey thanks for this feedback, this has led to a big simplification in this commit. I had tried to do something like this earlier, but got hung up on the “snapshot activated during runtime” case. I forgot that ChainstateManager can just reach into the BlockManager and set the snapshot height dynamically.

    So that’s what I’m doing here, without trying to adapt ChainstateRole to BlockManager behavior. The result is a lot simpler and - as you point out - less buggy!

  154. ryanofsky commented at 2:20 am on August 31, 2023: contributor

    Got as far as the blockstorage commit this time and wrote a lot of feedback on it. There’s a lot of small cleanups I suggested there, but I also suggested a different approach in #27596 (review) that I think would simplify code, avoid the needing to add any ChainstateRole arguments to BlockManager methods, and avoid potentially mixing blocks from the background and snapshot chainstates into the last background block file. I think the current approach is ok if you want to keep it, but it does seems messier than it needs to be and not as reliable as it could be.

    Progress so far:

    • 7b93d4adc976bdb32306951b21d7c98d9dacc972 net_processing: Request assumeutxo background chain blocks (1/23)
    • b85530c18bfe24ff434f32f959d8dd843afca54d bugfix: correct is_snapshot_cs in VerifyDB (2/23)
    • 8f6d6a49552c84a3a9112b39499685f018abfa40 assumeutxo: remove snapshot during -reindex{-chainstate} (3/23)
    • 7a62705f421b998a812f0f018dbf8f1211d57535 chainparams: add blockhash to AssumeutxoData (4/23)
    • 167f94d905ca1cc77cb613e89cea74275807fed5 validation: MaybeRebalanceCaches when chain leaves IBD (5/23)
    • 288dd732b5c0085618598fe3dbba23ac6d194173 validation: add ChainstateRole (6/23)
    • b06daa581f35a8e4736682929e356d560fa6a5da validation: only call UpdatedBlockTip for active chainstate (7/23)
    • d3b533d3f1e80addca2ab0b3ee97215d5d03d984 validation: pass ChainstateRole for validationinterface calls (8/23)
    • 1fda82e7192c25363d57ab203b5db8eac6088491 validationinterface: only send zmq notifications for active (9/23)
    • 3e446ffd122e11b206692f6ac702df989884fbe7 wallet: validationinterface: only handle active chain notifications (10/23)
    • fbd05a9eed1e40be0c4915e2aef98b3771d7af6c net_processing: validationinterface: ignore some events for bg chain (11/23)
    • 0dd0cda274c244c466b27ee5f390d4cdd2e1e195 validation: indexing changes for assumeutxo (12/23)
    • 5e4c824433fb4bf8b380d0f61a0aa424dc5456d4 validation: pruning for multiple chainstates (13/23)
    • 76d616c957522e7be52eaf10b3ac33accaf75c45 test: adjust chainstate tests to use recognized snapshot base (14/23)
    • 9df91113b8bbab8c98924bef61da388a8141b4e6 validation: populate nChainTx value for assumedvalid chainstates (15/23)
    • 1c668b690eeb4b98899be7e0ff33373b79932f2a blockstorage: segment normal/assumedvalid blockfiles (16/23)
    • f2dd7a0c1b67d3e4a6e086a619eff34fe6625263 validation: assumeutxo: swap m_mempool on snapshot activation (17/23)
    • 82a8744f1384f07dc30aaa6339b0b8fb32dc3af0 rpc: add loadtxoutset (18/23)
    • 555206238129c6ed9fcbbc498c81455c7fe6d851 rpc: add getchainstates (19/23)
    • cc85742de115a1e9533459acc8707b58c397ab6a test: add feature_assumeutxo functional test (20/23)
    • 194eb1fba6c9237affc273e5c087e54502ee7598 contrib: add script to demo/test assumeutxo (21/23)
    • 36cece6c488bfb810282e352f282a92941443be1 doc: update release-process.md for assumeutxo (22/23)
    • b6b62a96bd6d1eccaf6bd9eadfa1c60232aadeed chainparams: add assumeutxo param at height 788_000 (23/23)
  155. in src/net_processing.cpp:5904 in 7b93d4adc9 outdated
    5897@@ -5831,6 +5898,13 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
    5898             std::vector<const CBlockIndex*> vToDownload;
    5899             NodeId staller = -1;
    5900             FindNextBlocksToDownload(*peer, MAX_BLOCKS_IN_TRANSIT_PER_PEER - state.vBlocksInFlight.size(), vToDownload, staller);
    5901+            if (m_chainman.BackgroundSyncInProgress() && !IsLimitedPeer(*peer)) {
    5902+                TryDownloadingHistoricalBlocks(
    5903+                    *peer,
    5904+                    MAX_BLOCKS_IN_TRANSIT_PER_PEER - state.vBlocksInFlight.size(),
    


    naumenkogs commented at 9:08 am on August 31, 2023:

    7b93d4adc976bdb32306951b21d7c98d9dacc972

    Could use an assert this can’t underflow


    jamesob commented at 6:06 pm on August 31, 2023:
    Thanks, now avoiding underflow with cast + max.
  156. in src/validation.cpp:5874 in 288dd732b5 outdated
    5736@@ -5737,6 +5737,16 @@ bool ChainstateManager::DeleteSnapshotChainstate()
    5737     return true;
    5738 }
    5739 
    5740+ChainstateRole Chainstate::GetRole() const
    5741+{
    5742+    if (m_chainman.GetAll().size() <= 1) {
    


    naumenkogs commented at 9:25 am on August 31, 2023:

    288dd732b5c0085618598fe3dbba23ac6d194173

    I wish the chainstate state machine was briefly mentioned here, or referenced where it’s defined. A reviewer needs a way to understand why it’s impossible to have 2+ chainstates, one of them being NORMAL (and others being something).


    jamesob commented at 6:08 pm on August 31, 2023:
    Added a pointer to the ChainstateManager documentation this.
  157. in src/net_processing.cpp:1968 in d3b533d3f1 outdated
    1964@@ -1961,7 +1965,10 @@ void PeerManagerImpl::BlockDisconnected(const std::shared_ptr<const CBlock> &blo
    1965  * Maintain state about the best-seen block and fast-announce a compact block
    1966  * to compatible peers.
    1967  */
    1968-void PeerManagerImpl::NewPoWValidBlock(const CBlockIndex *pindex, const std::shared_ptr<const CBlock>& pblock)
    1969+void PeerManagerImpl::NewPoWValidBlock(
    


    naumenkogs commented at 9:30 am on August 31, 2023:

    d3b533d3f1e80addca2ab0b3ee97215d5d03d984

    could add an assert for the invariant: can’t be BACKGROUND role.


    naumenkogs commented at 9:37 am on August 31, 2023:
    This is done later in fbd05a9eed1e40be0c4915e2aef98b3771d7af6c, although in a return way for some reason?

    jamesob commented at 6:14 pm on August 31, 2023:
    Unclear if clients only want to know about NewPoWValidBlocks for background chainstate, but since the only usage is ignored (PeerManagerImpl) I’ll just remove the role parameter and avoiding sending the notification from non-active chainstates.
  158. in src/net_processing.cpp:1934 in fbd05a9eed outdated
    1915@@ -1916,9 +1916,15 @@ void PeerManagerImpl::BlockConnected(
    1916     const std::shared_ptr<const CBlock>& pblock,
    1917     const CBlockIndex* pindex)
    1918 {
    1919-    m_orphanage.EraseForBlock(*pblock);
    1920+    // Update this for all chainstate roles so that we don't mistakenly see peers
    1921+    // helping us do background IBD as having a stale tip.
    1922     m_last_tip_update = GetTime<std::chrono::seconds>();
    1923 
    1924+    if (role == ChainstateRole::BACKGROUND) {
    


    naumenkogs commented at 9:39 am on August 31, 2023:

    fbd05a9eed1e40be0c4915e2aef98b3771d7af6c

    I see why we don’t care about mempool stuff here, but m_block_stalling_timeout I’m not so sure. Is this logic completely irrelevant for background sync?


    jamesob commented at 6:28 pm on August 31, 2023:
    Good point! The m_block_stalling_timeout was added about a year ago, so when preparing these changes I didn’t consider it. It looks like we should keep it up to date for both chainstates. Fixed.
  159. naumenkogs commented at 9:53 am on August 31, 2023: member
    Reviewd up to 5e4c824433fb4bf8b380d0f61a0aa424dc5456d4
  160. in src/validation.cpp:4314 in b85530c18b outdated
    4203@@ -4204,7 +4204,7 @@ VerifyDBResult CVerifyDB::VerifyDB(
    4204     bool skipped_l3_checks{false};
    4205     LogPrintf("Verification progress: 0%%\n");
    4206 
    4207-    const bool is_snapshot_cs{!chainstate.m_from_snapshot_blockhash};
    4208+    const bool is_snapshot_cs{chainstate.m_from_snapshot_blockhash};
    


    ryanofsky commented at 12:03 pm on August 31, 2023:

    In commit “bugfix: correct is_snapshot_cs in VerifyDB” (b85530c18bfe24ff434f32f959d8dd843afca54d)

    Would it make sense to put this fix or other fixes into a backport PR? If this PR were merged and released with Bitcoin Core 26, would it be possible for someone to load an assumeutxo snapshot with Bitcoin Core 26, then downgrade to Bitcoin Core 25 or 24 and run into bugs processing the snapshot chainstate?


    jamesob commented at 6:29 pm on August 31, 2023:
    My inclination is to say that if the user begins snapshot use with a client that has this changeset, they won’t be able to downgrade without issue until the snapshot is validated (based on e.g. the net_processing changes here). I’m not sure where the best place to convey that would be… release notes?

    ryanofsky commented at 9:18 pm on August 31, 2023:

    re: #27596 (review)

    My inclination is to say that if the user begins snapshot use with a client that has this changeset, they won’t be able to downgrade without issue until the snapshot is validated (based on e.g. the net_processing changes here). I’m not sure where the best place to convey that would be… release notes?

    I think the most helpful place would be the loadtxoutset RPC documentation, saying that using the loadtxoutset RPC will make it temporarily not possible to downgrade the node software until the snapshot is validated, and then after that it will be safe to downgrade again.

    I wonder if we can do something simple to avoid this problem, though, like tweaking the snapshot chainstate directory name before releasing, so old software ignores it, or backporting a patch to the stable branches that makes them ignore the directory.

  161. in src/validation.cpp:5292 in f2dd7a0c1b outdated
    5204@@ -5205,6 +5205,9 @@ bool ChainstateManager::ActivateSnapshot(
    5205         const bool chaintip_loaded = m_snapshot_chainstate->LoadChainTip();
    5206         assert(chaintip_loaded);
    5207 
    5208+        // Transfer possession of the mempool to the snapshot chianstate.
    5209+        m_snapshot_chainstate->m_mempool = m_active_chainstate->m_mempool;
    


    ryanofsky commented at 5:59 pm on August 31, 2023:

    In commit “validation: assumeutxo: swap m_mempool on snapshot activation” (f2dd7a0c1b67d3e4a6e086a619eff34fe6625263)

    Can this code assert that the mempool is empty, or clear it, or else explain why contents of the mempool transactions are compatible with m_snapshot_chainstate?

    It would also be good to assert existing m_snapshot_chainstate pointer is not set, and make the ActivateSnapshot and ActivateExistingSnapshot versions of this code the same.


    jamesob commented at 7:10 pm on August 31, 2023:

    It would also be good to assert existing m_snapshot_chainstate pointer is not set, and make the ActivateSnapshot and ActivateExistingSnapshot versions of this code the same.

    That assertion happens here: https://github.com/bitcoin/bitcoin/blob/1b1d7110e67da11752ae1efadb1c12a9054eb867/src/validation.cpp#L5189

    Edit: oh oops, you’re probably talking about m_snapshot_chainstate->m_mempool, which isn’t checked.


    jamesob commented at 9:52 am on September 8, 2023:
    Added an assertion and comments. In practice, the mempool will be empty here because we wouldn’t activate a snapshot outside of IBD, and acceptance to the mempool is closed until IBD completes. I’ll add assertions elsewhere for that.
  162. in src/rpc/blockchain.cpp:2723 in 82a8744f13 outdated
    2718+        },
    2719+        [&](const RPCHelpMan& self, const JSONRPCRequest& request) -> UniValue
    2720+{
    2721+    fs::path path = fs::PathFromString(request.params[0].get_str());
    2722+    if (path.is_relative()) {
    2723+        path = gArgs.GetDataDirNet() / path;
    


    ryanofsky commented at 6:13 pm on August 31, 2023:

    In commit “rpc: add loadtxoutset” (82a8744f1384f07dc30aaa6339b0b8fb32dc3af0)

    Would be good to simplify and remove gArgs reference:

     0--- a/src/rpc/blockchain.cpp
     1+++ b/src/rpc/blockchain.cpp
     2@@ -2718,10 +2718,8 @@ static RPCHelpMan loadtxoutset()
     3         },
     4         [&](const RPCHelpMan& self, const JSONRPCRequest& request) -> UniValue
     5 {
     6-    fs::path path = fs::PathFromString(request.params[0].get_str());
     7-    if (path.is_relative()) {
     8-        path = gArgs.GetDataDirNet() / path;
     9-    }
    10+    NodeContext& node = EnsureAnyNodeContext(request.context);
    11+    fs::path path{AbsPathForConfigVal(EnsureArgsman(node), fs::u8path(request.params[0].get_str()))};
    12     FILE* file{fsbridge::fopen(path, "rb")};
    13     CAutoFile afile{file, SER_DISK, CLIENT_VERSION};
    14     if (afile.IsNull()) {
    15@@ -2739,7 +2737,6 @@ static RPCHelpMan loadtxoutset()
    16     LogPrintf("[snapshot] waiting to see blockheader %s in headers chain before snapshot activation\n",
    17         base_blockhash.ToString());
    18 
    19-    NodeContext& node = EnsureAnyNodeContext(request.context);
    20     ChainstateManager& chainman = *node.chainman;
    21 
    22     while (max_secs_to_wait_for_headers > 0) {
    

    jamesob commented at 10:17 am on September 8, 2023:
    Done, thanks.
  163. in src/rpc/blockchain.cpp:2770 in 82a8744f13 outdated
    2765+    }
    2766+
    2767+    // TODO jamesob: no real need to lock cs_main here, but during testing it makes it
    2768+    // easier to follow the logs. We may want to remove this before merge.
    2769+    //
    2770+    LOCK(::cs_main);
    


    ryanofsky commented at 6:18 pm on August 31, 2023:

    In commit “rpc: add loadtxoutset” (82a8744f1384f07dc30aaa6339b0b8fb32dc3af0)

    Should probably remove this if testing code which isn’t assuming cs_main is locked.


    jamesob commented at 10:18 am on September 8, 2023:
    Done.
  164. in src/rpc/blockchain.cpp:2735 in 82a8744f13 outdated
    2710+                RPCArg::Optional::NO,
    2711+                "path to the snapshot file. If relative, will be prefixed by datadir."},
    2712+        },
    2713+        RPCResult{
    2714+            RPCResult::Type::ELISION, "", "",
    2715+        },
    


    ryanofsky commented at 6:19 pm on August 31, 2023:

    In commit “rpc: add loadtxoutset” (82a8744f1384f07dc30aaa6339b0b8fb32dc3af0)

    Missing result description


    jamesob commented at 10:17 am on September 8, 2023:
    Fixed.
  165. in src/rpc/blockchain.cpp:2880 in 82a8744f13 outdated
    2813@@ -2729,6 +2814,7 @@ void RegisterBlockchainRPCCommands(CRPCTable& t)
    2814         {"hidden", &waitforblockheight},
    2815         {"hidden", &syncwithvalidationinterfacequeue},
    2816         {"hidden", &dumptxoutset},
    2817+        {"hidden", &loadtxoutset},
    


    ryanofsky commented at 6:21 pm on August 31, 2023:

    In commit “rpc: add loadtxoutset” (82a8744f1384f07dc30aaa6339b0b8fb32dc3af0)

    Should dumptxoutset and loadtxoutset still be hidden?


    jamesob commented at 10:19 am on September 8, 2023:
    Ah, good point. Fixed.
  166. in src/rpc/blockchain.cpp:2794 in 5552062381 outdated
    2789+return RPCHelpMan{
    2790+        "getchainstates",
    2791+        "\nReturn information about chainstates.\n",
    2792+        {},
    2793+        RPCResult{
    2794+            RPCResult::Type::ELISION, "", "",
    


    ryanofsky commented at 6:22 pm on August 31, 2023:

    In commit “rpc: add getchainstates” (555206238129c6ed9fcbbc498c81455c7fe6d851)

    Missing documentation for return value


    jamesob commented at 10:53 am on September 8, 2023:
    Fixed.
  167. in src/rpc/blockchain.cpp:2843 in 5552062381 outdated
    2816+
    2817+        data.pushKV("blocks",                (int)chain.Height());
    2818+        data.pushKV("bestblockhash",         tip->GetBlockHash().GetHex());
    2819+        data.pushKV("difficulty",            (double)GetDifficulty(tip));
    2820+        data.pushKV("verificationprogress",  GuessVerificationProgress(Params().TxData(), tip));
    2821+        data.pushKV("snapshot_blockhash",    cs->m_from_snapshot_blockhash.value_or(uint256{}).ToString());
    


    ryanofsky commented at 6:27 pm on August 31, 2023:

    In commit “rpc: add getchainstates” (555206238129c6ed9fcbbc498c81455c7fe6d851)

    Probably would be a little better to make this null instead of "" if unset


    jamesob commented at 10:51 am on September 8, 2023:
    Actually have no idea how to express “std::string or null” in our RPC framework… is this possible?

    ryanofsky commented at 0:28 am on September 14, 2023:

    re: #27596 (review)

    In commit “rpc: add getchainstates” (51dbde2bb73bcd03f73e3d5eb6c998e5e01b5536)

    Actually have no idea how to express “std::string or null” in our RPC framework… is this possible?

    I think you already did it on the documentation side /*optional=*/true above, since in general we treat null and missing values the same.


    jamesob commented at 6:52 pm on September 17, 2023:
    Done.
  168. jamesob force-pushed on Aug 31, 2023
  169. jamesob commented at 6:44 pm on August 31, 2023: member

    Yesterday’s feedback addressed; thanks @ryanofsky @naumenkogs. The diff gets smaller.

    0git range-diff upstream/master au.034 au.035
    
  170. in src/node/blockstorage.cpp:189 in 51622e9c1f outdated
    192+    // Distribute our -prune budget over all chainstates.
    193+    const auto target = std::max(
    194+        MIN_DISK_SPACE_FOR_BLOCK_FILES, GetPruneTarget() / chainman.GetAll().size());
    195+
    196+    auto chain_tip_height = static_cast<uint64_t>(chain.m_chain.Height());
    197+    if (chain_tip_height < 0 || target == 0) {
    


    andrewtoth commented at 5:51 pm on September 3, 2023:

    In commit “validation: pruning for multiple chainstates” (51622e9c1f7ed16f65844817a4a5e9bcbc157b4f)

    This line gives a compilation warning. Checking if a uint64_t is negative is always false, and the static_cast<uint64_t> is also a possible underflow error if chain.m_chain.Height() == -1.

    Perhaps check if chain.m_chain.Height() < 0 first and then assign to chain_tip_height after this line?


    jamesob commented at 10:56 am on September 8, 2023:
    Oops, good catch. Fixed.
  171. in doc/release-notes-27596.md:4 in 51622e9c1f outdated
    0@@ -0,0 +1,7 @@
    1+Pruning
    2+-------
    3+
    4+When using assumeutxo with `-prune`, the prune budget may be exceeded. Prune budget is
    


    naumenkogs commented at 8:33 am on September 4, 2023:

    51622e9c1f7ed16f65844817a4a5e9bcbc157b4f

    It might be helpful to be more precise here (and in the prune doc). This is what a user (not a dev) will need to know assuming they have limited memory, no?

    How large the exceeding can be?


    jamesob commented at 10:58 am on September 8, 2023:
    Mentioned specific values here.
  172. in src/validation.cpp:5884 in 51622e9c1f outdated
    5879+        // Leave the blocks in the background IBD chain alone if we're pruning
    5880+        // the snapshot chain.
    5881+        prune_start = *Assert(GetSnapshotBaseHeight()) + 1;
    5882+    }
    5883+
    5884+    int max_prune = chainstate.m_chain.Height() - static_cast<int>(MIN_BLOCKS_TO_KEEP);
    


    naumenkogs commented at 8:39 am on September 4, 2023:

    51622e9c1f7ed16f65844817a4a5e9bcbc157b4f

    it’s better to have std::max at this line, not below. This way I spend much less time thinking about the underflow, so does a future reviewer.


    jamesob commented at 11:00 am on September 8, 2023:
    Done.
  173. naumenkogs commented at 9:27 am on September 4, 2023: member
    Reviewed everything 1b1d7110e67da11752ae1efadb1c12a9054eb867. ACK once some remaining comments are addressed.
  174. jamesob force-pushed on Sep 8, 2023
  175. jamesob commented at 2:54 pm on September 8, 2023: member

    All feedback addressed, thanks @ryanofsky @andrewtoth @naumenkogs.

    au.036; range-diff:

    0( git remote -v | grep jamesob ) || git remote add jamesob https://github.com/jamesob/bitcoin
    1git fetch jamesob
    2git range-diff upstream/master jamesob/au.035 jamesob/au.036
    
  176. in src/test/validation_chainstatemanager_tests.cpp:429 in 64f431029f outdated
    427@@ -431,12 +428,16 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
    428     int num_indexes{0};
    429     int num_assumed_valid{0};
    430     const int expected_assumed_valid{20};
    431-    const int last_assumed_valid_idx{40};
    432+    const int last_assumed_valid_idx{111};
    


    ryanofsky commented at 2:21 am on September 11, 2023:

    In commit “test: adjust chainstate tests to use recognized snapshot base” (64f431029f04cc47889dbe8fad55ccc139af9c07)

    One problem with this change is that it makes the test less meaningful, now the snapshot block is the tip of the chain, and there are no blocks after it, so the test can no longer check that blocks after the snapshot block are added to the snapshot chain’s setBlockIndex set. But it seems like if you just mine 20 blocks on line 435 and check for 120 blocks on line 440, the test will pass with no other changes and cover these blocks.

    Also just in general I’ve been finding this test and the code around it very confusing, and think some extra comments and checks could help clarify it without having to change the existing code:

      0--- a/src/test/validation_chainstatemanager_tests.cpp
      1+++ b/src/test/validation_chainstatemanager_tests.cpp
      2@@ -422,17 +422,20 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
      3 
      4     int num_indexes{0};
      5     int num_assumed_valid{0};
      6+    // Blocks in range [assumed_valid_start_idx, last_assumed_valid_idx) will be
      7+    // marked as assumed-valid and not having data.
      8     const int expected_assumed_valid{20};
      9     const int last_assumed_valid_idx{111};
     10     const int assumed_valid_start_idx = last_assumed_valid_idx - expected_assumed_valid;
     11 
     12-    // Get to a valid assumeutxo snapshot
     13-    mineBlocks(10);
     14+    // Mine to height 120, past the hardcoded regtest assumeutxo snapshot at
     15+    // height 110
     16+    mineBlocks(20);
     17 
     18     CBlockIndex* validated_tip{nullptr};
     19     CBlockIndex* assumed_base{nullptr};
     20     CBlockIndex* assumed_tip{WITH_LOCK(chainman.GetMutex(), return chainman.ActiveChain().Tip())};
     21-    BOOST_CHECK_EQUAL(assumed_tip->nHeight, 110);
     22+    BOOST_CHECK_EQUAL(assumed_tip->nHeight, 120);
     23 
     24     auto reload_all_block_indexes = [&]() {
     25         // For completeness, we also reset the block sequence counters to
     26@@ -458,7 +461,7 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
     27         LOCK(::cs_main);
     28         auto index = cs1.m_chain[i];
     29 
     30-        // Blocks with heights in range [90, 110] are marked ASSUMED_VALID
     31+        // Blocks with heights in range [91, 110] are marked ASSUMED_VALID
     32         if (i < last_assumed_valid_idx && i >= assumed_valid_start_idx) {
     33             index->nStatus = BlockStatus::BLOCK_VALID_TREE | BlockStatus::BLOCK_ASSUMED_VALID;
     34         }
     35@@ -492,10 +495,36 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
     36     // Set tip of the assume-valid-based chain to the assume-valid block
     37     cs2.m_chain.SetTip(*assumed_base);
     38 
     39+    // Sanity check test variables.
     40+    BOOST_CHECK_EQUAL(num_indexes, 121); // 121 total blocks, including genesis
     41+    BOOST_CHECK_EQUAL(assumed_tip->nHeight, 120);  // original chain has height 120
     42+    BOOST_CHECK_EQUAL(validated_tip->nHeight, 90); // current cs1 chain has height 90
     43+    BOOST_CHECK_EQUAL(assumed_base->nHeight, 110); // current cs2 chain has height 110
     44+
     45+    // Regenerate cs1.setBlockIndexCandidates and cs2.setBlockIndexCandidate and
     46+    // check contents below.
     47     reload_all_block_indexes();
     48 
     49-    // The fully validated chain should have the current validated tip
     50-    // and the assumed valid base as candidates.
     51+    // The fully validated chain should only have the current validated tip and
     52+    // the assumed valid base as candidates, blocks 90 and 110. Specifically:
     53+    //
     54+    // - It does not have blocks 0-89 because they contain less work than the
     55+    //   chain tip.
     56+    //
     57+    // - It has block 90 because it has data and equal work to the chain tip,
     58+    //   (since it is the chain tip).
     59+    //
     60+    // - It does not have blocks 91-109 because they do not contain data.
     61+    //
     62+    // - It has block 110 even though it does not have data, because
     63+    //   LoadBlockIndex has a special case to always add the snapshot block as a
     64+    //   candidate. The special case is only actually intended to apply to the
     65+    //   snapshot chainstate cs2, not the background chainstate cs1, but it is
     66+    //   written broadly and applies to both.
     67+    //
     68+    // - It does not have any blocks after height 110 because cs1 is a background
     69+    //   chainstate, and only blocks where are ancestors of the snapshot block
     70+    //   are added as candidates for the background chainstate.
     71     BOOST_CHECK_EQUAL(cs1.setBlockIndexCandidates.size(), 2);
     72     BOOST_CHECK_EQUAL(cs1.setBlockIndexCandidates.count(validated_tip), 1);
     73     BOOST_CHECK_EQUAL(cs1.setBlockIndexCandidates.count(assumed_base), 1);
     74@@ -503,8 +532,25 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
     75     // The assumed-valid tolerant chain has the assumed valid base as a
     76     // candidate, but otherwise has none of the assumed-valid (which do not
     77     // HAVE_DATA) blocks as candidates.
     78+    //
     79+    // Specifically:
     80+    // - All blocks below height 110 are not candidates, because cs2 chain tip
     81+    //   has height 110 and they have less work than it does.
     82+    //
     83+    // - Block 110 is a candidate even though it does not have data, because it
     84+    //   is the snapshot block, which is assumed valid.
     85+    //
     86+    // - Blocks 111-120 are added because they have data.
     87+
     88+    // Check that block 90 is absent
     89     BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.count(validated_tip), 0);
     90+    // Check that block 109 is absent
     91+    BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.count(assumed_base->pprev), 0);
     92+    // Check that block 110 is present
     93+    BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.count(assumed_base), 1);
     94+    // Check that block 120 is present
     95     BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.count(assumed_tip), 1);
     96+    // Check that 11 blocks total are present.
     97     BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.size(), num_indexes - last_assumed_valid_idx + 1);
     98 }
     99 
    100diff --git a/src/validation.cpp b/src/validation.cpp
    101index f77325daedb5..9c8815914a57 100644
    102--- a/src/validation.cpp
    103+++ b/src/validation.cpp
    104@@ -3462,7 +3462,8 @@ void Chainstate::ResetBlockFailureFlags(CBlockIndex *pindex) {
    105 void Chainstate::TryAddBlockIndexCandidate(CBlockIndex* pindex)
    106 {
    107     AssertLockHeld(cs_main);
    108-    // The block only is a candidate for the most-work-chain if it has more work than our current tip.
    109+    // The block only is a candidate for the most-work-chain if it has the same
    110+    // or more work than our current tip.
    111     if (m_chain.Tip() != nullptr && setBlockIndexCandidates.value_comp()(pindex, m_chain.Tip())) {
    112         return;
    113     }
    

    I think it would be possible to make other improvements to the test beyond just adding comments. For example, I think it would be good to get rid of most of the variables and constants and just refer to blocks directly by height. Also, it would be good to add a fork to the chain below the snapshot height, to make sure blocks not leading up to the snapshot are not added as candidates to the background chainstate.

  177. ryanofsky commented at 2:28 am on September 11, 2023: contributor

    I made some more progress reviewing 9634d71e8b2f9cdccb437e2a689ac298c05c0583, but unfortunately got bogged down in the chainstatemanager_loadblockindex test again and suggested a lot of clarifications for it below.

    Related to this, I also found a possible bug in the ChainstateManager::LoadBlockIndex function and posted a comment in the earlier PR: #27746 (review)

  178. DrahtBot added the label CI failed on Sep 11, 2023
  179. jamesob force-pushed on Sep 11, 2023
  180. DrahtBot removed the label CI failed on Sep 11, 2023
  181. jamesob commented at 8:10 pm on September 11, 2023: member

    Pushed an update adding @ryanofsky’s good test changes, an import fix for the RPC code, and a documentation commit describing the confusing naming of HaveTxsDownloaded().

    au.037

    0( git remote -v | grep jamesob ) || git remote add jamesob https://github.com/jamesob/bitcoin
    1git fetch jamesob
    2git range-diff upstream/master jamesob/au.036 jamesob/au.037
    

    I’ve run this locally for a few IBD cycles.

  182. naumenkogs commented at 7:13 am on September 12, 2023: member
    ACK 9f9ffb655e4236f8d567b961102f53c5933e8aa8
  183. in src/node/blockstorage.h:86 in 3ff26bc733 outdated
    81+
    82+struct BlockfileCursor {
    83+    // The latest blockfile number.
    84+    int file_num{0};
    85+
    86+    // Track the height of the highest block in m_last_blockfile whose undo
    


    ryanofsky commented at 2:27 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    s/m_last_blockfile/file_num/


    jamesob commented at 4:53 pm on September 17, 2023:
    Done.
  184. in src/node/blockstorage.h:124 in 3ff26bc733 outdated
    120@@ -92,7 +121,8 @@ class BlockManager
    121      */
    122     bool LoadBlockIndex(const std::optional<uint256>& snapshot_blockhash)
    123         EXCLUSIVE_LOCKS_REQUIRED(cs_main);
    124-    void FlushBlockFile(bool fFinalize = false, bool finalize_undo = false);
    125+    void FlushBlockFile(int blockfile_num, bool fFinalize = false, bool finalize_undo = false);
    


    ryanofsky commented at 3:27 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    Prepending new parameter to a method with default arguments can be dangerous because it can lead to silent merge conflicts. (Argument value intended as the fFinalize parameter could be interpreted as the blockfile_num parameter).

    It would be pretty easy to just drop all the default arguments and make the code explicit:

     0--- a/src/node/blockstorage.h
     1+++ b/src/node/blockstorage.h
     2@@ -121,8 +121,8 @@ private:
     3      */
     4     bool LoadBlockIndex(const std::optional<uint256>& snapshot_blockhash)
     5         EXCLUSIVE_LOCKS_REQUIRED(cs_main);
     6-    void FlushBlockFile(int blockfile_num, bool fFinalize = false, bool finalize_undo = false);
     7-    void FlushChainstateBlockFile(int tip_height, bool fFinalize = false, bool finalize_undo = false);
     8+    void FlushBlockFile(int blockfile_num, bool fFinalize, bool finalize_undo);
     9+    void FlushChainstateBlockFile(int tip_height);
    10     void FlushUndoFile(int block_file, bool finalize = false);
    11     bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime, bool fKnown);
    12     bool FindUndoPos(BlockValidationState& state, int nFile, FlatFilePos& pos, unsigned int nAddSize);
    13--- a/src/node/blockstorage.cpp
    14+++ b/src/node/blockstorage.cpp
    15@@ -630,14 +630,14 @@ BlockfileType BlockManager::BlockfileTypeForHeight(int height)
    16     return (height >= *m_snapshot_height) ? BlockfileType::ASSUMED : BlockfileType::NORMAL;
    17 }
    18 
    19-void BlockManager::FlushChainstateBlockFile(int tip_height, bool finalize, bool finalize_undo)
    20+void BlockManager::FlushChainstateBlockFile(int tip_height)
    21 {
    22     LOCK(cs_LastBlockFile);
    23     auto& cursor = m_blockfile_cursors[BlockfileTypeForHeight(tip_height)];
    24     if (cursor) {
    25         // The cursor may not exist after a snapshot has been loaded but before any
    26         // blocks have been downloaded.
    27-        return FlushBlockFile(cursor->file_num, finalize, finalize_undo);
    28+        return FlushBlockFile(cursor->file_num, /*fFinalize=*/false, /*finalize_undo=*/false);
    29     }
    30 }
    31 
    

    jamesob commented at 4:55 pm on September 17, 2023:
    Yep, easy to do and done.
  185. in src/node/blockstorage.cpp:516 in 3ff26bc733 outdated
    394@@ -386,6 +395,15 @@ bool BlockManager::LoadBlockIndexDB(const std::optional<uint256>& snapshot_block
    395         }
    396     }
    397 
    398+    {
    399+        // Initialize the blockfile cursors.
    400+        LOCK(cs_LastBlockFile);
    401+        for (size_t i = 0; i < m_blockfile_info.size(); ++i) {
    402+            const auto last_height_in_file = m_blockfile_info[i].nHeightLast;
    403+            m_blockfile_cursors[BlockfileTypeForHeight(last_height_in_file)] = {static_cast<int>(i), 0};
    


    ryanofsky commented at 4:18 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    No need to change, but if you wanted an easy way to avoid repeatedly looking up the same two map elements in a loop, you could just replace the m_blockfile_cursors map with a fixed size array like @ajtowns recently suggested for #27213 (review). I think the benefits would be similar: not just more efficiency but also more straightforward code where you don’t have think about the [] operator default constructing elements or at method throwing exceptions.


    jamesob commented at 4:49 pm on September 17, 2023:
    Good suggestion! Done.
  186. in src/node/blockstorage.h:238 in 3ff26bc733 outdated
    233+     * comingling different height regions of the chain when an assumedvalid chainstate
    234+     * is in use. If heights are drastically different in the same blockfile, pruning
    235+     * suffers.
    236+     *
    237+     * This is set during ActivateSnapshot() or upon LoadBlockIndex() if a snapshot
    238+     * had been previously loaded.
    


    ryanofsky commented at 4:31 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    Can the comment also say what happens after the snapshot is validated?


    jamesob commented at 5:01 pm on September 17, 2023:
    Done.
  187. in src/node/blockstorage.h:184 in 3ff26bc733 outdated
    191+        {BlockfileType::NORMAL, BlockfileCursor{}},
    192+        {BlockfileType::ASSUMED, std::nullopt},
    193+    };
    194+    int MaxBlockfileNum() const EXCLUSIVE_LOCKS_REQUIRED(cs_LastBlockFile)
    195+    {
    196+        static const BlockfileCursor empty_cursor{0, 0};
    


    ryanofsky commented at 4:45 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    Here and other places in this PR, I think it would be nice avoid hardcoding these 0’s, and just use the default values that are already assigned in the struct definition. Alternately, if you think it is better to explicitly assign every field, it might be better to remove the default assignments in the struct so the compiler can warn when the assignments are missing.


    jamesob commented at 4:59 pm on September 17, 2023:
    Good point, fixed.
  188. in src/node/blockstorage.cpp:846 in 3ff26bc733 outdated
    842@@ -783,10 +843,10 @@ bool BlockManager::WriteUndoDataForBlock(const CBlockUndo& blockundo, BlockValid
    843         // in the block file info as below; note that this does not catch the case where the undo writes are keeping up
    844         // with the block writes (usually when a synced up node is getting newly mined blocks) -- this case is caught in
    845         // the FindBlockPos function
    846-        if (_pos.nFile < m_last_blockfile && static_cast<uint32_t>(block.nHeight) == m_blockfile_info[_pos.nFile].nHeightLast) {
    847+        if (_pos.nFile < static_cast<int>(cursor.file_num) && static_cast<uint32_t>(block.nHeight) == m_blockfile_info[_pos.nFile].nHeightLast) {
    


    ryanofsky commented at 4:47 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    static_cast<int> here and below seems unnecessary because file_num should already be an int.


    jamesob commented at 4:52 pm on September 17, 2023:
    Fixed.
  189. in src/node/blockstorage.cpp:738 in 3ff26bc733 outdated
    736+                    Assert(m_blockfile_cursors[chain_type])->undo_height);
    737+
    738+            // Try the next unclaimed blockfile number
    739+            nFile = this->MaxBlockfileNum() + 1;
    740+            // Set to increment MaxBlockfileNum() for next iteration
    741+            m_blockfile_cursors[chain_type] = BlockfileCursor{static_cast<int>(nFile), 0};
    


    ryanofsky commented at 5:18 pm on September 12, 2023:

    In commit “blockstorage: segment normal/assumedvalid blockfiles” (3ff26bc73374b53550dc2dd17dd3e1c58258fc30)

    static_cast<int>(nFile) cast here and blow seems unnecessary since nFile is already an int


    jamesob commented at 4:52 pm on September 17, 2023:
    Ah good catch, thanks. Fixed.
  190. in src/node/blockstorage.cpp:281 in e3cae85738 outdated
    277+    if (snapshot_blockhash) {
    278+        const AssumeutxoData au_data = *Assert(GetParams().AssumeutxoForBlockhash(*snapshot_blockhash));
    279+        snapshot_height = au_data.height;
    280+        CBlockIndex* base{LookupBlockIndex(*snapshot_blockhash)};
    281+
    282+        // Since nChainTx (responsible for estiamted progress) isn't persisted
    


    ryanofsky commented at 5:25 pm on September 12, 2023:

    In commit “validation: populate nChainTx value for assumedvalid chainstates” (e3cae85738c8bf4116547af8d868c6c01577b80f)

    estimated


    jamesob commented at 4:49 pm on September 17, 2023:
    Done.

    naumenkogs commented at 6:48 am on September 21, 2023:

    a7a5debb6a6bdc4f7217050e043cebb1c6e0d33a

    estiamted still not fixed


    naumenkogs commented at 7:05 am on September 21, 2023:
    ok, fixed later in bd770ff59075fa2fd0441311ef1c02439ed68fdd
  191. ryanofsky commented at 5:37 pm on September 12, 2023: contributor
    Working through 9f9ffb655e4236f8d567b961102f53c5933e8aa8. Got through the segmented storage change which is much simpler now, but I did leave some comments. Will continue reviewing.
  192. in src/validation.cpp:5793 in 0958f11c18 outdated
    5689-        std::make_unique<Chainstate>(mempool, m_blockman, *this, base_blockhash);
    5690