Add a switch to allow running in a pruned state #4481

pull rdponticelli wants to merge 2 commits into bitcoin:master from Criptomonedas:pruned changing 3 files +130 −41
  1. rdponticelli commented at 6:45 pm on July 7, 2014: contributor
    To avoid stalling the downloads of other peers, we disconnect those who ask us for a block we don’t have anymore.
  2. petertodd commented at 4:25 am on July 8, 2014: contributor

    I think this is acceptable so long as you add a “-allowpruned” option that disables the “we have all blocks” check on startup and unsets the NODE_NETWORK service bit. (meaning we act like a SPV client)

    You could argue with that you might as well not disconnect peers that ask for blocks you don’t have, as you aren’t advertising that you have any blocks at all. That might be useful for some types of local usage, and probably does no harm.

  3. laanwj commented at 7:18 am on July 8, 2014: member

    Agree with @petertodd here. If you don’t run a full node, you should not have it advertise NODE_NETWORK. The P2P protocol at this point does not support advertising availability of only ranges of blocks, and disconnecting anyone that happens to request a block that’s missing seems very crude.

    Also: -allowpruned should imply -disablewallet, as the wallet currently relies on being able to rescan the blocks on disk, for example on importkey or importwallet.

  4. petertodd commented at 8:23 am on July 8, 2014: contributor
    @laanwj Good point. You could make pruning just disable some wallet functionality, but disabling all wallet functionality is simpler for now - we might not have the wallet in the future anyway.
  5. rdponticelli commented at 0:01 am on July 9, 2014: contributor
    Thanks for the feedback. Pull updated, addressing the input.
  6. laanwj commented at 6:20 am on July 9, 2014: member

    I like the idea of an UTXO-only, non-archival node.

    • Excluding the wallet, we still need to test the rest of the RPC API for robustness against missing block data. How does this interact with gettransaction (w/ possible txindex), for example?
    • Ideally this would be accompanied with an option to not write blocks to disk at all. I think the current expectation is that some external program does the ‘pruning’?
  7. rdponticelli commented at 1:12 am on July 10, 2014: contributor
    @ashleyholman: I guess we should try and test fully what happens unsetting the flag, but that looks like the sensible thing to do.
  8. gmaxwell commented at 1:28 pm on July 10, 2014: contributor

    It should also check that so much hasn’t been removed that it can’t survive realistic reorganizations… a bunch of nodes that just refuse to reorg would be fairly harmful to the network (not to mention their users!). I’d suggest the check go back 288 blocks from tip since thats our default for the extensive checkblocks.

    This also needs at least a modest testplan to check that none of the exposed RPCs will crash, and that any that would crash are disabled. It should also be made incompatible with txindex.

  9. gmaxwell commented at 1:33 pm on July 10, 2014: contributor
    We may also want to leave this out of usage until it’s had a fair amount of testing (e.g. in the first release with it)… esp I’d worry that there may be additional network triggerable misbehavior from places where this violates assumptions.
  10. gmaxwell commented at 4:21 pm on July 10, 2014: contributor
    So after giving this more though, instead of allowpruned— why not just go all the way and have a pruned=1 … with the logic that if our validated height is >100k and a blockfiles highest block is <tip-288, just delete the block file? This would allow you to sync up without much free disk space (the first 119k blocks are small and fit in the first block file).
  11. gmaxwell commented at 8:44 pm on July 10, 2014: contributor
    (Sipa pointed out to me that the rational for the 100k limit was not obvious. The reason I proposed that was because without it if someone gave you a 400 block reorg during the very early initial block download you would not recover. The 100k limit gets the difficulty to a non-completely-trivial level, and is also few enough that its still in a single block file, so there would be no peak usage increase. Better to have a total-work check, but that would probably need headers first in order to be easy to implement).
  12. rdponticelli commented at 11:53 am on July 11, 2014: contributor

    So, there are several possible modes of operation after we allow pruning. @laanwj’s utxosonly, @gmaxwell’s autoprune are both interesting, and there could be more…

    Should I modify this patch to allow setting them as options, or would that add too much unneeded complexity?

  13. gmaxwell commented at 12:43 pm on July 11, 2014: contributor
    UTXO only cannot be done without substantial work because the node would be unable to reorganize.
  14. rdponticelli commented at 12:59 pm on July 11, 2014: contributor

    Yeah, obviously autoprune is a way lower hanging fruit.

    Maybe after #4468 utxoonly can be possible, but yes, at a significantly cost.

  15. laanwj commented at 1:00 pm on July 11, 2014: member
    Right, I did not think about reorganizations. A node that would fall over on even a small reorganization would be pointless. And storing a few blocks isn’t a problem. @gmaxwell’s idea about keeping only the recent N>288 blocks when height>100k sounds great, and is how auto-pruning should work.
  16. gmaxwell commented at 1:17 pm on July 11, 2014: contributor

    WRT UTXO-only, we’d also need to add code to hash the undo files, and add network code to fetch them (and make the undo datastructure normative along the way)… Or, just don’t delete the undo files but this is unattractive since there is currently about 2gb of undo data.

    So I think for now we should just do the autopruned which disables the wallet and txindex and keeps enough blocks that reorg failure is very unlikely. Then we can think about the above steps with the undo data, along with keeping the ability to remain acting as a node-network.

  17. wtogami commented at 9:28 am on July 12, 2014: contributor
    Shouldn’t this also disable UPNP as the only reason it is used is to allow incoming connections to possibly sync from your node?
  18. petertodd commented at 9:58 am on July 12, 2014: contributor

    I wouldn’t bother, as we’ll want to let nodes advertise useful services beyond the blockchain, e.g. tx relaying.

    On 12 July 2014 10:28:41 BST, Warren Togami notifications@github.com wrote:

    Shouldn’t this also disable UPNP as the only reason it is used is to allow incoming connections to possibly sync from your node?


    Reply to this email directly or view it on GitHub: #4481 (comment)

  19. gmaxwell commented at 12:32 pm on July 12, 2014: contributor
    We don’t want to disable listening or UPNP because you may want to use the node with p2pool or the like, but we should probably suppress address messages when we’re not node-network. I think that should just wait until another pull though.
  20. petertodd commented at 12:36 pm on July 12, 2014: contributor

    @gmaxwell Sounds good to me.

    On 12 July 2014 13:32:58 GMT+01:00, Gregory Maxwell notifications@github.com wrote:

    We don’t want to disable listening or UPNP because you may want to use the node with p2pool or the like, but we should probably suppress address messages when we’re not node-network. I think that should just wait until another pull though.


    Reply to this email directly or view it on GitHub: #4481 (comment)

  21. jgarzik commented at 0:06 am on July 14, 2014: contributor

    Looks mostly OK, modulo nits.

    Do we want to be able to find pruned nodes? Seems like allocating a NODE_xxx service bit in this PR is appropriate.

  22. gmaxwell commented at 4:16 am on July 14, 2014: contributor
    The bit ought to depend on willingness to at least relay the most recent N transactions from the tip. N=288 by definition sounds good to me, but if it’s in the bit definition perhaps it needs more shed painting, which is why I didn’t recommend this above (for this pull, e.g. we could do it in a separate one).
  23. jgarzik commented at 4:36 am on July 14, 2014: contributor

    @gmaxwell To forestall shed painting, I observe that NODE_NETWORK behavior has varied over time; only it’s general definition “a full node!” has remained static.

    As such, it seems reasonable to add NODE_PRUNED or whatever we want to call the high level attribute being advertised. If it’s only minor tweaks to find a good value for N, that seems like something that could be performed in-tree before the next release.

    It’s all about communication. A release note “NODE_PRUNED is experimental until 0.11” can further extend the life.

  24. laanwj commented at 6:48 am on July 14, 2014: member

    What about adding a service bit NODE_UTXO? And then having full nodes advertise it as well? This has been proposed before and I think that was a good suggestion (by @maaku?). Pruned nodes would only advertise NODE_UTXO.

    This would give NODE_NETWORK - starting from version X - the meaning of ‘can serve blocks’, and NODE_UTXO the meaning of ’tracks UTXO and can validate transactions’.

    This way, service bits actually tell about provided services. I like ’negative’ service bits like NODE_PRUNED considerably less.

  25. sipa commented at 12:03 pm on July 14, 2014: member

    I would move the meaning of block-servavility out of NODE_NETWORK, not the complement. Reason: we don’t really know in what ways the remainder of the functionality will be split off in the future, while the ability to serve blocks is well defined. Let NODE_NETWORK remain the kitchen sink, and give new bits a well defined meaning.

    As to the meaning itself, I have previously proposed to use 2 service bits, the combination of which indicates one of 4 levels (e.g. 0,288,4032,all or 288,2016,52500,all if we want some mandatory serving). Discussion mostly turned into using such a solution vs. adding a field to CAddress to indicate this information with more granularity.

  26. gmaxwell commented at 12:37 pm on July 14, 2014: contributor
    I was trying to avoid the granularity rathole here. :) Can we please put the new flag in a subsequent pull to avoid delaying this based on that? The approach of removing node network on a pruning node is safe if far from perfect… and will suffice until the extra behavior is hammered out.
  27. jgarzik commented at 12:46 pm on July 14, 2014: contributor

    I see that shed painting was not forestalled, @gmaxwell :)

    ACK

  28. rdponticelli commented at 10:46 pm on July 14, 2014: contributor

    Most nits has been addressed, and I added a branch in my repository for anybody willing to test the rejection instead of disconnecting case:

    https://github.com/Criptomonedas/bitcoin/tree/prunedreject

    It seems to work pretty fine, just a little more resource wasteful than disconnecting.

  29. sipa commented at 5:37 pm on July 15, 2014: member
    utACK
  30. maaku commented at 5:45 pm on July 15, 2014: contributor
    ut?
  31. sipa commented at 5:46 pm on July 15, 2014: member
    untested
  32. gmaxwell commented at 5:54 pm on July 15, 2014: contributor

    NAK. Important nits remain: This must not allow a node to start which cannot reorg.

    Please change the test in LoadBlockIndexDB() to check that all of the last 288 blocks are available. (then I’ll test and ACK)

  33. sipa commented at 5:56 pm on July 15, 2014: member
    Undo ACK. Agree with @gmaxwell
  34. jgarzik commented at 5:57 pm on July 15, 2014: contributor
    +1
  35. maaku commented at 9:40 pm on July 15, 2014: contributor

    You would want to serve not just the most recent blocks, but also a range of past blocks. Otherwise it would be very difficult to perform an initial block download in a scenario where most nodes are pruning. Adding a field to CAddress seems the more compatible approach with this requirement.

    On 07/14/2014 08:03 AM, Pieter Wuille wrote:

    I would move the meaning of block-servavility out of NODE_NETWORK, not the complement. Reason: we don’t really know in what ways the remainder of the functionality will be split off in the future, while the ability to serve blocks is well defined. Let NODE_NETWORK remain the kitchen sink, and give new bits a well defined meaning.

    As to the meaning itself, I have previously proposed to use 2 service bits, the combination of which indicates one of 4 levels (e.g. 0,288,4032,all or 288,2016,52500,all if we want some mandatory serving). Discussion mostly turned into using such a solution vs. adding a field to CAddress to indicate this information with more granularity.

    — Reply to this email directly or view it on GitHub #4481 (comment).

  36. gmaxwell commented at 10:37 pm on July 15, 2014: contributor
    @maaku absolutely, but I think we should do that as part of a pull that turns back on network services bits. This pull just turns the node into a client, and the only reason to keep any blocks at all is otherwise a reorg leaves the node stuck and it needs a total (30gbyte) reinit, and while it’s stuck it’s forked and potentially following a losing chain.
  37. in src/init.cpp: in 29965a779e outdated
    615@@ -615,7 +616,12 @@ bool AppInit2(boost::thread_group& threadGroup)
    616     fLogTimestamps = GetBoolArg("-logtimestamps", true);
    617     fLogIPs = GetBoolArg("-logips", false);
    618     setvbuf(stdout, NULL, _IOLBF, 0);
    619+    fAllowPruned = GetBoolArg("-allowpruned", false);
    620 #ifdef ENABLE_WALLET
    621+    if (fAllowPruned) {
    622+        if (SoftSetBoolArg("-disablewallet", true))
    623+            LogPrintf("AppInit2 : parameter interaction: -allowpruned=1 -> setting -disablewallet=1\n");
    


    Diapolo commented at 1:48 pm on July 16, 2014:
    Those checks and the message don’t belong here but in Step 2: parameter interactions in init.
  38. in src/main.cpp: in 29965a779e outdated
    2989@@ -2990,18 +2990,70 @@ bool static LoadBlockIndexDB()
    2990 
    2991     // Check presence of blk files
    2992     LogPrintf("Checking all blk files are present...\n");
    2993-    set<int> setBlkDataFiles;
    2994+    map<int, bool> mapBlkDataFileReadable,mapBlkUndoFileReadable;
    


    Diapolo commented at 1:49 pm on July 16, 2014:
    Why the Readable at the end?

    rdponticelli commented at 2:06 pm on July 16, 2014:
    Because they’re used to store that condition about the files.
  39. in src/main.cpp: in 29965a779e outdated
    3517+                            // his download. He shouldn't ask, anyway, as we unset NODE_NETWORK on this mode.
    3518+                            LogPrintf("cannot load block from disk, disconnecting peer=%d\n", pfrom->id);
    3519+                            pfrom->fDisconnect = true;
    3520+                        }
    3521+                        else
    3522+                            assert(!"cannot load block from disk");
    


    Diapolo commented at 1:53 pm on July 16, 2014:
    Should this be an assert or can we handle this a little nicer?

    jgarzik commented at 1:55 pm on July 16, 2014:
    At a minimum, assert() is wrong for runtime conditions such as disk errors or out-of-space conditions.

    rdponticelli commented at 2:08 pm on July 16, 2014:
    It was an assert in the original code. Maybe it should be changed, but I would left that for another pull, later.

    laanwj commented at 2:15 pm on July 16, 2014:
    Using AbortNode would make sense here (though I agree it’s out of scope for this pull)
  40. in src/main.cpp: in 29965a779e outdated
    2989@@ -2990,18 +2990,70 @@ bool static LoadBlockIndexDB()
    2990 
    2991     // Check presence of blk files
    2992     LogPrintf("Checking all blk files are present...\n");
    2993-    set<int> setBlkDataFiles;
    2994+    map<int, bool> mapBlkDataFileReadable,mapBlkUndoFileReadable;
    2995+    int nKeepBlksFromHeight = mapBlockIndex.size() - MIN_BLOCKS_TO_KEEP;
    


    sipa commented at 3:29 pm on July 16, 2014:
    This needs to be chainActive.Height(), not mapBlockIndex.size() (which may include stale branches).

    rdponticelli commented at 6:12 pm on July 16, 2014:
    chainActive.Height() is returning -1. I suppose it isn’t initialized at startup.

    sipa commented at 7:18 pm on July 16, 2014:
    That’s a problem, because it means we don’t know the height yet.

    rdponticelli commented at 7:31 pm on July 16, 2014:
    I’ll search for a workaround. Meanwhile, maybe increasing MIN_BLOCKS_TO_KEEP, as a safety net, would allow to keep testing this. What would be a sensible number?

    sipa commented at 7:35 pm on July 16, 2014:
    chainActive is initialized at the end of this function. Just move the test down.
  41. in src/main.cpp: in 29965a779e outdated
    3050+        }
    3051+        else
    3052+        {
    3053+            if((pindex->nHeight > nKeepBlksFromHeight) && (pindex->nHeight != ((int)mapBlockIndex.size()-1)))
    3054+            {
    3055+                LogPrintf("Missing undo in index for block: %i. mapBlockIndex.size: %i\n", pindex->nHeight, mapBlockIndex.size());
    


    sipa commented at 3:29 pm on July 16, 2014:
    Index has nothing to do with it? We’re just missing undo data.
  42. laanwj added the label UTXO Db and Indexes on Jul 31, 2014
  43. in src/util.cpp: in 42e1f8665c outdated
     96@@ -97,6 +97,7 @@ string strMiscWarning;
     97 bool fLogTimestamps = false;
     98 bool fLogIPs = false;
     99 volatile bool fReopenDebugLog = false;
    100+bool fPruned = false;
    


    sipa commented at 9:45 am on August 3, 2014:

    This really doesn’t belong in util.

    I’d say main, but apparently you need it inside net, which feels wrong. Feel like moving the pnodeLocalHost code to init?


    rdponticelli commented at 5:16 pm on August 3, 2014:
    What about passing an extra parameter to StartNode to allow setting the service bits from the caller in init? Would that be a correct solution? It would seem to be simpler and less invasive, wouldn’t it?

    sipa commented at 6:02 pm on August 3, 2014:
    Sounds good to me.

    jgarzik commented at 6:22 pm on August 3, 2014:
    Agree it does not belong in util, but also a parameter to StartNode does not seem appropriate. See NODE_EXT_SERVICES in the following WIP branch at https://github.com/jgarzik/bitcoin/tree/2014_ext_services
  44. sipa commented at 6:25 pm on August 3, 2014: member

    @jgarzik Please, let’s not make this dependent on how to expose this functionality to the network yet, that can come later. All we need for now is something that disables full node functionality when running pruned.

    EDIT: Oh, you mean just modifying nLocalServices from init? That seems fine too.

  45. jgarzik commented at 6:45 pm on August 3, 2014: contributor
    @sipa Correct, that’s all I meant. Local service configuration will ultimately be multi-step, and can be handled directly from init.
  46. rdponticelli force-pushed on Aug 26, 2014
  47. rdponticelli force-pushed on Aug 26, 2014
  48. BitcoinPullTester commented at 8:06 pm on August 26, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4481_d00a934cc05301d6a08f8d686428c4fd704ed76b/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  49. rdponticelli renamed this:
    Allow the node to run with history pruned.
    Add a switch to allow running in a pruned state
    on Aug 26, 2014
  50. rdponticelli force-pushed on Aug 27, 2014
  51. BitcoinPullTester commented at 6:42 pm on August 27, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4481_eae33b1af4f866257a84bcb438a1768fff07a63e/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  52. sipa commented at 6:46 pm on August 27, 2014: member
    Untested ACK.
  53. sipa commented at 6:48 pm on August 27, 2014: member

    There are a few changes I’d like to make, but let’s do that post-merge:

    • Merge the block/undo data file handling; for now, it seems they will be deleted simultaneously anyway.
    • Make the pruning depth configurable.
    • Abstract the deletion code out of the startup routine, and into something that can run regularly.

    EDIT: this is actually about #4701.

  54. rdponticelli force-pushed on Aug 28, 2014
  55. rdponticelli commented at 2:17 pm on August 28, 2014: contributor
    Added incompatibility with -txindex, as requested.
  56. BitcoinPullTester commented at 2:28 pm on August 28, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4481_9045aa715c2a043af244d14170fc11b84f986dcb/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  57. Include undo data to startup checks and refactor them onto a function.
    The check is now simpler, somewhat more efficient, and
    covers more data, to be sure the node can reorganize.
    709a98405e
  58. Add a switch to allow running in a pruned state
    These are the main functional changes on this state:
    
    * Do not allow running with a wallet or a txindex.
    * Check for data at startup is mandatory up to last 288 blocks.
    * NODE_NETWORK flag is unset.
    * Requests for pruned blocks from other peers is answered with "notfound" and they are disconnected, not to stall their IBD.
    * The range of blocks pruned is logged.
    efbf886cbe
  59. rdponticelli force-pushed on Sep 12, 2014
  60. BitcoinPullTester commented at 11:41 pm on September 12, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4481_efbf886cbe560950f432b5af78f63d4a57c81b51/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  61. rdponticelli closed this on Sep 18, 2014

  62. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-08 22:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me