-prune doesn’t work during import #23852

issue MarcoFalke openend this issue on December 23, 2021
  1. MarcoFalke commented at 12:56 pm on December 23, 2021: member

    It appears that -prune is ignored during import (-reindex, or -loadblock).

    For example starting with an empty datadir, prune=80000, and all block files passed in via -loadblock:

    0...
    12021-12-23T12:49:44Z [loadblk] [node/blockstorage.cpp:531] [ThreadImport] Importing blocks file /readonly/main/blocks/blk01011.dat...
    22021-12-23T12:49:44Z [loadblk] [node/blockstorage.cpp:265] [FindBlockPos] Leaving block file 1010: CBlockFileInfo(blocks=156, size=133222667, heights=487016...487360, time=2017-09-26...2017-09-28)
    32021-12-23T12:49:45Z [loadblk] [flatfile.cpp:69] [Allocate] Pre-allocating up to position 0x1000000 in blk01011.dat
    42021-12-23T12:49:45Z [loadblk] [logging/timer.h:57] [Log] FlushStateToDisk: find files to prune started
    52021-12-23T12:49:45Z [loadblk] [validation.cpp:3887] [FindFilesToPrune] Prune: target=80000MiB actual=123047MiB diff=-43047MiB max_prune_height=225719 removed 0 blk/rev pairs
    62021-12-23T12:49:45Z [loadblk] [logging/timer.h:57] [Log] FlushStateToDisk: find files to prune completed (0.05ms)
    7...
    

    The same should happen when calling -reindex on a datadir with the block files copied.

    So prune only works with IBD from the network?

  2. MarcoFalke added the label Bug on Dec 23, 2021
  3. MarcoFalke commented at 9:50 am on December 24, 2021: member
    As a workaround it is possible to start a node with the datadir on one hard drive and another one with the datadir on the other hard drive using -connect=localhost to import the blocks
  4. MarcoFalke closed this on Feb 17, 2022

  5. mruddy commented at 5:12 pm on April 20, 2022: contributor

    @MarcoFalke I was just looking at this because I’ve been looking at issues related to re-indexing. Please re-open this issue if you agree with my thoughts below.

    1. I think that -prune=N plus -loadblock=path could be made to work. That seems to make sense. An example use case is like what you were doing in your issue report above, where you have a clean set of block files and you want to create a pruned node from them, but you don’t want to change the input set of block files.

    2. I think that -prune=N plus -reindex makes potentially less sense together and should not be made to work.

      1. Enabling this combo could be dangerous by being destructive of the input blk files. If a problem occurs during re-indexing, then this combo may increase the chance of having to re-download lots of block data. If we leave them as a non-effective combo, then many re-indexing failures can be overcome by simply staring over with the original blk files still intact.
      2. A case where this combo may not make sense has to do with out of order blocks in blk files. The -reindex process can process blocks out of order. We wouldn’t want to prune a blk file that contains a block that was encountered out of order because when an out of order block is processed during re-indexing, its location is saved for a later re-read from the blk file after its parent is processed. If we pruned the blk file containing it, then we’d lose it and all of its ancestor blocks. We’d have to re-download them because the re-index process would consider all of its ancestors as being encountered out of order and not connectable. Therefore, we would not prune those blk files and thus the -prune=N would be a potentially meaningless suggestion (depending on how out of order a block was in a set of input block files) rather than a limit set by the node operator that will be generally adhered too. A worst case would have the last/best block inserted after genesis in the first blk file (dumb, but possible). No pruning would be possible until after everything is re-indexed.
  6. mruddy referenced this in commit da8e95c014 on Apr 24, 2022
  7. luke-jr referenced this in commit c86f129fd1 on May 21, 2022
  8. mruddy referenced this in commit 5c1bb1b0d9 on Oct 26, 2022
  9. mruddy referenced this in commit 7a022e4155 on Oct 27, 2022
  10. mruddy referenced this in commit 9a2d5ea407 on Oct 27, 2022
  11. mruddy referenced this in commit 347664ec71 on Oct 28, 2022
  12. mruddy referenced this in commit 2f62704430 on Oct 30, 2022
  13. mruddy referenced this in commit 488682e785 on Nov 1, 2022
  14. mruddy referenced this in commit f5c16495ed on Nov 18, 2022
  15. mruddy referenced this in commit 734355b470 on Nov 18, 2022
  16. mruddy referenced this in commit c4981e7f63 on Feb 22, 2023
  17. DrahtBot locked this on Apr 20, 2023
  18. achow101 referenced this in commit aebcd18c65 on May 3, 2023
  19. sidhujag referenced this in commit d03f4ab2d6 on May 4, 2023
  20. RandyMcMillan referenced this in commit 83c6c50e81 on May 27, 2023


MarcoFalke mruddy

Labels
Bug


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me