Split headers-only versions off CheckBlock/AcceptBlock #3884

pull sipa wants to merge 2 commits into bitcoin:master from sipa:headersfirst5 changing 2 files +179 −72
  1. sipa commented at 8:17 pm on March 16, 2014: member

    There is a small semantic change here, namely that blocks with invalid transaction data (eg duplicate hashes, non-first coinbase, …) will have their headers accepted into the tree (but never considered for connecting).

    That’s an inevitable consequence of moving towards headers-first, as we’ll do header connectivity checkinng before transaction checking anyway (the transaction data won’t be available until later, so we rather do more extensive checking on the headers that are available first).

    This has very low DoS risk, as it still requires faking PoW.

  2. luke-jr commented at 8:30 pm on March 16, 2014: member
    This doesn’t change the actual behaviour of the pre-existing functions, correct? So, eg, unit test assumptions remain valid..
  3. sipa commented at 8:34 pm on March 16, 2014: member

    A better summary of the changes:

    • before: standalone headers/tx checks, (orphan store), tree headers/tx checks, store block, connect
    • after: standalone headers checks, tree headers checks, store headers, standalone tx checks, (orphan store), tree tx checks, store block, connect

    In a later pull request, the orphan storage will be removed entirely, as blocks can be stored with just headers validated but not transactions (potentially resulting in out-of-order block storagre).

  4. sipa commented at 7:06 pm on March 17, 2014: member
    Please take your time, and ask for clarification as necessary.
  5. sipa commented at 4:16 pm on March 29, 2014: member
    There is a bug in this code, which prevents syncing. Closing until fixed.
  6. sipa closed this on Mar 29, 2014

  7. sipa reopened this on Apr 7, 2014

  8. sipa commented at 0:14 am on April 8, 2014: member
    \o/ segfault
  9. sipa commented at 6:22 pm on April 8, 2014: member
    Bug fixed. @luke-jr I don’t think any assumptions that are used in unit tests change. The only real change is that now CBlockIndex entries in the block tree may exist that do not have actual block data associated with them.
  10. mikehearn commented at 3:08 pm on April 10, 2014: contributor

    I did a review, comments ended up on the origin commit rather than here - oops.

    Overall looks good, just would appreciate some more comments and helper methods in a few places to make the logic easier to follow and reduce the chances of typos with all the bit twiddling.

  11. laanwj added this to the milestone 0.10.0 on Apr 11, 2014
  12. in src/main.cpp: in 2c82f2e5be outdated
    2488+
    2489+    // Enforce block.nVersion=2 rule that the coinbase starts with serialized block height
    2490+    if (block.nVersion >= 2)
    2491+    {
    2492+        // if 750 of the last 1,000 blocks are version 2 or greater (51/100 if testnet):
    2493+        if ((!TestNet() && CBlockIndex::IsSuperMajority(2, pindex->pprev, 750, 1000)) ||
    


    Diapolo commented at 12:42 pm on April 11, 2014:
    Shouldn’t these be constants perhaps in CChainParams?

    sipa commented at 1:12 pm on April 11, 2014:
    Agree, but let’s do that in another PR.
  13. rdponticelli commented at 4:43 pm on April 16, 2014: contributor
    There seems to be some bug on the connection logic in this pull. I got a node stuck after it got shutdown during syncing. After launching it again, all the remaining blocks got marked as orphans, and it never resumed syncing. Shutting down and launching again several times didn’t help.
  14. sipa commented at 5:24 pm on April 16, 2014: member
    @rdponticelli With a recent version of this code (less than 8 days old?)
  15. rdponticelli commented at 5:35 pm on April 16, 2014: contributor

    @sipa Yes, last version. Culprit seems to be some kind of race condition with the low disk space code. Some logs:

    2014-04-16 14:18:49 UpdateTip: new best=0000000000000000937bdca5485b8a02da4cbfea74fe982903a2da1ba8537866 height=295933 log2_work=77.988128 tx=36873027 date=2014-04-15 07:18:51 progress=0.993805 2014-04-16 14:18:49 ProcessBlock: ACCEPTED 2014-04-16 14:18:49 *** Error: Disk space is low! 2014-04-16 14:19:03 ERROR: AcceptBlock() : FindBlockPos failed 2014-04-16 14:19:03 ERROR: ProcessBlock() : AcceptBlock FAILED 2014-04-16 14:19:03 Requesting shutdown 2014-04-16 14:19:03 ProcessBlock: ORPHAN BLOCK 0, prev=0000000000000000a89d601cd42705cfe8a25b6546ed892309c67e5a197e3fdb 2014-04-16 14:19:03 ProcessBlock: ORPHAN BLOCK 1, prev=000000000000000005ea1ca8a944a574826bd856b83a29871f46c8e2a2323deb 2014-04-16 14:19:03 Running Shutdown in thread

    After that all the chain is detected as orphan:

    2014-04-16 14:58:04 Initialization result: 1 2014-04-16 14:58:14 ProcessBlock: ORPHAN BLOCK 0, prev=0000000000000000a89d601cd42705cfe8a25b6546ed892309c67e5a197e3fdb 2014-04-16 14:58:15 ProcessBlock: ORPHAN BLOCK 1, prev=000000000000000005ea1ca8a944a574826bd856b83a29871f46c8e2a2323deb 2014-04-16 14:58:17 ProcessBlock: ORPHAN BLOCK 2, prev=00000000000000001179ce5490d2ff833c0afb1799b6beb0981096c9800002c8 2014-04-16 14:58:17 ProcessBlock: ORPHAN BLOCK 3, prev=00000000000000001b4fd2aaec801ac068ea2a39a4b9f9eb16569a69c6713d6f 2014-04-16 14:58:18 ProcessBlock: ORPHAN BLOCK 4, prev=000000000000000023524d33b27b7ea7135332b32853a7582cf504e20644ea48 2014-04-16 14:58:19 ProcessBlock: ORPHAN BLOCK 5, prev=0000000000000000a5344e5239e17ae8995d3c7c1417b38d0dc1e047595a3807

  16. rdponticelli commented at 6:18 pm on April 16, 2014: contributor

    Running the code in master in this state I got:

    2014-04-16 17:10:31 init message: Done loading 2014-04-16 17:10:31 Initialization result: 1 2014-04-16 17:10:41 Pre-allocating up to position 0x7000000 in blk00132.dat 2014-04-16 17:10:47 ERROR: ReadBlockFromDisk : OpenBlockFile failed 2014-04-16 17:10:47 *** Failed to read block 2014-04-16 17:10:51 ERROR: AcceptBlock() : AddToBlockIndex failed 2014-04-16 17:10:51 ERROR: ProcessBlock() : AcceptBlock FAILED 2014-04-16 17:10:51 ERROR: ReadBlockFromDisk : OpenBlockFile failed 2014-04-16 17:10:51 *** Failed to read block 2014-04-16 17:10:51 ERROR: AcceptBlock() : AddToBlockIndex failed 2014-04-16 17:10:51 ERROR: ProcessBlock() : AcceptBlock FAILED 2014-04-16 17:10:51 Requesting shutdown 2014-04-16 17:10:51 Running Shutdown in thread 2014-04-16 17:10:51 ERROR: ReadBlockFromDisk : OpenBlockFile failed 2014-04-16 17:10:51 *** Failed to read block 2014-04-16 17:10:51 addcon thread interrupt 2014-04-16 17:10:51 net thread interrupt 2014-04-16 17:10:51 dumpaddr thread stop 2014-04-16 17:10:51 opencon thread interrupt 2014-04-16 17:10:53 ERROR: AcceptBlock() : AddToBlockIndex failed 2014-04-16 17:10:53 ERROR: ProcessBlock() : AcceptBlock FAILED 2014-04-16 17:10:53 msghand thread interrupt 2014-04-16 17:10:53 Shutdown : In progress…

    And after that the new code again:

    2014-04-16 17:45:19 init message: Done loading 2014-04-16 17:45:19 Initialization result: 1 2014-04-16 17:45:28 ProcessBlock: ACCEPTED 2014-04-16 17:45:31 ProcessBlock: ACCEPTED 2014-04-16 17:45:31 CheckForkWarningConditions: Warning: Found invalid chain at least ~6 blocks longer than our best chain. Chain state database corruption likely. 2014-04-16 17:45:31 ProcessBlock: ACCEPTED 2014-04-16 17:45:31 CheckForkWarningConditions: Warning: Large valid fork found forking the chain at height 295933 (0000000000000000937bdca5485b8a02da4cbfea74fe982903a2da1ba8537866) lasting to height 295941 (00000000000000002cda160c4452e7ad5252453a969dc6ff9b6e9e8bb584093f). Chain state database corruption likely. 2014-04-16 17:45:31 ProcessBlock: ACCEPTED 2014-04-16 17:45:33 CheckForkWarningConditions: Warning: Large valid fork found forking the chain at height 295933 (0000000000000000937bdca5485b8a02da4cbfea74fe982903a2da1ba8537866) lasting to height 295942 (0000000000000000adfe890f4c33801b26fac05207747652d8c311244e59a6d0). Chain state database corruption likely.

    And it’s still going like this….

  17. sipa commented at 11:21 am on April 17, 2014: member

    @rdponticelli Thanks for testing!

    So the problem was that if the block data failed to write, we ended up with a block header in the index, but no corresponding block data in the block file. At next startup, this would mean it would need to download it again, but the block download logic can’t deal with the in-index-but-not-on-disk case yet, which will be added later.

    I’ve (hopefully) fixed it by only writing the block header to the index after the block data is present.

  18. rdponticelli commented at 7:19 pm on April 17, 2014: contributor

    Now I’m getting this on another node. It seems that it’s breaking when it changes the block file:

    2014-04-17 19:02:56 ProcessBlock: ACCEPTED 2014-04-17 19:03:07 UpdateTip: new best=0000000000000000551b394408e467f83cd5d6f36612f8e33643d188f0d4498f height=293947 log2_work=77.722856 tx=36140885 date=2014-04-03 11:35:26 progress=0.931634 2014-04-17 19:03:07 ProcessBlock: ACCEPTED 2014-04-17 19:03:18 Pre-allocating up to position 0x300000 in rev00129.dat 2014-04-17 19:03:19 UpdateTip: new best=000000000000000055c9d3b2a816877d3b1a9ca6048170ed7cee1c8629255e0e height=293948 log2_work=77.72298 tx=36141736 date=2014-04-03 11:56:14 progress=0.931702 2014-04-17 19:03:19 ProcessBlock: ACCEPTED 2014-04-17 19:03:21 Pre-allocating up to position 0x2000000 in blk00129.dat 2014-04-17 19:03:52 UpdateTip: new best=0000000000000000b2a0b9c58aa86994aa85385d0d8a8676945f6d3b642b27a9 height=293949 log2_work=77.723105 tx=36142897 date=2014-04-03 12:08:09 progress=0.931744 2014-04-17 19:03:53 ProcessBlock: ACCEPTED 2014-04-17 19:03:53 ProcessBlock: ORPHAN BLOCK 0, prev=00000000000000006c3f5ca15b21e43e7328c6ea033a25671f08c396cfd027ca 2014-04-17 19:03:59 UpdateTip: new best=00000000000000006e97b96972f7f31f0847fcf35e723193c0949dad4bbc730c height=293950 log2_work=77.723229 tx=36143254 date=2014-04-03 12:14:21 progress=0.931765 2014-04-17 19:03:59 ProcessBlock: ACCEPTED 2014-04-17 19:04:03 UpdateTip: new best=0000000000000000cf179e61ecbe56eb19561396ef72650f2a8b3e7d29d21cde height=293951 log2_work=77.723353 tx=36143569 date=2014-04-03 12:20:27 progress=0.931785 2014-04-17 19:04:03 ProcessBlock: ACCEPTED 2014-04-17 19:04:05 UpdateTip: new best=000000000000000039022b02422dfee754c2dd91cf8dac8289c956b30da37b80 height=293952 log2_work=77.723478 tx=36143600 date=2014-04-03 12:20:35 progress=0.931785 2014-04-17 19:04:05 ProcessBlock: ACCEPTED 2014-04-17 19:04:05 ProcessBlock: ORPHAN BLOCK 1, prev=00000000000000007f49a436045b4190c4164adef4479ebc290ae1379826b40a 2014-04-17 19:04:07 UpdateTip: new best=00000000000000001c9b145536ce2f3ffd529742b0ab58a4bdee1ebd985e4c69 height=293953 log2_work=77.723602 tx=36143765 date=2014-04-03 12:24:37 progress=0.931799 2014-04-17 19:04:07 ProcessBlock: ACCEPTED 2014-04-17 19:04:07 ProcessBlock: ORPHAN BLOCK 2, prev=0000000000000000c0b4bede793045606417892efcbb83c3d5c52b9aa74b9086 2014-04-17 19:04:08 UpdateTip: new best=0000000000000000c9cbe25307641906939f3e767d94dd7d2553b860dad7a835 height=293954 log2_work=77.723726 tx=36143895 date=2014-04-03 12:27:15 progress=0.931808

  19. sipa commented at 7:49 pm on April 17, 2014: member
    Just seeing orphans while syncing is perfectly expected (not intentional though, but it will require more to fix that).
  20. sipa commented at 9:42 pm on April 18, 2014: member
    Rebased and addressed @mikehearn’s comments.
  21. Split up CheckBlock in a block and header version f457347053
  22. Split AcceptBlockHeader from AcceptBlock.
    Also modify some connection logic to deal with non-full blocks in the index.
    942b33a19d
  23. sipa commented at 10:36 pm on April 24, 2014: member
    Rebased.
  24. BitcoinPullTester commented at 11:00 pm on April 24, 2014: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/942b33a19d3aa96326acc044c1834c48beff777c for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  25. mikehearn commented at 9:42 pm on April 29, 2014: contributor
    @gavinandresen How does this look to you?
  26. leofidus commented at 1:38 pm on May 2, 2014: none
    Tested this on the Dogecoin blockchain. Both initial sync and keeping up with the blockchain worked as normal, both restarting the client and waking up from hibernation worked fine and I didn’t spot anything unusual in the logs.
  27. rebroad commented at 0:09 am on May 9, 2014: contributor
    I’m currently testing this. It’s still incredibly slow due to the peer selection for the block download. I really think it should take the speed of download into consideration for selecting a peer, and I still don’t understand why it selects only one peer - why can’t it select several? After all, most download tools these days choose to download from multiple sources to reduce the load on any one source.
  28. gmaxwell commented at 0:14 am on May 9, 2014: contributor
    @rebroad This is an incremental reorganization. All that stuff you are talking about was implemented months ago in sipa’s headers first branch but it was too risky a change to take at once and wasn’t seeing adequate testing. This pull (and others) are breaking up the steps to move to it incrementally in a way which is easier to verify.
  29. rebroad commented at 1:48 am on May 9, 2014: contributor
    @gmaxwell yes, I know. The stuff I am talking about hasn’t been implemented yet. This is why I am mentioning it.
  30. gmaxwell commented at 2:00 am on May 9, 2014: contributor

    The stuff I am talking about hasn’t been implemented yet.

    Pulling from multiple peers and taking speed into account were both implemented in the headers first branch. (actually it does something which is usually better than take speed into account it tracks when peers are stalling the process and kicks out the straggler(s)).

  31. laanwj commented at 5:46 am on May 9, 2014: member
    @rebroad That’s just rude. Please keep discussion in issues about the goal of the issue. @sipa does not claim that this change solves the problems that you mention, so you should not be disappointed that they are not solved by it.
  32. sipa commented at 10:25 am on May 9, 2014: member

    @rebroad Selecting multiple peers to download from would just bring duplicate block downloads back, unless it happens in a coordinated way. No such coordination exists, because we don’t know in advance which peers have which blocks - we just download what they tell us they have.

    Nobody claims we can’t download from multiple peers at once - that is one of the goals of headers-first synchronization (which this is incremental work towards, but it will take a few more pull requests before we’re actually there).

  33. laanwj added the label P2P on May 9, 2014
  34. mikehearn commented at 11:00 am on May 9, 2014: contributor
    How do we decide when there’s been enough testing and review? Seems like the path forward here isn’t clear.
  35. laanwj commented at 11:31 am on May 9, 2014: member
    @mikehearn What part is unclear? This is tagged ‘0.10’ so will be merged after the branch-off of 0.9.2.
  36. mikehearn commented at 11:46 am on May 9, 2014: contributor
    This may be a discussion for the mailing list, but I’m not sure it’s healthy to keep pull requests that are “done” queued up where they have to be constantly rebased, waiting to enter master, missing out on testing that people might do. If you want to have a branch for stabilisation, why not branch for 0.9.2 now and cherry pick fixes into it until you’re confident it’s ready?
  37. laanwj commented at 12:48 pm on May 9, 2014: member
    I know that - the plan is to split off 0.9.2 this week.
  38. mikehearn commented at 12:59 pm on May 9, 2014: contributor
    Okie dokie.
  39. rebroad commented at 1:18 pm on May 9, 2014: contributor

    @gmaxwell ah, so it’s not in the master branch? Am I able to submit pull requests to the branch you mention? What is the purpose of this branch and will it as some point be merged into master?

    My main reason for mentioning these concerns is that the current master as it is doesn’t work well, and gets 300 or so orphan blocks piling up without catching up on the blockchain, often requiring a restart of bitcoind to get it downloading again, at which point the hundreds of orphan blocks already downloaded are then lost (since they are not saved to disk).

    It’s also frustrating to see this, considering I raised a pull request about 2 years ago which downloaded blocks far faster and more efficiently than it currently does, and which did detect stuck downloads, unlike the current master branch.

  40. sipa commented at 1:57 pm on May 9, 2014: member
    @rebroad Gregory talking about my personal headers-first branch (#2964), which was never merged because of multiple reasons, and later abandoned. I’ve since (slowly) been working on several pull requests with pieces of that functionality. This pull request is just some refactoring work towards that.
  41. laanwj merged this on May 9, 2014
  42. laanwj closed this on May 9, 2014

  43. laanwj referenced this in commit d54985f3f1 on May 9, 2014
  44. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-17 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me