Deserialize error: overlapping blocks in block files? #5850

issue laanwj openend this issue on March 3, 2015
  1. laanwj commented at 11:09 am on March 3, 2015: member

    Recently I had an error on an ARM node running latest master. As its file system is on a USB stick it is notoriously unreliable, normally I just start a reindex, but as other similar problems have been reported, this time I decided to investigate

    02015-02-27 04:30:05 ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large at CBlockDiskPos(nFile=196, nPos=67432382)
    

    Looking at the file:

    0-rw------- 1 debian debian 134202024 Feb 11 03:28 blk00196.dat
    

    blk00196.dat, offset 0x404efbe. Not a block header in sight.

    00404efb0  aa 65 b2 9d 80 ae 02 21  00 e0 43 0e 47 6d fe 26  |.e.....!..C.Gm.&|
    10404efc0  0b 51 ae 47 c2 9a 4c c5  d0 10 0d da ff c5 9f de  |.Q.G..L.........|
    20404efd0  4a 73 63 92 3c 07 33 65  c8 01 21 02 5b 2c d7 c3  |Jsc.<.3e..!.[,..|
    30404efe0  00 c4 67 38 f8 48 94 00  1e f6 fc 8a eb 33 ee bc  |..g8.H.......3..|
    

    According to block database:

    0nHeight=330411 nFile=196 nDataPos=0404d320 nUndoPos=007ff580 hash=000000000000000011d3e32f9276f2b2aa154dedfe4a7a63df9cbaaf86f3939d
    1nHeight=330388 nFile=196 nDataPos=0404efbe nUndoPos=00718a26 hash=00000000000000000d61fbd0001a82706380d6b9ba5a09011097a6e9b967bf65
    2nHeight=330412 nFile=196 nDataPos=0405586d nUndoPos=00800983 hash=00000000000000000f611e05c0d6f1800cfd05291e224fa5377d988ea627ff94
    

    After creating a script that scans for all valid block headers in a block file, I found a strange occurence in blk00196.dat:

    0Range             DataPos  Hash                                                             Height
    103df9be8-03e935a2 03df9bf0 0000000000000000057e7aea33fc6d3599023ff14bb8f7cccf0517cfe53ca110 330364 
    203e935a2-03edce1a 03e935aa 0000000000000000090fc723e88fc12543e328dacc9c5d59664a10dd5e11f4c0 330404 
    303edce1a-03f484c7 03edce22 000000000000000002d445d11051e3c55dd7c083b8a0ed2c38808cddb3d433be 330405 
    403ee3121-03f4e7ce 03ee3129 000000000000000002d445d11051e3c55dd7c083b8a0ed2c38808cddb3d433be 330405 Overlap with last block (03ee3121,03f4e7ce)
    503f4e7ce-03f78aab 03f4e7d6 00000000000000000788eca10eaa37ad18c28bdb7b1abd0de580c77bb9e5dfbc 330406 
    

    It looks like overlapping blocks are written. This doesn’t seem like normal disk corruption.

    The above offset 0404efbe for block 330388 would fall into the range claimed by another block, 330411:

    0Range             DataPos  Hash                                                             Height
    10404d318-04055865 0404d320 000000000000000011d3e32f9276f2b2aa154dedfe4a7a63df9cbaaf86f3939d 330411 
    

    Full information (as well as supporting scripts) can be found here: https://download.visucore.com/tmp/2015_03_arm_error.tar.xz

    • blk00196.dat Block data
    • db_file196.txt Database dump for all blocks in 196, sorted by offset
    • dump_blk00196.txt Block scan list for blocks in 196
    • list-blocks.py Script to scan and list blocks in a .dat file
    • log_block_database.patch Patch to bitcoind to log block database at startup
  2. laanwj added the label Priority High on Mar 3, 2015
  3. laanwj added the label UTXO Db and Indexes on Mar 3, 2015
  4. laanwj commented at 4:50 pm on March 3, 2015: member
    I scanned the other block files, and it turns out that 196 is the only one in which this inconsistency happens. It looks like a one-time incident, consistent with (for example) a crash that left the database in an inconsistent state.
  5. theuni commented at 5:56 pm on March 3, 2015: member
    Nice job getting something tangible to investigate. I’ll poke at this some as well.
  6. ajweiss commented at 4:00 pm on March 4, 2015: contributor

    I have looked into this a bit. The offset between the two magics for the repeated beginning of block 330405 is 25351 bytes and as far as I can tell, the file write pointer is only advanced in FindBlockPos() by full block size increments (+ 8 bytes for network magic and size). I’ve scanned through my local block files for blocks that are 25343 bytes (+ 8 for network magic and size) and only really found stuff in much earlier (<300k) blocks. (Incidentally, I also tried 25351 and 25351+8, and nothing within a thousand blocks, although 334940 is 25359 (25351 with network headers).

    I’ve also poked around looking for orphans that are this size (blockchain.info, is there a better place?) and haven’t turned anything up.

    I’m tempted to believe that there may indeed be corruption of the block index database itself, as it stores both the offsets for within files and the erroneous entry for 330388.

    To dig deeper, I think I’d want to dump the entire block index (and the block file index) and compare it to the entire block file set… I dunno if that can be easily shared somewhere though…

  7. laanwj commented at 11:56 am on March 10, 2015: member
    Closing this. I have since restarted the node from scratch and haven’t had this issue yet, and #5668 seems to be about the same problem.
  8. laanwj closed this on Mar 10, 2015

  9. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me