UndoReadFromDisk: Checksum mismatch #6923

issue laanwj openend this issue on October 31, 2015
  1. laanwj commented at 4:12 pm on October 31, 2015: member

    As part of serious hammering as part of #6917 testing I think I found a new crash corruption issue (possibly a rare one).

    The leveldb database was still OK, however the start-up check fails in a new place: in the undo data check: https://github.com/bitcoin/bitcoin/blob/master/src/main.cpp#L1494

    Haven’t analysed the details but will keep the data around.

  2. laanwj added the label Block storage on Oct 31, 2015
  3. laanwj commented at 4:50 pm on November 4, 2015: member

    According to @gmaxwell in #6917:

    0Difference in VM flushing behavior, I guess. We appear to be missing a FileCommit in UndoWriteToDisk-- I think we need to have synced the blocks and undo before calling the insert.
    1
    2It would probably be better for performance if it wrote the block then undo, then did the syncs on both however.
    
  4. sipa commented at 5:00 pm on November 4, 2015: member
    Will have a look.
  5. sipa commented at 11:17 pm on November 4, 2015: member

    We don’t need a FileCommit in UndoWriteToDisk, because we always call it via FlushBlockFile 1) before writing the block index data that refers to it (in FlushStateToDisk) and 2) when transitioning to a new disk block file (in FindBlockPos).

    However, when reindexing, fKnown is set to true in FindBlockPos, so this flush is not called. That is the correct behaviour for the block files (which aren’t being changed in a reindex), but incorrect for undo files (which are being rewritten).

    I’m writing a fix.

  6. laanwj commented at 10:04 am on November 5, 2015: member

    Looking at the debug.log I saved here I noticed something peculiar:

    00001b070  65 3d 33 2e 39 4d 69 42  28 39 37 37 74 78 29 0d  |e=3.9MiB(977tx).|
    10001b080  0a 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
    20001b090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
    3*
    400030d40  32 30 31 35 2d 31 30 2d  33 31 20 31 33 3a 35 31  |2015-10-31 13:51|
    

    The crash also left a whole block of \x00 bytes at the end of the log. I suppose the space was allocated but the data wasn’t actually written yet. It is to be expected as we don’t ever sync the debug log to disk (just flush it, which flushes only application-side buffers), but this may give further insight into the kind of corruption.

  7. laanwj closed this on Nov 5, 2015

  8. laanwj added the label Data corruption on Feb 9, 2016
  9. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me