Fatal LevelDB error: Corruption: block checksum mismatch on Linux ext4 SATA SSDs #30692

issue cryptoquick openend this issue on August 21, 2024
  1. cryptoquick commented at 8:27 pm on August 21, 2024: none

    Is there an existing issue for this?

    • I have searched the existing issues

    Current behaviour

    I’ve been struggling with the errors mentioned here for a little while now: #30159 (comment)

    I’m running this on a desktop machine, not in the cloud, and on external SATA ext4 disks because my main system drive is formatted with Btrfs, which doesn’t work very well with key-value databases, I’ve found.

    Expected behaviour

    I get this error during IBD and also during reindexing.

    Steps to reproduce

    This the command I’m running:

    bitcoind -server -txindex=1 -datadir=/mnt/ThreeEight/bitcoind-mainnet -rpccookiefile=/mnt/ThreeEight/bitcoind-mainnet/.cookie

    Relevant log output

    02024-08-21T19:18:30Z UpdateTip: new best=0000000000000000000187d6a944e562d55881b738db06f0580d138c6af52754 height=856674 version=0x2192a000 log2_work=95.095637 tx=1058965890 date='2024-08-14T01:14:02Z' progress=0.995862 cache=253.5MiB(2160229txo)
    12024-08-21T19:18:30Z UpdateTip: new best=0000000000000000000197cc7a6d13c63cd086ca46372cc94577ff62fa56d568 height=856675 version=0x3e000000 log2_work=95.095650 tx=1058971585 date='2024-08-14T01:18:33Z' progress=0.995863 cache=253.8MiB(2162814txo)
    22024-08-21T19:18:30Z LevelDB read failure: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/chainstate/873236.ldb
    32024-08-21T19:18:30Z Fatal LevelDB error: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/chainstate/873236.ldb
    42024-08-21T19:18:30Z You can use -debug=leveldb to get more complete diagnostic messages
    52024-08-21T19:18:30Z Error: Error reading from database, shutting down.
    6Error: Error reading from database, shutting down.
    72024-08-21T19:18:30Z Error reading from database: Fatal LevelDB error: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/chainstate/873236.ldb
    8fish: Job 1, 'bitcoind -server -txindex=1 -da…' terminated by signal SIGABRT (Abort)
    

    How did you obtain Bitcoin Core

    Package manager

    What version of Bitcoin Core are you using?

    v27.1.0

    Operating system and version

    Arch Linux

    Machine specifications

    AMD 5950 CPU 128GB DDR4 4TB NVMe for OS 2TB & 3.8 TB 2.5" SATA SSDs

  2. sipa commented at 9:16 pm on August 21, 2024: member
    This looks like a LevelDB corruption inside the txindex index. Does -reindex wipe those? If not, that would explain why a reindex doesn’t fix the situation.
  3. maflcko commented at 6:40 am on August 22, 2024: member

    This looks like a LevelDB corruption inside the txindex index.

    Are you sure, to me the error message reads Error reading from database: Fatal LevelDB error: Corruption: block checksum mismatch: $DATADIR/chainstate/873236.ldb, whereas a index corruption should happen in $DATADIR/indexes/, according to https://github.com/bitcoin/bitcoin/blob/master/doc/files.md#data-directory-layout, no?

    Btrfs, which doesn’t work very well with key-value databases, I’ve found.

    Are there any observable downsides?

    on external SATA ext4 disks

    What is the exact setup and connection, given that you are using several storage units?

    Without further information, my guess would be that the external USB cable (or external connection) is flaky, but this is just a blind guess.

  4. maflcko commented at 6:47 am on August 22, 2024: member

    Bitcoin Core makes heavy use of CPU, RAM and storage IO. Hardware defects might only become visible when running Bitcoin Core. You might want to check your hardware for defects.

    • Use software such as memtest86 to check your RAM.
    • Use software such as linpack, or Prime95 to check the CPU behaviour under load.
    • Use software such as smartctl, fsck, badblocks, or CrystalDiskInfo to test your storage device use.

    Source: https://bitcoin.stackexchange.com/a/12206

  5. maflcko added the label Block storage on Aug 22, 2024
  6. maflcko added the label Data corruption on Aug 22, 2024
  7. maflcko added the label Questions and Help on Aug 22, 2024
  8. sipa commented at 9:34 am on August 22, 2024: member
    @maflcko Ah yes, it is the version here that had a corruption in the txindex: https://bitcoin.stackexchange.com/q/123999/208
  9. cryptoquick commented at 11:00 am on August 22, 2024: none

    Apologies, I meant to say internal SATA drives. There’s no USB enclosure, they’re inside the case and plugged into SATA 3.

    I tried upgrading the BIOS. I’ll see if that helps.

  10. cryptoquick commented at 8:40 pm on August 22, 2024: none

    I ran into the error earlier on in the IBD this time. It seems like there’s no rhyme or reason as to why this is occurring.

     02024-08-22T12:58:13Z UpdateTip: new best=0000000000000000011a06e4b1a3c497d247f785a4899e44ae800a9236438e16 height=424472 version=0x20000000 log2_work=85.108754 tx=148148770 date='2016-08-10T00:15:52Z' progress=0.137300 cache=124.8MiB(1146194txo)
     12024-08-22T12:58:13Z UpdateTip: new best=0000000000000000003a58cdf5401248a1330480a1c9b99440a5f974fb61ce17 height=424473 version=0x30000000 log2_work=85.108784 tx=148149386 date='2016-08-10T00:19:04Z' progress=0.137301 cache=124.8MiB(1144515txo)
     22024-08-22T12:58:13Z Fatal LevelDB error: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/indexes/txindex/052774.ldb
     32024-08-22T12:58:13Z You can use -debug=leveldb to get more complete diagnostic messages
     42024-08-22T12:58:13Z
     5
     6************************
     7EXCEPTION: 15dbwrapper_error
     8Fatal LevelDB error: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/indexes/txindex/052774.ldb
     9bitcoin in scheduler
    10
    11
    12
    13************************
    14EXCEPTION: 15dbwrapper_error
    15Fatal LevelDB error: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/indexes/txindex/052774.ldb
    16bitcoin in scheduler
    17
    18terminate called after throwing an instance of 'dbwrapper_error'
    19  what():  Fatal LevelDB error: Corruption: block checksum mismatch: /mnt/ThreeEight/bitcoind-mainnet/indexes/txindex/052774.ldb
    20fish: Job 1, 'bitcoind -server -txindex=1 -da…' terminated by signal SIGABRT (Abort)
    

    I’m not quite sure what to do.

  11. maflcko commented at 6:13 am on August 23, 2024: member

    I’m not quite sure what to do.

    Given that the corruption happens for any kind of index (txindex, chainstate), a hardware or software issue on your side is the most likely.

    I’d try to check if the internal cable is properly attached and the connector isn’t dusty or dirty. Then I’d try some stuff from #30692 (comment) with some caution (Backups are generally recommended, especially if data corruption is likely).

  12. github12101 commented at 11:47 am on August 24, 2024: none
    I am also inclined to say this is hardware error. I had these and it turned out to be unreliable RAM. Put that database on Btrfs, it should never have checksum corruption error there. If it does, on Btrfs, and you have kernel errors about checksum, then this is 100% hardware problem. I run full node on Btrfs and I don’t have any problems.
  13. cryptoquick commented at 1:43 pm on August 27, 2024: none
    I’m trying to check my RAM using memtest86, but I’m having trouble booting into that too. Give me a bit to troubleshoot.
  14. Tajuras commented at 3:00 pm on August 29, 2024: none
    I have the same problem, with windows 10 and windows 11. SSD Sandisk 2T. Computer is Ryzen 3900x absolutely stable. Error occurs in version 27.1 Once i rollback to version 25 i am completely stable now.
  15. maflcko commented at 3:04 pm on August 29, 2024: member

    I have the same problem, with windows 10 and windows 11. SSD Sandisk 2T. Computer is Ryzen 3900x absolutely stable. Error occurs in version 27.1 Once i rollback to version 25 i am completely stable now.

    Are you saying this is reproducible in 27.1, or did the error happen only once with 27.1?

  16. Tajuras commented at 3:10 pm on August 29, 2024: none
    It happen to me only in 27.1. I tested in 2 machines. In standard HD it takes more time to occur. In the one with sandisk SSD 2T, it occurs fast (in the same day after a few hours running). I get the conclusion that the problem might be in version 27.1. I gave up from it because every time error occurs i have to reindex chainstate and it takes 1-2 days. Making a copy from working node is a pain in the ass too. Only chainstate seems to corrupt. I tested fat32 and ntfs file systems. I am a computer programmer, have 3 different nodes in 3 computers. Currently i am only in version 25
  17. Tajuras commented at 3:15 pm on August 29, 2024: none
    In my main machine, is ryzen with 32gb ddr4 with 2 sticks of memory. The other machine is Xeon, with 32gb memory, in quad channel ddr3. So i dont think its memory, because the 2 computers are stable. I can run prime95 in both for 2 hours with no problem.
  18. maflcko commented at 3:19 pm on August 29, 2024: member
    Which type of Sandisk SSD is it? The “Extreme” ones are known to eat the data, according to https://duckduckgo.com/?q=SSD+Sandisk+Extreme+data+loss
  19. Tajuras commented at 3:22 pm on August 29, 2024: none

    SanDisk SSD PLUS 2000GB : 2000,3 GB Serial: 23281C800047 Padrao: ACS-3 | ACS-2 Revision 3 S.M.A.R.T., APM, NCQ, TRIM, DevSleep, GPL SATA/600 | SATA/600

    From CrystalDiskinfo

  20. Tajuras commented at 3:26 pm on August 29, 2024: none
    I will wait version 28 to come to test it again. For now i am using it with this Sandisk 2T, stable in v25 for about 1 month now.
    I let bitcoin core running, with electrum personal server all day while i am working.
  21. cryptoquick commented at 10:01 pm on August 30, 2024: none
    @Tajuras Which version of 25 are you running? I tried both 26.2 and 25.2 and I get the same problem with both.
  22. github12101 commented at 11:01 pm on August 30, 2024: none

    @Tajuras Which version of 25 are you running? I tried both 26.2 and 25.2 and I get the same problem with both.

    Have you memtested your machine yet?

  23. cryptoquick commented at 11:03 pm on August 30, 2024: none

    Have you memtested your machine yet?

    I’ve been having trouble with getting the grub configuration correct to actually boot into it. I’ll need to use a USB drive.

    How long do you recommend I run the test?

  24. github12101 commented at 11:35 pm on August 30, 2024: none

    Have you memtested your machine yet?

    I’ve been having trouble with getting the grub configuration correct to actually boot into it. I’ll need to use a USB drive.

    How long do you recommend I run the test?

    Running from USB will definitely simplify the procedure. I recommend running for several hours, maybe overnight. Also, get your storage media scanned for errors. SMART long scan + ext4 filesystem check. Also, you were saying bitcoind doesn’t work very well with Btrfs, can you check again and report back? Let’s see if you will be getting any errors. I am running bitcoind on Btrfs no problem, like previously mentioned.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me