Heap corruption while loading from bootstrap.dat in master? #4345

issue laanwj opened this issue on June 16, 2014
  1. laanwj commented at 6:07 AM on June 16, 2014: member

    bsm117532 on IRC reported the following crashes (7 of them) while bootstrapping a clean node with master:

    http://pastebin.com/H44pAkkR

    It looks like some kind of heap corruption issue. He is going to try with 0.9.2 to see if the issue exists there as well. I suppose not, it's most likely it has to do with one of the recent changes on master.

    Edit: his system is a 4 core, Intel Core 2 quad, 8G ram, so default parallelism will have been 4.

  2. laanwj added the label Bug on Jun 16, 2014
  3. laanwj added the label UTXO Db and Indexes on Jun 16, 2014
  4. jgarzik commented at 12:21 PM on June 16, 2014: contributor

    FWIW that backtrace appears to go off the rails inside leveldb, IIRC.

  5. mcelrath commented at 2:55 PM on June 16, 2014: none

    Testing with 0.9.2, it did not crash (overnight) reloading the bootstrap.dat, but it ended at block 278356 with a ton of orphan blocks, and not downloading anything despite having lots of peers (26).

    This is what happened to me with 0.9.1...I was unable to synchronize with the network at all because it stopped downloading blocks. Someone on IRC suggested that my blockchain was corrupted, which led to me trying to reload everything. Right now I'm running a -reindex.

    I don't understand why 0.9.1 and 0.9.2 ended up stalled in downloading the blockchain (orphan blocks?) which seems to be a separate problem to the SIGSEV's above. For now I will bisect and/or valgrind to find the memory corruption, but I'd appreciate suggestions about what else is going on that is preventing me from downloading the entire blockchain.

  6. jgarzik commented at 3:18 PM on June 16, 2014: contributor

    Network blockchain download stalls are (unfortunately) common and have nothing to do with corruption.

    Just stop and restart, if you get impatient.

  7. mcelrath commented at 3:31 PM on June 16, 2014: none

    By "stall" I mean more than 3 weeks, many (at least 10) restarts, and never progressing past a particular block. The cause seemed to be this:

    2014-06-14 20:00:23 CheckForkWarningConditions: Warning: Large valid fork found forking the chain at height 285298 (0000000000000000ea7984d2eef34d494e5ec5c31607f396f4a2b1decb6a1ca7) lasting to height 305806 (00000000000000000018257d507dcb0365a9176fe3aaa0e9539b0d9a3a5b5570).

  8. laanwj commented at 3:44 PM on June 16, 2014: member

    That's strange (and abnormal). It's starting to sound like something is wrong on your machine causing database and/or memory corruption.

  9. mcelrath commented at 3:53 PM on June 16, 2014: none

    I've run memtest and not found anything. I'm writing a little python script to run sha256 in circles and check for inconsistencies...

  10. mcelrath commented at 3:35 PM on June 17, 2014: none

    Further examining the possibility of bad hardware, I wrote this little script. It detected a hardware problem so I under-clocked my CPU and the problem seems to have gone away. I did see crashes with v0.9.2 but right now it is running and synchronized.

    I see a few other "issues" involving ORPHAN BLOCK (etc). It might be a good idea to hand this script to people who report problems synchronizing the blockchain to check their hardware. I had to run it for several hours before it reported an error, but bitcoin is very sensitive to a single hash being computed incorrectly.

    #!/usr/bin/python
    
    # Repeatedly run a sha256 on random data.  Keeps a rolling buffer of the last
    # <buflen> hashes and re-checks them.  Prints an error ONLY if a mismatch is
    # found.  If a mismatch is found, you have a hardware problem.
    
    from hashlib import sha256
    from collections import deque
    import random
    
    buflen = 100000
    hashbuf = deque(maxlen=buflen)
    
    for i in range(buflen):
        hashbuf.append([str(i), sha256(str(i)).hexdigest()])
    
    while True:
        k, khash = hashbuf.popleft()
        pophash = sha256(k).hexdigest()
        if pophash != khash:
            print "ERROR: sha256(%s) = %s does not match:"%(k, khash)
            print "       sha256(%s) = %s"%(k, pophash)
        k = str(random.getrandbits(1000))
        khash = sha256(k).hexdigest()
        hashbuf.append([k, khash])
    

    Please close this issue.

  11. laanwj commented at 3:42 PM on June 17, 2014: member

    Ok, thanks for letting us know.

  12. laanwj closed this on Jun 17, 2014

  13. laanwj commented at 6:26 AM on June 18, 2014: member

    FYI I've run various stress tests with master, importing bootstrap.dat files with different degrees of parallelism over the last few days, and not one crash or instance of corruption.

  14. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me