Assert inside leveldbwrapper to avoid continuing after failure. #5621

pull gmaxwell wants to merge 1 commits into bitcoin:master from gmaxwell:leveldb_failure changing 1 files +2 −0
  1. gmaxwell commented at 3:52 PM on January 8, 2015: contributor

    Database failure due to faulty hardware could leave users on a fork where they are vulnerable to opportunistic attack because they are rejecting the longest chain.

    If there is a software fault in the OS or LevelDB that causes many nodes to experience database failure at once its also important for overall consensus stability that the failing nodes shut down and not continue operating.

    In theory, the exceptions in leveldbwrapper should cause shutdowns but in practice the software calls into that database from many places, some of which throw away all exceptions.

    A more organized approach to errors should be used in the long-term, but this works for the moment.

  2. Assert inside leveldbwrapper to avoid continuing after failure.
    Database failure due to faulty hardware could leave users on a fork where
     they are vulnerable to opportunistic attack because they are rejecting
     the longest chain.
    
    If there is a software fault in the OS or LevelDB that causes many nodes
     to experience database failure at once its also important for overall
     consensus stability that the failing nodes shut down and not continue
     operating.
    
    In theory, the exceptions in leveldbwrapper should cause shutdowns but
     in practice the software calls into that database from many places, some
     of which throw away all exceptions.
    
    A more organized approach to errors should be used in the long-term,
     but this works for the moment.
    513fc67cd6
  3. gmaxwell commented at 3:53 PM on January 8, 2015: contributor

    Tested by running on a faulty host that was reliably getting LevelDB checksum errors and continuing operation afterwards.

  4. gmaxwell added this to the milestone 0.10.0 on Jan 8, 2015
  5. jgarzik commented at 3:55 PM on January 8, 2015: contributor

    ut ACK

  6. laanwj commented at 8:47 AM on January 9, 2015: member

    So, #5619 or this one? Or both?

  7. laanwj added the label Bug on Jan 9, 2015
  8. sipa commented at 2:23 PM on January 9, 2015: member

    Not both. This one will cause an assert before #5619 has a chance to catch anything anyway.

  9. gmaxwell commented at 2:55 PM on January 9, 2015: contributor

    I'm testing 5619 on my faulty host. So far it successfully has failed one of one attempts, another one is ongoing. (5621 was at 3 of 3 shortly after opening the PR) Sorry, the host isn't fast.

    I would accept 5619 if, after a more testing we're not surprised by any additional places where an exception here could escape. Though, ... I'm not sure we should be using exceptions for critical failures while we are still using them for normal conditions. Eliminating them from normal conditions would require a substantial rewrite of the network deserialization code, so asserting on the critical failures is the thing to do if you buy into my argument that we shouldn't be mixing them.

  10. laanwj removed this from the milestone 0.10.0 on Jan 12, 2015
  11. gmaxwell commented at 4:14 PM on January 12, 2015: contributor

    #5619 was merged.

  12. gmaxwell closed this on Jan 12, 2015

  13. MarcoFalke locked this on Sep 8, 2021
Labels

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-18 21:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me