Log explicit error message when coindb is found in inconsistent state #28350

pull fjahr wants to merge 1 commits into bitcoin:master from fjahr:202308-coindb-error changing 1 files +3 −0
  1. fjahr commented at 11:29 am on August 27, 2023: contributor

    While doing manual testing on assumeutxo this week I managed to put the coindb into an inconsistent state twice. For a normal user, this can also happen if their computer crashes during a flush or if they try to stop their node during a flush and then get tired of waiting and just shut their computer down or kill the process. It’s an edge case but I wouldn’t be surprised if this does happen more often when assumeutxo gets used more widely because there might be multiple flushes happening during loading of the UTXO set in the beginning and users may think something is going wrong because of the unexpected wait or they forgot some configs and want to start over quickly.

    The problem is, when this happens at first the node starts up normally until it’s time to flush again and then it hits an assert that the user can not understand.

    02023-08-25T16:31:09Z [httpworker.0] [snapshot] 52000000 coins loaded (43.30%, 6768 MB)
    12023-08-25T16:31:16Z [httpworker.0] Cache size (7272532192) exceeds total space (7256510300)
    22023-08-25T16:31:16Z [httpworker.0] FlushSnapshotToDisk: flushing coins cache (7272 MB) started
    3Assertion failed: (old_heads[0] == hashBlock), function BatchWrite, file txdb.cpp, line 126.
    4Abort trap: 6
    

    We should at least log an error message that gives users a hint of what the problem is and what they can do to resolve it. I am keeping this separate from the assumeutxo project since this issue can also happen during any regular flush.

  2. DrahtBot commented at 11:29 am on August 27, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK jonatack, ryanofsky, jamesob, achow101

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label CI failed on Aug 27, 2023
  4. DrahtBot removed the label CI failed on Aug 27, 2023
  5. in src/txdb.cpp:127 in be1b6d5122 outdated
    122@@ -123,6 +123,9 @@ bool CCoinsViewDB::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, boo
    123         // We may be in the middle of replaying.
    124         std::vector<uint256> old_heads = GetHeadBlocks();
    125         if (old_heads.size() == 2) {
    126+            if (old_heads[0] != hashBlock) {
    127+                LogPrintLevel(BCLog::COINDB, BCLog::Level::Error, "The coindb detected an inconsistent state, likely due to a previous crash or shutdown. You will need to restart with -reindex-chainstate or -reindex.\n");
    


    jonatack commented at 1:39 pm on August 27, 2023:

    The log output will be prefixed by [coindb:error], so for the log message perhaps go with the more user-friendly s/coindb/coins database/.

    0                LogPrintLevel(BCLog::COINDB, BCLog::Level::Error, "The coins database detected an inconsistent state, likely due to a previous crash or shutdown. You will need to restart bitcoind with the -reindex-chainstate or -reindex configuration option.\n");
    

    Wo


    fjahr commented at 6:17 pm on August 27, 2023:
    Done, thanks!
  6. jonatack commented at 1:45 pm on August 27, 2023: contributor
    utACK be1b6d5122355e2eff6bde4efd88a51c2740761e
  7. log: Print error message when coindb is in inconsistent state df60de770d
  8. fjahr force-pushed on Aug 27, 2023
  9. jonatack commented at 6:40 pm on August 27, 2023: contributor
    ACK df60de770d8e27c00eced29d8a4a4bce7c6e4b30
  10. ryanofsky approved
  11. ryanofsky commented at 4:20 pm on August 29, 2023: contributor

    Code review ACK df60de770d8e27c00eced29d8a4a4bce7c6e4b30

    This is an improvement, but I would think just killing the process should not put the coindb in an inconsistent state that would require a reindex. Am I wrong about that, or is there more work that could be done here to debug the issue and update the database atomically?

  12. jamesob approved
  13. jamesob commented at 4:37 pm on August 29, 2023: member
    Code review ACK df60de770d8e27c00eced29d8a4a4bce7c6e4b30
  14. glozow added the label Utils/log/libs on Sep 1, 2023
  15. achow101 commented at 5:17 pm on September 1, 2023: member
    ACK df60de770d8e27c00eced29d8a4a4bce7c6e4b30
  16. achow101 merged this on Sep 1, 2023
  17. achow101 closed this on Sep 1, 2023

  18. luke-jr commented at 8:33 pm on September 5, 2023: member

    For a normal user, this can also happen if their computer crashes during a flush or if they try to stop their node during a flush and then get tired of waiting and just shut their computer down or kill the process.

    It shouldn’t… That would be a bug.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 07:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me