peers.dat easily wiped out after unclean shutdown/hasty relaunch #4669

issue zw opened this issue on August 10, 2014
  1. zw commented at 1:16 AM on August 10, 2014: contributor

    I've found that peers.dat gets wiped out from time to time, manifesting as "Loaded 0 addresses from peers.dat <n>ms" when it previously had hundreds of peers recorded. I've associated this power outages/kernel oopses, but seen it more often. I think I've tracked this down: StopNode() calls DumpAddresses() but doesn't account for how we started up. When we'd started up and found .lock held, we reach DumpAddresses() with zero peers and dump an empty database, overwriting the good one.

    It's easy to forget about .lock after an outage. When you're tinkering and bitcoin-cli stop and hastily relaunch bitcoind, which I used to do before I realised (a) that bitcoin-cli stop is asynchronous, and (b) just how slow shutdown can sometimes be on my hardware, you trigger the same thing, which I think accounts for every other time I've seen it. Probably annoys me more because working Tor HS peers are harder to find.

    Maybe it's as simple as wrapping the dump in if (addrman.size() != 0) { ... }, but I haven't tested it.

  2. Diapolo commented at 12:18 PM on August 10, 2014: none

    DumpAddresses() is run 15 minutes after starting the node AFAIK. So it shouldn't directly overwrite peers.dat. All seems unrelated to the .lock file. That file can be there, if it's locked by another bitcoin process we don't startup another instance. If we crash-reboot that lock is gone and another instance can be started.

    Mind explaining it again, perhaps I don't understand. List the exact steps and what happend to get you an empty peers.dat.

  3. jgarzik commented at 12:43 PM on August 10, 2014: contributor

    @Diapolo DumpAddresses() also runs at shutdown.

    We can see the following code logic:

    • bitcoind.cpp calls AppInit2()
    • AppInit2() returns false upon hitting .lock (step 4), and does not read addrman db (step 10).
    • bitcoind.cpp calls Shutdown() -> StopNode() -> DumpAddresses()

    In fact, we flush several databases we have not read, if .lock is encountered!

    Definitely a bug.

  4. thereisnofromaddress commented at 4:07 PM on August 10, 2014: none

    I can confirm that I experienced this with an un-clean shutdown. Same result with 0 addresses read.

  5. jgarzik added the label Bug on Aug 11, 2014
  6. jgarzik added the label Priority Medium on Aug 11, 2014
  7. jgarzik added the label P2P on Aug 11, 2014
  8. laanwj commented at 11:47 AM on September 18, 2014: member

    In fact, we flush several databases we have not read, if .lock is encountered!

    I've looked at this in more detail, and it is not as bad as it seems:

    • GenerateBitcoins(false) does nothing if GenerateBitcoins(true) was never called
    • StopRPCThreads() does nothing if StartRPCThreads() wasn't called (rpc_io_service == NULL)
    • pwalletMain: is still 0, so nothing happens with it, and also not bitdb.Flush()
    • pblocktree: is still 0, so no Flush() happens with it

    However:

    • StopNode() should check whether the node was initialized, and do nothing if it wasn't- anologous to StopRPCThreads. Because this doesn't happen, an empty addresses file is written.
    • Same for mempool fee estimation logic. Because this doesn't happen, an empty fee estimates file is written.
    • boost::filesystem::remove(GetPidFile()) should only be called if we created the PID file in the first place. But more is wrong here, the PID file is created in bitcoind.cpp far before the datadir lock is checked.

    We should add a comment to Shutdown() that it must handle cases in which AppInit2() failed partway.

  9. laanwj commented at 12:14 PM on September 18, 2014: member

    Another option would be to change the initialization logic so that Shutdown() is not run when AppInit2() fails (ie, everything takes care of its own cleanup on failure). This has more code impact, and harder to test, but may be saner in the long term.

  10. laanwj referenced this in commit 7d3e00f6a8 on Sep 18, 2014
  11. laanwj referenced this in commit 94064710b9 on Sep 18, 2014
  12. gavinandresen commented at 1:29 PM on September 18, 2014: contributor

    How about simplifying and doing a quick, unclean, exit(1) if anything in startup initialization fails?

  13. jgarzik commented at 1:30 PM on September 18, 2014: contributor

    Was thinking similar to @gavinandresen -- throw an exception.

  14. laanwj commented at 2:10 PM on September 18, 2014: member

    That sounds very ugly to me. It also breaks any assumption held by the GUI (makes it impossible to report anything back to the user, for example).

    Also - things that are initialized must be shut down properly. IE if initializing the wallet fails but the block database has already been fiddled with, the block database must be closed on failure.

  15. laanwj closed this on Oct 14, 2014

  16. reddink referenced this in commit 995c716981 on May 27, 2020
  17. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 18:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me