File descriptor problem, causing leveldb crash #12323

issue laanwj openend this issue on February 1, 2018
  1. laanwj commented at 11:26 am on February 1, 2018: member

    Two users have reported file descriptor issues with 0.16.0 that they did not have with 0.15.1, the error looks like:

    0IO Error: ...chainstate/244997.ldb: Bad file descriptor", system error while flushing: Database I/O error. 
    

    After which the process shuts down. “Bad file descriptor” could be caused by running against the file descriptor limit, another possibility would be run-away closing of file descriptors due to a bug somewhere else.

    • It is likely unrelated to #12274, which deals with conserving file descriptors under heavy http load. None of the reporters have mentioned extreme RPC usage. Also this should not have regressed since 0.15, as, as far as I know, the internal http server had no significant changes.
    • It is likely not caused by leveldb itself using more file descriptors due to a larger database, because the same database with 0.15.1 does not exhibit the issue.
    • #11785 might be a work-around, although without knowing the root cause it is risky and might just be kicking the can down the road.
  2. laanwj added the label Resource usage on Feb 1, 2018
  3. laanwj added this to the milestone 0.16.0 on Feb 1, 2018
  4. TheBlueMatt commented at 2:23 pm on February 1, 2018: member
    Likely a duplicate of #12285, which may indicate that this is something like the net code randomly closing a socket fd that isnt allocated to it.
  5. david60 commented at 3:14 pm on February 1, 2018: none
    Hello. CloseSocket may be called with hSocket uninitialised, at net.cpp:448 (not confirmed to be the cause of this bug, but it seems likely)
  6. laanwj commented at 3:21 pm on February 1, 2018: member
    @david60 Thanks! We’lll investigate. Closing an uninitialized value would perfectly explain the apparent buckshot close() behavior seen here.
  7. laanwj commented at 9:21 am on February 2, 2018: member
    @david60 That was indeed the problem, combined with eternal re-try of failed oneshot connections. I’ll add you to the release credits. It should be fixed by #12326 + #12329.
  8. laanwj closed this on Feb 2, 2018

  9. MarcoFalke locked this on Sep 8, 2021


laanwj TheBlueMatt david60

Labels
Resource usage

Milestone
0.16.0


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me