Severe memory leak on current master #11822

issue achow101 openend this issue on December 4, 2017
  1. achow101 commented at 8:27 am on December 4, 2017: member

    Sometime during normal operation, my node crashed (possibly due to this issue, but I do not know) and I had to perform a reindex. During this reindex, I noticed that it was consuming upwards of 26 GB of memory and constantly consuming more. Unfortunately this caused my computer to lock up and I had to reboot it to fix that. Starting Core again resulted in a reindex and the same behavior again.

    I am using a build of the current master branch on commit 00d25e90db06149fa456b0a8f15b7b68005ff9c5

    Machine info: Ubuntu 17.10 32 GB RAM

    I was going to run it in valgrind but it was taking too long to get anywhere remotely past initialization to the part where the memory leak was happening, so I killed that.

    Now I am doing git bisect to find where the leak was introduced. AFAIK this leak is not present in 0.15.1.

    I believe that @sipa experienced the same issue on his odroid but was not able to reproduce the problem.

  2. fanquake added the label Resource usage on Dec 4, 2017
  3. sipa commented at 8:39 am on December 4, 2017: member
    I managed to hit this issue twice on master as of a week ago, OOMing a 32 GiB RAM system - but not since.
  4. achow101 commented at 9:18 am on December 4, 2017: member
    A fairly quick and rather unscientific git bisect says that e545dedf72bff2bd41c93c93eb576929fce37112 (part of #10286) is the commit that introduces the problem
  5. jamesob commented at 8:10 pm on December 4, 2017: member

    I’ve recently hit OOM kills and surprisingly lengthy reindexing times while running -reindex at https://github.com/bitcoin/bitcoin/commit/a89221873a3ee2451c73b41bbe2d99d36f439d31. Can dig up old syslog output if that would be helpful.

    Edit: machine OOM’d was 8GB RAM.

  6. achow101 commented at 9:15 pm on December 4, 2017: member

    This appears to be an unbounded queue rather than a memory leak. From IRC

    0<BlueMatt> validation running ahead of validationinterface, you end up with a deep queue of shared_ptr<CBlock>s of all the shit you connected
    1<BlueMatt> not normally an issue, but reindex is particularly easy to hit there
    
  7. sipa added this to the milestone 0.16.0 on Dec 5, 2017
  8. jamesob commented at 5:53 pm on December 6, 2017: member

    #11824 definitely fixes some of the issue, but I’m still seeing characteristics of a leak. I’ve had HEAD of @TheBlueMatt’s branch running for roughly 20 hours and just hit another OOM kill an hour and a half ago.

    Currently, bitcoind is consuming ~4GB of memory (and climbing) despite being configured with dbcache=2048.

    selection_912

    Also currently, bitcoind is in the middle of a flurry of UpdateTip calls:

     02017-12-06 17:51:26 UpdateTip: new best=000000000000000004d83a24935d4cbeac752a6694c54b6b7458adbfad597930 height=353991 version=0x00000003 log2_work=82.693085 tx=66925319 date='2015-04-27 21:53:03' progress=0.241454 cache=1354.9MiB(9831263txo)
     12017-12-06 17:51:26   - Connect postprocess: 3.43ms [71.70s (2.66ms/blk)]
     22017-12-06 17:51:26 - Connect block: 190.19ms [4649.96s (172.82ms/blk)]
     32017-12-06 17:51:26   - Load block from disk: 84.67ms [950.08s]
     42017-12-06 17:51:26     - Sanity checks: 28.50ms [295.75s (10.99ms/blk)]
     52017-12-06 17:51:26     - Fork checks: 0.35ms [5.42s (0.20ms/blk)]
     62017-12-06 17:51:26       - Connect 1485 transactions: 146.39ms (0.099ms/tx, 0.035ms/txin) [2062.80s (76.66ms/blk)]
     72017-12-06 17:51:26     - Verify 4230 txins: 146.72ms (0.035ms/txin) [2068.85s (76.89ms/blk)]
     82017-12-06 17:51:26     - Index writing: 74.44ms [898.25s (33.38ms/blk)]
     92017-12-06 17:51:26     - Callbacks: 0.39ms [5.49s (0.20ms/blk)]
    102017-12-06 17:51:26   - Connect total: 251.71ms [3293.98s (122.42ms/blk)]
    112017-12-06 17:51:27   - Flush: 32.85ms [326.96s (12.15ms/blk)]
    122017-12-06 17:51:27   - Writing chainstate: 0.82ms [7.61s (0.28ms/blk)]
    
  9. jamesob commented at 6:53 pm on December 6, 2017: member

    bitcoind memory usage up to 92%, still running UpdateTips.

    selection_913

     02017-12-06 18:51:57 UpdateTip: new best=0000000000000000066614453ab2d15e48fb703fc08d32b81c1ef57ab5bdc03f height=365172 version=0x00000003 log2_work=83.070049 tx=76111954 date='2015-07-13 18:59:49' progress=0.274586 cache=565.1MiB(3504388txo)
     12017-12-06 18:51:57   - Connect postprocess: 6.17ms [110.58s (2.90ms/blk)]
     22017-12-06 18:51:57 - Connect block: 2932.15ms [8166.81s (214.42ms/blk)]
     32017-12-06 18:51:57 receive version message: /Satoshi:0.15.1/: version 70015, blocks=497965, us=73.241.180.136:8333, peer=645
     42017-12-06 18:51:57   - Load block from disk: 84.30ms [1459.33s]
     52017-12-06 18:51:57     - Sanity checks: 21.27ms [463.79s (12.18ms/blk)]
     62017-12-06 18:51:57     - Fork checks: 0.28ms [7.94s (0.21ms/blk)]
     72017-12-06 18:51:58       - Connect 1152 transactions: 1372.18ms (1.191ms/tx, 0.343ms/txin) [3601.05s (94.54ms/blk)]
     82017-12-06 18:51:58     - Verify 3997 txins: 1372.48ms (0.343ms/txin) [3610.31s (94.79ms/blk)]
     92017-12-06 18:51:58     - Index writing: 68.24ms [1372.24s (36.03ms/blk)]
    102017-12-06 18:51:58     - Callbacks: 0.26ms [7.95s (0.21ms/blk)]
    112017-12-06 18:51:58   - Connect total: 1463.99ms [5493.27s (144.22ms/blk)]
    122017-12-06 18:51:58   - Flush: 24.53ms [558.88s (14.67ms/blk)]
    132017-12-06 18:51:58   - Writing chainstate: 0.44ms [546.31s (14.34ms/blk)]
    142017-12-06 18:51:58 UpdateTip: new best=00000000000000000390dcaa5744640915209882458e99669ef7132b9e49d4a6 height=365173 version=0x00000003 log2_work=83.07008 tx=76113106 date='2015-07-13 19:18:53' progress=0.274591 cache=565.2MiB(3505570txo)
    152017-12-06 18:51:58   - Connect postprocess: 6.69ms [110.59s (2.90ms/blk)]
    162017-12-06 18:51:58 - Connect block: 1579.95ms [8168.39s (214.46ms/blk)]
    172017-12-06 18:51:59   - Load block from disk: 91.57ms [1459.42s]
    182017-12-06 18:51:59     - Sanity checks: 23.40ms [463.82s (12.18ms/blk)]
    192017-12-06 18:51:59     - Fork checks: 0.28ms [7.94s (0.21ms/blk)]
    202017-12-06 18:52:02       - Connect 1106 transactions: 3571.52ms (3.229ms/tx, 0.671ms/txin) [3604.62s (94.63ms/blk)]
    212017-12-06 18:52:02     - Verify 5319 txins: 3571.92ms (0.672ms/txin) [3613.88s (94.88ms/blk)]
    222017-12-06 18:52:02 Pre-allocating up to position 0xc00000 in rev00299.dat
    232017-12-06 18:52:02     - Index writing: 82.79ms [1372.32s (36.03ms/blk)]
    242017-12-06 18:52:02     - Callbacks: 0.27ms [7.95s (0.21ms/blk)]
    252017-12-06 18:52:02   - Connect total: 3681.05ms [5496.96s (144.31ms/blk)]
    262017-12-06 18:52:02   - Flush: 30.14ms [558.91s (14.67ms/blk)]
    272017-12-06 18:52:02   - Writing chainstate: 1.61ms [546.32s (14.34ms/blk)]
    282017-12-06 18:52:02 UpdateTip: new best=000000000000000002ce778c882e1c43c5d221da8c50a259be85b605ed6bf83f height=365174 version=0x00000003 log2_work=83.070111 tx=76114212 date='2015-07-13 19:16:22' progress=0.274595 cache=565.8MiB(3510024txo)
    292017-12-06 18:52:02   - Connect postprocess: 4.83ms [110.60s (2.90ms/blk)]
    302017-12-06 18:52:02 - Connect block: 3809.21ms [8172.20s (214.55ms/blk)]
    312017-12-06 18:52:02   - Load block from disk: 102.39ms [1459.53s]
    322017-12-06 18:52:02     - Sanity checks: 24.25ms [463.84s (12.18ms/blk)]
    332017-12-06 18:52:02     - Fork checks: 0.64ms [7.94s (0.21ms/blk)]
    342017-12-06 18:52:05       - Connect 1260 transactions: 3033.98ms (2.408ms/tx, 0.585ms/txin) [3607.66s (94.71ms/blk)]
    352017-12-06 18:52:05     - Verify 5185 txins: 3034.36ms (0.585ms/txin) [3616.91s (94.95ms/blk)]
    362017-12-06 18:52:06     - Index writing: 81.44ms [1372.40s (36.03ms/blk)]
    372017-12-06 18:52:06     - Callbacks: 0.45ms [7.95s (0.21ms/blk)]
    382017-12-06 18:52:06   - Connect total: 3142.66ms [5500.10s (144.39ms/blk)]
    392017-12-06 18:52:06   - Flush: 25.42ms [558.93s (14.67ms/blk)]
    402017-12-06 18:52:06   - Writing chainstate: 0.51ms [546.32s (14.34ms/blk)]
    412017-12-06 18:52:06 UpdateTip: new best=00000000000000000be8f3cefa606cde3289637d5b5a2d211f00de01026129e0 height=365175 version=0x00000003 log2_work=83.070142 tx=76115472 date='2015-07-13 19:19:11' progress=0.274599 cache=566.2MiB(3513332txo)
    422017-12-06 18:52:06   - Connect postprocess: 5.67ms [110.60s (2.90ms/blk)]
    432017-12-06 18:52:06 - Connect block: 3276.65ms [8175.47s (214.63ms/blk)]
    
  10. TheBlueMatt commented at 6:56 pm on December 6, 2017: member
    I’m currently trying to reproduce @jamesob’s issues…there is no question #11824 fixes the majority of the issue, but there’s something else hiding somewhere….massif couldn’t find it which implies heap fragmentation or so again, but I’m digging.
  11. sipa commented at 7:00 pm on December 6, 2017: member
    I don’t think heap fragmentation can result in several GB of wasted space. Normally I don’t see RSS go more than a few 100 MB above dbcache.
  12. achow101 commented at 7:08 pm on December 6, 2017: member
    I definitely saw the issue @jamesob described last night as my node crashed again (same crash as before) but after it had finished reindexing. It seems to take a very long time to hit this though. I also think that whatever is causing this other OOM is what caused my need to reindex in the first place.
  13. jamesob commented at 7:40 am on December 7, 2017: member
    Oh wow nevermind, apologies. Systemd was referencing /usr/local/bin/bitcoind and I forgot to make install when I recompiled for @TheBlueMatt’s branch, so I’ve been running on some recent revision of master. Oops, disregard. :blush:
  14. jonasschnelli commented at 6:53 am on December 16, 2017: contributor
    Just ran into this (significant leak/OOM issue) with current master on a Debian 9 (headless) pure bitcoin non VM machine with 32GB of ram.
  15. Sjors commented at 7:16 pm on December 21, 2017: member

    I added txindex=1 today and then started QT with -reindex=1. This resulted in a Killed: 9, followed by a new IBD the next time I started. It was using ~30 GB of memory according to the OSX activity monitor, while dbcache=8000 and my system only has 16 GB physical memory. My system was also much more unresponsive than it usually is when I have QT running, but I assumed that was just normal during the reindex process.

    This was on the SegWit wallet branch.

    The new IBD is behaving so far ~ 1 GB memory use in “2013”.

  16. sipa closed this on Dec 29, 2017

  17. sipa referenced this in commit d9fdac130a on Dec 29, 2017
  18. UdjinM6 referenced this in commit 871b8585ca on Mar 25, 2020
  19. codablock referenced this in commit c383bca1a6 on Mar 25, 2020
  20. ckti referenced this in commit ef6b4d8ffd on Mar 28, 2021
  21. MarcoFalke locked this on Sep 8, 2021
  22. gades referenced this in commit 9fb1e5b3a0 on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-06 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me