Memory leak loading legacy wallet (BDB 4.8.30) #27283

issue Sjors openend this issue on March 20, 2023
  1. Sjors commented at 12:01 pm on March 20, 2023: member

    On Ubuntu 22.10 (GNU/Linux 6.0.6-060006-generic x86_64) with gcc (12.2.0) and then running under valgrind (3.18.1).

    Loading a legacy wallet and then shutting down the node is the easiest way to reproduce.

    Can be seen on master f4e42a78c75719ad6a99962360ec67d92a563a9d, but see #27283 (comment) for when the leak was introduced. It does not happen with BDB 5.13.

     0./configure BDB_LIBS= --enable-debug --disable-asm 
     1make
     2 
     3valgrind leak-check=full src/bitcoind -regtest -nowallet -wallet=test_legacy
     4 
     5023-03-20T19:32:09Z [test_legacy] Releasing wallet
     62023-03-20T19:32:09Z Shutdown: done
     7==1744646== 
     8==1744646== HEAP SUMMARY:
     9==1744646==     in use at exit: 2,377 bytes in 12 blocks
    10==1744646==   total heap usage: 296,219 allocs, 296,207 frees, 103,153,206 bytes allocated
    11==1744646== 
    12==1744646== 160 bytes in 1 blocks are definitely lost in loss record 6 of 12
    13==1744646==    at 0x4844899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
    14==1744646==    by 0xE702AD: __os_malloc (in /home/sjors/dev/bitcoin2/src/bitcoind)
    15==1744646==    by 0xE455CC: __env_alloc (in /home/sjors/dev/bitcoin2/src/bitcoind)
    16==1744646==    by 0xEE3CC1: __lock_open (in /home/sjors/dev/bitcoin2/src/bitcoind)
    17==1744646==    by 0xE497EB: __env_attach_regions (in /home/sjors/dev/bitcoin2/src/bitcoind)
    18==1744646==    by 0xE49A8E: __env_open (in /home/sjors/dev/bitcoin2/src/bitcoind)
    19==1744646==    by 0xE49C01: __env_open_pp (in /home/sjors/dev/bitcoin2/src/bitcoind)
    20==1744646==    by 0xE16E7D: DbEnv::open(char const*, unsigned int, int) (in /home/sjors/dev/bitcoin2/src/bitcoind)
    21==1744646==    by 0x9EDDE8: wallet::BerkeleyEnvironment::Open(bilingual_str&) (bdb.cpp:164)
    22==1744646==    by 0x9EEF34: wallet::BerkeleyDatabase::Verify(bilingual_str&) (bdb.cpp:271)
    23==1744646==    by 0x9F41C8: wallet::MakeBerkeleyDatabase(fs::path const&, wallet::DatabaseOptions const&, wallet::DatabaseStatus&, bilingual_str&) (bdb.cpp:849)
    24==1744646==    by 0x9A074A: wallet::MakeDatabase(fs::path const&, wallet::DatabaseOptions const&, wallet::DatabaseStatus&, bilingual_str&) (walletdb.cpp:1228)
    25==1744646== 
    26==1744646== LEAK SUMMARY:
    27==1744646==    definitely lost: 160 bytes in 1 blocks
    28==1744646==    indirectly lost: 0 bytes in 0 blocks
    29==1744646==      possibly lost: 0 bytes in 0 blocks
    30==1744646==    still reachable: 2,217 bytes in 11 blocks
    31==1744646==         suppressed: 0 bytes in 0 blocks
    32==1744646== Reachable blocks (those to which a pointer was found) are not shown.
    33==1744646== To see them, rerun with: --leak-check=full --show-leak-kinds=all
    34==1744646== 
    35==1744646== For lists of detected and suppressed errors, rerun with: -s
    36==1744646== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
    

    Originally I found it inside a test by using the address sanitizer. I kept the original text below so the comments make sense:


    0src/test/test_bitcoin --run_test=wallet_tests
    

    Passed when configured without BDB, fails when configured with BDB:

     0./configure 'BDB_LIBS=-L/…' BDB_CFLAGS=-I/…/include CC=clang CXX=clang++ --enable-suppress-external-warnings --disable-asm --with-sanitizers=address --disable-fuzz-binary --without-gui
     1make -C src/test
     2src/test/test_bitcoin --run_test=wallet_tests
     3
     4 5
     6*** No errors detected
     7
     8=================================================================
     9==637422==ERROR: LeakSanitizer: detected memory leaks
    10
    11Direct leak of 320 byte(s) in 2 object(s) allocated from:
    12    [#0](/bitcoin-bitcoin/0/) 0x555592a03dbe in malloc (/home/sjors/dev/bitcoin/src/test/test_bitcoin+0x470dbe) (BuildId: 8a0fe1527885800d0ca318fefe6c7e66c608d5eb)
    13    [#1](/bitcoin-bitcoin/1/) 0x555595965e64 in __os_malloc (/home/sjors/dev/bitcoin/src/test/test_bitcoin+0x33d2e64) (BuildId: 8a0fe1527885800d0ca318fefe6c7e66c608d5eb)
    14
    15SUMMARY: AddressSanitizer: 320 byte(s) leaked in 2 allocation(s).
    16make[3]: *** [Makefile:21823: wallet/test/wallet_tests.cpp.test] Error 1
    

    There’s other BDB related -with-sanitizers=address issues: #22592, #19034.

    There’s currently no suppression file for the address sanitizer and I don’t know how to make one. It could make sense to suppress the ones we’ve found, keep a Github issue open for each of one to fix & remove the suppression.

    Without BDB I can run the full unit and functional test suite just fine, so that’s good news in the long run. But in the shorter run it’s good to be able to thoroughly test all the legacy -> descriptor migration code that’s being written.

  2. Sjors commented at 12:03 pm on March 20, 2023: member
  3. fanquake commented at 12:08 pm on March 20, 2023: member
    Doesn’t look like you’ve provided anything more actionable or useful than the last comment in #19034?
  4. fanquake commented at 12:12 pm on March 20, 2023: member
    Also if you are going to open issues like this, can you provide more information. i.e you don’t even tell us what version of the code this happens with. I guess we are meant to assume it’s the latest commit in master? What compiler versions did you use? What system are you running on?
  5. maflcko commented at 12:15 pm on March 20, 2023: member

    There’s currently no suppression file for the address sanitizer

    See https://github.com/bitcoin/bitcoin/blob/master/test/sanitizer_suppressions/lsan

  6. Sjors commented at 12:34 pm on March 20, 2023: member

    @fanquake that was 65840 bytes, this one is 320 bytes, so I assume it’s a different issue?

    I was going to add more details if anyone couldn’t reproduce it, but here we go:

    Found it while testing a PR, but reproduced on master @ 40e1c4d4024b8ad35f2511b2e10bf80c5531dfde.

    0Ubuntu clang version 15.0.6
    1Target: x86_64-pc-linux-gnu
    2
    3Ubuntu 22.10 (GNU/Linux 6.0.6-060006-generic x86_64)
    

    @MarcoFalke I’ll see if I can get that too work.

  7. maflcko commented at 12:36 pm on March 20, 2023: member
    This leak may be fixed in bdb5.3, but I haven’t re-checked this.
  8. Sjors commented at 12:58 pm on March 20, 2023: member
    Narrowed it down to BOOST_FIXTURE_TEST_CASE(CreateWallet, TestChain100Setup).
  9. fanquake commented at 1:03 pm on March 20, 2023: member

    I was going to add more details if anyone couldn’t reproduce it, but here we go

    In future, save us all wasting our time, and provide the (trivial to do so) details initially.

  10. Sjors commented at 1:09 pm on March 20, 2023: member

    That test can be further narrowed down:

    0BOOST_FIXTURE_TEST_CASE(CreateWallet, TestChain100Setup)
    1{
    2    WalletContext context;
    3    context.args = &m_args;
    4    context.chain = m_node.chain.get();
    5    auto wallet = TestLoadWallet(context);
    6    // TestUnloadWallet(std::move(wallet));
    7    // wallet = TestLoadWallet(context);
    8    TestUnloadWallet(std::move(wallet));
    9}
    

    Uncommenting the extra unload & load will trigger the sanitizer.

    This also implies it’s different from #19034 because that specifically refers to “This happens when trying to load a wallet with corrupt logs.”. I might do a bisect…

  11. Sjors commented at 2:44 pm on March 20, 2023: member

    I went back to the #18727 merge commit 0f204dd3f21b997334a0e99954c939db154b64ca where this test was introduced. In order to test that old commit, I used contrib/install_db4.sh as it existed at the time, but otherwise didn’t use depends (can’t get that old version of boost to build). I then recreated a narrowed down test:

    0    auto chain = interfaces::MakeChain(m_node);
    1    auto wallet = TestLoadWallet(*chain);
    2    TestUnloadWallet(std::move(wallet));
    3    wallet = TestLoadWallet(*chain);
    4    TestUnloadWallet(std::move(wallet));
    

    This did not produce a memory leak (the full test file does).

    So I guess we have a good and bad for bisect now, but with all the build system changes it’s a bit non-trivial. It may be more efficient to just figure out what causes the memory leak.

  12. maflcko added the label Wallet on Mar 20, 2023
  13. Sjors commented at 3:58 pm on March 20, 2023: member

    Git bisect points to 8b5e7297c02f3100a9cb27bfe206e3fc617ec173 as the culprit (#19619). I verified that this commit triggers the error whereas its parent 3c815cfe54087fd139169161d2fd175e99840e6a doesn’t.

    It makes sense that this went unnoticed because wallet_tests was already suffering from a different memory leak at the time. cc @ryanofsky

  14. Sjors commented at 6:43 pm on March 20, 2023: member

    It’s not limited to the test. Running that commit in regtest, creating a wallet and then exiting triggers it too. That should make debugging easier… And after that it’s just a matter of starting and stopping bitcoind.

    Or just do it on master @ f4e42a78c75719ad6a99962360ec67d92a563a9d. But there it only happens if you load a legacy wallet. Not when you create one.

  15. Sjors commented at 7:32 pm on March 20, 2023: member

    Building bdb with DEBUG=1 did not make the LeakSanitizer output more usable.

    I also tried compiling with gcc (12.2.0) and then running under valgrind (3.18.1). This yielded much more useful info:

    0./configure BDB_LIBS=… --enable-suppress-external-warnings --enable-debug --disable-asm  --disable-fuzz-binary --disable-bench --without-gui --disable-test
    12valgrind —leak-check=full src/bitcoind -regtest -nowallet -wallet=test_legacy
    
     0023-03-20T19:32:09Z [test_legacy] Releasing wallet
     12023-03-20T19:32:09Z Shutdown: done
     2==1744646== 
     3==1744646== HEAP SUMMARY:
     4==1744646==     in use at exit: 2,377 bytes in 12 blocks
     5==1744646==   total heap usage: 296,219 allocs, 296,207 frees, 103,153,206 bytes allocated
     6==1744646== 
     7==1744646== 160 bytes in 1 blocks are definitely lost in loss record 6 of 12
     8==1744646==    at 0x4844899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
     9==1744646==    by 0xE702AD: __os_malloc (in /home/sjors/dev/bitcoin2/src/bitcoind)
    10==1744646==    by 0xE455CC: __env_alloc (in /home/sjors/dev/bitcoin2/src/bitcoind)
    11==1744646==    by 0xEE3CC1: __lock_open (in /home/sjors/dev/bitcoin2/src/bitcoind)
    12==1744646==    by 0xE497EB: __env_attach_regions (in /home/sjors/dev/bitcoin2/src/bitcoind)
    13==1744646==    by 0xE49A8E: __env_open (in /home/sjors/dev/bitcoin2/src/bitcoind)
    14==1744646==    by 0xE49C01: __env_open_pp (in /home/sjors/dev/bitcoin2/src/bitcoind)
    15==1744646==    by 0xE16E7D: DbEnv::open(char const*, unsigned int, int) (in /home/sjors/dev/bitcoin2/src/bitcoind)
    16==1744646==    by 0x9EDDE8: wallet::BerkeleyEnvironment::Open(bilingual_str&) (bdb.cpp:164)
    17==1744646==    by 0x9EEF34: wallet::BerkeleyDatabase::Verify(bilingual_str&) (bdb.cpp:271)
    18==1744646==    by 0x9F41C8: wallet::MakeBerkeleyDatabase(fs::path const&, wallet::DatabaseOptions const&, wallet::DatabaseStatus&, bilingual_str&) (bdb.cpp:849)
    19==1744646==    by 0x9A074A: wallet::MakeDatabase(fs::path const&, wallet::DatabaseOptions const&, wallet::DatabaseStatus&, bilingual_str&) (walletdb.cpp:1228)
    20==1744646== 
    21==1744646== LEAK SUMMARY:
    22==1744646==    definitely lost: 160 bytes in 1 blocks
    23==1744646==    indirectly lost: 0 bytes in 0 blocks
    24==1744646==      possibly lost: 0 bytes in 0 blocks
    25==1744646==    still reachable: 2,217 bytes in 11 blocks
    26==1744646==         suppressed: 0 bytes in 0 blocks
    27==1744646== Reachable blocks (those to which a pointer was found) are not shown.
    28==1744646== To see them, rerun with: --leak-check=full --show-leak-kinds=all
    29==1744646== 
    30==1744646== For lists of detected and suppressed errors, rerun with: -s
    31==1744646== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
    
  16. Sjors commented at 7:46 pm on March 20, 2023: member

    This leak may be fixed in bdb5.3, but I haven’t re-checked this.

    I tried with apt install libdb++-dev version 5.3.28 and ./configure --with-incompatible-bdb …

    This shows “Using BerkeleyDB version Berkeley DB 5.3.28: (September 9, 2013)” in the log.

    There’s no leak. Not with valgrind (gcc) and not with the address sanitizer (clang).

    I clarified the top description (but also kept the original text).

  17. Sjors renamed this:
    Memory leak in wallet_tests with BDB using address sanitizer
    Memory leak loading legacy wallet (BDB 4.8.30)
    on Mar 20, 2023
  18. ryanofsky commented at 10:10 pm on March 20, 2023: contributor

    This is puzzling because #27283 (comment) seems to suggest that memory is leaked while creating the BDB environment, but the commit which triggers the leak 8b5e7297c02f3100a9cb27bfe206e3fc617ec173 does not seem to be changing the way the environment is created or how it is used. The commit does switch from using a function called CreateWalletDatabase to another one called MakeWalletDatabase, but in both cases the functions are calling the same GetWalletEnv function to get a shared_ptr to the BerkeleyEnvironment, and passing that pointer to the BerkeleyDatabase constructor.

    I guess the fact that the leak happens in one version of BDB but another suggests it is a BDB bug, but it’s not clear what could be changing in commit 8b5e7297c02f3100a9cb27bfe206e3fc617ec173 which triggers the bug.

    I haven’t tried to reproduce this myself, though, so just going off of the information above.

  19. achow101 commented at 11:13 pm on March 20, 2023: member

    I’ve been able to reproduce the leak that valgrind finds on master. It appears to be related to the -privdb option; looking at the BDB source, the problem occurs inside a block of code guarded by a check for the DB_PRIVATE flag. Disabling that with -privdb=0 “resolves” the leak. Changing this setting in the code also resolves the LeakSanitizer error.

    None of the commits mentioned that are supposed to be “good” are not for me. They all fail in valgrind with the same error. I believe this is an issue with BDB itself that we cannot fix. I think the only solution is for us to either update BDB or ditch it entirely.

    I’ve looked through the BDB change logs between 4.8 and 5.3, and it seems like having memory leaks for private dbs is not uncommon. Multiple changelogs mention a couple fixes of memory leaks regarding private dbs.

  20. maflcko added the label Upstream on Mar 21, 2023
  21. Sjors commented at 12:21 pm on March 21, 2023: member

    It turns out 8b5e7297c02f3100a9cb27bfe206e3fc617ec173 triggers the test failure I described above, but it had no impact on the regular wallet loading / unloading. So maybe it’s not exactly the same issue.

    Starting with -privdb=0 indeed removes the memory leak for me for wallet loading / unloading. Dropping the DB_PRIVATE in bdb.cpp also fixes the test for me.

    The release notes for 0.3.13 state “Dropped DB_PRIVATE Berkeley DB flag.”, but it was added back in #1367.

  22. Sjors referenced this in commit 8a9303dc5f on Mar 21, 2023
  23. Sjors commented at 2:14 pm on March 21, 2023: member

    One solution is to ignore the problem but fix the address sanitizer test suite.

    This branch does that: https://github.com/bitcoin/bitcoin/compare/master...Sjors:bitcoin:2023/03/legacy-leak

    It sets options.use_shared_memory = true for wallet unit tests, and it adds -dbpriv=0 to the default config of functional tests.

    The second commit also makes the wallet migration tool use_shared_memory. This seems safe, since the operation only runs briefly, so the concern about other processes snooping seems less pressing.

    Strangely there’s one test that still fails with a memory leak: wallet_migration.py, at “Test migration of a basic keys only wallet with a balance”.

    Even with this “fix”, creating a legacy wallet under valgrind still generates heaps (haha) of warnings (log).

    I can make a PR if people think it’s the right approach.

    Adding a suppression (still not sure how) would be another way to satisfy the address sanitizer, though not valgrind(?).

  24. Sjors referenced this in commit db848805f5 on Mar 21, 2023
  25. maflcko commented at 2:51 pm on March 21, 2023: member
    My preference would be to suppress libdb4.8, but I guess that is not possible because the sanitizer doesn’t see the lib in static builds?
  26. willcl-ark commented at 11:01 am on July 3, 2024: member

    Can we close this and #19034 as “unplanned”?

    In theory v28.0 will (hopefully) be the final version we will read BDB wallets at all, and we now have our own BDB parser, and will (soon) be able to migrate without BDB.

    I don’t see these issues ever being fixed, and therefore they are not useful to keep open IMO…

  27. maflcko commented at 12:01 pm on July 3, 2024: member

    Many workarounds have been mentioned in the discussion above. For example:

    • disable privdb temporarily
    • upgrade bdb
    • suppress the warning
    • remove bdb

    Closing for now. I am happy to answer questions how to implement the workarounds, but I don’t think there is anything that can be done in this repo that hasn’t already an open pull request. If a pull request is missing, please create one, or explain exactly what should be implemented.

  28. maflcko closed this on Jul 3, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me