wallet: build: Don’t write BDB logs and bump BDB to 5.3 #18844

pull achow101 wants to merge 3 commits into bitcoin:master from achow101:bdb-5.3 changing 4 files +17 −37
  1. achow101 commented at 7:38 pm on May 1, 2020: member

    Berkeley DB uses a database transaction log to ensure Durability (the D of ACID). But we already have a couple of options enabled that remove the guarantee of Durability anyways (we enable DB_TXN_WRITE_NOSYNC). So the log already isn’t as useful to us as it could be. Furthermore, we don’t really make use of the BDB’s transaction feature. Almost every single read and write that we do is itself atomic and a single transaction. We do each read and write of a record as a single transaction. Flushing is always done when a WalletBatch is destroyed, and we have many single use WalletBatch objects.

    We do, in a few places, batch transactions so even after a transaction is committed, it may not be flushed to the database. In those instances, the log is useful as it does recover whatever wasn’t written. But whatever didn’t get written also doesn’t matter. If a new transaction was being written, the blockchain can be rescanned. If new metadata was being written, the user can just re-enter it. For newly generated keys, those keys didn’t see any use if their records didn’t get flushed to the db (that flushing always happens before the key generation function returns). So it’s fine to lose those keys, and for HD wallets, they will be regenerated anyways, so nothing is lost.

    The single exception to the single record single transaction that we usually do is with encrypting existing private keys. This encryption is done in a single transaction so that it is atomic, and that is many records being written. But even with that, the transaction log isn’t useful. We abort before anything is flushed to disk so the wallet state is not inconsistent and all of the unencrypted private keys are still there. The encryption can then be redone. So the log is not useful there.

    Additionally, we are already trying to get rid of the transaction log as much as possible. We turn on the auto remove (DB_LOG_AUTO_REMOVE) and the logs are always cleaned up on a clean shutdown. As mentioned earlier, the transaction log already has a flag that doesn’t always flush the log anyways. So I don’t think that just completely avoiding transaction logs entirely is unreasonable.

    Logs are disabled by using the db flag DB_TXN_NOT_DURABLE to prevent logs being written for a db, and using the db env flag DB_LOG_IN_MEMORY to keep all db environment logs in memory.

    Not having the BDB logs additionally allows us to remove the BDB 4.8 requirement. This requirement was because the Oracle keeps changing the log file format. The database file format itself has not changed in a long time. The log file format difference would supposedly make downgrades more difficult because the log files are not downgradable. However in my testing, this was not actually the case as the log files would be moved out of the way if the database could not be opened, so downgrading wasn’t that much of a problem, especially with clean shutdowns. Not having the transaction log be written at all makes this not a problem entirely as there aren’t any log files to get in the way. So we can now allow for other BDB versions.

    Bumping to BDB 5.3 is done only because that’s the version that is provided by many (most?) distros. With this change, --with-incompatible-bdb is no longer needed and we can simply build and installation instructions to just “install bdb from your package manager”.


    This change might be contentions, so here are some alternatives that we might want to discuss:

    • Not having logs at all might make the wallet more unreliable. Maybe instead of getting rid of the logs entirely we can output compatible log files for use in BDB 4.8. But that’s clunky and just locks us into BDB 5.3 (or whatever version).
      • We could possibly have our own consistent log file format which is then loaded into BDB somehow
    • Instead of no logs at all, when we encounter newer log files, copy them (as we do now) and the wallet.dat file to a backup location in the walletdir so that users can recover them if they see consistency issues with the wallet file. This still lets us change BDB.
  2. wallet: Don't write BDB logs
    We don't use BDB logs so it's pointless to keep them around. These log
    files are also what has locked us into BDB 4.8. We can just not write
    them by settting DbEnv flag DB_LOG_IN_MEMORY and Db flag
    DB_TXN_NOT_DURABLE. The creation of the db log dir and other log
    settings have been removed. Removing/moving the log is still kept around
    to remove/move the log files produced previously.
    9404990490
  3. build: remove --with-incompatible-bdb and locking to BDB 4.8 44e42094fc
  4. depends: Bump BDB to 5.3.28 cb29c38727
  5. luke-jr commented at 7:47 pm on May 1, 2020: member

    Concept NACK: This breaks backward compatibility, which is why we’ve always stuck to 4.8.

    --with-incompatible-bdb isn’t “needed” - it’s available, and not recommended. Only BDB 4.8 should be used.

  6. ryanofsky commented at 7:53 pm on May 1, 2020: member

    Concept NACK: This breaks backward compatibility, which is why we’ve always stuck to 4.8.

    This is maybe addressed in the PR description, which says that database format hasn’t actually changes just log format.

    I’m confused about motivation for this PR though. The goal of this is to make database writes more reliable? Have we seen cases where data has been lost and we think this change would fix? What exactly is the case where this would happen? Or is the goal something different like making writes more efficient?

  7. DrahtBot added the label Build system on May 1, 2020
  8. DrahtBot added the label Wallet on May 1, 2020
  9. achow101 commented at 9:36 pm on May 1, 2020: member

    Concept NACK: This breaks backward compatibility, which is why we’ve always stuck to 4.8.

    It doesn’t and testing that is incredibly easy. You don’t even need the changes in this PR to find that out for yourself. The log changes here aren’t strictly necessary for BDB 5.3 The wallet.dat files themselves are portable between these versions. From the documentation that I’ve found, we should be able to use the same wallet.dat files down to BDB 4.0, and up to the latest version of BDB

    I’m confused about motivation for this PR though.

    The goal is to remove the strict requirement for BDB 4.8.

    The goal of this is to make database writes more reliable?

    It actually does the opposite currently, which is why I added the section at the bottom about alternatives. When I wrote this, I assumed that flushes were occurring immediately after we were done writing a record, but this appears to not be the case. There is a bit of a delay which is mostly not human noticeable but this has an effect on a test like feature_backwards_compatibility where a wallet file ends up being copied before it’s flushed so is missing data.

  10. hebasto commented at 10:41 pm on May 1, 2020: member

    @achow101

    the Oracle keeps changing the log file format

    From the documentation that I’ve found, we should be able to use the same wallet.dat files down to BDB 4.0, and up to the latest version of BDB

    Mind pointing to the docs?

  11. luke-jr commented at 11:03 pm on May 1, 2020: member
    Oh, so disabling the logs fixes the compatibility problem? :o
  12. sipa commented at 11:10 pm on May 1, 2020: member

    My recollection is that the log files have no compatibility guarantees, not even backward compatibility for minor versions (i.e. you can’t use a database+log from 4.7 in 4.8) - which is why we flush logs on exit. The database itself is backward compatible, but not forward compatible (you can use db files from 4.7 in 4.8 but not the other way around).

    I’m unsure about the goal here. There used to be frequent issues with wallets being unable to be opened, and we ended up making several frequent changes to the BDB settings we open envs/databases with. At some point those issues stopped, so I suspect we found a fairly stable configuration. I wouldn’t touch these things without going back through history to see what has been tried before.

  13. achow101 commented at 11:32 pm on May 1, 2020: member

    Mind pointing to the docs?

    There are some changelogs on the previous releases page: https://www.oracle.com/database/technologies/related/berkeleydb-release-history.html. There is more documentation in the source downloads themselves.

    Oh, so disabling the logs fixes the compatibility problem? :o

    That is what I have found in my experimentation and dive through their source code.

    The database itself is backward compatible, but not forward compatible (you can use db files from 4.7 in 4.8 but not the other way around).

    This isn’t true for 5.3 -> 4.8, I’ve tested this. A quick look through the code for 4.7 and 4.8 indicate that it should be compatible too. I’m pretty sure the incompatibility there is with the logs.

  14. laanwj commented at 9:04 am on May 2, 2020: member

    Back in 2012 I had hoped we’d switch away from BerkeleyDB before needing to bump it ever.

    I’m still not convinced this is a particular thing that needs to be changed. I’m kind of scared of these kind of foundational changes to the wallet format and all the unknown issues it might cause downstream.

    So concept NACK, sorry.

  15. achow101 commented at 7:43 pm on May 2, 2020: member

    I’ve gone through their older versions to check compatibility. The only changes that were made were the depends package for BDB and configure.ac to allow other versions. The following versions could be compiled and feature_backwards_compatibility.py passes with a minor change to unload the w3 wallets before the files are copied. feature_backwards_compatibility.py opens new wallets from the branch in 0.19, 0.18, and 0.17 and the reverse. So this tests the compatibility of the wallet.dat file.

    • 4.7.25
    • 4.8.30
    • 5.0.32
    • 5.1.29
    • 5.2.42
    • 5.3.29

    Versions 6.0 introduces a new database file version which is not backwards compatible. But it is possible to take an existing wallet.dat, use it with 6.0+, and then go back to the older BDB versions. BDB won’t change a database’s format without an explicit upgrade.

    Versions 4.6 and lower have API differences. 4.6.21 had a minimal change and after applying it, the wallet.dat file still worked. 4.5.20 and lower have further API differences that I didn’t feel like going through to check the database file compatibility.


    Given this information, I think that something reasonable to do is to remove --with-incompatible-db and the messaging about the wallet.dat file not being portable. As these tests show, the wallet.dat file is portable across many versions. Only the log files are not portable and those are removed upon clean shutdown. Thus I think we should let people use system BDB instead of recommending 4.8. We can still ship 4.8 in release builds.

  16. MarcoFalke commented at 2:51 pm on May 3, 2020: member

    Concept ACK. I think everyone compiling from source is using bdb 5.3 and I haven’t heard of any issues with that.

    Please include tests to check that bitcoin core master (bdb5.3) can load bitcoin core 0.20.0 (bdb4.8) wallets (and vice-versa).

    If there are concerns that removing the --with-incompatible-bdb flag (added in 941dba17838a1a40f5a5862eb2f84ffec511911e) is going to make people use bdb4.5 or other untested configurations, then maybe an option would be to only whitelist 4.8 and 5.3, but keep the --with-incompatible-bdb flag for all other bdb versions.

  17. luke-jr commented at 7:42 pm on May 4, 2020: member

    I think everyone compiling from source is using bdb 5.3

    What is the basis for this assumption? I don’t know how people are building on Fedora/etc, but Gentoo still has bdb 4.8, and there are Ubuntu PPAs with it.

  18. sipa commented at 7:51 pm on May 4, 2020: member
    I can’t remember the last time I built without –with-incompatible-bdb.
  19. Sjors commented at 7:58 pm on May 4, 2020: member

    The backwards compatibility test creates a wallet on master (or the PR) and then opens it with an older release. Those older releases are downloaded binaries which ship with BDB 4.8.

    So as long as this PR doesn’t break that test, the change is probably backwards compatible.

    Narrator: it does break the test: https://travis-ci.org/github/bitcoin/bitcoin/jobs/682087327#L9993

  20. achow101 commented at 8:03 pm on May 4, 2020: member

    What is the basis for this assumption? I don’t know how people are building on Fedora/etc, but Gentoo still has bdb 4.8, and there are Ubuntu PPAs with it.

    I can’t remember the last time I built without –with-incompatible-bdb.

    I installed BDB 4.8 so I never build with --with-incompatible-bdb.

    So as long as this PR doesn’t break that test, the change is probably backwards compatible.

    Narrator: it does break the test: https://travis-ci.org/github/bitcoin/bitcoin/jobs/682087327#L9993

    Yes, this particular change is not backwards compatible, so I will be closing it. However, the particular thing it is failing on is the log files as the w3* wallets as those wallets are not being unloaded before they are copied. If you change the test to unload them first, it will pass. That’s because unloading the wallet flushes them and then removes the log file.

  21. achow101 closed this on May 4, 2020

  22. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me