wallet: Improve AvailableCoins performance by reducing duplicated operations

achow101 commented at 10:08 pm on March 28, 2022: member

While running my coin selection simulations, I noticed that towards the end of the simulation, the wallet would become slow to make new transactions. The wallet generally performs much more slowly when there are a large number of transactions and/or a large number of keys. The improvements here are focused on wallets with a large number of transactions as that is what the simulations produce.

Most of the slowdown I observed was due to DescriptorScriptPubKeyMan::GetSigningProvider re-deriving keys every time it is called. To avoid this, it will now cache the SigningProvider produced so that repeatedly fetching the SigningProvider for the same script will not result in the same key being derived over and over. This has a side effect of making the function non-const, which makes a lot of other functions non-const as well. This helps with wallets with lots of address reuse (as my coin selection simulations are), but not if addresses are not reused as keys will end up needing to be derived the first time GetSigningProvider is called for a script.

The GetSigningProvider problem was also exacerbated by unnecessarily fetching a SigningProvider for the same script multiple times. A SigningProvider is retrieved to be used inside of IsSolvable. A few lines later, we use GetTxSpendSize which fetches a SigningProvider and then calls CalculateMaximumSignedInputSize. We can avoid a second call to GetSigningProvider by using CalculateMaximumSignedInputSize directly with the SigningProvider already retrieved for IsSolvable.

There is an additional slowdown where ProduceSignature with a dummy signer is called twice for each output. The first time is IsSolvable checks that ProduceSignature succeeds, thereby informing whether we have solving data. The second is CalculateMaximumSignedInputSize which returns -1 if ProduceSignature fails, and returns the input size otherwise. We can reduce this to one call of ProduceSignature by using CalculateMaximumSignedInputSize’s result to set solvable.

Lastly, a lot of time is spent looking in mapWallet and mapTxSpends to determine whether an output is already spent. The performance of these lookups is slightly improved by changing those maps to use std::unordered_map and std::unordered_multimap respectively.

DrahtBot added the label RPC/REST/ZMQ on Mar 28, 2022

DrahtBot added the label Wallet on Mar 28, 2022

achow101 force-pushed on Mar 29, 2022

DrahtBot commented at 2:12 pm on March 29, 2022: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25695 (tidy: add modernize-use-using by fanquake)
#25664 (refactor: Redefine IsSolvable() using descriptors by darosior)
#25659 (wallet: simplify ListCoins implementation by furszy)
#23417 (wallet, spkm: Move key management from DescriptorScriptPubKeyMan to wallet level KeyManager by achow101)
#22693 (RPC/Wallet: Add “use_txids” to output of getaddressinfo by luke-jr)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

josibake commented at 4:19 pm on March 29, 2022: member

Concept ACK

very nice, cursory glance looks good. do you have any benchmarks on how much this improves performance when dealing with large wallets with many transactions?

achow101 commented at 10:10 pm on March 29, 2022: member

do you have any benchmarks on how much this improves performance when dealing with large wallets with many transactions?

One of the simulation scenarios I ran has 10050 deposits and 4950 spends. It’s runtime went from 4 hr 18 min to 38 min after the changes in this PR (plus one additional change to the simulation script which would apply a speedup that I’m not sure about, but would not dominate the total runtime).

in src/wallet/spend.cpp:190 in 7d64540c22 outdated

186@@ -187,10 +187,13 @@ void AvailableCoins(const CWallet& wallet, std::vector<COutput>& vCoins, const C
187 
188             std::unique_ptr<SigningProvider> provider = wallet.GetSolvingProvider(wtx.tx->vout[i].scriptPubKey);
189 
190-            bool solvable = provider ? IsSolvable(*provider, wtx.tx->vout[i].scriptPubKey) : false;

promag commented at 5:13 pm on April 5, 2022:

7d64540c22b0d0e0776f4e146f36d8b5ac462d09

This ternary was unnecessary, right?

Empact commented at 6:45 pm on April 6, 2022:

GetSolvingProvider can return nullptr: https://github.com/bitcoin/bitcoin/blob/41720a1f540ef3c16a283a6cce6f0a63552a4937/src/wallet/wallet.cpp#L3221-L3229

Empact commented at 6:59 pm on April 6, 2022:

Handled here in GetTxSpendSize / CalculateMaximumSignedInputSize: https://github.com/bitcoin/bitcoin/blob/41720a1f540ef3c16a283a6cce6f0a63552a4937/src/wallet/spend.cpp#L38-L40

achow101 commented at 8:07 pm on April 10, 2022:

It probably was.

furszy commented at 2:07 pm on April 20, 2022:

Self-note for 7d64540c:

The difference, aside from the clear speedup, is that we are no longer going to call VerifyScript after producing the dummy signature (which if would had failed in the past, the node would had crashed for the IsSolvable assertion).

in src/script/signingprovider.cpp:91 in a1299a9a5a outdated

86+{
87+    a.scripts.insert(b.scripts.begin(), b.scripts.end());
88+    a.pubkeys.insert(b.pubkeys.begin(), b.pubkeys.end());
89+    a.keys.insert(b.keys.begin(), b.keys.end());
90+    a.origins.insert(b.origins.begin(), b.origins.end());
91+    a.tr_spenddata = a.tr_spenddata;

promag commented at 5:19 pm on April 5, 2022:

a1299a9a5aee964ed75acac38e6d99d86952f4e4

👀

promag commented at 8:31 pm on April 5, 2022:

That’s not correct

w0xlt commented at 9:26 pm on April 5, 2022:

Yes, I agree. That is really not correct. b.tr_spenddata data is merged into a.tr_spenddata on the next line. But I do not see the point of a.tr_spenddata = a.tr_spenddata;.

promag commented at 10:27 pm on April 5, 2022:

Should be removed

achow101 commented at 8:07 pm on April 10, 2022:

Fixed

achow101 commented at 8:08 pm on April 10, 2022:

Fixed

in src/wallet/scriptpubkeyman.h:549 in a1299a9a5a outdated

545@@ -546,12 +546,14 @@ class DescriptorScriptPubKeyMan : public ScriptPubKeyMan
546 
547     KeyMap GetKeys() const EXCLUSIVE_LOCKS_REQUIRED(cs_desc_man);
548 
549+    std::map<int32_t, FlatSigningProvider> m_map_signing_providers;

promag commented at 5:43 pm on April 5, 2022:

a1299a9a5aee964ed75acac38e6d99d86952f4e4

Have you considered mutable instead of dropping const everywhere?

An annotation would be nice BTW.

achow101 commented at 8:07 pm on April 10, 2022:

Changed to mutable

in src/wallet/interfaces.cpp:326 in d3a6bba3fd outdated

325-        std::vector<WalletTx> result;
326-        result.reserve(m_wallet->mapWallet.size());
327+        std::set<WalletTx> result;
328         for (const auto& entry : m_wallet->mapWallet) {
329-            result.emplace_back(MakeWalletTx(*m_wallet, entry.second));
330+            result.emplace(MakeWalletTx(*m_wallet, entry.second));

promag commented at 6:08 pm on April 5, 2022:

d3a6bba3fdeba44a630f14755f73a9cb1a301b09

This seems to be a bad change for the GUI when loading a big wallet.

The order by hash requirement comes from TransactionTablePriv::updateWallet implementation, definitely something to improve/refactor.

achow101 commented at 8:08 pm on April 10, 2022:

I think it won’t be too bad, but fixing that seems like a bigger change than suitable for this PR.

promag changes_requested

promag commented at 6:08 pm on April 5, 2022: member

Concept ACK

w0xlt approved

w0xlt commented at 8:29 pm on April 5, 2022: contributor

Code Review ACK e6e77b3

achow101 force-pushed on Apr 10, 2022

w0xlt approved

w0xlt commented at 4:43 pm on April 11, 2022: contributor

reACK 91c3f77

in src/wallet/spend.cpp:195 in af54709713 outdated

188@@ -189,7 +189,7 @@ void AvailableCoins(const CWallet& wallet, std::vector<COutput>& vCoins, const C
189 
190             bool solvable = provider ? IsSolvable(*provider, wtx.tx->vout[i].scriptPubKey) : false;
191             bool spendable = ((mine & ISMINE_SPENDABLE) != ISMINE_NO) || (((mine & ISMINE_WATCH_ONLY) != ISMINE_NO) && (coinControl && coinControl->fAllowWatchOnly && solvable));
192-            int input_bytes = GetTxSpendSize(wallet, wtx, i, (coinControl && coinControl->fAllowWatchOnly));
193+            int input_bytes = CalculateMaximumSignedInputSize(wtx.tx->vout[i], provider.get(), /*use_max_sig=*/(coinControl && coinControl->fAllowWatchOnly));

furszy commented at 1:52 pm on April 20, 2022:

Seeing af547097: nit: would be good to extract wtx.tx->vout[I] into its own ref variable at the top of the vout for loop (line 165). We are accessing the same value in the vector several times.

achow101 commented at 10:36 pm on April 20, 2022:

Will do this if I have to touch again.

furszy commented at 3:43 pm on April 20, 2022: member

Code reviewed 91c3f774.

Two things to add:

About the signing providers cache (fcc2160d):

~~Seems that this going to cache the private keys when the wallet is unencrypted/unlocked and then keep them available in memory when the wallet gets encrypted/locked.~~ –> update: meh nah.. could actually assert that the provider in the cache does not have a sk stored.

Even when the changes are small, wouldn’t hurt to add test coverage for it (like getting the signing provider without including the private keys, then get it with them, check cache update, etc). Maybe, if you are ok, I could work on it (not sure how much it will take me to do it but.. I can give it a shot to start getting deeper over the descriptors architecture).
I do agree with @promag comment for 3b8b47b7 but.. yeah, better to work on it on a future PR as well.

achow101 commented at 7:18 pm on April 20, 2022: member

I wonder how this would perform by splitting up the COutPoint with this data structure:
0using TxSpends = std::unordered_map<uint256, std::unordered_multimap<uint32_t, uint256>, SaltedOutpointHasher>;
This would get rid of the O(n) iterations of the whole vout.size(), which should therefor have a lot less lookups.

Also I’m not sure how fast the unordered_multimap is, maybe it’s better to use a vector:
0using TxSpends = std::unordered_map<uint256, std::unordered_map<uint32_t, std::vector<uint256>>, SaltedOutpointHasher>;

Hmm, that would probably be better. I’ll try that. I don’t think there’s really a benchmark for this so it’ll be hard to measure the effect. I think the second solution is also faster as getting the range from the multimap appears to be O(n) on average too.

~~Seems that this going to cache the private keys when the wallet is unencrypted/unlocked and then keep them available in memory when the wallet gets encrypted/locked.~~ –> update: meh nah.. could actually assert that the provider in the cache does not have a sk stored.

This was considered and it specifically only caches the SigningProvider before private keys are added by making a copy.

Even when the changes are small, wouldn’t hurt to add test coverage for it (like getting the signing provider without including the private keys, then get it with them, check cache update, etc). Maybe, if you are ok, I could work on it (not sure how much it will take me to do it but.. I can give it a shot to start getting deeper over the descriptors architecture).

Feel free to try. You can reach out to me if you have any questions.

achow101 commented at 10:19 pm on April 20, 2022: member

I wonder how this would perform by splitting up the COutPoint with this data structure:

Actually there are a lot of other places that mapTxSpends is used and splitting it up requires making a ton of changes throughout the wallet to deal with it. The vast majority of the usage of mapTxSpends is to do lookups by outpoint.

furszy commented at 10:57 pm on April 26, 2022: member

If you like it, feel free to cherry-pick it: https://github.com/furszy/bitcoin/commit/42fbb693f5ad4c1c086089cfe0b95d2388c07996

furszy approved

furszy commented at 7:20 pm on April 27, 2022: member

code ACK 91c3f774

achow101 referenced this in commit 8be652e439 on Jun 17, 2022

DrahtBot added the label Needs rebase on Jun 17, 2022

fjahr commented at 1:24 pm on June 19, 2022: contributor

utACK 91c3f774c7ce55cb81d8e238d9469b1cacb10674 modulo rebase

achow101 force-pushed on Jun 20, 2022

DrahtBot removed the label Needs rebase on Jun 21, 2022

in src/wallet/spend.cpp:201 in 008d1178c8 outdated

197@@ -198,7 +198,7 @@ CoinsResult AvailableCoins(const CWallet& wallet,
198             // Filter by spendable outputs only
199             if (!spendable && only_spendable) continue;
200 
201-            int input_bytes = GetTxSpendSize(wallet, wtx, i, (coinControl && coinControl->fAllowWatchOnly));
202+            int input_bytes = CalculateMaximumSignedInputSize(wtx.tx->vout[i], provider.get(), /*use_max_sig=*/(coinControl && coinControl->fAllowWatchOnly));

furszy commented at 12:51 pm on June 27, 2022:

0            int input_bytes = CalculateMaximumSignedInputSize(output, provider.get(), /*use_max_sig=*/(coinControl && coinControl->fAllowWatchOnly));

achow101 commented at 3:40 pm on July 6, 2022:

Done

achow101 force-pushed on Jul 6, 2022

DrahtBot added the label Needs rebase on Jul 8, 2022

in src/script/signingprovider.h:91 in 6f9e59143d outdated

87@@ -88,6 +88,7 @@ struct FlatSigningProvider final : public SigningProvider
88 };
89 
90 FlatSigningProvider Merge(const FlatSigningProvider& a, const FlatSigningProvider& b);
91+void MergeInto(FlatSigningProvider& a, const FlatSigningProvider& b);

murchandamus commented at 6:43 pm on July 12, 2022:

Is it possible that “Merge” should have been replaced with “MergeInto” and a return of the merged value? It seems odd that we would both merge a and b, and then also merge b into a.

achow101 commented at 9:51 pm on July 13, 2022:

Looking at this further, it’s actually possible to do just use Merge, so I’ve removed MergeInto and now only use the preexisting Merge.

murchandamus commented at 7:22 pm on July 12, 2022: contributor

Concept ACK, code looks good to me, one question

achow101 force-pushed on Jul 12, 2022

DrahtBot removed the label Needs rebase on Jul 12, 2022

achow101 force-pushed on Jul 13, 2022

DrahtBot added the label Needs rebase on Jul 21, 2022

in src/wallet/spend.cpp:216 in 45736de91e outdated

186@@ -187,13 +187,15 @@ CoinsResult AvailableCoins(const CWallet& wallet,
187 
188             std::unique_ptr<SigningProvider> provider = wallet.GetSolvingProvider(output.scriptPubKey);
189 
190-            bool solvable = provider ? IsSolvable(*provider, output.scriptPubKey) : false;

josibake commented at 5:31 pm on July 21, 2022:

in https://github.com/bitcoin/bitcoin/pull/24699/commits/45736de91e9350258994183c7032d51bf2e1142e:

not sure if this is super important, but IsSolvable does a static_assert to check for STANDARD_SCRIPT_VERIFY_FLAGS and SCRIPT_VERIFY_WITNESS_PUBKEYTYPE and also runs VerifyScript, neither of which are happening in CalculateMaximumSignedInputSize

josibake commented at 5:51 pm on July 21, 2022:

maybe it’s worth adding these checks to DummySignInput so there is no behavior change?

achow101 commented at 6:44 pm on July 21, 2022:

static_assert is a compile time check so it doesn’t matter where it is done.

Both IsSolvable and CalculateMaximumSignedInputSize call ProduceSignature, which does VerifyScript at the very end of it if the rest of it was successful. So IsSolvable doing VerifyScript is actually redundant, and both IsSolvable and CalculatemaximumSignedInputSize are doing the same checks.

josibake commented at 6:58 pm on July 21, 2022:

ah, i missed the VerifyScript call in ProduceSignature, thanks for the explanation!

josibake approved

josibake commented at 5:56 pm on July 21, 2022: member

ACK https://github.com/bitcoin/bitcoin/pull/24699/commits/4323943403aa45333e415086bad735e0e7066da0 mod rebase

worth mentioning that I (and others) have been using this code to run wallet simulations based on real usage scenarios, so the code has been tested “in the wild”. left one comment re: IsSolvable but could just be my own lack of understanding. I also noticed the feature_backwards_compatibility.py --descriptors test is falling in CI , but i ran it locally without any failures

achow101 force-pushed on Jul 21, 2022

josibake commented at 7:06 pm on July 21, 2022: member

reACK 1f61630

DrahtBot removed the label Needs rebase on Jul 21, 2022

furszy approved

furszy commented at 6:22 pm on July 22, 2022: member

Code review ACK 1f616304

Something that has been floating around my head is the lack of TTL for each entry in the cache. So the map doesn’t grow indefinitely but.. not blocking, could tackle it in a follow-up work.

achow101 commented at 3:33 pm on July 23, 2022: member

@fjahr @w0xlt re-review?

fjahr commented at 8:05 pm on July 26, 2022: contributor

Code review re-ACK 1f61630497fd52089886903fdfc8e75b1c7f4d37

w0xlt approved

w0xlt commented at 1:36 pm on July 27, 2022: contributor

reACK https://github.com/bitcoin/bitcoin/pull/24699/commits/1f61630497fd52089886903fdfc8e75b1c7f4d37

DrahtBot added the label Needs rebase on Jul 28, 2022

wallet: Use CalculateMaximumSignedInputSize to indicate solvability

In AvailableCoins, we need to know whether we can solve for an output.
This was done by using IsSolvable, which just calls ProduceSignature and
produces a dummy signature. However, we already do that in order to get
the size of the input by using CalculateMaximumSignedInputSize. As this
function returns -1 if ProduceSignature fails, we can just remove the
use of IsSolvable and check that input_bytes is not -1 to determine
the solvability of an output.

8a105ecd1a

achow101 force-pushed on Jul 29, 2022

josibake commented at 3:28 pm on July 29, 2022: member

reACK https://github.com/bitcoin/bitcoin/commit/38ead65dae9c6bb14ef95b1405510d02317f520e

verified rebase with git range-diff master 1f61630 38ead65

DrahtBot removed the label Needs rebase on Jul 29, 2022

fjahr commented at 10:56 pm on July 29, 2022: contributor

re-ACK 38ead65dae9c6bb14ef95b1405510d02317f520e

furszy approved

furszy commented at 1:09 am on August 3, 2022: member

diff ACK 38ead65d

in src/wallet/scriptpubkeyman.cpp:2091 in 7bb08eb91d outdated

2089+        // Get the scripts, keys, and key origins for this script
2090+        std::vector<CScript> scripts_temp;
2091+        if (!m_wallet_descriptor.descriptor->ExpandFromCache(index, m_wallet_descriptor.cache, scripts_temp, *out_keys)) return nullptr;
2092+
2093+        // Cache SigningProvider so we don't need to re-derive if we need this SigningProvider again
2094+        m_map_signing_providers[index] = Merge(m_map_signing_providers[index], *out_keys);

murchandamus commented at 6:45 pm on August 3, 2022:

Since m_map_signing_providers.find(index) == m_map_signing_providers.end() in the else branch here, why isn’t this just:

0        m_map_signing_providers[index] = *out_keys;

Isn’t this merging with an empty object?

achow101 commented at 7:22 pm on August 3, 2022:

IIRC the compiler complained about doing it that way, presumably because out_keys is a unique_ptr

achow101 commented at 7:38 pm on August 3, 2022:

Apparently I remember incorrectly. Changed.

wallet: Cache SigningProviders

In order to avoid constantly re-deriving the same keys in
DescriptorScriptPubKeyMan, cache the SigningProviders generated inside
of GetSigningProvider.

1f798fe85b

Change mapTxSpends to be a std::unordered_multimap 97532867cf

Change getWalletTxs to return a set instead of a vector

For some reason, the primary consumer of getWalletTxs requires the
transactions to be in hash order when it is processing them. std::map
will iterate in hash order so the transactions end up in that order when
placed into the vector. To ensure this order when mapWallet is no longer
ordered, the vector is replaced with a set which will maintain the hash
order.

272356024d

Change mapWallet to be a std::unordered_map bc886fcb31

achow101 force-pushed on Aug 3, 2022

murchandamus commented at 7:42 pm on August 3, 2022: contributor

ACK bc886fcb31e1afa7bbf7b86bfd93e51da7076ccf

furszy approved

furszy commented at 10:13 pm on August 3, 2022: member

diff re-reACK bc886fcb

achow101 merged this on Aug 5, 2022

achow101 closed this on Aug 5, 2022

sidhujag referenced this in commit a9fe50175b on Aug 6, 2022

bitcoinhodler commented at 0:30 am on October 28, 2022: contributor

I filed #26406 to report an instability triggered by bc886fcb3

murchandamus commented at 5:48 pm on October 28, 2022: contributor

I filed #26406 to report an instability triggered by bc886fc

Reading the linked issue, this has meanwhile been resolved.

achow101 referenced this in commit 139ba2bf12 on Jan 4, 2023

sidhujag referenced this in commit b23c80f51c on Jan 4, 2023

hebasto referenced this in commit daebf9ebb0 on Feb 3, 2023

sidhujag referenced this in commit c665392cad on Feb 3, 2023

bitcoin locked this on Oct 28, 2023

wallet: Improve AvailableCoins performance by reducing duplicated operations #24699

Conflicts