The current hex tx decoding logic will refuse to decode valid extended-encoded transactions if the result fails the heuristic sanity check, even when the legacy-encoding fails. Fix this.
Fixes #20579
125@@ -126,31 +126,53 @@ static bool CheckTxScriptsSanity(const CMutableTransaction& tx)
126
127 static bool DecodeTx(CMutableTransaction& tx, const std::vector<unsigned char>& tx_data, bool try_no_witness, bool try_witness)
128 {
129+ CMutableTransaction tx_wit, tx_nowit;
130+ bool ok_wit = false, ok_nowit = false;
wtx
, tx
, wtx_ok
, tx_ok
143 } catch (const std::exception&) {
144 // Fall through.
145 }
146 }
147
148+ // Try decoding with legacy serialization, and remember if the result succesfully consumes the entire input.
ACK daf98fec173d259afe3cd5640e73ac51a91e0a4
Could perhaps use some tests.
Can you grab the example in the issue(or generate one) as a regression test?
On Tue, Dec 8, 2020, 7:37 AM Pieter Wuille notifications@github.com wrote:
@sipa commented on this pull request.
In src/core_read.cpp https://github.com/bitcoin/bitcoin/pull/20595#discussion_r537917709:
} catch (const std::exception&) {
// Fall through. } }
- // Try decoding with legacy serialization, and remember if the result succesfully consumes the entire input.
Fixed.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bitcoin/bitcoin/pull/20595#discussion_r537917709, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMAFU63TLZS43YZI27WVZ3STVRM7ANCNFSM4URDWYYQ .
171+ if (ok_legacy && CheckTxScriptsSanity(tx_legacy)) {
172+ tx = std::move(tx_legacy);
173+ return true;
174+ }
175+
176+ // If only one deserialization succeeded, use that one.
The following if
statement also handles the case where both deserializations succeeded but neither passed sanity, which the comments skip.
Maybe
0// If both deserializations succeeded, prefer the one for which CheckTxScriptsSanity returns true.
1// If that's the case for both ** or neither **, prefer the extended one
2// If only one deserialization succeeded, use that one.
3if (ok_ext && ok_legacy)
4 if (TxSanity(tx_ext) || !TxSanity(tx_legacy)) {
5 return tx_ext;
6 } else {
7 return tx_legacy;
8 }
9} else if (ok_ext) {
10 return tx_ext;
11} else if (ok_legacy) {
12 return tx_legacy;
13} else {
14 fail;
15}
would match the comments better? Or could consolidate the conditions something like:
0if (ok_ext && !(ok_legacy && !TxSanity(tx_ext) && TxSanity(tx_legacy)) {
1 // If extended decode succeeded, prefer it, except if legacy also succeeded,
2 // extended didn't pass sanity checks, and legacy did pass sanity checks
3 return tx_ext;
4} else if (ok_legacy) {
5 // Otherwise, if legacy succeeded use it
6 return tx_legacy;
7} else {
8 // Otherwise fail
9 fail;
10}
Alternatively you could stay closer to the current code with:
0if (try_witness) {
1 ok_ext = Decode(tx_ext);
2 if (ok_ext && (!try_no_witness || Sanity(tx_ext))) return tx_ext;
3}
4if (try_no_witness) {
5 ok_legacy = Decode(tx_legacy);
6 if (ok_legacy && (!ok_ext || Sanity(tx_legacy)) return tx_legacy;
7}
8if (ok_ext) return tx_ext; // neither pass sanity checks
9fail;
The comment implies that only one deserialization has succeeded if this line is reached, which is not true
0 // If both deserializations succeeded, prefer the extended one. Otherwise use the one that succeeded deserialization.
Sanity(tx_ext)
), I prefer your third option.
125@@ -126,31 +126,55 @@ static bool CheckTxScriptsSanity(const CMutableTransaction& tx)
126
127 static bool DecodeTx(CMutableTransaction& tx, const std::vector<unsigned char>& tx_data, bool try_no_witness, bool try_witness)
128 {
129+ CMutableTransaction tx_extended, tx_legacy;
DecodeTx
(previously, up to 4MB for each of tx
, tx_data
; now an additional 4MB for each of tx_extended
and tx_legacy
). Is that a potential problem anywhere? (I don’t think it is, but…)
For the example in the bug report, decoderawtransaction
(without specifying witness-only decode) fails, but getrawtransaction c61adde6f1c09ef4c9864e5c5aa7d92e0c6e76e4413710cf989bae9145ef141e true 0000000000000000001051d0017518f9fb7a7d201751a34fa91621ed32c0d88f
succeeds, so I think this is working as I expected. I didn’t suggest a change along these lines because of the extra memory usage, but if we’re okay with that, +1 from me.
Separately should we consider doing anything to remove the output that fails the sanity check from the utxo set, given it doesn’t parse and hence is unspendable?
ACK 451556858a23d81fd23ab2f3f7a72f61856c62d3
ACK, it seems this comes with two changes. It prefers the extended encoding:
Whenever both encodings are permitted, try both, and if only one succeeds,
return that one. Otherwise prefer the one for which the heuristic sanity
check passes. If that is the case for neither or for both, return the
extended-permitting deserialization.
Tested ACK 0f949cde3dff15170db7930b0f7345ff995c267d
The new naming, comments, and code organisation are much improved.
0diff --git a/src/core_read.cpp b/src/core_read.cpp
1index 7687a86185..fe64b7b95e 100644
2--- a/src/core_read.cpp
3+++ b/src/core_read.cpp
4@@ -148,37 +148,39 @@ static bool DecodeTx(CMutableTransaction& tx, const std::vector<unsigned char>&
5 CDataStream ssData(tx_data, SER_NETWORK, PROTOCOL_VERSION);
6 try {
7 ssData >> tx_extended;
8- if (ssData.empty()) ok_extended = true;
9+ if (ssData.empty()) {
10+ ok_extended = true;
11+ // Optimization: if extended decoding succeeded and the result passes CheckTxScriptsSanity,
12+ // don't bother decoding the other way.
13+ if (CheckTxScriptsSanity(tx_extended)) {
14+ tx = std::move(tx_extended);
15+ return true;
16+ }
17+ }
18 } catch (const std::exception&) {
19 // Fall through.
20 }
21 }
22
23- // Optimization: if extended decoding succeeded and the result passes CheckTxScriptsSanity,
24- // don't bother decoding the other way.
25- if (ok_extended && CheckTxScriptsSanity(tx_extended)) {
26- tx = std::move(tx_extended);
27- return true;
28- }
29-
30 // Try decoding with legacy serialization, and remember if the result successfully consumes the entire input.
31 if (try_no_witness) {
32 CDataStream ssData(tx_data, SER_NETWORK, PROTOCOL_VERSION | SERIALIZE_TRANSACTION_NO_WITNESS);
33 try {
34 ssData >> tx_legacy;
35- if (ssData.empty()) ok_legacy = true;
36+ if (ssData.empty()) {
37+ ok_legacy = true;
38+ // If legacy decoding succeeded and passes CheckTxScriptsSanity, that's our answer, as we know
39+ // at this point that extended decoding either failed or doesn't pass the sanity check.
40+ if (CheckTxScriptsSanity(tx_legacy)) {
41+ tx = std::move(tx_legacy);
42+ return true;
43+ }
44+ }
45 } catch (const std::exception&) {
46 // Fall through.
47 }
48 }
49
50- // If legacy decoding succeeded and passes CheckTxScriptsSanity, that's our answer, as we know
51- // at this point that extended decoding either failed or doesn't pass the sanity check.
52- if (ok_legacy && CheckTxScriptsSanity(tx_legacy)) {
53- tx = std::move(tx_legacy);
54- return true;
55- }
56-
57 // If extended decoding succeeded, and neither decoding passes sanity, return the extended one.
Code review ACK 0f949cde3dff15170db7930b0f7345ff995c267d
The new naming, comments, and code organisation are much improved.
Yes the comments definitely help clarify why the function is like this.