Fixes UBSan implicit-integer-sign-change issue per #19610 (comment).
Credit to Crypt-iQ for finding and reporting the issue and to vasild for the original review suggestion in #19590#pullrequestreview-455788826.
Closes #19678.
CInv::type
from int
to uint32_t
, fix UBSan warning
#19818
Fixes UBSan implicit-integer-sign-change issue per #19610 (comment).
Credit to Crypt-iQ for finding and reporting the issue and to vasild for the original review suggestion in #19590#pullrequestreview-455788826.
Closes #19678.
fixes issue #19678 UBSan implicit-integer-sign-change
Credit to Eugene (Crypt-iQ) for finding and reporting the issue
and to Vasil Dimov (vasild) for the original suggestion
424@@ -425,7 +425,7 @@ class CInv
425 // Combined-message helper methods
426 bool IsGenTxMsg() const { return type == MSG_TX || type == MSG_WTX || type == MSG_WITNESS_TX; }
427
428- int type;
429+ uint32_t type;
According to https://en.bitcoin.it/wiki/Protocol_documentation#Inventory_Vectors (section written in 2010 by “MagicalTux”) this is unsigned. Though, our python tests also treat this as signed. https://btcinformation.org/en/developer-reference#inv doesn’t say anything about signedness.
Would be good to at least update the python test framework and also add a p2p test.
Would be good to at least update the python test framework and also add a p2p test.
Updated the python test framework and added test coverage.
127@@ -125,6 +128,16 @@ def test_msgtype(self):
128 conn.sync_with_ping(timeout=1)
129 self.nodes[0].disconnect_p2ps()
130
131+ def test_inv_msg_with_negative_type(self):
132+ self.log.info("Test inv with negative type raises argument out of range error")
133+ invalid_inv = msg_inv([CInv(t=-1, h=1)]) # inv type must be an unsigned integer
134+ try:
135+ self.nodes[0].add_p2p_connection(P2PInterface()).send_message(invalid_inv)
136+ except struct.error as e:
We generally don’t test the test framework. Sending an uint32_t{-1} and observing the node doesn’t crash with sanitizers enabled should be a sufficient smoke test. Maybe add an assert_debug_log to observe how the inv will be formatted.
0 LogPrint(BCLog::NET, "got inv: %s %s peer=%d\n", inv.ToString(), fAlreadyHave ? "have" : "new", pfrom.GetId());
139+ else:
140+ raise AssertionError("No exception raised")
141+
142+ # Repeat, passing signed integer CInv type format to override default.
143+ LE_signed_int = "<i" # Python format chars for little-endian signed int
144+ inv_with_signed_type = msg_inv([CInv(t=-MSG_TX, h=1, f=LE_signed_int)])
0 inv_with_signed_type = msg_inv([CInv(t=0xffffffff, h=1)])
Serialized -1 as signed integer is identical to 0xffffffff serialized as unsigned integer, so might as well use the latter
That was breaking this method in messages.py::CInv
with KeyError: 4294967295
from calling CInv::typemap[4294967295]
0 def __repr__(self):
1 return "CInv(type=%s hash=%064x)" \
2 % (self.typemap[abs(self.type)], self.hash)
0xffffffff: "Error"
?
Concept ACK
Thanks for fixing UBSan fuzzing finds!
src/protocol.{cpp,h}
) then all functional tests would still pass, right?
127@@ -125,6 +128,22 @@ def test_msgtype(self):
128 conn.sync_with_ping(timeout=1)
129 self.nodes[0].disconnect_p2ps()
130
131+ def test_inv_msg_with_signed_integer_type(self):
132+ self.log.info("Test inv message with MSG_TX type as signed int raises out of range error")
133+ inv_with_signed_type = msg_inv([CInv(t=-MSG_TX, h=1)])
134+ try:
135+ self.nodes[0].add_p2p_connection(P2PInterface()).build_message(inv_with_signed_type)
136+ except struct.error as e:
@jonasnick Just to confirm: if the test changes we’re dropped in this change set (keeping only the changes to
src/protocol.{cpp,h}
) then all functional tests would still pass, right?
Right. I pushed the changes to src/protocol.{h/cpp}
several times in other PR changesets (#19610, #19611) before dropping them, and the CI passes.
According to https://en.bitcoin.it/wiki/Protocol_documentation#Inventory_Vectors (section written in 2010 by “MagicalTux”) this is unsigned. Though, our python tests also treat this as signed. https://btcinformation.org/en/developer-reference#inv doesn’t say anything about signedness.
btcinformation.org does show this as uint32_t: https://btcinformation.org/en/developer-reference#data-messages. That was added by @harding to the bitcoin.org developer docs here: https://github.com/bitcoin-dot-org/Bitcoin.org/commit/1634212dd5550bb9b610341668b4b7b5fa3690a5#diff-693f5b3e9d88b97c899694c8d1656cbbR110
Note that even though comparing an int to a uint32_t causes a UBSan warning, it’s not actually undefined behaviour: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#:~:text=%2Dfsanitize%3Dimplicit%2Dinteger%2D,the%20new%20value%20is%20negative. However, I think it’s still worth changing this, so that we’re always comparing / xor’ing a uint32_t with a uint32_t, so concept ACK.
Great links @jnewbery, thanks.
Note that even though comparing an int to a uint32_t causes a UBSan warning, it’s not actually undefined behaviour: clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#:~:text=%2Dfsanitize%3Dimplicit%2Dinteger%2D,the%20new%20value%20is%20negative.
“-fsanitize=implicit-integer-sign-change: Implicit conversion between integer types, if that changes the sign of the value…issues caught by this sanitizer are not undefined behavior, but are often unintentional.”
268 return r
269
270 def __repr__(self):
271 return "CInv(type=%s hash=%064x)" \
272- % (self.typemap[self.type], self.hash)
273+ % (self.typemap[abs(self.type)], self.hash)
I don’t think this abs(…)
is correct. There should be no reason to take an absolute value here, convert, say, -1
to 1
.
maybe
0self.typemap.get(self.type, "Error")
KeyError
to hide deeper issues, like argument out of datatype range.
243@@ -244,27 +244,28 @@ class CInv:
244 MSG_TX | MSG_WITNESS_FLAG: "WitnessTx",
245 MSG_BLOCK | MSG_WITNESS_FLAG: "WitnessBlock",
246 MSG_FILTERED_BLOCK: "filtered Block",
247- 4: "CompactBlock",
248- 5: "WTX",
249+ MSG_CMPCT_BLOCK: "CompactBlock",
250+ MSG_WTX: "WTX",
251+ 0xffffffff: "Error",
0xffffffff
here.
typemap
. I don’t think any of the tests cover the protocol.{h,cpp} CInv::type
type change, only the test framework change.
ACK 7984c39be
All the values we ever assign to CInv::type
are non-negative within the reach of a signed 32 bit integer. GetDataMsg
is uint32_t
and in some places we |=
a variable of type uint32_t
into CInv::type
:
So it makes sense to have CInv::type
as uint32_t
.
Actually serializing a signed integer in the way we do it is not portable, it just works because all platforms use two’s complement, but that’s not guaranteed.
All the values we ever assign to CInv::type are non-negative within the reach of a signed 32 bit integer. GetDataMsg is uint32_t and in some places we |= a variable of type uint32_t into CInv::type
Bitwise arithmetic should definitely use unsigned integers, good point.
Actually serializing a signed integer in the way we do it is not portable, it just works because all platforms use two’s complement, but that’s not guaranteed.
I thought this was already guaranteed in C++11, but no, apparently only C++20 will guarantee two’s complement signed integers.
ACK 7984c39be11ca04460883365e1ae2a496aaa6c0e 🌻
Signature:
0-----BEGIN PGP SIGNED MESSAGE-----
1Hash: SHA512
2
3ACK 7984c39be11ca04460883365e1ae2a496aaa6c0e 🌻
4-----BEGIN PGP SIGNATURE-----
5
6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
7pUjtLAwAtfwkLzUyVHkPG+CsHcnD4C7PQgsfTFBMxOM9trXOHY299iYHJehs2jwr
8j748j23g+pETizxSTTPal7FmbYGvmVrITJiLVfLtMeW+9vOoP59qzb1/mvxEWFWw
98hQnBJtUxYH1t8DxdWcVLew4jTlGyds2E6hGb09htsPam9LAxW+eUDynlwFXQgE1
10HNxmk8K2mMLYoHeLidculEx2UcQFIH7GqFWDplFDpzSAruFneWqyg+V2LUqhCTpH
11Jq/YFXupkfmHrySzWkJz3Ix2mGslNaxtqw18KnFFIc8Kx1qzZKOvloWIZCjvnCQt
12GbWMZfSvvf1ALu44WX+dZLl6NtGU8x18bS0y6tkKrPyW51XyouCn7J3aOk03IcXn
133LizPJLA5DFf9jCILzLwoQLRhZbLAVsSoWIBVIyIIuoOrsbf2omAoxe13HoepQzK
14a02g086odTrNRvKdwARXzC6tLEdgQHW4mJbZJLDCJ1Gz+qXJOla2tCIrieZq0Y8T
15vC5hwFiZ
16=81Qa
17-----END PGP SIGNATURE-----
Timestamp of file with hash a72c948f27f3e6c7aa323c708d61eff780f8ac68076405b22c540f1cb6199d2f -