FuzzedSock
has a few issues that block a fuzzer from making progress. See commit messages for details.
fuzz: Make FuzzedSock fuzz friendlier #30211
pull dergoegge wants to merge 3 commits into bitcoin:master from dergoegge:2024-05-fix-fuzzsock changing 2 files +19 −10-
dergoegge commented at 1:50 pm on May 31, 2024: member
-
[fuzz] Use fuzzer friendly ConsumeRandomLengthByteVector in FuzzedSock::Recv
See comment on FuzzedDataProvider::ConsumeRandomLengthString.
-
DrahtBot commented at 1:50 pm on May 31, 2024: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage
For detailed information about the code coverage, see the test coverage report.
Reviews
See the guideline for information on the review process.
Type Reviewers ACK marcofleon, brunoerg Approach ACK vasild If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
-
DrahtBot added the label Tests on May 31, 2024
-
dergoegge commented at 1:53 pm on May 31, 2024: member
-
brunoerg commented at 1:10 pm on June 1, 2024: contributorConcept ACK
-
in src/test/fuzz/util/net.cpp:209 in 617d43e656 outdated
209 pad_to_len_bytes = false; 210 } 211 } else { 212- random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>( 213- m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, len)); 214+ random_bytes = ConsumeRandomLengthByteVector(m_fuzzed_data_provider, len);
vasild commented at 5:15 am on June 2, 2024:I guess using
ConsumeRandomLengthByteVector()
is slower thanConsumeBytes()
because it uses an intermediatestd::string
, but is probably insignificant.The comment says:
0// Returns a std::string of length from 0 to |max_length|. When it runs out of 1// input data, returns what remains of the input. Designed to be more stable 2// with respect to a fuzzer inserting characters than just picking a random 3// length and then consuming that many bytes with |ConsumeBytes|. 4inline std::string 5FuzzedDataProvider::ConsumeRandomLengthString(size_t max_length) {
I am just curious, did you encounter a case where the current/previous implementation that uses
ConsumeBytes()
is “unstable” or had problems in some other way?
dergoegge commented at 9:39 am on June 3, 2024:did you encounter a case where the current/previous implementation that uses ConsumeBytes() is “unstable” or had problems in some other way?
No I just found this while investigating the peek issue (which was the main blocker).
in src/test/fuzz/util/net.cpp:196 in 617d43e656 outdated
192@@ -193,21 +193,20 @@ ssize_t FuzzedSock::Recv(void* buf, size_t len, int flags) const 193 bool pad_to_len_bytes{m_fuzzed_data_provider.ConsumeBool()}; 194 if (m_peek_data.has_value()) { 195 // `MSG_PEEK` was used in the preceding `Recv()` call, return `m_peek_data`. 196- random_bytes.assign({m_peek_data.value()}); 197+ random_bytes = m_peek_data.value();
vasild commented at 5:28 am on June 2, 2024:This will overflow the buffer inbuf
in the memcopy below if peek read is called with N bytes and then non-peek read is called with M bytes and M < N.
dergoegge commented at 9:40 am on June 3, 2024:Fixed.in src/test/fuzz/util/net.h:46 in 617d43e656 outdated
42@@ -43,7 +43,7 @@ class FuzzedSock : public Sock 43 * If `MSG_PEEK` is used, then our `Recv()` returns some random data as usual, but on the next 44 * `Recv()` call we must return the same data, thus we remember it here. 45 */ 46- mutable std::optional<uint8_t> m_peek_data; 47+ mutable std::optional<std::vector<uint8_t>> m_peek_data;
vasild commented at 5:34 am on June 2, 2024:from the commit message:
FuzzedSock only supports peeking at one byte at a time, which is not fuzzer friendly when trying to receive long data.
I am not sure how likely is this to happen, but at least it is not ruled out by the
recv(2)
documentation: a peek read of 10 bytes returns just 1 byte and repeating the peek read of 10 bytes returns the same 1 byte over and over again. The app can never peek past the 1 byte without actually consuming/reading that byte. I guess it must be ready for this behavior from the OS or from a mocked fuzzed socket.
dergoegge commented at 9:45 am on June 3, 2024:I could only find the following in the docs and based the changes here on that.
0This flag causes the receive operation to return data from 1the beginning of the receive queue without removing that 2data from the queue. Thus, a subsequent receive call will 3return the same data.
We could make it so that a peek read only returns one byte and a subsequent normal read return N bytes where the first byte corresponds to the peek byte. But I’m not sure if that would change much for our use case.
vasild commented at 12:52 pm on June 10, 2024:We could make it so that a peek read only returns one byte and a subsequent normal read return N bytes where the first byte corresponds to the peek byte.
This was the case before this PR, right?
But I’m not sure if that would change much for our use case.
Where can I see that use case? Is there some code that does not work without the changes to
FuzzedSock::Recv()
done in this PR?The doc you cited, in my understanding, means that what I described in the first comment of this thread is not ruled out and might happen.
vasild commented at 12:53 pm on June 10, 2024:std::vector<uint8_t>
can be used instead ofstd::optional<std::vector<uint8_t>>
and if the vector is empty that is the same as the optional without a value.
dergoegge commented at 1:09 pm on June 10, 2024:This was the case before this PR, right?
No, a peek read before this PR would return 1 byte and a normal read following that would also only return that one byte (with the exception of sometimes padding with a bunch of zeros). This is what causes the fuzz input to never be interpreted as one continuous sequence of data, which is why mutations that insert pieces of data (e.g. from a dictionary) are ineffective.
Where can I see that use case? Is there some code that does not work without the changes to FuzzedSock::Recv() done in this PR?
Yes the i2p harness is unable to cover most of the target code without these changes: #30230.
dergoegge commented at 1:12 pm on June 10, 2024:Feel free to open a follow up.
in src/test/fuzz/util/net.cpp:389 in 617d43e656 outdated
378@@ -380,7 +379,7 @@ bool FuzzedSock::Wait(std::chrono::milliseconds timeout, Event requested, Event* 379 return false; 380 } 381 if (occurred != nullptr) { 382- *occurred = m_fuzzed_data_provider.ConsumeBool() ? requested : 0; 383+ *occurred = m_fuzzed_data_provider.ConsumeBool() ? 0 : requested;
vasild commented at 5:40 am on June 2, 2024:Wow! Good catch! Maybe this deserves a comment because there is deeper reason for the way it is written than one may see at a first glance. (aka to prevent somebody restoring it back to? requested : 0
in a future refactor, code reorganization).
dergoegge commented at 9:41 am on June 3, 2024:Added some commentsvasild commented at 5:44 am on June 2, 2024: contributorApproach ACK 617d43e656816a532ca275707dffeb6e2a2e1fc7
Just one blocker below (about the overflow).
Thank you for the improvements!
dergoegge force-pushed on Jun 3, 2024[fuzz] Make peeking through FuzzedSock::Recv fuzzer friendly
FuzzedSock only supports peeking at one byte at a time, which is not fuzzer friendly when trying to receive long data. Fix this by supporting peek data of arbitrary length instead of only one byte.
[fuzz] Avoid endless waiting in FuzzedSock::{Wait,WaitMany}
Currently, when the FuzzedDataProvider of a FuzzedSock runs out of data, FuzzedSock::Wait and WaitMany will simulate endless waiting as the requested events are never simulated as occured. Fix this by simulating event occurence when ConsumeBool() returns false (e.g. when the data provider runs out). Co-authored-by: dergoegge <n.goeggi@gmail.com>
dergoegge force-pushed on Jun 3, 2024marcofleon approvedmarcofleon commented at 12:29 pm on June 4, 2024: contributorTested ACK 22d0f1a27ef7733b51b3c2138a8d01713df8f248 Ini2p
the private key needs to be at least 387 bytes. When this key is generated (using FuzzedSock) the bytes were being broken up such that the length of the saved key was always less than 387. HavingRecv
return the length of a byte vector, instead of a single byte, whenMSG_PEEK
is enabled fixes this and allows the fuzzer to continue through the I2P target (https://github.com/bitcoin/bitcoin/issues/28803).DrahtBot requested review from vasild on Jun 4, 2024DrahtBot requested review from brunoerg on Jun 4, 2024brunoerg approvedbrunoerg commented at 12:39 pm on June 4, 2024: contributorutACK 22d0f1a27ef7733b51b3c2138a8d01713df8f248fanquake merged this on Jun 4, 2024fanquake closed this on Jun 4, 2024
in src/test/fuzz/util/net.cpp:222 in 22d0f1a27e
214@@ -216,7 +215,11 @@ ssize_t FuzzedSock::Recv(void* buf, size_t len, int flags) const 215 } 216 return r; 217 } 218- std::memcpy(buf, random_bytes.data(), random_bytes.size()); 219+ // `random_bytes` might exceed the size of `buf` if e.g. Recv is called with 220+ // len=N and MSG_PEEK first and afterwards called with len=M (M < N) and 221+ // without MSG_PEEK. 222+ size_t recv_len{std::min(random_bytes.size(), len)}; 223+ std::memcpy(buf, random_bytes.data(), recv_len);
vasild commented at 1:04 pm on June 10, 2024:Now it will not overflow
buf
but it will lose the trailingN - M
bytes from the peek.To fix this, when assigning
random_bytes
earlier, it should only take/pop the firstlen
bytes fromm_peek_data
.I think all this may be unnecessary if the app has to work with peek only returning 1 byte, as discussed in #30211 (review)
dergoegge commented at 1:16 pm on June 10, 2024:These changes were very much necessary forFuzzedSock
to be fuzzer friendly. Feel free to open a follow up, if you have suggestion how further improve things.
vasild commented at 2:01 pm on June 12, 2024:… will lose the trailing
N - M
bytes from the peek.vasild commented at 1:04 pm on June 10, 2024: contributorI think this was merged prematurely.
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 03:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me