CCoinsViewCache
objects, and to simulation data, comparing the two at the end.
Add simulation-based CCoinsViewCache
fuzzer
#27011
pull
sipa
wants to merge
4
commits into
bitcoin:master
from
sipa:202301_coinscache_fuzz
changing
7
files
+514 −6
-
sipa commented at 11:12 pm on January 31, 2023: memberThe fuzzer goes through a sequence of operations that get applied to both a real stack of
-
DrahtBot commented at 11:12 pm on January 31, 2023: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviews
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #25325 (Add pool based memory resource by martinus)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
-
sipa force-pushed on Feb 1, 2023
-
in src/test/fuzz/coinscache_sim.cpp:243 in f87382af7a outdated
238+ // to both the real cache stack and the simulation. 239+ auto it = buffer.begin(); 240+ unsigned iters = 0; 241+ while (it != buffer.end()) { 242+ // Upper limit on number of operations. 243+ if (++iters == 10000U) break;
maflcko commented at 8:24 am on February 1, 2023:nit: Could replace those four lines with justLIMITED_WHILE
?
dergoegge commented at 11:35 am on February 1, 2023:Why not consume until the buffer is empty?
Input sizes are limited by the fuzzing engines anyway (e.g. libFuzzer’s default is 4096 bytes when starting from an empty corpus), so the max. number of ops is capped by that.
maflcko commented at 12:00 pm on February 1, 2023:libFuzzer’s default is 4096 bytes
We use more fuzz engines than libFuzzer on default settings ;)
You can look up the history of LIMITED_WHILE to get an idea of how much is broken if the macro is removed.
sipa commented at 2:07 pm on February 1, 2023:nit: Could replace those four lines with just
LIMITED_WHILE
?Done.
Why not consume until the buffer is empty?
It does, by the way (in addition to having an upper limit).
in src/test/fuzz/coinscache_sim.cpp:254 in f87382af7a outdated
249+ if (caches.empty()) { 250+ caches.emplace_back(new CCoinsViewCache(&bottom)); 251+ sim_caches[caches.size()].Wipe(); 252+ } 253+ 254+ unsigned cmd = (*(it++)) % 17U;
maflcko commented at 8:24 am on February 1, 2023:nit: If you do switch to FuzzedDataProvider, could use CallOneOf?
.
sipa commented at 2:07 pm on February 1, 2023:Done.in src/test/fuzz/coinscache_sim.cpp:258 in f87382af7a outdated
253+ 254+ unsigned cmd = (*(it++)) % 17U; 255+ switch (cmd) { 256+ 257+ case 0: { // GetCoin 258+ uint32_t outpointidx = (it == buffer.end()) ? 0 : *(it++);
maflcko commented at 8:27 am on February 1, 2023:nit: If you do switch to FuzzedDataProvider, could use the
<uint8_t>
int getter? Alternatively, maybe write a lambda:const auto get_uint8_t{[](auto& it) { return it == buffer.end() ? 0 : *(it++); }}
and use it in all places with0 uint32_t outpointidx{get_uint8_t(it)};
sipa commented at 2:08 pm on February 1, 2023:Done, switchedprovider.ConsumeIntegralInRange<uint32_t>(0, NUM_OUTPOINTS - 1);
, which makes it not strongly tied to the range of auint8_t
.maflcko commented at 8:34 am on February 1, 2023: memberLeft some style nits. (Can be ignored unless there is other stuff to push)fanquake requested review from dergoegge on Feb 1, 2023in src/test/fuzz/coinscache_sim.cpp:194 in f87382af7a outdated
189+} // namespace 190+ 191+FUZZ_TARGET(coinscache_sim) 192+{ 193+ /** Precomputed COutPoint and CCoins values. */ 194+ const PrecomputedData data;
dergoegge commented at 11:20 am on February 1, 2023:0 static const PrecomputedData data;
PrecomputedData
is generated deterministically, right? Given that and it being const, there should be no need to generate it on each iteration.
sipa commented at 2:08 pm on February 1, 2023:Oops, good catch. Made itstatic
, which makes fuzzing several times faster.dergoegge commented at 11:52 am on February 1, 2023: memberConcept ACKin src/test/fuzz/coinscache_sim.cpp:26 in f87382af7a outdated
20+constexpr uint32_t NUM_OUTPOINTS = 256; 21+/** Number of distinct Coin values used in this test (ignoring nHeight). */ 22+constexpr uint32_t NUM_COINS = 256; 23+/** Maximum number CCoinsViewCache objects used in this test. */ 24+constexpr uint32_t MAX_CACHES = 4; 25+/** Data type large enough to hold 2^NUM_COINS-1. */
jamesob commented at 1:10 pm on February 1, 2023:Confused by this comment -2**8
doesn’t seem to be able to fit2**256-1
. Did you mean to writeNUM_COINS - 1
?
sipa commented at 2:28 pm on February 1, 2023:Indeed!sipa force-pushed on Feb 1, 2023in src/test/fuzz/coinscache_sim.cpp:397 in c7d20096b6 outdated
391+ // Apply to real caches. 392+ caches.back()->Flush(); 393+ }, 394+ 395+ [&]() { // Sync. 396+ // Apply to simulation data (note that in our simulation, syncing and flushing is the same thing).
jamesob commented at 2:13 pm on February 1, 2023:Hm. Can’t you emulateSync()
by adding anerase
parameter toflush()
that just doesn’t set the cache entry toEntryType::NONE
? If you don’t do this, won’t the contents of the simulation cache differ from the real caches?
sipa commented at 2:30 pm on February 1, 2023:Semantically, Flush() and Sync() have the same effect: they push the changes made in this cache to the parent coinsview. Whether or not entries stay in a cache is an implementation aspect that’s not observable (except through CacheSize() etc).
The NONE entry in the simulated cache is also different from “missing” in the real cache, in that lookups in the simulated cache never pull in entries from parent caches, so you can’t really compare the actual cache entrytype with anything the real caches do. They’re wholly different implementations, but their observable behavior should be identical.
You’re right that I could choose to not reset things to NONE in the simulated cache, but that’d in effect be extending the test to test the behavior of the simulation: resetting to NONE is just a non-observable optimization too there.
jamesob approvedjamesob commented at 2:20 pm on February 1, 2023: memberACK c7d20096b6808b11aadda7a652273298b3dd98a8 (
jamesob/ackr/27011.2.sipa.add_simulation_based_cco
)Built and reviewed locally. Always happy to see more coins tests.
Just a few questions, but as-is this represents a mergeable improvement.
0-----BEGIN PGP SIGNED MESSAGE----- 1Hash: SHA512 2 3ACK c7d20096b6808b11aadda7a652273298b3dd98a8 ([`jamesob/ackr/27011.2.sipa.add_simulation_based_cco`](https://github.com/jamesob/bitcoin/tree/ackr/27011.2.sipa.add_simulation_based_cco)) 4 5Built and reviewed locally. Always happy to see more coins tests. 6 7Just a few questions, but as-is this represents a mergeable improvement. 8-----BEGIN PGP SIGNATURE----- 9 10iQIzBAEBCgAdFiEEGNRVI1NPYuZCSIrGepNdrbLETwUFAmPadUMACgkQepNdrbLE 11TwUs4w//V3HDJYwqiLfMmjKTT1Mo80v0cyFG2WQrUlnLIEBOdSZSor1ur2AgzOjk 124paKsSCuMzUVfm+CHVvuIRxpPyg9Q4sEONql979/H1yB8VvK7zbddJJ1nyKuEHJQ 13mex99u0TWspnQRst6T1B1STwSX7Sjc6UUAcVG1dkIZxHlaR0ll8ZDaQcHo9vUWyy 14gHVRi6V+YsiyVqTJ59hEYgkbHQASjLIFFSGyghMDJTp3jTtOKEDJZzdzpL8KgqSE 151Mj1/IsEasgekgxAWKuAC29PetSC+XktzHVrLGfR1IHnWnMQ1eoe8lLg86FD1byS 16U+QpTjb4KMf6zWiflzWcIE0EHgmxaBqug4uAuufPrZKd5r2OQyDxy1LkiawatTgu 17iwSvFriNH6X2B7I8J34l+/9g9NxubwxVoJvscuuk14vW80IK6lUyPsxc/ujIqiRl 18ganCc9Obon1JViEdh/t/sjmmHzCFpYmftvjNn+Kwl0XVSMa2a1zKIKMDoR18l/8L 19v60G6haqyl49H2mmjIPEnomsujwtQQW6mGYolYhAI+LuYOCmkPHQ6Mw8VsGZRmdg 20xxoATDjx5wMoGCjg3/p8TWszUTHDo3UxpkcBlR9lteWpOTCi3LtaNIYuv8/aBYJJ 21nbtJlxNXUGFoYBbC/RxLhTx6orbkcldElwXPtxhx/CYKmxWXMiA= 22=f9q/ 23-----END PGP SIGNATURE-----
0Tested on Linux-6.1.6-arch1-3-x86_64-with-glibc2.36 1 2Configured with ./configure LDFLAGS=-L/home/james/src/bitcoin2/db4/lib/ CPPFLAGS=-I/home/james/src/bitcoin2/db4/include/ 'CXXFLAGS=-fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis' --enable-wallet --enable-debug --with-daemon --enable-natpmp-default CXX=/usr/bin/clang++ CC=/usr/bin/clang --disable-shared --with-pic --enable-benchmark=no --enable-module-recovery --disable-module-ecdh --no-create --no-recursion 3 4Compiled with /usr/bin/ccache /usr/bin/clang++ -std=c++17 -mavx -mavx2 -mpclmul -fPIE -pipe -O2 -g -Wthread-safety-analysis -Wall -Werror=sign-compare -Wsign-compare -Werror=thread-safety-analysis -O0 -g3 -ftrapv -fdebug-prefix-map=$(abs_top_srcdir)=. -Wstack-protector -fstack-protector-all -fcf-protection=full -fstack-clash-protection -msse4.1 -msse4.2 -msse4 -msha i 5 6Compiler version: clang version 15.0.7
sipa force-pushed on Feb 1, 2023jamesob approvedjamesob commented at 4:15 pm on February 1, 2023: memberre-ACK fd84485
A small refactoring change and comment updates.
0git range-diff upstream/master ackr/27011.2.sipa.add_simulation_based_cco ackr/27011.3.sipa.add_simulation_based_cco
Add simulation-based CCoinsViewCache fuzzer
The fuzzer goes through a sequence of operations that get applied to both a real stack of CCoinsViewCache objects, and to simulation data, comparing the two at the end.
sipa force-pushed on Feb 1, 2023in src/coins.cpp:336 in bc2cc7243f outdated
328+ // Recompute cachedCoinsUsage. 329+ recomputed_usage += entry.coin.DynamicMemoryUsage(); 330+ } 331+ assert(recomputed_usage == cachedCoinsUsage); 332+} 333+
unknown commented at 11:53 pm on February 1, 2023:Not sure about 0, 1, 2 etc. used here:
Readability: Instead of using numbers, meaningful names for the different coin attributes could be used. This makes the code easier to understand and maintain.
sipa commented at 11:59 pm on February 1, 2023:There really is no meaning at all; they’re arbitrary bits assigned to certain properties.sipa force-pushed on Feb 1, 2023Add CCoinsViewCache::SanityCheck() and use it in fuzz test b0ff310840sipa force-pushed on Feb 2, 2023Add deterministic mode to CCoinsViewCache 59e6828bb5in src/coins.cpp:36 in 883c2c8258 outdated
31@@ -32,7 +32,9 @@ bool CCoinsViewBacked::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, 32 std::unique_ptr<CCoinsViewCursor> CCoinsViewBacked::Cursor() const { return base->Cursor(); } 33 size_t CCoinsViewBacked::EstimateSize() const { return base->EstimateSize(); } 34 35-CCoinsViewCache::CCoinsViewCache(CCoinsView* baseIn) : CCoinsViewBacked(baseIn) {} 36+CCoinsViewCache::CCoinsViewCache(CCoinsView* baseIn, bool deterministic) : 37+ CCoinsViewBacked(baseIn), cacheCoins(0, SaltedOutpointHasher(deterministic))
dergoegge commented at 11:21 am on February 2, 2023:I think currently the determinism is lost when
ReallocateCache
is called, because we call the constructor there without the deterministicSaltedOutpointHasher
.
sipa commented at 2:58 pm on February 2, 2023:Fixed, I think.sipa force-pushed on Feb 2, 2023dergoegge approveddergoegge commented at 4:48 pm on February 2, 2023: memberCode review ACK 59e6828bb5b56a2354a80059d3f660f551f3e207fanquake requested review from jamesob on Feb 2, 2023maflcko commented at 9:48 pm on February 2, 2023: memberin src/test/fuzz/coinscache_sim.cpp:177 in 59e6828bb5 outdated
172+ m_data[it->first] = it->second.coin; 173+ } 174+ } else { 175+ /* For non-dirty entries being written, compare them with what we have. */ 176+ if (it->second.coin.IsSpent()) { 177+ assert(m_data.count(it->first) == 0);
maflcko commented at 9:51 pm on February 2, 2023:
sipa commented at 10:00 pm on February 2, 2023:That’s not what I would have expected when I wrote it, but it makes sense; it’s related to how line 55 of coins.cpp (https://drahtbot.space/host_reports/DrahtBot/reports/coverage_fuzz/monotree/59e6828bb5b56a23/fuzz.coverage/src/coins.cpp.gcov.html) isn’t covered either.
When a coin is looked up in a child cache, and the parent reports it doesn’t exist, the child does not cache that absence (this is in contrast to a successful lookup, which does get copied into the child). As a result, we never actually exercise the case where non-
DIRTY
IsSpent()
appear in the caches.I don’t think this is a problem. This fuzz test is designed to simulate a relatively realistic scenario of cache creations/deletions/flushes/syncs/additions/spends/lookups, and I believe the above just means that a lack of non-DIRTY spent coins also doesn’t appear in actual block processing.
sipa commented at 3:34 pm on February 3, 2023:I added a commit that ought to exercise these cases. I’ll also submit a follow-up to the qa-assets repo to make use of these.Exercise non-DIRTY spent coins in caches in fuzz test 561848aaf2jamesob commented at 4:34 pm on February 7, 2023: memberSjors commented at 2:32 pm on February 8, 2023: memberI ran the fuzzer using the latest corpus for about an hour and it didn’t find any crashes, so that’s nice.sipa commented at 2:44 pm on February 8, 2023: memberThe seeds I’ve added to qa-assets for this fuzzer (https://github.com/bitcoin-core/qa-assets/pull/103) are the result of a few months of CPU time, FWIW.DrahtBot requested review from dergoegge on Feb 10, 2023dergoegge commented at 12:08 pm on February 13, 2023: memberCode review ACK 561848aaf2d67791e92754f3d11813bc53959a8fmaflcko merged this on Feb 13, 2023maflcko closed this on Feb 13, 2023
sidhujag referenced this in commit fed4b784cb on Feb 13, 2023martinus commented at 6:40 am on February 14, 2023: contributor@sipa While rebasing #25325 I saw that the fuzzer in
coins_view.cpp
still usesCCoinsMap coins_map;
and doesn’t use the deterministic seeds, shouldn’t this fuzzer do that as well?I can add it while rebasing, I’m touching that line anyways. Unless you want to do this separately
EDIT: I had a look what happens when I remove the default
bool deterministic = false
inSaltedOutpointHasher
, and found another use of the map in the fuzzer, intx_pool.cpp
throughconst CCoinsViewMemPool amount_view{WITH_LOCK(::cs_main, return &chainstate.CoinsTip()), tx_pool};
sipa commented at 0:14 am on February 15, 2023: memberI saw that the fuzzer in coins_view.cpp still uses CCoinsMap coins_map;
I think that should probably be addressed.
and doesn’t use the deterministic seeds, shouldn’t this fuzzer do that as well?
I believe it does?
0 CCoinsViewCache coins_view_cache{&backend_coins_view, /*deterministic=*/true};
EDIT: I had a look what happens when I remove the default bool deterministic = false in SaltedOutpointHasher, and found another use of the map in the fuzzer, in tx_pool.cpp through const CCoinsViewMemPool amount_view{WITH_LOCK(::cs_main, return &chainstate.CoinsTip()), tx_pool};
Makes sense to change that too, I think.
martinus commented at 5:15 am on February 15, 2023: contributorI believe it does?
It does in line 49 for
CCoinsViewCache
, but not in line 118 for theCCoinsMap coins_map;
. I’ve added that in #25325 herebitcoin locked this on Feb 18, 2024
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 06:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me