sipa
commented at 7:36 pm on March 11, 2024:
member
This PR contains a number of vaguely-related improvements to the random module.
The specific changes and more detailed rationale is in the commit messages, but the highlights are:
XoRoShiRo128PlusPlus (previously a test-only RNG) moves to random.h and becomes InsecureRandomContext, which is even faster than FastRandomContext but non-cryptographic. It also gets all helper randomness functions (randrange, fillrand, …), making it a lot more succinct to use.
During tests, all randomness is made deterministic (except for GetStrongRandBytes) but non-repeating (like GetRand() used to be when g_mock_deterministic_tests was used), either fixed, or from a random seed (overridden by env var).
Several infrequently used top-level functions (GetRandMillis, GetRandMicros, GetExponentialRand) are converted into member functions of FastRandomContext (and InsecureRandomContext).
GetRand<T>() (without argument) can now return the maximum value of the type (previously e.g. GetRand<uint32_t>() would never return 0xffffffff).
DrahtBot
commented at 7:36 pm on March 11, 2024:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
#30277 ([DO NOT MERGE] Erlay: bandwidth-efficient transaction relay protocol (Full implementation) by sr-gi)
#29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
#29543 (refactor: Avoid unsigned integer overflow in script/interpreter.cpp by hebasto)
#29536 (fuzz: fuzz connman with non-empty addrman + ASMap by brunoerg)
#29480 (Drop log category in SeedStartup by hebasto)
#29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
#26114 (net: Make AddrFetch connections to fixed seeds by mzumsande)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
DrahtBot added the label
CI failed
on Mar 12, 2024
DrahtBot
commented at 4:41 am on March 12, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
dergoegge
commented at 10:09 am on March 12, 2024:
member
Concept ACK
There are a few places in the fuzz tests where this will allow to easily replace FastRandomContext with a InsecureRandomContext, which is beneficial for performance (e.g. the addrman harnesses partially fill the addrman with addresses from rng) and we don’t need cryptographic rng there anyway.
During tests, all randomness is made deterministic
Sjors
commented at 3:11 pm on March 12, 2024:
member
What’s the impact on the fuzz corpus of switching to a different (?) deterministic RNG?
dergoegge
commented at 3:53 pm on March 12, 2024:
member
What’s the impact on the fuzz corpus of switching to a different (?) deterministic RNG?
I would expect that switching to a different rng should have no meaningful effect on the corpus itself. The corpus for a particular harness might change but the coverage for the code we intend to test should remain the same. This is because using rng in a fuzz harness only makes sense in very rare cases. It should never be used in a way that can significantly affect the coverage reached, otherwise there is no point in using a coverage-guided fuzzer, we could just pipe /dev/random to our harnesses.
For example, if we need to populate some data that we don’t really expect to have an impact on the thing we are testing, we might use rng instead of consuming from the fuzz input (we do this in the p2p transport harnesses to fill message contents, which are essentially irrelevant to the transport logic).
Switching to deterministic rng can cause a corpus’ coverage to grow because coverage-guided feedback loops start working more reliably when the code under test is deterministic. This can vary from harness to harness, but we’ve seen coverage-guided fuzzers find bugs once we’ve improved on determinism.
sipa force-pushed
on Mar 12, 2024
sipa force-pushed
on Mar 12, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa
commented at 11:21 pm on March 14, 2024:
member
I’m baffled by what is happening here. In the fuzz test CI job, the data directory gets deleted in between two fuzz input runs??
maflcko
commented at 8:57 am on March 15, 2024:
member
This is expected, because all fuzz tests in this pull request branch use the same datadir (5a19fd2bc079272a9b138710c6e50c5ea512c1a0736333c9817c4df1d77c8518). Maybe the randomness is too deterministic and returns a constant?
Also, the issue should happen locally as well, not only on CI. Something like ./test/fuzz/test_runner.py -j 9 -l DEBUG ./folder_temp reproduces it for me.
Edit:
025d4e22d8b25c33f60b91193673336e4ad1c9406 is the first bad commit
sipa force-pushed
on Mar 15, 2024
sipa
commented at 12:55 pm on March 15, 2024:
member
@maflcko Thank you! I suspect that is it. I indeed made all fuzz test use the same deterministic seed. I’ve dropped that change now.
DrahtBot removed the label
CI failed
on Mar 15, 2024
sipa force-pushed
on Mar 15, 2024
sipa force-pushed
on Mar 15, 2024
sipa force-pushed
on Mar 15, 2024
DrahtBot added the label
CI failed
on Mar 15, 2024
DrahtBot
commented at 7:04 pm on March 15, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
What’s the impact on the fuzz corpus of switching to a different (?) deterministic RNG?
Let’s break it down into cases:
The main “random.h” RNG (GetRand() and friends, default-constructed FastRandomContext objects, …). This used to be truly random in fuzz tests, and will now be deterministic with a fixed seed. In theory this should have no impact, because the fuzz tests shouldn’t be relying on this randomness in the first place. But possibly there are some which do, indirectly through code that wasn’t properly mocked or otherwise avoiding it, in which case making things deterministic should be a strict improvement by not making the fuzzer waste time on chasing the effects of that randomness.
XoRoShiRo128PlusPlus. There are a few fuzz tests (bip324, crypto_chacha20, p2p_transport_serialization, poolresource) that use this very fast deterministic RNG to construct certain data. This PR changes the behavior of some of them by replacing ad-hoc code to use that randomness with general helper functions that become available for all RNGs. In theory this might invalidate part of the fuzz corpus for those tests, but in practice I expect it won’t, because the data drawn from those RNGs is data that shouldn’t matter for the test much (if it did, it’d be drawn from the fuzz input instead).
So overall, it might invalidate a few tests’ corpus (but probably not), and for others it should either have no effect or be a strict improvement.
EthanHeilman
commented at 10:10 pm on April 5, 2024:
contributor
#28558 made PeerManager own a FastRandomContext, so we could (should?) use m_rng here instead (otherwise PeerManager::Options::deterministic_rng still only applies to some of the randomness).
Since this PR kind of makes individual “make this component deterministic” options redudant, we could consider reverting #28558 (not necessarily in this PR)?
I was thinking that in the long run we could break the dependencies between components and the specific rng they use (maybe something like template<RandomNumberGenerator R> class PeerManager { ... }), which would allow more fine grained mocking than a global “make rng deterministic” in tests (e.g. we could have a “rng” type that consumes from a FuzzedDataProvider). I guess this can be done by using globals as well.
#28558 made PeerManager own a FastRandomContext, so we could (should?) use m_rng here instead (otherwise PeerManager::Options::deterministic_rng still only applies to some of the randomness).
I’ve changed the PR to reuse PeerManagerImpl::m_rng.
Since this PR kind of makes individual “make this component deterministic” options redudant, we could consider reverting #28558 (not necessarily in this PR)?
Maybe. I’ve opted to use it where possible for now as it’s a smaller change, and has some (possibly negligible) performance advantage (no need to lock the global RNG mutex to get randomness when you already hold g_msgproc_mutex), but I think that can be reconsidered.
Independently, we may be able to just drop PeerManager::Options::deterministic_rng, relying on global deterministic mode instead.
I was thinking that in the long run we could break the dependencies between components and the specific rng they use (maybe something like template<RandomNumberGenerator R> class PeerManager { ... }), which would allow more fine grained mocking than a global “make rng deterministic” in tests (e.g. we could have a “rng” type that consumes from a FuzzedDataProvider). I guess this can be done by using globals as well.
Maybe, though that means testing something very different from what we’re doing here: testing under conditions where the RNG returns actually decidedly non-random results (which is different from a deterministic FastRandomContext which is still cryptographically-strong, just deterministic. I don’t know for how many things this makes sense.
sipa force-pushed
on Apr 30, 2024
DrahtBot added the label
CI failed
on Apr 30, 2024
DrahtBot removed the label
CI failed
on May 2, 2024
dergoegge approved
dergoegge
commented at 12:38 pm on May 20, 2024:
member
140- * is completely deterministic and does not gather more entropy after that.
141+// Forward declaration of RandomMixin, used in RandomNumberGenerator concept.
142+template<typename T>
143+class RandomMixin;
144+
145+/** A concept for RandomMixin-based random number generators. */
I think I decided that the compiler should be smart enough to optimize the final “else” case with Bits == 0 to effectively a return 0;, but it’s more obvious to just make it explicit. Done.
in
src/random.h:283
in
89fa561ab9outdated
271+ while (span.size() >= 8) {
272+ uint64_t gen = Impl().rand64();
273+ WriteLE64(UCharCast(span.data()), gen);
274+ span = span.subspan(8);
275+ }
276+ while (span.size()) {
Yes, it is required. However, adding the annotation will only silence the compiler warning.
FastRandomContext is not thread safe, so the lock will actually have to be taken.
Putting an EXCLUSIVE_LOCKS_REQUIRED only into the c++ file and leaving it out from the header file will not work, because the compiler has no way to see the annotation outside of the module. For example, it can not be seen from init.cpp.
I’ll give that patched branch a shot next week and see if it turns up anything else in our codebase. Will chime in upstream with a +1 as well (assuming it works).
I used the WIP above to check for other issues like this in our codebase. It only turned up a few false-positives because of duplicate annotations in declarations + definitions. Thankfully there were no cases of them only in definitions. PR’d here: #30316.
I have addressed this by making StartScheduledTasks not use m_rng, but just grab a new FastRandomContext instead; performance doesn’t matter for this function.
theuni
commented at 7:13 pm on May 28, 2024:
member
Concept ACK and first-pass code review ACK excluding tests.
I don’t feel qualified to review the rand_expo_duration changes.
I manually verified the randbits impls with some local tests as the shifts weren’t intuitive to me at first glance.
in
src/random.h:407
in
89fa561ab9outdated
362@@ -351,19 +363,19 @@ class FastRandomContext : public RandomMixin<FastRandomContext>
363 return ReadLE64(UCharCast(buf.data()));
364 }
365366- /** Fill a byte Span with random bytes. */
367+ /** Fill a byte Span with random bytes. This overrides the RandomMixin version. */
This is kinda sneaky and surprising (to me at least) to see with CRTP.
It would be nice to specify as part of the concept which functions are not allowed to be overridden, but sadly
std::experimental::detected_t is.. well.. experimental.
Well being able to ad-hoc override individual functions, while still being able to have non-overridden functions be able to refer to it, is sort of the point of CRTP (virtual functions have the same behavior, but CRTP gives it without runtime overhead). If we didn’t want this ability, it’d be possible to just have a template<typename R> class RNGUtility { R m_int; ...}, where there are low-level ChaCha20RNG and XoRoShiRo128PlusPlusRNG classes, and high-level using FastRandomContext = RNGUtility<ChaCha20RNG>; and using InsecureRandomContext = RNGUtility<XoRoShiRo128PlusPlusRNG>.
Being able to selectively mark certain functions as non-overridable would be nice of course, but I don’t see a way to do that currently.
sipa force-pushed
on May 28, 2024
bitcoin deleted a comment
on May 28, 2024
DrahtBot added the label
CI failed
on May 28, 2024
maflcko
commented at 2:02 pm on May 29, 2024:
member
0 node0 stderr random.h:218:26: runtime error: shift exponent -23 is negative
1 [#0](/bitcoin-bitcoin/0/) 0x61cb8c410ee5 in RandomMixin<FastRandomContext>::randbits(int) src/./random.h:218:26
2 [#1](/bitcoin-bitcoin/1/) 0x61cb8c460b2c in long RandomMixin<FastRandomContext>::randrange<long>(long) src/./random.h:270:35
3 [#2](/bitcoin-bitcoin/2/) 0x61cb8c460b2c in std::chrono::duration<long, std::ratio<1l, 1000l>> RandomMixin<FastRandomContext>::randrange<std::chrono::duration<long, std::ratio<1l, 1000l>>>(std::common_type<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type) src/./random.h:352:27
4 [#3](/bitcoin-bitcoin/3/) 0x61cb8c460b2c in (anonymous namespace)::PeerManagerImpl::StartScheduledTasks(CScheduler&) src/net_processing.cpp:2080:38
5 [#4](/bitcoin-bitcoin/4/) 0x61cb8c340ea6 in AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1980:37
6 [#5](/bitcoin-bitcoin/5/) 0x61cb8c2fa474 in AppInit(node::NodeContext&) src/bitcoind.cpp:227:43
7 [#6](/bitcoin-bitcoin/6/) 0x61cb8c2fa474 in main src/bitcoind.cpp:273:10
8 [#7](/bitcoin-bitcoin/7/) 0x77f2e3bd21c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 4d9090d61bf70e6b3225d583f0f08193f54670b2)
9 [#8](/bitcoin-bitcoin/8/) 0x77f2e3bd228a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 4d9090d61bf70e6b3225d583f0f08193f54670b2)
10 [#9](/bitcoin-bitcoin/9/) 0x61cb8c21cfe4 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/bitcoind+0xa1dfe4) (BuildId: 478a61c7d2f8deae95e82eb455a5b93dcb36ffa7)
11SUMMARY: UndefinedBehaviorSanitizer: invalid-shift-exponent random.h:218:26
214@@ -215,6 +215,7 @@ class RandomMixin
215 // number becomes the new bitbuf.
216 uint64_t gen = Impl().rand64();
217 ret = (gen << bitbuf_size) | bitbuf;
218+ assert(bits >= bitbuf_size);
I tried to track down if/how it could be a race condition, but given that it’s in init and all locks seems to be in place, I don’t see how that could possibly be the case.
Clearing the cache helped, for some reason. (Adding the assert should have already cleared the ccache entry for this translation unit, but now it seems to work :shrug: )
Looks like this commit is doing several things at once. It may be easier to review if move-only stuff (like moving rand256 and randbytes function bodies to the header file, and removing the explicit template instantiation) was done in a separate prepare commit?
I’ve added a few preparatory commits before this one, please have a look.
in
src/random.h:144
in
0b92cd2ae4outdated
149+ { rng.rand64() } noexcept -> std::same_as<uint64_t>;
150+ // A random number generator must provide randfill(Span<std::byte>).
151+ { rng.fillrand(s) } noexcept;
152+ // A random number generator must derive from RandomMixin, which adds other rand* functions.
153+ requires std::derived_from<std::remove_reference_t<T>, RandomMixin<std::remove_reference_t<T>>>;
154+};
nit in 0b92cd2ae4a317be4fe0dfd208017a7d05d1eb2a: Not sure I understand this concept. Is there anything that is not already enforced normally in C++ that would be enforced by this concept?
For example, static_cast in the Impl() method, which enforces this concept should already check for inheritance:
0<source>:17:40: error: static_cast from 'Mixin<FRC> *' to 'FRC *', which are not related by inheritance, is not allowed
Conversely, if the class wasn’t derived from RandomMixin, the RandomMixin::Impl() method wouldn’t be called and this check would be skipped anyway.
Even when the code is rewritten to C++23 using __cpp_explicit_this_parameter, the same should hold.
Similarly, calling rand64, when it is not provided should also result in a compile failure without this concept.
I guess it enforces the function signature of rand64 is noexcept?
Well in my view, concepts are really a kind of compiler-enforced documentation. Their primary purpose is giving a readable compiler error when template instantiation ought to be failing, but often also would be failing anyway. In this case, someone implementing a new RNG class FancyRNG : public RandomMixin<FancyRNG> but say missing rand64, would instead of getting an error in the instantiation of RandomMixin get an error message that references RandomNumberGenerator, which then explicitly lists what one is required to do to make it work.
383+
384+ constexpr explicit XoRoShiRo128PlusPlus(uint64_t seedval) noexcept
385+ : m_s0(SplitMix64(seedval)), m_s1(SplitMix64(seedval)) {}
386+
387+ // no copy - that is dangerous, we don't want accidentally copy the RNG and then have two streams
388+ // with exactly the same results. If you need a copy, call copy().
It was added in #26153, with a commit originally taken from #25325. I assume it originally had an explicit copy() member function, which was removed before merge. I’ve added a commit to remove it.
sipa force-pushed
on Jun 5, 2024
sipa force-pushed
on Jun 5, 2024
maflcko
commented at 1:38 pm on June 5, 2024:
member
I’ve rebased on top of the now-merged #30161, and believe I’ve addressed outstanding comments.
in
src/net.cpp:3479
in
462cca90c4outdated
3474@@ -3475,7 +3475,8 @@ std::vector<CAddress> CConnman::GetAddresses(CNode& requestor, size_t max_addres
3475 // nodes to be "terrible" (see IsTerrible()) if the timestamps are older than 30 days,
3476 // max. 24 hours of "penalty" due to cache shouldn't make any meaningful difference
3477 // in terms of the freshness of the response.
3478- cache_entry.m_cache_entry_expiration = current_time + std::chrono::hours(21) + GetRandMillis(std::chrono::hours(6));
3479+ cache_entry.m_cache_entry_expiration = current_time +
3480+ 21h + FastRandomContext().randrange<std::chrono::microseconds>(6h);
EthanHeilman
commented at 11:14 pm on June 8, 2024:
Is the change from milliseconds to microseconds intentional?
Yes, because m_cache_entry_expiration is expressed in microseconds anyway; no need to reduce accuracy. I’ve moved the behavior change to a separate commit.
in
src/random.h:360
in
462cca90c4outdated
473+ * certain type of message) to minimize leaking information to observers.
474+ *
475+ * The probability of an event occurring before time x is 1 - e^-(x/a) where a
476+ * is the average interval between events.
477+ * */
478+ std::chrono::microseconds rand_expo_duration(std::chrono::microseconds mean) noexcept
EthanHeilman
commented at 11:22 pm on June 8, 2024:
Nit: Why not rand_exp_duration instead? It is one character shorter and exp reads more clearly to me as exponential than expo?
This is done in commit e5658e7a40f03b0b07b1f83f22a9d834bc8fd5c6 which only improves the code organization. Specifically here the reason for moving is to have all the non-exposed functions in random.cpp within one anonymous namespace at the start of the file, and all public one below.
EthanHeilman
commented at 2:24 pm on June 9, 2024:
That makes sense. This improved organization is worth it.
Although GetOSRand is declared in random.h, it isn’t actually used by anything outside of random.cpp, so it could remain inside the anonymous namespace.
in
src/random.cpp:421
in
462cca90c4outdated
416 /** Extract up to 32 bytes of entropy from the RNG state, mixing in new entropy from hasher.
417 *
418 * If this function has never been called with strong_seed = true, false is returned.
419+ *
420+ * If allow_deterministic is true, and MakeDeterministic has been called before, output
421+ * from the deterministic PRNG is output instead.
EthanHeilman
commented at 11:49 pm on June 8, 2024:
This confused me because MixExtract is a deterministic function based on hasher, m_counter, and m_state. Unless I am mistaken the non-determinism comes from the calling function determining on the value of ret. Why not address strengthen ret value in the calling function rather than here?
EthanHeilman
commented at 11:51 pm on June 8, 2024:
0 * If allow_deterministic is true, and MakeDeterministic has been called before, output
1 * from the deterministic PRNG.
I see how this is confusing; MixExtract is indeed a deterministic function, and arguably, all of RNGState is deterministic when considered in isolation.
The “determinism” referred here is the introduction of a new deterministic mode for the entire RNG (module-wide, including seeding/initializations/hardware, …). In order to do that, a separate RNG is embedded within RNGState which is only ever explicitly initialized, and if it is, MixExtract can be asked to draw its output from there instead of from its normal SHA512-based mechanism. The entropy fed to MixExtract is still processed normally in this case because for some purposes (cryptographic keys) I think we always want normal high-quality randomness for belt-and-suspenders reasons. Finally, all of this is done inside MixExtract instead of elsewhere so that the same lock can be used for everything (RNGState::m_mutex), meaning the performance penalty of introducing this mechanism is limited to the if (allow_deterministic && m_deterministic_prng.has_value()) conditional, which involves no atomics or locks.
So really I think we want to call the RNG-wide name for this feature “deterministic mode” (e.g. at ProcRand), but maybe using the same name at the level of RNGState is confusing? Would it help to call it call it something else?
EthanHeilman
commented at 2:38 pm on June 9, 2024:
If I understand, you don’t want to replace MixExtract entirely when the deterministic test RNG is used, because when MixExtract to still run and add/mix entropy in case someone decides to generate a keying material while the deterministic test RNG is on. This was what I was missing.
I like the name deterministic mode because it makes it more clear this is a RNG wide change.
Thought I had after waking up this morning was to change this to bool fail_on_deterministic_mode. Thus if someone had the deterministic test RNG activated and then tried to generate a key it would throw an error because fail_on_deterministic_mode=true. That said, I’m not sure this works with the current approach since if someone calls SeedStrengthen with the deterministic test RNG activated it would fail and it is reasonable to call SeedStrengthen in tests/fuzzing. Edit: You make a similar point in the next comment. I don’t think my proposed change here is worth making in this PR.
How about having an always_use_real_rng bool instead, with meaning !allow_deterministic. That means the “deterministic” terminology would be more clearly referring to the mode (is the deterministic PRNG enabled), but not to the produced numbers.
in
src/random.cpp:521
in
462cca90c4outdated
518+void SeedStrengthen(CSHA512& hasher, RNGState& rng, SteadyClock::duration dur) noexcept
519 {
520 // Generate 32 bytes of entropy from the RNG, and a copy of the entropy already in hasher.
521 unsigned char strengthen_seed[32];
522- rng.MixExtract(strengthen_seed, sizeof(strengthen_seed), CSHA512(hasher), false);
523+ rng.MixExtract(strengthen_seed, sizeof(strengthen_seed), CSHA512(hasher), false, /*allow_deterministic=*/true);
EthanHeilman
commented at 11:59 pm on June 8, 2024:
If I am understanding this logic correctly, if m_deterministic_prng.has_value() is true then MixExtract in SeedStrengthen will never use m_counter or m_state to do MixExtract?
Strengthen is going to make the output of this call non-deterministic anyways, so what value does allow_deterministic=true as an argument provide in MixExtract? Shouldn’t this be false?
Is the intention to make Strengthen deterministic at some later point?
437@@ -479,6 +438,13 @@ class RNGState {
438 hasher.Finalize(buf);
439 // Store the last 32 bytes of the hash output as new RNG state.
440 memcpy(m_state, buf + 32, 32);
441+ // Handle requests for deterministic randomness.
442+ if (allow_deterministic && m_deterministic_prng.has_value()) {
EthanHeilman
commented at 0:02 am on June 9, 2024:
Does allow_deterministic=false and m_deterministic_prng.has_value()=true imply that something has gone wrong? Should we log an error or take some action or are there cases where this is expected behavior?
Hmm. The intent was that this is supported behavior; GetStrongRandBytes() never sets allow_deterministic, and thus in deterministic mode would still return from the normal fully-seeded RNG (because I don’t want to introduce anything in the code path of GetStrongRandBytes that could return anything but high quality randomness). This means that if GetStrongRandBytes is called anywhere in test code that enables deterministic randomness, the condition you state occurs.
It’s probably possible to instead make the behavior to abort if allow_deterministic=false and m_deterministic_prng.has_value()=true, but it would involve going over all cases where tests use GetStrongRandBytes (directly, or indirectly). I don’t think I want to do that in this PR.
sipa force-pushed
on Jun 9, 2024
DrahtBot added the label
CI failed
on Jun 9, 2024
EthanHeilman approved
EthanHeilman
commented at 4:02 pm on June 9, 2024:
contributor
ACK125509f395a214465ebf379164e41e4c6d19d443
I have not tested the code. All my comments have been addressed.
DrahtBot requested review from dergoegge
on Jun 9, 2024
DrahtBot requested review from theuni
on Jun 9, 2024
DrahtBot removed the label
CI failed
on Jun 9, 2024
DrahtBot added the label
CI failed
on Jun 9, 2024
DrahtBot removed the label
CI failed
on Jun 9, 2024
DrahtBot added the label
CI failed
on Jun 9, 2024
DrahtBot removed the label
CI failed
on Jun 9, 2024
maflcko
commented at 11:30 am on June 10, 2024:
member
The false positive CI error still happens. I am taking another look.
maflcko
commented at 8:34 pm on June 10, 2024:
member
I could reproduce once outside the CI env on a 24.04 Ubuntu vm:
DrahtBot added the label
Needs rebase
on Jun 11, 2024
sipa force-pushed
on Jun 11, 2024
sipa
commented at 11:19 pm on June 11, 2024:
member
Rebased.
DrahtBot removed the label
Needs rebase
on Jun 12, 2024
DrahtBot
commented at 7:52 am on June 12, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
CI failed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 12, 2024
DrahtBot removed the label
CI failed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 12, 2024
DrahtBot removed the label
CI failed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 13, 2024
DrahtBot removed the label
CI failed
on Jun 13, 2024
maflcko
commented at 7:41 am on June 13, 2024:
member
I am still looking into the clang error. I couldn’t find an issue with your code, but minimizing/reducing the clang bug will take some time, because it appears non-deterministic.
maflcko
commented at 9:33 am on June 15, 2024:
member
It is in your code, and not a compiler bug :sweat_smile:
Wow, nice catch, thank you. Will fix next week.
sipa force-pushed
on Jun 18, 2024
sipa
commented at 5:29 pm on June 18, 2024:
member
Made the following changes:
0diff --git a/src/random.cpp b/src/random.cpp
1index 71a2fcc781..39a1e3eab9 100644
2--- a/src/random.cpp
3+++ b/src/random.cpp
4@@ -763,5 +763,12 @@ void RandomInit()
5 6 double MakeExponentiallyDistributed(uint64_t uniform) noexcept
7 {
8- return -std::log1p(uniform * -5.42101086242752217e-20 /* -1/2^64 */);
9+ // To convert uniform into an exponentially-distributed double, we use two steps:
10+ // - Convert uniform into a uniformly-distributed double in range [0, 1), use the expression
11+ // ((uniform >> 11) * 0x1.0p-53), as described in https://prng.di.unimi.it/ under
12+ // "Generating uniform doubles in the unit interval". Call this value x.
13+ // - Given an x in uniformly distributed in [0, 1), we find an exponentially distributed value
14+ // with mean 1 as (-log(1 - x)).
15+ // Combining the two, and using log1p(x) = log(1 + x), we obtain the following:
16+ return -std::log1p((uniform >> 11) * -0x1.0p-53);
17 }
18diff --git a/src/random.h b/src/random.h
19index 5766411bb8..8e8001d078 100644
20--- a/src/random.h
21+++ b/src/random.h
22@@ -150,7 +150,7 @@ concept StdChronoDuration = requires {
23 std::type_identity<T>());
24 };
2526-/** Given a uniformly random 64-bit value uniform, return an exponentially distributed value with mean 1. */
27+/** Given a uniformly random uint64_t, return an exponentially distributed double with mean 1. */
28 double MakeExponentiallyDistributed(uint64_t uniform) noexcept;
2930 /** Mixin class that provides helper randomness functions.
31diff --git a/src/net_processing.cpp b/src/net_processing.cpp
32index 7a0b0996d0..e52b897921 100644
33--- a/src/net_processing.cpp
34+++ b/src/net_processing.cpp
35@@ -513,7 +514,7 @@ public:
36 EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex, !m_recent_confirmed_transactions_mutex, !m_most_recent_block_mutex, g_msgproc_mutex);
3738 /** Implement PeerManager */
39- void StartScheduledTasks(CScheduler& scheduler) override EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
40+ void StartScheduledTasks(CScheduler& scheduler) override;
41 void CheckForStaleTipAndEvictPeers() override;
42 std::optional<std::string> FetchBlock(NodeId peer_id, const CBlockIndex& block_index) override
43@@ -2077,7 +2080,7 @@ void PeerManagerImpl::StartScheduledTasks(CScheduler& scheduler)
44 scheduler.scheduleEvery([this] { this->CheckForStaleTipAndEvictPeers(); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});
4546 // schedule next run for 10-15 minutes in the future
47- const auto delta = 10min + m_rng.randrange<std::chrono::milliseconds>(5min);
48+ const auto delta = 10min + FastRandomContext().randrange<std::chrono::milliseconds>(5min);
49 scheduler.scheduleFromNow([&] { ReattemptInitialBroadcast(scheduler); }, delta);
50 }
DrahtBot added the label
CI failed
on Jun 18, 2024
DrahtBot removed the label
CI failed
on Jun 19, 2024
DrahtBot added the label
Needs rebase
on Jun 20, 2024
random: write rand256() in function of fillrand()4e74e19d3e
random: move rand256() and randbytes() to .h file3e46525fc9
random: add a few noexcepts to FastRandomContext18fd6487d5
random: use BasicByte concept in randbytes01ef95ca98
random: refactor: move rand* utilities to RandomMixin
Rather than make all the useful types of randomness be exclusive to
FastRandomContext, move it to a separate RandomMixin class where it can be reused by
other RNGs.
A Curiously Recurring Template Pattern (CRTP) is used for this, to provide the ability
for individual RNG classes to override one or more randomness functions, without
needing the runtime-cost of virtual classes.
Specifically, RNGs are expected to only provide fillrand and rand64, while all others
are derived from those:
- randbits
- randrange
- randbytes
- rand32
- rand256
- randbool
- rand_uniform_delay
- rand_uniform_duration
- min(), max(), operator()(), to comply with C++ URBG concept.
7f24615249
random: Improve RandomMixin::randbits
The previous randbits code would, when requesting more randomness than available
in its random bits buffer, discard the remaining entropy and generate new.
Benchmarks show that it's usually better to first consume the existing randomness
and only then generate new ones. This adds some complexity to randbits, but it
doesn't weigh up against the reduced need to generate more randomness.
0807afa5c4
random: add RandomMixin::randbits with compile-known bits
In many cases, it is known at compile time how many bits are requested from
randbits. Provide a variant of randbits that accepts this number as a template,
to make sure the compiler can make use of this knowledge. This is used immediately
in rand32() and randbool(), and a few further call sites.
e4eb03715d
random: modernize XoRoShiRo128PlusPlus a bit
Make use of C++20 functions in XoRoShiRo128PlusPlus.
059b39c24f
xoroshiro128plusplus: drop comment about nonexisting copy()83c6356a70
random: move XoRoShiRo128PlusPlus into random module
This is preparation for making it more generally accessible.
9ce910381b
random: convert XoRoShiRo128PlusPlus into full RNG
Convert XoRoShiRo128PlusPlus into a full RandomMixin-based RNG class,
providing all utility functionality that FastRandomContext has. In doing so,
it is renamed to InsecureRandomContext, highlighting its non-cryptographic
nature.
To do this, a fillrand fallback is added to RandomMixin (where it is used by
InsecureRandomContext), but FastRandomContext still uses its own fillrand.
38bd2062cb
tests: overhaul deterministic test randomness
The existing code provides two randomness mechanisms for test purposes:
- g_insecure_rand_ctx (with its wrappers InsecureRand*), which during tests is
initialized using either zeros (SeedRand::ZEROS), or using environment-provided
randomness (SeedRand::SEED).
- g_mock_deterministic_tests, which controls some (but not all) of the normal
randomness output if set, but then makes it extremely predictable (identical
output repeatedly).
Replace this with a single mechanism, which retains the SeedRand modes to control
all randomness. There is a new internal deterministic PRNG inside the random
module, which is used in GetRandBytes() when in test mode, and which is also used
to initialize g_insecure_rand_ctx. This means that during tests, all random numbers
are made deterministic. There is one exception, GetStrongRandBytes(), which even
in test mode still uses the normal PRNG state.
This probably opens the door to removing a lot of the ad-hoc "deterministic" mode
functions littered through the codebase (by simply running relevant tests in
SeedRand::ZEROS mode), but this isn't done yet.
6d7f91db50
random: make GetRand() support entire range (incl. max)
The existing code uses GetRand(nMax), with a default value for nMax, where nMax is the
range of values (not the maximum!) that the output is allowed to take. This will always
miss the last possible value (e.g. GetRand<uint32_t>() will never return 0xffffffff).
Fix this, by moving the functionality largely in RandomMixin, and also adding a
separate RandomMixin::rand function, which returns a value in the entire (non-negative)
range of an integer.
cf04e6c90e
net: use GetRandMicros for cache expiration
This matches the data type of m_cache_entry_expiration.
b3d715f2f6
random: convert GetRand{Micros,Millis} into randrange
There are only a few call sites of these throughout the codebase, so
move the functionality into FastRandomContext, and rewrite all call sites.
This requires the callers to explicit construct FastRandomContext objects,
which do add to the verbosity, but also make potentially apparent locations
where the code can be improved by reusing a FastRandomContext object (see
further commit).
22c03710b3
random: convert GetExponentialRand into rand_exp_delay75bcc1057b
random: improve precision of MakeExponentiallyDistributed
The old code only used 48 bits of entropy to construct a uniformly random
double. Switch to the technique suggested in https://prng.di.unimi.it/,
which uses 53 bits.
e2956a4319
net, net_processing: use existing RNG objects more
PeerManagerImpl, as well as several net functions, already have existing
FastRandomContext objects. Reuse them instead of constructing new ones.
5a93d08f35
random: cleanup order, comments, static818fb4aa83
tests: make fuzz tests (mostly) deterministic with fixed seeddac177c79e
random: use LogError for init failurefb02b74c09
random: replace construct/assign with explicit Reseed()f0e5681e97
sipa force-pushed
on Jun 20, 2024
sipa
commented at 6:19 pm on June 20, 2024:
member
DrahtBot removed the label
Needs rebase
on Jun 20, 2024
in
src/random.h:437
in
cf04e6c90eoutdated
429@@ -445,6 +430,33 @@ void Shuffle(I first, I last, R&& rng)
430 }
431 }
432433+/** Generate a uniform random integer of type T in the range [0..nMax)
434+ * Precondition: nMax > 0, T is an integral type, no larger than uint64_t
435+ */
436+template<typename T>
437+T GetRand(T nMax) noexcept {
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-06-26 16:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me