sipa
commented at 7:36 pm on March 11, 2024:
member
This PR contains a number of vaguely-related improvements to the random module.
The specific changes and more detailed rationale is in the commit messages, but the highlights are:
XoRoShiRo128PlusPlus (previously a test-only RNG) moves to random.h and becomes InsecureRandomContext, which is even faster than FastRandomContext but non-cryptographic. It also gets all helper randomness functions (randrange, fillrand, …), making it a lot more succinct to use.
During tests, all randomness is made deterministic (except for GetStrongRandBytes) but non-repeating (like GetRand() used to be when g_mock_deterministic_tests was used), either fixed, or from a random seed (overridden by env var).
Several infrequently used top-level functions (GetRandMillis, GetRandMicros, GetExponentialRand) are converted into member functions of FastRandomContext (and InsecureRandomContext).
GetRand<T>() (without argument) can now return the maximum value of the type (previously e.g. GetRand<uint32_t>() would never return 0xffffffff).
DrahtBot
commented at 7:36 pm on March 11, 2024:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#30377 (refactor: Make uint256S(const char*) consteval by hodlinator)
#30277 ([DO NOT MERGE] Erlay: bandwidth-efficient transaction relay protocol (Full implementation) by sr-gi)
#30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
#29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
#29543 (refactor: Avoid unsigned integer overflow in script/interpreter.cpp by hebasto)
#29536 (fuzz: fuzz connman with non-empty addrman + ASMap by brunoerg)
#29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
#26114 (net: Make AddrFetch connections to fixed seeds by mzumsande)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
DrahtBot added the label
CI failed
on Mar 12, 2024
DrahtBot
commented at 4:41 am on March 12, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
dergoegge
commented at 10:09 am on March 12, 2024:
member
Concept ACK
There are a few places in the fuzz tests where this will allow to easily replace FastRandomContext with a InsecureRandomContext, which is beneficial for performance (e.g. the addrman harnesses partially fill the addrman with addresses from rng) and we don’t need cryptographic rng there anyway.
During tests, all randomness is made deterministic
Sjors
commented at 3:11 pm on March 12, 2024:
member
What’s the impact on the fuzz corpus of switching to a different (?) deterministic RNG?
dergoegge
commented at 3:53 pm on March 12, 2024:
member
What’s the impact on the fuzz corpus of switching to a different (?) deterministic RNG?
I would expect that switching to a different rng should have no meaningful effect on the corpus itself. The corpus for a particular harness might change but the coverage for the code we intend to test should remain the same. This is because using rng in a fuzz harness only makes sense in very rare cases. It should never be used in a way that can significantly affect the coverage reached, otherwise there is no point in using a coverage-guided fuzzer, we could just pipe /dev/random to our harnesses.
For example, if we need to populate some data that we don’t really expect to have an impact on the thing we are testing, we might use rng instead of consuming from the fuzz input (we do this in the p2p transport harnesses to fill message contents, which are essentially irrelevant to the transport logic).
Switching to deterministic rng can cause a corpus’ coverage to grow because coverage-guided feedback loops start working more reliably when the code under test is deterministic. This can vary from harness to harness, but we’ve seen coverage-guided fuzzers find bugs once we’ve improved on determinism.
sipa force-pushed
on Mar 12, 2024
sipa force-pushed
on Mar 12, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 13, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa force-pushed
on Mar 14, 2024
sipa
commented at 11:21 pm on March 14, 2024:
member
I’m baffled by what is happening here. In the fuzz test CI job, the data directory gets deleted in between two fuzz input runs??
maflcko
commented at 8:57 am on March 15, 2024:
member
This is expected, because all fuzz tests in this pull request branch use the same datadir (5a19fd2bc079272a9b138710c6e50c5ea512c1a0736333c9817c4df1d77c8518). Maybe the randomness is too deterministic and returns a constant?
Also, the issue should happen locally as well, not only on CI. Something like ./test/fuzz/test_runner.py -j 9 -l DEBUG ./folder_temp reproduces it for me.
Edit:
025d4e22d8b25c33f60b91193673336e4ad1c9406 is the first bad commit
sipa force-pushed
on Mar 15, 2024
sipa
commented at 12:55 pm on March 15, 2024:
member
@maflcko Thank you! I suspect that is it. I indeed made all fuzz test use the same deterministic seed. I’ve dropped that change now.
DrahtBot removed the label
CI failed
on Mar 15, 2024
sipa force-pushed
on Mar 15, 2024
sipa force-pushed
on Mar 15, 2024
sipa force-pushed
on Mar 15, 2024
DrahtBot added the label
CI failed
on Mar 15, 2024
DrahtBot
commented at 7:04 pm on March 15, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
What’s the impact on the fuzz corpus of switching to a different (?) deterministic RNG?
Let’s break it down into cases:
The main “random.h” RNG (GetRand() and friends, default-constructed FastRandomContext objects, …). This used to be truly random in fuzz tests, and will now be deterministic with a fixed seed. In theory this should have no impact, because the fuzz tests shouldn’t be relying on this randomness in the first place. But possibly there are some which do, indirectly through code that wasn’t properly mocked or otherwise avoiding it, in which case making things deterministic should be a strict improvement by not making the fuzzer waste time on chasing the effects of that randomness.
XoRoShiRo128PlusPlus. There are a few fuzz tests (bip324, crypto_chacha20, p2p_transport_serialization, poolresource) that use this very fast deterministic RNG to construct certain data. This PR changes the behavior of some of them by replacing ad-hoc code to use that randomness with general helper functions that become available for all RNGs. In theory this might invalidate part of the fuzz corpus for those tests, but in practice I expect it won’t, because the data drawn from those RNGs is data that shouldn’t matter for the test much (if it did, it’d be drawn from the fuzz input instead).
So overall, it might invalidate a few tests’ corpus (but probably not), and for others it should either have no effect or be a strict improvement.
EthanHeilman
commented at 10:10 pm on April 5, 2024:
contributor
#28558 made PeerManager own a FastRandomContext, so we could (should?) use m_rng here instead (otherwise PeerManager::Options::deterministic_rng still only applies to some of the randomness).
Since this PR kind of makes individual “make this component deterministic” options redudant, we could consider reverting #28558 (not necessarily in this PR)?
I was thinking that in the long run we could break the dependencies between components and the specific rng they use (maybe something like template<RandomNumberGenerator R> class PeerManager { ... }), which would allow more fine grained mocking than a global “make rng deterministic” in tests (e.g. we could have a “rng” type that consumes from a FuzzedDataProvider). I guess this can be done by using globals as well.
#28558 made PeerManager own a FastRandomContext, so we could (should?) use m_rng here instead (otherwise PeerManager::Options::deterministic_rng still only applies to some of the randomness).
I’ve changed the PR to reuse PeerManagerImpl::m_rng.
Since this PR kind of makes individual “make this component deterministic” options redudant, we could consider reverting #28558 (not necessarily in this PR)?
Maybe. I’ve opted to use it where possible for now as it’s a smaller change, and has some (possibly negligible) performance advantage (no need to lock the global RNG mutex to get randomness when you already hold g_msgproc_mutex), but I think that can be reconsidered.
Independently, we may be able to just drop PeerManager::Options::deterministic_rng, relying on global deterministic mode instead.
I was thinking that in the long run we could break the dependencies between components and the specific rng they use (maybe something like template<RandomNumberGenerator R> class PeerManager { ... }), which would allow more fine grained mocking than a global “make rng deterministic” in tests (e.g. we could have a “rng” type that consumes from a FuzzedDataProvider). I guess this can be done by using globals as well.
Maybe, though that means testing something very different from what we’re doing here: testing under conditions where the RNG returns actually decidedly non-random results (which is different from a deterministic FastRandomContext which is still cryptographically-strong, just deterministic. I don’t know for how many things this makes sense.
sipa force-pushed
on Apr 30, 2024
DrahtBot added the label
CI failed
on Apr 30, 2024
DrahtBot removed the label
CI failed
on May 2, 2024
dergoegge approved
dergoegge
commented at 12:38 pm on May 20, 2024:
member
140- * is completely deterministic and does not gather more entropy after that.
141+// Forward declaration of RandomMixin, used in RandomNumberGenerator concept.
142+template<typename T>
143+class RandomMixin;
144+
145+/** A concept for RandomMixin-based random number generators. */
I think I decided that the compiler should be smart enough to optimize the final “else” case with Bits == 0 to effectively a return 0;, but it’s more obvious to just make it explicit. Done.
in
src/random.h:279
in
89fa561ab9outdated
271+ while (span.size() >= 8) {
272+ uint64_t gen = Impl().rand64();
273+ WriteLE64(UCharCast(span.data()), gen);
274+ span = span.subspan(8);
275+ }
276+ while (span.size()) {
Yes, it is required. However, adding the annotation will only silence the compiler warning.
FastRandomContext is not thread safe, so the lock will actually have to be taken.
Putting an EXCLUSIVE_LOCKS_REQUIRED only into the c++ file and leaving it out from the header file will not work, because the compiler has no way to see the annotation outside of the module. For example, it can not be seen from init.cpp.
I’ll give that patched branch a shot next week and see if it turns up anything else in our codebase. Will chime in upstream with a +1 as well (assuming it works).
I used the WIP above to check for other issues like this in our codebase. It only turned up a few false-positives because of duplicate annotations in declarations + definitions. Thankfully there were no cases of them only in definitions. PR’d here: #30316.
I have addressed this by making StartScheduledTasks not use m_rng, but just grab a new FastRandomContext instead; performance doesn’t matter for this function.
theuni
commented at 7:13 pm on May 28, 2024:
member
Concept ACK and first-pass code review ACK excluding tests.
I don’t feel qualified to review the rand_expo_duration changes.
I manually verified the randbits impls with some local tests as the shifts weren’t intuitive to me at first glance.
in
src/random.h:403
in
89fa561ab9outdated
362@@ -351,19 +363,19 @@ class FastRandomContext : public RandomMixin<FastRandomContext>
363 return ReadLE64(UCharCast(buf.data()));
364 }
365366- /** Fill a byte Span with random bytes. */
367+ /** Fill a byte Span with random bytes. This overrides the RandomMixin version. */
This is kinda sneaky and surprising (to me at least) to see with CRTP.
It would be nice to specify as part of the concept which functions are not allowed to be overridden, but sadly
std::experimental::detected_t is.. well.. experimental.
Well being able to ad-hoc override individual functions, while still being able to have non-overridden functions be able to refer to it, is sort of the point of CRTP (virtual functions have the same behavior, but CRTP gives it without runtime overhead). If we didn’t want this ability, it’d be possible to just have a template<typename R> class RNGUtility { R m_int; ...}, where there are low-level ChaCha20RNG and XoRoShiRo128PlusPlusRNG classes, and high-level using FastRandomContext = RNGUtility<ChaCha20RNG>; and using InsecureRandomContext = RNGUtility<XoRoShiRo128PlusPlusRNG>.
Being able to selectively mark certain functions as non-overridable would be nice of course, but I don’t see a way to do that currently.
sipa force-pushed
on May 28, 2024
bitcoin deleted a comment
on May 28, 2024
DrahtBot added the label
CI failed
on May 28, 2024
maflcko
commented at 2:02 pm on May 29, 2024:
member
0 node0 stderr random.h:218:26: runtime error: shift exponent -23 is negative
1 [#0](/bitcoin-bitcoin/0/) 0x61cb8c410ee5 in RandomMixin<FastRandomContext>::randbits(int) src/./random.h:218:26
2 [#1](/bitcoin-bitcoin/1/) 0x61cb8c460b2c in long RandomMixin<FastRandomContext>::randrange<long>(long) src/./random.h:270:35
3 [#2](/bitcoin-bitcoin/2/) 0x61cb8c460b2c in std::chrono::duration<long, std::ratio<1l, 1000l>> RandomMixin<FastRandomContext>::randrange<std::chrono::duration<long, std::ratio<1l, 1000l>>>(std::common_type<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type) src/./random.h:352:27
4 [#3](/bitcoin-bitcoin/3/) 0x61cb8c460b2c in (anonymous namespace)::PeerManagerImpl::StartScheduledTasks(CScheduler&) src/net_processing.cpp:2080:38
5 [#4](/bitcoin-bitcoin/4/) 0x61cb8c340ea6 in AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1980:37
6 [#5](/bitcoin-bitcoin/5/) 0x61cb8c2fa474 in AppInit(node::NodeContext&) src/bitcoind.cpp:227:43
7 [#6](/bitcoin-bitcoin/6/) 0x61cb8c2fa474 in main src/bitcoind.cpp:273:10
8 [#7](/bitcoin-bitcoin/7/) 0x77f2e3bd21c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 4d9090d61bf70e6b3225d583f0f08193f54670b2)
9 [#8](/bitcoin-bitcoin/8/) 0x77f2e3bd228a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 4d9090d61bf70e6b3225d583f0f08193f54670b2)
10 [#9](/bitcoin-bitcoin/9/) 0x61cb8c21cfe4 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/bitcoind+0xa1dfe4) (BuildId: 478a61c7d2f8deae95e82eb455a5b93dcb36ffa7)
11SUMMARY: UndefinedBehaviorSanitizer: invalid-shift-exponent random.h:218:26
214@@ -215,6 +215,7 @@ class RandomMixin
215 // number becomes the new bitbuf.
216 uint64_t gen = Impl().rand64();
217 ret = (gen << bitbuf_size) | bitbuf;
218+ assert(bits >= bitbuf_size);
I tried to track down if/how it could be a race condition, but given that it’s in init and all locks seems to be in place, I don’t see how that could possibly be the case.
Clearing the cache helped, for some reason. (Adding the assert should have already cleared the ccache entry for this translation unit, but now it seems to work :shrug: )
Looks like this commit is doing several things at once. It may be easier to review if move-only stuff (like moving rand256 and randbytes function bodies to the header file, and removing the explicit template instantiation) was done in a separate prepare commit?
I’ve added a few preparatory commits before this one, please have a look.
in
src/random.h:152
in
0b92cd2ae4outdated
149+ { rng.rand64() } noexcept -> std::same_as<uint64_t>;
150+ // A random number generator must provide randfill(Span<std::byte>).
151+ { rng.fillrand(s) } noexcept;
152+ // A random number generator must derive from RandomMixin, which adds other rand* functions.
153+ requires std::derived_from<std::remove_reference_t<T>, RandomMixin<std::remove_reference_t<T>>>;
154+};
nit in 0b92cd2ae4a317be4fe0dfd208017a7d05d1eb2a: Not sure I understand this concept. Is there anything that is not already enforced normally in C++ that would be enforced by this concept?
For example, static_cast in the Impl() method, which enforces this concept should already check for inheritance:
0<source>:17:40: error: static_cast from 'Mixin<FRC> *' to 'FRC *', which are not related by inheritance, is not allowed
Conversely, if the class wasn’t derived from RandomMixin, the RandomMixin::Impl() method wouldn’t be called and this check would be skipped anyway.
Even when the code is rewritten to C++23 using __cpp_explicit_this_parameter, the same should hold.
Similarly, calling rand64, when it is not provided should also result in a compile failure without this concept.
I guess it enforces the function signature of rand64 is noexcept?
Well in my view, concepts are really a kind of compiler-enforced documentation. Their primary purpose is giving a readable compiler error when template instantiation ought to be failing, but often also would be failing anyway. In this case, someone implementing a new RNG class FancyRNG : public RandomMixin<FancyRNG> but say missing rand64, would instead of getting an error in the instantiation of RandomMixin get an error message that references RandomNumberGenerator, which then explicitly lists what one is required to do to make it work.
383+
384+ constexpr explicit XoRoShiRo128PlusPlus(uint64_t seedval) noexcept
385+ : m_s0(SplitMix64(seedval)), m_s1(SplitMix64(seedval)) {}
386+
387+ // no copy - that is dangerous, we don't want accidentally copy the RNG and then have two streams
388+ // with exactly the same results. If you need a copy, call copy().
It was added in #26153, with a commit originally taken from #25325. I assume it originally had an explicit copy() member function, which was removed before merge. I’ve added a commit to remove it.
sipa force-pushed
on Jun 5, 2024
sipa force-pushed
on Jun 5, 2024
maflcko
commented at 1:38 pm on June 5, 2024:
member
I’ve rebased on top of the now-merged #30161, and believe I’ve addressed outstanding comments.
in
src/net.cpp:3479
in
462cca90c4outdated
3474@@ -3475,7 +3475,8 @@ std::vector<CAddress> CConnman::GetAddresses(CNode& requestor, size_t max_addres
3475 // nodes to be "terrible" (see IsTerrible()) if the timestamps are older than 30 days,
3476 // max. 24 hours of "penalty" due to cache shouldn't make any meaningful difference
3477 // in terms of the freshness of the response.
3478- cache_entry.m_cache_entry_expiration = current_time + std::chrono::hours(21) + GetRandMillis(std::chrono::hours(6));
3479+ cache_entry.m_cache_entry_expiration = current_time +
3480+ 21h + FastRandomContext().randrange<std::chrono::microseconds>(6h);
EthanHeilman
commented at 11:14 pm on June 8, 2024:
Is the change from milliseconds to microseconds intentional?
Yes, because m_cache_entry_expiration is expressed in microseconds anyway; no need to reduce accuracy. I’ve moved the behavior change to a separate commit.
in
src/random.h:360
in
462cca90c4outdated
473+ * certain type of message) to minimize leaking information to observers.
474+ *
475+ * The probability of an event occurring before time x is 1 - e^-(x/a) where a
476+ * is the average interval between events.
477+ * */
478+ std::chrono::microseconds rand_expo_duration(std::chrono::microseconds mean) noexcept
EthanHeilman
commented at 11:22 pm on June 8, 2024:
Nit: Why not rand_exp_duration instead? It is one character shorter and exp reads more clearly to me as exponential than expo?
This is done in commit e5658e7a40f03b0b07b1f83f22a9d834bc8fd5c6 which only improves the code organization. Specifically here the reason for moving is to have all the non-exposed functions in random.cpp within one anonymous namespace at the start of the file, and all public one below.
EthanHeilman
commented at 2:24 pm on June 9, 2024:
That makes sense. This improved organization is worth it.
Although GetOSRand is declared in random.h, it isn’t actually used by anything outside of random.cpp, so it could remain inside the anonymous namespace.
Moved GetOSRand to the anonymous namespace (meaning it can stay in its relative location in random.cpp, reducing the diff size), and dropped it from random.h.
in
src/random.cpp:421
in
462cca90c4outdated
416 /** Extract up to 32 bytes of entropy from the RNG state, mixing in new entropy from hasher.
417 *
418 * If this function has never been called with strong_seed = true, false is returned.
419+ *
420+ * If allow_deterministic is true, and MakeDeterministic has been called before, output
421+ * from the deterministic PRNG is output instead.
EthanHeilman
commented at 11:49 pm on June 8, 2024:
This confused me because MixExtract is a deterministic function based on hasher, m_counter, and m_state. Unless I am mistaken the non-determinism comes from the calling function determining on the value of ret. Why not address strengthen ret value in the calling function rather than here?
EthanHeilman
commented at 11:51 pm on June 8, 2024:
0 * If allow_deterministic is true, and MakeDeterministic has been called before, output
1 * from the deterministic PRNG.
I see how this is confusing; MixExtract is indeed a deterministic function, and arguably, all of RNGState is deterministic when considered in isolation.
The “determinism” referred here is the introduction of a new deterministic mode for the entire RNG (module-wide, including seeding/initializations/hardware, …). In order to do that, a separate RNG is embedded within RNGState which is only ever explicitly initialized, and if it is, MixExtract can be asked to draw its output from there instead of from its normal SHA512-based mechanism. The entropy fed to MixExtract is still processed normally in this case because for some purposes (cryptographic keys) I think we always want normal high-quality randomness for belt-and-suspenders reasons. Finally, all of this is done inside MixExtract instead of elsewhere so that the same lock can be used for everything (RNGState::m_mutex), meaning the performance penalty of introducing this mechanism is limited to the if (allow_deterministic && m_deterministic_prng.has_value()) conditional, which involves no atomics or locks.
So really I think we want to call the RNG-wide name for this feature “deterministic mode” (e.g. at ProcRand), but maybe using the same name at the level of RNGState is confusing? Would it help to call it call it something else?
EthanHeilman
commented at 2:38 pm on June 9, 2024:
If I understand, you don’t want to replace MixExtract entirely when the deterministic test RNG is used, because when MixExtract to still run and add/mix entropy in case someone decides to generate a keying material while the deterministic test RNG is on. This was what I was missing.
I like the name deterministic mode because it makes it more clear this is a RNG wide change.
Thought I had after waking up this morning was to change this to bool fail_on_deterministic_mode. Thus if someone had the deterministic test RNG activated and then tried to generate a key it would throw an error because fail_on_deterministic_mode=true. That said, I’m not sure this works with the current approach since if someone calls SeedStrengthen with the deterministic test RNG activated it would fail and it is reasonable to call SeedStrengthen in tests/fuzzing. Edit: You make a similar point in the next comment. I don’t think my proposed change here is worth making in this PR.
How about having an always_use_real_rng bool instead, with meaning !allow_deterministic. That means the “deterministic” terminology would be more clearly referring to the mode (is the deterministic PRNG enabled), but not to the produced numbers.
518+void SeedStrengthen(CSHA512& hasher, RNGState& rng, SteadyClock::duration dur) noexcept
519 {
520 // Generate 32 bytes of entropy from the RNG, and a copy of the entropy already in hasher.
521 unsigned char strengthen_seed[32];
522- rng.MixExtract(strengthen_seed, sizeof(strengthen_seed), CSHA512(hasher), false);
523+ rng.MixExtract(strengthen_seed, sizeof(strengthen_seed), CSHA512(hasher), false, /*allow_deterministic=*/true);
EthanHeilman
commented at 11:59 pm on June 8, 2024:
If I am understanding this logic correctly, if m_deterministic_prng.has_value() is true then MixExtract in SeedStrengthen will never use m_counter or m_state to do MixExtract?
Strengthen is going to make the output of this call non-deterministic anyways, so what value does allow_deterministic=true as an argument provide in MixExtract? Shouldn’t this be false?
Is the intention to make Strengthen deterministic at some later point?
437@@ -479,6 +438,13 @@ class RNGState {
438 hasher.Finalize(buf);
439 // Store the last 32 bytes of the hash output as new RNG state.
440 memcpy(m_state, buf + 32, 32);
441+ // Handle requests for deterministic randomness.
442+ if (allow_deterministic && m_deterministic_prng.has_value()) {
EthanHeilman
commented at 0:02 am on June 9, 2024:
Does allow_deterministic=false and m_deterministic_prng.has_value()=true imply that something has gone wrong? Should we log an error or take some action or are there cases where this is expected behavior?
Hmm. The intent was that this is supported behavior; GetStrongRandBytes() never sets allow_deterministic, and thus in deterministic mode would still return from the normal fully-seeded RNG (because I don’t want to introduce anything in the code path of GetStrongRandBytes that could return anything but high quality randomness). This means that if GetStrongRandBytes is called anywhere in test code that enables deterministic randomness, the condition you state occurs.
It’s probably possible to instead make the behavior to abort if allow_deterministic=false and m_deterministic_prng.has_value()=true, but it would involve going over all cases where tests use GetStrongRandBytes (directly, or indirectly). I don’t think I want to do that in this PR.
sipa force-pushed
on Jun 9, 2024
DrahtBot added the label
CI failed
on Jun 9, 2024
EthanHeilman approved
EthanHeilman
commented at 4:02 pm on June 9, 2024:
contributor
ACK125509f395a214465ebf379164e41e4c6d19d443
I have not tested the code. All my comments have been addressed.
DrahtBot requested review from dergoegge
on Jun 9, 2024
DrahtBot requested review from theuni
on Jun 9, 2024
DrahtBot removed the label
CI failed
on Jun 9, 2024
DrahtBot added the label
CI failed
on Jun 9, 2024
DrahtBot removed the label
CI failed
on Jun 9, 2024
DrahtBot added the label
CI failed
on Jun 9, 2024
DrahtBot removed the label
CI failed
on Jun 9, 2024
maflcko
commented at 11:30 am on June 10, 2024:
member
The false positive CI error still happens. I am taking another look.
maflcko
commented at 8:34 pm on June 10, 2024:
member
I could reproduce once outside the CI env on a 24.04 Ubuntu vm:
DrahtBot added the label
Needs rebase
on Jun 11, 2024
sipa force-pushed
on Jun 11, 2024
sipa
commented at 11:19 pm on June 11, 2024:
member
Rebased.
DrahtBot removed the label
Needs rebase
on Jun 12, 2024
DrahtBot
commented at 7:52 am on June 12, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
CI failed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 12, 2024
DrahtBot removed the label
CI failed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 12, 2024
DrahtBot removed the label
CI failed
on Jun 12, 2024
DrahtBot added the label
CI failed
on Jun 13, 2024
DrahtBot removed the label
CI failed
on Jun 13, 2024
maflcko
commented at 7:41 am on June 13, 2024:
member
I am still looking into the clang error. I couldn’t find an issue with your code, but minimizing/reducing the clang bug will take some time, because it appears non-deterministic.
maflcko
commented at 9:33 am on June 15, 2024:
member
DrahtBot removed the label
Needs rebase
on Jun 20, 2024
in
src/random.h:488
in
cf04e6c90eoutdated
429@@ -445,6 +430,33 @@ void Shuffle(I first, I last, R&& rng)
430 }
431 }
432433+/** Generate a uniform random integer of type T in the range [0..nMax)
434+ * Precondition: nMax > 0, T is an integral type, no larger than uint64_t
435+ */
436+template<typename T>
437+T GetRand(T nMax) noexcept {
nit in e4eb03715d5944e999a399538919089862d8c7c6: I don’t understand this branch. To me it looks identical to the fallback branch, just with Bits==1 manually hardcoded. But it is known at compile time, so the compiler can do it for you, and you can just remove this branch, no? (Same for Bits==0)
I tested it locally and didn’t see a speedup when including this branch. (Maybe a trivial speedup when removing it, but this is likely noise)
If there is a benefit that I am missing, it would be good to mention it in the commit message or in a benchmark, or otherwise.
In fact, I don’t understand this whole function. I’d presume the performance bottleneck is always Impl().rand64() and whether a few bit operations or shifts use an operator known at compile time or not shouldn’t matter, no?
Locally I could only find a difference when using InsecureRandomContext, but that doesn’t use the randbits<>() method outside of test, or at all? So I wonder if the function is useful at all.
Yeah, I don’t expect this to matter for FastRandomContext, where the production of randomness is expensive enough that it doesn’t matter. But for InsecureRandomContext, I want to make sure the compiler creates a suitably specialized instance of randbits for use in randbool for example. While not done in this PR, the original reason for starting to work on this PR is because I want a fast InsecureRandomContext for use in #30286.
That said, you’re right that all the different manual specializations in this function are overkill, and should be obviously creatable by the compiler. I’ve dropped them all except BITS == 64 (which needs special handling anyway).
in
src/random.h:151
in
7f24615249outdated
148+ // A random number generator must provide rand64().
149+ { rng.rand64() } noexcept -> std::same_as<uint64_t>;
150+ // A random number generator must provide randfill(Span<std::byte>).
151+ { rng.fillrand(s) } noexcept;
152+ // A random number generator must derive from RandomMixin, which adds other rand* functions.
153+ requires std::derived_from<std::remove_reference_t<T>, RandomMixin<std::remove_reference_t<T>>>;
25@@ -26,7 +26,7 @@
26 * The following (classes of) functions interact with that state by mixing in new
27 * entropy, and optionally extracting random output from it:
28 *
29- * - The GetRand*() class of functions, as well as construction of FastRandomContext objects,
30+ * - GetRandBytes, GetRandHash, as well as construction of FastRandomContext objects,
0diff --git a/src/random.h b/src/random.h
1index ab4e049817..fd05e7e575 100644
2--- a/src/random.h
3+++ b/src/random.h
4@@ -28,8 +28,8 @@
5 * The following (classes of) functions interact with that state by mixing in new
6 * entropy, and optionally extracting random output from it:
7 *
8- * - The GetRand*() class of functions, as well as construction of FastRandomContext objects,
9- * perform 'fast' seeding, consisting of mixing in:
10+ * - GetRandBytes, GetRandHash, GetRand, GetRandDur, as well as construction of FastRandomContext
11+ * objects, perform 'fast' seeding, consisting of mixing in:
12 * - A stack pointer (indirectly committing to calling thread and call stack)
sipa
commented at 5:32 pm on June 27, 2024:
member
I’ve addressed the outstanding comments, I believe, including:
Rename allow_deterministic to always_use_real_rng (with inverted meaning).
Dropped GetRand() (but not GetRand(nRange)), inlining it instead.
Removed GetOSRand from the public random.h interface, and moved it into the anonymous namespace.
Fixed/improved a few comments
Dropped most of the specializations in randbits<BITS>.
in
src/random.cpp:326
in
326c76909foutdated
324@@ -323,29 +325,6 @@ static void Strengthen(const unsigned char (&seed)[32], SteadyClock::duration du
325 memory_cleanse(buffer, sizeof(buffer));
326 }
327328-#ifndef WIN32
329-/** Fallback: get 32 bytes of system entropy from /dev/urandom. The most
62@@ -63,6 +63,10 @@
63 * When mixing in new entropy, H = SHA512(entropy || old_rng_state) is computed, and
64 * (up to) the first 32 bytes of H are produced as output, while the last 32 bytes
65 * become the new RNG state.
66+ *
67+ * During tests, the RNG is put into a special deterministic mode, in which the output of
68+ * all RNG functions, with the exception of GetStrongRandBytes(), is replaced with the
69+ * output of a deterministic RNG that is initialized once and then never reseeded.
0diff --git a/src/random.h b/src/random.h
1index e67e295b5f..f447216776 100644
2--- a/src/random.h
3+++ b/src/random.h
4@@ -64,9 +64,11 @@
5 * (up to) the first 32 bytes of H are produced as output, while the last 32 bytes
6 * become the new RNG state.
7 *
8- * During tests, the RNG is put into a special deterministic mode, in which the output of
9- * all RNG functions, with the exception of GetStrongRandBytes(), is replaced with the
10- * output of a deterministic RNG that is initialized once and then never reseeded.
11+ * During tests, the RNG can be put into a special deterministic mode, in which the output
12+ * of all RNG functions, with the exception of GetStrongRandBytes(), is replaced with the
13+ * output of a deterministic RNG. This deterministic RNG does not gather entropy, and is
14+ * unaffected by RandAddPeriodic() or RandAddEvent(). It produces pseudorandom data that
15+ * only depends on the seed it was initialized with, possibly until it is reinitialized.
16 */
maflcko
commented at 5:57 pm on June 27, 2024:
member
467@@ -261,7 +468,7 @@ class FastRandomContext
468 * debug mode detects and panics on. This is a known issue, see
469 * https://stackoverflow.com/questions/22915325/avoiding-self-assignment-in-stdshuffle
About whether the stdlib is faster or uses more entropy, I haven’t benchmarked it, but wanted to note that it generates two ints with one 64-bit rng call and uses https://github.com/lemire/FastShuffleExperiments
Going to leave this as a follow-up (I think making this work with CRTP may mean splitting it up into two concepts).
in
src/random.h:488
in
15cc11b4e5outdated
480@@ -274,29 +481,26 @@ void Shuffle(I first, I last, R&& rng)
481 }
482 }
483484+/** Generate a uniform random integer of type T in the range [0..nMax)
485+ * Precondition: nMax > 0, T is an integral type, no larger than uint64_t
486+ */
487+template<typename T>
488+T GetRand(T nMax) noexcept {
nit: This is used in 5 places, in 3 of them “incorrectly”. (BIP 330 doesn’t disallow a uint64_t max as salt, same for CSipHasher in PriorityComputer). So you could replace them with a call to rand. The remaining two could just be inlined with a call to randrange.
507 /* Number of random bytes returned by GetOSRand.
508 * When changing this constant make sure to change all call sites, and make
509 * sure that the underlying OS APIs for all platforms support the number.
510 * (many cap out at 256 bytes).
511 */
512 static const int NUM_OS_RANDOM_BYTES = 32;
sipa
commented at 4:02 pm on June 28, 2024:
member
If there are further nits, I will address them in a follow-up.
Ready for more (re)review.
DrahtBot added the label
Needs rebase
on Jul 1, 2024
random: write rand256() in function of fillrand()493a2e024e
random: move rand256() and randbytes() to .h fileb3b382dde2
random: add a few noexcepts to FastRandomContext27cefc7fd6
random: use BasicByte concept in randbytes40dd86fc3b
random: refactor: move rand* utilities to RandomMixin
Rather than make all the useful types of randomness be exclusive to
FastRandomContext, move it to a separate RandomMixin class where it can be reused by
other RNGs.
A Curiously Recurring Template Pattern (CRTP) is used for this, to provide the ability
for individual RNG classes to override one or more randomness functions, without
needing the runtime-cost of virtual classes.
Specifically, RNGs are expected to only provide fillrand and rand64, while all others
are derived from those:
- randbits
- randrange
- randbytes
- rand32
- rand256
- randbool
- rand_uniform_delay
- rand_uniform_duration
- min(), max(), operator()(), to comply with C++ URBG concept.
9b14d3d2da
random: Improve RandomMixin::randbits
The previous randbits code would, when requesting more randomness than available
in its random bits buffer, discard the remaining entropy and generate new.
Benchmarks show that it's usually better to first consume the existing randomness
and only then generate new ones. This adds some complexity to randbits, but it
doesn't weigh up against the reduced need to generate more randomness.
21ce9d8658
random: add RandomMixin::randbits with compile-known bits
In many cases, it is known at compile time how many bits are requested from
randbits. Provide a variant of randbits that accepts this number as a template,
to make sure the compiler can make use of this knowledge. This is used immediately
in rand32() and randbool(), and a few further call sites.
ddb7d26cfd
random: modernize XoRoShiRo128PlusPlus a bit
Make use of C++20 functions in XoRoShiRo128PlusPlus.
8924f5120f
xoroshiro128plusplus: drop comment about nonexisting copy()8f5ac0d0b6
random: move XoRoShiRo128PlusPlus into random module
This is preparation for making it more generally accessible.
8cc2f45065
random: convert XoRoShiRo128PlusPlus into full RNG
Convert XoRoShiRo128PlusPlus into a full RandomMixin-based RNG class,
providing all utility functionality that FastRandomContext has. In doing so,
it is renamed to InsecureRandomContext, highlighting its non-cryptographic
nature.
To do this, a fillrand fallback is added to RandomMixin (where it is used by
InsecureRandomContext), but FastRandomContext still uses its own fillrand.
6cfdc5b104
tests: overhaul deterministic test randomness
The existing code provides two randomness mechanisms for test purposes:
- g_insecure_rand_ctx (with its wrappers InsecureRand*), which during tests is
initialized using either zeros (SeedRand::ZEROS), or using environment-provided
randomness (SeedRand::SEED).
- g_mock_deterministic_tests, which controls some (but not all) of the normal
randomness output if set, but then makes it extremely predictable (identical
output repeatedly).
Replace this with a single mechanism, which retains the SeedRand modes to control
all randomness. There is a new internal deterministic PRNG inside the random
module, which is used in GetRandBytes() when in test mode, and which is also used
to initialize g_insecure_rand_ctx. This means that during tests, all random numbers
are made deterministic. There is one exception, GetStrongRandBytes(), which even
in test mode still uses the normal PRNG state.
This probably opens the door to removing a lot of the ad-hoc "deterministic" mode
functions littered through the codebase (by simply running relevant tests in
SeedRand::ZEROS mode), but this isn't done yet.
810cdf6b4e
random: make GetRand() support entire range (incl. max)
The existing code uses GetRand(nMax), with a default value for nMax, where nMax is the
range of values (not the maximum!) that the output is allowed to take. This will always
miss the last possible value (e.g. GetRand<uint32_t>() will never return 0xffffffff).
Fix this, by moving the functionality largely in RandomMixin, and also adding a
separate RandomMixin::rand function, which returns a value in the entire (non-negative)
range of an integer.
DrahtBot removed the label
Needs rebase
on Jul 1, 2024
random: get rid of GetRand by inliningddc184d999
net: use GetRandMicros for cache expiration
This matches the data type of m_cache_entry_expiration.
82de1b80d9
random: convert GetRand{Micros,Millis} into randrange
There are only a few call sites of these throughout the codebase, so
move the functionality into FastRandomContext, and rewrite all call sites.
This requires the callers to explicit construct FastRandomContext objects,
which do add to the verbosity, but also make potentially apparent locations
where the code can be improved by reusing a FastRandomContext object (see
further commit).
4eaa239dc3
random: convert GetExponentialRand into rand_exp_durationcfb0dfe2cf
random: improve precision of MakeExponentiallyDistributedd5fcbe966b
net, net_processing: use existing RNG objects more
PeerManagerImpl, as well as several net functions, already have existing
FastRandomContext objects. Reuse them instead of constructing new ones.
8e31cf9c9b
random: cleanup order, comments, static2c91330dd6
tests: make fuzz tests (mostly) deterministic with fixed seed97e16f5704
random: use LogError for init failure2ae392d561
random: replace construct/assign with explicit Reseed()ce8094246e
sipa force-pushed
on Jul 1, 2024
achow101
commented at 7:14 pm on July 1, 2024:
member
ACKce8094246ee95232e9d84f7e37f3c0a43ef587ce
DrahtBot requested review from maflcko
on Jul 1, 2024
in
src/random.h:209
in
21ce9d8658outdated
211+ Assume(bits <= 64);
212+ // Requests for the full 64 bits are passed through.
213+ if (bits == 64) return Impl().rand64();
214+ uint64_t ret;
215+ if (bits <= bitbuf_size) {
216+ // If there is enough entropy left in bitbuf, return its bottom bits bits.
maflcko removed review request from EthanHeilman
on Jul 3, 2024
maflcko removed review request from theuni
on Jul 3, 2024
maflcko removed review request from dergoegge
on Jul 3, 2024
maflcko requested review from dergoegge
on Jul 3, 2024
maflcko requested review from EthanHeilman
on Jul 3, 2024
maflcko requested review from theuni
on Jul 3, 2024
in
src/random.cpp:787
in
d5fcbe966boutdated
783+ // - Given an x in uniformly distributed in [0, 1), we find an exponentially distributed value
784+ // by applying the quantile function to it. For the exponential distribution with mean 1 this
785+ // is F(x) = -log(1 - x).
786+ //
787+ // Combining the two, and using log1p(x) = log(1 + x), we obtain the following:
788+ return -std::log1p((uniform >> 11) * -0x1.0p-53);
In d5fcbe966bc501db8bf6a3809633f0b82e6ae547: Always throwing away 11 bits of entropy in the new version compared to 0 before the PR. I guess you want to preserve the expression from the linked site and it’s unclear whether not throwing away the bits would be more performant.
Yeah, I wanted to keep MakeExponential as stand-alone reviewable as possible. I don’t think any of the call sites are so performance-critical that the difference matters.
in
src/test/util/net.cpp:129
in
e2d1f84858outdated
In e2d1f84858485650ff743753ffa5c679f210a992: Now we are in C++20 land, it might be time to use real designated initializers instead of comments when touching blocks like these?
This is already a list initialization, so I don’t think clang-tidy can pick up the named args at all. Happy to review a follow-up, if you decide to open one.
In ddc184d999d7e1a87efaf6bcb222186f0dcd87ec: The commit message in e2d1f84858485650ff743753ffa5c679f210a992 made me do a survey of (mis)uses of GetRand() . 2 similar cases stood out to me at first, but they appear correct after some noodling. This is one of them and the other is check_block_index in validation.cpp.
(check_ratio is often set to 1 (always) or 0 (never)).
Sharing the same return path actually makes the behavior slightly quicker for me to grok:
hodlinator
commented at 10:04 pm on July 3, 2024:
contributor
ACKce8094246ee95232e9d84f7e37f3c0a43ef587ce
Mainly code review. Did not attempt to check thread safety compliance. (Passed make check & test/functional/test_runner.py).
e2d1f84858485650ff743753ffa5c679f210a992 - random: make GetRand() support entire range (incl. max)
The commit message title probably should be “random: make GetRand<T>() support entire range (incl. max)”, since the overloads taking parameters still are exclusive at the end. It felt dangerous that GetRand<T>(void) behavior differed in this way from the overloads - happy others suggested the calls be inlined as randrange() and rand<T>() are much clearer!
dergoegge approved
dergoegge
commented at 9:57 am on July 4, 2024:
member
utACKce8094246ee95232e9d84f7e37f3c0a43ef587ce
fanquake merged this
on Jul 4, 2024
fanquake closed this
on Jul 4, 2024
hebasto added the label
Needs CMake port
on Jul 4, 2024
achow101 referenced this in commit
10677713ca
on Jul 9, 2024
fanquake referenced this in commit
e51653985c
on Jul 11, 2024
hebasto
commented at 6:51 am on July 14, 2024:
member
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-06-23 21:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me