test: Fix RANDOM_CTX_SEED use with parallel tests

hodlinator commented at 1:43 pm on August 28, 2024: contributor

To fix the issue with test directories clashing when running in parallel with the env var RANDOM_CTX_SEED set, we now mix in the test name.

Issue was introduced by 97e16f57042cab07e5e73f6bed19feec2006e4f7, and comment for g_insecure_rand_ctx_temp_path became invalid with that commit. Make the determinism clear through removing the no longer useful g_insecure_rand_ctx_temp_path.

Fixes #30696.

DrahtBot commented at 1:44 pm on August 28, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process. A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30748 (test: Pin and document TEST_DIR_PATH_ELEMENT, SeedRand::FIXED_SEED by maflcko)
#30618 (test: support std::optional in BOOST_CHECK_* and increase FromUserHex fuzz feature coverage by l0rinc)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label Tests on Aug 28, 2024

hodlinator commented at 1:49 pm on August 28, 2024: contributor

Can be verified through running

0RANDOM_CTX_SEED=21 make -j10 check

in src/test/util/setup_common.cpp:157 in eeeb56b74c outdated

154@@ -156,9 +155,10 @@ BasicTestingSetup::BasicTestingSetup(const ChainType chainType, TestOpts opts)
155     // data directories use a random name that doesn't overlap with other tests.
156     SeedRandomForTest(SeedRand::SEED);
157 
158+    const std::string test_path{G_TEST_GET_FULL_NAME ? G_TEST_GET_FULL_NAME() : ""};

maflcko commented at 2:59 pm on August 28, 2024:

This isn’t set in the fuzz tests, IIRC. So, as explained in #30696 (comment) it would fail if there were two fuzz tests that set up a (Basic)TestingSetup each. No?

maflcko commented at 3:01 pm on August 28, 2024:

Also, obviously, the issue would remain if the same test is run in parallel

hodlinator commented at 9:09 am on August 29, 2024:

Thank you for your detailed linked comment @maflcko. Apologies for not going back to re-read it before publishing this PR.

Before publishing, I did recall that you were worried about fuzz tests and ran:

0RANDOM_CTX_SEED=21 ./test/fuzz/test_runner.py -j40 .

..with this added temporarily:

0printf("%s\n", rand_str.c_str());

The hashes outputted were unique. I admit it surprised me and I also printed test_path and saw they were empty. Should have dug deeper at that point to figure out the cause of randomness before publishing the PR.

After your latest comment I dug deeper, discovering:

test/fuzz/test_runner.py does not automatically forward on environment variables, I guess it wants to keep the environment under tighter control. Making it explicitly forward RANDOM_CTX_SEED makes the deterministic randomness of the fuzz tests start from the same seed.

Outputting getpid() in the BasicTestingSetup-ctor now shows unique PIDs per fuzz test. I hesitate to add the PID to the seed since the intent of 97e16f57042cab07e5e73f6bed19feec2006e4f7 seems to be to decrease determinism). In the case of benchmarks, the PID would probably not help either, as the intent seems to be to run them single-threaded(?) - which would at the same time mean directories being created/destroyed should not be an issue for benchmarks.

I’ve opted in the latest push to implement forwarding of RANDOM_CTX_SEED and implemented G_TEST_GET_FULL_NAME for bench+fuzz. Please let me know what you think about that approach.

(Noticed in the debugger that the BasicTestingSetup-ctor, after creating the test directories, creates the kernel::Context which calls RandomInit() which calls ProcRand(..., /*always_use_real_rng=*/true) …which should be harmless as long as RNGState::m_deterministic_prng has been set).

maflcko commented at 9:16 am on August 29, 2024:

To offer another alternative fix, beside the ones proposed in #30696 (comment):

Another possible fix would be to ensure the g_rng_temp_path is seeded before any call to SeedRandomForTest.

I think it would help why you didn’t think that any of the 4 alternative fixes are worth a try and instead went for this, which is an incomplete fix, as explained in my previous comment.

hodlinator commented at 10:09 am on August 29, 2024:

I think it would help why you didn’t think that any of the 4 alternative fixes are worth a try and instead went for this, which is an incomplete fix, as explained in my previous comment.

In general I think determinism should be increased, not decreased. Increased determinism appears to be the original intent behind moving SeedRandomForTest earlier in 97e16f57042cab07e5e73f6bed19feec2006e4f7, please let me know if you have a different interpretation. All your suggestions lead to decreased determinism, to avoid repetition, I only left additional comments:

“Use GetStrongRandBytes, however, I am not sure how slow that’d be in the worst case.” - You pointed out that it may be a bit slow.
“Move the SeedRandomForTest after the fuzz target’s init function, but before ResetCoverageCounters. However that’d still break if the fuzz target spun up a TestingSetup.” - As init often creates BasicTestingSetup, that approach would also need to have the setup_common.cpp part of 97e16f57042cab07e5e73f6bed19feec2006e4f7 reverted. As you said any later created BasicTestingSetup after init would become deterministic.
“Use the hash of the current folder generator, as well as the hash of the pid. Which may obviously break once the kernel wraps the pids, but I am not sure how likely that is.” - Addressed in my previous comment.
Ensure the g_rng_temp_path is seeded before any call to SeedRandomForTest. - Just the general determinism concern above.

Also, obviously, the issue would remain if the same test is run in parallel

The same test being run in parallel with itself could use -testdatadir. Is that a common case?

Edit: Could use different RANDOM_CTX_SEED if wanting to run the same test in parallel with itself as well.

maflcko commented at 10:27 am on August 29, 2024:

Ah, I see. There may be a misunderstanding here. According to the comment you removed, the datadir should actually be “unique”, not “deterministic”. (“Random context to get unique temp data dirs.”)

Once you make it deterministic, there will be a chance of collision, which may lead to a crash, which this pull request is trying to fix, no?

The same test being run in parallel with itself could use -testdatadir. Is that a common case?

Edit: Could use different RANDOM_CTX_SEED if wanting to run the same test in parallel with itself as well.

I don’t think anything here is a common case. So in theory an alternative “fix” would be to simply say in a comment that RANDOM_CTX_SEED can only be used when running a single test process at a time.

hodlinator commented at 11:51 am on August 29, 2024:

Ah, I see. There may be a misunderstanding here. According to the comment you removed, the datadir should actually be “unique”, not “deterministic”. (“Random context to get unique temp data dirs.”)

Aha, time to bring out the git log -S "Random context to get unique temp data dirs" command. ;)

I see it was your fae43a97ca947cd0802392e9bb86d9d0572c0fba from 2019, so it’s hard to argue that the author may have meant “unique” as in simply different from other tests. :) See you introduced the later seeding of the RNG in that commit too.

On the other hand, you (together with me) ACKed #29625 which contains 97e16f57042cab07e5e73f6bed19feec2006e4f7, moving the seeding prior to directory naming randomization. Do you now disagree with it?

Once you make it deterministic, there will be a chance of collision, which may lead to a crash, which this pull request is trying to fix, no?

The PR does fix collisions between different tests that would otherwise happen when RANDOM_CTX_SEED was set to the same value, which was resulting in the same folder name being used. (RANDOM_CTX_SEED=21 make -j10 check).

I don’t think anything here is a common case. So in theory an alternative “fix” would be to simply say in a comment that RANDOM_CTX_SEED can only be used when running a single test process at a time.

Combining tests with setting RANDOM_CTX_SEED is indeed an edge case. This PR arguably implements missing test names for bench+fuzz, missing support for forwarding RANDOM_CTX_SEED to fuzz tests, simplifies the code through removing g_rng_temp_path which had a no longer valid comment after 97e16f57042cab07e5e73f6bed19feec2006e4f7, the main fix/complexity lies in adding entropy to the random folder path using the test name.

The extremely paranoid would say that the folder name generation could influence test success/failure, so having it deterministic is helpful. Unit tests are poorly written if that is ever the case, but for fuzzing it’s still an open question to me.

maflcko commented at 12:06 pm on August 29, 2024:

On the other hand, you (together with me) ACKed #29625 which contains 97e16f5, moving the seeding prior to directory naming randomization. Do you now disagree with it?

I think it was simply an oversight. At least for me.

Once you make it deterministic, there will be a chance of collision, which may lead to a crash, which this pull request is trying to fix, no?

The PR does fix collisions between different tests that would otherwise happen when RANDOM_CTX_SEED was set to the same value, which was resulting in the same folder name being used. (RANDOM_CTX_SEED=21 make -j10 check).

Yes, I understand the code diff. I am just saying that it would be nicer if running the same test also didn’t crash for that reason, when run in parallel. It seems plausible that someone would want to do that, given that it is possible and supported right now (when not setting RANDOM_CTX_SEED).

Otherwise, someone else will report that as an issue and the code has to be changed again.

The extremely paranoid would say that the folder name generation could influence test success/failure

Ok, that helps to understand your concern. However, I’d say that the folder name is printed. So if it lead to an issue, it would be trivial to reproduce.

(If it is not printed on failure, then that should be fixed)

hodlinator commented at 1:25 pm on August 29, 2024:

On the other hand, you (together with me) ACKed #29625 which contains 97e16f57042cab07e5e73f6bed19feec2006e4f7, moving the seeding prior to directory naming randomization. Do you now disagree with it?

I think it was simply an oversight. At least for me.

Alright, thanks for addressing this. I think it would be good to wait for sipa to have time as this PR continues in the same direction as his commit.

Yes, I understand the code diff. I am just saying that it would be nicer if running the same test also didn’t crash for that reason, when run in parallel. It seems plausible that someone would want to do that, given that it is possible and supported right now (when not setting RANDOM_CTX_SEED).

Otherwise, someone else will report that as an issue and the code has to be changed again.

If they don’t want to set different RANDOM_CTX_SEED they can set different -testdatadir (and probably should, for clarity in case there are failures).

The extremely paranoid would say that the folder name generation could influence test success/failure

Ok, that helps to understand your concern. However, I’d say that the folder name is printed. So if it lead to an issue, it would be trivial to reproduce.

It will be even more trivial to reproduce if one can just set the RANDOM_CTX_SEED env var and re-run the same binary without modifying and recompiling the code (and potentially having to set up a git & build environment).

maflcko commented at 5:40 am on August 30, 2024:

It will be even more trivial to reproduce if one can just set the RANDOM_CTX_SEED env var and re-run the same binary without modifying and recompiling the code (and potentially having to set up a git & build environment).

I just think this optimized in the wrong direction. Has there ever been a known issue due to a fixed-size ascii-hex-encoded path element? Stuff like variable-size or non-hex-ascii are known to possibly find issues, but a fixed size [a-f0-9] element, I am not aware of.

Moreover, adding code like fuzz_env['RANDOM_CTX_SEED'] = random_seed to the fuzz runner also seems to go in the wrong direction a bit. The goal of the fuzz tests is to be fully deterministic and stable, within a single run and across different runs. If the fuzz randomness is seeded with anything other than stuff provided by the fuzz engine, it is easy to get non-determinism, or coverage instability.

An having different types of “seed” in the fuzz context is also confusing.

If they don’t want to set different RANDOM_CTX_SEED they can set different -testdatadir (and probably should, for clarity in case there are failures).

Well, the only point of setting RANDOM_CTX_SEED is to reproduce a past failure. However, if you want to go down the route of requiring to set a different testdatadir when wanting to run the same test in parallel, it may be best to fully go that route. Instead of a hash, just use the test name as-is? This would also be parallel to the functional tests.

maflcko commented at 12:07 pm on October 2, 2024:

However, if you want to go down the route of requiring to set a different testdatadir when wanting to run the same test in parallel, it may be best to fully go that route. Instead of a hash, just use the test name as-is? This would also be parallel to the functional tests.

For reference, #31000 should be one step into that direction.

DrahtBot added the label Needs rebase on Aug 28, 2024

jonatack commented at 4:35 pm on August 28, 2024: member

Will review after rebase.

test: bench+fuzz - Implement G_TEST_GET_FULL_NAME dfef02f938

test: Make fuzz/test_runner pass on RANDOM_CTX_SEED if set 8e5f8a4f77

test: Fix RANDOM_CTX_SEED use with parallel tests

To fix the issue with test directories clashing when running in parallel with the env var RANDOM_CTX_SEED set, we now mix in the test name.

Issue was introduced by 97e16f57042cab07e5e73f6bed19feec2006e4f7, and comment for g_insecure_rand_ctx_temp_path became invalid with that commit. Make the determinism clear through removing the no longer useful g_insecure_rand_ctx_temp_path.

Fixes #30696.

065f0f66d6

hodlinator force-pushed on Aug 29, 2024

DrahtBot removed the label Needs rebase on Aug 29, 2024

DrahtBot added the label Needs rebase on Sep 6, 2024

DrahtBot commented at 9:12 am on September 6, 2024: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

maflcko commented at 2:49 pm on October 16, 2024: member

Cross-ref to #30748 (comment), depending on how this evolves.

hodlinator commented at 1:52 pm on November 14, 2024: contributor

Closing since the main issue was fixed in a different way by the now merged #31000.

hodlinator closed this on Nov 14, 2024

hodlinator deleted the branch on Nov 14, 2024

maflcko commented at 4:19 pm on November 14, 2024: member

I think the issue in the fuzz tests still persists.

hodlinator commented at 9:42 pm on November 14, 2024: contributor

I think the issue in the fuzz tests still persists.

That’s correct. Fuzzing was not my main concern, but could maybe resurrect that part of this PR.

I’ve rebased on current master and re-pushed the branch I had deleted, but am not allowed to reopen this one. https://github.com/bitcoin/bitcoin/compare/master...hodlinator:bitcoin:2024-08/RANDOM_CTX_SEED_jobs_fix_alt

Are you still against passing through RANDOM_CTX_SEED in the test_runner.py? In that case I can drop that commit.

maflcko commented at 7:56 am on November 15, 2024: member

If you want you can just open a new pull, it will have to be reviewed from scratch anyway.

Are you still against passing through RANDOM_CTX_SEED in the test_runner.py? In that case I can drop that commit.

Yes, adding code like fuzz_env['RANDOM_CTX_SEED'] = random_seed to the fuzz runner also seems to go in the wrong direction a bit. The goal of the fuzz tests is to be fully deterministic and stable, within a single run and across different runs. If the fuzz randomness is seeded with anything other than stuff provided by the fuzz engine, it is easy to get non-determinism, or coverage instability. (Copied from #30737 (review))

If you disagree, it would be good to explain why.

maflcko commented at 11:55 am on November 18, 2024: member

For reference, the fuzz issue can be hit more frequent on machines that do not have nanosecond precision. GCE may not have it, according to the oss-fuzz crashes of the form:

0libc++abi: terminating due to uncaught exception of type std::__1::__fs::filesystem::filesystem_error: filesystem error: in remove_all: No such file or directory ["/tmp/test_common bitcoin/1731742732339168000/regtest/fuzzed_i2p_private_key"]

Ref: https://oss-fuzz.com/testcase-detail/5425475725361152

https://issues.oss-fuzz.com/issues/379418970

Edit: Fixed in https://github.com/bitcoin/bitcoin/pull/31317

maflcko commented at 7:15 am on November 20, 2024: member

Edit: Fixed in #31317

My understanding is that with the current state of that pull, all outstanding observable issues mentioned in this thread are fixed? If not, please leave a comment.

There may or may not be some small style-fixups/follow-ups such as #31317 (review), but no real issue to be fixed.

hodlinator commented at 10:30 am on November 20, 2024: contributor

Agree that issues here are fixed. For orderliness, maybe the fuzz-part of dfef02f9382579352abcb4fcc8d85170de9f441a from this PR could be done as a new PR just to ensure named folders?

maflcko commented at 10:53 am on November 20, 2024: member

Ah yes. I am happy to review a pull for named folders when fuzzing

hodlinator commented at 2:19 pm on November 20, 2024: contributor

New PR #31333.

test: Fix RANDOM_CTX_SEED use with parallel tests #30737

Code Coverage

Reviews

Conflicts