This change rushes a fix for #23099.
This approach works for test/functional/test_runner.py in my system, changes to test/fuzz/test_runner.py are untested at the time of writing (I don't know how to run this command).
This change rushes a fix for #23099.
This approach works for test/functional/test_runner.py in my system, changes to test/fuzz/test_runner.py are untested at the time of writing (I don't know how to run this command).
Concept ACK
182 | @@ -182,6 +183,11 @@ def main(): 183 | use_valgrind=args.valgrind, 184 | ) 185 | 186 | + shutil_disk_usage_stat_ = shutil.disk_usage(args.corpus_dir) 187 | + if shutil_disk_usage_stat_.free < 1: 188 | + logging.debug("No space left on device") 189 | + sys.exit(-1)
Is this needed? Pretty sure libFuzzer will print this error by itself.
Also, the fuzz inputs usually do not reside in the /tmp dir, which was the motivation for this patch?
543 | @@ -544,6 +544,11 @@ def run_tests(*, test_list, src_dir, build_dir, tmpdir, jobs=1, enable_coverage= 544 | logging.debug("Early exiting after test failure") 545 | break 546 | 547 | + shutil_disk_usage_stat_ = shutil.disk_usage(testdir) 548 | + if shutil_disk_usage_stat_.free < 1:
I think this can be bumped to 16 M, because it is not possible to create a datadir that is smaller than that.
The first block file will pre-allocate 16M:
16M /tmp/aaa/regtest/blocks/blk00000.dat
We cannot apply that, because that could make the output line "no free space left on device" be shown when that's not really the failure reason.
Though I've applied your idea in a different way.
543 | @@ -544,6 +544,11 @@ def run_tests(*, test_list, src_dir, build_dir, tmpdir, jobs=1, enable_coverage= 544 | logging.debug("Early exiting after test failure") 545 | break 546 | 547 | + shutil_disk_usage_stat_ = shutil.disk_usage(testdir) 548 | + if shutil_disk_usage_stat_.free < 1: 549 | + logging.debug("No space left on device") 550 | + sys.exit(-1)
This will leave dangling processes
Unable to reproduce on my system
when current test is feature_taproot.py --previous_release
when execution flow reaches the line (don't know what other
tests were running concurrently).
26 | @@ -27,6 +27,15 @@ 27 | import logging 28 | import unittest 29 | 30 | +# Should fail testing and immediately stop if free space falls below. 31 | +# This would be checked before running any and every test. 32 | +# TODO consider increasing to 16 * 1024 * 1024 (https://github.com/bitcoin/bitcoin/pull/23100#discussion_r716825510)
why add a todo?
because you wrote you think a test requires 16M
a todo is the best I can add here about that
I think it would be better to change the number instead of adding a TODO to change the number
I think so as well, but I can't personally commit it
why?
because I don't trust you, because I don't even know you, because one should only commit code one can take ownership of, because I can't take ownership of an assertion you seem not to be even sure of, because I would lie by committing it!
You don't have to trust me. The value is in the source code BLOCKFILE_CHUNK_SIZE, or you can test it yourself: #23100 (review)
Even if wasn't, what is the worst thing that could happen by increasing the number?
The value is in the source code
BLOCKFILE_CHUNK_SIZE, or you can test it yourself
I would have a look, thanks.
I was hoping that many of the >200 functional tests would not strictly require that a large space reserve. 🙄
Even if wasn't, what is the worst thing that could happen by increasing the number?
In case the constant didn't apply to all tests, it would result on unnecessary test halting for maybe successfully passable tests, and even maybe aborting a test run able to end successfully.
I have had a casual look, I'd say there are many functional tests ending and being removed upon success way before reaching 16MiB, I'm feeling like removing MIN_FREE_SPACE_BEFORE_TEST. 🤔
There's even some functional tests occupying only KiBs and in a blink of an eye it's successful and gone.
In a few days I'd be removing MIN_FREE_SPACE_BEFORE_TEST.
@katesalazar just a few notes:
MIN_FREE_SPACE_BEFORE_TEST is currently unused, so that needs to be corrected (unless you decide to remove it - but it think it is better to check there is enough space before starting each test, rather than checking after each).@meshcollider wrote:
- I agree with Marco's comments about the 16M (and note that he is the test framework maintainer, so please do take his advice seriously 🙂)
Does agreeing come simply from trusting Marco, or does it come from real positive a posteriori knowledge of the technology, you have?
Again, no one asked you to trust me. Everything is in the source code, which you can read or execute yourself: #23100 (review)
You keep not proving your point, I am not applying your suggestion myself. It's better you commit your unproven magic numbers under your name in a subsequent PR or an alternative PR.
I am not applying your suggestion myself. It's better you commit your unproven magic numbers under your name
That's fine, you don't have to do anything you don't want to. However, this Pull Request is not going to be merged with the review comments left unaddressed, or the TODO left in the code. So you either need to address everything, or you can close this PR.
I've watched example_test.py failing to run successfully, apparently refusing to start under ~10M free space. It's likely I'll investigate and end up adding Marco's suggestion.
I think it can refuse to boot with 'Disk space too low' with more than 16M free, tried using 30M, it seems it refused to boot.
26 | @@ -27,6 +27,9 @@ 27 | import logging 28 | import unittest 29 | 30 | +# Should fail testing and immediately stop if free space falls below. 31 | +MIN_FREE_SPACE = 16 * 1024 * 1024 # Could be larger.
Instead of a comment "Could be larger" it could make sense to explain why this was chosen or where the magic number is from.
Also, why are you not choosing the value from constexpr min_disk_space?
550 | @@ -546,6 +551,12 @@ def run_tests(*, test_list, src_dir, build_dir, tmpdir, jobs=1, enable_coverage= 551 | logging.debug("Early exiting after test failure") 552 | break 553 | 554 | + shutil_disk_usage_stat_ = shutil.disk_usage(testdir) 555 | + if (shutil_disk_usage_stat_.free < MIN_FREE_SPACE 556 | + and i < test_count - jobs): 557 | + logging.debug("No space left on device to continue testing") 558 | + break
The comment below " # Clean up dangling processes if any. This may only happen with --failfast option." will need to be updated
Yes! I will just let the CI build finish successfully before changing it.
26 | @@ -27,6 +27,11 @@ 27 | import logging 28 | import unittest 29 | 30 | +# Should (in general but not always) fail testing and immediately stop if free space falls below. 31 | +# Constant value 50M comes from `bool CheckDiskSpace(2)` at 32 | +#Â `src/util/system.h` via uses in `src/node/blockstorage.cpp`.
Please don't put source file names / C++ implementation details here, as they are likely to just become outdated in regards to the code. Also, comments like "in general but not always" are not really useful, unless they are elaborated on (and that would be overkill here), as the reader is just left wondering about the "but not always" conditions.
Please squash your commits again :)
It is good practice to squash after every fixup so that the PR is immediately ready to review and merge. Otherwise, squashing later will invalidate the ACKs on that commit hash.
Conceptually it'd be cleaner if you could detect Bitcoin Core complaining about free space once and then abort the test, but I don't know if that's feasible.
It is good practice to squash after every fixup so that the PR is immediately ready to review and merge. Otherwise, squashing later will invalidate the ACKs on that commit hash.
Understood and grabbed this advice for urgent changes. Which you won't see me taking care of in the next maybe 5 or 10 years, for reasons.
I'll squash on demand, because that's the rules. But squashing is a lossy* process, on Git and on GitHub, and I don't think I should be doing it lightly.
* Let's imagine for a while the references log does not exist.
Conceptually it'd be cleaner if you could detect Bitcoin Core complaining about free space once and then abort the test, but I don't know if that's feasible.
Could spawn a thread which monitors outputs looking for "Disk space too low". Probably that output and more cases, "No space left on drive" (IDK exactly)...
That is a great idea if a test can take a very long time to naturally abort once is theoretically clear it will fail. I would doubt that's the case, but IDK.
BTW I very much appreciate your patience with my bi-polar bear things, thank you.
dc439011d155033dec4ad0af2059808ce9b8cd73 is an amend of (only) the commit message of 646029c4365bc921a84cf8952200d81f6e68b1b4 (which passed CI checks).
But squashing is a lossy* process, on Git and on GitHub, and I don't think I should be doing it lightly.
You should, because in general people don't review PRs that look WIP - and unsquashed fixup commits immediately signal WIP. PRs are required to be clean and tidy for merge. There's no need to worry about loss because the new commit should always be an improvement over the old. Feel free to keep a backup of your old commits on a separate branch locally though if you want to.
utACK dc439011d155033dec4ad0af2059808ce9b8cd73
Thanks also for your patience with the review process :)
Concept ACK
It happened more than once to me that /tmp got clogged up with left-overs of failed test runs that I forgot to clean, so I'm supporting an early low-space detection. Will test later.
FWIW, on my system, functional tests start to fail already when less than ~67MB are available on the /tmp partition. This matches the code in CheckDiskSpace which needs min_disk_space (50MB) plus the value additional_bytes that is passed to this function, which in turn are the 16MB mentioned above for first block pre-allocation.
FWIW, on my system, functional tests start to fail already when less than ~67MB are available on the /tmp partition. This matches the code in
CheckDiskSpacewhich needsmin_disk_space(50MB) plus the valueadditional_bytesthat is passed to this function, which in turn are the 16MB mentioned above for first block pre-allocation.
I never tested those conditions, but really looked like it should behave the way you describe.
Let's set MIN_FREE_SPACE to ((50 + 16) * 1024 * 1024).
A ((50 + 16) * 1024 * 1024) value should be better. I'll wait a few days and if there's no opposition I'll submit such change and report it to ACKer/s.
Thanks @theStack.
@meshcollider wrote:
utACK dc43901
I pushed 4b41e9d7745a6856263f on top of dc439011d155033dec.
I don't know what's the software allowing to push an 4b41e9d7745a6856263f ACK (would that be the case) on top of an dc439011d155033dec ACK, more importantly I don't know what's the software allowing to automatically drafting a replay of an 4b41e9d7745a6856263f ACK (would that be the case) on the rev resulting from squashing 4b41e9d7745a6856263f in dc439011d155033dec.
As the code should be the same and only the commit message should have to be reviewed, a tool should automatically draft such ACK on the squashed rev for accelerating the review process.
If that's a manual process as I suspect, then this tool fails to provide the features this project needs.
@katesalazar I don't understand your previous comment. In any case, you need to squash your commits. You should just be doing that by default.
@fanquake wrote:
I don't understand your previous comment.
Don't worry, I'm surely missing some task you do when you review.
Let's suppose there is a rev A changing 1000 lines, it's acknowledged. A change happens and rev B has to be submitted changing 1 single line.
Submitting B on top of A is straight forward and safe.
You want me to directly push B fixed up in A, i.e. rev C, so that C(codebase) = B(A(codebase)) = (B∘A)(codebase).
I just need you to teach me what's the command that tells you that I didn't sneak a rogue change in C changing some 10-ish lines unexpectedly.
How do you compare A to C and get the equivalent to B without me even ever pushing B?
How do you compare A to C and get the equivalent to B without me even ever pushing B?
Clarification,
I know git diff A C gives it; but my question is, do you really do this to
see B? Or do you just read all over all of the changes in A once again
when C is pushed?
You probably want git range-diff. I'd suggest reading our productivity docs.
You probably want
git range-diff. I'd suggest reading our productivity docs.
Oh, we stand on the shoulders of giants!
But even if range-diff works, how would you make sure that the old feature branch tip revision (rev) isn't removed by the Git garbage collector when you finally range-diff a feature branch tip against the previously reviewed rev which is no longer part of a branch AKA reference (ref)?
A reviewed rev is worth a lot more an asset than an unreviewed rev. You would want to make sure that the garbage collector doesn't remove a reviewed rev orphaned from a reference until the new tip is reviewed.
Don't you need to reconfigure the references log or the garbage collector so that they use a larger buffer or timeframe?
Or what you do is to just never update your local copy of a feature branch ref unless you are going to review the new feature branch tip right away, hence the previously reviewed rev orphaned from the pull request ref becomes obsolete and discardable in the single step of reviewing the new feature branch tip?
You can create a branch or tag to make sure git doesn't gc it
Understood. Good enough, thanks.
Concept ACK. I tend to run functional tests on a RAM disk which occasionally runs out of space, resulting in weird behavior. Hopefully this PR helps.
save dev time good :) code review ack is invested time ;D thanks for the necromancy, cheers
On Thu, Dec 9, 2021 at 11:45 AM Sjors Provoost @.***> wrote:
Concept ACK. I tend to run functional tests on a RAM disk which occasionally runs out of space, resulting in weird behavior. Hopefully this PR helps.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bitcoin/bitcoin/pull/23100#issuecomment-989733726, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMRS4WZBYMCU4X3DRKAEJRDUQCCGHANCNFSM5EY4R27A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
As an exception, last functional tests to be run do not impose drive
space requirements.
Some neighboring lines do not follow PEP8 strictly, so neither does
some of the lines in this change.
Includes many suggestions courtesy of MarcoFalke and fanquake and
theStack.
564 | @@ -565,7 +565,11 @@ def run_tests(*, test_list, src_dir, build_dir, tmpdir, jobs=1, enable_coverage= 565 | logging.debug("Early exiting after test failure") 566 | break 567 | 568 | - shutil_disk_usage_stat_ = shutil.disk_usage(testdir) 569 | + try: # XXX 570 | + shutil_disk_usage_stat_ = shutil.disk_usage(testdir) 571 | + except FileNotFoundError: # XXX
have to investigate why this fixed CI
This reverts commit 4bc38b1a74c772d608bd947ef92e7faac3e47314.
This reverts commit 4af6367cdf39b85202ededa879c89a2adb5dcf92.
have to investigate why this fixed CI
What is the status of this? Are you still working on it? It's not mergable in it's current state.
What is the status of this? Are you still working on it?
This was finished at some point, but I failed to keep up with changes in the master branch.
It's not mergable in it's current state.
I don't have the time, hardware and drive I'd want in order to clean this PR up (4bc38b1a74) but I'm happy to let it go whenever you need so.
I don't have the time, hardware and drive I'd want in order to clean this PR up
That's fine. We'll close it as "Up for Grabs".