- Fix intermittent failures like https://cirrus-ci.com/task/4571677035069440 and https://cirrus-ci.com/task/6566987122868224
- Document and tidy up RPC walletpassphrase MAX_SLEEP_TIME/nSleepTime
- Add test logging and an assertion and use named args
test: fix intermittent wallet_encryption failures on win64 task #25021
pull jonatack wants to merge 3 commits into bitcoin:master from jonatack:fix-intermittent-wallet_encryption-test changing 3 files +41 −38-
jonatack commented at 10:32 pm on April 28, 2022: contributor
-
jonatack force-pushed on Apr 28, 2022
-
jonatack force-pushed on Apr 28, 2022
-
jonatack force-pushed on Apr 28, 2022
-
jonatack force-pushed on Apr 28, 2022
-
DrahtBot added the label Tests on Apr 28, 2022
-
mzumsande commented at 0:11 am on April 29, 2022: contributorStrange that this happens twice in a day suddenly, without any recent changes that look related. The linked CI runs fail at different subtests, one at (“Check a timeout less than the limit”) and one at (“Check a timeout greater than the limit”) - so a fix should probably also affect both.
-
MarcoFalke commented at 5:38 am on April 29, 2022: memberI can’t see how this would happen unless the python clock is ahead of the system clock??
-
jonatack force-pushed on Apr 29, 2022
-
jonatack commented at 12:27 pm on April 29, 2022: contributor
Yeah, it’s weird. Win64-only so far. Let’s see if there are more failures.
Thanks @mzumsande, re-pushed with a smaller additional margin for both tests.
-
jonatack force-pushed on Apr 29, 2022
-
jonatack commented at 4:35 pm on April 29, 2022: contributor
Another one: https://cirrus-ci.com/task/4631236957175808
0 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\wallet_encryption.py", line 93, in run_test 1 assert_greater_than(expected_time_with_buffer, actual_time) 2 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\util.py", line 56, in assert_greater_than 3 raise AssertionError("%s <= %s" % (str(thing1), str(thing2))) 4 5AssertionError: 1751237054.993892 <= 1751237055
-
MarcoFalke added the label Windows on Apr 29, 2022
-
jonatack renamed this:
test: fix intermittent wallet_encryption functional test
test: fix intermittent wallet_encryption failures on win64 task
on Apr 29, 2022 -
bitcoin deleted a comment on Apr 29, 2022
-
bitcoin deleted a comment on Apr 29, 2022
-
jonatack commented at 2:15 pm on May 3, 2022: contributor
-
vasild commented at 4:27 am on May 6, 2022: contributor
I think a possible explanation would be if the
GetTime()
in the C++ code on Windows ends up rounding up to a whole second. So, for example:- expected time = int(10.5) + MAX = 10 + MAX
- 0.1 second passes
- walletpassphrase RPC is executed, asking it to keep the phrase for MAX seconds, it calculates now=ceil(10.6) + MAX = 11 + MAX // ceil() wtf!?
- 0.1 second passes
- expected with buffer = 10.7 + MAX // not truncated to whole seconds
So, we end up with actual time (11 + MAX) < expected with buffer (10.7 + MAX).
If this guess is correct, then the following should fix it:
0before = math.floor(time.time()) + MAX_VALUE 1self.nodes[0].walletpassphrase(passphrase2, MAX_VALUE + 1000) 2after = math.ceil(time.time()) + MAX_VALUE 3actual = self.nodes[0].getwalletinfo()['unlocked_until'] 4assert_greater_than_or_equal(actual, before) 5assert_greater_than_or_equal(after, actual)
-
jonatack force-pushed on May 10, 2022
-
in test/functional/wallet_encryption.py:76 in 2178ad2e79 outdated
88- self.nodes[0].walletpassphrase(passphrase2, MAX_VALUE + 1000) 89- expected_time_with_buffer = time.time() + MAX_VALUE 90- actual_time = self.nodes[0].getwalletinfo()['unlocked_until'] 91- assert_greater_than_or_equal(actual_time, expected_time) 92- assert_greater_than(expected_time_with_buffer, actual_time) 93+ timeout_limit = 100000000 # maximum time to keep the decryption key, in seconds (~3 years)
vasild commented at 6:21 am on May 11, 2022:nit: maybe mention that this is the same asMAX_SLEEP_TIME
insrc/wallet/rpc/encrypt.cpp
.
jonatack commented at 3:15 pm on May 12, 2022:Done – seems best to avoid specifying the file to reduce future maintainance but added the constant name and documentation where it is declared; good idea.vasild approvedvasild commented at 6:24 am on May 11, 2022: contributorACK 76f8dc9bd1b4efd543dc54c17a5f8e9d512580e1jonatack force-pushed on May 12, 2022MarcoFalke commented at 3:25 pm on May 12, 2022: memberIs this still an issue? I haven’t seen it recently.jonatack commented at 3:30 pm on May 12, 2022: contributorHaven’t seen it in a week. Could drop the fix for now and leave the improvements.ryanofsky commented at 9:34 pm on May 12, 2022: contributorJust saw this error: https://cirrus-ci.com/task/4948737314258944?logs=functional_tests#L25 in #24230
Log:
0 node0 2022-05-12T20:03:02.859261Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\rpc\request.cpp:179] [parse] ThreadRPCServer method=walletpassphrase user=__cookie__ 1 node0 2022-05-12T20:03:03.012258Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\rpc\server.cpp:530] [RPCRunLater] queue run of timer lockwallet(default_wallet) in 99999400 seconds (using HTTP) 2 test 2022-05-12T20:03:03.014000Z TestFramework (ERROR): Assertion failed 3 Traceback (most recent call last): 4 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\test_framework.py", line 133, in main 5 self.run_test() 6 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\wallet_encryption.py", line 85, in run_test 7 assert_greater_than(expected_time_with_buffer, actual_time) 8 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\util.py", line 56, in assert_greater_than 9 raise AssertionError("%s <= %s" % (str(thing1), str(thing2))) 10 AssertionError: 1752385182.999136 <= 1752385183
MarcoFalke commented at 5:33 am on May 13, 2022: memberOk, then I think we should only maim the test on windows, not on all platforms.vasild approvedMarcoFalke commented at 9:37 am on May 13, 2022: memberWhy “maim”? The test functionality if fully preserved, no?
Sure, “maim” is a bit too strong. It is making the test lax by up to another second, while no one understands why.
MarcoFalke commented at 1:03 pm on May 13, 2022: memberWould be nice if someone could run this on msvc:
0#include <iostream> 1#include <chrono> 2#include <string> 3 4template <typename TimePoint> 5std::string to_string(const TimePoint& time_point) 6{ 7 return std::to_string(time_point.time_since_epoch().count()); 8} 9 10int main() 11{ 12 using namespace std::literals::chrono_literals; 13 using Sec = std::chrono::seconds; 14 15 std::cout << "Time point\t" "Cast\t" "Floor\t" "Round\t" "Ceil\n"; 16 std::cout << "(ms)\t\t" "(s)\t" "(s)\t" "(s)\t" "(s)\n"; 17 for (const auto value_ms: {5432ms, 5678ms}) { 18 std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds> 19 time_point_ms(value_ms); 20 21 std::cout 22 << to_string(time_point_ms) << "\t\t" 23 << to_string(std::chrono::time_point_cast<Sec>(time_point_ms)) << "\t" 24 << to_string(std::chrono::floor <Sec>(time_point_ms)) << "\t" 25 << to_string(std::chrono::round <Sec>(time_point_ms)) << "\t" 26 << to_string(std::chrono::ceil <Sec>(time_point_ms)) << "\n"; 27 } 28}
From https://en.cppreference.com/w/cpp/chrono/time_point/ceil. Should print:
0Time point Cast Floor Round Ceil 1(ms) (s) (s) (s) (s) 25432 5 5 5 6 35678 5 5 6 6
vasild commented at 1:26 pm on May 13, 2022: contributorIt is making the test lax by up to another second
That is true, but the point of the test is to ensure that the time limit is honored - 1. cannot be negative, 2. is capped at some limit and 3. if within the limits, then is accepted. The original test in 134cdc7cee3da7c554e40ad947a9cdcbb3069f13 contained 5 seconds buffer, so the point of the test is not to check that the value is accepted up to the second, exactly.
in test/functional/wallet_encryption.py:86 in 1a82609ce0 outdated
111+ timeout = timeout_limit - 600 112+ before = floor(time.time()) + timeout 113+ self.nodes[0].walletpassphrase(passphrase2, timeout) 114+ after = ceil(time.time()) + timeout 115+ actual = self.nodes[0].getwalletinfo()['unlocked_until'] 116+ assert before <= actual <= after
vasild commented at 1:28 pm on May 13, 2022:Would this print the values if the assertion fails?
0>>> a=1 1>>> b=2 2>>> c=3 3>>> assert a <= b <= c 4>>> 5>>> c=0 6>>> assert a <= b <= c 7Traceback (most recent call last): 8 File "<stdin>", line 1, in <module> 9AssertionError 10>>>
If not, then maybe revert back to using
assert_greater_than()
because seeing the values in CI logs is so precious in debugging this.
jonatack commented at 3:05 pm on August 10, 2022:Good point. Updated the assertion error to print i.e.AssertionError: walletpassphrase "unlocked_until" value of 1760141911 not in expected range of 1760141909..1760141910
MarcoFalke commented at 1:43 pm on May 13, 2022: memberso the point of the test is not to check that the value is accepted up to the second, exactly.
Right. I don’t care about this test.
nRelockTime
is only there for aesthetic reasons, it can’t be relied upon anyway, since it doesn’t take into account how long it takes to register the event in libevent. (If that takes 5 seconds, the value is off by 5 seconds)I mostly worry the clock on windows doesn’t work as expected and it will lead to more severe issues down the line (extremely unlikely, but worst case a consensus split)
vasild commented at 1:54 pm on May 13, 2022: contributorOk, in case it helps, here is another test prog I played with on my computer (not windows):
0int main(int, char**) 1{ 2 const auto now = std::chrono::system_clock::now().time_since_epoch(); 3 std::cout << std::chrono::duration_cast<std::chrono::seconds>(now).count() << std::endl 4 << std::chrono::duration_cast<std::chrono::milliseconds>(now).count() << std::endl; 5 return 0; 6}
It prints, for example:
01652449734 11652449734758
which confirms my implementation is rounding down, as in the spec:
casting to an integer duration is subject to truncation as with any static_cast to an integer type
PS: even if the problem is elsewhere I think the changes in this PR make the test more robust and readable.
MarcoFalke commented at 1:59 pm on May 13, 2022: membercc @sipsorcery Could you please try #25021 (comment) on msvc?sipsorcery commented at 1:37 pm on May 15, 2022: memberVisual Studio 2022 v17.1.6 9 (C++ compiler version 19.31.31107 for x86)
0Time point Cast Floor Round Ceil 1(ms) (s) (s) (s) (s) 25432 5 5 5 6 35678 5 5 6 6
in test/functional/wallet_encryption.py:84 in 1a82609ce0 outdated
109+ 110+ self.log.info("Test walletpassphrase with a timeout less than the limit") 111+ timeout = timeout_limit - 600 112+ before = floor(time.time()) + timeout 113+ self.nodes[0].walletpassphrase(passphrase2, timeout) 114+ after = ceil(time.time()) + timeout
MarcoFalke commented at 5:12 pm on May 15, 2022:This doesn’t seem right. If the Bitcoin Core clock was doing
ceil
, we’d see a failure rate of about 100%.round
seems unlikely as well, as the failure rate would be about 50%.Statistically, the error on windows should be 10-100ms, which can’t be explained by cast/round/ceil.
vasild commented at 1:53 pm on May 19, 2022:Well, this just ignores sub-second fractions and accounts for the time passed between the calls, keeping the original intention of the test (I hope!). Do you think it could still fail on Windows?MarcoFalke commented at 9:17 am on May 20, 2022: memberAn alternative to this would be to use mocktime to set the exact seconds and then check the exact result. (Obviously mocktime couldn’t shorten or extend the actual locking duration, but the test doesn’t check that anyway).MarcoFalke commented at 9:24 am on May 20, 2022: member(I still don’t know what the correct fix is, just handing out thoughts fyi)jonatack commented at 3:41 pm on May 20, 2022: contributorAnother one https://cirrus-ci.com/task/5010386603737088
0176/244 - wallet_encryption.py --legacy-wallet failed, Duration: 6 s 1stdout: 22022-05-20T15:08:20.442000Z TestFramework (INFO): Initializing test directory C:\Users\ContainerAdministrator\AppData\Local\Temp\test_runner_₿_🏃_20220520_141819\wallet_encryption_66 32022-05-20T15:08:25.754000Z TestFramework (INFO): Check a timeout less than the limit 42022-05-20T15:08:25.879000Z TestFramework (INFO): Check a timeout greater than the limit 52022-05-20T15:08:26.006000Z TestFramework (ERROR): Assertion failed 6Traceback (most recent call last): 7 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\test_framework.py", line 133, in main 8 self.run_test() 9 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\wallet_encryption.py", line 93, in run_test 10 assert_greater_than(expected_time_with_buffer, actual_time) 11 File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\util.py", line 56, in assert_greater_than 12 raise AssertionError("%s <= %s" % (str(thing1), str(thing2))) 13AssertionError: 1753059305.991002 <= 1753059306 142022-05-20T15:08:26.069000Z TestFramework (INFO): Stopping nodes
jonatack commented at 12:15 pm on May 25, 2022: contributorjonatack commented at 6:30 pm on May 27, 2022: contributorvasild commented at 1:03 pm on July 13, 2022: contributorI think this is an improvement over the currentmaster
, even not considering the test failures.MarcoFalke commented at 11:31 am on July 19, 2022: memberIs this still happening after the windows upgrade?MarcoFalke commented at 1:40 pm on July 25, 2022: memberI haven’t seen it yet, so it indeed seems like a bug in Windows 2019jonatack commented at 8:11 am on July 27, 2022: contributorjonatack commented at 3:28 pm on July 28, 2022: contributorvasild commented at 4:24 am on July 29, 2022: contributorjonatack commented at 2:12 pm on July 30, 2022: contributorjonatack commented at 2:53 pm on July 30, 2022: contributorSeems this won’t be merged.jonatack closed this on Jul 30, 2022
MarcoFalke commented at 3:05 pm on July 30, 2022: memberThe reason I haven’t merged this yet is that it has only one ACK, but outstanding review comments such as #25021 (review) and #25021 (review)
Also, I am not sure if a test fix should modify the cpp source code. No objection to those changes, but maybe try to submit them separate from a test fix?
vasild commented at 12:32 pm on August 10, 2022: contributorThe reason for closing - “this won’t be merged” seems unjustified to me. If Marco did not merge it in the past, that does not mean he will not merge it in the future. Or that another maintainer will not merge it in the future.
Indeed it has just one ACK. @achow101, maybe you can review this, since you are the original author of the tests.
I think this PR should be reopened, IMO it is an improvement even if it does not fix the mysterious test failures.
MarcoFalke commented at 12:35 pm on August 10, 2022: memberFor reference, I created #25750 two weeks agojonatack commented at 2:23 pm on August 10, 2022: contributorI think this PR should be reopened, IMO it is an improvement even if it does not fix the mysterious test failures.
Ok, re-opening.
Note also that many test changes (and regular codebase ones, #25656 would be an example merged today) are merged with one ACK, so this doesn’t seem to be a hard rule.
jonatack reopened this on Aug 10, 2022
Work around intermittent Win failure in walletpassphrase tests
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
Document and tidy up RPC walletpassphrase MAX_SLEEP_TIME/nSleepTime
note: using std::min here shouldn't add to the compilation time of src/wallet/rpc/encrypt.cpp, as src/wallet/wallet.h that it depends on already includes the algorithm library. Also added this file to the clang-tidy include-what-you-use CI script and updated the headers.
Add logging and an assertion to wallet_encryption tests and use named args 189f57bd03jonatack force-pushed on Aug 10, 2022jonatack commented at 3:08 pm on August 10, 2022: contributorUpdated to improve assertion error, changed the first commit message from “fix” to “work around” and rebased to current master to enable clang-tidy iwyu checks on the second commit (will re-push after the clang-tidy CI run). Happy to split out the some of changes to a separate pull or to update the approach if reviewers prefer.fanquake commented at 3:20 pm on August 10, 2022: memberhttps://github.com/bitcoin/bitcoin/pull/25021/files#r873198183
Has this been is addressed? If yes, can you mark it as resolved, if not, can you comment?
changed the first commit message from “fix” to “work around”
If this is now a work around, rather than a fix, I’d rather use the work-around that is contained to the platform that is broken, i.e #25750, rather than refactoring code that works for all other platforms, and essentially hiding the bug on Windows. At least until we properly understand the issue; see debugging in #25482.
The PR title and description need updating in any case.
MarcoFalke commented at 5:00 pm on August 10, 2022: memberCan you explain why the workaround here is better than the one in #25750?DrahtBot commented at 1:40 pm on August 11, 2022: contributorThe following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #25878 (tests: Use mocktime for wallet encryption timeout by achow101)
- #25750 (test: Temporarily work around Windows bug in walletpassphrase by MarcoFalke)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
achow101 commented at 6:21 pm on August 15, 2022: memberIt seems like instead of messing around with buffers and relaxing ranges, this could just be resolved through the use of mock time so that bitcoind is set to a fixed time.jonatack commented at 6:25 pm on August 15, 2022: contributorI’ll give that a shot 👍MarcoFalke commented at 5:23 pm on August 18, 2022: memberFor reference, I suggested mocktime in May: #25021 (comment)
This should be fine to do here, as the time returned has no guarantee to be related to the real relock time, which happens completely independent in the libevent scheduler thread.
jonatack closed this on Aug 20, 2022
bitcoin locked this on Aug 20, 2023
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 06:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me