test: Fix intermittent failure in feature_dbcrash #19022
pull MarcoFalke wants to merge 1 commits into bitcoin:master from MarcoFalke:2005-testFixIntermittentFailDbCrash changing 1 files +6 −1-
MarcoFalke commented at 10:59 pm on May 19, 2020: memberExample backtrace https://cirrus-ci.com/task/6005716207009792?command=functional_test#L817
-
test: Fix intermittent failure in feature_dbcrash fa2ca0cbdd
-
MarcoFalke commented at 10:59 pm on May 19, 2020: memberThis was presumably introduced in aab81720de237b258ed4e15f1b1831c11abf74f0
-
fanquake added the label Tests on May 19, 2020
-
in test/functional/feature_dbcrash.py:259 in fa2ca0cbdd
255@@ -256,7 +256,11 @@ def run_test(self): 256 self.log.debug("Mining longer tip") 257 block_hashes = [] 258 while current_height + 1 > self.nodes[3].getblockcount(): 259- block_hashes.extend(self.nodes[3].generate(min(10, current_height + 1 - self.nodes[3].getblockcount()))) 260+ block_hashes.extend(self.nodes[3].generatetoaddress(
jnewbery commented at 2:51 pm on May 29, 2020:How about doing the generate call on its own line? newlines are free!
MarcoFalke commented at 3:01 pm on May 29, 2020:That would require poisoning the namespace with a symbol name. Named arguments and functional programming ftw. Bracket are free! :rocket:
jnewbery commented at 3:08 pm on May 29, 2020:Really? You think 4-nested brackets are easier to read or are you just trolling?
MarcoFalke commented at 3:15 pm on May 29, 2020:Not trolling. 4 brackets is how I write code. :grimacing:
glowang commented at 0:57 am on June 1, 2020:@MarcoFalke lol I see a couple of places with style like this, no wonder 😆
MarcoFalke commented at 12:17 pm on June 1, 2020:Huh, is this generally considered to be less readable? If yes, I might change my style, but comparing the two:
0in_1 = 'foo' 1in_2 = 'bar' 2res = run(in_1, in_2)
vs
0res = run( 1 in_1='foo', 2 in_2='bar', 3)
The latter one has a lower mental load because there is two less symbols in the surrounding scope. I can’t find references, but I am pretty sure we had a bug in the tests because symbols were leaked from one test to the next test by accident.
Also, inlining promotes use of named arguments, which obviously promotes code readability. I am sure we had at least one bug due to not using named arguments.
jnewbery commented at 2:09 pm on June 1, 2020:This isn’t an honest comparison. Nesting functions 4 deep inside is very different from the example you’ve given. It’s also completely unrelated to whether functions are called with named or positional arguments.
glowang commented at 3:12 pm on June 1, 2020:The indentation @MarcoFalke uses does help a lot, and I can see your point on reducing mental load now.
amitiuttarwar commented at 3:41 am on June 4, 2020:I agree in the
in_1
,in_2
example the second looks more clear, but I think thats mainly because we’re using mocked out variable names.if it was something like
0blockhash = getbestblockhash() 1coinbase_txid = getblock(blockhash)['tx'][0] 2res = calculate_fee_amount(coinbase_txid, blockhash)
that seems pretty clear to me. The variable naming is relevant for capturing intent. In this case you could just look at the
res=calculate..
line and get an idea of what’s happening (because at first pass, you can assume the variables are named correctly.)–
in regards to this code snippet, I’d find something like:
0num_blocks_to_create = min(10, current_height + 1 - self.nodes[3].getblockcount()) 1new_block_hashes = self.nodes[3].generatetoaddress( 2 nblocks = num_blocks_to_create, 3 # new address to avoid mining a block that has just been invalidated 4 address=self.nodes[3].getnewaddress(), 5) 6block_hashes.extend(new_block_hashes)
significantly more clear than the current;
0block_hashes.extend(self.nodes[3].generatetoaddress( 1 nblocks=min(10, current_height + 1 - self.nodes[3].getblockcount()), 2 # new address to avoid mining a block that has just been invalidated 3 address=self.nodes[3].getnewaddress(), 4))
if I were proposing the code, I’d think about the variable names a bit more… but the information they can provide about the content can be super useful, esp in untyped languages. I actually looked up the docs for
generatetoaddress
to clarify what was happening here to write up the alternative.Coming from rubyland, this is pretty common practice for making code easier to parse. A quick google of “self documenting code” landed me on this page, where # 1 and # 2 are about extracting logic to abstract it behind a variable name (either via function or expression).
–
Lol, lots of words. I want to share because when I come across the deeply nested functions in the functional tests, I find it difficult to parse and will often re-write them into more incremental chunks just to read what’s happening.
jnewbery commented at 2:57 pm on May 29, 2020: memberutACK fa2ca0cbdde5c6c5e407ec037e52e3f6315a0b37
Perhaps a longer-term solution would be to change the
generate()
method inTestNode
:- set an hd seed on the node instead of a private key in
import_deterministic_coinbase_privkeys()
- use
generatetodescriptor
to generate to a key in the hd tree - keep a static variable in
generate()
to ensure a new keypath is used for each call.
Could be a good first issue?
MarcoFalke commented at 3:03 pm on May 29, 2020: memberkeep a static variable in generate() to ensure a new keypath is used for each call.
You mean a member variable? (Different nodes shouldn’t influence each other)
jnewbery commented at 3:12 pm on May 29, 2020: memberYou mean a member variable? (Different nodes shouldn’t influence each other)
You’re right. Should be a member variable.
MarcoFalke merged this on May 29, 2020MarcoFalke closed this on May 29, 2020
MarcoFalke deleted the branch on May 29, 2020glowang commented at 3:37 pm on May 30, 2020: contributorutACK fa2ca0c
Perhaps a longer-term solution would be to change the
generate()
method inTestNode
:- set an hd seed on the node instead of a private key in
import_deterministic_coinbase_privkeys()
- use
generatetodescriptor
to generate to a key in the hd tree - keep a static variable in
generate()
to ensure a new keypath is used for each call.
Could be a good first issue?
I can work on this!
MarcoFalke commented at 3:48 pm on May 30, 2020: memberNice! I see that this method is lacking a comment to explain its motivation. The background the tests need deterministic keys is:
- When the wallet is not compiled into Bitcoin Core, the tests should still be able to generate blocks. Blocks need to be generated to an address (or descriptor), so we need some information to derive those
- When the wallet is compiled into Bitcoin Core (and mock-time is used), we want blocks to be generated deterministically in some tests. See for example
test/functional/data/rpc_getblockstats.json
- Finally, a minor side-effect, is that we don’t need to cache the wallets directory in the functional test cache.
Just make sure to preserve those use-cases.
Let me know if you need any other pointers.
sidhujag referenced this in commit 4534a9a222 on May 31, 2020glowang commented at 3:08 pm on June 1, 2020: contributorThank you so much @MarcoFalke and @jnewbery !
1.You mentioned that
Blocks need to be generated to an address (or descriptor), so we need some information to derive those
. I understand that you want to generate blocks to an address/descriptor, but what woud happen if the keys are not deterministic anyways? What benefits do we get from having information to derive them? Sorry if this is super obvious.2.I did a little research, but I’m not sure if I’m getting what you meant by
Do not cache the wallets directory in functional test cache
. Looks like this behavior has been enforced in test_framework.py,os.rmdir(cache_path('wallets'))
. Is this what you mean? Is this the exact behavior we want to maintain?3.Additionally, I would like to make sure I understand the context & goal of this task:
Background: feature_dbcrash test experienced intermittent failures due to the fact that
generate()
uses a deterministic address. Deterministic addresses don’t work when a block is invalidated uniformly randomly (but why does invalid block not work well with deterministic addresses? Why does using a new address solve this problem?). This has been worked around by calling thegetnewaddress()
function.Goal: we want to rewrite the generate() function to this logic:
- If the wallet is compiled && mocktime is used:
- Blocks should be generated deterministically, like rpc_getblockstats.json
- If the wallet is not compiled:
- Blocks are generated to an address or descriptor:
- Set an hd seed on the node instead of using
get_deterministic_priv_key().address
- Use generatetodesriptor to generate keys from the seed - Use a ember variable in generate() to ensure a new keypath is used
- Blocks are generated to an address or descriptor:
- Set an hd seed on the node instead of using
MarcoFalke commented at 3:26 pm on June 1, 2020: memberWhy does using a new address solve this problem?
If you call
- generate
- invalidateblock(best_block)
- generate
And all of this happens in the same second, then the two generated blocks will be completely identical in all fields of their struct. Thus, the second call to
generate
will fail because the generated block is already marked invalid.MarcoFalke commented at 4:11 pm on June 1, 2020: memberLooks like this behavior has been enforced in test_framework.py, os.rmdir(cache_path(‘wallets’)). Is this what you mean? Is this the exact behavior we want to maintain?
Yes, I’d say so. It allows to simply cache the blockchain once instead of MAX_NODES datadirs (including the wallet dirs)
And about your other questions: In general, the more deterministic a test is, the easier it is to reproduce failures.
glowang commented at 3:54 pm on June 5, 2020: contributorthank you for your answers!! I have some follow up questions 😂
In your answer, there seems to be an emphasis of using
mock time
when the wallet is enabled, but not so important if we are only generating to address…is there a reason for that? It seems that we setmocktime
when we expect there is some time consuming processes (such as tx rebroadcast) & that we usemocktime
to time travel.The feature_crash.py test is using
generatetoaddress
for mining after this pr, so right now it is independent of wallet. Does this mean that this test scenario does not require a wallet to be enabled? Is there any harm if I pass in the -disablewallet param?In the case that wallet is enabled & that we want to generate blocks deterministically, is it okay if we reuse the block data in rpc_getblockstats.json?
In the case that wallet is not enable, do you have any preference as to whether we should use
generatetodescriptor
orgeneratetoaddress
? @jnewbery asked specifically to usegeneratetodescriptor
in his previous comment, but @MarcoFalke seemed to suggest that either is okay?Thank you again 🙏
MarcoFalke commented at 4:32 pm on June 5, 2020: memberDon’t worry too much about mocktime. This is just another dimension to achieve more determinism. It should be orthogonal to your changes.
rpc_getblockstats.json
was just an example. It shouldn’t affect your changes.Sometimes mocktime is used to “speed up” tests by telling them how much time has passed. In other cases, it is used to deterministically generate blocks. (The time ends up in the block header, see https://btcinformation.org/en/developer-reference#block-headers)
Is there any harm if I pass in the -disablewallet param?
Currently the test does require a wallet. However, it might be possible to remove that requirement.
jasonbcox referenced this in commit 4abe3c4a7f on Sep 9, 2020PastaPastaPasta referenced this in commit 9d6a5a7d0d on Sep 17, 2021PastaPastaPasta referenced this in commit ff7c61d068 on Sep 19, 2021thelazier referenced this in commit 354637828e on Sep 25, 2021DrahtBot locked this on Feb 15, 2022
MarcoFalke jnewbery glowang amitiuttarwarLabels
Tests
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 21:12 UTC
More mirrored repositories can be found on mirror.b10c.me