[test] Add aborttrescan tests

kallewoof commented at 8:13 am on April 18, 2017: member

This PR adds tests for the new abortrescan RPC command.

A new function ref_node was added to util.py, which works like start_node except it never actually launches the process. This is used to get two node objects in python which work separately from each other, in order to make two simultaneous requests (importprivkey and abortrescan).

in test/functional/test_runner.py:111 in e68c168eac outdated

107@@ -108,6 +108,7 @@
108     'rpcnamedargs.py',
109     'listsinceblock.py',
110     'p2p-leaktests.py',
111+    'import-abort-rescan.py',           # ~17s

jonasschnelli commented at 8:16 am on April 18, 2017:

I think we should remove the time hint, otherwise this will become “the standard” and, with that, the “keep-it-updated” problem will follow.

kallewoof commented at 8:20 am on April 18, 2017:

Hm, yeah, you’re right. Removing.

jonasschnelli commented at 8:17 am on April 18, 2017: contributor

Concept ACK. Nice way with the ref node & threading.

kallewoof force-pushed on Apr 18, 2017

fanquake added the label Tests on Apr 18, 2017

jnewbery commented at 4:43 pm on April 18, 2017: member

Thanks for opening this PR to cover abortrescan. Definitely worth doing.

I don’t like the change you’ve made to start_node(). Adding an optional parameter to a function called start_node() which causes the function to not start a node is really an abuse of the function.

I’d much prefer us to move toward having an encapsulated class for a test node, where we could have multiple rpc connections, including asynchronous rpc connections if required. I’ve been trying to push towards that model in #10082, but I haven’t had much luck in attracting reviewers for that PR or the linked PR. I think adding random functionality into util.py is unmaintainable and is moving in the wrong direction.

So concept ACK for covering this with a functional test, but please lets not add more to util.py.

kallewoof commented at 0:35 am on April 19, 2017: member

@jnewbery

I don’t like the change you’ve made to start_node(). Adding an optional parameter to a function called start_node() which causes the function to not start a node is really an abuse of the function.

Yeah, I was wondering if I should have reversed it (i.e. ref_node takes a launch bool and start_node just proxies).

As for encapsulate with multiple rpc connections, that sounds great but the functionality is, as you say, not in there. My change is minimal and works, so maybe it’s an okay for now until #10082 is ready? (I’ll gladly review btw. I just gotta wake up first..)

Edit: Alternatively I could base this PR on top of #10082. Never done that kind of thing before though.

kallewoof force-pushed on Apr 19, 2017

kallewoof commented at 2:20 am on April 19, 2017: member

#10082 seems like a big project (I’d love to help btw), so I am proposing a solution until that is resolved here. start_node now calls ref_node which now has an optional spawnproc flag. This means start_node always starts a node, and ref_node can do both, depending on the flag.

Unsquashed history: 1 → 2 → 3⊱1 → 4⊱2

kallewoof force-pushed on Apr 19, 2017

jnewbery commented at 9:05 pm on April 19, 2017: member

@kallewoof thanks for being so accepting of my feedback! I don’t like NACKing PRs, but I really want to try to not put any additional complexity in util.py if we can help it.

Have you had a look at getblocktemplate_longpoll.py ? That’s doing something similar to this test where an additional asynchronous RPC thread is required. I’ve tried to rewrite your testcase in the same style here: https://github.com/jnewbery/bitcoin/tree/pr10225. Can you take a look and tell me what you think? I’d prefer to do it this way than adding more complexity to the mainline start_node() function. If we can get #10082 merged, then we can then look at adding an asynchronous RPC thread to the TestNode class so it’s available more generally.

EDIT: I’ve found that this test fails intermittently when I run it locally. I think perhaps the importprivkey() is completing very quickly so it’s a race condition and the abortrescan() call needs to arrive at the right moment.

kallewoof commented at 1:44 am on April 20, 2017: member

@jnewbery not a worry at all – you don’t have to hold back the punches with me. I am learning a lot from the feedback I get from you guys. :)

Gotcha on the no-touching-util.py. I will look at getblocktemplate_longpoll and see if I can adapt. Worst case I have two options: I can drop this PR until #10082 or I can put the ref_node code into the test itself with a # TODO and we simply rip it out when it’s time to replace.

As for the intermittent failures, yes, I am trying to keep the time to run as low as possible and I may have put it a bit too low (upping the range in one or both of the top for loops should stabilize it, I think).

kallewoof force-pushed on Apr 20, 2017

kallewoof commented at 2:16 am on April 20, 2017: member

@jnewbery Wow, the solution in getblocktemplate_longpoll.py was so much cleaner. I switched to that and dropped some commits. I also upped the range to hopefully address the intermittent fails you experienced. Edit: beginning to suspect problem is in fact in the abortres aborting too early. Added small sleep (7’’⊱2).

Unsquashed history: 1 → 2 → ~~3⊱1~~ → 4⊱2 → 5⊱2 → 6⊱2 → 7’’⊱2

kallewoof force-pushed on Apr 20, 2017

jnewbery commented at 1:52 pm on April 20, 2017: member

Looks better, but the test is still failing more often than not for me. I ran the test 20 times (with 4 tests running in parallel):

 0TEST                                 | STATUS    | DURATION
 1
 2import-abort-rescan.py --portseed=1  | ✓ Passed  | 11 s
 3import-abort-rescan.py --portseed=10 | ✖ Failed  | 11 s
 4import-abort-rescan.py --portseed=11 | ✖ Failed  | 11 s
 5import-abort-rescan.py --portseed=12 | ✖ Failed  | 11 s
 6import-abort-rescan.py --portseed=13 | ✖ Failed  | 11 s
 7import-abort-rescan.py --portseed=14 | ✖ Failed  | 11 s
 8import-abort-rescan.py --portseed=15 | ✖ Failed  | 11 s
 9import-abort-rescan.py --portseed=16 | ✖ Failed  | 11 s
10import-abort-rescan.py --portseed=17 | ✖ Failed  | 11 s
11import-abort-rescan.py --portseed=18 | ✓ Passed  | 11 s
12import-abort-rescan.py --portseed=19 | ✖ Failed  | 11 s
13import-abort-rescan.py --portseed=2  | ✖ Failed  | 12 s
14import-abort-rescan.py --portseed=20 | ✖ Failed  | 11 s
15import-abort-rescan.py --portseed=3  | ✖ Failed  | 12 s
16import-abort-rescan.py --portseed=4  | ✖ Failed  | 13 s
17import-abort-rescan.py --portseed=5  | ✖ Failed  | 11 s
18import-abort-rescan.py --portseed=6  | ✖ Failed  | 11 s
19import-abort-rescan.py --portseed=7  | ✖ Failed  | 11 s
20import-abort-rescan.py --portseed=8  | ✖ Failed  | 11 s
21import-abort-rescan.py --portseed=9  | ✖ Failed  | 11 s
22
23ALL                                  | ✖ Failed  | 224 s (accumulated)

(ignore the portseed argument - that’s just a hack to allow instances of the same test to be run by the test_runner. The value of portseed is ignored).

I’m not why this fails so much for me, but seems to pass for you and Travis.

kallewoof commented at 0:30 am on April 21, 2017: member

@jnewbery Is there an easy way to run the test like that? E.g. 20 times 4 in parallel?

Edit: tests keep succeeding for me on a MacBook Pro. Running them on a linux machine (lubuntu) resulted in sporadic failures. Looking into it now.

Edit: there are two cases where the test will fail; one is when the abortres loop sleeps right over the importprivkey time-to-finish (for 0.01), and one, I realized, is when the importprivkey doesn’t actually start up before the abortres loop ends. I increased the range from 200 to 2000, with sleep kept at 0.001, which means a total approximate time of 2 seconds in the loop. This almost fixed it on my end but test started failing on the next assertion (aborted, so should not have balance). I bumped block chain size and that seems to have done it. This all feels very flakey though. :(

[…]: → 8⊱2 → 9⊱2

kallewoof force-pushed on Apr 21, 2017

[test] Test abortrescan command. ed60970c83

kallewoof force-pushed on Apr 21, 2017

kallewoof renamed this:
~~[test] Add aborttrescan tests~~
[WIP] [test] Add aborttrescan tests
on Apr 23, 2017

kallewoof commented at 11:31 pm on April 23, 2017: member

~~Still seeing intermittent failures. Marking this WIP until this is fully resolved.~~

Edit: Actually, the failures I were seeing were related to a debug line that triggered #10265 so I am removing the WIP part.

kallewoof renamed this:
~~[WIP] [test] Add aborttrescan tests~~
[test] Add aborttrescan tests
on Apr 24, 2017

laanwj merged this on Apr 25, 2017

laanwj closed this on Apr 25, 2017

laanwj referenced this in commit e0a7e1994e on Apr 25, 2017

kallewoof deleted the branch on Apr 25, 2017

jnewbery commented at 2:55 pm on April 25, 2017: member

Is there an easy way to run the test like that? E.g. 20 times 4 in parallel?

You can do this by changing the BASE_SCRIPTS list in test_runner.py to be the same test multiple times. test_runner will automatically remove duplicates, but if you add --portseed=x to the test name, then it will run them as separate tests. The dummy portseed parameter is overridden by an actual portseed further down in test_runner.

This all feels very flakey though. :(

Indeed! Is it true to say there’s a tradeoff between making the test case last longer and making it more robust? If that’s the case I think you should err towards making it run longer and perhaps add it as an extended script rather than a base script.

jnewbery commented at 7:10 pm on May 2, 2017: member

This test is still failing intermittently for me in two different ways:

02017-05-02 18:51:29.922000 TestFramework (ERROR): Assertion failed
1Traceback (most recent call last):
2  File "/home/ubuntu/bitcoin/test/functional/test_framework/test_framework.py", line 146, in main
3    self.run_test()
4  File "./import-abort-rescan.py", line 50, in run_test
5    assert abortres # if false, we failed to abort
6AssertionError

and:

0Traceback (most recent call last):
1  File "/home/ubuntu/bitcoin/test/functional/test_framework/test_framework.py", line 146, in main
2    self.run_test()
3  File "./import-abort-rescan.py", line 58, in run_test
4    assert_equal(self.nodes[1].getbalance(), 0.0)
5  File "/home/ubuntu/bitcoin/test/functional/test_framework/util.py", line 408, in assert_equal
6    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
7AssertionError: not(0.12300000 == 0.0)

This is also causing our Jenkins build machine to fail occasionally (2 times out of 30 builds) @laanwj - was this merged accidentally? There haven’t been any reviews/ACKs. Can we revert it and get it reviewed before remerging?

laanwj commented at 1:51 pm on May 3, 2017: member

@laanwj - was this merged accidentally?

I think so, sorry @kallewoof, needs a new PR now.

jnewbery commented at 1:53 pm on May 3, 2017: member

@kallewoof - this has been backed out by #10327 . Please open a new PR so this test can be reviewed before being merged back in. A few suggestions for making this less flakey:

increase the number of generated blocks by at least an order of magnitude. Run the test many times on a fast machine, with bitcoind’s datadir in /dev/shm. I think this passes on Travis because bitcoind runs slowly so you have a larger window for the abortrescan RPC to hit.
check the debug logs to see how long the rescan actually takes so we’re not guessing on what a safe window is.
I think this test should be in the extended_test list rather than the base_tests list.
alternatively, tack this test onto a longer test script which already has a long chain, for example pruning.py. Seems a bit untidy, but should guarantee that the rescan takes a long time.

kallewoof commented at 12:13 pm on May 4, 2017: member

No problem - I will do proper testing and make a new PR once done. Sorry for the trouble!

PastaPastaPasta referenced this in commit a307c32504 on Jun 10, 2019

PastaPastaPasta referenced this in commit 55c011c55a on Jun 10, 2019

PastaPastaPasta referenced this in commit 5608a08249 on Jun 10, 2019

PastaPastaPasta referenced this in commit 3f0dced66d on Jun 11, 2019

PastaPastaPasta referenced this in commit 2f6c6d5ee2 on Jun 11, 2019

PastaPastaPasta referenced this in commit fc987f2561 on Jun 12, 2019

PastaPastaPasta referenced this in commit 869aafdb92 on Jun 14, 2019

PastaPastaPasta referenced this in commit db2f8cb131 on Jun 14, 2019

PastaPastaPasta referenced this in commit 06777ff992 on Jun 15, 2019

PastaPastaPasta referenced this in commit 806b7599cc on Jun 19, 2019

barrystyle referenced this in commit 62adfac188 on Jan 22, 2020

MarcoFalke locked this on Sep 8, 2021

[test] Add aborttrescan tests #10225