Fixes #20934 by using the “sync up” method described in #20538 (comment).
After improving robustness with this approach (commits 1-3), it turned out that there were still some fails, but those were unrelated to zmq: Out of 500 runs, 3 times sync_mempool()
or sync_blocks()
timed out, which can happen because the trickle relay time has no upper bound – hence in rare cases, it takes longer than 60s. This is fixed by enabling immediate tx relay on node1 (commit 4), which as a nice side-effect also gives us a rough 2x speedup for the test.
For further details, also see the explanations in the commit messages.
There is no guarantee that the test is still not flaky, but it would help if potential reviewers would run the following script locally and report how many runs failed (feel free to do less than 1000 runs, as this takes quite a long if ran with --valgrind
):
0#!/bin/sh
1OUTPUT_FILE=./zmq_results
2echo ===== repeated zmq test ===== > $OUTPUT_FILE
3
4for i in `seq 1000`; do
5 echo ------------------------
6 echo ----- test run $i -----
7 echo ------------------------
8 echo --- $i --- >> $OUTPUT_FILE
9 ./test/functional/interface_zmq.py --valgrind
10 if [ $? -ne 0 ]; then
11 echo "FAILED. /o\\" >> $OUTPUT_FILE
12 else
13 echo "PASSED. \\o/" >> $OUTPUT_FILE
14 fi
15done
16
17echo Failed test runs:
18grep FAILED $OUTPUT_FILE | wc -l