Looks like this happens for some workers/users on the same machine. For example, it happens for ci_worker_1738693519_026122670
, but none of the others. Also, it seems to almost persist for several hours. That is, it happens on almost all runs, once it happened at least once.
Example failures:
- https://cirrus-ci.com/task/6482591089426432?logs=ci#L603
- https://cirrus-ci.com/task/6102859474796544?logs=ci#L624
- https://cirrus-ci.com/task/6278234062454784?logs=ci#L614
However, the same worker sometimes still passes:
This can also be seen by looking at the worker summary: https://0xb10c.github.io/bitcoin-core-ci-stats/tags/workerci_worker_1738693519_026122670/
On the passing run, the command podman container rm --force --all
prints:
0[05:49:48.929] time="2025-02-17T05:49:48-05:00" level=warning msg="StopSignal SIGTERM failed to stop container ci_native_nowallet_libbitcoinkernel in 10 seconds, resorting to SIGKILL"
1[05:49:50.607] time="2025-02-17T05:49:50-05:00" level=error msg="Unable to clean up network for container be7a1370684d21a332fca91b3323679159787b30b9aa5469b8960e4ec632404c: \"tearing down network namespace configuration for container be7a1370684d21a332fca91b3323679159787b30b9aa5469b8960e4ec632404c: netavark: IO error: aardvark pid not found\""
2[05:49:52.396] be7a1370684d21a332fca91b3323679159787b30b9aa5469b8960e4ec632404c
Also, the last aborted run seems to have happened during podman run
: https://cirrus-ci.com/task/6702139617050624?logs=ci#L268
Thus, I believe this may be an upstream race bug in podman.
It would be good to minimize, report and fix the bug upstream.