test: Run framework unit tests in parallel #29771

pull tdb3 wants to merge 1 commits into bitcoin:master from tdb3:parallel_unit_tests changing 2 files +53 −30
  1. tdb3 commented at 3:42 AM on March 31, 2024: contributor

    Functional test framework unit tests are currently run prior to all other functional tests.

    This PR enables execution of the test framework unit tests in parallel with the functional tests, rather than before the functional tests, saving runtime and more efficiently using available cores.

    This is a follow up to #29470 (comment)

    New behavior:

    1. When running all tests, the framework unit tests are run in parallel with the other tests (unless explicitly skipped with --exclude). This parallelization introduces marginal time savings when running all tests, depending on the machine used. As an example, a 2-3% time savings (9 seconds) was observed on a machine using --jobs=18 (with 18 available cores).
    2. When running specific functional tests, framework unit tests are now skipped by default. Framework unit tests can be added by including feature_framework_unit_tests.py in the list of specific tests being executed. The rationale for skipping by default is that if the tester is running specific functional tests, there is a conscious decision to focus testing, and choosing to run all tests (where unit tests are run by default) would be a next step.
    3. The --skipunit option is now removed since unit tests are parallelized (they no longer delay other tests). Unit tests are treated equally as functional tests.

    Implementation notes:

    Since TextTestRunner can be noisy (even with verbosity=0, and therefore trigger job failure through the presence of non-failure stderr output), the approach taken was to send output to stdout, and forward test result (as determined by TestResult returned). This aligns with the previous check for unit test failure (if not result.wasSuccessful():).

    This approach was tested by inserting self.assertEquals(True, False) into test_framework/address.py and seeing specifics of the failure reported.

    135/302 - feature_framework_unit_tests.py failed, Duration: 0 s
    
    stdout:
    .F
    ======================================================================
    FAIL: test_bech32_decode (test_framework.address.TestFrameworkScript.test_bech32_decode)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/dev/myrepos/bitcoin/test/functional/test_framework/address.py", line 228, in test_bech32_decode
        self.assertEqual(True, False)
    AssertionError: True != False
    
    ----------------------------------------------------------------------
    Ran 2 tests in 0.003s
    
    FAILED (failures=1)
    
    
    stderr:
    

    There was an initial thought to parallelize the execution of the unit tests themselves (i.e. run the 12 unit test files in parallel), however, this is not anticipated to further reduce runtime meaningfully and is anticipated to add unnecessary complexity.

  2. DrahtBot commented at 3:42 AM on March 31, 2024: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, achow101
    Approach ACK kevkevinpal

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29387 (test: fix RPC coverage check by BrandonOdiwuor)
    • #29335 (test: Handle functional test disk-full error by BrandonOdiwuor)
    • #28312 (test: fix keys_to_multisig_script (P2MS) helper for n/k > 16 by theStack)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on Mar 31, 2024
  4. tdb3 force-pushed on Mar 31, 2024
  5. DrahtBot added the label CI failed on Mar 31, 2024
  6. DrahtBot commented at 4:11 AM on March 31, 2024: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    <sub>Debug: https://github.com/bitcoin/bitcoin/runs/23271888984</sub>

  7. tdb3 force-pushed on Mar 31, 2024
  8. DrahtBot removed the label CI failed on Mar 31, 2024
  9. Sjors commented at 2:25 PM on April 2, 2024: member

    I suppose the original idea was to run these first as a sanity check. To somewhat retain that property, perhaps the entire test suite can be aborted if this fails.

    If you run these in parallel, I think you might as well drop the --skipunit since it's faster than many of the other tests.

    When running specific functional tests, framework unit tests are now skipped by default

    This is useful, but wasn't that already the case? Or do you mean when using the --filter argument, rather than executing the individual test directly?

    Tested the average of three runs for time test/functional/test_runner.py --jobs=100 on a machine with 32 threads and a RAM disk for --cachedir and --tmpdir

    Before: 1 minute 12 seconds

    After: 1 minute 13 seconds

    That last number is dominated by a single run that took unusually long. Typically it's 4 seconds faster.

  10. maflcko commented at 2:31 PM on April 2, 2024: member

    I suppose the original idea was to run these first as a sanity check. To somewhat retain that property, perhaps the entire test suite can be aborted if this fails.

    Not sure. For example, what point is it to abort all tests when the muhash unit test fails, when most tests are not using muhash? If someone wanted to abort early, there is the failfast option.

    If you run these in parallel, I think you might as well drop the --skipunit since it's faster than many of the other tests.

    Agree.

  11. Sjors commented at 2:51 PM on April 2, 2024: member

    I was assuming that any failure of the framework tests means your system is broken, but maybe that's too strong an assumption.

  12. tdb3 commented at 10:03 PM on April 2, 2024: contributor

    Thank you for taking a look @Sjors and @maflcko!

    If you run these in parallel, I think you might as well drop the --skipunit since it's faster than many of the other tests.

    To ensure I understand, is the desire here to have the unit tests run by default when running individual tests (and also by default when running all tests)?

    If so, that seems reasonable to me. As you mentioned, the unit tests are decently fast and the default job value is 4 anyway (so there would be some inherent parallelization now rather than the previous behavior to serialize unit test execution). For example, on my system, running feature_signet.py with or without feature_framework_unit_tests.py (default jobs=4) now has the same runtime. The tests that would be impacted are relatively short ones anyway, and the user still has the option to specify --skipunit to save a few seconds (the difference between the short test execution and unit test runtime).

    When running specific functional tests, framework unit tests are now skipped by default

    This is useful, but wasn't that already the case? Or do you mean when using the --filter argument, rather than executing the individual test directly?

    The previous behavior was to run the unit tests first, blocking all other tests until the unit tests passed. If the unit tests failed, execution would stop before running any other tests (i.e. the default behavior for unit tests was to fail fast).

    The new behavior is for unit tests to be run in parallel, not in a fail fast manner. The user can specify --failfast to end testing early if any test fails (unit tests are no exception).

  13. Sjors commented at 7:39 AM on April 3, 2024: member

    is the desire here to have the unit tests run by default when running individual tests

    No

    (and also by default when running all tests)?

    Yes

    They are just another test now, so they don't need special treatment via --skipunit. Conversely, they don't need to be run more often than other tests.

    The user can specify --failfast to end testing early if any test fails (unit tests are no exception)

    Yes, that seems good enough.

  14. tdb3 commented at 9:10 AM on April 3, 2024: contributor

    is the desire here to have the unit tests run by default when running individual tests

    No

    (and also by default when running all tests)?

    Yes

    They are just another test now, so they don't need special treatment via --skipunit. Conversely, they don't need to be run more often than other tests.

    The user can specify --failfast to end testing early if any test fails (unit tests are no exception)

    Yes, that seems good enough.

    Ah, thank you for clarifying. @mzumsande, now that unit tests are being run in parallel, we could treat them like any other test (i.e. no special treatment with --skipunit, so this option could now be removed). Do you see this aligning with the original goals of skipunit or is there something else to consider? New behavior is described in the opening post, but briefly:

    • unit tests are run in parallel with other tests, so no longer delay other tests
    • unit tests are no longer run implicitly when running specific tests (so do not block those tests). Unit tests can be explicitly included in the list of tests.
  15. mzumsande commented at 9:24 AM on April 3, 2024: contributor

    @mzumsande, now that unit tests are being run in parallel, we could treat them like any other test (i.e. no special treatment with --skipunit, so this option could now be removed). Do you see this aligning with the original goals of skipunit or is there something else to consider?

    Yes, if they are treated like other tests, -skipunit is no longer necessary and should be removed. That's even better than the status quo because it removes the need to remember that -skipunit exists when running a subset of tests

  16. tdb3 force-pushed on Apr 3, 2024
  17. tdb3 commented at 10:42 AM on April 3, 2024: contributor

    Thank you for the input all. skipunit now removed. Rebased and pushed.

  18. in test/functional/feature_framework_unit_tests.py:47 in 61c32b87bd outdated
      42 | +            test_framework_tests
      43 | +        )
      44 | +        if not result.wasSuccessful():
      45 | +            print(f"{buf.getvalue()}", file=sys.stderr)
      46 | +            sys.exit(TEST_EXIT_FAILED)
      47 | +        sys.exit(TEST_EXIT_PASSED)
    


    maflcko commented at 11:11 AM on April 3, 2024:

    Could also print the result on success? IIUC, stdout will be hidden by the test runner, and when running this test independently, having some feedback is better than no feedback?


    tdb3 commented at 1:31 AM on April 4, 2024:

    Thank you for reviewing. Would you be ok with a simplified approach of writing all unit test output to stdout, with this output being shown when unit tests fail? This seems a bit cleaner and more consistent than selectively placing all unit test output to stdout or stderr depending on test pass/fail.

    More info

    TextTestRunner lets us print unit test output to one stream. Previously (before this PR), TextTestRunner wrote all output to stderr regardless of pass/fail but doing this now would cause a false failure (test runner interprets any stderr to mean the test failed).

    As was the case before this PR, unit test pass/fail would continue to be based on TestResult returned.

    As you mentioned, stdout is hidden by the test runner in non-failure situations, so writing to stdout on test pass doesn't do much currently (other than providing the output to test runner for future use). The user is provided with a reasonable test pass/fail status (matching that of the other tests).

    Example output

    Pre-PR (pass)

    $ test/functional/test_runner.py --jobs=18 --cachedir=/mnt/tmp/cache --tmpdir=/mnt/tmp test/functional/feature_signet.py
    Temporary test directory at /mnt/tmp/test_runner_₿_🏃_20240403_192403
    Running Unit Tests for Test Framework Modules
    .....................
    ----------------------------------------------------------------------
    Ran 21 tests in 8.262s
    
    OK
    Remaining jobs: [feature_signet.py]
    1/1 - feature_signet.py passed, Duration: 8 s
    
    TEST              | STATUS    | DURATION
    
    feature_signet.py | ✓ Passed  | 8 s
    
    ALL               | ✓ Passed  | 8 s (accumulated) 
    Runtime: 8 s
    

    Pre-PR (failure in unit test)

    $ test/functional/test_runner.py --jobs=18 --cachedir=/mnt/tmp/cache --tmpdir=/mnt/tmp test/functional/feature_signet.py
    Temporary test directory at /mnt/tmp/test_runner_₿_🏃_20240403_192509
    Running Unit Tests for Test Framework Modules
    .F
    ======================================================================
    FAIL: test_bech32_decode (test_framework.address.TestFrameworkScript.test_bech32_decode)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/dev/bitcoin/test/functional/test_framework/address.py", line 228, in test_bech32_decode
        self.assertEqual(False, True)
    AssertionError: False != True
    
    ----------------------------------------------------------------------
    Ran 2 tests in 0.003s
    
    FAILED (failures=1)
    Early exiting after failure in TestFramework unit tests
    

    PR (pass, running unit and functional tests):

    $ test/functional/test_runner.py --jobs=18 --cachedir=/mnt/tmp/cache --tmpdir=/mnt/tmp test/functional/feature_signet.py test/functional/feature_framework_unit_tests.py 
    Temporary test directory at /mnt/tmp/test_runner_₿_🏃_20240403_193148
    Remaining jobs: [feature_signet.py, feature_framework_unit_tests.py]
    1/2 - feature_signet.py passed, Duration: 8 s
    Remaining jobs: [feature_framework_unit_tests.py]
    2/2 - feature_framework_unit_tests.py passed, Duration: 9 s
    
    TEST                            | STATUS    | DURATION
    
    feature_framework_unit_tests.py | ✓ Passed  | 9 s
    feature_signet.py               | ✓ Passed  | 8 s
    
    ALL                             | ✓ Passed  | 17 s (accumulated) 
    Runtime: 9 s
    

    PR (pass, running only unit tests):

    $ test/functional/test_runner.py --jobs=18 --cachedir=/mnt/tmp/cache --tmpdir=/mnt/tmp test/functional/feature_framework_unit_tests.py
    Temporary test directory at /mnt/tmp/test_runner_₿_🏃_20240403_193240
    Remaining jobs: [feature_framework_unit_tests.py]
    1/1 - feature_framework_unit_tests.py passed, Duration: 9 s
    
    TEST                            | STATUS    | DURATION
    
    feature_framework_unit_tests.py | ✓ Passed  | 9 s
    
    ALL                             | ✓ Passed  | 9 s (accumulated) 
    Runtime: 9 s
    

    PR (failure in unit tests):

    $ test/functional/test_runner.py --jobs=18 --cachedir=/mnt/tmp/cache --tmpdir=/mnt/tmp test/functional/feature_framework_unit_tests.py
    Temporary test directory at /mnt/tmp/test_runner_₿_🏃_20240403_193325
    Remaining jobs: [feature_framework_unit_tests.py]
    1/1 - feature_framework_unit_tests.py failed, Duration: 0 s
    
    stdout:
    .F
    ======================================================================
    FAIL: test_bech32_decode (test_framework.address.TestFrameworkScript.test_bech32_decode)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/dev/myrepos/bitcoin/test/functional/test_framework/address.py", line 228, in test_bech32_decode
        self.assertEqual(False, True)
    AssertionError: False != True
    
    ----------------------------------------------------------------------
    Ran 2 tests in 0.003s
    
    FAILED (failures=1)
    
    stderr:
    
    
    TEST                            | STATUS    | DURATION
    
    feature_framework_unit_tests.py | ✖ Failed  | 0 s
    
    ALL                             | ✖ Failed  | 0 s (accumulated) 
    Runtime: 0 s
    

    maflcko commented at 6:16 AM on April 4, 2024:

    Would you be ok with a simplified approach of writing all unit test output to stdout, with this output being shown when unit tests fail?

    Sure. That'd be fine as well


    maflcko commented at 6:17 AM on April 4, 2024:

    I was thinking about calling the script (passing) directly:

    python3 ./test/functional/feature_framework_unit_tests.py 
    

    tdb3 commented at 12:24 PM on April 4, 2024:

    Ah, makes sense. Pushed an update to address this.

  19. DrahtBot added the label CI failed on Apr 3, 2024
  20. tdb3 commented at 12:25 PM on April 4, 2024: contributor

    Pushed update to address comment and simplify feature_framework_unit_tests.py; providing output when running with or without test runner. Rebased as well.

  21. tdb3 force-pushed on Apr 4, 2024
  22. DrahtBot removed the label CI failed on Apr 4, 2024
  23. DrahtBot added the label Needs rebase on Apr 22, 2024
  24. tdb3 force-pushed on Apr 23, 2024
  25. tdb3 commented at 1:27 AM on April 23, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

    Rebased to address conflict

  26. DrahtBot removed the label Needs rebase on Apr 23, 2024
  27. maflcko commented at 7:00 AM on April 23, 2024: member

    Rebased to address conflict

    The rebase is wrong. Please pay attention to correctly address any merge conflicts, as they arise. One way to avoid mistakes, is to review your own code changes and to check rebases, as other reviewers will do later. For example, using the range-diff, as explained in the docs.

  28. tdb3 commented at 10:49 AM on April 23, 2024: contributor

    The rebase is wrong.

    Thank you. Looks like I had overlooked the addition of wallet_util (TEST_FRAMEWORK_MODULES being moved to feature_framework_unit_tests.py made it less obvious). Will rebase/push soon.

  29. tdb3 force-pushed on Apr 24, 2024
  30. test: Run framework unit tests in parallel
    Reorganize functional test framework unit tests to run in parallel
    with other functional tests.
    
    The option `skipunit` is removed, since unit tests no longer delay
    functional test execution.
    
    Unit tests are run by default when running all tests, and can be
    run explicitly with `feature_framework_unit_tests.py` when running
    a subset of tests.
    f19f0a2e5a
  31. tdb3 force-pushed on Apr 24, 2024
  32. DrahtBot added the label CI failed on Apr 24, 2024
  33. DrahtBot commented at 12:27 AM on April 24, 2024: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    <sub>Debug: https://github.com/bitcoin/bitcoin/runs/24178995046</sub>

  34. tdb3 commented at 1:16 AM on April 24, 2024: contributor

    Rebased and pushed.

  35. DrahtBot removed the label CI failed on Apr 24, 2024
  36. maflcko commented at 8:43 AM on April 24, 2024: member

    ACK f19f0a2e5af6c2a64900f1f229e21b6f1668bd3d 🌽

    <details><summary>Show signature</summary>

    Signature:

    untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    trusted comment: ACK f19f0a2e5af6c2a64900f1f229e21b6f1668bd3d 🌽
    5iGa35C3WtPyuzsus+SXWTAwe3NCXsFoFkgT0RqTyiSknQ3FPoy0VkQEbNmm9S4yj1hIlYnvVtDo7Ny/YVS7BA==
    

    </details>

  37. in test/functional/test_runner.py:241 in f19f0a2e5a
     237 | @@ -255,6 +238,7 @@
     238 |      'wallet_keypool.py --descriptors',
     239 |      'wallet_descriptor.py --descriptors',
     240 |      'p2p_nobloomfilter_messages.py',
     241 | +    TEST_FRAMEWORK_UNIT_TESTS,
    


    kevkevinpal commented at 1:32 PM on April 24, 2024:

    is there any reason we picked this location in BASE_SCRIPTS, I would think we would want to run TEST_FRAMEWORK_UNIT_TESTS earlier on even if they are running in parallel

    since if the unit tests fail other tests will probably fail as well.

    I think making it the first script in BASE_SCRIPTS would make sense to me and then making an exception in regards to run times since other tests deepened on this.


    tdb3 commented at 4:12 PM on April 24, 2024:

    Thank you for the review!

    Great question. The location chosen was within the # vv Tests less than 30s vv portion of BASE_SCRIPTS. I'm seeing run-times for feature_framework_unit_tests.py (on two machines I'm using) of 9s and 6s, and TEST_FRAMEWORK_UNIT_TESTS is around mid-way in the "30s" portion of the list, which seemed reasonable. https://github.com/bitcoin/bitcoin/blob/f19f0a2e5af6c2a64900f1f229e21b6f1668bd3d/test/functional/test_runner.py#L392-L393

    The existing comment to insert at random seems to provide general high-level guidance and doesn't dig into individual system performance (for example, 30s worth of work on an Ivy Bridge CPU and a Zen 4 CPU would be drastically different due to significant IPC differences). Maybe I'm overlooking other guidance that defines relative performance (e.g. 30s on a particular cloud service instance type, CPU type, etc.). There's probably an opportunity for rethinking and updating this guidance, but that would be adjacent to this PR rather than included in it.

  38. kevkevinpal commented at 1:34 PM on April 24, 2024: contributor

    Approach ACK f19f0a2e5af6c2a64900f1f229e21b6f1668bd3d

    I still need to run this on my local machine

  39. achow101 commented at 8:00 PM on April 26, 2024: member

    ACK f19f0a2e5af6c2a64900f1f229e21b6f1668bd3d

  40. DrahtBot requested review from kevkevinpal on Apr 26, 2024
  41. achow101 merged this on Apr 26, 2024
  42. achow101 closed this on Apr 26, 2024

  43. bitcoin locked this on Apr 26, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-17 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me