bench: add support for custom data directory #31000

pull furszy wants to merge 2 commits into bitcoin:master from furszy:2024_bench_custom_datadir changing 3 files +43 −2
  1. furszy commented at 9:01 pm on September 29, 2024: member

    Expands the benchmark framework with the existing -testdatadir arg, enabling the ability to change the benchmark data directory.

    This is useful for running benchmarks on different storage devices, and not just under the OS /tmp/ directory.

    A good use case is #28574, where we are benchmarking the wallet migration process on an HDD.

  2. DrahtBot commented at 9:01 pm on September 29, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK hodlinator

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Sep 29, 2024
  4. DrahtBot added the label CI failed on Sep 29, 2024
  5. DrahtBot commented at 10:24 pm on September 29, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/30825058976

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  6. bench: specialize working directory name
    Since G_TEST_GET_FULL_NAME is not initialized in the benchmark framework,
    benchmarks using the unit test setup run in the same directory without
    any clear distinction between them.
    This poses an extra complication for locating any specific benchmark
    directory during a failure.
    
    In master, all benchmarks run in the following path:
    /<OS_tmp_dir>/test_common bitcoin/<random_uint256>/
    
    After this commit, benchmarks are contained within its own directory:
    /<OS_tmp_dir>/test_common bitcoin/<benchmark_name>/<random_uint256>/
    
    This makes it easier to find any benchmark run when a failure occurs.
    221410bc14
  7. furszy force-pushed on Sep 30, 2024
  8. furszy force-pushed on Sep 30, 2024
  9. DrahtBot removed the label CI failed on Sep 30, 2024
  10. maflcko commented at 5:43 am on September 30, 2024: member
    Just as a note, it should already be possible to pick the device via an env var, see https://en.cppreference.com/w/cpp/filesystem/temp_directory_path#Notes.
  11. EricMujjona approved
  12. EricMujjona commented at 8:57 am on September 30, 2024: none
    Efficient code
  13. furszy commented at 8:42 pm on September 30, 2024: member

    Just as a note, it should already be possible to pick the device via an env var, see https://en.cppreference.com/w/cpp/filesystem/temp_directory_path#Notes.

    Hmm, thats a good observation. Even when that may be possible now, I’m not sure we should start recommending the mixture of env vars and binary args at the user interface level. It seems more natural to plug the already existing -testdatadir arg to the benchmark framework. Still, if we move towards the env var direction, the next step should be revert #26564?.

  14. maflcko commented at 5:35 am on October 1, 2024: member

    the next step should be revert #26564?.

    IIUC the motivation for 26564 was to have a fully static and fixed path for the unit test datadir. My understanding is that changing the temp storage device was just a side-effect and not the primary motivation. (My comment was just a note to say that your goal is already achievable today)

    It seems more natural to plug the already existing -testdatadir arg to the benchmark framework.

    I agree. I’ve also suggested this in #30737 (review)

  15. furszy commented at 6:01 pm on October 1, 2024: member

    the next step should be revert #26564?.

    IIUC the motivation for 26564 was to have a fully static and fixed path for the unit test datadir. My understanding is that changing the temp storage device was just a side-effect and not the primary motivation. (My comment was just a note to say that your goal is already achievable today)

    Practically speaking, we’re ultimately talking about pretty much the same feature. It’s about customizing the test root directory path so it can be inspected or used in another context. The thing is, 26564 introduced several different features in the same commit: the test no cleanup, the directory name specialization using the test name, and the -testdatadir arg. And what I meant is that this last point can be achieved by setting the environment variable you mentioned. Yet, it seems we agree on not doing it, but it’s curious that this can still be used nowadays.

  16. in src/bench/bench_bitcoin.cpp:65 in 1ba225c6ce outdated
    60@@ -60,6 +61,17 @@ static uint8_t parsePriorityLevel(const std::string& str) {
    61     return levels;
    62 }
    63 
    64+// Parses test setup related arguments
    65+static std::vector<std::string> parseTestSetupArgs(const ArgsManager& argsman) {
    


    hodlinator commented at 7:44 am on October 3, 2024:

    Braces on new lines for classes, functions, methods.

    Can see you are following style of function above, but function below follows the dev-notes.

    Don’t feel as strongly about capitalization.

    0static std::vector<std::string> ParseTestSetupArgs(const ArgsManager& argsman)
    1{
    

    If you want to make parsePriorityLevel conform to the dev-notes as well, I support it.


    furszy commented at 6:43 pm on October 3, 2024:
    Done. Added the jump line. I don’t feel strong about the capitalization neither but it looks incorrect to introduce it now when all other functions aren’t using it.

    hodlinator commented at 10:22 pm on October 5, 2024:

    I don’t feel strong about the capitalization neither but it looks incorrect to introduce it now when all other functions aren’t using it.

    (SetupBenchArgs is capitalized, main must be non-capitalized, so that only leaves parseAsymptote to change beyond the function under discussion).

  17. in src/bench/bench_bitcoin.cpp:68 in 1ba225c6ce outdated
    60@@ -60,6 +61,17 @@ static uint8_t parsePriorityLevel(const std::string& str) {
    61     return levels;
    62 }
    63 
    64+// Parses test setup related arguments
    65+static std::vector<std::string> parseTestSetupArgs(const ArgsManager& argsman) {
    66+    std::vector<std::string> args;
    67+
    68+    // Test data directory
    


    hodlinator commented at 8:07 am on October 3, 2024:
    nit: Redundant comment.

    furszy commented at 6:45 pm on October 3, 2024:
    Have reworked this function in a more general manner just so we can add new args in the future with just one line of parsing code.

    hodlinator commented at 10:27 pm on October 5, 2024:
    A bit heavy for one arg as it is right now, but I like that it avoids repeating the name. :+1:
  18. in src/bench/bench_bitcoin.cpp:40 in 1ba225c6ce outdated
    36@@ -37,6 +37,7 @@ static void SetupBenchArgs(ArgsManager& argsman)
    37     argsman.AddArg("-sanity-check", "Run benchmarks for only one iteration with no output", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    38     argsman.AddArg("-priority-level=<l1,l2,l3>", strprintf("Run benchmarks of one or multiple priority level(s) (%s), default: '%s'",
    39                                                            benchmark::ListPriorities(), DEFAULT_PRIORITY), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    40+    argsman.AddArg("-testdatadir", "Run benchmarks on a custom directory (default: </tmp/>)", ArgsManager::ALLOW_ANY, OptionsCategory::DEBUG_TEST);
    


    hodlinator commented at 8:19 am on October 3, 2024:

    Why not make SetupUnitTestArgs available and call it here instead? Delegates the responsibility of documenting the default value to that part of the code.

    (Could maybe rename SetupUnitTestArgs -> SetupTestArgs or static BasicTestingSetup::RegisterArgs to make it less unit test-specific and communicate how broadly it is used).


    furszy commented at 7:15 pm on October 3, 2024:

    Why not make SetupUnitTestArgs available and call it here instead? Delegates the responsibility of documenting the default value to that part of the code.

    (Could maybe rename SetupUnitTestArgs -> SetupTestArgs or static BasicTestingSetup::RegisterArgs to make it less unit test-specific and communicate how broadly it is used).

    I think we should be careful and explicit about the features we add to other binaries. One of them could end up clashing with another functionality or be removed from a binary that was using it without us noticing it.


    hodlinator commented at 10:16 pm on October 5, 2024:
    Agree a SetupTestArgs may not communicate clearly enough that the function is used in a broader context. How about calling it SetupCommonTestArgs and adding a comment about it being used in benchmarks as well? That, together with no longer marking the function static (private to the compilation unit), might mitigate your concerns?

    furszy commented at 11:55 pm on October 5, 2024:

    Agree a SetupTestArgs may not communicate clearly enough that the function is used in a broader context. How about calling it SetupCommonTestArgs and adding a comment about it being used in benchmarks as well? That, together with no longer marking the function static (private to the compilation unit), might mitigate your concerns?

    I’m not so in favor of mentioning upper level binaries or frameworks in lower level files but no problem in applying the suggestion.


    furszy commented at 0:01 am on October 6, 2024:
    Hmm well, I’m still a bit reluctant to it. It would add test/util/setup_common.h dependency to the benchmark binary entry point just for a single line of code.
  19. in src/bench/bench.cpp:36 in 1ba225c6ce outdated
    32+ * Retrieves the available test setup command line arguments that may be used
    33+ * in the benchmark. They will be used only if the benchmark utilizes a
    34+ * 'BasicTestingSetup' or any child of it.
    35+ */
    36+static std::function<std::vector<const char*>()> g_bench_command_line_args{};
    37+
    


    hodlinator commented at 8:23 am on October 3, 2024:
    (Please either remove empty line after local global here or add an empty line after static std::string g_running_benchmark_name).

    furszy commented at 7:19 pm on October 3, 2024:
    done.
  20. hodlinator commented at 9:07 am on October 3, 2024: contributor

    Concept ACK 1ba225c6ce689defb7063dd68f62e1e90fcdda5e

    Good to see these globals implemented for bench. As already mentioned, I’ve been working on a partially overlapping change in #30737.

  21. bench: add support for custom data directory
    Expands the benchmark framework with the existing '-testdatadir' arg,
    enabling the ability to change the benchmark data directory.
    
    This is useful for running benchmarks on different storage devices, and
    not just under the OS /tmp/ directory.
    ded1a6cc49
  22. furszy force-pushed on Oct 3, 2024
  23. furszy commented at 7:21 pm on October 3, 2024: member

    Thanks for the review @hodlinator. Updated per feedback.

    Good to see these globals implemented for bench. As already mentioned, I’ve been working on a partially overlapping change in #30737.

    Cool. It wasn’t on my radar. Adding it now.

  24. hodlinator commented at 10:32 pm on October 5, 2024: contributor

    git range-diff master 1ba225c ded1a6c

    Still feel like it would be good to call a common function adding -testdatadir to an ArgsManager instead of duplicating the functionality between test/bench (and having a different description-string).


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-08 16:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me