test: clean up logs when there aren't perf subprocesses #23928

pull brunoerg wants to merge 1 commits into bitcoin:master from brunoerg:2021-12-fix-perf-clean-up changing 2 files +4 −1
  1. brunoerg commented at 9:00 PM on December 31, 2021: contributor

    To check node performance during functional tests we can use perf. It generates profile data in the node's datadir, so after executing a test, node's datadir is not clean up. However, it only works in Linux, if you try to use perf in other systems for example, a warning message will be showed but the test will be executed normally. Besides that, the node's datadir won't be clean up because it understands you're using perf even if not working. This PR fixes it by checking if there are 'perf subprocesses' when deciding if the node's datadir should be clean up.

    Example (master branch/macOS 12.0.1):

    ➜  bitcoin-core-dev git:(master) ✗ ./test/functional/feature_anchors.py --perf
    2021-12-30T19:07:22.576000Z TestFramework (INFO): Initializing test directory /var/folders/7j/m0yjzmhj4ys9jgl353v2mqph0000gn/T/bitcoin_func_test_bkg26piv
    2021-12-30T19:07:22.621000Z TestFramework.node0 (WARNING): Can't profile with perf; only available on Linux platforms
    2021-12-30T19:07:23.169000Z TestFramework (INFO): When node starts, check if anchors.dat doesn't exist
    2021-12-30T19:07:23.169000Z TestFramework (INFO): Add 2 block-relay-only connections to node
    2021-12-30T19:07:23.446000Z TestFramework (INFO): Add 5 inbound connections to node
    2021-12-30T19:07:24.011000Z TestFramework (INFO): Check node connections
    2021-12-30T19:07:24.013000Z TestFramework (INFO): Stop node 0
    2021-12-30T19:07:24.231000Z TestFramework (INFO): Check the addresses in anchors.dat
    2021-12-30T19:07:24.232000Z TestFramework (INFO): Start node
    2021-12-30T19:07:24.236000Z TestFramework.node0 (WARNING): Can't profile with perf; only available on Linux platforms
    2021-12-30T19:07:24.758000Z TestFramework (INFO): When node starts, check if anchors.dat doesn't exist anymore
    2021-12-30T19:07:24.814000Z TestFramework (INFO): Stopping nodes
    2021-12-30T19:07:25.085000Z TestFramework (WARNING): Not cleaning up dir /var/folders/7j/m0yjzmhj4ys9jgl353v2mqph0000gn/T/bitcoin_func_test_bkg26piv due to perf data
    2021-12-30T19:07:25.085000Z TestFramework (INFO): Tests successful
    

    Example (this branch/macOS 12.0.1):

    ➜  bitcoin-core-dev git:(2021-12-fix-perf-clean-up) ✗ ./test/functional/feature_anchors.py --perf
    2021-12-30T19:08:25.664000Z TestFramework (INFO): Initializing test directory /var/folders/7j/m0yjzmhj4ys9jgl353v2mqph0000gn/T/bitcoin_func_test_xl2qtsbz
    2021-12-30T19:08:25.706000Z TestFramework.node0 (WARNING): Can't profile with perf; only available on Linux platforms
    2021-12-30T19:08:26.261000Z TestFramework (INFO): When node starts, check if anchors.dat doesn't exist
    2021-12-30T19:08:26.262000Z TestFramework (INFO): Add 2 block-relay-only connections to node
    2021-12-30T19:08:26.538000Z TestFramework (INFO): Add 5 inbound connections to node
    2021-12-30T19:08:27.110000Z TestFramework (INFO): Check node connections
    2021-12-30T19:08:27.112000Z TestFramework (INFO): Stop node 0
    2021-12-30T19:08:27.330000Z TestFramework (INFO): Check the addresses in anchors.dat
    2021-12-30T19:08:27.331000Z TestFramework (INFO): Start node
    2021-12-30T19:08:27.336000Z TestFramework.node0 (WARNING): Can't profile with perf; only available on Linux platforms
    2021-12-30T19:08:27.870000Z TestFramework (INFO): When node starts, check if anchors.dat doesn't exist anymore
    2021-12-30T19:08:27.927000Z TestFramework (INFO): Stopping nodes
    2021-12-30T19:08:28.195000Z TestFramework (INFO): Cleaning up /var/folders/7j/m0yjzmhj4ys9jgl353v2mqph0000gn/T/bitcoin_func_test_xl2qtsbz on exit
    2021-12-30T19:08:28.195000Z TestFramework (INFO): Tests successful
    
  2. DrahtBot added the label Tests on Dec 31, 2021
  3. in test/functional/test_framework/test_framework.py:315 in 8a133c4b16 outdated
     311 |          should_clean_up = (
     312 |              not self.options.nocleanup and
     313 |              not self.options.noshutdown and
     314 |              self.success != TestStatus.FAILED and
     315 | -            not self.options.perf
     316 | +            (not self.options.perf or not exists_perf_subprocess)
    


    theStack commented at 10:27 PM on January 1, 2022:

    nit: alternative version, which I find easier to reason (feel free to ignore):

                not (self.options.perf and exists_perf_subprocess)
    

    brunoerg commented at 1:48 PM on January 2, 2022:

    Great

  4. theStack approved
  5. theStack commented at 10:29 PM on January 1, 2022: contributor

    Tested ACK 8a133c4b168e079bdf0f95798ba01903e4ff1477

    Nice catch! I could reproduce the issue and verify that the PR fixes it on OpenBSD 7.0.

    master:

    $ ./test/functional/feature_utxo_set_hash.py --perf
    2022-01-01T22:23:48.453000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_p5bvc84c
    2022-01-01T22:23:48.481000Z TestFramework.node0 (WARNING): Can't profile with perf; only available on Linux platforms
    2022-01-01T22:23:49.027000Z TestFramework (INFO): Test MuHash implementation consistency
    2022-01-01T22:23:50.364000Z TestFramework (INFO): Test deterministic UTXO set hash results
    2022-01-01T22:23:50.456000Z TestFramework (INFO): Stopping nodes
    2022-01-01T22:23:50.575000Z TestFramework (WARNING): Not cleaning up dir /tmp/bitcoin_func_test_p5bvc84c due to perf data
    2022-01-01T22:23:50.576000Z TestFramework (INFO): Tests successful
    

    PR branch:

    $ ./test/functional/feature_utxo_set_hash.py --perf
    2022-01-01T22:24:33.087000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_w4r3xff3
    2022-01-01T22:24:33.136000Z TestFramework.node0 (WARNING): Can't profile with perf; only available on Linux platforms
    2022-01-01T22:24:33.947000Z TestFramework (INFO): Test MuHash implementation consistency
    2022-01-01T22:24:35.554000Z TestFramework (INFO): Test deterministic UTXO set hash results
    2022-01-01T22:24:35.646000Z TestFramework (INFO): Stopping nodes
    2022-01-01T22:24:35.846000Z TestFramework (INFO): Cleaning up /tmp/bitcoin_func_test_w4r3xff3 on exit
    2022-01-01T22:24:35.846000Z TestFramework (INFO): Tests successful
    

    If I read the code correctly, Linux users should also be able to reproduce the behaviour if they uninstall perf-tools: https://github.com/bitcoin/bitcoin/blob/a1e04e10799791a728cce4430808517598c8742c/test/functional/test_framework/test_node.py#L456-L458

  6. brunoerg force-pushed on Jan 2, 2022
  7. theStack approved
  8. theStack commented at 3:48 PM on January 2, 2022: contributor

    re-ACK af76ac1e1120f670d52ef3df05792ecad6f1aeaa

  9. shaavan approved
  10. shaavan commented at 11:10 AM on January 3, 2022: contributor

    ACK af76ac1e1120f670d52ef3df05792ecad6f1aeaa Tested successfully on Ubuntu 20.04

    It makes sense to check if the perf-tools were used correctly before deciding on cleaning up datadir. And I think the existence of perf-subprocess is a good indicator of whether perf-tools were used correctly or not.

    I was able to test this PR on Ubuntu 20.04, with perf-tools uninstalled, successfully. Adding the screenshot of relevant outputs to show the difference between master and PR branch results.

    Screenshot:

    Master PR
    Master PR
  11. maflcko changes_requested
  12. maflcko commented at 11:39 AM on January 3, 2022: member

    So if a test shuts down the nodes, it won't be possible to use perf?

  13. theStack commented at 1:55 PM on January 3, 2022: contributor

    So if a test shuts down the nodes, it won't be possible to use perf?

    True, didn't think about that scenario :/ so I guess it'd be better to set some flag like perf_available and check for that instead of the perf subprocesses?

  14. brunoerg commented at 2:19 PM on January 3, 2022: contributor

    So if a test shuts down the nodes, it won't be possible to use perf?

    True, nice thought. So, it would be better to create a flag perf_available as suggested by @theStack (default false) in TestNode, and set it true in _start_perf.

    subp = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    self.perf_subprocesses[profile_name] = subp
    self.perf_available = True
    

    and then, use this flag to verify, like:

    perf_available = any(bool(node.perf_available) for node in self.nodes)
    should_clean_up = (
         not self.options.nocleanup and
         not self.options.noshutdown and
         self.success != TestStatus.FAILED and
         not (self.options.perf and perf_available)
    )
    

    What do you think?

  15. shaavan commented at 2:32 PM on January 3, 2022: contributor

    @brunoerg

    I reckon this would be a better way to go about, as:

    • We would not be dependent on the existence of the node.perf_subprocess to check if perf-tools were used successfully or not.
    • Value of node.perf_available set to true would ensure with 100% confidence that perf-tools were used successfully for this node, and hence datadir should not be cleaned.
  16. brunoerg force-pushed on Jan 3, 2022
  17. brunoerg commented at 4:08 PM on January 3, 2022: contributor

    force-pushed to change the approach, now using a flag instead of checking the subprocess.

  18. brunoerg commented at 9:43 PM on January 4, 2022: contributor

    Is this approach ok for that situation? @MarcoFalke @theStack

  19. theStack approved
  20. theStack commented at 6:13 PM on January 24, 2022: contributor

    re-ACK c0f468690b7994a8604a3d8f7ec30ebf3105dfb8

    It would be more clean to store the flag only once rather than per-node (since the perf_available value will always be the same among nodes) though, but no hard opinion.

  21. maflcko commented at 3:07 PM on January 26, 2022: member

    Still not sure about this. What is the point of continuing the test when the user asked for perf, but it is not available? Shouldn't the test fail early with an error?

  22. brunoerg commented at 3:59 PM on January 27, 2022: contributor

    If more people agree on failing the test when perf is not available, I can change the approach. However, I believe the test could be executed normally since perf is just a tool to measure performance.

  23. fanquake commented at 10:51 AM on May 6, 2022: member

    @MarcoFalke any thoughts after the latest comment?

  24. maflcko commented at 11:15 AM on May 6, 2022: member

    No, I don't have new thoughts related to this pull request

  25. DrahtBot commented at 2:00 PM on September 23, 2022: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK kouloumos
    Stale ACK shaavan, theStack

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    No conflicts as of last run.

  26. achow101 commented at 6:53 PM on October 12, 2022: member

    Are you still working on this?

  27. brunoerg commented at 9:28 PM on October 12, 2022: contributor

    Are you still working on this?

    It seems ready but needs more conceptual review?

  28. in test/functional/test_framework/test_framework.py:310 in c0f468690b outdated
     306 | @@ -307,11 +307,12 @@ def shutdown(self):
     307 |                  node.cleanup_on_exit = False
     308 |              self.log.info("Note: bitcoinds were not stopped and may still be running")
     309 |  
     310 | +        perf_available = any(bool(node.perf_available) for node in self.nodes)
    


    kouloumos commented at 2:00 PM on January 30, 2023:

    nit: I think that this is more of a perf_active than a perf_available.

  29. in test/functional/test_framework/test_framework.py:315 in c0f468690b outdated
     311 |          should_clean_up = (
     312 |              not self.options.nocleanup and
     313 |              not self.options.noshutdown and
     314 |              self.success != TestStatus.FAILED and
     315 | -            not self.options.perf
     316 | +            not (self.options.perf and perf_available)
    


    kouloumos commented at 2:11 PM on January 30, 2023:

    When perf_available is true, self.options.perf is not always true.

    By checking only for perf_available (which is set only if perf is actually active) you are also fixing an issue where if profile_with_perf is used, the directory still gets cleaned up, and therefore the perf data are lost.

                not (perf_available)
    

    and also should check for perf_available instead of self.options.perf as part of the below conditional statement.


    brunoerg commented at 9:33 AM on April 26, 2023:

    Make sense, gonna address it!

  30. kouloumos commented at 2:24 PM on January 30, 2023: contributor

    Concept ACK

    This PR can also fix an issue regarding the usage of profile_with_perf, where if used, the directory still gets cleaned up, therefore the perf data are lost.

    It would be more clean to store the flag only once rather than per-node (since the perf_available value will always be the same among nodes) though, but no hard opinion.

    This is not true due to the existence of the profile_with_perf context manager that can be independently enabled for nodes.

  31. test: clean up logs when there aren't perf subprocesses 481a739a1d
  32. brunoerg force-pushed on Apr 26, 2023
  33. brunoerg commented at 9:34 AM on April 26, 2023: contributor

    Force-pushed addressing #23928 (review). Thanks @kouloumos for the review.

  34. maflcko commented at 11:52 AM on April 26, 2023: member

    I still think if the user passes --perf and perf is not available, the program should fail and tell the user that perf isn't found. Not sure what the point is to silently ignore what the user asked for and add code for a special case that shouldn't exist.

  35. DrahtBot added the label CI failed on Jun 9, 2023
  36. DrahtBot removed the label CI failed on Jun 9, 2023
  37. brunoerg commented at 1:58 PM on September 20, 2023: contributor

    Closing it for now.

  38. brunoerg closed this on Sep 20, 2023

  39. bitcoin locked this on Sep 19, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-02 03:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me