lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner #29744

pull davidgumberg wants to merge 10 commits into bitcoin:master from davidgumberg:rustlin changing 19 files +1181 −928
  1. davidgumberg commented at 10:20 pm on March 26, 2024: contributor

    Motivated by a comment in #29479, this draft PR refactors the lint runner into multiple modules, and refactors a few of the python linters to reuse common file excluding logic.

    The include guards lint now prints all missing include guards before exiting, and all of the rewritten linters now exclude all subtrees, including crypto/caes.

    The lint runner now supports running individual linters by passing --lint=LINT_CHECK_TO_RUN to the lint runner. If using cargo run arguments can be passed e.g.:

    0cd test/lint/test_runner && cargo run -- --lint=doc --lint=includes
    

    I can see reasons why the benefits of this PR might be outweighed by the downsides so I’m just publishing this as a draft to explore if further rewriting/refactoring of the lint suite in rust is desirable, and if so, what the best approach is.

  2. DrahtBot commented at 10:20 pm on March 26, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept NACK fjahr
    Concept ACK dergoegge

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29494 (build: Assume HAVE_CONFIG_H, Add IWYU pragma keep to bitcoin-config.h includes by maflcko)
    • #28981 (Replace Boost.Process with cpp-subprocess by hebasto)
    • #22417 (util/system: Close non-std fds when execing slave processes by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on Mar 26, 2024
  4. fjahr commented at 10:50 pm on March 26, 2024: contributor
    I think this needs a conceptual discussion if we want to move all Python code to Rust eventually.
  5. bitcoin deleted a comment on Mar 26, 2024
  6. bitcoin deleted a comment on Mar 26, 2024
  7. dergoegge commented at 10:41 am on March 27, 2024: member
    Concept ACK 🦀
  8. maflcko commented at 11:21 am on March 27, 2024: member

    I think this needs a conceptual discussion if we want to move all Python code to Rust eventually.

    I don’t think moving all Python code to Rust makes sense. Simply due to the reason that reviewing such a large change is close to impossible.

    0$ cat $( git ls-files '*.py' ) | wc -l 
    178641
    

    See also my reply in another thread: #29479 (comment)

  9. maflcko commented at 11:25 am on March 27, 2024: member

    The only expected change in behavior is that the include guards lint now prints all missing include guards before exiting.

    I think you are missing that crypto/ctaes is a subtree, which previously was not ignored by some checks, and now is. It may be good to mention this.

  10. maflcko commented at 12:04 pm on March 27, 2024: member

    Also, looking at this again, test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py were forgotten?

    I think it could make sense to either add them to this pull, or correct the python list, and make them to use the python list.

  11. fjahr commented at 12:09 pm on March 27, 2024: contributor

    I don’t think moving all Python code to Rust makes sense. Simply due to the reason that reviewing such a large change is close to impossible.

    The tests are already split up into separate files and can be ported one at a time. We could pull in rust_bitcoin to get access to all the low-level stuff they have already written. It is a big project but certainly doable if Rust is considered a big enough upgrade. If it’s just a minor improvement I don’t think it’s worth alienating the developers that don’t know Rust from contributing to the linters.

  12. maflcko commented at 12:20 pm on March 27, 2024: member

    The tests are already split up into separate files and can be ported one at a time

    Sure, but I think the functional test framework itself is already 10k lines of code. Having to support two test frameworks that are supposed to behave similar or identical may already be too difficult.

  13. DrahtBot added the label Needs rebase on Mar 27, 2024
  14. fjahr commented at 1:38 pm on March 27, 2024: contributor

    Previously the linters were separated into their own files and could be called separately as well. This is now moving to one big monolytic program, which is not great. The lint_all function also doesn’t actually run all linters anymore with this change, it only runs the python ones, not the rewritten ones, so it should probably be renamed?

    Sure, but I think the functional test framework itself is already 10k lines of code. Having to support two test frameworks that are supposed to behave similar or identical may already be too difficult.

    If such a rewrite is not worth it on a big scale we should also not do it on a small scale. The value and effort scale up proportionally with the size of the change. And if we are concerned about resources I think we also should do this here either because a small distraction is a distraction as well.

  15. maflcko commented at 6:50 pm on March 27, 2024: member

    Previously the linters were separated into their own files and could be called separately as well.

    Good point. This should be trivial to fix. For example by having an environment variable to indicate which check to run. For example LINT=subtree cargo run to run lint_subtree, and LINT=help to print all LINT tests.

    The lint_all function also doesn’t actually run all linters anymore with this change, it only runs the python ones

    I don’t think this has changed. The linters written in bash were never run as part of “lint all”, for example git-subtree-check.sh or test/lint/commit-script-check.sh. Also, test/lint/check-doc.py (written in python) was never run as part of “lint all”.

    Generally, I think more important than the choice of language is to actually fix the issue that was intended to be solved. That is, #29744 (comment) and #29744 (comment) . Whether the fix is done in Python or Rust should be secondary, as long as the fix is done.

  16. davidgumberg force-pushed on Apr 2, 2024
  17. Refactor lint runner into separate modules
    Refactor lint runner into separate modules and use
    get_pathspecs_exclude* helpers everywhere.
    7b5549d4d8
  18. Refactor rust linters into separate modules 1ffdf3f0ad
  19. Support individual linters in rust lint runner
    Add support for passing `--lint=LINT_TO_RUN` to the rust lint runner.
    871133f7f2
  20. Docs: Running individual lint checks 840436f5e4
  21. Rewrite include guard linter in rust 22898ad895
  22. Rewrite includes linter in rust 3cf58e70e3
  23. Rewrite spelling linter in rust 8e49b1ebee
  24. Rewrite locale dependence linter in Rust 9b02b26a5d
  25. Rewrite python encoding lints in rust 733b881592
  26. Codespell false positive crate->create c239e90996
  27. davidgumberg force-pushed on Apr 3, 2024
  28. DrahtBot added the label CI failed on Apr 3, 2024
  29. davidgumberg commented at 0:35 am on April 3, 2024: contributor

    I’ve substantially revised this PR to address @fjahr’s feedback that the rust lint runner monofile was getting too big, and that it doesn’t support running individual tests and @maflcko’s suggestion to include test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py .

    I am just OK at regex, and had to do some tricky things to avoid importing the rust regex crate, so I ask any reviewers to give special care to all of the regex segments, especially the pattern in fn get_locale_dependence_regexp(), I really appreciate your time and feedback!

    I feel that while whether the added review and complexity burden of the changes here and, broadly, of having another language for everyone to be knowledgeable of outweighs the benefits remains an open question, (I am concept +0) it is possible for further rewriting of the lint suite to be worthwhile without it being the case that moving all Python code to Rust should be a good idea.

  30. davidgumberg renamed this:
    lint: Rewrite spelling, includes, and include guards linters in Rust
    lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner
    on Apr 3, 2024
  31. DrahtBot renamed this:
    lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner
    lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner
    on Apr 3, 2024
  32. DrahtBot removed the label Needs rebase on Apr 3, 2024
  33. DrahtBot removed the label CI failed on Apr 5, 2024
  34. in test/lint/test_runner/src/linters/doc.rs:19 in 1ffdf3f0ad outdated
    14+    {
    15+        Ok(())
    16+    } else {
    17+        Err("".to_string())
    18+    }
    19+}
    


    maflcko commented at 1:09 pm on April 5, 2024:

    In 1ffdf3f0ad10879cc4aab6a513af8d1c20c63151: Why? What is the point to put a single function consisting of a single if in a new file and module? I fail to see how more boilerplate than the actual logic is beneficial.

    If your point is that large files shouldn’t exist, then I tend to disagree. For example, doc/developer-notes.md is large and exists. Same for ./src/rpc/blockchain.cpp.

    My preference would be to not move code, unless there is a reason.

  35. DrahtBot added the label Needs rebase on Apr 10, 2024
  36. DrahtBot commented at 10:55 am on April 10, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  37. maflcko commented at 11:37 am on April 15, 2024: member
    Are you still working on this?
  38. davidgumberg commented at 12:24 pm on April 15, 2024: contributor

    Are you still working on this?

    Yes, I was just waiting to see if there was any other feedback before putting all of the linter code back into a single file.

    I think a lot of the reason for the individual lint checks being placed in separate files previously was so that they could be run individually, but I feel that a single file that contains both the lint checks and the lint runner is a bit clunky.

    I would rather leave splitting up main.rs to the tastes of a future PR, unless someone has a strong opinion about how the files should be laid out.

  39. fjahr commented at 1:52 pm on April 15, 2024: contributor

    Concept NACK

    As I have said before, I think there needs to be a discussion about using more Rust just for the sake of using more Rust in the project. The only motivation given so far is that @maflcko said that it could be done. There is no benefit to doing this otherwise apparently and the biggest effect I see is that there will be fewer people able to contribute to that part of the code base. And this change alone will also take up a lot of review effort so it appears to be a clear negative, at least in the short to mid-term.

  40. maflcko commented at 2:08 pm on April 15, 2024: member

    As I have said before, I think there needs to be a discussion about using more Rust just for the sake of using more Rust in the project. The only motivation given so far is that @maflcko said that it could be done.

    Not sure where you get that from, but I never said that using more Rust for the sake of using more Rust can be done. I said something else:

  41. fjahr commented at 2:15 pm on April 15, 2024: contributor

    Not sure where you get that from

    “an alternative might be to rewrite the 3 lint checks to rust” #29479#pullrequestreview-1957109397 That is what @davidgumberg has given as the motivation in his description. It does not matter what you have said in other places because I have not been talking about that. I only talked about your comment that the author took as the motivation for this PR.

  42. maflcko commented at 2:45 pm on April 15, 2024: member
    That comment says so that they can use the already existing function as the motivation. (This is the issue that was attempted to be solved and why the other pull request was created in the first place). Re-reading my comment, I don’t get the impression that it said “using more Rust just for the sake of using more Rust in the project can be done”.
  43. fjahr commented at 2:49 pm on April 15, 2024: contributor

    That comment says so that they can use the already existing function as the motivation. (This is the issue that was attempted to be solved and why the other pull request was created in the first place).

    Right, and that original PR was merged so that part does not apply here anymore, still this PR is still open.

    Re-reading my comment, I don’t get the impression that it said “using more Rust just for the sake of using more Rust in the project can be done”.

    Well, my impression is that the author got that impression and opened this PR because of it.

  44. maflcko commented at 2:58 pm on April 15, 2024: member

    Right, and that original PR was merged so that part does not apply here anymore, still this PR is still open.

    My understanding is that the original PR didn’t actually fix the issue, so the problem remains. For example, crypto/ctaes is a subtree, but test/lint/lint_ignore_dirs.py does not list it. Also, test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py don’t use test/lint/lint_ignore_dirs.py.

    Let me know if I am missing something. If the issue is indeed fixed, this pull request can clearly be closed.

    In any case, it seems better to close this pull request and start a fresh one, if needed, which clearly explains the issue/problem in its own words, and how it is fixed. I don’t think the discussion in this pull request, and the fact that the patch changed after opening, makes it easy to follow.

  45. fjahr commented at 3:18 pm on April 15, 2024: contributor

    For example, crypto/ctaes is a subtree, but test/lint/lint_ignore_dirs.py does not list it. Also, test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py don’t use test/lint/lint_ignore_dirs.py.

    Right, these are behavior changes that I did not see as part of the motivation of the original PR, as the title says it was a refactor. So I don’t think this is something that was missed, it was just not the goal. It should be made more clear that this is the main motivation of this PR, to improve exclusion of subtrees. It should not be called a refactor in the title as it is now.

  46. maflcko commented at 12:11 pm on April 22, 2024: member
    Closing for now per previous discussion. Feel free to create a new pull request(s) with the behavior changes and an accurate description in your own words. Please try to avoid the use of the word “refactor” for bugfixes or behavior changes.
  47. maflcko closed this on Apr 22, 2024

  48. davidgumberg commented at 0:19 am on April 26, 2024: contributor
    @maflcko @fjahr Thank you for the feedback, sorry for the confusion on this PR. I’ve opened https://github.com/bitcoin/bitcoin/pull/29965

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me