lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner

davidgumberg commented at 10:20 pm on March 26, 2024: contributor

Motivated by a comment in #29479, this draft PR refactors the lint runner into multiple modules, and refactors a few of the python linters to reuse common file excluding logic.

The include guards lint now prints all missing include guards before exiting, and all of the rewritten linters now exclude all subtrees, including crypto/caes.

The lint runner now supports running individual linters by passing --lint=LINT_CHECK_TO_RUN to the lint runner. If using cargo run arguments can be passed e.g.:

0cd test/lint/test_runner && cargo run -- --lint=doc --lint=includes

I can see reasons why the benefits of this PR might be outweighed by the downsides so I’m just publishing this as a draft to explore if further rewriting/refactoring of the lint suite in rust is desirable, and if so, what the best approach is.

DrahtBot commented at 10:20 pm on March 26, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept NACK	fjahr
Concept ACK	dergoegge

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#29494 (build: Assume HAVE_CONFIG_H, Add IWYU pragma keep to bitcoin-config.h includes by maflcko)
#28981 (Replace Boost.Process with cpp-subprocess by hebasto)
#22417 (util/system: Close non-std fds when execing slave processes by luke-jr)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label Tests on Mar 26, 2024

fjahr commented at 10:50 pm on March 26, 2024: contributor

I think this needs a conceptual discussion if we want to move all Python code to Rust eventually.

bitcoin deleted a comment on Mar 26, 2024

dergoegge commented at 10:41 am on March 27, 2024: member

Concept ACK 🦀

maflcko commented at 11:21 am on March 27, 2024: member

I think this needs a conceptual discussion if we want to move all Python code to Rust eventually.

I don’t think moving all Python code to Rust makes sense. Simply due to the reason that reviewing such a large change is close to impossible.

0$ cat $( git ls-files '*.py' ) | wc -l 
178641

See also my reply in another thread: #29479 (comment)

maflcko commented at 11:25 am on March 27, 2024: member

The only expected change in behavior is that the include guards lint now prints all missing include guards before exiting.

I think you are missing that crypto/ctaes is a subtree, which previously was not ignored by some checks, and now is. It may be good to mention this.

maflcko commented at 12:04 pm on March 27, 2024: member

Also, looking at this again, test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py were forgotten?

I think it could make sense to either add them to this pull, or correct the python list, and make them to use the python list.

fjahr commented at 12:09 pm on March 27, 2024: contributor

I don’t think moving all Python code to Rust makes sense. Simply due to the reason that reviewing such a large change is close to impossible.

The tests are already split up into separate files and can be ported one at a time. We could pull in rust_bitcoin to get access to all the low-level stuff they have already written. It is a big project but certainly doable if Rust is considered a big enough upgrade. If it’s just a minor improvement I don’t think it’s worth alienating the developers that don’t know Rust from contributing to the linters.

maflcko commented at 12:20 pm on March 27, 2024: member

The tests are already split up into separate files and can be ported one at a time

Sure, but I think the functional test framework itself is already 10k lines of code. Having to support two test frameworks that are supposed to behave similar or identical may already be too difficult.

DrahtBot added the label Needs rebase on Mar 27, 2024

fjahr commented at 1:38 pm on March 27, 2024: contributor

Previously the linters were separated into their own files and could be called separately as well. This is now moving to one big monolytic program, which is not great. The lint_all function also doesn’t actually run all linters anymore with this change, it only runs the python ones, not the rewritten ones, so it should probably be renamed?

Sure, but I think the functional test framework itself is already 10k lines of code. Having to support two test frameworks that are supposed to behave similar or identical may already be too difficult.

If such a rewrite is not worth it on a big scale we should also not do it on a small scale. The value and effort scale up proportionally with the size of the change. And if we are concerned about resources I think we also should do this here either because a small distraction is a distraction as well.

maflcko commented at 6:50 pm on March 27, 2024: member

Previously the linters were separated into their own files and could be called separately as well.

Good point. This should be trivial to fix. For example by having an environment variable to indicate which check to run. For example LINT=subtree cargo run to run lint_subtree, and LINT=help to print all LINT tests.

The lint_all function also doesn’t actually run all linters anymore with this change, it only runs the python ones

I don’t think this has changed. The linters written in bash were never run as part of “lint all”, for example git-subtree-check.sh or test/lint/commit-script-check.sh. Also, test/lint/check-doc.py (written in python) was never run as part of “lint all”.

Generally, I think more important than the choice of language is to actually fix the issue that was intended to be solved. That is, #29744 (comment) and #29744 (comment) . Whether the fix is done in Python or Rust should be secondary, as long as the fix is done.

davidgumberg force-pushed on Apr 2, 2024

Refactor lint runner into separate modules

Refactor lint runner into separate modules and use
get_pathspecs_exclude* helpers everywhere.

7b5549d4d8

Refactor rust linters into separate modules 1ffdf3f0ad

Support individual linters in rust lint runner

Add support for passing `--lint=LINT_TO_RUN` to the rust lint runner.

871133f7f2

Docs: Running individual lint checks 840436f5e4

Rewrite include guard linter in rust 22898ad895

Rewrite includes linter in rust 3cf58e70e3

Rewrite spelling linter in rust 8e49b1ebee

Rewrite locale dependence linter in Rust 9b02b26a5d

Rewrite python encoding lints in rust 733b881592

Codespell false positive crate->create c239e90996

davidgumberg force-pushed on Apr 3, 2024

DrahtBot added the label CI failed on Apr 3, 2024

davidgumberg commented at 0:35 am on April 3, 2024: contributor

I’ve substantially revised this PR to address @fjahr’s feedback that the rust lint runner monofile was getting too big, and that it doesn’t support running individual tests and @maflcko’s suggestion to include test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py .

I am just OK at regex, and had to do some tricky things to avoid importing the rust regex crate, so I ask any reviewers to give special care to all of the regex segments, especially the pattern in fn get_locale_dependence_regexp(), I really appreciate your time and feedback!

I feel that while whether the added review and complexity burden of the changes here and, broadly, of having another language for everyone to be knowledgeable of outweighs the benefits remains an open question, (I am concept +0) it is possible for further rewriting of the lint suite to be worthwhile without it being the case that moving all Python code to Rust should be a good idea.

davidgumberg renamed this:
~~lint: Rewrite spelling, includes, and include guards linters in Rust~~
lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner
on Apr 3, 2024

DrahtBot renamed this:
~~lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner~~
lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner
on Apr 3, 2024

DrahtBot removed the label Needs rebase on Apr 3, 2024

DrahtBot removed the label CI failed on Apr 5, 2024

in test/lint/test_runner/src/linters/doc.rs:19 in 1ffdf3f0ad outdated

14+    {
15+        Ok(())
16+    } else {
17+        Err("".to_string())
18+    }
19+}

maflcko commented at 1:09 pm on April 5, 2024:

In 1ffdf3f0ad10879cc4aab6a513af8d1c20c63151: Why? What is the point to put a single function consisting of a single if in a new file and module? I fail to see how more boilerplate than the actual logic is beneficial.

If your point is that large files shouldn’t exist, then I tend to disagree. For example, doc/developer-notes.md is large and exists. Same for ./src/rpc/blockchain.cpp.

My preference would be to not move code, unless there is a reason.

DrahtBot added the label Needs rebase on Apr 10, 2024

DrahtBot commented at 10:55 am on April 10, 2024: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

maflcko commented at 11:37 am on April 15, 2024: member

Are you still working on this?

davidgumberg commented at 12:24 pm on April 15, 2024: contributor

Are you still working on this?

Yes, I was just waiting to see if there was any other feedback before putting all of the linter code back into a single file.

I think a lot of the reason for the individual lint checks being placed in separate files previously was so that they could be run individually, but I feel that a single file that contains both the lint checks and the lint runner is a bit clunky.

I would rather leave splitting up main.rs to the tastes of a future PR, unless someone has a strong opinion about how the files should be laid out.

fjahr commented at 1:52 pm on April 15, 2024: contributor

Concept NACK

As I have said before, I think there needs to be a discussion about using more Rust just for the sake of using more Rust in the project. The only motivation given so far is that @maflcko said that it could be done. There is no benefit to doing this otherwise apparently and the biggest effect I see is that there will be fewer people able to contribute to that part of the code base. And this change alone will also take up a lot of review effort so it appears to be a clear negative, at least in the short to mid-term.

maflcko commented at 2:08 pm on April 15, 2024: member

As I have said before, I think there needs to be a discussion about using more Rust just for the sake of using more Rust in the project. The only motivation given so far is that @maflcko said that it could be done.

Not sure where you get that from, but I never said that using more Rust for the sake of using more Rust can be done. I said something else:

I *don't* think moving all Python code to Rust makes sense. https://www.github.com/bitcoin/bitcoin/pull/29744#issuecomment-2022526794
Generally, I think more important than the choice of language is to actually fix the issue that was intended to be solved. That is, _ and _ . Whether the fix is done in Python or Rust should be secondary, as long as the fix is done. https://www.github.com/bitcoin/bitcoin/pull/29744#issuecomment-2023709475
Rust is type-safe, so many issues that are found after code is written, reviewed, and merged just can not happen and will be caught, ideally before a pull is reviewed and merged. For example ... https://www.github.com/bitcoin/bitcoin/pull/29479#issuecomment-2021251949

fjahr commented at 2:15 pm on April 15, 2024: contributor

Not sure where you get that from

“an alternative might be to rewrite the 3 lint checks to rust” #29479#pullrequestreview-1957109397 That is what @davidgumberg has given as the motivation in his description. It does not matter what you have said in other places because I have not been talking about that. I only talked about your comment that the author took as the motivation for this PR.

maflcko commented at 2:45 pm on April 15, 2024: member

That comment says so that they can use the already existing function as the motivation. (This is the issue that was attempted to be solved and why the other pull request was created in the first place). Re-reading my comment, I don’t get the impression that it said “using more Rust just for the sake of using more Rust in the project can be done”.

fjahr commented at 2:49 pm on April 15, 2024: contributor

That comment says so that they can use the already existing function as the motivation. (This is the issue that was attempted to be solved and why the other pull request was created in the first place).

Right, and that original PR was merged so that part does not apply here anymore, still this PR is still open.

Re-reading my comment, I don’t get the impression that it said “using more Rust just for the sake of using more Rust in the project can be done”.

Well, my impression is that the author got that impression and opened this PR because of it.

maflcko commented at 2:58 pm on April 15, 2024: member

Right, and that original PR was merged so that part does not apply here anymore, still this PR is still open.

My understanding is that the original PR didn’t actually fix the issue, so the problem remains. For example, crypto/ctaes is a subtree, but test/lint/lint_ignore_dirs.py does not list it. Also, test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py don’t use test/lint/lint_ignore_dirs.py.

Let me know if I am missing something. If the issue is indeed fixed, this pull request can clearly be closed.

In any case, it seems better to close this pull request and start a fresh one, if needed, which clearly explains the issue/problem in its own words, and how it is fixed. I don’t think the discussion in this pull request, and the fact that the patch changed after opening, makes it easy to follow.

fjahr commented at 3:18 pm on April 15, 2024: contributor

For example, crypto/ctaes is a subtree, but test/lint/lint_ignore_dirs.py does not list it. Also, test/lint/lint-locale-dependence.py and test/lint/lint-python-utf8-encoding.py don’t use test/lint/lint_ignore_dirs.py.

Right, these are behavior changes that I did not see as part of the motivation of the original PR, as the title says it was a refactor. So I don’t think this is something that was missed, it was just not the goal. It should be made more clear that this is the main motivation of this PR, to improve exclusion of subtrees. It should not be called a refactor in the title as it is now.

maflcko commented at 12:11 pm on April 22, 2024: member

Closing for now per previous discussion. Feel free to create a new pull request(s) with the behavior changes and an accurate description in your own words. Please try to avoid the use of the word “refactor” for bugfixes or behavior changes.

maflcko closed this on Apr 22, 2024

davidgumberg commented at 0:19 am on April 26, 2024: contributor

@maflcko @fjahr Thank you for the feedback, sorry for the confusion on this PR. I’ve opened https://github.com/bitcoin/bitcoin/pull/29965

bitcoin locked this on Apr 26, 2025

lint: Refactor lint checks to reuse exclusion logic; Support running individual lint checks in lint runner #29744

Code Coverage

Reviews

Conflicts