As discussed recently in #9581 and in referenced past discussions #6839, #6156, and probably a bunch of other places too, there is concern about the long tail of effort and risk involved with these style fix-up PRs. To summarize what has been articulated by the experienced maintainers:
- it is a lot of PR clutter for reviewers/maintainers to sift through
- it impedes release management since style fix-ups often have difficult merge conflicts with more important content
- they are sometimes hard to review and bugs can slip in
- they permanently clutter up output from
git blameand other history tools
My suggestion is that the ideal long-term strategy is to automate detection of these style issues as much as possible such that the CI system can auto-check every PR. This will ensure that the code is kept clean and consistent at source and is kept that way for all time.
Additionally, the message I am getting from what has been written is that the strategy for bringing this online needs to be done properly or not at all. It will require a conscientious effort to minimize the impact to ongoing activities and steps should be taken to make damn sure it is not the cause of any bugs. I have some thoughts, but collecting ideas from all around is good (and thanks for all that has already been written, it is quite helpful for thinking this through). I am content leaving this PR to sit until a larger collection of automation scripts are complete and we have a strategy that the maintainers are on board with.
This PR provides an example script for very basic style checks which are expressed as simple regular expressions:
STYLE_RULES = [
{'title': 'No tabstops',
'applies': ['*.c', '*.cpp', '*.h', '*.py', '*.sh'],
'regex': '\t',
'fix': ' '},
{'title': 'No trailing whitespace on a line',
'applies': ['*.c', '*.cpp', '*.h', '*.py', '*.sh'],
'regex': ' \n',
'fix': '\n'},
{'title': 'No more than three consecutive newlines',
'applies': ['*.c', '*.cpp', '*.h', '*.py', '*.sh'],
'regex': '\n\n\n\n',
'fix': '\n\n\n'},
{'title': 'Do not end a line with a semicolon',
'applies': ['*.py'],
'regex': ';\n',
'fix': '\n'},
{'title': 'Do not end a line with two semicolons',
'applies': ['*.c', '*.cpp', '*.h'],
'regex': ';;\n',
'fix': ';\n'},
]
The script provides a report subcommand that gives an overview of the state of the repository. It also provides a check command, which like in #9459, provides a shell status code (to make it easy to hook to TravisCI) and also gives pointers to what needs to be done to resolve the issue. Finally it provides a fix command which just does a search-and-replace according to the 'fix' string in the above STYLE_RULES dictionary.
The master branch has 141 files that fail this set of rules (19 + 93 + 16 + 13 + 0). However, I did not include the fixup in this PR due to the aforementioned reasons. The script can do the edit by invoking contrib/devtools/basic_style.py fix ..
clang-format, pep8, pylint, coverity, clang static analyzer, etc. are what I would define as 'non-basic' style checks that can be similarly automated (and might actually make this one redundant). Also, the copyright_header.py script in #9452/#9459 could be part of the end result too.