Add clang_format.py to help automate code style analysis #9658

pull isle2983 wants to merge 1 commits into bitcoin:master from isle2983:PR-clang-format changing 2 files +697 −0

isle2983 commented at 10:09 PM on January 31, 2017: contributor

The objective here is to expose data to inform the code style discussion. The tool makes it easy to change the clang-format binary and .clang_format file to help reason about the problem.

clang_format.py is added with the usage:

$ contrib/devtools/clang_format.py -h
usage: clang_format.py [-h] [-b BIN_PATH] [-s STYLE_FILE] [-j JOBS] [-f]
                       {report,check,format} [target [target ...]]

A utility for invoking clang-format to observe the state of C++ code
formatting in the repository. It produces reports of style metrics and also
can apply formatting.

positional arguments:
  {report,check,format}
                        Selects the action to be taken. 'report' produces a
                        report with analysis of the selected files taken as a
                        group. 'check' validates that the selected files match
                        the style and gives a per-file report and returns a
                        non-zero bash status if there are any format issues
                        discovered. 'format' applies the style formatting to
                        the selected files.
  target                A list of files and/or directories that select the
                        subset of files for this action. If a directory is
                        given as a target, all files contained in it and its
                        subdirectories are recursively selected. All targets
                        must be tracked in the same git repository clone.
                        (default=The current directory)

optional arguments:
  -h, --help            show this help message and exit
  -b BIN_PATH, --bin-path BIN_PATH
                        The path to the clang-format binary to be used.
                        (default=clang-format-[0-9]\.[0-9] installed in PATH
                        with the highest version number)
  -s STYLE_FILE, --style-file STYLE_FILE
                        The path to the style file to be used. (default=The
                        src/.clang_format file of the repository which holds
                        the targets)
  -j JOBS, --jobs JOBS  Parallel jobs for computing diffs. (default=6)
  -f, --force           Force proceeding with 'check' or 'format' if clang-
                        format doesn't support all parameters in the style
                        file. (default=False)

The report command looks like this when run against master:

$ contrib/devtools/clang_format.py report
--------------------------------------------------------------------------------
1466 files tracked according to 'git ls-files'
 406 files in scope according to SOURCE_FILES and ALWAYS_IGNORE settings
 406 files examined according to listed targets
--------------------------------------------------------------------------------
clang-format bin:         /usr/lib/llvm-3.8/bin/clang-format
clang-format version:     3.8.0
Using style in:           /home/isle/muhbitcoin/src/.clang-format
--------------------------------------------------------------------------------
Parallel jobs for diffs:   6
Elapsed time:              38.60s
Slowest diffs:
 33.88s for src/chainparamsseeds.h
  3.45s for src/qt/bitcoinstrings.cpp
--------------------------------------------------------------------------------
Files 100% matching:             47
Files <100% matching:           359
Formatted content MD5:      91ee2c53ca93ccca584457770dbf606c
--------------------------------------------------------------------------------
Files 90%-99% matching:         132
Files 80%-89% matching:         151
Files 70%-79% matching:          46
Files 60%-69% matching:          14
Files 50%-59% matching:          10
Files 40%-49% matching:           1
Files 30%-39% matching:           1
Files 20%-29% matching:           1
Files 10%-19% matching:           1
Files  0%- 9% matching:           2
--------------------------------------------------------------------------------

 +-------+          +------------+--------+---------+-----------+-------------+
 | score |          | pre-format |  added | removed | unchanged | post-format |
 +-------+  +-------+------------+--------+---------+-----------+-------------+
 |  83%  |  | lines |     109469 |  16807 |   17917 |     91552 |      108359 |
 +-------+  +-------+------------+--------+---------+-----------+-------------+

--------------------------------------------------------------------------------

It also will take a list of specific files to narrow it to a specific set instead of the entire repo.

The check subcommand output displays a specific report for each non-matching file in the set of targets and provides a status code to the shell:

$ contrib/devtools/clang_format.py check src/init.cpp
--------------------------------------------------------------------------------
1466 files tracked according to 'git ls-files'
 406 files in scope according to SOURCE_FILES and ALWAYS_IGNORE settings
   1 files examined according to listed targets
--------------------------------------------------------------------------------
A code format issue was detected in src/init.cpp
 +-------+          +------------+--------+---------+-----------+-------------+
 | score |          | pre-format |  added | removed | unchanged | post-format |
 +-------+  +-------+------------+--------+---------+-----------+-------------+
 |  90%  |  | lines |       1641 |    151 |     163 |      1478 |        1629 |
 +-------+  +-------+------------+--------+---------+-----------+-------------+
--------------------------------------------------------------------------------
These files can be auto-formatted by running:
$ contrib/devtools/clang_format.py format [target [target ...]]
--------------------------------------------------------------------------------
*** Format issues found!

The format subcommand will apply clang-format to set of targets much like the predecessor clang-format.py (which was recently removed in #9649).

The usage is overloaded for convenience:

To limit the reporting to just src/qt/:

$ contrib/devtools/clang_format.py report src/qt/

To check three random files:

$ contrib/devtools/clang_format.py check src/foo.cpp src/bar.h src/baz.cpp

To apply format to a single random file and also all files under a subdir:

$ contrib/devtools/clang_format.py format src/fizz.cpp src/buzz/

I have tested with all versions 3.4 through 3.9.0 on Ubuntu 14.04 and most of the same up to 3.9.1 on Debian 8. The applied formatting does have small differences between major versions (e.g. different result between 3.8.0 and 3.9.0 but the same result between 3.9.0 and 3.9.1). Some of the parameters in src/.clang_format aren't added until version 3.6.0. The script warns if a parameter is rejected. If you use the --force flag, it will proceed with 'best-effort' where the rejected parameters are dropped.

This is the fourth script in a series (The others being #9459 copyright_header.py, #9603 basic_style.py and #9632 clang_static_analysis.py) that provides a report command to measure the state of the codebase in the particular dimension.

A suggested trajectory from here is:

automating a suite of scripts to generate reports with consistent and known tool dependencies - a build matrix item in TravisCI looks like a good first thought
exposing metrics so they can be observed and monitored.

After that, the discussion can progress to:

considering some bulk actions to improve the codebase - in a safe, measured, conscientious way, of course.
observing the 'drift' in the wrong direction as PRs are merged
considering enforcement of pre-merge CI criteria where it adds clear value - the check subcommand in these scripts show a way that this can be done.
considering areas where similar automation scripts might be helpful (pep8 etc.)

Comments on all this are welcome.

Implement clang_format.py

A script for running 'clang-format' to help expose and resolve code style issues

4a1a0df5fb

jtimon commented at 11:51 PM on January 31, 2017: contributor

This seems related to #9649 ping @MarcoFalke
MarcoFalke commented at 12:00 AM on February 1, 2017: member

The rationale for removing the predecessor was that (maybe) people were confused which clang format script is meant for day-to-day dev activity, so they used none. Lets hope this changes after we had the discussion in last weeks meeting and after deleting the unused script. So there is only one script left with a description on how to use it. Though, I am not interested in enforcing the use of clang format, also I won't highlight any style nits in pulls, as I think it would be distracting.

In regard to this pull, I think this is better placed in some meta repository, so we can use it whenever appropriate but don't confuse regular developers by the number of scripts we put in the devtools subdir.

I submitted a pull request to the maintainers repo to archive the old version: https://github.com/bitcoin-core/bitcoin-maintainer-tools/pull/7 I think your script does fit really well into that repo as well.
isle2983 cross-referenced this on Feb 1, 2017 from issue Improvements to copyright_header.py and some minor copyright header tweaks. by isle2983
isle2983 cross-referenced this on Feb 1, 2017 from issue Add clang_static_analysis.py to help automate static analysis runs by isle2983
isle2983 commented at 6:18 PM on February 1, 2017: contributor

I support keeping devtools/ uncluttered for things that are clearly needed to be understood by everyone, so I am on board with it not going there at present time.

In the long run, my hope would be that every code-changing PR gets massaged developer-side prior to submission and everyone does a clean TravisCI run on the branch in their forked repo prior to submitting the PR. An efficient workflow to cut down on TravisCI round trips would also include running a pass of the same scripts on individual work machines, so having them easily available is a plus. To me, that is more of a 'developer' task rather than a 'maintainer' task - albeit an optional one at present.

The next task I am looking at is to bring the four scripts I have written into one subdirectory modeled after the rpc tests in qa/. I wish to consolidate the repeated code and make better OO abstractions for future extensions. Also, for the same PR, I hope to have it all working inside TravisCI in a report-information-only capacity. At that time we could decide to make it live in a different repo if it isn't too much trouble importing it for TravisCI runs on bitcoin/bitcoin.
jonasschnelli added the label Scripts and tools on Feb 1, 2017
laanwj commented at 8:54 AM on February 2, 2017: member

Please add scripts like this to the maintainer tools: https://github.com/bitcoin-core/bitcoin-maintainer-tools

There are meta-tools that are not part of the Bitcoin Core release schedule and branching, so have no place in the repository. qa/devtools could be seen as a historical error in that regard, but we'll keep some often-used scripts there.
laanwj closed this on Feb 2, 2017
laanwj commented at 8:58 AM on February 2, 2017: member

In the long run, my hope would be that every code-changing PR gets massaged developer-side prior to submission and everyone does a clean TravisCI run on the branch in their forked repo prior to submitting the PR.

Sure, I completely agree that in the future, if it gets used by every developer it can be moved to somewhere inside the tree if that is more convenient.

But right now, this script has 0 (or 1) users. So let's add it to the maintainer tools first and see how that goes.
bitcoin locked this on Sep 8, 2021

Contributors

Labels

Linked (view graph)

#9459 Improvements to copyright_header.py and some minor copyright header tweaks.#9603 Add basic_style.py to automate some style checking.#9632 Add clang_static_analysis.py to help automate static analysis runs #9649 [doc] Remove unused clang format dev script