Improvements to copyright_header.py and some minor copyright header tweaks. #9459

pull isle2983 wants to merge 2 commits into bitcoin:master from isle2983:PR-copyright-script-improve changing 10 files +636 −382
  1. isle2983 commented at 9:59 PM on January 2, 2017: contributor

    This is a re-submission of PR #9453 with the TravisCI integration and language removed based on feedback.

    The changes are a significant iteration to contrib/devtools/copyright_header.py. Like the preceding version, this script is able to parse through the source files and report on the state of the copyright headers as well as update the year.

    The first major improvement is that it now has a ruleset specific for each subtree to detect the local proper header style (the subtrees being secp256k1, leveldb, univalue and ctaes).

    The second major improvement is that it now makes more elegant use of Python's regular expressions to detect and manipulate the header text. This approach should be maintainable and extensible while keeping the script's execution time quick.

    The third major improvement is the addition of a check command which gives a positive or negative acceptance for the state of the repository according to the rules of the script. When issues are found, it also gives more direct suggestions for how to resolve. The purpose is to better assist periodic maintenance of the state of the copyright headers.

    Also, some minor tweaks are done to make the set of copyright headers more consistent. The addition of the MIT license header to contrib/dev/tools/gen-manpages.sh was previously ACKed by @nomnombtc in #9453.

  2. fanquake added the label Docs and Output on Jan 3, 2017
  3. MarcoFalke commented at 1:20 PM on January 4, 2017: member

    Needs rebase if still relevant

  4. isle2983 force-pushed on Jan 7, 2017
  5. isle2983 commented at 6:12 PM on January 7, 2017: contributor

    Rebased. The conflict was in the files:

    contrib/linearize/linearize-data.py contrib/linearize/linearize-hashes.py share/rpcuser/rpcuser.py

    For the three files, the conflict was with 27765b6403cece54320374b37afb01a0cfe571c3 which incremented the end year to 2016.

    For linearize-data.py and linearize-hashes.py it also conflicted with 3c8f63ba7c7be62d462727f4d67633e1064f0f79 which changed the shebang at the top of the file to point to python3

  6. paveljanik commented at 11:32 AM on January 8, 2017: contributor

    Should we check the subdirs like secp256k1 at all? Regexps like:

    +SECP256K1_HOLDERS = [
    +    "Pieter Wuille +\\*",
    +    "Pieter Wuille, Gregory Maxwell +\\*",
    +    "Pieter Wuille, Andrew Poelstra +\\*",
    +    "Andrew Poelstra +\\*",
    +    "Diederik Huys, Pieter Wuille +\\*",
    +    "Thomas Daede, Cory Fields +\\*",
    +]
    +ANY_SECP256K1_HOLDER = '(%s)' % '|'.join([h for h in SECP256K1_HOLDERS])
    

    are strange to have in Bitcoin Core source tree (even in contrib directory)...

  7. fanquake commented at 2:13 PM on January 12, 2017: member

    utACK https://github.com/bitcoin/bitcoin/pull/9459/commits/b64b1c60000db317d086862f2ca6cd148ea5847f - yes why not unify the headers, maybe for the last time.

    Not sure about the changes to the copyright script though. Should we really be monitoring the copyright headers of subtrees in here? It looks like the script will need constant maintenance when new files are added to the repo, or something changes upstream.

  8. isle2983 commented at 5:59 PM on January 12, 2017: contributor

    I am not really sure what the right answer is. It is nice have an automated way to monitor the state of the copyright for code that compiled and used in the repo. But on the other hand, there are ways of doing that in a way that doesn't need the particular validation logic checked into the mainline.

    For example, as posted in #9459, a suggested workflow for Coverity (a static code analysis tool) is to have a side branch automatically pulling from the mainline that is allowed to go red on CI for a while and then periodically synced up to green with a fixup PR. https://docs.travis-ci.com/user/coverity-scan/#Build-Submission-Frequency

  9. fanquake commented at 12:15 AM on January 21, 2017: member

    ACK https://github.com/bitcoin/bitcoin/pull/9459/commits/b64b1c60000db317d086862f2ca6cd148ea5847f NACK 1083bad I don't we need the subtree style checking etc.

    If you drop the second commit, we can merge this, but then let's focus efforts on style/automation changes that we are more likely to see benefits from, such as #9603.

  10. isle2983 force-pushed on Jan 23, 2017
  11. isle2983 commented at 3:02 AM on January 23, 2017: contributor

    Yep, the picture of what makes sense is becoming clearer. Agreed that the subtree checking probably won't even make sense in the long run.

    The new branch keeps b64b1c60000db317d086862f2ca6cd148ea5847f as-is, but revs 1083bad to a new version of the script with all the subtree stuff taken out. The ALWAYS_IGNORE now looks like this:

    ALWAYS_IGNORE = [
        # empty files for python init:
        '*__init__.py',
        # files in subtrees:
        'src/secp256k1/*',
        'src/leveldb/*',
        'src/univalue/*',
        'src/crypto/ctaes/*',
    ]
    

    So, the remaining benefit of the copyright_header.py changes over the mainline one are:

    1. proper python regex-based file searching and editing (the mainline version is a bit clumsy in this department)

    2. the check subcommand that gives a pointer specific to each file that has an issue and a bash status code for future CI script use.

    3. a different report subcommand output which now looks like this:

    --------------------------------------------------------------------------------
    1465 files tracked according to 'git ls-files'
     556 files examined according to SOURCE_FILES and ALWAYS_IGNORE fnmatch rules
    --------------------------------------------------------------------------------
    Files expected to have header:                                     534
    Files not expected to have header:                                  22
    Files not expected to have 'copyright' occurrence outside header:  527
    Files expected to have 'copyright' occurrence outside header:       29
    Files passed:                                                      556
    Files failed:                                                        0
    --------------------------------------------------------------------------------
    
    1. a bunch of minor tweaks to bring it in line with a couple things learned from basic_style.py in #9603

    Again, following from the body of other discussions, there is no rush to fix up the minor headers issues - they aren't hurting anyone in their present state. If there is no rush to merge, I am content letting this PR sit until a larger collection of automation scripts are completed and we can have the bigger CI discussion.

  12. jtimon commented at 11:52 PM on January 31, 2017: contributor

    Needs rebase

  13. Minor adjustments to MIT Licence copyright headers to make uniform 5cae31114f
  14. Iteration on copyright_header.py to use regexes and add 'check' subcommand
    The validation of the copyright header is done with a more comprehensive regex
    check for the exact format.
    
    The output of the 'report' command has changed to give the results of specific
    expectations for sets of files.
    
    Also implements 'check' command to give non-zero shell status if issues are
    found along with hints on what needs to be done to resolve the issue.
    ccc3e19ce1
  15. isle2983 force-pushed on Feb 1, 2017
  16. isle2983 commented at 12:14 AM on February 1, 2017: contributor

    rebased. clang-format.py was removed, so this no longer needs to adjust it's header.

    Also, this discussion about scripting continues in #9658. This will likely end up being closed once a new PR is put together with some automation of the various checks.

  17. isle2983 commented at 7:41 PM on February 2, 2017: contributor

    Closing.

    The plan is to submit script likes this to https://github.com/bitcoin-core/bitcoin-maintainer-tools as that now seems to be the preferred location.

    I will be touching up this script to share some code with the others and submitting it all as a PR there.

    Also, once that is up the underlying copyright_header.py can probably be removed from devtools.

  18. isle2983 closed this on Feb 2, 2017

  19. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-15 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me