Subtree exclude mess in linters and update scripts #17413

issue laanwj openend this issue on November 8, 2019
  1. laanwj commented at 8:33 am on November 8, 2019: member

    For various operations, it is important to exclude subtrees that have imported code:

    • Doxygen: prevent cluttering our documentation with documentation that belongs with that library itself
    • Scripted-diff: commits are not allowed to touch subtrees, this is easy to accidentally do with scripted changes
    • Linters: some linters check policies that do not apply to upstream projects
    • Test coverage: distorts the results; low test coverage of subsidiary libraries does not mean anything

    It would be good for consistency, and to reduce developer frustration, for this list to be in one place only, then sourced from where it is needed.

    For example, for the crc32c subtree I had to update these places:

    • contrib/devtools/copyright_header.py (EXCLUDE_DIRS)
    • doc/Doxyfile.in (EXCLUDE)
    • test/lint/lint-include-guards.py (EXCLUDE_FILES_WITH_PREFIX)
    • test/lint/lint-includes.py (EXCLUDED_DIRS)
    • test/lint/lint-python-utf8-encoding.py (EXCLUDED_DIRS)
    • test/lint/lint-spelling.py (FILES_ARGS, git ls-files inline)
    • test/lint/lint-whitespace.py (EXCLUDED_DIRS)
    • Makefile.am (LCOV_FILTER_PATTERN)

    I’m not yet sure whether I forgot any places (because the excludes are specified in different formats in different places).

    Good first issue

    The purpose of the good first issue label is to highlight which issues are suitable for a new contributor without a deep understanding of the codebase.

    Useful skills: shell scripting, python

    Want to work on this issue?

    You do not need to request permission to start working on this. You are encouraged to comment on the issue if you are planning to work on it. This will help other contributors monitor which issues are actively being addressed and is also an effective way to request assistance if and when you need it.

    For guidance on contributing, please read CONTRIBUTING.md before opening your pull request.

  2. laanwj added the label good first issue on Nov 8, 2019
  3. laanwj added the label Scripts and tools on Nov 8, 2019
  4. hackerrdave commented at 4:25 pm on December 21, 2019: contributor
    Going to take a crack at this
  5. niVelion commented at 5:13 pm on January 14, 2022: none

    I’ve just had a look over the mentioned files, and am copying my notes here to make them public.

    Unless otherwise noted, the filetype is shell code.

    1. https://github.com/bitcoin/bitcoin/blob/master/contrib/devtools/copyright_header.py#L32-40

    Python: Quoted string array.

    0EXCLUDE_DIRS = [
    1    # git subtrees
    2    "src/crypto/ctaes/",
    3    "src/leveldb/",
    4    "src/minisketch",
    5    "src/secp256k1/",
    6    "src/univalue/",
    7    "src/crc32c/",
    8]
    
    1. https://github.com/bitcoin/bitcoin/blob/master/doc/Doxyfile.in#L864-L866

    Unquoted values.

    0EXCLUDE                = src/crc32c \
    1                         src/leveldb \
    2                         src/json
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/extended-lint-cppcheck.sh#L76

    Quoted strings, prefixed with :(exclude).

    0WARNINGS=$(git ls-files -- "*.cpp" "*.h" ":(exclude)src/leveldb/" ":(exclude)src/crc32c/" ":(exclude)src/secp256k1/" ":(exclude)src/minisketch/" ":(exclude)src/univalue/" | \
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-include-guards.sh#L13

    Regular Expression.

    0REGEXP_EXCLUDE_FILES_WITH_PREFIX="src/(crypto/ctaes/|leveldb/|crc32c/|secp256k1/|minisketch/|test/fuzz/FuzzedDataProvider.h|tinyformat.h|bench/nanobench.h|univalue/)"
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-includes.sh#L12

    Regular Expression.

    0IGNORE_REGEXP="/(leveldb|secp256k1|minisketch|univalue|crc32c)/"
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-python-utf8-encoding.sh#L12

    Quoted string, prefixed with :(exclude).

    0OUTPUT=$(git grep " open(" -- "*.py" ":(exclude)src/crc32c/" | grep -vE "encoding=.(ascii|utf8|utf-8)." | grep -vE "open\([^,]*, ['\"][^'\"]*b[^'\"]*['\"]")
    

    6a. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-python-utf8-encoding.sh#L20

    0OUTPUT=$(git grep "check_output(" -- "*.py" ":(exclude)src/crc32c/"| grep "universal_newlines=True" | grep -vE "encoding=.(ascii|utf8|utf-8).")
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-spelling.sh#L18

    Quoted strings, prefixed with :(exclude).

    0mapfile -t FILES < <(git ls-files -- ":(exclude)build-aux/m4/" ":(exclude)contrib/seeds/*.txt" ":(exclude)depends/" ":(exclude)doc/release-notes/" ":(exclude)src/leveldb/" ":(exclude)src/crc32c/" ":(exclude)src/qt/locale/" ":(exclude)src/qt/*.qrc" ":(exclude)src/secp256k1/" ":(exclude)src/minisketch/" ":(exclude)src/univalue/" ":(exclude)contrib/builder-keys/keys.txt" ":(exclude)contrib/guix/patches")
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-whitespace.sh#L36

    Quoted strings, prefixed with :(exclude).

    0  if ! git diff -U0 "${COMMIT_RANGE}" -- "." ":(exclude)depends/patches/" ":(exclude)contrib/guix/patches/" ":(exclude)src/leveldb/" ":(exclude)src/crc32c/" ":(exclude)src/secp256k1/" ":(exclude)src/minisketch/" ":(exclude)src/univalue/" ":(exclude)doc/release-notes/" ":(exclude)src/qt/locale/"; then
    

    8a. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-whitespace.sh#L43

    Quoted strings, prefixed with :(exclude).

    0  if ! git diff -U0 "${COMMIT_RANGE}" -- *.cpp *.h *.md *.py *.sh ":(exclude)src/leveldb/" ":(exclude)src/crc32c/" ":(exclude)src/secp256k1/" ":(exclude)src/minisketch/" ":(exclude)src/univalue/" ":(exclude)doc/release-notes/" ":(exclude)src/qt/locale/"; then
    
    1. https://github.com/bitcoin/bitcoin/blob/master/Makefile.am#L186-L198

    Automake: Quoted strings.

     0LCOV_FILTER_PATTERN = \
     1	-p "/usr/local/" \
     2	-p "/usr/include/" \
     3	-p "/usr/lib/" \
     4	-p "/usr/lib64/" \
     5	-p "src/leveldb/" \
     6	-p "src/crc32c/" \
     7	-p "src/bench/" \
     8	-p "src/univalue" \
     9	-p "src/crypto/ctaes" \
    10	-p "src/minisketch" \
    11	-p "src/secp256k1" \
    12	-p "depends"
    

    Additionally, searching the repo for univalue I found these possible additional locations.

    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-files.py#L22

    Python: RegExp.

    0ALLOWED_SOURCE_FILENAME_EXCEPTION_REGEXP = (
    1    "^src/(secp256k1/|minisketch/|univalue/|test/fuzz/FuzzedDataProvider.h)"
    2)
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-format-strings.sh#L37

    Pipe separated; part of regexp.

    0    for MATCHING_FILE in $(git grep --full-name -l "${FUNCTION_NAME}" -- "*.c" "*.cpp" "*.h" | sort | grep -vE "^src/(leveldb|secp256k1|minisketch|tinyformat|univalue|test/fuzz/strprintf.cpp)"); do
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-locale-dependence.sh#L49

    Pipe separarated; part of regexp.

    0REGEXP_IGNORE_EXTERNAL_DEPENDENCIES="^src/(crypto/ctaes/|leveldb/|secp256k1/|minisketch/|tinyformat.h|univalue/)"
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-shell-locale.sh#L15-18

    Pipe separated; part of regexp.

    0for SHELL_SCRIPT in $(git ls-files -- "*.sh" | grep -vE "src/(secp256k1|minisketch|univalue)/"); do
    1    if grep -q "# This script is intentionally locale dependent by not setting \"export LC_ALL=C\"" "${SHELL_SCRIPT}"; then
    2        continue
    3    fi
    
    1. https://github.com/bitcoin/bitcoin/blob/master/test/lint/lint-shell.sh#L28

    Pipe separated; part of regexp.

    0mapfile -t FILES < <(git ls-files -- '*.sh' | grep -vE 'src/(leveldb|secp256k1|minisketch|univalue)/')
    
  6. maxraustin commented at 6:56 pm on February 17, 2022: none
    My group and I are interested in looking into this one as a starter issue, have started brainstorming solutions.
  7. maxraustin referenced this in commit 58ebed9814 on Feb 23, 2022
  8. maxraustin referenced this in commit a47b81708f on Mar 24, 2022
  9. laanwj commented at 8:44 am on April 15, 2022: member
    I expect this is going to be slightly easier to do after #24783. At least the linters are uniformly Python after this, so there’s no need for a zillion different implementations of directory exclusion list logic.
  10. laanwj commented at 1:18 pm on May 9, 2022: member
    Updated the OP after #24783. I think it would make sense to converge on a EXCLUDED_DIRS list; many scripts were converted to this approach from using a regexp.
  11. fanquake closed this on Mar 27, 2024

  12. fanquake referenced this in commit 28f2ca675f on Mar 27, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me