Add linter: Make sure we explicitly open all text files using UTF-8 encoding in Python #13448

pull practicalswift wants to merge 2 commits into bitcoin:master from practicalswift:lint-python-utf8-encoding changing 20 files +57 −38
  1. practicalswift commented at 3:51 PM on June 12, 2018: contributor

    Add linter: Make sure we explicitly open all text files using UTF-8 encoding in Python.

    As requested by @laanwj in #13440.

  2. fanquake added the label Scripts and tools on Jun 12, 2018
  3. promag commented at 3:56 PM on June 12, 2018: member

    Concept ACK.

  4. in contrib/devtools/clang-format-diff.py:155 in 743add2f9e outdated
     151 | @@ -152,7 +152,7 @@ def main():
     152 |        sys.exit(p.returncode)
     153 |  
     154 |      if not args.i:
     155 | -      with open(filename) as f:
     156 | +      with open(filename, encoding="utf8") as f:
    


    MarcoFalke commented at 3:56 PM on June 12, 2018:

    Can you upstream this? (Asking only half-serious)


    practicalswift commented at 7:40 PM on June 12, 2018:

    Unfortunately LLVM does not use GitHub (beyond a repo mirror) and I try to limit my open source contributions to projects accepting GitHub PR:s due to time constraints :-)

  5. in contrib/linearize/linearize-hashes.py:99 in 743add2f9e outdated
      95 | @@ -96,7 +96,7 @@ def get_block_hashes(settings, max_blocks_per_call=10000):
      96 |  
      97 |  def get_rpc_cookie():
      98 |      # Open the cookie file
      99 | -    with open(os.path.join(os.path.expanduser(settings['datadir']), '.cookie'), 'r') as f:
     100 | +    with open(os.path.join(os.path.expanduser(settings['datadir']), '.cookie'), 'r', encoding="utf8") as f:
    


    MarcoFalke commented at 3:57 PM on June 12, 2018:

    The cookie file is ASCII encoded


    laanwj commented at 4:39 PM on June 12, 2018:

    I'm not sure that's a problem: if something is ASCII-encoded, it doesn't hurt reading it as utf-8.


    MarcoFalke commented at 4:45 PM on June 12, 2018:

    No strong opinion. Would throw and error if it was no longer ascii.

  6. in test/functional/feature_notifications.py:61 in 743add2f9e outdated
      57 | @@ -58,7 +58,7 @@ def run_test(self):
      58 |  
      59 |          # file content should equal the generated transaction hashes
      60 |          txids_rpc = list(map(lambda t: t['txid'], self.nodes[1].listtransactions("*", block_count)))
      61 | -        with open(self.tx_filename, 'r') as f:
      62 | +        with open(self.tx_filename, 'r', encoding="utf-8") as f:
    


    MarcoFalke commented at 3:58 PM on June 12, 2018:

    txids should only be ascii encoded

  7. in test/functional/test_framework/util.py:330 in 743add2f9e outdated
     326 | @@ -327,7 +327,7 @@ def get_auth_cookie(datadir):
     327 |                      assert password is None  # Ensure that there is only one rpcpassword line
     328 |                      password = line.split("=")[1].strip("\n")
     329 |      if os.path.isfile(os.path.join(datadir, "regtest", ".cookie")):
     330 | -        with open(os.path.join(datadir, "regtest", ".cookie"), 'r') as f:
     331 | +        with open(os.path.join(datadir, "regtest", ".cookie"), 'r', encoding="utf8") as f:
    


    MarcoFalke commented at 3:59 PM on June 12, 2018:

    The cookie file is ASCII encoded

  8. in test/lint/lint-python-utf8-encoding.sh:14 in 743add2f9e outdated
       9 | +
      10 | +EXIT_CODE=0
      11 | +OUTPUT=$(git grep -i " open(" -- "*.py" | grep -vE 'encoding=.(utf8|utf-8).' | grep -vE "open\([^,]*, ['\"][^'\"]*b[^'\"]*['\"]")
      12 | +if [[ ${OUTPUT} != "" ]]; then
      13 | +    echo "Python's open(...) seems to be used to open text files without explicitly"
      14 | +    echo "specifying 'encoding=utf8':"
    


    MarcoFalke commented at 4:00 PM on June 12, 2018:

    Should say encoding='utf8'? Also, you might want to use the single quotes(') when patching the python files in this pull, since this seems the more common way for short strings. Just a nit, though.

  9. MarcoFalke commented at 4:01 PM on June 12, 2018: member

    utACK 743add2f9ee381b80f70869585cd48e89efc5df1. Just a few nits. Feel free to ignore them.

  10. laanwj commented at 6:02 PM on June 12, 2018: member

    Thanks for adding this. utACK 743add2f9ee381b80f70869585cd48e89efc5df1, no strong opinion on ascii versus utf-8. I'm ok with being strict there, or keeping it like this, the point is being explicit.

    Edit, this trips up the linter in travis:

    Python's open(...) seems to be used to open text files without explicitly
    specifying 'encoding=utf8':
    contrib/verify-commits/verify-commits.py:    verified_root = open(dirname + "/trusted-git-root", "r").read().splitlines()[0]
    contrib/verify-commits/verify-commits.py:    verified_sha512_root = open(dirname + "/trusted-sha512-root-commit", "r").read().splitlines()[0]
    contrib/verify-commits/verify-commits.py:    revsig_allowed = open(dirname + "/allow-revsig-commits", "r").read().splitlines()
    contrib/verify-commits/verify-commits.py:    unclean_merge_allowed = open(dirname + "/allow-unclean-merge-commits", "r").read().splitlines()
    contrib/verify-commits/verify-commits.py:    incorrect_sha512_allowed = open(dirname + "/allow-incorrect-sha512-commits", "r").read().splitlines()
    ^---- failure generated from test/lint/lint-python-utf8-encoding.sh
    
  11. Explicitly specify encoding when opening text files in Python code 634bd97001
  12. practicalswift force-pushed on Jun 12, 2018
  13. practicalswift force-pushed on Jun 12, 2018
  14. practicalswift force-pushed on Jun 12, 2018
  15. practicalswift force-pushed on Jun 12, 2018
  16. Add linter: Make sure we explicitly open all text files using UTF-8 or ASCII encoding in Python c8176b3cc7
  17. practicalswift force-pushed on Jun 12, 2018
  18. practicalswift commented at 7:50 PM on June 12, 2018: contributor

    @MarcoFalke @laanwj Thanks for the quick review. Feedback addressed. Please re-review :-)

  19. MarcoFalke commented at 8:42 PM on June 12, 2018: member

    utACK c8176b3cc7556d7bcec39a55ae4d6ba16453baaa

  20. DrahtBot commented at 8:15 PM on June 14, 2018: member

    <!--e57a25ab6845829454e8d69fc972939a-->Note to reviewers: This pull request conflicts with the following ones:

    • #13339 (wallet: Replace %w by wallet name in -walletnotify script by promag)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  21. in test/lint/lint-python-utf8-encoding.sh:11 in c8176b3cc7
       6 | +#
       7 | +# Make sure we explicitly open all text files using UTF-8 (or ASCII) encoding to
       8 | +# avoid potential issues on the BSDs where the locale is not always set.
       9 | +
      10 | +EXIT_CODE=0
      11 | +OUTPUT=$(git grep " open(" -- "*.py" | grep -vE "encoding=.(ascii|utf8|utf-8)." | grep -vE "open\([^,]*, ['\"][^'\"]*b[^'\"]*['\"]")
    


    ken2812221 commented at 1:35 PM on June 15, 2018:

    nit: I would prefer not to use . to match "


    MarcoFalke commented at 1:40 PM on June 15, 2018:

    It matches also '. Imo every other character would be a syntax error in python, so should be fine.


    practicalswift commented at 1:44 PM on June 15, 2018:

    @ken2812221 That is intentional to match ' and ".


    ken2812221 commented at 1:56 PM on June 15, 2018:

    Right, that makes sense.

  22. ken2812221 approved
  23. ken2812221 commented at 1:36 PM on June 15, 2018: contributor

    utACK c8176b3

  24. laanwj merged this on Jun 16, 2018
  25. laanwj closed this on Jun 16, 2018

  26. laanwj referenced this in commit a90ca4087a on Jun 16, 2018
  27. PastaPastaPasta referenced this in commit d1200755f1 on Jul 7, 2020
  28. zkbot referenced this in commit 43ac2062f9 on Oct 28, 2020
  29. zkbot referenced this in commit 84a5830aaa on Nov 9, 2020
  30. practicalswift deleted the branch on Apr 10, 2021
  31. UdjinM6 referenced this in commit 6f6067873b on May 5, 2021
  32. UdjinM6 referenced this in commit c0db4fddf0 on May 6, 2021
  33. gades referenced this in commit 3d5dbfb7d0 on Mar 11, 2022
  34. gades referenced this in commit 970f488c5c on May 2, 2022
  35. DrahtBot locked this on Aug 18, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-16 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me