Add linter: Make sure we explicitly open all text files using UTF-8 encoding in Python.
Add linter: Make sure we explicitly open all text files using UTF-8 encoding in Python #13448
pull practicalswift wants to merge 2 commits into bitcoin:master from practicalswift:lint-python-utf8-encoding changing 20 files +57 −38-
practicalswift commented at 3:51 PM on June 12, 2018: contributor
- fanquake added the label Scripts and tools on Jun 12, 2018
-
promag commented at 3:56 PM on June 12, 2018: member
Concept ACK.
-
in contrib/devtools/clang-format-diff.py:155 in 743add2f9e outdated
151 | @@ -152,7 +152,7 @@ def main(): 152 | sys.exit(p.returncode) 153 | 154 | if not args.i: 155 | - with open(filename) as f: 156 | + with open(filename, encoding="utf8") as f:
MarcoFalke commented at 3:56 PM on June 12, 2018:Can you upstream this? (Asking only half-serious)
practicalswift commented at 7:40 PM on June 12, 2018:Unfortunately LLVM does not use GitHub (beyond a repo mirror) and I try to limit my open source contributions to projects accepting GitHub PR:s due to time constraints :-)
in contrib/linearize/linearize-hashes.py:99 in 743add2f9e outdated
95 | @@ -96,7 +96,7 @@ def get_block_hashes(settings, max_blocks_per_call=10000): 96 | 97 | def get_rpc_cookie(): 98 | # Open the cookie file 99 | - with open(os.path.join(os.path.expanduser(settings['datadir']), '.cookie'), 'r') as f: 100 | + with open(os.path.join(os.path.expanduser(settings['datadir']), '.cookie'), 'r', encoding="utf8") as f:
MarcoFalke commented at 3:57 PM on June 12, 2018:The cookie file is ASCII encoded
laanwj commented at 4:39 PM on June 12, 2018:I'm not sure that's a problem: if something is ASCII-encoded, it doesn't hurt reading it as utf-8.
MarcoFalke commented at 4:45 PM on June 12, 2018:No strong opinion. Would throw and error if it was no longer ascii.
in test/functional/feature_notifications.py:61 in 743add2f9e outdated
57 | @@ -58,7 +58,7 @@ def run_test(self): 58 | 59 | # file content should equal the generated transaction hashes 60 | txids_rpc = list(map(lambda t: t['txid'], self.nodes[1].listtransactions("*", block_count))) 61 | - with open(self.tx_filename, 'r') as f: 62 | + with open(self.tx_filename, 'r', encoding="utf-8") as f:
MarcoFalke commented at 3:58 PM on June 12, 2018:txids should only be ascii encoded
in test/functional/test_framework/util.py:330 in 743add2f9e outdated
326 | @@ -327,7 +327,7 @@ def get_auth_cookie(datadir): 327 | assert password is None # Ensure that there is only one rpcpassword line 328 | password = line.split("=")[1].strip("\n") 329 | if os.path.isfile(os.path.join(datadir, "regtest", ".cookie")): 330 | - with open(os.path.join(datadir, "regtest", ".cookie"), 'r') as f: 331 | + with open(os.path.join(datadir, "regtest", ".cookie"), 'r', encoding="utf8") as f:
MarcoFalke commented at 3:59 PM on June 12, 2018:The cookie file is ASCII encoded
in test/lint/lint-python-utf8-encoding.sh:14 in 743add2f9e outdated
9 | + 10 | +EXIT_CODE=0 11 | +OUTPUT=$(git grep -i " open(" -- "*.py" | grep -vE 'encoding=.(utf8|utf-8).' | grep -vE "open\([^,]*, ['\"][^'\"]*b[^'\"]*['\"]") 12 | +if [[ ${OUTPUT} != "" ]]; then 13 | + echo "Python's open(...) seems to be used to open text files without explicitly" 14 | + echo "specifying 'encoding=utf8':"
MarcoFalke commented at 4:00 PM on June 12, 2018:Should say encoding='utf8'? Also, you might want to use the single quotes(') when patching the python files in this pull, since this seems the more common way for short strings. Just a nit, though.
MarcoFalke commented at 4:01 PM on June 12, 2018: memberutACK 743add2f9ee381b80f70869585cd48e89efc5df1. Just a few nits. Feel free to ignore them.
laanwj commented at 6:02 PM on June 12, 2018: memberThanks for adding this.
utACK 743add2f9ee381b80f70869585cd48e89efc5df1, no strong opinion onasciiversusutf-8. I'm ok with being strict there, or keeping it like this, the point is being explicit.Edit, this trips up the linter in travis:
Python's open(...) seems to be used to open text files without explicitly specifying 'encoding=utf8': contrib/verify-commits/verify-commits.py: verified_root = open(dirname + "/trusted-git-root", "r").read().splitlines()[0] contrib/verify-commits/verify-commits.py: verified_sha512_root = open(dirname + "/trusted-sha512-root-commit", "r").read().splitlines()[0] contrib/verify-commits/verify-commits.py: revsig_allowed = open(dirname + "/allow-revsig-commits", "r").read().splitlines() contrib/verify-commits/verify-commits.py: unclean_merge_allowed = open(dirname + "/allow-unclean-merge-commits", "r").read().splitlines() contrib/verify-commits/verify-commits.py: incorrect_sha512_allowed = open(dirname + "/allow-incorrect-sha512-commits", "r").read().splitlines() ^---- failure generated from test/lint/lint-python-utf8-encoding.shExplicitly specify encoding when opening text files in Python code 634bd97001practicalswift force-pushed on Jun 12, 2018practicalswift force-pushed on Jun 12, 2018practicalswift force-pushed on Jun 12, 2018practicalswift force-pushed on Jun 12, 2018Add linter: Make sure we explicitly open all text files using UTF-8 or ASCII encoding in Python c8176b3cc7practicalswift force-pushed on Jun 12, 2018practicalswift commented at 7:50 PM on June 12, 2018: contributor@MarcoFalke @laanwj Thanks for the quick review. Feedback addressed. Please re-review :-)
MarcoFalke commented at 8:42 PM on June 12, 2018: memberutACK c8176b3cc7556d7bcec39a55ae4d6ba16453baaa
DrahtBot commented at 8:15 PM on June 14, 2018: member<!--e57a25ab6845829454e8d69fc972939a-->Note to reviewers: This pull request conflicts with the following ones:
- #13339 (wallet: Replace %w by wallet name in -walletnotify script by promag)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
in test/lint/lint-python-utf8-encoding.sh:11 in c8176b3cc7
6 | +# 7 | +# Make sure we explicitly open all text files using UTF-8 (or ASCII) encoding to 8 | +# avoid potential issues on the BSDs where the locale is not always set. 9 | + 10 | +EXIT_CODE=0 11 | +OUTPUT=$(git grep " open(" -- "*.py" | grep -vE "encoding=.(ascii|utf8|utf-8)." | grep -vE "open\([^,]*, ['\"][^'\"]*b[^'\"]*['\"]")
ken2812221 commented at 1:35 PM on June 15, 2018:nit: I would prefer not to use
.to match"
MarcoFalke commented at 1:40 PM on June 15, 2018:It matches also
'. Imo every other character would be a syntax error in python, so should be fine.
practicalswift commented at 1:44 PM on June 15, 2018:@ken2812221 That is intentional to match
'and".
ken2812221 commented at 1:56 PM on June 15, 2018:Right, that makes sense.
ken2812221 approvedken2812221 commented at 1:36 PM on June 15, 2018: contributorutACK c8176b3
laanwj merged this on Jun 16, 2018laanwj closed this on Jun 16, 2018laanwj referenced this in commit a90ca4087a on Jun 16, 2018PastaPastaPasta referenced this in commit d1200755f1 on Jul 7, 2020zkbot referenced this in commit 43ac2062f9 on Oct 28, 2020zkbot referenced this in commit 84a5830aaa on Nov 9, 2020practicalswift deleted the branch on Apr 10, 2021UdjinM6 referenced this in commit 6f6067873b on May 5, 2021UdjinM6 referenced this in commit c0db4fddf0 on May 6, 2021gades referenced this in commit 3d5dbfb7d0 on Mar 11, 2022gades referenced this in commit 970f488c5c on May 2, 2022DrahtBot locked this on Aug 18, 2022LabelsLinked (view graph)
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-16 15:15 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me