contrib: Improve verify-commits.py to work with maintainers leaving #27058

pull achow101 wants to merge 5 commits into bitcoin:master from achow101:improve-verify-commits changing 6 files +45 −876
  1. achow101 commented at 7:17 pm on February 7, 2023: member

    Currently the verify-commits.py script does not work well with maintainers giving up their commit access. If a key is removed from trusted-keys, any commits it signed previously will fail to verify, however keys cannot be kept in the list as it would allow that person to continue to push new commits. Furthermore, the trusted-keys used depends on the working tree which verify-commits.py itself may be modifying. When the script is run, the trusted-keys may be the one that is intended to be used, but the script may change the tree to a different commit with a different trusted-keys and use that instead!

    To resolve these issues, I’ve updated verify-commits.py to load the trusted-keys file and check the keys itself rather than delegating that to gpg.sh (which previously read in trusted-keys). This avoids the issue with the tree changing.

    I’ve also updated the script so that it stops modifying the tree. It would do this for the clean merge check where it would checkout each individual commit and attempt to reapply the merges, and then checking out the commit given as a cli arg. git merge-tree lets us do basically that but without modifying the tree. It will give us the object id for the resulting tree which we can compare against the object id of the tree in the merge commit in question. This also appears to be quite a bit faster.

    Lastly I’ve removed all of the exception commits in allow-revsig-commits, allow-incorrect-sha512-commits, and allow-unclean-merge-commits since all of these predate the commits in trusted-git-root and trusted-sha512-root. I’ve also updated the script to skip verification of commits that predate trusted-git-root, and skip sha512 verification for those that predate trusted-sha512-root.

  2. DrahtBot commented at 7:17 pm on February 7, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK Sjors
    Concept ACK glozow

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot added the label Scripts and tools on Feb 7, 2023
  4. in contrib/verify-commits/verify-commits.py:162 in 61f21001ac outdated
    205-            recreated_tree = subprocess.check_output([GIT, 'show', '--format=format:%T', 'HEAD']).decode('utf8').splitlines()[0]
    206+            recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]
    207             if current_tree != recreated_tree:
    208                 print("Merge commit {} is not clean".format(current_commit), file=sys.stderr)
    209-                subprocess.call([GIT, 'diff', current_commit])
    210-                subprocess.call([GIT, 'checkout', '--force', '--quiet', branch])
    


    brunoerg commented at 8:04 pm on February 7, 2023:
    If you’re removing this line, so there is no need to have branch var anymore?

    achow101 commented at 9:14 pm on February 7, 2023:
    Removed.
  5. achow101 force-pushed on Feb 7, 2023
  6. in contrib/verify-commits/trusted-keys:1 in 15d9fc0933 outdated
    0@@ -1,6 +1,7 @@
    1-71A3B16735405025D447E8F274810B012346C9A6
    2-B8B3F1C0E58C15DB6A81D30C3648A882F4316B9B
    3-E777299FC265DD04793070EB944D35F9AC3DB76A
    4-D1DBF2C4B96F2DEBF4C16654410108112E7EA81F
    5-152812300785C96444D3334D17565732E08E5E41
    6-6B002C6EA3F91B1B0DF0C9BC8F617F1200A6D25C
    7+key,since,until
    


    maflcko commented at 9:54 am on February 8, 2023:

    Not sure if it makes sense to document the last merge commit here.

    Keeping the key will also mean that the script will break again once the key naturally expires or is revoked. So I guess you’d also have to document a faketime where the script still passes? Moreover, the key might be purged from keyservers if it is expired/revoked, so you may also have to document the full key here. Not sure if we want this.

    It seems easier to just bump the trusted root, like it was done before, see d4b3dc5b0a726cc4cc7a8467be43126e78f841cf

    The trusted root hash commits to all previous commits, so there shouldn’t be any downside, compared to listing a subset of commits prior to the trusted root hash.

    Of course anyone is free to archive previous keys as long as they want and re-run gpg on previous commits, with or without faketime, as often as they want. But doing that once as part of the review process of bumping the trusted root should be enough. I don’t see a need to maintain every key forever.

    Overall I think, the script should ideally be easy to understand and use, so that many users can read and use it. It is already hard enough to properly run the script (just one example: #25197 (comment) ; happy to provide more examples), so adding even more complexity and failure modes will mean that likely no one is running it in practice, let alone understand it and be able to act on errors if there are any.


    achow101 commented at 3:43 pm on February 8, 2023:

    For me, the question was how comfortable are we with a really recent trusted root? Suppose an active maintainer needs to be removed on very short notice, e.g. keys and account is compromised. We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that? With this change, we could just set the until commit rather than removing the key.

    My thinking was that the trusted root shouldn’t be updated ever, and really we should make it possible to move it even earlier so that older commits signed by previous maintainers can still be verified. But maybe that’s not how people think of verify-commits.


    maflcko commented at 4:01 pm on February 8, 2023:

    But maybe that’s not how people think of verify-commits.

    Yeah opinions may differ on this. Though, I think there is no easy way out of manually reviewing (and manually taking over! [1]) changes to this script and the data it uses. So optimizing for an easy and repeatable review process should be priority, to allow as many people to repeat it as possible. Otherwise, if the review process is different for each occasion, it is impossible to document and hard to follow. I do think it probably makes sense to allow for one-off rare exceptions (unclean merge, unsigned merge, wrong sha512, …), which can be used during the review process of bumping the trusted root, but I think will complicate stuff if they are needed to be maintained for the whole history at the HEAD commit forever. (Note that anyone can check out an earlier version to get them for a previous commit, even if they are removed from HEAD)

    We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that?

    I’d say yes. There is no way out of doing a change on short notice. And then, whether reviewers review the until commit, or the trusted root, which will be equal to the until commit, shouldn’t make a difference?

    [1] Blindly checking out the script on the fetched/downloaded/merged git commit is next to useless, unless you are the CI doing a smoke test.


    achow101 commented at 5:07 pm on February 10, 2023:
    I’ve dropped this commit for now.
  7. achow101 force-pushed on Feb 10, 2023
  8. Sjors commented at 1:31 pm on February 11, 2023: member

    I like the CSV approach in c090a7ff86257411761e22d3642d0e8e05859b73 better. See inline discussion and here. I don’t like having to update the root trusted commit every time the maintainer list changes.

    Imo we should only bump it if a former maintainer revokes their key or it expires, plus maybe every couple of years to preempt such issues. As @MarcoFalke suggested people can do this with git checkout "$(cat contrib/verify-commits/trusted-git-root)~1" (and override verification script with the new version). But I’d rather avoid that friction.

    One heuristic for when it’s ok to change the root trusted commit could be the branch-off commit for the oldest release branch that we still support backports to. @MarcoFalke wrote:

    What would the alternative be? Listing hundreds or thousands of “revsig” commits in a file, to ensure it is impossible to review manually, only with special git commands, potentially making it trivial to sneak in malicious commits that are not actually revsig commits? And then, as you say, bump the root anyway later.

    This script doesn’t protect against maintainer collusion. But doesn’t the CSV remove the need for revsig commits?

    Tested with contrib/verify-commits/verify-commits.py HEAD~4. It indeed seems a bit faster. Still have to test some other things.

  9. maflcko commented at 1:45 pm on February 11, 2023: member

    I don’t like having to update the root trusted commit every time the maintainer list changes.

    You don’t have to. It is only one possible trusted git root. Everyone using this script will have to pick their root themselves in their own copy of this script anyway. They are free to not touch their script after the root is bumped and continue using the previous script+data; they are free to pick the latest data from the master branch, once and if they agree with it; they are also free to pick any other trusted git root according to their own needs (this is what I do).

    In fact, everyone using the same trusted root (or encouraging people to do so via a workflow) makes it trivial for a single merge commit to short-circuit out all history (by treating all of history as one pull request), without the script even noticing (unless you are using a different root).

  10. achow101 commented at 4:32 pm on February 11, 2023: member

    But doesn’t the CSV remove the need for revsig commits?

    It could. The CSV could have a field that indicates the key is revoked, so we could then allow revsigs only for that key. Similar with expired keys.

  11. kristapsk commented at 4:52 pm on February 11, 2023: contributor
    Maybe it would make sense to use OpenTimestamps to verify timestamps (dates) of merge commits?
  12. Sjors commented at 2:39 pm on February 12, 2023: member
    @kristapsk maybe not for this PR, however, that might be a good way to make it safe to use a revocation date (we can’t trust the date in a git commit). The rule would then be that a timestamp must exist with a median time past before that date.
  13. DrahtBot added the label Needs rebase on Feb 16, 2023
  14. verify-commits: Move trusted-keys valid sig check into verify-commits itself
    Instead of having gpg.sh check against the trusted keys for a valid
    signature, do it inside of verify-commits itself.
    
    This also allows us to use the same trusted-keys throughout the
    verify-commits.py check rather than it possibly being modified during
    the clean merge check.
    53b07b2b47
  15. verify-commits: Remove all allowed commit exceptions
    These commits predate the current trusted root.
    76923bfa09
  16. verify-commits: Use merge-tree in clean merge check 5497c14830
  17. verify-commits: Skip checks for commits older than trusted roots bb86887527
  18. achow101 force-pushed on Feb 16, 2023
  19. DrahtBot removed the label Needs rebase on Feb 16, 2023
  20. fanquake added this to the milestone 25.0 on Feb 20, 2023
  21. Sjors commented at 2:45 pm on February 21, 2023: member

    For testing purposes, I tried rebasing the PR on master, removing @MarcoFalke’s key without updating the trusted root and instead adding revsigs:

    0git log --format="%H %GK" --merges $(cat contrib/verify-commits/trusted-git-root)..master | grep -E "CE2B75697E69A548" | cut -c -40
    

    It seems to ignore them though, because contrib/verify-commits/verify-commits.py HEAD~4 fails: No parent of 75f0e0b607cd7ff7afd56853eb34a2b285b22ad2 was signed with a trusted key!. I checked that the first parent it lists is in allow-revsig-commits.

    Other stuff does seem to work (per bb86887527d817ee2a015863ddf3541dac42080f), although I didn’t test very thoroughly: if I mess with a random commit in history, it will fail complain about that commits that weren’t signed. Even if I whitelist myself, the sha256 tree will complain, as it should.

    We should merge this before #27135 for easier verification of the past ~year.

  22. Sjors commented at 3:05 pm on February 21, 2023: member

    Ah wait, I’m misunderstanding what allow-revsig-commits does. It merely allows a signature fron a revoked key. Not a key that we removed from the trusted list, but a revoked PGP key (someone verifying signatures may or may not have an up to date view of revoked keys).

    So the only way to verify earlier history is to check out the trusted root commit, copy the most recent verify-commits.py and run it again. That way it will use the trusted keys at the time.

    I suggest we merge this and then later rethink how we want to handle changing maintainers.

    tACK bb86887527d817ee2a015863ddf3541dac42080f

  23. verify-commits: Mention git v2.38.0 requirement 14fac808bd
  24. in contrib/verify-commits/verify-commits.py:166 in 5497c14830 outdated
    162@@ -164,15 +163,11 @@ def main():
    163         allow_unclean = current_commit in unclean_merge_allowed
    164         if len(parents) == 2 and check_merge and not allow_unclean:
    165             current_tree = subprocess.check_output([GIT, 'show', '--format=%T', current_commit]).decode('utf8').splitlines()[0]
    166-            subprocess.call([GIT, 'checkout', '--force', '--quiet', parents[0]])
    167-            subprocess.call([GIT, 'merge', '--no-ff', '--quiet', '--no-gpg-sign', parents[1]], stdout=subprocess.DEVNULL)
    168-            recreated_tree = subprocess.check_output([GIT, 'show', '--format=format:%T', 'HEAD']).decode('utf8').splitlines()[0]
    169+            recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]
    


    maflcko commented at 3:23 pm on February 21, 2023:

    Does anyone know which merge strategy this uses? I couldn’t find anything at https://git-scm.com/docs/git-merge-tree

    See also:


    achow101 commented at 5:46 pm on February 21, 2023:
    Looking at the source, I believe it uses ort.

    maflcko commented at 5:51 pm on February 21, 2023:

    So the minimum git version will be 2.38? https://github.com/git/git/commit/1f0c3a29da3515d88537902cd267cc726020eea5

    Might be good to test on commit 0cfbb171bd6f61a4edba913e281553b9bdf08d4a


    Sjors commented at 6:43 pm on February 21, 2023:

    It seems so. If I remove the Homebrew version of Git (2.39.2) and fall back to Apple’s default 2.37.1 the verification script fails (on any commit)

     0 % contrib/verify-commits/verify-commits.py 0cfbb17
     1Using verify-commits data from /Users/sjors/dev/bitcoin/contrib/verify-commits
     2usage: git merge-tree <base-tree> <branch1> <branch2>
     3Traceback (most recent call last):
     4  File "contrib/verify-commits/verify-commits.py", line 191, in <module>
     5    main()
     6  File "contrib/verify-commits/verify-commits.py", line 181, in main
     7    recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]
     8  File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 411, in check_output
     9    **kwargs).stdout
    10  File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 512, in run
    11    output=stdout, stderr=stderr)
    12subprocess.CalledProcessError: Command '['git', 'merge-tree', '100949af0e2551f22c02a73355f2c64710b68ef1', 'a60d9eb9e6b6a272a3fca8981d89a55955dced55']' returned non-zero exit status 129.
    

    achow101 commented at 9:58 pm on February 21, 2023:
    I think it is reasonable to require that. I’m sure the people who would run this script can figure that out. I’ll add a note to the docs.
  25. Sjors commented at 11:18 am on February 22, 2023: member
    ACK 14fac808bd6c12bce121011bbf50501960c7326f
  26. fanquake requested review from glozow on Feb 23, 2023
  27. fanquake requested review from hebasto on Feb 23, 2023
  28. hebasto commented at 4:22 pm on February 23, 2023: member

    This part in the PR description

    I’ve changed trusted-keys from just a list of key ids to a csv file with each row being the key id, and the earliest and latest commits the key is allowed to have signed.

    looks outdated.

  29. achow101 commented at 4:46 pm on February 23, 2023: member

    looks outdated.

    Updated the description

  30. petertodd commented at 5:01 pm on February 23, 2023: contributor

    @Sjors

    The rule would then be that a [OpenTimestamps] timestamp must exist with a median time past before that date.

    The median time past is not the correct way to interpret a Bitcoin block time for the purpose of timestamping. The problem is median time past is in the past; a timestamp proof is a stronger statement if it’s backdated, not weaker.

    Instead, the OpenTimestamps client rounds off timestamps to the nearest day in the local timezone to give users the right impression about the accuracy of OTS proofs, without getting into the UI complexity of having timestamps with dates apparently in the future. For an automated tool, adding a day to the block time and interpreting the timestamp as proof that some data existed prior to that point in time could be a reasonable approach.

  31. Sjors commented at 6:28 pm on February 23, 2023: member
    @petertodd in practice things should be fine if there’s at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there’s a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key.
  32. petertodd commented at 7:07 pm on February 23, 2023: contributor

    @petertodd in practice things should be fine if there’s at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there’s a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key.

    My comment is about what standard an automated tool should apply. Obviously, if some humans are looking at it and manually updating a CSV, there’s a lot of flexibility and people can use their judgement.

  33. glozow commented at 5:55 pm on February 24, 2023: member
    Concept ACK 14fac808bd6c12bce121011bbf50501960c7326f Makes sense to me to use the same trusted-keys throughout, remove commit exceptions that are never used, use merge-tree instead of checkout/merge, and skip if commit is older than trusted root.
  34. glozow merged this on Feb 27, 2023
  35. glozow closed this on Feb 27, 2023

  36. maflcko commented at 1:37 pm on February 27, 2023: member

    This will break the script on all operating systems except the rolling ones like rawhide and tumbleweed?

    See also:

    subprocess.CalledProcessError: Command ‘[‘git’, ‘merge-tree’, ‘be2e748f378fc9ed40593a723dd18f2528705956’, ‘14fac808bd6c12bce121011bbf50501960c7326f’]’ returned non-zero exit status 129.

    https://cirrus-ci.com/task/4961884550463488?logs=lint#L242

  37. glozow commented at 1:56 pm on February 27, 2023: member
    iiuc reverting 5497c1483097a9b582ef78089a2ce1101b7d722e would make it go away though keeping merge-tree would be nice. Is bumping the lint container to something that gets 2.38 an option for CI?
  38. maflcko commented at 2:00 pm on February 27, 2023: member
    Yeah, should be easy to temporarily bump CI to ubuntu:devel. Though, I wonder what users are supposed to do?
  39. glozow commented at 2:23 pm on February 27, 2023: member
    Would it be unreasonable for users to add a ppa and install that way? Also temporary?
  40. sidhujag referenced this in commit 309fdc36b0 on Feb 27, 2023
  41. Sjors commented at 9:09 am on March 1, 2023: member
    I think it’s fine to have a temporary hoop here, as long as it’s documented.
  42. maflcko commented at 9:46 am on March 1, 2023: member
    Ok, fair enough. Also, I guess no one is forced to upgrade their script. Anyone is free to use the previous version of the python script as long as they want.
  43. bitcoin locked this on Feb 29, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me