Refactor: explicit VerifyScript control flow based on pattern matching

JeremyRubin commented at 1:06 am on May 7, 2019: contributor

This is an experimental refactor I wrote to help myself understand script validation logic better. I’m not strongly advocating for it as it touches consensus and I haven’t thoroughly verified it is identical, but I felt it was much clearer than the existing logic, so was worth sharing.

Essentially, the current VerifyScript logic processes all scripts down the same control flow path. This makes the code a bit hard to reason about as the logic for enabling features (e.g., segwit, p2sh, p2sh segwit) is all interleaved cleverly. Furthermore, the use of the stack between execution to communicate conditions for checks (e.g., clearstack) is confusing. In order for me to understand what it was doing and be convinced it is correct, I felt I should re-write it to be less clever in several small steps. This PR is the result of de-clevering this code.

In this PR I:

duplicate the entire verification logic for the main Script templates we know about.
Simplify the logic substantially for several types of script.
Get rid of swapping the stack copy and just running p2sh on a copy of the stack
Because we know the exact side effects of a SegWit program, we are able to avoid executing the segwit script at all and directly run the segwit program. (We could do the same trick for P2SH, but it is a bigger refactor).
Moved assertions about flag combinations to be unconditional
Made a few checks into asserts where we know (based on script type) it will never fail

The code could be further simplified with a bit more duplication (currently the classic script verification code is used in multiple modes), but at current I think it is a nice balance between repetition and clarity.

In the future if new flags are introduced, they should hopefully only be affecting witness programs and not scripts. If this logic needed to be amended, it should be (IMO) easier to patch this version of the function than the current one.

I haven’t benchmarked, but skipping a few extra EvaluateScripts for segwit probably doesn’t hurt.

The Diff for this PR kinda sucks, I recommend just looking at the files side-by-side.

Refactor: explicit VerifyScript control flow based on pattern matching script types b6198514c8

fanquake added the label Validation on May 7, 2019

DrahtBot commented at 2:18 am on May 7, 2019: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#14696 (qa: Add explicit references to related CVE’s in p2p_invalid_block test. by lucash-dev)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

JeremyRubin commented at 4:26 am on May 7, 2019: contributor

If #13062 goes ahead (it’s been sitting for a while) it doesn’t look to hard to rebase either PR.

jnewbery commented at 2:36 pm on May 7, 2019: member

You say in the opening paragraph that you’re not advocating for this PR and haven’t verified that it’s consensus compatible with the existing code. Can you explain what your motivation is for opening this? What are you expecting from other contributors?

A good general principal is that we shouldn’t change consensus code unless there are very obvious benefits, and if we do have to make changes, the review bar should be very high indeed. This PR clearly doesn’t offer those benefits or meet that bar. Much of the feedback in #14837 also applies here.

Concept NACK

ryanofsky commented at 3:22 pm on May 7, 2019: member

Can you explain what your motivation is for opening this? What are you expecting from other contributors?

I think it’s explained pretty well, and I don’t think anything is being expected of you if you aren’t interested in this. People can look at this if they are interested, and if they think they think the new code is an improvement over the old code they can say “hey this new code is really great” and maybe make an argument for merging it, or have a discussion, or find problems we didn’t previously know about. Not every PR has to be merged, and PRs that aren’t merged can still be useful for other reasons.

MarcoFalke added the label Brainstorming on May 7, 2019

JeremyRubin commented at 6:01 pm on May 7, 2019: contributor

Well, the idea is that I wanted to understand how this part of the code was working and it was needlessly more complicated than it should be.

In order to understand it better, I refactored it into something simpler. I opened the PR because if anyone in the future has an interest in looking at this logic, perhaps because they are adding a new script flag, I would point them towards perhaps doing this refactor first if the logic becomes especially complex.

Having a PR open means that if someone touches this function DrahBot will automatically warn the conflict, allowing a reviewer to check it out as well even if they don’t search.

Another possible outcome from this would be to turn this version of the code into a comment, which details that the semantics of the following code should be equivalent to the comment flow.

Another benefit of this, and a reason why you should spend your time to review it, is that it passes the test suite. If you think there is a risk that this code couldn’t have identical semantics, you could write a test demonstrating that which would help protect us from errors made down the line when there are changes adding a feature rather than behavior less refactoring.

The same is true for benchmarking – I don’t necessarily suspect that the current version is slow, but perhaps now this gives me a reason to write a benchmark which can test master as well (I didn’t write this one with performance in mind, just clarity). That benchmark can help prevent regression in the future.

jnewbery commented at 1:57 pm on May 8, 2019: member

Not every PR has to be merged, and PRs that aren’t merged can still be useful for other reasons.

I think that’s where we disagree. A pull request is a request for the branch to be pulled into master.

If this branch is intended as educational or illustrative, then it makes sense to be maintained as a branch in @JeremyRubin’s repo, rather than as a PR in the bitcoin repo. Having PRs open which are never meant to be merged adds cognitive overhead to all maintainers and reviewers.

jb55 commented at 7:18 am on May 14, 2019: member

I think this PR would be more interesting if your changes were split into separate commits so they could be reviewed and verified independently. In the current state it’s really hard to verify the correctness of each change simulataneously.

I don’t think it was a waste of time. I find PRs like these give software verifiers a chance to run state machine tests over changes like these, to sharpen their skills, techniques, and to illuminate unexpected errors for future refactorings. but yeah it would be more useful if it was broken up a bit.

sipa commented at 4:05 pm on May 14, 2019: member

I suggested a few years ago that we could have two independent script implementations in the codebase (1) a stripped-down version of the current code, only for use in consensus logic, which is rarely touched and (2) a feature-rich one that undergoes refactoring and addition of features as needed for everything else. There would obviously be tests to (aim to) verify correspondence between the two, but from an assurance perspective, we would not know for sure that they exhibit the same behavior in all cases. Bugs in the second would only affect its users, rather than risk forking the network however.

This is the sort of change that would make perfect second in (2), but not in (1).

JeremyRubin commented at 7:07 pm on May 14, 2019: contributor

@jb55 have at it! https://github.com/bitcoin/bitcoin/compare/master...JeremyRubin:explicit-verifyscript-reunsquash?expand=1 This is more or less the same end state (modulo typos/formatting errors) done in very small digestible steps. (I reproduced this from scratch, not a branch that existed prior). Each commit should be able to run and pass all tests. @sipa I agree that it would be good to produce an untouched version that stays the same for a very long time – I suppose where we might differ in opinion is that I think that this might be a good candidate for that freeze/reference logic ahead of the fork to a more feature-ful version. I’m not what kind of features you mean though, given that they would probably impact consensus at the end of the day.

sipa commented at 7:35 pm on May 14, 2019: member

By features I mean for example all the logic in CScript to actually construct scripts. Not anything affecting consensus.

And I certainly don’t think this is the kind of change that’s worth making to consensus logic; making code easier to read is far inferior in priority to guaranteeing consistency.

JeremyRubin commented at 9:54 pm on May 14, 2019: contributor

We don’t disagree that the goal is consistency.

My view is that, as noted above, this version is superior in maintaining consistency with future versions/features added.

I.e., if we add further complexity to this code for handling a new case or flag we will have an increased likelihood of those code paths causing an incompatibility if the flag is added against the old version than against a refactored version. E.g., if the current version is 0.20 and 0.21 makes a change that is incompatible with version 0.20 and earlier that is worse than a change which is compatible with 0.20 and incompatible with 0.19. I doubt that in the event of a catastrophic fork error anyone is going to take “whatever Bitcoin 0.8 syncs to” as the ground truth.

What’s further nice about this version is that because SegWit is pattern matched, an incompatible future change in script processing (or existing bug) has segwit’s behavior isolated, reducing the surface of a change to EvalScript’s logic having an impact on SegWit detection, even at the expense of consistency. I’d personally – and I understand this may not be a consensus view – rather have old nodes fork off if SegWit had some specification error in it which caused old interpreters to treat SegWit programs as invalid spends. I admit that’s a cherry picked example – it’s entirely possible the new code would make something previously spendable unspendable.

Again, I’m not strongly advocating that we need to change the code to be this way. But I think it might be worth it to do something like this in advance of further flags or complexities being added which bear an equal if not greater risk of breaking consensus.

JeremyRubin commented at 10:38 pm on May 14, 2019: contributor

addendum:

I also wanted to point out that an alternative approach to this refactor would be to take the past revisions to this logic and structure it as follows:

0if SEGWIT:
1   code from when SegWit activated's script processing
2if P2SH:
3   code from when p2sh activated script processing
4if ORIGINAL:
5   code from original interpreter, no extensions

This structure is actually, imo, more likely to be consistent with earlier versions than the interleaved style as you can clearly see the exact flow that will be taken for each flags combo and verify that it matches the old version.

This is obviously less DRY code, but given that it’s consensus, preferring consistency is better than clever interleavings.

sdaftuar commented at 1:01 am on May 17, 2019: member

As I’ve commented elsewhere many times before, I think a PR like this is way too high risk to be worth contemplating actually merging. But if there are things you’ve learned from your effort that would be helpful to document for others, then I think opening a PR that just adds better comments to the existing logic and justifies its correctness could be helpful.

laanwj commented at 1:21 pm on June 7, 2019: member

There seems to be agreement not to make this change (on the verification/consensus side), and I tend to agree it’s too risky, so I’m closing this.

laanwj closed this on Jun 7, 2019

DrahtBot locked this on Feb 15, 2022

Refactor: explicit VerifyScript control flow based on pattern matching #15969

Conflicts