-
Notifications
You must be signed in to change notification settings - Fork 38.8k
Refactor: explicit VerifyScript control flow based on pattern matching #15969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
If #13062 goes ahead (it's been sitting for a while) it doesn't look to hard to rebase either PR. |
|
You say in the opening paragraph that you're not advocating for this PR and haven't verified that it's consensus compatible with the existing code. Can you explain what your motivation is for opening this? What are you expecting from other contributors? A good general principal is that we shouldn't change consensus code unless there are very obvious benefits, and if we do have to make changes, the review bar should be very high indeed. This PR clearly doesn't offer those benefits or meet that bar. Much of the feedback in #14837 also applies here. Concept NACK |
I think it's explained pretty well, and I don't think anything is being expected of you if you aren't interested in this. People can look at this if they are interested, and if they think they think the new code is an improvement over the old code they can say "hey this new code is really great" and maybe make an argument for merging it, or have a discussion, or find problems we didn't previously know about. Not every PR has to be merged, and PRs that aren't merged can still be useful for other reasons. |
|
Well, the idea is that I wanted to understand how this part of the code was working and it was needlessly more complicated than it should be. In order to understand it better, I refactored it into something simpler. I opened the PR because if anyone in the future has an interest in looking at this logic, perhaps because they are adding a new script flag, I would point them towards perhaps doing this refactor first if the logic becomes especially complex. Having a PR open means that if someone touches this function DrahBot will automatically warn the conflict, allowing a reviewer to check it out as well even if they don't search. Another possible outcome from this would be to turn this version of the code into a comment, which details that the semantics of the following code should be equivalent to the comment flow. Another benefit of this, and a reason why you should spend your time to review it, is that it passes the test suite. If you think there is a risk that this code couldn't have identical semantics, you could write a test demonstrating that which would help protect us from errors made down the line when there are changes adding a feature rather than behavior less refactoring. The same is true for benchmarking -- I don't necessarily suspect that the current version is slow, but perhaps now this gives me a reason to write a benchmark which can test master as well (I didn't write this one with performance in mind, just clarity). That benchmark can help prevent regression in the future. |
I think that's where we disagree. A pull request is a request for the branch to be pulled into master. If this branch is intended as educational or illustrative, then it makes sense to be maintained as a branch in @JeremyRubin's repo, rather than as a PR in the bitcoin repo. Having PRs open which are never meant to be merged adds cognitive overhead to all maintainers and reviewers. |
|
I think this PR would be more interesting if your changes were split into separate commits so they could be reviewed and verified independently. In the current state it's really hard to verify the correctness of each change simulataneously. I don't think it was a waste of time. I find PRs like these give software verifiers a chance to run state machine tests over changes like these, to sharpen their skills, techniques, and to illuminate unexpected errors for future refactorings. but yeah it would be more useful if it was broken up a bit. |
|
I suggested a few years ago that we could have two independent script implementations in the codebase (1) a stripped-down version of the current code, only for use in consensus logic, which is rarely touched and (2) a feature-rich one that undergoes refactoring and addition of features as needed for everything else. There would obviously be tests to (aim to) verify correspondence between the two, but from an assurance perspective, we would not know for sure that they exhibit the same behavior in all cases. Bugs in the second would only affect its users, rather than risk forking the network however. This is the sort of change that would make perfect second in (2), but not in (1). |
|
@jb55 have at it! https://github.com/bitcoin/bitcoin/compare/master...JeremyRubin:explicit-verifyscript-reunsquash?expand=1 This is more or less the same end state (modulo typos/formatting errors) done in very small digestible steps. (I reproduced this from scratch, not a branch that existed prior). Each commit should be able to run and pass all tests. @sipa I agree that it would be good to produce an untouched version that stays the same for a very long time -- I suppose where we might differ in opinion is that I think that this might be a good candidate for that freeze/reference logic ahead of the fork to a more feature-ful version. I'm not what kind of features you mean though, given that they would probably impact consensus at the end of the day. |
|
By features I mean for example all the logic in CScript to actually construct scripts. Not anything affecting consensus. And I certainly don't think this is the kind of change that's worth making to consensus logic; making code easier to read is far inferior in priority to guaranteeing consistency. |
|
We don't disagree that the goal is consistency. My view is that, as noted above, this version is superior in maintaining consistency with future versions/features added. I.e., if we add further complexity to this code for handling a new case or flag we will have an increased likelihood of those code paths causing an incompatibility if the flag is added against the old version than against a refactored version. E.g., if the current version is 0.20 and 0.21 makes a change that is incompatible with version 0.20 and earlier that is worse than a change which is compatible with 0.20 and incompatible with 0.19. I doubt that in the event of a catastrophic fork error anyone is going to take "whatever Bitcoin 0.8 syncs to" as the ground truth. What's further nice about this version is that because SegWit is pattern matched, an incompatible future change in script processing (or existing bug) has segwit's behavior isolated, reducing the surface of a change to EvalScript's logic having an impact on SegWit detection, even at the expense of consistency. I'd personally -- and I understand this may not be a consensus view -- rather have old nodes fork off if SegWit had some specification error in it which caused old interpreters to treat SegWit programs as invalid spends. I admit that's a cherry picked example -- it's entirely possible the new code would make something previously spendable unspendable. Again, I'm not strongly advocating that we need to change the code to be this way. But I think it might be worth it to do something like this in advance of further flags or complexities being added which bear an equal if not greater risk of breaking consensus. |
|
addendum: I also wanted to point out that an alternative approach to this refactor would be to take the past revisions to this logic and structure it as follows: This structure is actually, imo, more likely to be consistent with earlier versions than the interleaved style as you can clearly see the exact flow that will be taken for each flags combo and verify that it matches the old version. This is obviously less DRY code, but given that it's consensus, preferring consistency is better than clever interleavings. |
|
As I've commented elsewhere many times before, I think a PR like this is way too high risk to be worth contemplating actually merging. But if there are things you've learned from your effort that would be helpful to document for others, then I think opening a PR that just adds better comments to the existing logic and justifies its correctness could be helpful. |
|
There seems to be agreement not to make this change (on the verification/consensus side), and I tend to agree it's too risky, so I'm closing this. |
This is an experimental refactor I wrote to help myself understand script validation logic better. I'm not strongly advocating for it as it touches consensus and I haven't thoroughly verified it is identical, but I felt it was much clearer than the existing logic, so was worth sharing.
Essentially, the current VerifyScript logic processes all scripts down the same control flow path. This makes the code a bit hard to reason about as the logic for enabling features (e.g., segwit, p2sh, p2sh segwit) is all interleaved cleverly. Furthermore, the use of the stack between execution to communicate conditions for checks (e.g., clearstack) is confusing. In order for me to understand what it was doing and be convinced it is correct, I felt I should re-write it to be less clever in several small steps. This PR is the result of de-clevering this code.
In this PR I:
The code could be further simplified with a bit more duplication (currently the classic script verification code is used in multiple modes), but at current I think it is a nice balance between repetition and clarity.
In the future if new flags are introduced, they should hopefully only be affecting witness programs and not scripts. If this logic needed to be amended, it should be (IMO) easier to patch this version of the function than the current one.
I haven't benchmarked, but skipping a few extra EvaluateScripts for segwit probably doesn't hurt.
The Diff for this PR kinda sucks, I recommend just looking at the files side-by-side.