-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Minimize vfExec counting in script handling #14245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| opcodetype opcode; | ||
| valtype vchPushValue; | ||
| std::vector<bool> vfExec; | ||
| bool fExec = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By definition, the first opcode must be executed because vfExec is empty and without any false item. So fExec must be true at the beginning
| { | ||
| while (pc < pend) | ||
| { | ||
| bool fExec = !count(vfExec.begin(), vfExec.end(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value of fExec may change only if an IF, NOTIF, ELSE, or ENDIF is encountered. No other opcode may flip fExec. So we don't need to redo it for every opcode, just after IF, NOTIF, ELSE, or ENDIF
| if (opcode == OP_NOTIF) | ||
| fValue = !fValue; | ||
| popstack(stack); | ||
| fExec = fValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in OP_IF, if fExec is true, the original vfExec must be by definition all true. Since fValue is added to the end of vfExec, the new fExec value must be same as fValue
If fExec is false, a false will be added to vfExec, and by definition fExec will remain false, so we don't need to do anything
| if (vfExec.empty()) | ||
| return set_error(serror, SCRIPT_ERR_UNBALANCED_CONDITIONAL); | ||
| vfExec.back() = !vfExec.back(); | ||
| fExec = fExec ? false : !count(vfExec.begin(), vfExec.end(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in OP_ELSE, if fExec is true, the original vfExec must be by definition all true. Since OP_ELSE will flip the last vfExec item, the resulting vfExec must have exactly one false at the end. So we know fExec must become false.
If fExec is false, there is at least one false in vfExec but we don't know where it is, so we have to count as usual (fExec will flip if and only if the original vfExec had exactly one false at the end)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could improve by using std::find since the actual count is not necessary?
| if (vfExec.empty()) | ||
| return set_error(serror, SCRIPT_ERR_UNBALANCED_CONDITIONAL); | ||
| vfExec.pop_back(); | ||
| fExec = fExec ? true : !count(vfExec.begin(), vfExec.end(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in OP_ENDIF, if fExec is true, the original vfExec must be by definition all true. Since OP_ENDIF will simply remove the last vfExec item, the resulting vfExec must not have any false. So we know fExec must remain true.
If fExec is false, there is at least one false in vfExec but we don't know where it is, so we have to count as usual (fExec will flip if and only if the original vfExec had exactly one false at the end)
| opcodetype opcode; | ||
| valtype vchPushValue; | ||
| std::vector<bool> vfExec; | ||
| bool fExec = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2018-09-20 01:47:27 cppcheck(pr=14245): [src/script/interpreter.cpp:297]: (style) The scope of the variable 'fExec' can be reduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but a few variables there, like nOpCount, can be reduced as well
|
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
| The last travis run for this pull request was 231 days ago and is thus outdated. To trigger a fresh travis build, this pull request should be closed and re-opened. |
|
This has been open for a year, and received no conceptual review / interest (other than nits). Closing as "Up for grabs". |
|
Oh, I completely forgot about this PR's existence. #16902 is a slightly more advanced version of this. |
e6e622e Implement O(1) OP_IF/NOTIF/ELSE/ENDIF logic (Pieter Wuille) d0e8f4d [refactor] interpreter: define interface for vfExec (Anthony Towns) 89fb241 Benchmark script verification with 100 nested IFs (Pieter Wuille) Pull request description: While investigating what mechanisms are possible to maximize the per-opcode verification cost of scripts, I noticed that the logic for determining whether a particular opcode is to be executed is O(n) in the nesting depth. This issue was also pointed out by Sergio Demian Lerner in https://bitslog.wordpress.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/, and this PR implements a variant of the O(1) algorithm suggested there. This is not a problem currently, because even with a nesting depth of 100 (the maximum possible right now due to the 201 ops limit), the slowdown caused by this on my machine is around 70 ns per opcode (or 0.25 s per block) at worst, far lower than what is possible with other opcodes. This PR mostly serves as a proof of concept that it's possible to avoid it, which may be relevant in discussions around increasing the opcode limits in future script versions. Without it, the execution time of scripts can grow quadratically with the nesting depth, which very quickly becomes unreasonable. This improves upon #14245 by completely removing the `vfExec` vector. ACKs for top commit: jnewbery: Code review ACK e6e622e MarcoFalke: ACK e6e622e 🐴 fjahr: ACK e6e622e ajtowns: ACK e6e622e laanwj: concept and code review ACK e6e622e jonatack: ACK e6e622e code review, build, benches, fuzzing Tree-SHA512: 1dcfac3411ff04773de461959298a177f951cb5f706caa2734073bcec62224d7cd103767cfeef85cd129813e70c14c74fa8f1e38e4da70ec38a0f615aab1f7f7
e6e622e Implement O(1) OP_IF/NOTIF/ELSE/ENDIF logic (Pieter Wuille) d0e8f4d [refactor] interpreter: define interface for vfExec (Anthony Towns) 89fb241 Benchmark script verification with 100 nested IFs (Pieter Wuille) Pull request description: While investigating what mechanisms are possible to maximize the per-opcode verification cost of scripts, I noticed that the logic for determining whether a particular opcode is to be executed is O(n) in the nesting depth. This issue was also pointed out by Sergio Demian Lerner in https://bitslog.wordpress.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/, and this PR implements a variant of the O(1) algorithm suggested there. This is not a problem currently, because even with a nesting depth of 100 (the maximum possible right now due to the 201 ops limit), the slowdown caused by this on my machine is around 70 ns per opcode (or 0.25 s per block) at worst, far lower than what is possible with other opcodes. This PR mostly serves as a proof of concept that it's possible to avoid it, which may be relevant in discussions around increasing the opcode limits in future script versions. Without it, the execution time of scripts can grow quadratically with the nesting depth, which very quickly becomes unreasonable. This improves upon bitcoin#14245 by completely removing the `vfExec` vector. ACKs for top commit: jnewbery: Code review ACK e6e622e MarcoFalke: ACK e6e622e 🐴 fjahr: ACK e6e622e ajtowns: ACK e6e622e laanwj: concept and code review ACK e6e622e jonatack: ACK e6e622e code review, build, benches, fuzzing Tree-SHA512: 1dcfac3411ff04773de461959298a177f951cb5f706caa2734073bcec62224d7cd103767cfeef85cd129813e70c14c74fa8f1e38e4da70ec38a0f615aab1f7f7
e6e622e Implement O(1) OP_IF/NOTIF/ELSE/ENDIF logic (Pieter Wuille) d0e8f4d [refactor] interpreter: define interface for vfExec (Anthony Towns) 89fb241 Benchmark script verification with 100 nested IFs (Pieter Wuille) Pull request description: While investigating what mechanisms are possible to maximize the per-opcode verification cost of scripts, I noticed that the logic for determining whether a particular opcode is to be executed is O(n) in the nesting depth. This issue was also pointed out by Sergio Demian Lerner in https://bitslog.wordpress.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/, and this PR implements a variant of the O(1) algorithm suggested there. This is not a problem currently, because even with a nesting depth of 100 (the maximum possible right now due to the 201 ops limit), the slowdown caused by this on my machine is around 70 ns per opcode (or 0.25 s per block) at worst, far lower than what is possible with other opcodes. This PR mostly serves as a proof of concept that it's possible to avoid it, which may be relevant in discussions around increasing the opcode limits in future script versions. Without it, the execution time of scripts can grow quadratically with the nesting depth, which very quickly becomes unreasonable. This improves upon bitcoin#14245 by completely removing the `vfExec` vector. ACKs for top commit: jnewbery: Code review ACK e6e622e MarcoFalke: ACK e6e622e 🐴 fjahr: ACK e6e622e ajtowns: ACK e6e622e laanwj: concept and code review ACK e6e622e jonatack: ACK e6e622e code review, build, benches, fuzzing Tree-SHA512: 1dcfac3411ff04773de461959298a177f951cb5f706caa2734073bcec62224d7cd103767cfeef85cd129813e70c14c74fa8f1e38e4da70ec38a0f615aab1f7f7
@SergioDemianLerner described a potential attack vector by exploiting the way how
vfExecis handled in script. Since thevfExecvector is scanned once for every opcode, a specially crafted script could scan up to 979K items, and it may take a couple seconds to validate a block packed with such scripts. Read more: https://bitslog.wordpress.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/The article suggested an O(1) algorithm to fix the problem. I'm trying to fix in a different way. Although it is not the optimal solution, the 5-line patch is very easy to review, and it can reduce the worst case from 979k to about 5k items to be scanned, a 99.49% reduction.
To make review easier, I'll make inline comments
EDIT: a regular block full of CHECKSIG might also take seconds to validate, so this consensus code fix may not be necessary. But anyway, review and comments are welcomed.