-
Notifications
You must be signed in to change notification settings - Fork 38.7k
util: Limit decimal range of numbers ParseScript accepts #18416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/core_read.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, here's an alternative shorter version for the same logic, without branches:
n = std::min(n, static_cast<int64_t>(std::numeric_limits<int32_t>::max()));
n = std::max(n, static_cast<int64_t>(std::numeric_limits<int32_t>::min()));
Not sure if it's really more readable though, personally I'd be okay with both versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::clamp in C++17 would be even better and less verbose, but unsure if we can use C++17 features ?
Note that min/max in the standard also do exactly one comparison each.
I can remove the if/else though if people think it fits better here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, std::clamp would indeed be a very nice option. Unfortunately, upgrading to C++17 has not happened yet (see this discussion: #16684), so it can't be used.
Sometimes substitutes for not-yet-available C++ features are implemented by contributors, like Span for the C++20 std::span by sipa (see #12886), don't know though if that is a desired way to go here, and if yes, if it should be packed in this PR. This should probably include identifiying all clamps throughout the codebase and using our newly introduced clamp function (so it can be replaced with std::clamp once we move to C++17).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'd be down to do that in a different PR. I'm looking for stuffs to do to improve my knowledge of the codebase anyway :)
I'll wait for some other people to confirm "yes this is a good idea, do a separate PR for std::clamp" before starting though
88bd812 to
c8e4db2
Compare
c8e4db2 to
86f79bb
Compare
|
It seems that the tests were already including values wider than 2^32 so I reckon there is 2 options :
The second option seemed the less intrusive, so I updated the PR to only forbid the value -2^63. I also added 2 tests reflecting that : before this PR, those tests were triggering the UB before failing. |
|
I think we should just fix UB in CScriptNum::serialize; that should be easy, and avoid all issues. In addition, I think we may want to restrict what range of numbers ParseScript accepts in decimal, because numbers outside -0xFFFFFFFF...0xFFFFFFFF are simply illegal in scripts (they don't work as numbers, so they should really be represented as hex strings instead if they're used at all). That has nothing to do with avoiding UB; it's just fixing weird behavior in ParseScript. |
|
OK, thanks. So. if I understand correctly, we do agree that the following tests in |
|
@pierreN They can be converted to hexadecimal. |
86f79bb to
c857b8c
Compare
|
Ow my bad, I was too focused on the I edited my branch accordingly. |
c857b8c to
08b392a
Compare
src/test/data/tx_invalid.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this version be kept as an example of an _invalid script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks. This is a good idea I think, I was hesitant to add more tests. If I understood you correctly, the tests are invalids here, so I guess you mean adding something like:
["Argument 2^32 in decimal (hence clamped to 0xffffffff) with nLockTime=2^32-2"],
[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "4294967296 CHECKLOCKTIMEVERIFY 1"]],
"0100000001000100000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000fffffffe", "P2SH,CHECKLOCKTIMEVERIFY"],
If this is correct, I guess I should add similar tests for other modified test cases too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to add as many tests as you like, as long as all existing proper test cases also stay there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great, I added 5 more tests in the branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that this shouldn't be silently clamped. The test here looks right. Why would it be safe to parse a script and silently change it while parsing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::runtime_error is thrown on other script parse errors in ParseScript(…). Shouldn't that be thrown also for this type of script parse error in ParseScript?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks, this is clearly a possibility.
Note that for now, no test trigger the already existing std::runtime_error. script_json_test doesn't handle exceptions, so it should be modified to catch them (maybe in a different PR then ?). Testcases will have to be modified to handle the exception too I guess.
(the fuzzer harness already handle exceptions well)
I'm not sure which possibility (clamping or throwing an error) fits more the style of the codebase. Maybe getting an ACK from @sipa for the exception first ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@practicalswift hm sorry but I guess it's better to ask first: if you return an std::runtime_error, how to describe the error in the JSON test file ?
I guess that adding a new value to the ScriptError_t enum is too intrusive.
Adding a special if/else case in script_test.cpp:ParseScriptError seems too hackish.
Reusing an existing ScriptError_t seems the less odd option - something like SCRIPT_ERR_BAD_OPCODE or SCRIPT_ERR_OP_COUNT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a need. (tx/script)_(valid/invalid) are testing the transaction and script validity logic - not the script parser.
If a particular test now causes an exception to be thrown, just write that test differently. It's still testing the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so as stated below, if a test contains 4294967296 CHECKLOCKTIMEVERIFY 1 or 999...999, this is a script parser error which should be covered in bitcoin-util-test.json, not here. I believe this resolve this issue ?
08b392a to
a83167a
Compare
|
What's the rationale here for saturating the number instead of raising an error? I generally prefer explicit parse errors on invalid input, at least. |
|
@laanwj Agreed. That would also be in line with how parsing errors are handled elsewhere in |
|
Yes, that's what I meant to suggest: fail if a number outside the valid range is present. Sorry if that was not clear. |
|
Slightly related: The fuzz tests already have the error condition in this parser here covered: https://marcofalke.github.io/btc_cov/fuzz.coverage/src/core_read.cpp.gcov.html#84 . However our unit and functional tests don't know how to handle this/have this uncovered: https://marcofalke.github.io/btc_cov/total.coverage/src/core_read.cpp.gcov.html#84 |
|
It looks like a test case can be added to:
|
|
OK so sorry now I'm a bit confused - as stated above, if we start handling exceptions in |
|
The exception is simply passed in an { "exec": "./bitcoin-tx",
"args": ["-create", "nversion=1foo"],
"return_code": 1,
"error_txt": "error: Invalid TX version requested",
"description": "Tests the check for invalid nversion value"
}, |
|
Ow, OK, thanks, I see. So this way if we return an exception, we can just add a test |
a83167a to
f3fb51d
Compare
src/test/data/tx_valid.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that using an exception allowed to raise new errors in tx_valid - before the clamped value also valided the test...
|
OK, I think I updated the branch as we discussed above ? If this PR is OK, I'll add a test case in |
maflcko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
src/core_read.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this needs a static cast?
| if(n > static_cast<int64_t>(0xffffffff) || n < -1*static_cast<int64_t>(0xffffffff)) { | |
| if (n > 0xffffffff || n < -0xffffffff) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second static_cast seems strictly necessary, see this example : https://godbolt.org/z/Ko33NG
The first one is superficial (implicit), I just thought that the symetry looked better. I can remove the first one if you prefer ?
edit : at least fixed whitespace style
edit 2 : also note that the LL suffix could be used but this would assume that long long = 64bits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicit is better than implicit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with C++11 you can also get a safer way to achieve the same: int64_t{bla}. If the value doesn't fit in the integer type, it won't compile. Whereas with static_cast it would compile and give the wrong result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the branch accordingly,
c2c109f to
4cefaac
Compare
|
ACK 4cefaace020f406d1143cf1351988958ef360861 |
4cefaac to
b12b2fe
Compare
|
(absorbing an empty commit to re-launch failed CI) |
b12b2fe to
2ee8cc0
Compare
|
re-ACK 2ee8cc07292e62a1be47d09421f93d25ad24fbd4 |
src/core_read.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is <limits> still necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ow, thank you. No it isn't, just removed it.
2ee8cc0 to
9ab14e4
Compare
|
ACK 9ab14e4 |
|
Thanks ! |
|
I'm very sorry to only realize this now but |
|
Sorry, I don't follow, And both _atoi64 and strtoll both return It will additionally set an In anycase it's true that |
If that's the case this is actually OK. I thought they didn't have any guaranteed return value in that case. both LONG_MIN and LONG_MAX are out of the range so it will do so correctly…
Oh right! then you'd actually need |
|
OK, great, thanks for the code review ! |
dcda81c test: add coverage for script parse error in ParseScript (pierrenn) Pull request description: Follow up on this suggestion : #18416 (comment) This adds a test case to raise the `script parse error` in `ParseScript`. ACKs for top commit: instagibbs: utACK dcda81c Tree-SHA512: ae0ef2c00f34cee818c83582f190d5f4043159e922862f2b442b7b895b8ff3ca421533699247c12c367be77813b5205830a771cd47a18e8932807ccace2d6a1c
| ["0x0100", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY,MINIMALDATA", "UNKNOWN_ERROR", "CSV fails if stack top is not minimally encoded"], | ||
| ["0", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", "CSV fails if stack top bit 1 << 31 is set and the tx version < 2"], | ||
| ["4294967296", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", | ||
| ["0x050000000001", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you changing all the values when translating from dec to hex? It might be fine, but at least for locktime values there are different code paths depending on the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ow, good point. I double checked with the value returned by ParseScript in the tests before/after the PR - so that the behavior of the tests doesn't change at all (i.e. before this PR the script built by ParseScript from 4294967296 contained 0x050000000001)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, oops. My bad
Summary: This is a backport of Core [[bitcoin/bitcoin#18416 | PR18416]] Test Plan: `ninja && test/util/bitcoin-util-test.py` Reviewers: #bitcoin_abc, deadalnix Reviewed By: #bitcoin_abc, deadalnix Differential Revision: https://reviews.bitcoinabc.org/D8883
Following up on this suggestion : #18413 (comment), prevent the output of
atoi64in thecore_read.cpp:ParseScripthelper to send toCScriptNum::serializevalues wider than 32-bit.Since the
ParseScripthelper is only used by the tool defined inbitcoin-tx.cpp, this only prevents users to provide too much unrealistic values.