util: Limit decimal range of numbers ParseScript accepts #18416

pierreN · 2020-03-24T07:36:47Z

Following up on this suggestion : #18413 (comment), prevent the output of atoi64 in the core_read.cpp:ParseScript helper to send to CScriptNum::serialize values wider than 32-bit.

Since the ParseScript helper is only used by the tool defined in bitcoin-tx.cpp, this only prevents users to provide too much unrealistic values.

theStack · 2020-03-24T08:09:14Z

src/core_read.cpp

FWIW, here's an alternative shorter version for the same logic, without branches:

n = std::min(n, static_cast<int64_t>(std::numeric_limits<int32_t>::max())); n = std::max(n, static_cast<int64_t>(std::numeric_limits<int32_t>::min()));

Not sure if it's really more readable though, personally I'd be okay with both versions.

std::clamp in C++17 would be even better and less verbose, but unsure if we can use C++17 features ?

Note that min/max in the standard also do exactly one comparison each.

I can remove the if/else though if people think it fits better here.

Ah, std::clamp would indeed be a very nice option. Unfortunately, upgrading to C++17 has not happened yet (see this discussion: #16684), so it can't be used.
Sometimes substitutes for not-yet-available C++ features are implemented by contributors, like Span for the C++20 std::span by sipa (see #12886), don't know though if that is a desired way to go here, and if yes, if it should be packed in this PR. This should probably include identifiying all clamps throughout the codebase and using our newly introduced clamp function (so it can be replaced with std::clamp once we move to C++17).

Thanks, I'd be down to do that in a different PR. I'm looking for stuffs to do to improve my knowledge of the codebase anyway :)

I'll wait for some other people to confirm "yes this is a good idea, do a separate PR for std::clamp" before starting though

pierreN · 2020-03-24T15:35:36Z

It seems that the tests were already including values wider than 2^32 so I reckon there is 2 options :

remove/update tests to make them fit into a 2^32
just prevent the UB value instead

The second option seemed the less intrusive, so I updated the PR to only forbid the value -2^63. I also added 2 tests reflecting that : before this PR, those tests were triggering the UB before failing.

sipa · 2020-03-24T16:14:28Z

I think we should just fix UB in CScriptNum::serialize; that should be easy, and avoid all issues.

In addition, I think we may want to restrict what range of numbers ParseScript accepts in decimal, because numbers outside -0xFFFFFFFF...0xFFFFFFFF are simply illegal in scripts (they don't work as numbers, so they should really be represented as hex strings instead if they're used at all). That has nothing to do with avoiding UB; it's just fixing weird behavior in ParseScript.

pierreN · 2020-03-24T23:57:43Z

OK, thanks.

So. if I understand correctly, we do agree that the following tests in src/test/data/script_tests.json are wrong and should be removed (most of them originally from 90320d6 ) ?

["549755813887", "SIZE 5 EQUAL", "P2SH,STRICTENC", "OK"],
["549755813888", "SIZE 6 EQUAL", "P2SH,STRICTENC", "OK"],
["9223372036854775807", "SIZE 8 EQUAL", "P2SH,STRICTENC", "OK"],

["-549755813887", "SIZE 5 EQUAL", "P2SH,STRICTENC", "OK"],
["-549755813888", "SIZE 6 EQUAL", "P2SH,STRICTENC", "OK"],
["-9223372036854775807", "SIZE 8 EQUAL", "P2SH,STRICTENC", "OK"],

["549755813887", "0x05 0xFFFFFFFF7F EQUAL", "P2SH,STRICTENC", "OK"],
["549755813888", "0x06 0x000000008000 EQUAL", "P2SH,STRICTENC", "OK"],
["9223372036854775807", "0x08 0xFFFFFFFFFFFFFF7F EQUAL", "P2SH,STRICTENC", "OK"],

["-549755813887", "0x05 0xFFFFFFFFFF EQUAL", "P2SH,STRICTENC", "OK"],
["-549755813888", "0x06 0x000000008080 EQUAL", "P2SH,STRICTENC", "OK"],
["-9223372036854775807", "0x08 0xFFFFFFFFFFFFFFFF EQUAL", "P2SH,STRICTENC", "OK"],

["4294967296", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", "CSV fails if stack top bit 1 << 31 is not set, and tx version < 2"],

sipa · 2020-03-25T01:36:16Z

@pierreN They can be converted to hexadecimal.

pierreN · 2020-03-25T04:06:26Z

Ow my bad, I was too focused on the atoi64 branch to see it that way, thanks.

I edited my branch accordingly.

maflcko · 2020-03-25T11:45:54Z

src/test/data/tx_invalid.json

Shouldn't this version be kept as an example of an _invalid script?

OK, thanks. This is a good idea I think, I was hesitant to add more tests. If I understood you correctly, the tests are invalids here, so I guess you mean adding something like:

["Argument 2^32 in decimal (hence clamped to 0xffffffff) with nLockTime=2^32-2"], [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "4294967296 CHECKLOCKTIMEVERIFY 1"]], "0100000001000100000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000fffffffe", "P2SH,CHECKLOCKTIMEVERIFY"],

If this is correct, I guess I should add similar tests for other modified test cases too ?

Feel free to add as many tests as you like, as long as all existing proper test cases also stay there

OK great, I added 5 more tests in the branch.

What I mean is that this shouldn't be silently clamped. The test here looks right. Why would it be safe to parse a script and silently change it while parsing?

std::runtime_error is thrown on other script parse errors in ParseScript(…). Shouldn't that be thrown also for this type of script parse error in ParseScript?

OK, thanks, this is clearly a possibility.

Note that for now, no test trigger the already existing std::runtime_error. script_json_test doesn't handle exceptions, so it should be modified to catch them (maybe in a different PR then ?). Testcases will have to be modified to handle the exception too I guess.

(the fuzzer harness already handle exceptions well)

I'm not sure which possibility (clamping or throwing an error) fits more the style of the codebase. Maybe getting an ACK from @sipa for the exception first ?

@practicalswift hm sorry but I guess it's better to ask first: if you return an std::runtime_error, how to describe the error in the JSON test file ?

I guess that adding a new value to the ScriptError_t enum is too intrusive.
Adding a special if/else case in script_test.cpp:ParseScriptError seems too hackish.
Reusing an existing ScriptError_t seems the less odd option - something like SCRIPT_ERR_BAD_OPCODE or SCRIPT_ERR_OP_COUNT ?

I don't think there is a need. (tx/script)_(valid/invalid) are testing the transaction and script validity logic - not the script parser.

If a particular test now causes an exception to be thrown, just write that test differently. It's still testing the same thing.

OK, so as stated below, if a test contains 4294967296 CHECKLOCKTIMEVERIFY 1 or 999...999, this is a script parser error which should be covered in bitcoin-util-test.json, not here. I believe this resolve this issue ?

laanwj · 2020-03-25T16:18:36Z

What's the rationale here for saturating the number instead of raising an error? I generally prefer explicit parse errors on invalid input, at least.

practicalswift · 2020-03-25T16:20:55Z

@laanwj Agreed. That would also be in line with how parsing errors are handled elsewhere in ParseScript(…): a std::runtime_error is thrown :)

sipa · 2020-03-25T16:34:06Z

Yes, that's what I meant to suggest: fail if a number outside the valid range is present. Sorry if that was not clear.

maflcko · 2020-03-25T17:09:08Z

Slightly related: The fuzz tests already have the error condition in this parser here covered: https://marcofalke.github.io/btc_cov/fuzz.coverage/src/core_read.cpp.gcov.html#84 . However our unit and functional tests don't know how to handle this/have this uncovered: https://marcofalke.github.io/btc_cov/total.coverage/src/core_read.cpp.gcov.html#84

maflcko · 2020-03-25T17:11:31Z

It looks like a test case can be added to:

src/test/script_tests.cpp and/or
test/util/data/bitcoin-util-test.json with binary bitcoin-tx

pierreN · 2020-03-25T17:30:52Z

OK so sorry now I'm a bit confused - as stated above, if we start handling exceptions in script_test.cpp, which ScriptError_t to have in the JSON ? Do you think this would be useful ?

maflcko · 2020-03-25T17:34:03Z

The exception is simply passed in an error_text in test/util/data/bitcoin-util-test.json. E.g.

  { "exec": "./bitcoin-tx",
    "args": ["-create", "nversion=1foo"],
    "return_code": 1,
    "error_txt": "error: Invalid TX version requested",
    "description": "Tests the check for invalid nversion value"
  },

pierreN · 2020-03-25T18:07:09Z

Ow, OK, thanks, I see.

So this way if we return an exception, we can just add a test ./src/bitcoin-tx -create outscript=0:9999999999 to check the parser (and have code coverage for the added exception) - hence the test in src/test/data/tx_invalid.json containing 4294967296 CHECKLOCKTIMEVERIFY 1 becomes useless since it would just trigger the same exception in the parser.

pierreN · 2020-03-25T18:53:26Z

src/test/data/tx_valid.json

Note that using an exception allowed to raise new errors in tx_valid - before the clamped value also valided the test...

pierreN · 2020-03-25T18:55:07Z

OK, I think I updated the branch as we discussed above ?

If this PR is OK, I'll add a test case in bitcoin-util-test.json for the second exception in ParseScript in another PR.

maflcko

Concept ACK

maflcko · 2020-03-25T18:54:40Z

src/core_read.cpp

Any reason this needs a static cast?

Suggested change

if(n > static_cast<int64_t>(0xffffffff) || n < -1*static_cast<int64_t>(0xffffffff)) {

if (n > 0xffffffff || n < -0xffffffff) {

The second static_cast seems strictly necessary, see this example : https://godbolt.org/z/Ko33NG

The first one is superficial (implicit), I just thought that the symetry looked better. I can remove the first one if you prefer ?

edit : at least fixed whitespace style

edit 2 : also note that the LL suffix could be used but this would assume that long long = 64bits

Explicit is better than implicit :)

I think with C++11 you can also get a safer way to achieve the same: int64_t{bla}. If the value doesn't fit in the integer type, it won't compile. Whereas with static_cast it would compile and give the wrong result.

Thanks, updated the branch accordingly,

test/util/data/bitcoin-util-test.json

sipa · 2020-03-25T20:30:52Z

ACK 4cefaace020f406d1143cf1351988958ef360861

pierreN · 2020-03-26T01:45:07Z

(absorbing an empty commit to re-launch failed CI)

sipa · 2020-03-26T22:47:50Z

re-ACK 2ee8cc07292e62a1be47d09421f93d25ad24fbd4

laanwj · 2020-03-27T06:48:48Z

src/core_read.cpp

Is <limits> still necessary?

Ow, thank you. No it isn't, just removed it.

laanwj · 2020-03-27T06:59:35Z

ACK 9ab14e4
(only difference is removing the limits include as suggested: https://github.com/bitcoin/bitcoin/compare/2ee8cc07292e62a1be47d09421f93d25ad24fbd4..9ab14e4d21c73d16d8d782f1576fe29e659e2a70)

pierreN · 2020-03-27T07:27:46Z

Thanks !

laanwj · 2020-03-27T08:19:53Z

I'm very sorry to only realize this now but atoi64 itself doesn't do bounds checking. So very large or small values outside the 64 bit range will still overflow the 64 bit or at least not raise an error conditions.
I think the appropriate function to use here is our own ParseUInt32.

pierreN · 2020-03-27T08:31:04Z

Sorry, I don't follow, atoi64 is:

int64_t atoi64(const std::string& str)
{
#ifdef _MSC_VER
    return _atoi64(str.c_str());
#else
    return strtoll(str.c_str(), nullptr, 10);
#endif
}

And both _atoi64 and strtoll both return LLONG_MAX or LLONG_MIN when receiving too big positive/negative values ?

It will additionally set an errno but since we check beforehand that we are parsing a number, there should not be any issue ?

In anycase it's true that ParseInt32 (since the value can be signed) could be more adapted since it checks for errno anyway. I could open a new PR for that if people prefer ?

laanwj · 2020-03-27T10:22:19Z

And both _atoi64 and strtoll both return LLONG_MAX or LLONG_MIN when receiving too big positive/negative values ?

If that's the case this is actually OK. I thought they didn't have any guaranteed return value in that case.

       The strtol() function returns the result of the conversion, unless the value would underflow or overflow.  If an underflow occurs, strtol() returns LONG_MIN.  If an overflow occurs,  str‐
       tol() returns LONG_MAX.

both LONG_MIN and LONG_MAX are out of the range so it will do so correctly…

In anycase it's true that ParseInt32 (since the value can be signed) could be more adapted since it checks for errno anyway. I could open a new PR for that if people prefer ?

Oh right! then you'd actually need ParseInt64: the value can be signed, and it's not in the range of a 32 bit signed integer (-0x80000000…0x7fffffff). But I think it's fine to keep it as-is then, phew.

pierreN · 2020-03-27T10:25:19Z

OK, great, thanks for the code review !

dcda81c test: add coverage for script parse error in ParseScript (pierrenn) Pull request description: Follow up on this suggestion : #18416 (comment) This adds a test case to raise the `script parse error` in `ParseScript`. ACKs for top commit: instagibbs: utACK dcda81c Tree-SHA512: ae0ef2c00f34cee818c83582f190d5f4043159e922862f2b442b7b895b8ff3ca421533699247c12c367be77813b5205830a771cd47a18e8932807ccace2d6a1c

maflcko · 2020-03-27T14:06:02Z

src/test/data/script_tests.json

 ["0x0100", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY,MINIMALDATA", "UNKNOWN_ERROR", "CSV fails if stack top is not minimally encoded"],
 ["0", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", "CSV fails if stack top bit 1 << 31 is set and the tx version < 2"],
-["4294967296", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME",
+["0x050000000001", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME",


Why are you changing all the values when translating from dec to hex? It might be fine, but at least for locktime values there are different code paths depending on the value.

Ow, good point. I double checked with the value returned by ParseScript in the tests before/after the PR - so that the behavior of the tests doesn't change at all (i.e. before this PR the script built by ParseScript from 4294967296 contained 0x050000000001)

Ah, oops. My bad

Summary: This is a backport of Core [[bitcoin/bitcoin#18416 | PR18416]] Test Plan: `ninja && test/util/bitcoin-util-test.py` Reviewers: #bitcoin_abc, deadalnix Reviewed By: #bitcoin_abc, deadalnix Differential Revision: https://reviews.bitcoinabc.org/D8883

theStack reviewed Mar 24, 2020

View reviewed changes

pierreN force-pushed the fix-parsescript-numop-overflow branch from 88bd812 to c8e4db2 Compare March 24, 2020 12:46

DrahtBot added the Tests label Mar 24, 2020

pierreN force-pushed the fix-parsescript-numop-overflow branch from c8e4db2 to 86f79bb Compare March 24, 2020 15:29

pierreN mentioned this pull request Mar 24, 2020

script: prevent UB when computing abs value for num opcode serialize #18413

Merged

pierreN force-pushed the fix-parsescript-numop-overflow branch from 86f79bb to c857b8c Compare March 25, 2020 04:05

pierreN changed the title ~~Prevent num op overflows in ParseScript() helper~~ Limit decimal range of numbers ParseScript accepts Mar 25, 2020

pierreN force-pushed the fix-parsescript-numop-overflow branch from c857b8c to 08b392a Compare March 25, 2020 04:35

maflcko reviewed Mar 25, 2020

View reviewed changes

pierreN force-pushed the fix-parsescript-numop-overflow branch from 08b392a to a83167a Compare March 25, 2020 15:29

pierreN force-pushed the fix-parsescript-numop-overflow branch from a83167a to f3fb51d Compare March 25, 2020 18:48

maflcko changed the title ~~Limit decimal range of numbers ParseScript accepts~~ util: Limit decimal range of numbers ParseScript accepts Mar 25, 2020

pierreN commented Mar 25, 2020

View reviewed changes

maflcko added Utils/log/libs and removed Tests labels Mar 25, 2020

maflcko approved these changes Mar 25, 2020

View reviewed changes

pierreN force-pushed the fix-parsescript-numop-overflow branch 2 times, most recently from c2c109f to 4cefaac Compare March 25, 2020 20:22

pierreN force-pushed the fix-parsescript-numop-overflow branch from 4cefaac to b12b2fe Compare March 26, 2020 01:44

pierreN force-pushed the fix-parsescript-numop-overflow branch from b12b2fe to 2ee8cc0 Compare March 26, 2020 20:56

pierreN mentioned this pull request Mar 27, 2020

test: Add coverage for script parse error in ParseScript #18447

Merged

laanwj reviewed Mar 27, 2020

View reviewed changes

Limit decimal range of numbers ParseScript accepts

9ab14e4

pierreN force-pushed the fix-parsescript-numop-overflow branch from 2ee8cc0 to 9ab14e4 Compare March 27, 2020 06:51

laanwj merged commit b53af72 into bitcoin:master Mar 27, 2020

maflcko reviewed Mar 27, 2020

View reviewed changes

practicalswift mentioned this pull request Apr 1, 2020

Some fuzzing crashes when running under ASan and/or UBSan #18046

Closed

MarkLTZ mentioned this pull request Apr 4, 2020

Bitcoin PR tracking litecoinz-core/litecoinz#14

Closed

azuchi added a commit to chaintope/bitcoinrb that referenced this pull request Jun 16, 2020

Update test data for bitcoin/bitcoin#18416

b73426b

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

	if(n > static_cast<int64_t>(0xffffffff) \|\| n < -1*static_cast<int64_t>(0xffffffff)) {
	if (n > 0xffffffff \|\| n < -0xffffffff) {

util: Limit decimal range of numbers ParseScript accepts #18416

util: Limit decimal range of numbers ParseScript accepts #18416

Uh oh!

Conversation

pierreN commented Mar 24, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreN Mar 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreN commented Mar 24, 2020

Uh oh!

sipa commented Mar 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pierreN commented Mar 24, 2020

Uh oh!

sipa commented Mar 25, 2020

Uh oh!

pierreN commented Mar 25, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreN Mar 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

practicalswift Mar 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreN Mar 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

laanwj commented Mar 25, 2020

Uh oh!

practicalswift commented Mar 25, 2020

Uh oh!

sipa commented Mar 25, 2020

Uh oh!

maflcko commented Mar 25, 2020

Uh oh!

maflcko commented Mar 25, 2020

Uh oh!

pierreN commented Mar 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maflcko commented Mar 25, 2020

Uh oh!

pierreN commented Mar 25, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreN commented Mar 25, 2020

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreN Mar 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

pierreN Mar 24, 2020 •

edited

Loading

sipa commented Mar 24, 2020 •

edited

Loading

pierreN Mar 25, 2020 •

edited

Loading

practicalswift Mar 25, 2020 •

edited

Loading

pierreN Mar 25, 2020 •

edited

Loading

pierreN commented Mar 25, 2020 •

edited

Loading

pierreN Mar 25, 2020 •

edited

Loading

maflcko Mar 26, 2020 •

edited

Loading

laanwj commented Mar 27, 2020 •

edited

Loading

pierreN commented Mar 27, 2020 •

edited

Loading

laanwj commented Mar 27, 2020 •

edited

Loading

pierreN Mar 27, 2020 •

edited

Loading