Skip to content

Conversation

@pierreN
Copy link
Contributor

@pierreN pierreN commented Mar 24, 2020

Following up on this suggestion : #18413 (comment), prevent the output of atoi64 in the core_read.cpp:ParseScript helper to send to CScriptNum::serialize values wider than 32-bit.

Since the ParseScript helper is only used by the tool defined in bitcoin-tx.cpp, this only prevents users to provide too much unrealistic values.

Comment on lines 67 to 68
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, here's an alternative shorter version for the same logic, without branches:

    n = std::min(n, static_cast<int64_t>(std::numeric_limits<int32_t>::max()));
    n = std::max(n, static_cast<int64_t>(std::numeric_limits<int32_t>::min()));

Not sure if it's really more readable though, personally I'd be okay with both versions.

Copy link
Contributor Author

@pierreN pierreN Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::clamp in C++17 would be even better and less verbose, but unsure if we can use C++17 features ?

Note that min/max in the standard also do exactly one comparison each.

I can remove the if/else though if people think it fits better here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, std::clamp would indeed be a very nice option. Unfortunately, upgrading to C++17 has not happened yet (see this discussion: #16684), so it can't be used.
Sometimes substitutes for not-yet-available C++ features are implemented by contributors, like Span for the C++20 std::span by sipa (see #12886), don't know though if that is a desired way to go here, and if yes, if it should be packed in this PR. This should probably include identifiying all clamps throughout the codebase and using our newly introduced clamp function (so it can be replaced with std::clamp once we move to C++17).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'd be down to do that in a different PR. I'm looking for stuffs to do to improve my knowledge of the codebase anyway :)

I'll wait for some other people to confirm "yes this is a good idea, do a separate PR for std::clamp" before starting though

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from 88bd812 to c8e4db2 Compare March 24, 2020 12:46
@DrahtBot DrahtBot added the Tests label Mar 24, 2020
@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from c8e4db2 to 86f79bb Compare March 24, 2020 15:29
@pierreN
Copy link
Contributor Author

pierreN commented Mar 24, 2020

It seems that the tests were already including values wider than 2^32 so I reckon there is 2 options :

  • remove/update tests to make them fit into a 2^32
  • just prevent the UB value instead

The second option seemed the less intrusive, so I updated the PR to only forbid the value -2^63. I also added 2 tests reflecting that : before this PR, those tests were triggering the UB before failing.

@sipa
Copy link
Member

sipa commented Mar 24, 2020

I think we should just fix UB in CScriptNum::serialize; that should be easy, and avoid all issues.

In addition, I think we may want to restrict what range of numbers ParseScript accepts in decimal, because numbers outside -0xFFFFFFFF...0xFFFFFFFF are simply illegal in scripts (they don't work as numbers, so they should really be represented as hex strings instead if they're used at all). That has nothing to do with avoiding UB; it's just fixing weird behavior in ParseScript.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 24, 2020

OK, thanks.

So. if I understand correctly, we do agree that the following tests in src/test/data/script_tests.json are wrong and should be removed (most of them originally from 90320d6 ) ?

["549755813887", "SIZE 5 EQUAL", "P2SH,STRICTENC", "OK"],
["549755813888", "SIZE 6 EQUAL", "P2SH,STRICTENC", "OK"],
["9223372036854775807", "SIZE 8 EQUAL", "P2SH,STRICTENC", "OK"],

["-549755813887", "SIZE 5 EQUAL", "P2SH,STRICTENC", "OK"],
["-549755813888", "SIZE 6 EQUAL", "P2SH,STRICTENC", "OK"],
["-9223372036854775807", "SIZE 8 EQUAL", "P2SH,STRICTENC", "OK"],

["549755813887", "0x05 0xFFFFFFFF7F EQUAL", "P2SH,STRICTENC", "OK"],
["549755813888", "0x06 0x000000008000 EQUAL", "P2SH,STRICTENC", "OK"],
["9223372036854775807", "0x08 0xFFFFFFFFFFFFFF7F EQUAL", "P2SH,STRICTENC", "OK"],

["-549755813887", "0x05 0xFFFFFFFFFF EQUAL", "P2SH,STRICTENC", "OK"],
["-549755813888", "0x06 0x000000008080 EQUAL", "P2SH,STRICTENC", "OK"],
["-9223372036854775807", "0x08 0xFFFFFFFFFFFFFFFF EQUAL", "P2SH,STRICTENC", "OK"],

["4294967296", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", "CSV fails if stack top bit 1 << 31 is not set, and tx version < 2"],

@sipa
Copy link
Member

sipa commented Mar 25, 2020

@pierreN They can be converted to hexadecimal.

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from 86f79bb to c857b8c Compare March 25, 2020 04:05
@pierreN pierreN changed the title Prevent num op overflows in ParseScript() helper Limit decimal range of numbers ParseScript accepts Mar 25, 2020
@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

Ow my bad, I was too focused on the atoi64 branch to see it that way, thanks.

I edited my branch accordingly.

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from c857b8c to 08b392a Compare March 25, 2020 04:35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this version be kept as an example of an _invalid script?

Copy link
Contributor Author

@pierreN pierreN Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks. This is a good idea I think, I was hesitant to add more tests. If I understood you correctly, the tests are invalids here, so I guess you mean adding something like:

["Argument 2^32 in decimal (hence clamped to 0xffffffff) with nLockTime=2^32-2"],
[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "4294967296 CHECKLOCKTIMEVERIFY 1"]],
"0100000001000100000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000fffffffe", "P2SH,CHECKLOCKTIMEVERIFY"],

If this is correct, I guess I should add similar tests for other modified test cases too ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to add as many tests as you like, as long as all existing proper test cases also stay there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK great, I added 5 more tests in the branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that this shouldn't be silently clamped. The test here looks right. Why would it be safe to parse a script and silently change it while parsing?

Copy link
Contributor

@practicalswift practicalswift Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::runtime_error is thrown on other script parse errors in ParseScript(…). Shouldn't that be thrown also for this type of script parse error in ParseScript?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks, this is clearly a possibility.

Note that for now, no test trigger the already existing std::runtime_error. script_json_test doesn't handle exceptions, so it should be modified to catch them (maybe in a different PR then ?). Testcases will have to be modified to handle the exception too I guess.

(the fuzzer harness already handle exceptions well)

I'm not sure which possibility (clamping or throwing an error) fits more the style of the codebase. Maybe getting an ACK from @sipa for the exception first ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@practicalswift hm sorry but I guess it's better to ask first: if you return an std::runtime_error, how to describe the error in the JSON test file ?

I guess that adding a new value to the ScriptError_t enum is too intrusive.
Adding a special if/else case in script_test.cpp:ParseScriptError seems too hackish.
Reusing an existing ScriptError_t seems the less odd option - something like SCRIPT_ERR_BAD_OPCODE or SCRIPT_ERR_OP_COUNT ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a need. (tx/script)_(valid/invalid) are testing the transaction and script validity logic - not the script parser.

If a particular test now causes an exception to be thrown, just write that test differently. It's still testing the same thing.

Copy link
Contributor Author

@pierreN pierreN Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so as stated below, if a test contains 4294967296 CHECKLOCKTIMEVERIFY 1 or 999...999, this is a script parser error which should be covered in bitcoin-util-test.json, not here. I believe this resolve this issue ?

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from 08b392a to a83167a Compare March 25, 2020 15:29
@laanwj
Copy link
Member

laanwj commented Mar 25, 2020

What's the rationale here for saturating the number instead of raising an error? I generally prefer explicit parse errors on invalid input, at least.

@practicalswift
Copy link
Contributor

@laanwj Agreed. That would also be in line with how parsing errors are handled elsewhere in ParseScript(…): a std::runtime_error is thrown :)

@sipa
Copy link
Member

sipa commented Mar 25, 2020

Yes, that's what I meant to suggest: fail if a number outside the valid range is present. Sorry if that was not clear.

@maflcko
Copy link
Member

maflcko commented Mar 25, 2020

Slightly related: The fuzz tests already have the error condition in this parser here covered: https://marcofalke.github.io/btc_cov/fuzz.coverage/src/core_read.cpp.gcov.html#84 . However our unit and functional tests don't know how to handle this/have this uncovered: https://marcofalke.github.io/btc_cov/total.coverage/src/core_read.cpp.gcov.html#84

@maflcko
Copy link
Member

maflcko commented Mar 25, 2020

It looks like a test case can be added to:

  • src/test/script_tests.cpp and/or
  • test/util/data/bitcoin-util-test.json with binary bitcoin-tx

@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

OK so sorry now I'm a bit confused - as stated above, if we start handling exceptions in script_test.cpp, which ScriptError_t to have in the JSON ? Do you think this would be useful ?

@maflcko
Copy link
Member

maflcko commented Mar 25, 2020

The exception is simply passed in an error_text in test/util/data/bitcoin-util-test.json. E.g.

  { "exec": "./bitcoin-tx",
    "args": ["-create", "nversion=1foo"],
    "return_code": 1,
    "error_txt": "error: Invalid TX version requested",
    "description": "Tests the check for invalid nversion value"
  },

@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

Ow, OK, thanks, I see.

So this way if we return an exception, we can just add a test ./src/bitcoin-tx -create outscript=0:9999999999 to check the parser (and have code coverage for the added exception) - hence the test in src/test/data/tx_invalid.json containing 4294967296 CHECKLOCKTIMEVERIFY 1 becomes useless since it would just trigger the same exception in the parser.

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from a83167a to f3fb51d Compare March 25, 2020 18:48
@maflcko maflcko changed the title Limit decimal range of numbers ParseScript accepts util: Limit decimal range of numbers ParseScript accepts Mar 25, 2020
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that using an exception allowed to raise new errors in tx_valid - before the clamped value also valided the test...

@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

OK, I think I updated the branch as we discussed above ?

If this PR is OK, I'll add a test case in bitcoin-util-test.json for the second exception in ParseScript in another PR.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this needs a static cast?

Suggested change
if(n > static_cast<int64_t>(0xffffffff) || n < -1*static_cast<int64_t>(0xffffffff)) {
if (n > 0xffffffff || n < -0xffffffff) {

Copy link
Contributor Author

@pierreN pierreN Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second static_cast seems strictly necessary, see this example : https://godbolt.org/z/Ko33NG

The first one is superficial (implicit), I just thought that the symetry looked better. I can remove the first one if you prefer ?

edit : at least fixed whitespace style

edit 2 : also note that the LL suffix could be used but this would assume that long long = 64bits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit is better than implicit :)

Copy link
Member

@maflcko maflcko Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with C++11 you can also get a safer way to achieve the same: int64_t{bla}. If the value doesn't fit in the integer type, it won't compile. Whereas with static_cast it would compile and give the wrong result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated the branch accordingly,

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch 2 times, most recently from c2c109f to 4cefaac Compare March 25, 2020 20:22
@sipa
Copy link
Member

sipa commented Mar 25, 2020

ACK 4cefaace020f406d1143cf1351988958ef360861

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from 4cefaac to b12b2fe Compare March 26, 2020 01:44
@pierreN
Copy link
Contributor Author

pierreN commented Mar 26, 2020

(absorbing an empty commit to re-launch failed CI)

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from b12b2fe to 2ee8cc0 Compare March 26, 2020 20:56
@sipa
Copy link
Member

sipa commented Mar 26, 2020

re-ACK 2ee8cc07292e62a1be47d09421f93d25ad24fbd4

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is <limits> still necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ow, thank you. No it isn't, just removed it.

@pierreN pierreN force-pushed the fix-parsescript-numop-overflow branch from 2ee8cc0 to 9ab14e4 Compare March 27, 2020 06:51
@laanwj
Copy link
Member

laanwj commented Mar 27, 2020

ACK 9ab14e4
(only difference is removing the limits include as suggested: https://github.com/bitcoin/bitcoin/compare/2ee8cc07292e62a1be47d09421f93d25ad24fbd4..9ab14e4d21c73d16d8d782f1576fe29e659e2a70)

@laanwj laanwj merged commit b53af72 into bitcoin:master Mar 27, 2020
@pierreN
Copy link
Contributor Author

pierreN commented Mar 27, 2020

Thanks !

@laanwj
Copy link
Member

laanwj commented Mar 27, 2020

I'm very sorry to only realize this now but atoi64 itself doesn't do bounds checking. So very large or small values outside the 64 bit range will still overflow the 64 bit or at least not raise an error conditions.
I think the appropriate function to use here is our own ParseUInt32.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 27, 2020

Sorry, I don't follow, atoi64 is:

int64_t atoi64(const std::string& str)
{
#ifdef _MSC_VER
    return _atoi64(str.c_str());
#else
    return strtoll(str.c_str(), nullptr, 10);
#endif
}

And both _atoi64 and strtoll both return LLONG_MAX or LLONG_MIN when receiving too big positive/negative values ?

It will additionally set an errno but since we check beforehand that we are parsing a number, there should not be any issue ?

In anycase it's true that ParseInt32 (since the value can be signed) could be more adapted since it checks for errno anyway. I could open a new PR for that if people prefer ?

@laanwj
Copy link
Member

laanwj commented Mar 27, 2020

And both _atoi64 and strtoll both return LLONG_MAX or LLONG_MIN when receiving too big positive/negative values ?

If that's the case this is actually OK. I thought they didn't have any guaranteed return value in that case.

       The strtol() function returns the result of the conversion, unless the value would underflow or overflow.  If an underflow occurs, strtol() returns LONG_MIN.  If an overflow occurs,  str‐
       tol() returns LONG_MAX. 

both LONG_MIN and LONG_MAX are out of the range so it will do so correctly…

In anycase it's true that ParseInt32 (since the value can be signed) could be more adapted since it checks for errno anyway. I could open a new PR for that if people prefer ?

Oh right! then you'd actually need ParseInt64: the value can be signed, and it's not in the range of a 32 bit signed integer (-0x800000000x7fffffff). But I think it's fine to keep it as-is then, phew.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 27, 2020

OK, great, thanks for the code review !

maflcko pushed a commit that referenced this pull request Mar 27, 2020
dcda81c test: add coverage for script parse error in ParseScript (pierrenn)

Pull request description:

  Follow up on this suggestion :  #18416 (comment)

  This adds a test case to raise the `script parse error` in `ParseScript`.

ACKs for top commit:
  instagibbs:
    utACK dcda81c

Tree-SHA512: ae0ef2c00f34cee818c83582f190d5f4043159e922862f2b442b7b895b8ff3ca421533699247c12c367be77813b5205830a771cd47a18e8932807ccace2d6a1c
["0x0100", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY,MINIMALDATA", "UNKNOWN_ERROR", "CSV fails if stack top is not minimally encoded"],
["0", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME", "CSV fails if stack top bit 1 << 31 is set and the tx version < 2"],
["4294967296", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME",
["0x050000000001", "CHECKSEQUENCEVERIFY", "CHECKSEQUENCEVERIFY", "UNSATISFIED_LOCKTIME",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you changing all the values when translating from dec to hex? It might be fine, but at least for locktime values there are different code paths depending on the value.

Copy link
Contributor Author

@pierreN pierreN Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ow, good point. I double checked with the value returned by ParseScript in the tests before/after the PR - so that the behavior of the tests doesn't change at all (i.e. before this PR the script built by ParseScript from 4294967296 contained 0x050000000001)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, oops. My bad

azuchi added a commit to chaintope/bitcoinrb that referenced this pull request Jun 16, 2020
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 12, 2021
Summary: This is a backport of Core [[bitcoin/bitcoin#18416 | PR18416]]

Test Plan: `ninja && test/util/bitcoin-util-test.py`

Reviewers: #bitcoin_abc, deadalnix

Reviewed By: #bitcoin_abc, deadalnix

Differential Revision: https://reviews.bitcoinabc.org/D8883
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants