Skip to content

Conversation

@pierreN
Copy link
Contributor

@pierreN pierreN commented Mar 23, 2020

This was reported by practicalswift here #18046

It seems that the original author of the line used a reference to glibc abs: https://github.com/lattera/glibc/blob/master/stdlib/abs.c

However depending on some implementation details this can be undefined behavior for unusual values.

A detailed explanation of the UB is provided here : https://stackoverflow.com/questions/17313579/is-there-a-safe-way-to-get-the-unsigned-absolute-value-of-a-signed-integer-with (by Billy O'Neal)

Simple relevant godbolt example : https://godbolt.org/z/yRwtCG

Thanks!

@sipa
Copy link
Member

sipa commented Mar 23, 2020

This shouldn't be needed, I think, because CScriptNum values are restricted to 32-bit.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 23, 2020

OK, thanks.

Should I close this PR even if it's a trivial fix to #18046 ? (cf discussion in #18383 )

@sipa
Copy link
Member

sipa commented Mar 23, 2020

I must be wrong, because I don't really understand why it's possible that that fuzzer is able to trigger a negation of -2^63.

@practicalswift
Copy link
Contributor

@sipa FWIW:

$ xxd -p -r <<< "2d360932445550092d36093609092d393939393939393939393939393939393939360955" > crash-parse_script
$ src/test/fuzz/parse_script crash-parse_script
script/script.h:332:35: runtime error: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself
    #0 0x55e134173738 in CScriptNum::serialize(long const&) src/./script/script.h:332:35
    #1 0x55e134172f40 in CScript::push_int64(long) src/./script/script.h:405:22
    #2 0x55e13416984f in CScript::operator<<(long) src/./script/script.h:445:45
    #3 0x55e13416984f in ParseScript(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) src/core_read.cpp:62:20
    #4 0x55e134167b0b in test_one_input(std::vector<unsigned char, std::allocator<unsigned char> > const&) src/test/fuzz/parse_script.cpp:13:15

@pierreN pierreN force-pushed the fix-script-absolute branch from d209eb7 to bab50c5 Compare March 24, 2020 01:23
Copy link
Member

@sipa sipa Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to use std::abs, which since C++11 is overloaded to work for all signed integer types as input (avoiding the assumption that long long is enough for 64 bits, and also avoiding promoting values to higher than 64 bits in case long long is even bigger).

See https://en.cppreference.com/w/cpp/numeric/math/abs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks I'm testing this. I thought that llabs showed more clearly the intend but using abs might be better if you want to switch to a smaller type later indeed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, this won't ever switch to a smaller type.

I'm just commenting this because hardcoding the assumption that long long = 64 bits seems bad (if there were an abs64bit or so that would seem ideal, but there isn't). Thankfully, thanks to std::abs being overloaded for all integer types, there is no need for such an assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah indeed yes, this is better that way then.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 24, 2020

I don't really understand why it's possible that that fuzzer is able to trigger a negation of -2^63.

In the input provided by @practicalswift above, 2d39393939393939393939393939393939393936 is the ASCII string -9999999999999999996 which is translated by the atoi64 in core_read.cpp:ParseScript to -2^63 while looping over the words.
This is then sent to CScriptNum::serialize via CScript::push_int64.

I could not find the proposed input in https://github.com/bitcoin-core/qa-assets/ nor bitcoin-core/qa-assets#7 so I guess the end goal here is to increase code coverage by adding more data to qa-assets in a future PR ?

TL;DR I reckon there is 3 options:

  • not considereing the provided ASCII text input
  • since the atoi64 already changes values beyond 2^63 when doing the ASCII->int64_t conversion, limit the atoi64 output to (-)2^32 as @sipa thought (by modifying it manually)
  • the current CI build fails seems to be due to some compiler versions not respecting the standard (namely MSVC, quite ironic), I updated my branch to use abs instead (which hopefully all compilers have fully tested..)

From what I gather, ParseScript is only used in bitcoin-tx.cpp once (except for tests), which takes input directly from the CLI.

So if you find this PR too intrusive, limiting the output of atoi64 in ParseScript might be a better option ? (I might be wrong though)

@pierreN pierreN force-pushed the fix-script-absolute branch from bab50c5 to 699f677 Compare March 24, 2020 01:41
@sipa
Copy link
Member

sipa commented Mar 24, 2020

Code review ACK bab50c56e1fa2477f7d625855485e8527c78747c, with either std::abs or std::llabs.

It's more obviously correct and should avoid the fuzzer issues we're seeing. I have also verified that there are no consensus-critical code paths that can trigger this (they're all restricted to much smaller numbers).

I think the use of atoi64 and its clamping behavior in core_read.cpp is independently problematic (it should just fail when the number is too big), but I don't think that's on the critical path for fixing this issue. I'll open a PR to restrict it to -0xFFFFFFFF..0xFFFFFFFF.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 24, 2020

OK, thanks. (also note that atoi64 seems platform dependant (which might explain why other people in the above issue didn't see the fuzzer error)).

I would'nt mind trying to do the PR for the atoi64 too, but I guess it will be faster/less work for other people if @sipa do it :)

@sipa
Copy link
Member

sipa commented Mar 24, 2020

@pierreN Feel free to restrict the number conversion in ParseScript yourself if you like.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that using std::abs instead of the manual calculation doesn't get rid of the UB.
According to https://en.cppreference.com/w/cpp/numeric/math/abs: "Computes the absolute value of an integer number. The behavior is undefined if the result cannot be represented by the return type."

So, for the corner case input -2^63 (=numeric_limits<int64_t>::min()), both variants are UB (probably internally, std::abs does exactly work the same than our manual way -- one example for libstdc++: https://github.com/openbsd/src/blob/master/gnu/lib/libstdc%2B%2B/libstdc%2B%2B/include/c_std/std_cstdlib.h#L149).

See also https://embeddedgurus.com/stack-overflow/2012/02/the-absolute-truth-about-abs/ (talks about C, but that applies to C++ as well I'd guess).

@pierreN
Copy link
Contributor Author

pierreN commented Mar 24, 2020

Thanks, it seems you are right. I guess the fuzzer is not triggered here since we use abs but the UB is still there.

I'm not sure that adding a special check for -2^63 inside CScript::serialize (or inside abs) is a good thing. The only decent option available seems to be the stackoverflow post above, but it is inapplicable here.

Hence, I reckon there is 2 possible options for this PR :

  • just close it since util: Limit decimal range of numbers ParseScript accepts #18416 prevents the UB from happening anyway
  • rename the commit into something like "script: rely on std implementation to get abs value for num opcode serialize" - since relying on the standard seems cleaner and show more obviously the intent ?

What do you guys think ?

@sipa
Copy link
Member

sipa commented Mar 24, 2020

@theStack Looking at the standard library implementation isn't relevant, as it is allowed to rely on platform-specific behavior. However, it looks like std::abs is indeed specified to be undefined if the result cannot be represented in the input type.

What is wrong with uint64_t absvalue = (value < 0) ? -uint64_t(value) : uint64_t(value). I remember you tried this, but don't remember what error came out.

An alternative can be uint64_t absvalue = (value < 0) ? ~uint64_t(value) + 1 : uint64_t(value) which is semantically exactly the same but avoids a negation of a unsigned type.

@pierreN pierreN force-pushed the fix-script-absolute branch from 699f677 to abeae82 Compare March 25, 2020 04:15
@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

The issue with -uint64_t(value) was that MSVC seemed not to implement the 5.3.1 §8 of the standard about the - unary operator properly...

It seems however that 2s complement by hand works : https://godbolt.org/z/cr-phu so I updated the branch accordingly. Thanks.

@theStack
Copy link
Contributor

FWIW, I just checked where else the manual abs approach of "determining-sign-and-when indicating-negating-the-number", is used in the codebase, and found this function:

UniValue ValueFromAmount(const CAmount& amount)
{
bool sign = amount < 0;
int64_t n_abs = (sign ? -amount : amount);
int64_t quotient = n_abs / COIN;
int64_t remainder = n_abs % COIN;
return UniValue(UniValue::VNUM,
strprintf("%s%d.%08d", sign ? "-" : "", quotient, remainder));
}

I didnt' check though in which range the amount values are passed from the call-sites. Maybe you want to look into it if UB is also a problem here in the course of this (or a possible future) PR.

@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

I didnt' check though in which range the amount values are passed from the call-sites. Maybe you want to look into it if UB is also a problem here in the course of this (or a possible future) PR.

Thanks. From a quick check and believing that the variable name COIN is accurate, you have 2^62/21 000 000/COINS > 1000 so this shoould not be able to trigger the UB ?

It seems it costs nothing to prevent it though, and I'd be happy to do a PR for that if others agree.

@practicalswift
Copy link
Contributor

practicalswift commented Mar 25, 2020

@theStack Good catch regarding ValueFromAmount(const CAmount& amount). I noticed it too when fuzzing ValueFromAmount:

// ValueFromAmount(i) not defined when i == std::numeric_limits<int64_t>::min()
if (i64 != std::numeric_limits<int64_t>::min()) {
int64_t parsed_money;
if (ParseMoney(ValueFromAmount(i64).getValStr(), parsed_money)) {
assert(parsed_money == i64);
}
}

I think ideally all our functions should i.) state pre-conditions clearly (using documentation and/or assertions), and ii.) robustly handle all inputs that are not explicitly disallowed by the clearly stated pre-conditions :)

@pierreN
Copy link
Contributor Author

pierreN commented Mar 25, 2020

I noticed it too when fuzzing ValueFromAmount:

Looking at the interesting integer.cpp, shouldn't the same be applied to FormatMoney if we follow that logic ?

With a quick regexp search I also found occurences in :

  • qt/bitcoinunits.cpp : qint64 n_abs = (n > 0 ? n : -n);
  • timedata.cpp : in the abs64 function (edit: this can't happen since at worst we abs the average between 0 and the nOffsetSample argument)

laanwj added a commit that referenced this pull request Mar 27, 2020
9ab14e4 Limit decimal range of numbers ParseScript accepts (pierrenn)

Pull request description:

  Following up on this suggestion : #18413 (comment), prevent the output of `atoi64` in the `core_read.cpp:ParseScript` helper to send to `CScriptNum::serialize` values wider than 32-bit.

  Since the `ParseScript` helper is only used by the tool defined in `bitcoin-tx.cpp`, this only prevents users to provide too much unrealistic values.

ACKs for top commit:
  laanwj:
    ACK 9ab14e4

Tree-SHA512: ee228269d19d04e8fee0aa7c0ae2bb0a2b437b8e574356e8d9b2279318242057d51fcf39a842aa3afe27408d0f2d5276df245d07a3f4828644a366f80587b666
@practicalswift
Copy link
Contributor

practicalswift commented Mar 31, 2020

Looking at the interesting integer.cpp, shouldn't the same be applied to FormatMoney if we follow that logic ?

I suggest doing that in a follow-up PR to keep this PR focused. Would be really nice to get this one merged to get rid of the last known fuzzing roadblock :)

@maflcko
Copy link
Member

maflcko commented Mar 31, 2020

@practicalswift What is the fuzzer that hits this in current master? And what is the backtrace?

@practicalswift
Copy link
Contributor

@MarcoFalke Oh, it seems I got the crashes mixed up.

This is the remaining case (it is unrelated to this PR):

$ UBSAN_OPTIONS="print_stacktrace=1:halt_on_error=1:report_error_type=1" src/test/fuzz/transaction crash-transaction
INFO: Seed: 2635477314
INFO: Loaded 1 modules   (415121 inline 8-bit counters): 415121 [0x5575e380b6d8, 0x5575e3870c69), 
INFO: Loaded 1 PC tables (415121 PCs): 415121 [0x5575e3870c70,0x5575e3ec6580), 
src/test/fuzz/transaction: Running 1 inputs 1 time(s) each.
Running: crash-transaction
primitives/transaction.cpp:87:19: runtime error: signed integer overflow: 1095216725760 + 9223372032559808512 cannot be represented in type 'long'
    #0 0x5575e2041ec1 in CTransaction::GetValueOut() const src/primitives/transaction.cpp:87:19
    #1 0x5575e0eb4a97 in test_one_input(std::vector<unsigned char, std::allocator<unsigned char> > const&) src/test/fuzz/transaction.cpp:78:18

This remaining case is fixed in #18383.

@pierreN
Copy link
Contributor Author

pierreN commented Apr 1, 2020

OK, thanks.

What about recycling this PR to fix those custom absolute value functions ?

For this purpose I added a small fuzzing harness in the PR #18491

Or we can just close this PR (and hence also close the fuzzing harness in #18491)

@pierreN pierreN force-pushed the fix-script-absolute branch from abeae82 to 60175ab Compare April 2, 2020 13:12
@pierreN pierreN force-pushed the fix-script-absolute branch from 60175ab to c6819c4 Compare April 2, 2020 13:19
@pierreN
Copy link
Contributor Author

pierreN commented Apr 2, 2020

So following this comment : #18491 (comment) just added CScriptNum::serialize to the integer fuzzing harness to this PR instead of opening another one.

@maflcko
Copy link
Member

maflcko commented Apr 3, 2020

Could also add the following diff:

diff --git a/src/test/fuzz/scriptnum_ops.cpp b/src/test/fuzz/scriptnum_ops.cpp
index db44bb9e19..d45c85bc51 100644
--- a/src/test/fuzz/scriptnum_ops.cpp
+++ b/src/test/fuzz/scriptnum_ops.cpp
@@ -128,10 +128,6 @@ void test_one_input(const std::vector<uint8_t>& buffer)
             script_num &= fuzzed_data_provider.ConsumeIntegral<int64_t>();
             break;
         }
-        // Avoid negation failure:
-        // script/script.h:332:35: runtime error: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself
-        if (script_num != CScriptNum{std::numeric_limits<int64_t>::min()}) {
             (void)script_num.getvch();
-        }
     }
 }

@pierreN pierreN force-pushed the fix-script-absolute branch from c6819c4 to a3fc3c5 Compare April 3, 2020 14:55
@pierreN
Copy link
Contributor Author

pierreN commented Apr 3, 2020

Ow, right, thanks. Branch updated.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 3, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@pierreN pierreN force-pushed the fix-script-absolute branch from 7b47f78 to 2748e87 Compare April 8, 2020 23:33
@pierreN
Copy link
Contributor Author

pierreN commented Apr 28, 2020

It's been around a month this PR has been opened - do you guys think I should close it?

Thanks.

@sipa
Copy link
Member

sipa commented Apr 28, 2020

ACK 2748e87

@maflcko
Copy link
Member

maflcko commented Apr 29, 2020

ACK 2748e87, only checked that the bitcoind binary does not change with clang -O2 🎓

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

ACK 2748e8793267126c5b40621d75d1930e358f057e, only checked that the bitcoind binary does not change with clang -O2 🎓
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUi98wwAwknOFbsqPTE+alfdjNCoL0MsBT7VrsX7Io4GqSF0gopF0hSAvuDiiR+9
I0+Hl10iCynEOzQgDPRLQGpuQOp1uAxqbKc6QCOF1vdwwu2WdRoCTGCtVr/jxCeg
MYg0S0IWlyoaM8v5rP7tci4HXsI7TDkrx13izCKfTuB7HXcUjA3vja6K1Crb4vIZ
2qpI1yCy2unnAwkoQO8qj752rFIefw2b0hDly2IkWWxfE5CCwkqBYlft9kfR03zA
L81M/Eo2xniXosHiVjSRr5q01LZBWykD8EexYQPOMqP8eYXumPgLob1WD1h/ryVp
k9qmDsCoV32tfsUiOv53I5CRLG69jJaLqvNAkdTMfrlb1VZlvQwmtdVNx+W1UPNC
fknQj1whu40fPMNYX+Qk8F9HKHXV3zCZu27sFad0nnGtg5Xio29ayck/OTb5uLcU
ACSj6sFSf9rta3WvFGU9HwoicnkArFhb/81itfvn4K+M9YHDO0Fafokv2xm4grjx
CZwsrSnk
=ovF6
-----END PGP SIGNATURE-----

Timestamp of file with hash c058e638449b2ccb23d207a2e64064c908b5cebd9528068119d048e87b501d5a -

@practicalswift
Copy link
Contributor

ACK 2748e87

@jonatack
Copy link
Member

@pierreN no worries, a month (or much, much longer) isn't unusual due to the number of PRs opened relative to the number of people who review.

@pierreN
Copy link
Contributor Author

pierreN commented May 1, 2020

Ow my bad - OK, thanks for the ACKs, I'll leave the PR open.

@fanquake fanquake merged commit 68ef952 into bitcoin:master May 2, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 2, 2020
…num opcode serialize

2748e87 script: prevent UB when computing abs value for num opcode serialize (pierrenn)

Pull request description:

  This was reported by practicalswift here bitcoin#18046

  It seems that the original author of the line used a reference to glibc `abs`: https://github.com/lattera/glibc/blob/master/stdlib/abs.c

  However depending on some implementation details this can be undefined behavior for unusual values.

  A detailed explanation of the UB is provided here : https://stackoverflow.com/questions/17313579/is-there-a-safe-way-to-get-the-unsigned-absolute-value-of-a-signed-integer-with (by [Billy O'Neal](https://twitter.com/malwareminigun))

  Simple relevant godbolt example :  https://godbolt.org/z/yRwtCG

  Thanks!

ACKs for top commit:
  sipa:
    ACK 2748e87
  MarcoFalke:
    ACK 2748e87, only checked that the bitcoind binary does not change with clang -O2 🎓
  practicalswift:
    ACK 2748e87

Tree-SHA512: 539a34c636c2674c66cb6e707d9d0dfdce63f59b5525610ed88da10c9a8d59d81466b111ad63b850660cef3750d732fc7755530c81a2d61f396be0707cd86dec
fanquake added a commit to fanquake/bitcoin that referenced this pull request May 15, 2020
Given that bitcoin#18413 has not been backported.
deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 22, 2021
Summary: Backport of core [[bitcoin/bitcoin#18413 | PR18413]].

Test Plan:
  ninja all check-all
  ninja bitcoin-fuzzers
  ./test/fuzz/test_runner.py <path_to_corpus>

Reviewers: #bitcoin_abc, deadalnix

Reviewed By: #bitcoin_abc, deadalnix

Differential Revision: https://reviews.bitcoinabc.org/D9017
furszy added a commit to PIVX-Project/PIVX that referenced this pull request Mar 1, 2021
921c4a4 script: prevent UB when computing abs value for num opcode serialize (pierrenn)
58ebadb Fix memory leak in net_tests (Pieter Wuille)
786f396 Fix memory leak in wallet tests (random-zebra)
d8b125d Avoid integer overflows in scriptnum tests (Pieter Wuille)

Pull request description:

  Fix `scriptnum_tests` currently failing with debug enabled, due to integer overflow, backporting bitcoin#9512.
  Last commit coming from bitcoin#18413.

ACKs for top commit:
  Fuzzbawls:
    ACK 921c4a4
  furszy:
    utACK 921c4a4 and merging..

Tree-SHA512: 07827bfda430e0704b427e373700ff37049bddaa59a56d415618441cb63fb8500bdb19f9df834574ade4618f0825853e44255f3a304486b4bbcb60dbbb382938
backpacker69 referenced this pull request in peercoin/peercoin Mar 28, 2021
Given that #18413 has not been backported.
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants