Skip to content

Conversation

@kazcw
Copy link
Contributor

@kazcw kazcw commented Nov 16, 2018

Decrementing psz beyond the beginning of the string is UB, even though
the out-of-bounds pointer is never dereferenced.

I don't think any clang sanitizer covers this, so I don't see any way a test could catch the original behavior.

Decrementing psz beyond the beginning of the string is UB, even though
the out-of-bounds pointer is never dereferenced.
@promag
Copy link
Contributor

promag commented Nov 16, 2018

Both old and new code LGTM.

@practicalswift
Copy link
Contributor

practicalswift commented Nov 16, 2018

Extremely nice find and another excellent contribution from you! Thanks! You're right that the sanitizers don't catch this class of UB. I think tis-interpreter is the only good tool catching this class of UB. May I ask how you found this issue?

From what I can see we unconditionally hit this UB as part of startup no matter what parameters we pass to bitcoind :-\

Why this is UB: Assuming char foo[N]; then calculating foo + n is UB if n is not in the range 0 <= n <= N. Calculating one past the end (n=N) is OK (but we may not dereference).

@practicalswift
Copy link
Contributor

@promag What is "old code" referring to? Don't you agree that the existing code in master invokes UB? :-)

@promag
Copy link
Contributor

promag commented Nov 16, 2018

@practicalswift TIL, it is UB unless one past the last element.

utACK 0f459d8.

@promag
Copy link
Contributor

promag commented Nov 16, 2018

@kazcw how about this:

diff --git a/src/uint256.cpp b/src/uint256.cpp
index d9da66803..e5cadfae5 100644
--- a/src/uint256.cpp
+++ b/src/uint256.cpp
@@ -40,13 +40,12 @@ void base_blob<BITS>::SetHex(const char* psz)
     const char* pbegin = psz;
     while (::HexDigit(*psz) != -1)
         psz++;
-    psz--;
     unsigned char* p1 = (unsigned char*)data;
     unsigned char* pend = p1 + WIDTH;
-    while (psz >= pbegin && p1 < pend) {
-        *p1 = ::HexDigit(*psz--);
-        if (psz >= pbegin) {
-            *p1 |= ((unsigned char)::HexDigit(*psz--) << 4);
+    while (psz > pbegin && p1 < pend) {
+        *p1 = ::HexDigit(*--psz);
+        if (psz > pbegin) {
+            *p1 |= ((unsigned char)::HexDigit(*--psz) << 4);
             p1++;
         }
     }

@kazcw
Copy link
Contributor Author

kazcw commented Nov 16, 2018

@practicalswift Just reading code!

@promag That would be the minimal change to fix the problem, but Satoshi wrote C++ like an old-school C hacker 😆. IMO indexing is a little clearer in a codebase that these days uses more modern idioms.

@practicalswift
Copy link
Contributor

@kazcw Nice! Then keep on reading code please! :-)

I've tried to launch a crusade against UB in Bitcoin Core. I even named the latest C-lightning release "The Consensus Loving Nasal Daemon" as a catchy slogan in this fight, so and I'm very happy to see your contributions! Keep 'em coming! :-)

Context:

@practicalswift
Copy link
Contributor

practicalswift commented Nov 16, 2018

@MarcoFalke @laanwj What about adding a new label – say "Undefined Behaviour" – for UB pull requests and issues?

FWIW I think the current use of the "Refactoring" label for UB PRs is problematic for a number of reasons:

  • UB PRs might be considered less important from a review attention perspective due to the refactoring label
  • The use of the refactoring label for UB might be interpreted as us downplaying the risks associated with UB
  • The "Refactoring" label is technically incorrect:
    1. Code refactoring is defined as the process of restructuring existing computer code without changing its external behaviour.
    2. Triggering undefined behaviour is defined as executing computer code whose behaviour is not prescribed by the language specification to which the code adheres, for the current state of the program.
    3. Since undefined behaviour is a superset of external behaviour it follows that a PR fixing UB cannot technically be considered refactoring.

Some open UB PR:s that need the suggested "Undefined Behaviour" label:

@sipa
Copy link
Member

sipa commented Nov 16, 2018

@practicalswift While I am in favor of (slowly) moving the code to strictly comply with the C++ standard, I also think we should also be aware that there are differences between:

  • Technically UB, but not for any real platform
  • Technically UB, but well defined for all systems we support
  • Would be UB, but not in any reachable codepath
  • Any of the above that follows from a misunderstanding by developers/reviewers, resulting in a fragile codebase.
  • An actual issue that can have impact on production systems.

I would argue that the first one or two are not a priority at all (still a good-to-have, but there are also code churn concerns), while the last few are increasingly more so. I don't think classifying them all under "UB" is a good idea - it would be overly dramatic for some things, and a wild understatement for others.

By this I don't want to discourage anyone from working towards improving the situation, but perhaps a "crusade" isn't the right approach.

@maflcko
Copy link
Member

maflcko commented Nov 16, 2018

Mostly agree with @sipa.

Also, there is a label "Bug", which (I guess) could be used when there is an observable issue with the code. I.e. actual malfunction and not just a code style issue or random compiler warning. Generally, if you are unable to write a functional or unit test case that will fail on current master, I'd say applying the "Refactoring" label is just fine.

We only support gcc and clang as compilers and in most cases of UB they just do the right thing without having to rewrite the code. And given that this piece of code is around for as long as the code base existed and no one ever run into issues, it seems unlikely that things will suddenly break.

@kazcw
Copy link
Contributor Author

kazcw commented Nov 16, 2018

This is a complex but important topic. I think it would be worth working out a clearly-defined policy. I propose for comment an approach that admits mostly-objective triage into practical categories, based on what kinds of testing can demonstrate the issue. I have a starting point here (suggested changes / fundamental disagreements welcome):
https://gist.github.com/kazcw/6849c43796d51d3e56ad12bf691ac2e6#file-reliability-vs-ub-md

If there's agreement on this approach in principle, I could PR it as section in docs/developer-notes.md, and then we could use the PR process to work out the finer details.

@sipa, I think this is generally consistent with the approach you have outlined. To keep the UB discussion simple I have not addressed unreachable codepaths, which I think would fit the same needs-refactor category as other footgun-API issues.

@MarcoFalke I have included a "Brittle" category in between "Refactoring" and "Broken", that would fit for example a function call that does not currently cause misbehavior, but could become broken if the compiler decided to inline it. If you agree that this kind of thing is more severe than might be expected for issues under the "Refactoring" label, we should try to come to a consensus on how to define it.

@practicalswift I think a crusade on "potentially-dangerous UB" would be apt, as long as we are careful about how we define "potentially-dangerous". What do you think of the approach of leniency for a whitelist of classes of UB in existing code, with those classes selected based on clang's sanitizers options (and in this particular issue, lack thereof)? (And maybe valgrind?)

It looks like the current sanitizers used in CI test runs are "thread" and "integer,undefined". That is definitely insufficient for the definition of the "Brittle" classification: for example, those sanitizers don't include uninitialized reads. IMO a whitelist approach is appropriate: every sanitizer category should be considered to represent potential danger unless specifically decided otherwise (which we might just do lazily, as issues come up?). Ideally all sanitizers deemed important would be checked in CI test runs, but given the slowdown factor of some sanitizers, I don't know if that would be practical.

@practicalswift
Copy link
Contributor

@MarcoFalke

We only support gcc and clang as compilers and in most cases of UB they just do the right thing without having to rewrite the code.

Are you willing to bet your money on that they'll "do the right thing" also in the future?

The C++ standard is a contract between the compiler writer and the programmer.

As programmers we can choose to break the contract and hope that the other party is okay with it or won't notice.

In this past this strategy seemed to work: compiler writers seldom enforced the contract we signed. They never took us to court.

However, in recent years compiler writers have started to aggressively take advantage of undefined behaviours to improve optimizations. They are taking us to court like never before:

You've probably read them but these are great blog posts on the subject from the compiler writers' perspective:

Chris Lattner's "What Every C Programmer Should Know About Undefined Behavior":

John Regehr's "A Guide to Undefined Behavior in C and C++":

@maflcko
Copy link
Member

maflcko commented Nov 17, 2018

It looks like the current sanitizers used in CI test runs are "thread" and "integer,undefined"

Unfortunately we don't run the thread sanitizer (#14058), nor the memory sanitizer or valgrind (due to memory issues internal to bdb and qt, IIRC).

@l2a5b1
Copy link
Contributor

l2a5b1 commented Nov 20, 2018

utACK 0f459d8

This pull request eliminates pointer arithmetic [1] and addresses UB if psz points to an array [2].

[1] “Pointer arithmetic is fragile and easy to get wrong, the source of many, many bad bugs and security violations.“ ES.42: Keep use of pointers simple and straightforward

[2] Excerpt From: Bjarne Stroustrup. “The C++ Programming Language, 4/e“:

“The result of taking the address of the element before the initial element or beyond one-past-the-last element is undefined and should be avoided.”

”int v[] = { 1, 2, 3, 4 };
int* p1 = v;    // pointer to initial element (implicit conversion)
int* p4 = v–1;  // before the beginning, undefined: don't do it

@gmaxwell
Copy link
Contributor

@practicalswift It sounds like you are only looking at things from one side. Consider a case where a behaviour is language spec UB but explicitly defined by our supported toolchain and ubiquitously used. In that case the existing behaviour is at least currently harmless. A fix risks introducing its own bugs (potentially not harmless) and also diverts review/testing attention away from other changes, potentially allowing bugs through or causing important improvements to be delayed. If the fix is done hastily (e.g. as part of a mass of changes in a 'crusade') even if it is not buggy itself could make the code more brittle or less clear and thus contribute to the introduction of bugs down the line. (You've seen me complain sometimes against fixing warnings by peppering in casts, and usually its because of brittleness/less clarity)

There are as many ways for code to be good as there are people and kinds of code, a decision to do something is never a one sided choice to make things better or not, it's always a prioritization of what kinds of better we're making things and what risks we're taking by making those changes.

@practicalswift
Copy link
Contributor

practicalswift commented Nov 20, 2018

@gmaxwell I agree with the point you and @sipa make: all changes – including UB fixes – should be critically evaluated from a risk-reward perspective. And PR:s deemed too risky should be closed off after review.

Perhaps what we don't agree on is how much work it would take to kill off the remaining instances of UB in our code base, and how complex the required changes would be. Could that be the case?

If we get specific: is there any particular type of UB you fear that we have in such quantity that it would be unrealistic to fix?

Also, are there any UB fixes that have been submitted so far that you don't think are worth fixing?

Personally I think we're better off having all UB cases identified and documented. That way we can reason about them like we do in this PR and take the appropriate action (if any). We agree on that, right? :-)

@maflcko
Copy link
Member

maflcko commented Nov 21, 2018

To the best of my knowledge the runtime observable UBs were already documented through an UBSAN suppressions file (https://github.com/bitcoin/bitcoin/blob/5c292dafcd54adfcd9f80c0e1fccb45c8683808f/contrib/sanitizers-ubsan.suppressions)? To fix all of them a series of separate pull requests would be required, each of them a massive review burden. It would probably take months if not years to get all through the review process. Especially unsigned-integer-overflow are tricky, as pointed out by @arvidn in #11535 (comment). Because the right fix is not always obviously clear and might even introduce more UB temporarily. I suggest to come up with a guideline on UB and maybe a separate guideline on unsigned-integer-overflow before proceeding with patches that might go in the wrong direction and give a false impression of having fixed something.

@practicalswift
Copy link
Contributor

practicalswift commented Nov 21, 2018

@MarcoFalke Some things that are important to note when talking about UB:

  • Wraparound behaviour using unsigned integers is well-defined, so what -fsanitize=integer calls unsigned-integer-overflow is not UB. (Unintended unsigned integer wraparound can be a source of bugs, but it is not UB.)
  • UBSan does not catch all UBs. Since UBSan performs dynamic analysis it is restricted to finding the subset of all UBs that are detectable at run-time. And in that subset of UBs it still doesn't catch all types of UB. The UB discussed in this PR is an example of that: it is UB, it is detectable at run-time but it is not catched by UBsan. Another example is uninitialized reads which are not detected by UBSan (ASan does).

If we get more specific:

  • Are there any specific UB fixes that have been submitted so far that you don't think are worth fixing?
  • Is there any particular type of UB you fear that we have in such quantity that it would be unrealistic to fix?
  • Is there any specific scenario in which we'd be better off by not having all UB cases clearly identified and documented so that we can reason about their impact and take the appropriate action (if any)?

@sipa
Copy link
Member

sipa commented Nov 21, 2018

@practicalswift I think the only disagreement is on how much of a priority this is.

@practicalswift
Copy link
Contributor

@kazcw

It looks like the current sanitizers used in CI test runs are "thread" and "integer,undefined". That is definitely insufficient for the definition of the "Brittle" classification: for example, those sanitizers don't include uninitialized reads. IMO a whitelist approach is appropriate: every sanitizer category should be considered to represent potential danger unless specifically decided otherwise (which we might just do lazily, as issues come up?). Ideally all sanitizers deemed important would be checked in CI test runs, but given the slowdown factor of some sanitizers, I don't know if that would be practical.

PR #14794 adds AddressSanitizer (ASan) to Travis. Please review :-)

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 4, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@laanwj
Copy link
Member

laanwj commented Feb 12, 2019

This is sure generating a lot of discussion for a few-line change. Is there disagreement on merging this specific change?

@practicalswift
Copy link
Contributor

@laanwj There is no disagreement about the code change from what I can tell. I think we should merge this or alternatively document in assumptions.h that we assume this specific class of UB to be safe in practice.

@gmaxwell
Copy link
Contributor

gmaxwell commented May 7, 2019

Please do not just accept any actually known true UB as just an "assumption". Assumptions should be for platform limitations (e.g. we require int be 32 bits) and implementation defined behaviour (including places where clang and GCC explicitly strengthen the language specification and turn something that the language leaves UB into implementation defined).

I think it would be really nice if every PR fixing an obvious bug didn't turn into some awful debate about undefined behaviour.

@laanwj
Copy link
Member

laanwj commented May 7, 2019

Agreed: so let's move back the focus in this PR to reviewing the specific code change. Please move general UB discussion elsewhere:

This has currently 2 utACKs, no NACKs.

@practicalswift
Copy link
Contributor

practicalswift commented May 16, 2019

This might be of help for people reviewing this PR who want to understand the old behaviour:

Consider the case SetHex("1000000000000000000000000000000000000000000000000000000000000000"):

When the reverse processing reaches the leading 1 we have psz == pbegin and the following happens:

*p1 |= ((unsigned char)HexDigit(*psz--) << 4);

Please note that psz has now been decremented to pbegin - 1 which is outside of the object.

An example:

$ git diff
diff --git a/src/uint256.cpp b/src/uint256.cpp
index e3bc9712e..8cd1065fb 100644
--- a/src/uint256.cpp
+++ b/src/uint256.cpp
@@ -49,6 +49,7 @@ void base_blob<BITS>::SetHex(const char* psz)
             *p1 |= ((unsigned char)::HexDigit(*psz--) << 4);
             p1++;
         }
+        assert(psz >= pbegin);
     }
 }

$ make
$ src/test/test_bitcoin -t addrman_tests
test_bitcoin: uint256.cpp:52: void base_blob<256>::SetHex(const char *) [BITS = 256]: Assertion `psz >= pbegin' failed.
Aborted

@laanwj laanwj merged commit 0f459d8 into bitcoin:master Jul 3, 2019
laanwj added a commit that referenced this pull request Jul 3, 2019
0f459d8 fix an undefined behavior in uint::SetHex (Kaz Wesley)

Pull request description:

  Decrementing psz beyond the beginning of the string is UB, even though
  the out-of-bounds pointer is never dereferenced.

  I don't think any clang sanitizer covers this, so I don't see any way a test could catch the original behavior.

ACKs for top commit:
  promag:
    utACK 0f459d8.
  l2a5b1:
    utACK 0f459d8

Tree-SHA512: 388223254ea6e955f643d2ebdf74d15a3d494e9f0597d9f05987ebb708d7a1cc06ce64bd25d447d75b5f5561bdae9630dcf25adb7bd75f7a382298b95d127162
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 5, 2019
0f459d8 fix an undefined behavior in uint::SetHex (Kaz Wesley)

Pull request description:

  Decrementing psz beyond the beginning of the string is UB, even though
  the out-of-bounds pointer is never dereferenced.

  I don't think any clang sanitizer covers this, so I don't see any way a test could catch the original behavior.

ACKs for top commit:
  promag:
    utACK 0f459d8.
  l2a5b1:
    utACK 0f459d8

Tree-SHA512: 388223254ea6e955f643d2ebdf74d15a3d494e9f0597d9f05987ebb708d7a1cc06ce64bd25d447d75b5f5561bdae9630dcf25adb7bd75f7a382298b95d127162
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jul 8, 2020
Summary:
0f459d868d85053f1cc066ea9099793f88cbd655 fix an undefined behavior in uint::SetHex (Kaz Wesley)

Pull request description:

  Decrementing psz beyond the beginning of the string is UB, even though
  the out-of-bounds pointer is never dereferenced.

  I don't think any clang sanitizer covers this, so I don't see any way a test could catch the original behavior.

ACKs for top commit:
  promag:
    utACK 0f459d8.
  l2a5b1:
    utACK 0f459d868d85053f1cc066ea9099793f88cbd655

Tree-SHA512: 388223254ea6e955f643d2ebdf74d15a3d494e9f0597d9f05987ebb708d7a1cc06ce64bd25d447d75b5f5561bdae9630dcf25adb7bd75f7a382298b95d127162

Backport of Core [[bitcoin/bitcoin#14734 | PR14734]]

Test Plan: `ninja check`

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D6844
PastaPastaPasta added a commit to PastaPastaPasta/dash that referenced this pull request Oct 22, 2021
0f459d8 fix an undefined behavior in uint::SetHex (Kaz Wesley)

Pull request description:

  Decrementing psz beyond the beginning of the string is UB, even though
  the out-of-bounds pointer is never dereferenced.

  I don't think any clang sanitizer covers this, so I don't see any way a test could catch the original behavior.

ACKs for top commit:
  promag:
    utACK 0f459d8.
  l2a5b1:
    utACK 0f459d8

Tree-SHA512: 388223254ea6e955f643d2ebdf74d15a3d494e9f0597d9f05987ebb708d7a1cc06ce64bd25d447d75b5f5561bdae9630dcf25adb7bd75f7a382298b95d127162

# Conflicts:
#	src/uint256.cpp
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants