Skip to content

Comments

Add a zero length check to EVP_DigestUpdate.#6819

Closed
paulidale wants to merge 1 commit intoopenssl:masterfrom
paulidale:digest_update
Closed

Add a zero length check to EVP_DigestUpdate.#6819
paulidale wants to merge 1 commit intoopenssl:masterfrom
paulidale:digest_update

Conversation

@paulidale
Copy link
Contributor

EVP_EncryptUpdate and EVP_DecryptUpdate already handle this case properly.
HASH_UPDATE was not modified because the legacy interfaces use it.

Refer #6800 #6817

@richsalz richsalz added the approval: done This pull request has the required number of approvals label Jul 31, 2018
Copy link
Contributor

@mspncp mspncp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this was the easy part. Now it remains to find all implementations that already have this check and remove the additional check. ;-)

@mspncp
Copy link
Contributor

mspncp commented Jul 31, 2018

remove the additional check.

OTOH, the benefit is probably not worth the effort. So just forget my last remark.

@paulidale
Copy link
Contributor Author

I have to agree with this assessment @mspncp We can wait until 1.2.0 I suspect. Failing a check later rather than earlier isn't a big issue. Not failing a check that should have been there but is now is acceptable and won't break things.

@dot-asm
Copy link
Contributor

dot-asm commented Jul 31, 2018

Failing a check later rather than earlier isn't a big issue.

But excessive checks can sum up to big issue. But probably more importantly one [can] argue that it's not EVP's business to stand between caller and digest implementation. I mean what if there is digest implementation to which zero length has some meaning. And now EVP will start filtering it. Note that I'm not saying that we have such implementation, question rather is what is formally appropriate, to make this specific assumption about implementation or not.

@paulidale
Copy link
Contributor Author

If we did encounter a digest where a zero length has meaning, couldn't we then add a flag to indicate this?
I don't see this change precluding the possibility in the future.

I don't think the existing checks in digests can be removed before 1.2. The low level APIs are public and we can't break them. There will be other checks that could go however (e.g. the one that spurred this).

@dot-asm
Copy link
Contributor

dot-asm commented Aug 1, 2018

If we did encounter a digest where a zero length has meaning, couldn't we then add a flag to indicate this?

As already said goal should rather be minimizing the amount of checks, not adding redundant ones or creating preconditions for future ones.

(e.g. the one that spurred this).

One has to recognize that what spurred this is visual asymmetry and not real problem. I mean with or without length check outcome would be all the same. There is no possibility of crash, or producing unexpected result. If asymmetry is problem, then it can as well be fixed by ... removing the message_len checks in eddsa.c.

I don't think the existing checks in digests can be removed before 1.2.

I don't follow. I'm not talking about removing anything in digest implementations, but about not adding explicit check that would bypass current ones. I mean now if you pass zero length, it will be subjected (taking sha3 as example) to if (len >= bsz) {} else {rem=len} if (rem) {}. I.e. there is no explicit check for len being zero, but it does nothing and returns success value. One could add if (len == 0) return 1; in the beginning, but rationale is that it can incur undesired penalties for real-life values.

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

I don't follow. I'm not talking about removing anything in digest implementations

I think Pauli referred to my (now withdrawn) remark that if we would add a new top-level length check we should remove the down-level checks in turn.

@paulidale
Copy link
Contributor Author

Yes, I was referring to the remark by @mspncp

Removing the lower down checks will be an overall win -- one branch prediction instead of many. We can't complete this until 1.2 comes out due to API restraints. Sure some algorithms don't need it, but most do. Can the algorithms that don't need it be simplified without having the check above them?

I'm in agreement that zero length digest calls will be uncommon but not impossible, does this mean we should ignore the issue and remove all such checks? If folks agree, I follow through. I'm not yet convinced that doing so would be prudent. I'd prefer one widespread check instead of lots of ad hoc ones.

@dot-asm
Copy link
Contributor

dot-asm commented Aug 1, 2018

Well, if we go into formal API compatibility, then let's ask ourselves following question. Since we don't actually know if there is an external engine that relies on zero length being passed through, are we actually free to add suggested check in minor release? Note that I'm not talking about de-facto common sense, but purely formal side of it.

@levitte
Copy link
Member

levitte commented Aug 1, 2018

I'm not sure how formal we can be... from the way we ourselves use EVP_DigestUpdate, we have obviously expected it to do the right thing with zero-length input. Since all digest algos theoretically work on the concatenation of all input, it should follow that zero-length input should have zero impact, simply regarding such input as valid on a higher level and reacting on it by doing nothing other than saying "yeah, sure thing!" should be the right thing to do.

It follows that any digest algo implementation that currently doesn't check for zero-length input, explicitely or implicitely (I noticed that the sha3 implementation does this implicitely), and thereby has a change of state or other side effect, is buggy and should be corrected (unless we do place a check in EVP_DigestUpdate).

@dot-asm
Copy link
Contributor

dot-asm commented Aug 1, 2018

It follows that any digest algo implementation that currently doesn't check for zero-length input, explicitely or implicitely (I noticed that the sha3 implementation does this implicitely), and thereby has a change of state or other side effect, is buggy and should be corrected

As already said, all our implementations perform the checks (and/or rely on memcpy to do nothing).

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

Well, if we go into formal API compatibility ...

Please allow a bold remark: Whatever these formal API compatibility rules are, the question under consideration appears tiny in comparison to what these rules would have yielded w.r.t the compatibility breaks introduced by new features like the TLS 1.3 support or the new DRBG implementation.

@dot-asm
Copy link
Contributor

dot-asm commented Aug 1, 2018

Please allow a bold remark:

Sure. Except that they are really different in sense that new features are announced and expected, while this one masquerades as bug fix when there is no[!] bug.

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

If @dot-asm is right (and I have no reason to doubt it, because otherwise an issue would have been raised long ago) then the check introduced here is indeed superflous.

Let me recall that the original cause for this pr was the observation of @Bugcheckers that there is an inconsistency in the treatment of the case message_len == 0 by the callers of EVP_DigestUpdate(), which is checked in some places but not in others. So maybe the alternative way of solving this inconsistency would be to remove the superflous checks (e.g in eddsa.c, Lines 164 and 205)?

(edit: added the text in italics above)

@Bugcheckers
Copy link
Contributor

Bugcheckers commented Aug 1, 2018

We passed 0 as count to EVP_DigestUpdate in debugger and it does not crash. So the only concern is logical error (if there is), otherwise as @mspncp said, checks (e.g in eddsa.c, Lines 164 and 205) should be removed.

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

Well it appears that the checks in eddsa.c are not superfluous and were added only recently by @mattcaswell in 7876dbf fo fix undefined behaviour in the curve448 code.

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

Well it appears that the checks in eddsa.c are not superfluous and were added only recently by @mattcaswell in 7876dbf fo fix undefined behaviour in the curve448 code.

Matt, in view of the discussion here, wouldn't it make sense to fix the zero-length-input case inside the evp method implementation (since other digest implementations treat a zero length input consistently as no-op), instead of leaving the burdon of checking the input length to the caller, which might forget to do it?

@levitte
Copy link
Member

levitte commented Aug 1, 2018

Well it appears that the checks in eddsa.c are not superfluous and were added only recently by @mattcaswell in 7876dbf fo fix undefined behaviour in the curve448 code.

That commit is a bit odd. It talks avoiding passing NULL, and yet checks the input length...

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

See #6838 for an alternative to this pull request.

@mspncp
Copy link
Contributor

mspncp commented Aug 1, 2018

That commit is a bit odd. It talks avoiding passing NULL, and yet checks the input length...

Maybe it is because Matt was considering the case where a NULL pointer is passed together with a length of 0, which would eventually have resulted in a call to memcpy(...,NULL,0) in sha3_update(), which is undefined behaviour by the standard?

@mattcaswell
Copy link
Member

Maybe it is because Matt was considering the case where a NULL pointer is passed together with a length of 0, which would eventually have resulted in a call to memcpy(...,NULL,0) in sha3_update(), which is undefined behaviour by the standard?

The problem was this:

crypto/evp/m_sha3.c:73:36: runtime error: null pointer passed as argument 2, which is declared to never be null

This only ever occurred when the length was 0.

@mattcaswell mattcaswell added this to the 1.1.1 milestone Aug 2, 2018
@mspncp mspncp modified the milestones: 1.1.1, Post 1.1.1 Aug 3, 2018
@mspncp
Copy link
Contributor

mspncp commented Aug 3, 2018

@paulidale I changed the milestone to 'Post 1.1.1' and leave it up to you whether you prefer to close this or keep it for version 1.2.

@mspncp
Copy link
Contributor

mspncp commented Aug 3, 2018

@Bugcheckers, there are even more such locations, pr is in preparation.

~/src/openssl$ grep -rn 'if *(.*&&.*EVP_.*Update' crypto ssl

crypto/ct/ct_vfy.c:90:    if (sct->ext_len && !EVP_DigestUpdate(ctx, sct->ext, sct->ext_len))
crypto/rsa/rsa_pss.c:215:    if (sLen && !EVP_DigestUpdate(ctx, salt, sLen))
crypto/kdf/tls1_prf.c:204:    if (seed != NULL && !EVP_DigestSignUpdate(ctx, seed, seed_len))
crypto/kdf/tls1_prf.c:217:        if (seed && !EVP_DigestSignUpdate(ctx, seed, seed_len))
crypto/evp/m_sigver.c:161:    if (sigret != NULL && EVP_DigestSignUpdate(ctx, tbs, tbslen) <= 0)
ssl/s3_enc.c:456:    if ((sender != NULL && EVP_DigestUpdate(ctx, sender, len) <= 0)

@mspncp
Copy link
Contributor

mspncp commented Aug 3, 2018

See #6854.

@paulidale
Copy link
Contributor Author

Post 1.1.1 sounds fine.

The question is do we want all of these checks done in a single place as opposed to several?
Consideration being given to SHA3 etc that avoid the need for the check.

@dot-asm
Copy link
Contributor

dot-asm commented Aug 6, 2018

The question is do we want all of these checks done in a single place as opposed to several?

I for one would still argue against. As already said, it's not really EVP's busyness. As for those several len==0 checks. I wish we could remove them, it's just that memcpy(0,0,0) is deemed undefined. As for moving the referred or corresponding checks elsewhere in more general sense. It's complex question without one-size-fits-all answer. I'd say that invalid/unacceptable arguments should be rejected as close to caller as possible, with error description. But here we are not looking at invalid argument, as we are actually willing to accept EVP_DigestUpdate(ctx,0,0). And the len==0 checks are really about discrepancy between our desire to accept EVP_DigestUpdate(ctx,0,0) and memcpy's formal compliance with wording in standard. And in such case it's only appropriate if the harmonizing check is next to memcpy and is not lurking elsewhere in chain. [One can also put it as following. Another way to resolve the conflict would be to implement own memcpy replacement. Not that I suggest to, this is just mental exercise that illuminates the point. That this kind of check rather belongs in digest implementation, and not EVP dispatcher.]

@mspncp
Copy link
Contributor

mspncp commented Aug 6, 2018

Another way to resolve the conflict would be to implement own memcpy replacement.

In that matter I would plea for taking a more pragmatic approach: While in theory, the standard does not guarantee well defined behaviour for memcpy(NULL,NULL,0) (or more general, memcpy(dest,src,0) with invalid memory references dest, src ), in practice all actual compilers should be well behaved (my guess).

One could simply add a test that verifies that memcpy(dest,src,0) does not crash with an access violation for dest,src chosen from some set of invalid memory references (like NULL, some low addresses, some high addresses, some random addresses). Would you consider that sufficient proof Andy, or is there another kind of undefined behaviour you can think of?

@paulidale
Copy link
Contributor Author

The difficulty here is ensuring that the compiler doesn't optimise away such a memcpy call. Compilers are free to fudge this (and do so) and a call to memcpy(?, ?, 0) could easily be detected and removed.

Let me ponder this further. I don't disagree with Andy's assessment, I'm just not sure how to best proceed.

@dot-asm
Copy link
Contributor

dot-asm commented Aug 6, 2018

a more pragmatic approach

I for one am for pragmatic approach, but unfortunately it's apparently out of our, practitioners, hands(*).

a test that verifies that memcpy(dest,src,0) does not crash

As already implied such unit test wouldn't necessarily prove anything, because compiler can act completely differently in specific situations. Not to mention that it would entail masking from sanitizer, and that is more than one wishes to tolerate. I mean if given a choice between len==0 and #if-conditional __attribute__(something) I'd really go with former...

(*) On certain level it's sad that sometimes standard lawyering takes over practitioner's outlook, but what can you do when you face extreme standard interpretations like one mentioned in #5626 (comment)?

@kaduk
Copy link
Contributor

kaduk commented Aug 6, 2018

We are not even limited to worrying about language specification and compiler; it seems that upthread we have seen C library's headers that apply a non-NULL annotation to memcpy pointer arguments, and we would need to suppress that in this sort of pragmatic scheme.

@dot-asm
Copy link
Contributor

dot-asm commented Aug 9, 2018

upthread we have seen C library's headers that apply a non-NULL annotation to memcpy pointer arguments,

Well, it's kind of chicken-n-egg problem, isn't it? Except that this time answer is fortunately known. I mean it's not like there first was non-NULL annotation, and then language specification, right? On related note it should be noted that there is formal discrepancy between mentioned annotation, sanitizer behavour and language specification. Latter speaks of valid pointers, while NULL is just one among a whole lot of invalid pointers(*). For example 13 is as [in]valid as NULL, and yet sanitizer actually lets memcpy(13,13,0) fly as defined. So that annotation and sanitizer are really about catching blunders. [And sad part is that non-blunder cases end up being treated as blunders.]

(*) We are naturally talking about systems we are most accustomed to, those with virtual memory and accompanying protection. But even on those one can arrange NULL to be valid. For example Solaris offers LD_PRELOAD-able [email protected] that does it.

@paulidale paulidale removed the approval: done This pull request has the required number of approvals label Sep 3, 2018
@paulidale
Copy link
Contributor Author

Do we want this one in or not? It's been languishing in limbo for a while now.

@davidben
Copy link
Contributor

While in theory, the standard does not guarantee well defined behaviour for memcpy(NULL,NULL,0) (or more general, memcpy(dest,src,0) with invalid memory references dest, src ), in practice all actual compilers should be well behaved (my guess).

Sadly, no, GCC and glibc are not well-behaved.
https://www.imperialviolet.org/2016/06/26/nonnull.html

This is, IMO, simply a bug in the C specification, but the world is now stuck with it until C fixes it, plus a decade or so for all the bad GCC and glibc versions to phase out of existence.

@paulidale
Copy link
Contributor Author

This one didn't seem like it was all that desirable, it needs a rebase and has gone stale.
Closing, if somebody thinks it is worthwhile please reopen it and I'll update it.

@paulidale paulidale closed this May 21, 2019
@paulidale paulidale deleted the digest_update branch May 21, 2019 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants