Skip to content

Comments

x509_vfy.c: Make chain building more intuitive, flexible, and complete#13748

Closed
DDvO wants to merge 1 commit intoopenssl:masterfrom
siemens:chain_building_completeness
Closed

x509_vfy.c: Make chain building more intuitive, flexible, and complete#13748
DDvO wants to merge 1 commit intoopenssl:masterfrom
siemens:chain_building_completeness

Conversation

@DDvO
Copy link
Contributor

@DDvO DDvO commented Dec 29, 2020

UPDATE: this may need further improvements and is put on hold until v3.0.0 is released.

This PR includes some general improvements of the chain building algorithm and fixes a design flaw of build_chain():
it makes little sense to try finding an issuer of the current end of the chain target cert if this cert can be found immediately in the trust store (and is actually trusted).
So better stop chain building as early as possible, i.e, as soon as reaching a trust anchor.
This is more in line with RFC 4158 advising for an efficient construction of a chain with minimal length
and should also minimize the risk of picking a non-suitable path,
which is of particular importance in the given implementation because it does not backtrack.

Note that the changes do not aim at changing the implicit trust anchor behavior:
If a certificate is in the trust store and does not have trust attributes,
accept it as implicit trust anchor if it is self-signed or X509_V_FLAG_PARTIAL_CHAIN is set.

Also improve ossl_cmp_build_cert_chain() using the improve implementation
and make it public under the name OSSL_build_cert_chain().
Update: this side-topic moved to #14128.

Moreover, generalize the internal check_trust() to work not with leaf but with current cert.
This allows getting rid of strange exceptional cases in find_issuer()
(and maybe also in other parts of the overall chain building algorithm).
Among others, this provides a more general fix (compared to #13762)
of the regression introduced reported for v1.1.1 in #13739.

Checklist
  • documentation is added or updated
  • tests are added or updated

@DDvO DDvO changed the title x509_vfy.c: Make chain building more intuitive and complete x509_vfy.c: Make chain building more intuitive, flexible, and complete Dec 29, 2020
@DDvO DDvO added the triaged: bug The issue/pr is/fixes a bug label Dec 29, 2020
@DDvO DDvO added this to the 3.0.0 beta1 milestone Dec 29, 2020
@richsalz
Copy link
Contributor

I am worried about changing behavior. Please make the new behavior off by default.

@DDvO
Copy link
Contributor Author

DDvO commented Jan 1, 2021

I am worried about changing behavior.

@richsalz can you make more concrete what you fear could go wrong here?
The change in question just prefers short chains during chain building, which improves its completeness.
The correctness of chain building is anyway checked later during chain verification, right?
Moreover, one should hope that the multitude of existing tests on chain building and verification covers all critical cases, and none of them fails after this change.
So I cannot see where a real danger should come in here.

@DDvO
Copy link
Contributor Author

DDvO commented Jan 1, 2021

The provided change happens to fix an issue that recently was reported on the openssl-users mailing list:

Someone wants to use a directly trusted self-signed SSL server cert
and for some reason puts this cert in the list of trusted certs after another version of this cert with same subject (and SAN extensions) but with a different public key.
So far, the chain building chose the other cert (because it matches first) and then failed to verify the server cert (also because no backtracking is done).
Instead, after the change provided here the chain building immediately takes the right server cert (found in second position) in the trusted list and verification succeeds.

Please make the new behavior off by default.

As the example just given shows, this would be counter-productive.

@richsalz
Copy link
Contributor

richsalz commented Jan 1, 2021

With this change, what is the need for the PARTIAL_CHAINS flag?

@richsalz
Copy link
Contributor

richsalz commented Jan 1, 2021

The fact that someone reported an issue does not outweigh the current users who are depending on the present semantics.

@kroeckx
Copy link
Member

kroeckx commented Jan 1, 2021 via email

@richsalz
Copy link
Contributor

richsalz commented Jan 2, 2021

@kroeckx yes I have read the discussion in #7871. (My comment there raised the same issue as I did here: this is a behavior change happening very late in the 3.0 release process and I am opposed.) I wryly note @vdukhovni's comment, #7871 (comment), that @DDvO gave a thumbs-up to. :)

@DDvO, my question "what does this to the requiring PARTIAL_CHAINS" was directed to you.

This is a behavior change. There are no tests added that show the behavior change before and after. And when I pushed on that, the response was "one should hope that the multitude of existing tests on chain building and verification covers all critical cases, and none of them fails after this change." HOPE IS NOT GOOD ENOUGH. Here is a use-case that is not covered, for example: all sorts of certs are thrown into a trust store, including "trusted roots" and intermediaries. A few users, that want a particular path through a new intermediary, turn on PARTIAL_CHAINS. Consider when a CA adds a new digest mechanism (or crypto algorithm) that isn't supported on some devices.

Chain building and trust validation is hard, full of dark and mysterious corner cases, and changing it should be done upfront EARLY in the release plan, to socialize it (as Viktor said). It should not be done LATE while we are trying to count the days to get the BETA out.

@kroeckx
Copy link
Member

kroeckx commented Jan 2, 2021

@richsalz: My comment was directed at @DDvO, since I did not notice him participating in that issue, unlike you. I have to agree with most of what you say.

@beldmit
Copy link
Member

beldmit commented Jan 2, 2021

@openssl/otc, I think this PR deserves a special OTC discussion.

@DDvO
Copy link
Contributor Author

DDvO commented Jan 2, 2021

I became aware of the discussion in #7871 just a couple of hours back, and meanwhile I've added to two major contributions to it.
In the light of that discussion I've started refining this PR w.r.t. the treatment of partial chains...

@DDvO DDvO modified the milestones: 3.0.0 beta1, Post 3.0.0 Jan 2, 2021
@DDvO DDvO removed approval: otc review pending triaged: bug The issue/pr is/fixes a bug labels Jan 2, 2021
@DDvO DDvO changed the title x509_vfy.c: Make chain building more intuitive, flexible, and complete WIP: x509_vfy.c: Make chain building more intuitive, flexible, and complete Jan 2, 2021
@DDvO DDvO force-pushed the chain_building_completeness branch 2 times, most recently from 296668a to 2511fb9 Compare January 4, 2021 15:41
@DDvO
Copy link
Contributor Author

DDvO commented Jan 4, 2021

With this change, what is the need for the PARTIAL_CHAINS flag?

This PR does not aim at changing the semantics of X509_V_FLAG_PARTIAL_CHAIN, which is a different issue, still under discussion in #7871.
The improvements done here just make sure that more valid chains are found (assuming the same set of trust anchors as before).
In other words, it should not change trust policies but does increase chain building completeness.

@DDvO DDvO removed this from the Post 3.0.0 milestone Jan 4, 2021
@DDvO
Copy link
Contributor Author

DDvO commented Jan 4, 2021

The fact that someone reported an issue does not outweigh the current users who are depending on the present semantics.

Sure, but I still doubt that someone is depending on the incompleteness of the hitherto implementation.
A counterexample would convince me of the contrary. @richsalz, the challenge to provide one is still open 😉

@DDvO
Copy link
Contributor Author

DDvO commented Jan 4, 2021

There are no tests added that show the behavior change before and after.

I've meanwhile added the example discussed recently on the openssl-users list that I mentioned above:
https://www.mail-archive.com/[email protected]/msg88961.html

It demonstrates the effectiveness of the enhancements done here for an extremely simple case.
Conversely, the hitherto implementation had been failing even in such extremely simple cases.

@richsalz
Copy link
Contributor

richsalz commented Jan 4, 2021

It will take me a couple of days of elapsed time to run some experiments.

@vdukhovni
Copy link

I haven't looked at this PR in a little while. There are some good bits here IIRC and some things here and some more controversial. Perhaps you could stage just the parts that are cleaning up code and not changing policy into a smaller separate PR, and we can then think about just the policy changes separately, if you're still keen on those too?

@DDvO
Copy link
Contributor Author

DDvO commented Feb 1, 2021

Yeah I'll do that (as soon as I find the time for it).

@DDvO
Copy link
Contributor Author

DDvO commented Feb 9, 2021

I haven't looked at this PR in a little while. There are some good bits here IIRC and some things here and some more controversial. Perhaps you could stage just the parts that are cleaning up code and not changing policy into a smaller separate PR, and we can then think about just the policy changes separately, if you're still keen on those too?

@vdukhovni, I've meanwhlle carved out all unrelated improvements,
fixed the minor issues you pointed out here,
updated/corrected/extended some comments, and
re-visited the topics discussed above (see there for details).

In a nutshell:
I'm meanwhile more convinced than ever that the changes proposed here are correct and even needed for improving completeness and to avoid inefficient and even counterproductive extra checks (which can lead also to spurious errors).
Moreover, the code has become a little cleaner and thus better to read and maintain.

@richsalz
Copy link
Contributor

richsalz commented Feb 9, 2021

@DDvO, does your latest comment mean you want this in 3.0?

@DDvO
Copy link
Contributor Author

DDvO commented Feb 9, 2021

@DDvO, does your latest comment mean you want this in 3.0?

Not really - I did not change the milestone, which still is Post 3.0.0.
I do not see the proposed improvements as urgent
but just wanted to update the contents of this PR
and give the possibility to advance the discussion on it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously exact match of untrusted certificates was only for the EE certificate, allowing one to specify the EE cert as a trust anchor (typically a non-default CAfile with just that cert) for the connection in question.

Now we appear to be doing something similar for intermediate CAs, but why???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I wrote in the description of this PR:

it makes little sense to try finding an issuer of the current end of the chain if this end can be found immediately in the trust store (and is actually trusted).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the EE cert, we always know the provenance of each subsequent certificate as we're building the chain, because we added it, either by getting it from the trust store or from the untrusted store (often peer-provided chain). This is tracked by virtue of the "num_untrusted" variable. Therefore, I still fail to see the point (which you're failing to explain precisely in a non hand waving way) of applying the EE-special trade-in for a trust-store equivalent at any other depth in the chain.

You just have to explain this much better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point that for each cert in the chain being constructed, except for the target cert (which BTW does not need to be an EE cert) we already know whether its is from the trust store, and in fact we already call check_trust() on the current cert if it is from the trust store, and stop the loop if its is either trusted or rejected!
This was not clear to me because the whole chain building loop is so spread-out and complicated with detail (which partly could and better should be abstracted away).
So thanks to your comment I meanwhile see that for the certs after the target cert the new check is actually redundant!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point that for each cert in the chain being constructed, except for the target cert (which BTW does not need to be an EE cert) we already know whether its is from the trust store, and in fact we already call check_trust() on the current cert if it is from the trust store, and stop the loop if its is either trusted or rejected!
This was not clear to me because the whole chain building loop is so spread-out and complicated with detail (which partly could and better should be abstracted away).
So thanks to your comment I meanwhile see that for the certs after the target cert the new check is actually redundant!

We're making progress! If you limit the new check for just depth 0, with a suitable comment, my objections may be resolved. You're then just handling EE direct match for partial chain early, skipping the chain construction. That's probably OK, if that's what you're after. I can support that change, building a more complete chain seems redundant (the caller asked for partial chains, and has the EE cert in the store, could they possible indicate intent any more clearly???)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done this, as soon as possible, as follows:

/*
 * Stop immediately if target cert is directly trusted/rejected.
 * This improves completeness and prevents inefficiency and spurious
 * errors, which can occur when it makes no sense to look for an issuer.
 */
if (num == 1 && (trust = check_trust(ctx, 1)) != X509_TRUST_UNTRUSTED)
    break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to make the EE cert "trusted", is that now done elsewhere?

Copy link
Contributor Author

@DDvO DDvO Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is now done within check_trust() -
as I wrote in the new preliminary comment above:

                It is now covered by the generalized check_trust()
                called above, which checks direct trust for
                current cert (at index num_trusted - 1)

Copy link

@vdukhovni vdukhovni Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain exactly which cases this changes from prior behavior...
Is this just intended to handle the EE case of partial chain sooner?

Previous code tried to only do exact EE match as a last resort, building a chain to some trusted issuer if at all possible. I think just in case applications prefer that outcome. This was Stephen Henson's decision, and he's no longer with the project, so there's no certainty as to why, but I think that's plausible.

What's the new rationale? And why did the code have to change?

Copy link
Contributor Author

@DDvO DDvO Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pity that Stephen Henson did not document the rationale for that (IMHO strange) behavior.

As stated several times before, the very idea of this PR is to stop chain building as soon as possible, making the resulting chain as short as possible, which has several advantages:

  • completeness - the new test case that would not have succeeded before:
    "accept ee-cert2 although deceptively matching also first cert in ee-certs"
  • IMO more security because only a minimal number of certs are involved in the chain, which less chances of errors/bug/attacks in between.
  • efficiency
  • sometimes it simply does not make sense to try and find an issuer first, and doing so can cause needless spurious errors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we can't as easily go back to not trusted-first, just by changing the flags at the top of the function?
I guess going back to "untrusted first" is not likely to happen...

Copy link
Contributor Author

@DDvO DDvO Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that going back to "untrusted first" is not likely to happen,
but if we wanna keep this option we can remove the tentative #if 0 and #endif,
which I used to verify/demonstrate that it is not necessary (anymore).

@DDvO DDvO force-pushed the chain_building_completeness branch from e6c093c to 2c5bc52 Compare February 23, 2021 09:04
@DDvO
Copy link
Contributor Author

DDvO commented Feb 23, 2021

Rebased to fix a merge conflict.
A few further adaptations and TODOs added.

Copy link

@mkris86 mkris86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Geezzeee this is hard

Comment on lines 3097 to 3096

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we finally agree that this is not quite right? Was that in a separate PR that overlaps with this one?

Copy link
Contributor Author

@DDvO DDvO Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It was in this PR, namely in #13748 (comment) - please place any further comments on that specific topic there.
I just haven't found the time yet to adapt this PR accordingly.

What you could do in the meantime is to answer to my latest reactions to your comments in #13735. This has been waiting for your further input since end of January,
and would be great if we can finalize that soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: @vdukhovni, I've just done that as discussed above.

@DDvO DDvO force-pushed the chain_building_completeness branch from 2c5bc52 to 9b44459 Compare March 5, 2021 18:22
@DDvO DDvO changed the title WIP: x509_vfy.c: Make chain building more intuitive, flexible, and complete x509_vfy.c: Make chain building more intuitive, flexible, and complete Mar 5, 2021
@vdukhovni
Copy link

I think we need to set up a voice call and go through all the outstanding open questions in the multiple X509 PRs. There are just too many places to find all the unresolved questions. I've lost track of where to look.

My timezone is GMT-0400. I can do 9AM to 10AM some morning, pick a day and send me a conference link, along with a list of URLs for the relevant PRs and specific unresolved comments inside those PRs.

Stop chain building immediately if target cert is directly trusted/rejected.
This is required because it not always makes sense to try and find an issuer cert.
This allows test case
"accept ee-cert although deceptively matching also first cert in ee-certs" succeed
and prevents spurious errors for instance in test case
"accept last-resort direct leaf match Ed25519-signed self-issued cert".
This also allows getting rid of exceptional cases in find_issuer() and build_chain().

Also simplify internal_verify() and refactor the invocation of check_dane_issuer().
@DDvO DDvO force-pushed the chain_building_completeness branch from 9b44459 to cf695c7 Compare April 28, 2021 19:05
@t8m t8m added triaged: feature The issue/pr requests/adds a feature triaged: refactor The issue/pr requests/implements refactoring labels Aug 9, 2021
@DDvO DDvO modified the milestones: Post 3.0.0, 4.0.0 Apr 28, 2023
@t8m
Copy link
Member

t8m commented Jul 3, 2024

Could you please rebase the PR to resolve conflicts?

@DDvO
Copy link
Contributor Author

DDvO commented Jan 30, 2026

@vdukhovni, @t8m, unfortunately, I won't have the time to continue working on this.
Can someone take over, or shall I close this PR, or how else to handle this?

@DDvO DDvO removed this from the 4.0.0 milestone Feb 14, 2026
@DDvO
Copy link
Contributor Author

DDvO commented Feb 18, 2026

Closing for the time being.

@DDvO DDvO closed this Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch: master Applies to master branch help wanted triaged: feature The issue/pr requests/adds a feature triaged: refactor The issue/pr requests/implements refactoring

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants