Add TLS verify callback for Python and Ruby#12656
Add TLS verify callback for Python and Ruby#12656JackOfMostTrades wants to merge 8 commits intogrpc:masterfrom
Conversation
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
14 similar comments
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
|
This looks like a breaking change in the Node API, so we probably wouldn't accept it without a proposal for an API change to be included in the next major version bump. Also, development on the node library has been moved to https://github.com/grpc/grpc-node. |
|
Ah, good to know, I'll check out that repo. That said, I don't think it's a breaking change; it adds an optional 4th parameter (JS object), so if left undefined existing behavior would be unchanged. |
|
You're right, it's not a breaking change. I mixed it up with the optional parameter at the end of the server credentials SSL factory function. |
include/grpc/grpc_security.h
Outdated
There was a problem hiding this comment.
@justinburke FYI.
It doesn't look like this is consistent with what we discussed here: #10721
Am I missing something?
There was a problem hiding this comment.
Yep, you're right. First pass at implementing this we were aiming to drop in identical code logic that we had in Java where specifying a custom hostname verifier completely replaces the original hostname verification.
But I'll concede that's a separate (and possibly undesirable) concern. Let me take another pass at this PR with the skip option pulled out.
ffbc1bc to
621430c
Compare
|
Thank you for your pull request. Before we can look at your contribution, we need to ensure all contributors are covered by a Contributor License Agreement. After the following items are addressed, please respond with a new comment here, and the automated system will re-verify.
Regards, |
1 similar comment
|
Thank you for your pull request. Before we can look at your contribution, we need to ensure all contributors are covered by a Contributor License Agreement. After the following items are addressed, please respond with a new comment here, and the automated system will re-verify.
Regards, |
|
Ok, I finally got around to pulling out the skip_hostname_verify option, so what's left is just the checkServerIdentity callback which is strictly supplemental to hostname verification. |
mehrdada
left a comment
There was a problem hiding this comment.
[Review specific to Python/Cython files]
Thanks! Looks mostly good, with minor nits.
src/python/grpcio/grpc/__init__.py
Outdated
| private_key=None, | ||
| certificate_chain=None): | ||
| certificate_chain=None, | ||
| verify_options=None): |
There was a problem hiding this comment.
A few comments about verify_options:
- Please add API documentation.
- It seems there is only one option available at the moment, correct? What are the additional options that might be potentially added in the future. Should they go under
verify_optionsor as an additional named argument? Should the singular option become the argument instead? (e.g.verify_callback=callback_fn) - I prefer
validation_optionsorverification_optionsor justoptionsif this is supposed to be more generic (based on the answer to the above question).
There was a problem hiding this comment.
There was a bit of churn before the PR was opened that led to this being an object. At first there were more options (like assert fingerprint, skip hostname verification, ...) but all that got pulled out. Given that, I think it would be perfectly fine just to have a verify_callback=None argument that could be passed in directly. It could be helpful to forward-plan for additional options inside the verify_options, but I can't name any off the top of my head I think should be added that the callback couldn't implement. I also think it's reasonably pythonic if we wanted to support more options in the future to just to add those as additional optional named arguments.
So given that context, unless you think otherwise I'll just change this to a verify_callback parameter.
There was a problem hiding this comment.
Let's go with verify_callback!
There was a problem hiding this comment.
Sounds good! PR updates pushed.
Changes requested have been made
|
Python/Cython changes LGTM. Pinging @apolcyn for Ruby review & @jiangtaoli2016 for security review. |
|
|
@JackOfMostTrades Please take a look at Sanity test failures. ( |
Run clang format/tidy. Apply yapf.
|
Thanks for the pointer. Looks like pylint didn't like my use of nonlocal in the test, so I just refactored that a tiny bit. Apparently I also needed to run generate_projects, yapf, and clang_format. So hopefully it should be all fixed up properly now! |
|
|
Python tests are now failing (sounds like an argument mismatch). Those tests are runnable locally too, with |
| params.servername = servername; | ||
| params.cert = cert; | ||
|
|
||
| return rb_thread_call_with_gvl(invoke_rb_verify_callback_with_gvl, ¶ms) == |
There was a problem hiding this comment.
I think the way this callback is invoked with respect to concurrency needs to be changed.
The C-Core is completely non-blocking and asynchronous. Since some C-Core thread (which may or may not be a ruby thread), is directly acquiring the interpreter lock and calling in to this ruby application handler, this has the potential to indefinitely block a C-Core thread and prevent intended work from being done.
One fix for this could be to dedicate a ruby thread to the task of invoking these callbacks. Such a thread could poll for work on a queue of "ready-to-run" "veroify peer callbacks". The actual verify peer callback handler that the C-Core invokes would just add to that queue, signal conditional variable, etc.. We already have such a thread for the task is invoking ruby-defined "call credentials" callbacks (See https://github.com/grpc/grpc/blob/master/src/ruby/ext/grpc/rb_event_thread.c#L126), so I think that adding another would could be reasonable. An optimization could be to combine both of these thread in to one, but I'm not sure that's worth doing.
There was a problem hiding this comment.
^ on second though, it might actually be simpler to combiner invocation of these handlers in to the call credentials callback thread, by using e.g. a "queue node tag type", but I'm not sure
There was a problem hiding this comment.
Also, just to add, granted an indefinitely blocking application callback would also deadlock a dedicated callback thread and prevent more callbacks from being fired, but IMO overall the dedicated thread is simpler; it's difficult to reason about what happens when a c-core thread acquires a ruby lock and runs ruby code.
There was a problem hiding this comment.
Hmm, what you're saying makes total sense, but the verify callback exposed by C-Core at present expects the callback to be synchronous so it doesn't pass in the on_peer_checked closure that would allow the callback to be asynchronous.
I don't think it would be a big deal to tweak the C-Core API so that callbacks are asynchronous. Which sounds like it might be important given interpreter locks. Is there anyone that should weight in before I go down the route of changing that core API?
There was a problem hiding this comment.
cc @jiangtaoli2016 @markdroth for this core API change proposal (to allow the verify peer callback to be async)
There was a problem hiding this comment.
I agree, it's not okay to block in a callback from a C-core thread.
I'm not thrilled about adding yet another case where we need an asynchronous callback from C-core to the application. We already have a few of these, and they've caused a number of headaches over time; it turns out to be fairly tricky to get the synchronization and state machinery right when leaving and re-entering C-core.
That having been said, I don't have a better suggestion, and the C-core API is not a public API, so we do have the flexibility to change this later if we can come up with something better. So I'd say go ahead and add the callback to the API.
I suggest that you model the changes after one of the existing implementations. The best one to look at is probably the metadata credentials plugin, whose API is defined here:
The implementation is here:
https://github.com/grpc/grpc/blob/master/src/core/lib/security/credentials/plugin/plugin_credentials.cc
https://github.com/grpc/grpc/blob/master/src/core/lib/security/credentials/plugin/plugin_credentials.h
Note that this API allows both sync and async returns. The sync return is much simpler and more efficient in the case where no blocking is needed; the async return requires a callback and a thread hop, so it's more complex and less efficient.
One thing that is not handled correctly in the metadata credentials plugin API is cancellation. Unfortunately, this API didn't originally support cancellation, and all of the wrapped language APIs were added without cancellation support. When we later realized that we needed to support this, we were able to change the C-core API, but it was too late to fix the wrapped language APIs. So we did a poor man's approach: we have C-core mark the request as cancelled, and when the wrapped language eventually invokes the callback, nothing actually happens. But this means that we still wind up doing a lot of processing in the application that isn't actually necessary, because we have no way to inform it that the work is no longer needed. I highly recommend not making that mistake with this new API: let's make sure that we expose a way of propagating cancelations to the wrapped language, so that this can be handled as a first-class citizen.
Please let me know if you have any questions.
There was a problem hiding this comment.
Thanks for all the pointers! I'll work on putting together an update to the core API and put that in a new PR. I'll come back to the python/ruby bindings in this PR once we get that ironed out. It'll probably be a little bit before I can get through that, so I'll flag you on it when it's ready.
|
Hey there, thank you for creating this PR. I have an issue where I get a I've tried this branch in my project, but it never run into the Thank you, EDIT: In fact it works if there is no EDIT 2: I succeeded making my network works. It was a config error. |
|
Thanks for this change. It looks like it's in flux with a dependency on another PR. Feel free to create a new request or reopen if this needs more attention. |
Release 1.6.0 of grpc-java added support for using a custom TLS hostname verifier with clients. This feature parity doesn't exist for gRPC clients in other languages, which this PR tries to address (although an "override hostname" can be specified, support for an arbitrary callback is missing). In particular this adds support for a verification callback in JavaScript and Python. As an example, callbacks in JS look like:
The "checkServerIdentity" idiom was chosen to match the same option available to TLS clients with the built-in node library. (InsecurySkipHostnameVerify was inspired by golang's InsecureSkipVerify option.)
In Python it looks like this:
For better context, the motivation for having callbacks like this stems from our internal PKI infrastructure where multidimensional identity information is added to various proprietary X.509 extensions. The policy on exactly which dimensions are relevant and should be checked varies from client to client so it's important that each client be able to supply verification logic.
In a less proprietary context, custom verifiers are often used as a supplemental check to do things like certificate or CA pinning.
We've been using an internal fork based on this PR for a few months now, so at least for JS and Python I have some practical confidence that this works as intended.