-
Notifications
You must be signed in to change notification settings - Fork 40.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proxy for container streaming in kubelet for streaming auth. #64006
Conversation
f1d3c57
to
c59bff0
Compare
With this PR, $ sudo /usr/local/bin/crictl --runtime-endpoint=unix:///var/run/dockershim.sock exec -i -t 1fe29f12bc656 /bin/sh
# ls
ls
bin build cgi-bin conf error htdocs icons include logs modules
# |
c59bff0
to
2ca85a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments / questions.
pkg/kubelet/server/server.go
Outdated
s.host.StreamingConnectionIdleTimeout(), | ||
remotecommandconsts.DefaultStreamCreationTimeout, | ||
remotecommandconsts.SupportedStreamingProtocols) | ||
handler := proxy.NewUpgradeAwareHandler(url, nil, false, false, &responder{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// retrive handler for url location using default proxy transport(nil) with wrapTransport and upgradeRequired parameters set to off (false)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add comment for each field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/kubelet/server/server.go
Outdated
s.host.StreamingConnectionIdleTimeout(), | ||
remotecommandconsts.DefaultStreamCreationTimeout, | ||
remotecommandconsts.SupportedStreamingProtocols) | ||
handler := proxy.NewUpgradeAwareHandler(url, nil, false, false, &responder{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should responder be a proxy.ErrorResponder or nil here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proxy.ErrorResponder
is just an interface. responder
implements it.
return err | ||
} | ||
// Use the actual address as baseURL host. This handles the "0" port case. | ||
s.config.BaseURL.Host = listener.Addr().String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably comment the baseurl structure element on line 74 above so people know it will/may be updated after start() otherwise someone might rval it early...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
// IndirectStreamingRuntime is the interface implemented by runtimes that handle the serving of the | ||
// StreamingRuntime is the interface implemented by runtimes that handle the serving of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should make a comment here about DirectStreamingRuntime and IndirectStreamingRuntime being end of life at least for the time being. Maybe with a link to the pr that removed them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just kubelet implementation detail. I don't think anyone will come back and cleanup if we put such a comment here.
Like when we remove rkt
and dockershim
package, we won't leave a comment saying dockershim was removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True.. I saw the upper case first char.. and since we may have a future implementation with a switch, I figured a link to the past. But as you say the link is in github.
I'll fix the unit test. |
2ca85a4
to
39eaab4
Compare
Fixed the unit test. The unit test was testing |
39eaab4
to
3f133f1
Compare
/test containerd |
3f133f1
to
d42b9a6
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Random-Liu, yujuhong The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/sig auth |
[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process @Random-Liu @tallclair @yujuhong Pull Request Labels
|
// and apiserver will access container runtime directly. This approach is more performant, | ||
// but less secure because the connection between apiserver and container runtime is not | ||
// authenticated. | ||
RedirectContainerStreaming bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a FeatureGate instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more an internal refactoring than a "feature".
We didn't even want to add the flag initially, because we don't want user to care about this. But eventually we decided to do so because we don't have performance data yet, and we want to leave a knob for people who care about the overhead.
If the data in #64006 (comment) makes sense, I even want to remove the flag. :P
@@ -77,6 +86,7 @@ func (s *ContainerRuntimeOptions) AddFlags(fs *pflag.FlagSet) { | |||
// General settings. | |||
fs.StringVar(&s.ContainerRuntime, "container-runtime", s.ContainerRuntime, "The container runtime to use. Possible values: 'docker', 'remote', 'rkt (deprecated)'.") | |||
fs.StringVar(&s.RuntimeCgroups, "runtime-cgroups", s.RuntimeCgroups, "Optional absolute name of cgroups to create and run the runtime in.") | |||
fs.BoolVar(&s.RedirectContainerStreaming, "redirect-container-streaming", s.RedirectContainerStreaming, "Enables container streaming redirect. If false, kubelet will proxy container streaming data between apiserver and container runtime; if true, kubelet will return an http redirect to apiserver, and apiserver will access container runtime directly. The proxy approach is more secure, but introduces some overhead. The redirect approach is more performant, but less secure because the connection between apiserver and container runtime is not authenticated.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "not authenticated" -> "may not be authenticated"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
}() | ||
go func() { | ||
if err := ds.streamingServer.Start(true); err != nil && err != http.ErrServerClosed { | ||
glog.Fatalf("Failed to start streaming server: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: server may have started, maybe "Streaming server stopped unexpectedly: %v"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -77,6 +86,7 @@ func (s *ContainerRuntimeOptions) AddFlags(fs *pflag.FlagSet) { | |||
// General settings. | |||
fs.StringVar(&s.ContainerRuntime, "container-runtime", s.ContainerRuntime, "The container runtime to use. Possible values: 'docker', 'remote', 'rkt (deprecated)'.") | |||
fs.StringVar(&s.RuntimeCgroups, "runtime-cgroups", s.RuntimeCgroups, "Optional absolute name of cgroups to create and run the runtime in.") | |||
fs.BoolVar(&s.RedirectContainerStreaming, "redirect-container-streaming", s.RedirectContainerStreaming, "Enables container streaming redirect. If false, kubelet will proxy container streaming data between apiserver and container runtime; if true, kubelet will return an http redirect to apiserver, and apiserver will access container runtime directly. The proxy approach is more secure, but introduces some overhead. The redirect approach is more performant, but less secure because the connection between apiserver and container runtime is not authenticated.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should this be proxy-container-streaming
so the false
value maintains the previous behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want proxy to be the default behavior, so that:
- Container streaming of all CRI container runtime is more secure.
- dockershim can work with crictl (it is not the main goal, but is something we do want)
if kubeDeps.TLSOptions != nil { | ||
config.TLSConfig = kubeDeps.TLSOptions.Config | ||
if !crOptions.RedirectContainerStreaming { | ||
config.Addr = net.JoinHostPort("localhost", "0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this use a unix socket instead of localhost? (more secure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a TODO.
I decided to go with localhost first because it is simpler. For unix socket, we need to parse URL returned by container runtime to figure out whether it is unix socket (container runtime could still return regular URL), and handle unix socket differently for NewUpgradeAwareHandler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up is fine, but I'm just curious - did you try this? I don't see anything in NewUpgradeAwareHandler that looks like it would have a problem with a unix URL?
s.host.StreamingConnectionIdleTimeout(), | ||
remotecommandconsts.DefaultStreamCreationTimeout, | ||
remotecommandconsts.SupportedStreamingProtocols) | ||
handler := proxy.NewUpgradeAwareHandler(url, nil /*transport*/, false /*wrapTransport*/, false /*upgradeRequired*/, &responder{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set MaxBytesPerSec, or at least put a TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
remotecommandconsts.DefaultStreamCreationTimeout, | ||
remotecommandconsts.SupportedStreamingProtocols) | ||
handler := proxy.NewUpgradeAwareHandler(url, nil /*transport*/, false /*wrapTransport*/, false /*upgradeRequired*/, &responder{}) | ||
handler.ServeHTTP(response.ResponseWriter, request.Request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: refactor this to function, so we can be sure to use the same options for exec, attach & pf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
s.host.StreamingConnectionIdleTimeout(), | ||
remotecommandconsts.DefaultStreamCreationTimeout, | ||
remotecommandconsts.SupportedStreamingProtocols) | ||
handler := proxy.NewUpgradeAwareHandler(url, nil /*transport*/, false /*wrapTransport*/, false /*upgradeRequired*/, &responder{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the apiserver sets upgradeRequired for exec, attach & PF requests, so I think we should do the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -233,10 +235,16 @@ func (s *server) Start(stayUp bool) error { | |||
return errors.New("stayUp=false is not yet implemented") | |||
} | |||
|
|||
listener, err := net.Listen("tcp", s.config.Addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned elsewhere, can we provide a "unix" option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
@Random-Liu: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Review the full test history for this PR. Silence the bot with an |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
@tallclair The PR was merged... I'll address your comments in a new PR. |
I did some simple performance benchmark in a local cluster today. Benchmark:
Result:
The benchmark is very simple. Please feel free to reproduce this. Anyway, this is the data I get. And if it turns out that redirect has higher apiserver resource usage, I prefer proxy more because apiserver is the actual bottleneck. @yujuhong @tallclair @mrunalp @feiskyer |
Here is the result if I run 2 exec at the same time:
Apiserver still have more source usage in the redirect approach. Interesting... |
Signed-off-by: Lantao Liu <[email protected]>
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Address comments in #64006. Address comments in #64006 @tallclair @yujuhong @kubernetes/sig-node-pr-reviews Signed-off-by: Lantao Liu <[email protected]> **Release note**: ```release-note none ```
Signed-off-by: Lantao Liu <[email protected]>
[plugins.cri] |
For #36666, option 2 of #36666 (comment).
This PR:
DirectStreamingRuntime
, and changedIndirectStreamingRuntime
toStreamingRuntime
. AllDirectStreamingRuntime
s,dockertools
andrkt
, were removed.Please note that, this PR replaced the redirect with proxy directly instead of adding a knob to switch between the 2 behaviors. For existing CRI runtimes like containerd and cri-o, they should change to serve container streaming on localhost, so as to make the whole container streaming connection secure.
If a general authentication mechanism proposed in #62747 is ready, we can switch back to redirect, and all code can be found in github history.
Please also note that this added some overhead in kubelet when there are container streaming connections. However, the actual bottleneck is in the apiserver anyway, because it does proxy for all container streaming happens in the cluster. So it seems fine to get security and simplicity with this overhead. @derekwaynecarr @mrunalp Are you ok with this? Or do you prefer a knob?
@yujuhong @timstclair @dchen1107 @mikebrow @feiskyer
/cc @kubernetes/sig-node-pr-reviews
Release note: