Avoid containerd access as much as possible.#571
Avoid containerd access as much as possible.#571Random-Liu merged 1 commit intocontainerd:masterfrom
Conversation
9e1012d to
718d1cc
Compare
|
golint failed. |
|
@miaoyq Fixing it. :) |
5121ea6 to
099c13d
Compare
| // NetNSPath is the network namespace used by the sandbox. | ||
| NetNSPath string | ||
| // IP of Pod if it is attached to non host network | ||
| IP string |
There was a problem hiding this comment.
@abhi This moves IP from Sandbox into Metadata. This doesn't make any logical difference, just a minor cleanup.
| defer timeoutTimer.Stop() | ||
| for { | ||
| // Poll once before waiting for stopCheckPollInterval. | ||
| // TODO(random-liu): Use channel with event handler instead of polling. |
There was a problem hiding this comment.
@miaoyq We should also change this to channel.
There was a problem hiding this comment.
OK, will change this after this pr merged.
| pid uint32, processStatus containerd.ProcessStatus, verbose bool) (map[string]string, error) { | ||
| if !verbose { | ||
| return nil, nil | ||
| func toCRISandboxInfo(ctx context.Context, sandbox sandboxstore.Sandbox) (map[string]string, error) { |
There was a problem hiding this comment.
@mikebrow I change this function to return error instead of logging. We are going to merge cri-containerd and containerd, this won't be a problem then.
099c13d to
7d9512f
Compare
| return nil, fmt.Errorf("failed to list sandbox containers: %v", err) | ||
| } | ||
|
|
||
| var sandboxes []*runtime.PodSandbox |
There was a problem hiding this comment.
Let's do this kind of optimization when we see this is a bottleneck. :)
I don't think it will make too much difference.
| func (s *Store) Get(id string) (Sandbox, error) { | ||
| sb, err := s.GetAll(id) | ||
| if err != nil { | ||
| return sb, err |
There was a problem hiding this comment.
It's fine here. I think. :)
| @@ -106,53 +107,20 @@ func (em *eventMonitor) handleEvent(evt *events.Envelope) { | |||
| e := any.(*eventtypes.TaskExit) | |||
There was a problem hiding this comment.
Do we not handle sandbox with TaskOOM?
There was a problem hiding this comment.
Yeah, we don't need to do that. Even we handle, what can we do? :)
| defer func() { | ||
| if retErr != nil { | ||
| // Cleanup the sandbox container if an error is returned. | ||
| if _, err := task.Delete(ctx, containerd.WithProcessKill); err != nil { |
There was a problem hiding this comment.
Now we also do task.Delete with unknown state in event.handleSandboxExit, so we should check that if the err is NotFound. The NotFound is a normal case.
There was a problem hiding this comment.
Changed both sandbox_run.go and container_start.go to check IsNotFound. Good catch!
| func handleSandboxExit(e *eventtypes.TaskExit, sb sandboxstore.Sandbox) { | ||
| // Don't need to check pid because sandbox container should only have one process. | ||
| // No stream attached to sandbox container. | ||
| task, err := sb.Container.Task(context.Background(), nil) |
There was a problem hiding this comment.
I think we should not do this on unknown state. Let RunPodSandbox do it.
Or else we maybe do task.Delete both in handleSandboxExit and RunPodSandbox.
There was a problem hiding this comment.
On a second thought, I still prefer put task.Delete out of `Update.
Update will hold a lock of the container/sandbox, and List (which is a critical operation for kubelet) will be blocked until the lock is released.
We should put as little logic as possible into Update. The issue you mentioned can be handled by #571 (comment).
| } | ||
|
|
||
| pid = task.Pid() | ||
| processStatus = taskStatus.Status |
There was a problem hiding this comment.
Could we also store the pid and processStatus to sandboxstore?
There was a problem hiding this comment.
processState keeps changing, it's hard to maintain it in our store. I'd prefer not to do that until we have to.
I'll do pid.
7d9512f to
5b934ef
Compare
|
Test failure seems like a GCE flake. /test pull-cri-containerd-node-e2e |
| return false, err | ||
| } | ||
| return status.GetState() == runtime.PodSandboxState_SANDBOX_NOTREADY, nil | ||
| }, time.Second, 30*time.Second), "sandbox state should become NOTREADY") |
There was a problem hiding this comment.
todo for configuring timeout for becoming ready.. possibly receive direction over cri api
There was a problem hiding this comment.
I think fixed 30 second for test is fine. This should happen very soon.
There was a problem hiding this comment.
I think it's fine to, but people not wanting to wait for a new pod might not like it..
There was a problem hiding this comment.
Hm, what do you mean?
Here, we kill the corresponding task from containerd, and wait for the pod sandbox status to be updated NOTREADY by cri-containerd. It usually only takes several microseconds I guess?
There was a problem hiding this comment.
NVM I was distracted. Ignore :-)
| // List all sandboxes from store. | ||
| sandboxesInStore := c.sandboxStore.List() | ||
|
|
||
| response, err := c.client.TaskService().List(ctx, &tasks.ListTasksRequest{}) |
There was a problem hiding this comment.
Consider adding a todo for running a validate routine to make sure we have stayed in sync and didn't just miss an event.. I like this pattern of working off the push of events, but without some periodic validation there is a risk of an event failing to reach us, low but risk nonetheless? Or are all the events returning successful delivery to containerd?
There was a problem hiding this comment.
Good point. Will add a TODO for that.
We sometimes see Docker stay in weird state because it missed an event and not in sync with containerd.
| pid uint32, processStatus containerd.ProcessStatus, verbose bool) (map[string]string, error) { | ||
| if !verbose { | ||
| return nil, nil | ||
| func toCRISandboxInfo(ctx context.Context, sandbox sandboxstore.Sandbox) (map[string]string, error) { |
| info, err := toCRISandboxInfo(ctx, sandbox, pid, processStatus, r.GetVerbose()) | ||
|
|
||
| // Generate verbose information. | ||
| info, err := toCRISandboxInfo(ctx, sandbox) |
There was a problem hiding this comment.
if r.GetVerbose() there is no need to get info here..
There was a problem hiding this comment.
If !r.GetVerbose() it will return directly above.
|
|
||
| // stopCheckPollInterval is the the interval to check whether a sandbox | ||
| // is stopped successfully. | ||
| const stopCheckPollInterval = 100 * time.Millisecond |
|
|
||
| if err := c.stopSandboxContainer(ctx, sandbox.Container); err != nil { | ||
| return nil, fmt.Errorf("failed to stop sandbox container %q: %v", id, err) | ||
| // Only stop sandbox container when it's running. |
There was a problem hiding this comment.
is no error the the expected CRI behavior if it's not running, seems like we should tell them they can't stop a non running podsandbox ...
At least log a warning?
There was a problem hiding this comment.
Actually we expect StopPodSandbox to be idempotent. Caller should be able to call this many times.
And kubelet does call this function multiple times for one pod sandbox.
There was a problem hiding this comment.
ok. just checking, odd behavior IMO Maybe they call stop multiple times cause no one threw an already stopped error that they could easily ignore if they wanted to treat it as idempotent for their use case.
| func (s *Store) Get(id string) (Sandbox, error) { | ||
| sb, err := s.GetAll(id) | ||
| if err != nil { | ||
| return sb, err |
|
/test pull-cri-containerd-node-e2e |
|
|
||
| if err := c.stopSandboxContainer(ctx, sandbox.Container); err != nil { | ||
| return nil, fmt.Errorf("failed to stop sandbox container %q: %v", id, err) | ||
| // Only stop sandbox container when it's running. |
There was a problem hiding this comment.
ok. just checking, odd behavior IMO Maybe they call stop multiple times cause no one threw an already stopped error that they could easily ignore if they wanted to treat it as idempotent for their use case.
| return false, err | ||
| } | ||
| return status.GetState() == runtime.PodSandboxState_SANDBOX_NOTREADY, nil | ||
| }, time.Second, 30*time.Second), "sandbox state should become NOTREADY") |
There was a problem hiding this comment.
NVM I was distracted. Ignore :-)
| // eventMonitor monitors containerd event and updates internal state correspondingly. | ||
| // TODO(random-liu): [P1] Is it possible to drop event during containerd is running? | ||
| // TODO(random-liu): [P1] Figure out is it possible to drop event during containerd | ||
| // is running. If it is, we should do periodically list to sync state with containerd. |
Signed-off-by: Lantao Liu <[email protected]>
8799774 to
df58d68
Compare
|
/lgtm |
|
/lgtm |
Related to containerd/containerd#2020.
This PR:
This should reduce the containerd/containerd-shim cpu/memory usage.
Signed-off-by: Lantao Liu [email protected]