Skip to content

cri: handle sandbox/container exit event in parallel#4682

Merged
dmcgowan merged 1 commit intocontainerd:masterfrom
fuweid:cri-handle-exit-event-separate
Jan 24, 2021
Merged

cri: handle sandbox/container exit event in parallel#4682
dmcgowan merged 1 commit intocontainerd:masterfrom
fuweid:cri-handle-exit-event-separate

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Oct 31, 2020

The current design is that all the task exit events are put into one queue. The event consumer will consume the event one by one, serially and there is only one consumer. The consumer is required to handle one task in 10 seconds. If timeout, consumer will put it back into backoff queue and retry it later.

If the node has to evict many pods for some reason, like disk pressure, the backlog of handling task exit event will increases. If each one exit event needs 10 seconds, the deletion of pods will take long time.

For this case, I would like to propose to handle task exit event in parallel. When the container/sandbox monitors receive the task exit event, the monitor should handle it first. If it failed, it can put the event into backoff queue and event consumer can try it later.

Signed-off-by: Wei Fu [email protected]

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Oct 31, 2020

Build succeeded.

@fuweid fuweid force-pushed the cri-handle-exit-event-separate branch from eb13925 to 939b159 Compare October 31, 2020 12:10
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Oct 31, 2020

Build succeeded.

Comment thread pkg/cri/server/events.go Outdated
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 3, 2020

ping @AkihiroSuda and @mikebrow

@cpuguy83
Copy link
Copy Markdown
Member

cpuguy83 commented Dec 8, 2020

I don't think I understand this change.
It might help to understand:

  1. What is the current behavior
  2. Why is the behavior problematic
  3. Why does this solution help

@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 9, 2020

@cpuguy83

The current design is that all the task exit events are put into one queue. The event consumer will consume the event one by one, serially and there is only one consumer. The consumer is required to handle one task in 10 seconds. If timeout, consumer will put it back into backoff queue and retry it later.

If the node has to evict many pods for some reason, like disk pressure, the backlog of handling task exit event will increases. If each one exit event needs 10 seconds, the deletion of pods will take long time.

For this case, I would like to propose to handle task exit event in parallel. When the container/sandbox monitors receive the task exit event, the monitor should handle it first. If it failed, it can put the event into backoff queue and event consumer can try it later.

Sorry for unclear message. So, does it make sense to you?

@fuweid fuweid changed the title cri: handle sandbox/container exit event separately cri: handle sandbox/container exit event concurrently Dec 9, 2020
@fuweid fuweid changed the title cri: handle sandbox/container exit event concurrently cri: handle sandbox/container exit event in parallel Dec 9, 2020
Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this alot. Are there any paths left for handleEvent() that could still receive a TaskExit? If not we can probably remove that case..

Comment thread pkg/cri/server/container_start.go Outdated
Comment thread pkg/cri/server/events.go Outdated
Comment thread pkg/cri/server/events.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error output here always bothered me :-) if err != store.ErrNotExist wrap the error with can't find? probably better to just say unexpected error retrieving sandbox for TaskExit event

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Comment thread pkg/cri/server/events.go Outdated
@fuweid fuweid force-pushed the cri-handle-exit-event-separate branch from 939b159 to 9760f88 Compare December 24, 2020 15:15
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 24, 2020

--- FAIL: TestLosetup (0.08s)
    --- FAIL: TestLosetup/RemoveLoopDevicesAssociatedWithImage (0.04s)
Error:         losetup_test.go:96: assertion failed: expected [/dev/loop1] (length 1) to have length 0

Hmm...

@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Dec 25, 2020

The backoff is map and need mutex to lock it. Updated it later

@fuweid fuweid force-pushed the cri-handle-exit-event-separate branch 2 times, most recently from c6e6957 to 4de9c3d Compare January 9, 2021 04:27
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Jan 9, 2021

-- FAIL: TestRwLoop (0.06s)
Error:     losetup_linux_test.go:93: write /dev/loop1: no space left on device

hmm....

@fuweid fuweid force-pushed the cri-handle-exit-event-separate branch from 4de9c3d to e024dde Compare January 11, 2021 15:36
@containerd containerd deleted a comment from theopenlab-ci Bot Jan 11, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Jan 11, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Jan 11, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Jan 11, 2021
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Jan 13, 2021

ping @mikebrow the patch is ready to review. PTAL. Thanks!

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
very nice!

@fuweid fuweid added area/cri Container Runtime Interface (CRI) kind/performance labels Jan 21, 2021
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Jan 22, 2021

ping @AkihiroSuda and @cpuguy83 ~

The event monitor handles exit events one by one. If there is something
wrong about deleting task, it will slow down the terminating Pods. In
order to reduce the impact, the exit event watcher should handle exit
event separately. If it failed, the watcher should put it into backoff
queue and retry it.

Signed-off-by: Wei Fu <[email protected]>
@fuweid fuweid force-pushed the cri-handle-exit-event-separate branch from e024dde to e56de63 Compare January 24, 2021 07:01
@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Jan 24, 2021

updated the error message for go error message convention.

@containerd containerd deleted a comment from theopenlab-ci Bot Jan 24, 2021
Copy link
Copy Markdown
Member

@dmcgowan dmcgowan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dmcgowan dmcgowan merged commit f615c58 into containerd:master Jan 24, 2021
@fuweid fuweid deleted the cri-handle-exit-event-separate branch January 24, 2021 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cri Container Runtime Interface (CRI) impact/changelog kind/performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants