Skip to content

Conversation

@laurazard
Copy link
Member

@laurazard laurazard commented May 30, 2023

Context

See: #8557 (comment) (writeup of the issue)
fixes #8557

What I did

Address the performance issue by reducing contention between

func (s *service) Start(ctx context.Context, r *taskAPI.StartRequest) (*taskAPI.StartResponse, error) {

and
func (s *service) checkProcesses(e runcC.Exit) {

by removing
eventSendMu sync.Mutex

and instead, individually keep track of starting processes which allows checkProcesses() to optimistically check unblocked/already started processes, and only if necessary separately queue events for blocked processes to be retried on a separate goroutine.


This model significantly reduces contention around this area, allowing

func (s *service) checkProcesses(e runcC.Exit) {

to process the incoming events which decongests the line down to
func Reap() error {

Benchmarks

(using the repro steps from #8557 (comment))

baseline:

0.11user 0.03system 0:01.00elapsed
0.09user 0.06system 0:25.40elapsed
0.08user 0.02system 0:17.45elapsed

with this change:

0.11user 0.08system 0:00.91elapsed
0.08user 0.02system 0:00.72elapsed
0.09user 0.01system 0:00.65elapsed

With some instrumenting, we can also confirm that the intended effect is achieved since the notify goroutine:

baseline:

notify goroutine took 10.167µs
notify goroutine took 6.542µs
notify goroutine took 16.5µs
notify goroutine took 4.417µs
notify goroutine took 8.063819576s
notify goroutine took 2.060341814s
notify goroutine took 5.061881112s
notify goroutine took 7.063627129s
notify goroutine took 1.060047492s
notify goroutine took 4.061607998s
notify goroutine took 6.063175599s
notify goroutine took 9.064799771s

with this change:

notify goroutine took 110.280639ms
notify goroutine took 173.13788ms
notify goroutine took 142.423614ms
notify goroutine took 25.899222ms
notify goroutine took 4.356252ms
notify goroutine took 3.834µs
notify goroutine took 2.833µs
notify goroutine took 2.5µs
notify goroutine took 64.833µs
notify goroutine took 2.197334ms
notify goroutine took 78.833µs
notify goroutine took 59.167µs
notify goroutine took 98.583µs
notify goroutine took 47.042µs
notify goroutine took 42.958µs
notify goroutine took 48.916µs
notify goroutine took 69.125µs
notify goroutine took 57.291µs
notify goroutine took 42.333µs
notify goroutine took 33.417µs
notify goroutine took 72.39987ms

callout: I believe this solution preserves the intended behaviour re: ensuring start events are sent before exit events (however well this worked before), but PTAL/discuss/propose alternate ways of handling the issue.

(bonus) cute animal

image

@k8s-ci-robot
Copy link

Hi @laurazard. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@laurazard laurazard marked this pull request as ready for review May 30, 2023 14:22
@laurazard
Copy link
Member Author

@laurazard
Copy link
Member Author

laurazard commented May 30, 2023

Edit: I was running some more tests and this doesn't completely solve the issue – every now and then, checkProcesses still takes an inordinate amount of time, in the following call:

if p.Pid() != e.Pid {

which locks everything else out. I've got some more changes to address that which I'll include in a separate commit (tomorrow, I need sleep 😓) but for right now I'll draft this PR.

Update

Above write up is correct, but misses another cause of delays in

func (s *service) checkProcesses(e runcC.Exit) {

Specifically,

if p.Pid() != e.Pid {
sometimes takes multiple seconds to return due to the fact that it locks on
which contends with
func (e *execProcess) start(ctx context.Context) (err error) {
(which is the desired behaviour, as the comment on that function shows:

func (e *execProcess) start(ctx context.Context) (err error) {
	// The reaper may receive exit signal right after
	// the container is started, before the e.pid is updated.
	// In that case, we want to block the signal handler to
	// access e.pid until it is updated.
	e.pid.Lock()
	defer e.pid.Unlock()

This "preserving causality" business is annoying 😅


Alright, I've amended the commit to address this contention as well.

@laurazard laurazard marked this pull request as draft May 30, 2023 23:57
@laurazard laurazard marked this pull request as ready for review May 31, 2023 15:35
@laurazard
Copy link
Member Author

s.sendProcessEvent(e, p, c)
}
for p, c := range missing {
go tryMissing(e, p, c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event channel is closed on shutdown and shutdown is protected by the same s.mu lock. Is completing this event outside of the lock an important part of reducing the contention issue? If not, its probably safest just to ensure these complete before returning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most relevant change to resolving the contention issue is not blocking checkProcesses on processes that haven't started yet – so ensuring these complete before returning isn't a great solution and reintroduces a big part of the issue, but we can move around where we lock on s.mu to make sure we don't send events outside of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some changes, let me know if that LGTY @dmcgowan :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel close/send behavior is pretty annoying in Go as it necessitates additional locking and status checking to avoid a possible panic. It might warrant an explicit shutdown state variable and mutex for send/close.

Address performance issues caused by lock contention in
`runtime/v2/runc/task/service.go` due to `checkProcesses()`
and `Start()` both locking on `s.eventSendMU` (noticeable
when starting a lot of tasks concurrently on the same
container), we instead selectively lock on a per-process
basis - preserving start-exit event order.

Also resolves issues related to `checkProcesses()` performance
due to locking while fetching the PID for processes still starting
by optimistically checking already started processes, and only if
needed, queue a separate routine to wait for the starting process.

Signed-off-by: Laura Brehm <[email protected]>
Copy link
Member

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spy a race condition.

if _, starting := s.processEventsLocker.Load(p.ID()); !starting {
break
}
time.Sleep(50 * time.Millisecond)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do better than polling. Store a channel in s.processEventsLocker and close it when deleting the entry from the map. Then this function can receive on the channel to wait.

if startedCh, ok := s.processEventsLocker.Load(p.ID()); ok {
	<-startedCh.(chan struct{})
}

containers: make(map[string]*runc.Container),
context: ctx,
events: make(chan interface{}, 128),
processEventsLocker: sync.Map{},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
processEventsLocker: sync.Map{},

You don't need to explicitly initialize a field to its zero value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be a map[... 😅 didn't catch that, thanks :)

if !container.HasPid(e.Pid) {
continue
}
containers := s.containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maps are reference types, similar to slices. Copying a map by value only copies the reference to the underlying mutable data structure so this does not make it safe to access containers without holding s.mu. You would have to deep-copy the map to make this work.

Comment on lines +163 to +166
lockKey := r.ExecID
if lockKey == "" {
lockKey = r.ID
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like r.ExecID only needs to be unique for a given r.ID so I'm pretty sure the map key would need to be the tuple (r.ID, r.ExecID).

// Naming things is hard. Try to think of a better name than this.
type processEventsKey struct { ID, ExecID string }

s.processEventsLocker.Store(processEventsKey{ID: r.ID, ExecID: r.ExecID}, ...)

Alternatively, have a separate map for each container, e.g. by moving it to the runc.Container struct?

@thaJeztah
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Concurrent task execs take significantly longer than sequentially

6 participants