Skip to content

up: fix various race/deadlock conditions on exit#10934

Merged
milas merged 2 commits intodocker:mainfrom
milas:fix-stop-up
Aug 31, 2023
Merged

up: fix various race/deadlock conditions on exit#10934
milas merged 2 commits intodocker:mainfrom
milas:fix-stop-up

Conversation

@milas
Copy link
Copy Markdown
Contributor

@milas milas commented Aug 25, 2023

What I did
If running up in foreground mode (i.e. not -d), when exiting via Ctrl-C, Compose stops all the
services it launched directly as part of that up command.

In one of the E2E tests (TestUpDependenciesNotStopped), this was occasionally flaking because the stop
behavior was racy: the return might not block on
the stop operation because it gets added to the
error group in a goroutine. As a result, it was
possible for no services to get terminated on exit.

There were a few other related pieces here that
I uncovered and tried to fix while stressing this. For example, the printer could cause a deadlock if an event was sent to it after it stopped.

Also, an error group wasn't really appropriate here; each goroutine is a different operation for printing, signal-handling, etc. If one part fails, we don't
actually want printing to stop, for example. This has been switched to a multierror.Group, which has the same API but coalesces errors instead of canceling a context the moment the first one fails and returning that single error.

(not mandatory) A picture of a cute animal, if possible in relation to what you did
chinchilla

@milas milas requested a review from a team August 25, 2023 22:00
@milas milas self-assigned this Aug 25, 2023
@milas milas requested review from StefanScherer, glours, laurazard, ndeloof, nicksieger and ulyssessouza and removed request for a team August 25, 2023 22:00
@milas milas force-pushed the fix-stop-up branch 3 times, most recently from b931158 to 4347d31 Compare August 25, 2023 22:04
If running `up` in foreground mode (i.e. not `-d`),
when exiting via `Ctrl-C`, Compose stops all the
services it launched directly as part of that `up`
command.

In one of the E2E tests (`TestUpDependenciesNotStopped`),
this was occasionally flaking because the stop
behavior was racy: the return might not block on
the stop operation because it gets added to the
error group in a goroutine. As a result, it was
possible for no services to get terminated on exit.

There were a few other related pieces here that
I uncovered and tried to fix while stressing this.
For example, the printer could cause a deadlock if
an event was sent to it after it stopped.

Also, an error group wasn't really appropriate here;
each goroutine is a different operation for printing,
signal-handling, etc. If one part fails, we don't
actually want printing to stop, for example. This has
been switched to a `multierror.Group`, which has the
same API but coalesces errors instead of canceling a
context the moment the first one fails and returning
that single error.

Signed-off-by: Milas Bowman <[email protected]>
@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 28, 2023

Codecov Report

Patch coverage: 85.54% and project coverage change: +0.04% 🎉

Comparison is base (41682ac) 58.24% compared to head (a387e64) 58.28%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10934      +/-   ##
==========================================
+ Coverage   58.24%   58.28%   +0.04%     
==========================================
  Files         123      123              
  Lines       10864    10882      +18     
==========================================
+ Hits         6328     6343      +15     
  Misses       3916     3916              
- Partials      620      623       +3     
Files Changed Coverage Δ
pkg/compose/printer.go 81.08% <78.43%> (+4.69%) ⬆️
pkg/compose/up.go 83.33% <96.87%> (-1.97%) ⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants