-
Notifications
You must be signed in to change notification settings - Fork 38.7k
ci: Work around podman stop intermittent failure #28547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Looks like this is working, except that the container id contains the name when the issue happens https://cirrus-ci.com/task/6582091933024256?logs=ci#L250: |
|
ok, pushed the complete set of changes (please refer to the commit messages for details) |
This reflects what the script does (docker run ...).
The same is done by the 06 script.
Force remove any containers, pontentially leaving dangling processes, which should be fine.
This limits the scope of the CI_CONTAINER_ID symbol. Can be reviewed with --color-moved=dimmed-zebra
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa2c894, I have reviewed the code and tested it locally.
fa2c894 ci: move-only CI_CONTAINER_ID to 02_run_container.sh (MarcoFalke) fa695b4 ci: Work around podman stop bug (MarcoFalke) fa09a03 ci: Add set -ex to 02_run_container.sh (MarcoFalke) fac9abb ci: Rename 04_install to 02_run_container (MarcoFalke) Pull request description: Sometimes, it seems that `podman stop` does not work. Presumably, it falls back to `podman kill`, which is async. Try to work around this intermittent issue by using the `rm --force` over `stop`. Example failing log https://cirrus-ci.com/task/4549784611061760?logs=ci#L238: ``` Restart docker before run to stop and clear all containers started with --rm ++ podman container stop --all e4eca0766f87864d89fc230aa884a238c214cfbcd44cf76a4dbdb2a30c982009 ++ echo 'Prune all dangling images' Prune all dangling images ++ docker image prune --force Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg. +++ docker run --cap-add LINUX_IMMUTABLE --rm --interactive --detach --tty --mount type=bind,src=/tmp/cirrus-build-1970593815,dst=/tmp/cirrus-build-1970593815,readonly --mount type=volume,src=ci_macos_cross_ccache,dst=/tmp/ccache_dir --mount type=volume,src=ci_macos_cross_depends,dst=/ci_container_base/depends --mount type=volume,src=ci_macos_cross_previous_releases,dst=/ci_container_base/prev_releases --env-file /tmp/env --name ci_macos_cross ci_macos_cross Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg. time="2023-09-27T20:55:39Z" level=warning msg="The input device is not a TTY. The --tty and --interactive flags might not work properly" Error: creating container storage: the container name "ci_macos_cross" is already in use by e4eca0766f87864d89fc230aa884a238c214cfbcd44cf76a4dbdb2a30c982009. You have to remove that container to be able to reuse that name: that name is already in use ACKs for top commit: hebasto: ACK fa2c894, I have reviewed the code and tested it locally. Tree-SHA512: 31fca340c6bedaadf4dd51fa745d9b3969042cebc0c7c904ef18af3f2f986039ec4354ccdff1422fbf77cf223e4423857368dce53cfa67ef15c76b78d007eace
|
If the issue persists, I guess cherry-picking 4444a11 can be another attempt. |
Sometimes, it seems that
podman stopdoes not work. Presumably, it falls back topodman kill, which is async.Try to work around this intermittent issue by using the
rm --forceoverstop.Example failing log https://cirrus-ci.com/task/4549784611061760?logs=ci#L238: