Update container with sandbox metadata after NetNS is created#7481
Update container with sandbox metadata after NetNS is created#7481estesp merged 1 commit intocontainerd:mainfrom
Conversation
|
Hi @qiutongs. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
|
f9884df to
3c3711c
Compare
|
|
||
| time.Sleep(5 * time.Second) | ||
| t.Log("Restart containerd after 5s") | ||
| RestartContainerd(t) |
There was a problem hiding this comment.
synced with @qiutongs last night. We need to send sigkill to prevent containerd from graceful shutdown. The graceful shutdown will cancel the context to the inflight rpc call, which might invoke the defer to cleanup the resource.
|
[It] runtime should support portforward in host network |
|
[It] runtime should support portforward in host network |
Signed-off-by: Qiutong Song <[email protected]>
|
/retest |
samuelkarp
left a comment
There was a problem hiding this comment.
Apologies for the late review, but LGTM. One small nit remains, but that can be addressed separately.
| # The command may return non-zero and we want to continue this script. | ||
| # e.g. containerd receives SIGKILL | ||
| set +e |
There was a problem hiding this comment.
Bash options (which are set with set) do not have a function scope, so this set +e affects execution for all commands afterward. There's no set -e (or set -o errexit) in the script, so we're not changing behavior, but it'd be better to move this to the top of the script so it's clear that it applies to everything.
Here's a little script demonstrating:
#!/bin/bash
# This should exit non-zero, but not cause the script to exit.
false
echo "false: $?"
# Abort the script when the first error occurs
set -e
echo "set -e"
# false # This would cause the script to abort
myfunc() {
echo "myfunc"
# Allow errors
set +e
false
echo "false: $?"
}
# call myfunc
myfunc
echo "calling false again"
false
echo "false: $?"
echo "If 'set -e' were in effect, we wouldn't reach this line."
set -e
false
echo "Unreachable"There was a problem hiding this comment.
The keepalive is running in background, which is in other process. It doesn't impact the following statements. But it needs doc or uses separate bash script to ship it.
This is to fix a bug introduced in #5904.
When the network namespace is created, we need to update both container spec and sandbox metadata so that both information can persist on disk. If sandbox metadata is missing, the NetNS is considered as closed when loading sandbox during restart recovery because the NetNS.path is empty. Closed NetNS will affect network teardown in sandbox stop.
Manual Testing
setupPodNetwork, e.g. 60ssudo crictl runp sandbox.jsonsudo crictl podsshows theBefore fix:
sudo crictl inspectp <POD ID>shows"netNamespaceClosed": trueAfter fix:
sudo crictl inspectp <POD ID>shows"netNamespaceClosed": falseNew Integration Test