Skip to content

[port #5904 to sbserver] Setup pod network after creating the sandbox container#7426

Merged
samuelkarp merged 4 commits intocontainerd:mainfrom
samuelkarp:port-pr-5904-to-sbserver
Nov 23, 2022
Merged

[port #5904 to sbserver] Setup pod network after creating the sandbox container#7426
samuelkarp merged 4 commits intocontainerd:mainfrom
samuelkarp:port-pr-5904-to-sbserver

Conversation

@samuelkarp
Copy link
Copy Markdown
Member

@samuelkarp samuelkarp commented Sep 23, 2022

This pull request ports #5904 and #7481 to the sbserver implementation in the CRI plugin. Since sbserver is still incomplete, this PR is currently working around the incomplete parts. However, it does pass all the integration tests and is working.

I'd like to get this merged as we continue to work on the Sandbox API since the integration tests enabled by this PR will help catch behavior regressions.

(cc @qiutongs)

@samuelkarp samuelkarp requested review from fuweid and mxpv September 23, 2022 22:50
@k8s-ci-robot
Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@samuelkarp
Copy link
Copy Markdown
Member Author

/test all
/test pull-containerd-sandboxed-node-e2e

@samuelkarp samuelkarp added the area/cri Container Runtime Interface (CRI) label Sep 23, 2022
@samuelkarp samuelkarp added this to the 1.7 milestone Sep 23, 2022
@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch 2 times, most recently from 4172c46 to 1ddd92b Compare September 23, 2022 23:45
@samuelkarp
Copy link
Copy Markdown
Member Author

/test all
/test pull-containerd-sandboxed-node-e2e

Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go Outdated
Comment thread pkg/cri/sbserver/podsandbox/sandbox_stop.go
@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch from 1ddd92b to 4367e31 Compare September 26, 2022 22:52
@samuelkarp
Copy link
Copy Markdown
Member Author

samuelkarp force-pushed the port-pr-5904-to-sbserver branch from 1ddd92b to 4367e31

Updated per review comments from @qiutongs.

@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch from 4367e31 to 99e08fe Compare September 29, 2022 21:04
@samuelkarp
Copy link
Copy Markdown
Member Author

samuelkarp force-pushed the port-pr-5904-to-sbserver branch from 4367e31 to 99e08fe

Rebased on #5904

@samuelkarp samuelkarp marked this pull request as ready for review September 30, 2022 01:16
@samuelkarp
Copy link
Copy Markdown
Member Author

/test pull-containerd-sandboxed-node-e2e

@sbuckfelder
Copy link
Copy Markdown

Now that #5904 is merged could you rebase on main to make it easier to review your changes?

Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go
Comment thread pkg/cri/sbserver/restart.go
Comment thread pkg/cri/sbserver/sandbox_run.go
Comment thread pkg/cri/sbserver/sandbox_run.go
Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go
@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch from 99e08fe to ba43164 Compare October 13, 2022 06:58
@samuelkarp
Copy link
Copy Markdown
Member Author

Now that #5904 is merged could you rebase on main to make it easier to review your changes?

@sbuckfelder Done, though I may also need to port #7481 over to sbserver as well.

@samuelkarp
Copy link
Copy Markdown
Member Author

/test pull-containerd-sandboxed-node-e2e

@samuelkarp
Copy link
Copy Markdown
Member Author

I've updated this with additional commits porting #7481 to sbserver as well (and slightly rewriting the integration test, as some of the assumptions it makes don't apply to sbserver).

/test pull-containerd-sandboxed-node-e2e

Comment thread integration/sandbox_run_rollback_test.go Outdated
@samuelkarp
Copy link
Copy Markdown
Member Author

samuelkarp commented Oct 14, 2022

Marked as draft while I resolve the test failures.

No longer a draft.

@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch 2 times, most recently from 3c33ddb to fb884ef Compare October 14, 2022 06:20
@samuelkarp
Copy link
Copy Markdown
Member Author

/retest

@samuelkarp
Copy link
Copy Markdown
Member Author

/test pull-containerd-sandboxed-node-e2e

@samuelkarp samuelkarp marked this pull request as ready for review October 14, 2022 07:37
Comment thread integration/sandbox_run_rollback_test.go Outdated
Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go Outdated
Comment on lines +222 to +237
if resp != nil && resp.SandboxID == "" && resp.Pid == 0 && resp.CreatedAt == nil && len(resp.Labels) == 0 {
// if resp is a non-nil zero-value, an error occurred during cleanup
cleanupErr = fmt.Errorf("failed to cleanup sandbox")
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Start() return an error instead of asking the caller to check whether resp is its zero value or not?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled with this question too. Start() does already return an error in the case when the sandbox couldn't be started, and Start() attempts to roll-back the changes made in order to setup and start the sandbox. The challenge is that there's an additional indication needed for the case where Start() fails in its cleanup. Rather than add another error and change the sandbox.Controller interface, I opted for doing this zero-value approach instead. There's a note at the top of Start() now to check whether this is reasonable once Delete() is implemented, since an alternative would be to remove all cleanup actions from Start() and instead expect that the caller invoke Delete() explicitly if Start() returns an error.

I'm open to other suggestions here though if you've got some.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Start() fails, Start() should internally roll-back what it created itself. If Start() succeeds, then Delete() needs to be called to clean up what created by Start().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Start() fails, Start() should internally roll-back what it created itself.

This is discussing the case where Start() fails to roll-back what it created.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I mean, The rollback should eventually be handled within the SandboxController. includes handling Sandbox 's TaskEvents with EventMonitor in the SandboxController. The CRI should not be aware of the SandboxContainer, it only uses the SandboxController.

Of course, as an intermediate process, the current implementation is possible. I just imagine the final state.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with you. Kubelet will call RemovePodSandbox for Pods in NotReady state.
The current implementation of SandboxController I feel that there are still some unfinished, such as handleSandboxExit processing in sbserver/events.go.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This PR is meant to port over the bugfix behavior of #5904 and enable the new tests as we continue to work on the overall Sandbox API support. I expect that as more of the SandboxController and sbserver are implemented, some of this code will need to change (though the tests shouldn't as those should help prevent behavior regression).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I appreciate your work. It takes a lot of time.
If possible we can communicate together. I am very interested in this. :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I hope not to trouble you because of the pit I wrote.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No trouble, thanks for taking the time to look at the PR and share your feedback!

@thaJeztah
Copy link
Copy Markdown
Member

Would it be possible to fix the first commit to revert the imports that moved to reduce code-churn a bit? (I'm guessing it git got confused when cherry-picking the changes)

@samuelkarp
Copy link
Copy Markdown
Member Author

Would it be possible to fix the first commit to revert the imports that moved to reduce code-churn a bit? (I'm guessing it git got confused when cherry-picking the changes)

@thaJeztah Yes, I'll take a look at that today. Was focusing on getting the tests to pass first yesterday. 😂

@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch 3 times, most recently from 51538ac to cd4338f Compare October 14, 2022 18:05
Comment thread integration/sandbox_run_rollback_test.go
Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go
Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go
Comment thread pkg/cri/sbserver/restart.go Outdated
Comment thread pkg/cri/sbserver/restart.go
Comment thread pkg/cri/sbserver/sandbox_run.go Outdated
Comment thread pkg/cri/sbserver/sandbox_run.go
@samuelkarp samuelkarp force-pushed the port-pr-5904-to-sbserver branch from aff235f to ac4af4d Compare November 22, 2022 00:47
Comment thread pkg/cri/sbserver/podsandbox/sandbox_run.go
@mxpv
Copy link
Copy Markdown
Member

mxpv commented Nov 22, 2022

/test pull-containerd-sandboxed-node-e2e

Copy link
Copy Markdown
Member

@mxpv mxpv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on green

(I have a couple of follow ups, but don't want to block this PR as it painful to rebase and fix conflicts each time).

@samuelkarp samuelkarp merged commit 7d3ca17 into containerd:main Nov 23, 2022
@samuelkarp samuelkarp deleted the port-pr-5904-to-sbserver branch November 23, 2022 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cri Container Runtime Interface (CRI)

Projects

None yet

Development

Successfully merging this pull request may close these issues.