[port #5904 to sbserver] Setup pod network after creating the sandbox container#7426
Conversation
|
Skipping CI for Draft Pull Request. |
|
/test all |
4172c46 to
1ddd92b
Compare
|
/test all |
1ddd92b to
4367e31
Compare
4367e31 to
99e08fe
Compare
|
/test pull-containerd-sandboxed-node-e2e |
|
Now that #5904 is merged could you rebase on main to make it easier to review your changes? |
99e08fe to
ba43164
Compare
@sbuckfelder Done, though I may also need to port #7481 over to sbserver as well. |
|
/test pull-containerd-sandboxed-node-e2e |
|
I've updated this with additional commits porting #7481 to sbserver as well (and slightly rewriting the integration test, as some of the assumptions it makes don't apply to sbserver). /test pull-containerd-sandboxed-node-e2e |
|
No longer a draft. |
3c33ddb to
fb884ef
Compare
|
/retest |
|
/test pull-containerd-sandboxed-node-e2e |
| if resp != nil && resp.SandboxID == "" && resp.Pid == 0 && resp.CreatedAt == nil && len(resp.Labels) == 0 { | ||
| // if resp is a non-nil zero-value, an error occurred during cleanup | ||
| cleanupErr = fmt.Errorf("failed to cleanup sandbox") | ||
| } |
There was a problem hiding this comment.
Should Start() return an error instead of asking the caller to check whether resp is its zero value or not?
There was a problem hiding this comment.
I struggled with this question too. Start() does already return an error in the case when the sandbox couldn't be started, and Start() attempts to roll-back the changes made in order to setup and start the sandbox. The challenge is that there's an additional indication needed for the case where Start() fails in its cleanup. Rather than add another error and change the sandbox.Controller interface, I opted for doing this zero-value approach instead. There's a note at the top of Start() now to check whether this is reasonable once Delete() is implemented, since an alternative would be to remove all cleanup actions from Start() and instead expect that the caller invoke Delete() explicitly if Start() returns an error.
I'm open to other suggestions here though if you've got some.
There was a problem hiding this comment.
If Start() fails, Start() should internally roll-back what it created itself. If Start() succeeds, then Delete() needs to be called to clean up what created by Start().
There was a problem hiding this comment.
If Start() fails, Start() should internally roll-back what it created itself.
This is discussing the case where Start() fails to roll-back what it created.
There was a problem hiding this comment.
Ok, I mean, The rollback should eventually be handled within the SandboxController. includes handling Sandbox 's TaskEvents with EventMonitor in the SandboxController. The CRI should not be aware of the SandboxContainer, it only uses the SandboxController.
Of course, as an intermediate process, the current implementation is possible. I just imagine the final state.
There was a problem hiding this comment.
Yes, I agree with you. Kubelet will call RemovePodSandbox for Pods in NotReady state.
The current implementation of SandboxController I feel that there are still some unfinished, such as handleSandboxExit processing in sbserver/events.go.
There was a problem hiding this comment.
Yep. This PR is meant to port over the bugfix behavior of #5904 and enable the new tests as we continue to work on the overall Sandbox API support. I expect that as more of the SandboxController and sbserver are implemented, some of this code will need to change (though the tests shouldn't as those should help prevent behavior regression).
There was a problem hiding this comment.
yes, I appreciate your work. It takes a lot of time.
If possible we can communicate together. I am very interested in this. :)
There was a problem hiding this comment.
And I hope not to trouble you because of the pit I wrote.
There was a problem hiding this comment.
No trouble, thanks for taking the time to look at the PR and share your feedback!
|
Would it be possible to fix the first commit to revert the imports that moved to reduce code-churn a bit? (I'm guessing it git got confused when cherry-picking the changes) |
@thaJeztah Yes, I'll take a look at that today. Was focusing on getting the tests to pass first yesterday. 😂 |
51538ac to
cd4338f
Compare
620c032 to
aff235f
Compare
Port of 4f4aad0 to sbserver Signed-off-by: Samuel Karp <[email protected]>
Signed-off-by: Samuel Karp <[email protected]>
Port of b41d6f4 to sbserver Signed-off-by: Samuel Karp <[email protected]>
Signed-off-by: Samuel Karp <[email protected]>
aff235f to
ac4af4d
Compare
|
/test pull-containerd-sandboxed-node-e2e |
This pull request ports #5904 and #7481 to the sbserver implementation in the CRI plugin. Since sbserver is still incomplete, this PR is currently working around the incomplete parts. However, it does pass all the integration tests and is working.
I'd like to get this merged as we continue to work on the Sandbox API since the integration tests enabled by this PR will help catch behavior regressions.
(cc @qiutongs)