Skip to content

Use ttrpc OnClose to fix restored dead shims#2109

Merged
stevvooe merged 3 commits intocontainerd:masterfrom
crosbymichael:onclose
Feb 6, 2018
Merged

Use ttrpc OnClose to fix restored dead shims#2109
stevvooe merged 3 commits intocontainerd:masterfrom
crosbymichael:onclose

Conversation

@crosbymichael
Copy link
Copy Markdown
Member

@crosbymichael crosbymichael commented Feb 6, 2018

Closes #2083

Fixes #2078

This is an alternative to #2083 that does not have a blocking wait

Contains the OnClose method for the client

Signed-off-by: Michael Crosby <[email protected]>
When restoring a task make sure that dead shims are handled.

Fixes containerd#2078

Signed-off-by: Michael Crosby <[email protected]>
@crosbymichael
Copy link
Copy Markdown
Member Author

Verified this manually

@codecov-io
Copy link
Copy Markdown

codecov-io commented Feb 6, 2018

Codecov Report

Merging #2109 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2109   +/-   ##
=======================================
  Coverage   45.29%   45.29%           
=======================================
  Files          95       95           
  Lines        9461     9461           
=======================================
  Hits         4285     4285           
  Misses       4464     4464           
  Partials      712      712
Flag Coverage Δ
#linux 50.18% <ø> (ø) ⬆️
#windows 40.19% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e92c913...95d4f53. Read the comment docs.

return nil, nil, errors.Wrap(err, "failed to set OOM Score on shim")
}
c, clo, err := WithConnect(address)(ctx, config)
c, clo, err := WithConnect(address, func() {})(ctx, config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we just pass nil instead? I feel it'd be more natural and expected

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil would cause a panic. no-ops work well here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a matter of checking it before setting it though.

I just think it'd be more idiomatic.

Comment thread linux/runtime.go
"id": id,
"namespace": ns,
}).Error("connecting to shim")
err := r.cleanupAfterDeadShim(ctx, bundle, ns, id, pid)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a task.Delete() here.

If it's not done, re-running the same container (with the same name) will fail until containerd is restarted (at which point it'll delete the task when the plugin loads)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we don't have a task at this point. Shouldn't r.terminate do the same thing?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That why I moved the task adding in my orig PR.

The task.Delete() call is after the call to cleanupAfterDeadShim if it doesn't error out. I think it is an issue now, because task with a dead shim can no longer be listed as when a list of task is requested we now try to get its state from the shim. The original behavior left it to the user to invoke the Delete call, but that's no longer possible.

It may be better to automatically delete the task in r.terminate. But the proper event would need to be generated too (realized that wasn't the case in my orig PR either).

@crosbymichael
Copy link
Copy Markdown
Member Author

@mlaventure I added another commit to address this. let me know what you think

Copy link
Copy Markdown
Contributor

@mlaventure mlaventure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this should work.

I've tested it locally too and it works with the latest change.

Would have to design a regression test for it though. can do it in a follow up PR sometime this week I think.

Comment thread linux/runtime.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can now be removed

@mlaventure
Copy link
Copy Markdown
Contributor

oh, it doesn't build on windows :)

Signed-off-by: Michael Crosby <[email protected]>
@crosbymichael
Copy link
Copy Markdown
Member Author

Fixed

@stevvooe stevvooe added this to the 1.0.2 milestone Feb 6, 2018
@stevvooe
Copy link
Copy Markdown
Member

stevvooe commented Feb 6, 2018

LGTM

@stevvooe stevvooe merged commit 0d9995e into containerd:master Feb 6, 2018
@crosbymichael crosbymichael deleted the onclose branch February 6, 2018 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants