Expand on the definition of our ops #225

duglin · 2015-10-13T17:34:02Z

There are a few "OPEN" question in the doc that I'd like to brainstorm on.

Signed-off-by: Doug Davis [email protected]

wking · 2015-10-13T17:38:07Z

runtime.md

There's a mailing list thread for this command-line specification, and initial work in this Gist.

@wking can you remove the comments that aren't asking for a specific change? Not sure how many others are like me, but I like to have a clean PR (no outstanding comments that need resolving) before they are merged, and if a PR has a lot of comments that are just "FYI" in nature then its hard to pick out the ones that need action from the ones we can ignore.

On Mon, Jan 04, 2016 at 10:44:32AM -0800, Doug Davis wrote:

@wking can you remove the comments that aren't asking for a specific
change?

Will do. Now that we have a tagged set of open list issues [1,2](including the command-line API [3]), I think there's less need for
either my FYI comments or entries like this. So my actionable
suggestion here is “drop this line in favor of 1 for listing open
high-level discussion”.

wking · 2015-10-13T18:00:50Z

On Tue, Oct 13, 2015 at 10:34:04AM -0700, Doug Davis wrote:

There are a few "OPEN" question in the doc that I'd like to
brainstorm on.

I think this belongs in the lifecycle thread [1,2], but I'm in favor
of anything that ends up in some kind of lifecycle docs landing :). I
left a few comments inline with more detailed suggestions.

crosbymichael · 2015-10-13T23:39:41Z

@duglin do you think adding things like pause/resume and checkpoint/restore would hurt other platforms that do not have this type of feature set? Maybe these are somethings that we can leave out for broader adoption and spec out the main usecase of create, start, stop, delete?

duglin · 2015-10-14T00:29:57Z

@crosbymichael that's why I worded the intro paragraph the way I did. I tried to make it say that these ops are only required to be supported if the platform itself supports it. Perhaps some better wording is needed, but what do you think about the general idea?

crosbymichael · 2015-10-14T00:31:03Z

I think it's good overall. Just don't want to add wording that could prohibit runtimes being compliant for some of these "extra" features.

mrunalp · 2015-10-14T17:55:50Z

runtime.md

Maybe simplify a bit to "unless the operation is not supported by the base operating system" ?

On Wed, Oct 14, 2015 at 10:55:52AM -0700, Mrunal Patel wrote:

+OCI compliant runtimes MUST support the following operations,
unless support for the operation can not be supported by the base
operating system (kernel) itself.

Maybe simplify a bit to "unless the operation is not supported by
the base operating system" ?

+1 to Mrunal's wording, but I'd switch “base” → “host” as well.

crosbymichael · 2015-11-10T00:31:15Z

runtime.md

I would not say this today because its the opposite of how it actually works.

I think depends on whether paths were passed for the namespaces/cgroups or the runtime created those for the container.

On Tue, Nov 17, 2015 at 04:38:22PM -0800, Mrunal Patel wrote:

+Stopping a container MUST stop all of the processes running within the scope of the container.
+Stopping a container MUST NOT delete the associated namespaces or cgroups for the container.

I think depends on whether paths were passed for the
namespaces/cgroups or the runtime created those for the container.

+1. We should only be removing namespaces and cgroups (and killing
any processes that are still in them or their ancestors) if we're
stopping/deleting the container that created them. I think everyone's
on board with that (although it's not how the “delete” phrasing
currently reads here).

However, this PR is splitting runC's current cleanup into distinct
“stop” and “delete” actions, and I think that's what @crosbymichael is
objecting to (because runC doesn't currently have a way to
stop-but-not-delete a container).

@crosbymichael ignoring for a moment what the code does today... should stopping a container delete the namespaces and cgroups? Shouldn't delete and stop be distinct actions?

Its kinda a hard question because it goes into how you define a container. Technically I think its a bad idea to keep namespaces around because its buggy and racy just to make it fit someones made up definition of a container. We would be jumping through hoops in code and on the system for no real use case other than making the technical reality match something that one of us typed.

We dont keep around all the memory of a VM when its stopped just because we made a UI with a stop and delete button so I dont think it makes sense to do this.

Just because we want a stop and delete phase does not mean we need to go looking for something that we can keep laying around to clean up later just to support the thought. Maybe we don't need a stop and delete at all so maybe we should discuss some reasons why people think it should be added.

Still coming up to speed on the details of this but let's say I have a container that gets stopped and then started again later. If we delete the namespaces/cgroup during the stop, is there anything that can not be reliably recreated upon the next time the container is started? Any worry about some other namespace/cgroup grabbing something that we'd like to hold on to? If not, then I think you're probably right.

What you described is how it works today and this has never come up as an issue except in "wouldn't it be cool if..." discussions.

In that case, are you suggesting we go all the way and just merge delete() and stop()?

Yes, for the looks of what delete is defined as right now. I dont see any advantages right now from having it other than more operations and more complexity for implementer

On Mon, Jan 04, 2016 at 01:42:12PM -0800, Michael Crosby wrote:

What you described is how it works today and this has never come up
as an issue except in "wouldn't it be cool if..." discussions.

“Kill everything in the container, removing its namespaces and
cgroups” sounds like “delete” to me, so I don't see much reason to
distinguish between “stop” and “delete” if that's also what “stop”
means.

For a situation in which preserving namespaces is how it works today,
see my create/exec/destroy example [1,2]. One place where this moves
beyond “wouldn't it be cool if…” is that a spec based on this kind of
create/exec/delete separation lets us get out of the hook business and
its ordering issues 3; folks can use a shell script (or whatever)
for both the runtime actions and the currently-hook actions.

crosbymichael · 2015-11-10T00:38:09Z

@duglin overall I think it looks good. I would still suggest removing pause/resume and checkpoint/restore until we get more information about other platforms and how they are going to handle some of the actions. It's always easier to add more later than take somethings away.

mrunalp · 2015-11-18T00:21:54Z

runtime.md

Generating an error makes sense.

On Tue, Nov 17, 2015 at 04:21:56PM -0800, Mrunal Patel wrote:

+OPEN: the entire "generate an error" stuff in here is something we should discuss.

Generating an error makes sense.

I think we all agree that we want error reporting, but we don't have a
clear picture for how to report errors to the caller. With both the
standard streams and error code already assigned to the container
process, what channel is the runtime using to report internal errors?

I purposely used the word "generate" instead of "report" or something else that implies the end user sees it. I think that gets into implementation details. For example, some impl might just log everything and their "user" never sees any of it and I think that's valid from this specs perspective. The point is that we all agree its an error condition and the action did not happen but how the impl deal with the error is up to it.

On Mon, Jan 04, 2016 at 10:49:48AM -0800, Doug Davis wrote:

+OPEN: the entire "generate an error" stuff in here is something we should discuss.

I purposely used the word "generate" instead of "report" or
something else that implies the end user sees it. I think that gets
into implementation details. For example, some impl might just log
everything and their "user" never sees any of it and I think that's
valid from this specs perspective. The point is that we all agree
its an error condition and the action did not happen but how the
impl deal with the error is up to it.

That sounds like a useful distinction to me, and I think its worth
spelling it in this PR. How about adding an error-handling section
somewhere (the glossary? #107) that translates your generate
vs. report disctinction into spec language, and then just use “MUST
generate an error” like you already do in this section.

Once we do that, we can drop the OPEN issue here, and open a mailing
list thread if/when we want something in the spec about the reporting
aspect.

up will add a section that talks about generating errors.

On Mon, Jan 04, 2016 at 12:12:49PM -0800, Doug Davis wrote:

+OPEN: the entire "generate an error" stuff in here is something we should discuss.

up will add a section that talks about generating errors.

The “Errors” section that landed with b669866 → 7a5b1f9 looks good to me.

wking · 2016-01-04T20:44:33Z

runtime.md

I still think @mrunalp's earlier suggestion for simplifying this sentence would be a good change.

duglin · 2016-01-05T20:48:40Z

Updated. Merged create/start and stop/delete.
A couple of other questions/comments for folks to weigh in on too.

wking · 2016-01-05T22:08:52Z

runtime.md

nit: trailing whitespace.

And I don't see any point to declaring an action but not specifying how to configure/trigger it. How would you test runtimes for conformance with this action? What do bundle-authors or runtime-callers do to use this feature?

I'll add something about exec's config file.

On Tue, Jan 05, 2016 at 02:12:40PM -0800, Doug Davis wrote:

+Unlike the container's main process which is specified in the config.json file, this specification does not define how the exec process is defined.

I'll add something about exec's config file.

To get this PR landed faster, I suggest pulling the exec action out
into a follow-up PR. But maybe everyone else thinks we're closer to
an exec-spec consensus than I do ;).

This slipped through the renumbering in 7117ede (Expand on the definition of our ops, 2015-10-13, opencontainers#225). Signed-off-by: W. Trevor King <[email protected]>

# digest/hashing target Most of this has spun off with [1], and I haven't heard of anyone talking about verifying the on-disk filesystem in a while. My personal take is on-disk verification doesn't add much over serialized verification unless you have a local attacker (or unreliable disk), and you'll need some careful threat modeling if you want to do anything productive about the local attacker case. For some more on-disk verification discussion, see the thread starting with [2]. # distributable-format target This spun off with [1]. # lifecycle target I think this is resolved since 7713efc (Add lifecycle for containers, 2015-10-22, opencontainers#231), which was committed on the same day as the ROADMAP entry (4859f6d, Add initial roadmap, 2015-10-22, opencontainers#230). # container-action target Addressed by 7117ede (Expand on the definition of our ops, 2015-10-13, opencontainers#225), although there has been additional discussion in a7a366b (Remove exec from required runtime functionalities, 2016-04-19, opencontainers#388) and 0430aaf (Split create and start, 2016-04-01, opencontainers#384). # validation and testing targets Validation is partly covered by cdcabde (schema: JSON Schema and validator for `config.json`, 2016-01-19, opencontainers#313) and subequent JSON Schema work. The remainder of these targets are handled by ocitools [3]. # printable/compiled-spec target The bulk of this was addressed by 4ee036f (*: printable documents, 2015-12-09, opencontainers#263). Any remaining polishing of that workflow seems like a GitHub-issue thing and not a ROADMAP thing. And publishing these to opencontainers.org certainly seems like it's outside the scope of this repository (although I think that such publishing is a good idea). [1]: https://github.com/opencontainers/image-spec [2]: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/NHpSQ19KCAAJ Subject: OCI Bundle Digests Summary Date: Wed, 14 Oct 2015 17:09:15 +0000 Message-ID: <CAD2oYtN-9yLLhG_STO3F1h58Bn5QovK+u3wOBa=t+7TQi-hP1Q@mail.gmail.com> [3]: https://github.com/opencontainers/ocitools Signed-off-by: W. Trevor King <[email protected]>

This wording is descended from 7117ede (Expand on the definition of our ops, 2015-10-13, opencontainers#225), but the idea is covered generically by e53a72b (Clarify the operation is not for command-line api, 2016-05-24, opencontainers#450), so we no longer need a create-specific note. Especially in the lifecycle docs, where there's already enough going on without this low-level detail. Signed-off-by: W. Trevor King <[email protected]>

wking reviewed Oct 13, 2015
View reviewed changes

duglin mentioned this pull request Oct 13, 2015

Container manipulated via multiple commands #226

Closed

mrunalp reviewed Oct 14, 2015
View reviewed changes

duglin mentioned this pull request Oct 28, 2015

Add lifecycle for containers #231

Merged

crosbymichael reviewed Nov 10, 2015
View reviewed changes

mrunalp reviewed Nov 18, 2015
View reviewed changes

duglin force-pushed the RuntimeOps branch 3 times, most recently from f74c3ca to 7a5b1f9 Compare January 4, 2016 20:37

wking reviewed Jan 4, 2016
View reviewed changes

runtime.md Outdated

Copy link

Contributor

wking Jan 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think @mrunalp's earlier suggestion for simplifying this sentence would be a good change.

duglin force-pushed the RuntimeOps branch 3 times, most recently from 6393659 to 920757b Compare January 5, 2016 18:01

duglin force-pushed the RuntimeOps branch from 920757b to 4a674dc Compare January 5, 2016 20:49

wking reviewed Jan 5, 2016
View reviewed changes

wking pushed a commit to wking/opencontainer-runtime-spec that referenced this pull request Sep 8, 2016

Expand on the definition of our ops #225

Expand on the definition of our ops #225

Conversation

duglin commented Oct 13, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wking commented Oct 13, 2015

Uh oh!

crosbymichael commented Oct 13, 2015

Uh oh!

duglin commented Oct 14, 2015

Uh oh!

crosbymichael commented Oct 14, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

crosbymichael commented Nov 10, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

duglin commented Jan 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!