Skip to content

Conversation

@helsaawy
Copy link
Contributor

@helsaawy helsaawy commented Jan 10, 2022

Fixed bug where container is cast as a process, which then causes the
container to be deleted prematurely before when the container finishes
executing.
Currently, whenever an LCOW container is stopped, the logs show multiple
errors being raised that runc cannot find the container, which cause the
Kill command issued by containerd to exit unsuccessfully.

Added conversion of runc log file error strings into error types that
wrap HResult error types.
Wrapped runc errors from log file, which is more informative that error
returned from cmd execution.

Added traces to guest container operations, to trace low level container
operations.

Signed-off-by: Hamza El-Saawy [email protected]

Comment on lines 147 to 148
entity := log.G(ctx).WithField(logfields.ContainerID, c.id)
entity.Debug("opengcs::Container::Delete")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the logs here be something other than debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill change them to Info

Err error `json:"error,omitempty"`
}

func (l *standardLogEntry) asError() (err error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the biggest fan of this function, trying to see if I can think of another way of doing things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also am not a fan, but I couldn't find an example elsewhere of something similar, and simply wrapping the string an errors.New meant the string operations would be pushed elsewhere in the code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is wrapping the string an issue?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be because of version issues (I think I was usuing v0.0.115), but the runc errors I was getting were container with id exists, and not what runc exports of container with given ID already exists. Similarly, container does not exist has the container id in the middle of it

I do want to revisit this in the future if we're confident those runc errors will hold steady and use those errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the string wrapping, that means elsewhere in the code, if we want to match against the error returned, we'd have to use string search, and that would leak into the runc handling code

Copy link
Contributor

@anmaxvl anmaxvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Fixed bug where container is cast as a process, which then causes the
container to be deleted prematurely before when the container finishes
executing.

Added conversion of runc log file error strings into `error` types that
wrap HResult error types.
Wrapped runc errors from log file, which is more informative that error
returned from cmd execution.

Added traces to guest container operations, to trace low level container
operations.

Signed-off-by: Hamza El-Saawy <[email protected]>
Signed-off-by: Hamza El-Saawy <[email protected]>
Signed-off-by: Hamza El-Saawy <[email protected]>
Copy link
Contributor

@ambarve ambarve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@helsaawy helsaawy merged commit 71baff4 into microsoft:master Feb 3, 2022
@helsaawy helsaawy deleted the he/runcstate branch February 3, 2022 19:12
princepereira pushed a commit to princepereira/hcsshim that referenced this pull request Aug 29, 2024
Fixed bug where container is cast as a process, which then causes the
container to be deleted prematurely before when the container finishes
executing.
Currently, whenever an LCOW container is stopped, the logs show multiple
errors being raised that runc cannot find the container, which cause the
Kill command issued by containerd to exit unsuccessfully.

Added conversion of runc log file error strings into error types that
wrap HResult error types.
Wrapped runc errors from log file, which is more informative that error
returned from cmd execution.

Added traces to guest container operations, to trace low level container
operations.

Signed-off-by: Hamza El-Saawy [email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants