Skip to content

Conversation

@dcantah
Copy link
Contributor

@dcantah dcantah commented Feb 15, 2022

We've had quite a few times where creating the upper or work directories in
the guest fails with ENOSPC but we don't have any view into what the mount
looks like at the time we get this. This change just catches any ENOSPC errors
when creating an overlayfs mount, calls statfs and logs the disk space and inode
info for the mount the failed directory is on. This should make investigating
these types of issues much easier.

This may be followed up with a change to delete the upper and work directories
for a container, as this becomes troublesome with the model for sharing a scratch volume.

We've had quite a few times where creating the upper or work directories in
the guest fails with ENOSPC but we don't have any view into what the mount
looks like at the time we get this. This change just catches any ENOSPC errors
when creating an overlayfs mount, calls statfs and logs the disk space and inode
info for the mount the failed directory is on. This should make investigating
these types of issues much easier.

This may be followed up with a change to delete the upper and work directories
for a container, as this becomes troublesome with the model for sharing a scratch volume.

Signed-off-by: Daniel Canter <[email protected]>
@dcantah dcantah requested a review from a team as a code owner February 15, 2022 01:15
@dcantah
Copy link
Contributor Author

dcantah commented Feb 15, 2022

@anmaxvl Assigning you as we'd talked about this :)

@helsaawy helsaawy self-assigned this Feb 15, 2022
Copy link
Contributor

@helsaawy helsaawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn between processErrNoSpace and processNoSpaceError, but besides that and other nits, LGTM

@dcantah
Copy link
Contributor Author

dcantah commented Feb 15, 2022

I am torn between processErrNoSpace and processNoSpaceError, but besides that and other nits, LGTM

@helsaawy I originally had it as processERRNOSPC so we would've had three options! I just landed on ErrNoSpace as its the closest to the actual string rep of the error and in 'Go form', but let me know if you feel strongly on the second. I'm not tied to either

Signed-off-by: Daniel Canter <[email protected]>
@helsaawy
Copy link
Contributor

I am torn between processErrNoSpace and processNoSpaceError, but besides that and other nits, LGTM

@helsaawy I originally had it as processERRNOSPC so we would've had three options! I just landed on ErrNoSpace as its the closest to the actual string rep of the error and in 'Go form', but let me know if you feel strongly on the second. I'm not tied to either

I guess ErrNoSpace would be the Go repr, so lets just leave it at that then

Copy link
Contributor

@anmaxvl anmaxvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I was just checking the logs to see what other places could result in this error and whether we should log the stats there as well, but it looks like we'll eventually end up failing to mount overlay for container, so it's fine to just log it here.

@dcantah dcantah merged commit 14414dd into microsoft:master Feb 19, 2022
princepereira pushed a commit to princepereira/hcsshim that referenced this pull request Aug 29, 2024
* Linux GCS: Log disk info on ENOSPC errors

We've had quite a few times where creating the upper or work directories in
the guest fails with ENOSPC but we don't have any view into what the mount
looks like at the time we get this. This change just catches any ENOSPC errors
when creating an overlayfs mount, calls statfs and logs the disk space and inode
info for the mount the failed directory is on. This should make investigating
these types of issues much easier.

This may be followed up with a change to delete the upper and work directories
for a container, as this becomes troublesome with the model for sharing a scratch volume.

Signed-off-by: Daniel Canter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants