config-linux: add support for rsvd hugetlb cgroup#1116
Merged
kolyshkin merged 1 commit intoopencontainers:mainfrom Mar 21, 2023
Merged
config-linux: add support for rsvd hugetlb cgroup#1116kolyshkin merged 1 commit intoopencontainers:mainfrom
kolyshkin merged 1 commit intoopencontainers:mainfrom
Conversation
The previous non-rsvd max/limit_in_bytes does not account for reserved huge page memory, making it possible for a process to reserve all the huge page memory, without being able to allocate it (due to hugetlb cgroup page fault accounting restrictions). In practice this makes it possible to successfully mmap more huge page memory than allowed via the cgroup settings, but when using the memory the process will get a SIGBUS and crash. This is bad for applications trying to mmap at startup (and it succeeds), but the program crashes when starting to use the memory. eg. postgres is doing this by default. This patch updates and clarifies `LinuxResources.HugepageLimits` and `LinuxHugepageLimit` by defaulting the configurations go to rsvd hugetlb cgroup (when supported) and fallback to page fault accounting if not supported. Fixes opencontainers#1050 Signed-off-by: Kailun Qin <[email protected]>
|
@kailun-qin I'm confused, this patch only seems to include code comments and doc changes? |
cbandy
reviewed
Jan 24, 2023
Contributor
|
This (together with runtime implementation) should fix the real issue with some software, described in #1050. |
Contributor
|
@tianon PTAL |
cbandy
approved these changes
Mar 18, 2023
Member
|
Thanks @kailun-qin! |
Merged
Member
|
@kailun-qin @odinuge Do you have a PR for runc? |
omprakaash
added a commit
to omprakaash/oci-spec-rs
that referenced
this pull request
Mar 9, 2024
Adds support for the rsvd hugetlb cgroup. Enables reservation time checks on huge paqe memory limits. More info: opencontainers/runtime-spec#1116
omprakaash
added a commit
to omprakaash/oci-spec-rs
that referenced
this pull request
Mar 9, 2024
Adds support for the rsvd hugetlb cgroup. Enables reservation time checks on huge paqe memory limits. More info: opencontainers/runtime-spec#1116 Signed-off-by: Om Prakaash <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The previous non-rsvd max/limit_in_bytes does not account for reserved
huge page memory, making it possible for a process to reserve all the
huge page memory, without being able to allocate it (due to hugetlb
cgroup page fault accounting restrictions).
In practice this makes it possible to successfully mmap more huge page
memory than allowed via the cgroup settings, but when using the memory
the process will get a SIGBUS and crash. This is bad for applications
trying to mmap at startup (and it succeeds), but the program crashes
when starting to use the memory. eg. postgres is doing this by default.
This patch updates and clarifies
LinuxResources.HugepageLimitsandLinuxHugepageLimitby defaulting the configurations go to rsvd hugetlbcgroup (when supported) and fallback to page fault accounting if not
supported.
Fixes #1050
Signed-off-by: Kailun Qin [email protected]