-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support quota set on overlay snapshot #5859
base: main
Are you sure you want to change the base?
Conversation
Hi @yylt. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Build succeeded.
|
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file seems copied from Moby. Please clarify the origin of the code and the license in code comments.
snapshots/quota/projectquota.go
Outdated
#define Q_XGETPQUOTA QCMD(Q_XGETQUOTA, PRJQUOTA) | ||
#endif | ||
*/ | ||
import "C" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid cgo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid cgo
Done!
Quota handling is a topic we should discuss at a high level first and possibly a good topic for a future community meeting. However, getting library support for setting project quotas across different backing FS makes sense to do now. As an end feature, we need to discuss how it will be integrated with Kubernetes and how we can use it from our own tooling and go libraries. It seems backwards to have the cri layer here enable the quota then the snapshot set a static quota. I would think the quota would be set by the client adding a quota and the snapshotter enabling it via configuration and erroring out if the backing FS couldn't support it when enabled. The difficult part is figuring out from the client perspective whether quota is enabled so it knows whether it can avoid expensive ephemeral storage accounting (such as in the Kubelet). |
pkg/cri/config/config.go
Outdated
@@ -266,6 +266,9 @@ type PluginConfig struct { | |||
// of being placed under the hardcoded directory /var/run/netns. Changing this setting requires | |||
// that all containers are deleted. | |||
NetNSMountsUnderStateDir bool `toml:"netns_mounts_under_state_dir" json:"netnsMountsUnderStateDir"` | |||
// SupportSetQuota indicate to set quota on container read-write snapshot, when snapshot do not | |||
// support quota set, it will do nothing. | |||
SupportSetQuota bool `toml:"support_set_quota" json:"supportSetQuota"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe EnableRootfsQuota
or EnableSetQuota
sounds better.
In Kubernetes, there is a feature gate and a check method
|
sorry for so long time later.
it is good suggestion. now quota setter focus on overlay, and with pure go. and update snapshots litter, add new option such as:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to build a containerd to do some test. Too many quota is created in my env.
[root@daocloud ~]# xfs_quota -xc "report -a -p -h" | head -n 100
Project quota on /var/lib/kubelet (/dev/sdc)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- ---------------------------------
#0 820K 0 0 00 [------]
volume1048590 0 8E 8E 00 [------]
volume1048605 0 8E 8E 00 [------]
volume1048607 0 8E 8E 00 [------]
volume1048610 0 8E 8E 00 [------]
Project quota on /var/lib/containerd (/dev/sdb)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- ---------------------------------
#0 10.2G 0 0 00 [------]
#2 0 0 0 00 [------]
#3 0 15G 15G 00 [------]
#4 12K 15G 15G 00 [------]
#5 0 15G 15G 00 [------]
#6 0 15G 15G 00 [------]
#7 0 15G 15G 00 [------]
#8 0 15G 15G 00 [------]
#9 0 15G 15G 00 [------]
#10 0 15G 15G 00 [------]
#11 0 15G 15G 00 [------]
#12 0 15G 15G 00 [------]
#13 0 15G 15G 00 [------]
#14 0 15G 15G 00 [------]
#15 0 15G 15G 00 [------]
#16 0 15G 15G 00 [------]
#17 0 15G 15G 00 [------]
#18 0 15G 15G 00 [------]
#19 0 15G 15G 00 [------]
#20 0 15G 15G 00 [------]
#21 0 15G 15G 00 [------]
#22 0 15G 15G 00 [------]
#23 0 15G 15G 00 [------]
#24 4K 15G 15G 00 [------]
#25 0 15G 15G 00 [------]
#26 0 15G 15G 00 [------]
#27 0 15G 15G 00 [------]
#28 0 15G 15G 00 [------]
#29 4K 15G 15G 00 [------]
#30 0 15G 15G 00 [------]
#31 0 15G 15G 00 [------]
#32 0 15G 15G 00 [------]
#33 0 15G 15G 00 [------]
#34 0 15G 15G 00 [------]
#35 4K 15G 15G 00 [------]
#36 4K 15G 15G 00 [------]
#37 0 15G 15G 00 [------]
#38 540K 15G 15G 00 [------]
#39 0 15G 15G 00 [------]
#40 0 15G 15G 00 [------]
#41 0 15G 15G 00 [------]
#42 0 15G 15G 00 [------]
#43 0 10G 10G 00 [------]
#44 0 10G 10G 00 [------]
#45 0 10G 10G 00 [------]
#46 0 10G 10G 00 [------]
#47 0 10G 10G 00 [------]
#48 0 10G 10G 00 [------]
#49 0 10G 10G 00 [------]
#50 0 10G 10G 00 [------]
#51 0 10G 10G 00 [------]
#52 0 10G 10G 00 [------]
#53 0 10G 10G 00 [------]
#54 0 10G 10G 00 [------]
#55 0 10G 10G 00 [------]
#56 0 10G 10G 00 [------]
#57 0 10G 10G 00 [------]
#58 0 10G 10G 00 [------]
#59 0 10G 10G 00 [------]
#60 0 10G 10G 00 [------]
#61 0 10G 10G 00 [------]
#62 0 10G 10G 00 [------]
#63 0 10G 10G 00 [------]
I set the size to 10G at first and 15G later.
[root@daocloud ~]# xfs_quota -xc "report -a -p -h" | wc -l
1956507
snapshots/overlay/plugin/plugin.go
Outdated
@@ -57,6 +62,15 @@ func init() { | |||
oOpts = append(oOpts, overlay.WithUpperdirLabel) | |||
} | |||
|
|||
ic.Meta.Exports["root"] = root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dup with line 74.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
snapshots/overlay/overlay.go
Outdated
fsMagic, err := overlayutils.GetFSMagic(root) | ||
if err != nil { | ||
return nil, err | ||
} | ||
if config.quotaSize > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fsMagic, err := overlayutils.GetFSMagic(root) | |
if err != nil { | |
return nil, err | |
} | |
if config.quotaSize > 0 { | |
if config.quotaSize > 0 { | |
fsMagic, err := overlayutils.GetFSMagic(root) | |
if err != nil { | |
return nil, err | |
} |
snapshots/overlay/overlay.go
Outdated
} else { | ||
size = config.quotaSize | ||
} | ||
return quotaCtl.SetAllQuota(size, targets...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setting quota
may need some logs. Probably, some debug log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
yes, it becauce set quota committed layer too. but it is not expected |
I got some new errors with latest code.
And the quota is not set as expected. |
f72d83c
to
ed16c4b
Compare
I try command
and use bpftrace to debug, and the syscall
finally, use below to translate
|
6e0ed18
to
333e1a4
Compare
Currently, only the quota for overlayfs snapshotter has been implemented, and unit test cases, it is believed that the code in CRI should be discussed and implemented in a separate patch, therefore could not add more integration tests. Could you please take another look? @dmcgowan @AkihiroSuda |
368240f
to
8ac1836
Compare
Any update? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also update https://github.com/containerd/containerd/tree/main/docs/snapshotters to explain the usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At present, there is no direct way to use quota on overlayfs, here is to support quota-enabled for overlayfs only.
As mentioned at #5859 (comment), perhaps in the future, configuring quotas through config.toml
, cri-api
, or annotations
will be added, and this should discussion more.
Therefore, I believe it is more appropriate to modify the document when it is supported configure.
Signed-off-by: Yang Yang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments
@@ -379,6 +382,24 @@ func WithLabels(labels map[string]string) Opt { | |||
} | |||
} | |||
|
|||
// WithLabels appends labels to a created snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// WithLabels appends labels to a created snapshot | |
// WithQuotaSize appends a label to this snapshot identifying it's maximum quota size in bytes for the snapshot writable layer |
@@ -379,6 +382,24 @@ func WithLabels(labels map[string]string) Opt { | |||
} | |||
} | |||
|
|||
// WithLabels appends labels to a created snapshot | |||
func WithQuotaSize(bytesize uint64) Opt { | |||
return func(info *Info) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call WithLabels
@@ -71,6 +73,12 @@ func WithUpperdirLabel(config *SnapshotterConfig) error { | |||
return nil | |||
} | |||
|
|||
// WithEnableQuota define the enable quota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// WithEnableQuota define the enable quota | |
// WithEnableQuota set enable quota to true/on |
} | ||
} | ||
|
||
func QuotaSize(labels map[string]string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func QuotaSize(labels map[string]string) string { | |
// QuotaSize returns a string comprising the maximum quota size in bytes as an unsigned integer or nil if unset | |
func QuotaSize(labels map[string]string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add more helpers here. This helper doesn't do much either.
@@ -48,6 +48,9 @@ type Config struct { | |||
|
|||
// MountOptions are options used for the overlay mount (not used on bind mounts) | |||
MountOptions []string `toml:"mount_options"` | |||
|
|||
// EnableQuota is define the quota on the snapshot writable layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// EnableQuota is define the quota on the snapshot writable layer | |
// EnableQuota turns on the quota max size in bytes feature for the snapshot writable layer |
case fs.MagicXfs: | ||
quotaCtl, err := quota.NewControl(root) | ||
if err != nil { | ||
log.L.WithError(err).Warnf("could not initinal quota control") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.L.WithError(err).Warnf("could not initinal quota control") | |
log.L.WithError(err).Warnf("could not initialize quota control") |
log.L.WithError(err).Warnf("could not initinal quota control") | ||
return nil, fmt.Errorf("directory '%s' does not support set quota, make sure mount with 'pquota'", root) | ||
} | ||
return func(targets []string, size uint64) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noting we switched to a string... but are still using quota size here as a uint. thus no way to set "0" on size vs "" to mean unset return nil...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is only the first step but it is defining an interface for how it will be used by other snapshotters and backing filesystems in the future. I am fine we we want to start by just defining the label but think we should have a more generalized solution or plan before we put it into the default snapshotter.
Is there a plan for how this would interact with CRI/k8s? Setting a default max in the CRI config seems more dangerous than helpful as these sort of limits should be defined by k8s. This should also be able to free up the costly Usage calculation when it is used, otherwise it is only a minor enhancement.
I would suggest not considering this for the 2.1 release since it would only be experimental and the limited backing store support and possible client use doesn't warrant changes to the default snapshotter. I do envision in 2.2 on beyond we can start thinking about more hybrid snapshotters, where a snapshotter may be using overlay as well as block devices (similar to what erofs is doing). That could provide a more generalized solution for quota but possibly not a part of the default overlay snapshotter.
@@ -38,6 +39,8 @@ const ( | |||
LabelSnapshotUIDMapping = "containerd.io/snapshot/uidmapping" | |||
// LabelSnapshotGIDMapping is the label used for GID mappings | |||
LabelSnapshotGIDMapping = "containerd.io/snapshot/gidmapping" | |||
// LabelSnapshotQuotaSize is the quota size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we define the label here, we should clearly define the expected value
} | ||
} | ||
|
||
func QuotaSize(labels map[string]string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add more helpers here. This helper doesn't do much either.
// SupportQuota checks if overlay filesystem supports quota or not. | ||
// | ||
// This function returns quotaSetter or error if it fails to check the filesystem. | ||
func SupportQuota(root string) (QuotaSetter, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't overlay specific
// | ||
// Get project id of parent dir as minimal id to be used by driver | ||
// | ||
minProjectID, err := getProjectID(basePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when containerd restart , can not obtain an existing path with prjid set on the system. In this case, the value in q.quotas[] is empty. The newly started container will use the prjid already used by another container
(dlv) p q
("*github.com/containerd/containerd/snapshots/quota.Control")(0xc000456800)
github.com/containerd/containerd/snapshots/quota.Control {
RWMutex: sync.RWMutex {
w: (sync.Mutex)(0xc000456800),
writerSem: 0,
readerSem: 0,
readerCount: ("sync/atomic.Int32")(0xc000456810),
readerWait: ("sync/atomic.Int32")(0xc000456814),},
backingFsBlockDev: "/home/containerd_rt/root/io.containerd.snapshotter.v1.overlayfs/backingFsBlockDev",
nextProjectID: 2,
quotas: map[string]uint32 [],}
(dlv)
Compared to ephemeral storage, adding storage configuration to cri-API is completely unnecessary, and there are many file system restrictions, making it seem useless |
are we on the same page? "Goals - Universal snapshot support for quota: To implement a unified approach for managing quotas across all snapshot types. ... " along with Crosby's orig. recommendation "this can be implemented with snapshot labels and then the snapshotters will have to be updated to support these things. However, we still need higher layers like CRI and kube to pass these labels" and this discussion thread: #759 It's doesn't feel like we should only care about quotas for overlayfs and it doesn't seem like quota requirements don't need to flow over the CRI pod/container/metrics apis, ... not following the idea that quotas would be completely unnecessary and seemingly useless for CRI clients. |
sorry for that, it's possible that I didn't fully express my thoughts. Integrating with CRI is obviously necessary, but that would require submitting a KEP([Kubernetes Enhancement Proposals) and engaging in discussions. I anticipate this process to be quite time-consuming. Moreover, ephemeral storage generally doesn't require snapshot type, making it more convenient. Only after the KEP had been merged, that would be necessary to modify the current patch, it's likely that widespread agreement will only be achieved at that point. |
It would be great if this feature could also be introduced to docker with containerd snapshotter enabled. Currently with |
support set quota on overlay snapshot
as discussed in issue #3329
Fixes #3329
enable quota set by follows:
config.toml like this
check containerd root mount info
check container had been set quota
Signed-off-by: Yang Yang [email protected]