-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add configuration to limit the number of rows deleted from vm_stats #8740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add configuration to limit the number of rows deleted from vm_stats #8740
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.19 #8740 +/- ##
============================================
- Coverage 14.95% 14.95% -0.01%
+ Complexity 11017 11014 -3
============================================
Files 5378 5378
Lines 469935 469951 +16
Branches 59493 58858 -635
============================================
- Hits 70301 70294 -7
- Misses 391847 391871 +24
+ Partials 7787 7786 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
FelipeM525
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CLGTM!
BryanMLima
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CLGTM
|
@blueorangutan package |
|
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9623 |
|
@blueorangutan package |
|
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9730 |
|
@sureshanaparti @DaanHoogland could we run the CI here? |
|
@blueorangutan test rocky8 kvm-rocky8 |
|
@weizhouapache a [SL] Trillian-Jenkins test job (rocky8 mgmt + kvm-rocky8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-10312)
|
|
@sureshanaparti I think we are ready to merge this one |
weizhouapache
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoaoJandre
It seems Vishesh had some concerns, not sure if they are addressed
@weizhouapache I've answered @vishesh92's concerns a few months ago, as he did not respond anymore, I thought that he was ok with my reasoning. |
My only concern is increasing number of global settings. I would personally prefer a generic global setting which can be reused for multiple tables instead of a table specific global setting. I don't see any benefit of having multiple global settings per table for deletion. |
|
@vishesh92 , is this something that would block this PR, or can it be added as a future improvement. cc @JoaoJandre . |
|
@vishesh92 @DaanHoogland I agree that if we are to have a configuration for each table it would be too much. However, the configuration that @vishesh92 proposed to use ( As this is a broader discussion, I think that we should move this discussion somewhere else to agree on which configurations should exist, then when/if something is decided, we can create a PR to normalize all the required tables. I feel like this is out of scope for this PR, as we would have to change a lot of code that is not related to the |
@JoaoJandre The PR looks good. Can we rename the global setting to something more generic like Also the Build Github Action is failing due to test failures. Please check. |
As I said on my previous message, this PR is focusing only on the Also, the tests should be fixed now :) |
|
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch. |
|
@blueorangutan package |
|
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el7 ✖️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 10000 |
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10009 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
@vishesh92 @JoaoJandre can you guys agree? Personally I have a slight preference to start out with as generic settings as we can and speciallise as we go. But I cannot be bothered much. |
|
[SF] Trillian test result (tid-10506)
|
Creating a generic config would significantly broaden the scope of this PR. I could just rename the config that I created; however, as I won't touch other tables in this PR, it would be misleading to have a configuration with a generic name that only affects a single table. Thus, I would rather we keep this name, as it is more intuitive and descriptive. A future PR that wants to do something similar for other tables could change this configuration's name/description to be more generic and use it as well if the author wills it. |
|
Let's merge this PR for now. And create a PR later to make the configuration setting generic. |
…pache#8740) Co-authored-by: João Jandre <[email protected]>
Description
ACS has a job that periodically removes records from the
vm_statstable that are older than the retention time (settingvm.stats.max.retention.time). In environments that have a large amount of statistics being collected and have a long retention time, if the retention time is shortened, table cleaning may fail to finish due to the huge amount of records being deleted leading to a query timeout.The
vm.stats.remove.batch.sizeconfiguration has been added to define the batch size of statistics removal queries. This way, instead of trying to remove all records in a single query, ACS will remove records in batches, with the batch size being defined with thevm.stats.remove.batch.size setting. The default value is -1, where no limit will be applied; thus, the current behavior is maintained.Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
How Has This Been Tested?
In a local lab, I artificially generated 60 million rows on
vm_statsand set the retention time to one minute.Before the change, the delete job would timeout and no rows would be removed.
After applying the change and setting
vm.stats.max.retention.timeto 100000, the table was successfully cleaned.