fix: coordination rolling update not working for LWS workloads#151
Conversation
Summary of ChangesHello @LikiosSedo, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where coordination rolling updates failed for LeaderWorkerSet (LWS) workloads when only the coordination-level rolling update strategy was defined. The changes ensure that the system correctly processes both role-level and coordination-level rolling update configurations, preventing missed updates and ensuring proper partition management during rollouts. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses a bug where coordination-level rolling update strategies were not being applied to LWS workloads. The changes correctly expand the condition for applying rolling update configurations and add a check for RolloutStrategy changes to trigger reconciliation. My review identified a minor omission where MaxSurge from the coordination strategy is not handled, leading to incomplete override logic. I've included a suggestion to fix this for consistency.
There was a problem hiding this comment.
Pull request overview
This PR fixes coordination-based rolling updates for LeaderWorkerSet (LWS) workloads so that coordination-level rolling update settings are correctly applied and detected, addressing issue #150 where partition values were never propagated to LWS workloads.
Changes:
- Adjusts the LWS rolling update configuration to honor either role-level
RolloutStrategy.RollingUpdateor coordination-levelrollingUpdateStrategy, including proper precedence forMaxUnavailableandPartition. - Updates
lwsSpecEqualto includeRolloutStrategyin the semantic equality check so that changes to rollout/partition settings trigger a patch instead of being treated as no-op.
Comments suppressed due to low confidence (1)
pkg/reconciler/lws_reconciler.go:329
- The new handling of rolling update configuration and RolloutStrategy comparison for LWS (including coordination-level
rollingUpdateStrategyandlwsSpecEqualcomparingRolloutStrategy) is not covered by unit tests inlws_reconciler_test.go, even though similar rollout/coordination behavior is tested for other reconcilers (e.g.,sts_reconciler_test.goandinstanceset_reconciler_test.go). To keep behavior stable and prevent regressions in the coordination rolling update logic, please add focused tests for: (1) applying coordination-only rolling update (no role-levelRolloutStrategy) to LWS, including partition/maxUnavailable selection precedence between coordination and role-level config, and (2)semanticallyEqualLeaderWorkerSet/lwsSpecEqualdetecting changes inRolloutStrategy(e.g., partition changes) so that patches are actually applied.
// RollingUpdate
hasRoleRollingUpdate := role.RolloutStrategy != nil && role.RolloutStrategy.RollingUpdate != nil
hasCoordinationRollingUpdate := rollingUpdateStrategy != nil
if hasRoleRollingUpdate || hasCoordinationRollingUpdate {
rollingUpdateConfiguration := lwsapplyv1.RollingUpdateConfiguration()
if hasRoleRollingUpdate {
if role.RolloutStrategy.RollingUpdate.MaxSurge != nil {
rollingUpdateConfiguration = rollingUpdateConfiguration.WithMaxSurge(*role.RolloutStrategy.RollingUpdate.MaxSurge)
}
if role.RolloutStrategy.RollingUpdate.MaxUnavailable != nil {
rollingUpdateConfiguration = rollingUpdateConfiguration.WithMaxUnavailable(*role.RolloutStrategy.RollingUpdate.MaxUnavailable)
}
}
if rollingUpdateStrategy != nil && rollingUpdateStrategy.MaxUnavailable != nil {
rollingUpdateConfiguration =
rollingUpdateConfiguration.WithMaxUnavailable(*rollingUpdateStrategy.MaxUnavailable)
}
var partitionIntOrStr *intstr.IntOrString
if rollingUpdateStrategy != nil && rollingUpdateStrategy.Partition != nil {
partitionIntOrStr = rollingUpdateStrategy.Partition
} else if hasRoleRollingUpdate && role.RolloutStrategy.RollingUpdate.Partition != nil {
partitionIntOrStr = role.RolloutStrategy.RollingUpdate.Partition
}
if partitionIntOrStr != nil {
partition, err := intstr.GetScaledValueFromIntOrPercent(partitionIntOrStr, int(*role.Replicas), true)
if err != nil {
return nil, err
}
rollingUpdateConfiguration = rollingUpdateConfiguration.WithPartition(int32(partition))
}
lwsSpecConfig = lwsSpecConfig.WithRolloutStrategy(
lwsapplyv1.RolloutStrategy().WithRollingUpdateConfiguration(rollingUpdateConfiguration),
)
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@LikiosSedo Please check the ut and e2e test issue. |
58e07da to
f253170
Compare
Done |
…roject#151) * fix: coordination rolling update not working for LWS workloads * test: add unit tests for LWS coordination rolling update --------- Co-authored-by: 刘森栋 <[email protected]>
Ⅰ. Motivation
Coordination rolling update does not work for LeaderWorkerSet workloads when only
coordinationRequirements.strategy.rollingUpdateis configured (without role-levelrolloutStrategy).Ⅱ. Modifications
Two bugs in
pkg/reconciler/lws_reconciler.go:role.RolloutStrategyandrollingUpdateStrategyparameterRolloutStrategycomparison to detect partitionchanges
Ⅲ. Does this pull request fix one issue?
fixes #150
Ⅳ. List the added test cases
Manual and Local e2e testing verified:
Ⅴ. Describe how to verify it
coordination.strategy.rollingUpdate.maxSkewspec.rolloutStrategy.rollingUpdateConfiguration.partitionchanges from non-zero to 0