Skip to content

feat(kep): add new kep to refine configmap content and configuration#134

Merged
cheyang merged 10 commits intosgl-project:mainfrom
JasonHe-WQ:config
Mar 2, 2026
Merged

feat(kep): add new kep to refine configmap content and configuration#134
cheyang merged 10 commits intosgl-project:mainfrom
JasonHe-WQ:config

Conversation

@JasonHe-WQ
Copy link
Copy Markdown
Contributor

@JasonHe-WQ JasonHe-WQ commented Dec 29, 2025

Ⅰ. Motivation

Currently, the RBG service discovery configuration (/etc/rbg/config.yaml) contains redundant fields that increase
file size and maintenance overhead.

Current Configuration Issues

  • Redundant size fields: group.size and roles.*.size can be calculated from instances
  • Repetitive address patterns: Full FQDNs share common templates

Ⅱ. Modifications

Combine configmaps of each role to a single configmap and reduce redundants content.

Ⅲ. Does this pull request fix one issue?

issue #133

Ⅳ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

Ⅴ. Describe how to verify it

VI. Special notes for reviews

Checklist

  • Format your code make fmt.
  • Add unit tests or integration tests.
  • Update the documentation related to the change.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @JasonHe-WQ, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new Kubernetes Enhancement Proposal (KEP) focused on optimizing the configuration format for RoleBasedGroup (RBG) service discovery. The core objective is to reduce the size and complexity of the /etc/rbg/config.yaml file by removing redundant fields and consolidating configuration for different roles. This refinement aims to improve maintainability and scalability by simplifying how instance addresses are defined and managed, while also addressing potential compatibility challenges with existing systems.

Highlights

  • Configuration Refinement: Introduces a new Kubernetes Enhancement Proposal (KEP) to refine the configuration format for RoleBasedGroup (RBG) service discovery, specifically targeting the /etc/rbg/config.yaml file.
  • Redundancy Reduction: Eliminates redundant fields such as group.size and roles.*.size, which can be derived from instance information, and removes repetitive FQDN patterns by using a template-based address generation.
  • ConfigMap Consolidation: Proposes combining multiple role-specific configmaps into a single configmap template, reducing duplication and maintenance overhead.
  • Non-Continuous Instance Support: Adds support for excludes as an optional field within role configurations to handle non-continuous instance indices, providing more flexibility.
  • Improved Maintainability and Scalability: The new format is expected to significantly reduce configuration file size, especially for large RBGs, leading to improved loading and parsing performance and easier maintenance.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Kubernetes Enhancement Proposal (KEP-130) to refine the configuration format for RoleBasedGroup (RBG). The goal is to reduce redundancy and improve maintainability by adopting a more concise, template-based configuration.

My review focuses on the clarity and completeness of the KEP document. While the proposal is a good step towards simplifying configuration, the KEP itself has several areas that need improvement. I've pointed out some confusing statements, empty sections, and a critical need to detail the upgrade/downgrade strategy for this breaking change. The kep.yaml metadata file also needs to be filled out with the correct information. Addressing these points will make the KEP clearer and more robust.

- [Integration tests](#integration-tests)
- [e2e tests](#e2e-tests)
- [Graduation Criteria](#graduation-criteria)
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Upgrade / Downgrade Strategy section is currently empty. As this KEP introduces a breaking change to the configuration format, this section is critical for users to understand how to manage the transition. Please provide a detailed plan for both upgrading to and downgrading from this new configuration format.

### Risks and Mitigations

With the new configuration format, there is a risk of no compatible format with existing systems.
To mitigate this, we will provide clear examples to help users transition to the new format.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Providing clear examples is a good first step, but it's not a sufficient mitigation strategy for a breaking change. A more robust plan should be detailed in the Upgrade / Downgrade Strategy section. Please consider and document strategies such as:

  • Supporting both old and new formats for a transition period.
  • Providing an automated conversion tool to help users migrate.
  • Clear versioning and feature-gating for the new format.

Comment on lines +88 to +89
Also, currently each role of one RBG has a configmap while the content is the same. This redundancy can be reduced by
using a single configmap template with parameters for size and excludes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This statement seems to contradict the PR description, which mentions combining configmaps for each role into a single one. The KEP itself focuses on changing the content of the configmap, not combining multiple configmaps. Could you clarify if the plan is to have one configmap per RBG instead of one per role? If so, the KEP should be updated to reflect this significant design change, as it's not currently described.

### Goals


1. Remove Template-based address generation:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The goal Remove Template-based address generation: appears to be a typo. The proposal actually introduces and relies on template-based address generation to simplify the configuration. Please correct this to Introduce Template-based address generation or similar to avoid confusion.

making it easier to maintain and read.


#### Story 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Story 2 section is empty. Please either provide the user story or remove this section if it is not applicable.


### Risks and Mitigations

With the new configuration format, there is a risk of no compatible format with existing systems.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrasing risk of no compatible format is a bit awkward. Consider rephrasing for clarity, for example: risk of incompatibility with existing systems or This change is not backward compatible.

Comment thread keps/130-configuration-refine/kep.yaml Outdated
Comment on lines +1 to +51
title: KEP Template
kep-number: NNNN
authors:
- "@jane.doe"
owning-sig: sig-xyz
participating-sigs:
- sig-aaa
- sig-bbb
status: provisional|implementable|implemented|deferred|rejected|withdrawn|replaced
creation-date: yyyy-mm-dd
reviewers:
- TBD
- "@alice.doe"
approvers:
- TBD
- "@oscar.doe"

see-also:
- "/keps/sig-aaa/1234-we-heard-you-like-keps"
- "/keps/sig-bbb/2345-everyone-gets-a-kep"
replaces:
- "/keps/sig-ccc/3456-replaced-kep"

# The target maturity stage in the current dev cycle for this KEP.
# If the purpose of this KEP is to deprecate a user-visible feature
# and a Deprecated feature gates are added, they should be deprecated|disabled|removed.
stage: alpha|beta|stable

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.19"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v1.19"
beta: "v1.20"
stable: "v1.22"

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: MyFeature
components:
- kube-apiserver
- kube-controller-manager
disable-supported: true

# The following PRR answers are required at beta release
metrics:
- my_feature_metric No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file appears to be an unedited template. Please fill in all the placeholder fields with the correct information for KEP-130. For example:

  • title: Configuration Refine for RoleBasedGroup (RBG)
  • kep-number: 130
  • authors: Your GitHub handle
  • status: provisional (for a draft KEP)
  • creation-date: The date you created the KEP.

Updating these fields is necessary for the KEP to be properly tracked.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces KEP-130, which proposes a refined configuration format for RBG (RoleBasedGroup) service discovery to reduce redundancy and improve maintainability. The new format eliminates derivable fields like size and uses template-based address generation to minimize configuration file size, particularly beneficial for large-scale deployments.

Key Changes:

  • New configuration format that removes redundant group.size and roles.*.size fields (calculable from instances)
  • Template-based FQDN generation using pattern {role}-{i}.s-{group}-{role}.{namespace}.svc.cluster.local
  • Introduction of optional excludes field for non-continuous instance indices

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 21 comments.

File Description
keps/130-configuration-refine/kep.yaml KEP metadata file defining the enhancement proposal details (currently contains template placeholders)
keps/130-configuration-refine/README.md Comprehensive KEP document describing the motivation, design, and implementation details for the configuration refinement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread keps/130-configmap-refine/README.md Outdated
Comment on lines +222 to +231
roles:
router:
size: 1
encoder:
size: 2
prefill:
size: 2
decode:
size: 2
excludes: [1] # Optional
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation in the YAML example. The roles should be indented consistently. Lines 223-230 show inconsistent spacing before role names.

Suggested change
roles:
router:
size: 1
encoder:
size: 2
prefill:
size: 2
decode:
size: 2
excludes: [1] # Optional
roles:
router:
size: 1
encoder:
size: 2
prefill:
size: 2
decode:
size: 2
excludes: [1] # Optional

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
Comment on lines +1 to +2
title: KEP Template
kep-number: NNNN
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title should be updated to match the actual KEP topic. Currently it shows "KEP Template" but should be something like "Configuration Refine for RoleBasedGroup (RBG)" to match the README.md title.

Suggested change
title: KEP Template
kep-number: NNNN
title: Configuration Refine for RoleBasedGroup (RBG)
kep-number: 130

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
Comment on lines +14 to +17
approvers:
- TBD
- "@oscar.doe"

Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approvers field contains placeholder values that should be replaced with actual approver GitHub handles or removed if approvers have not yet been assigned.

Suggested change
approvers:
- TBD
- "@oscar.doe"
approvers: []

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
Comment on lines +21 to +22
replaces:
- "/keps/sig-ccc/3456-replaced-kep"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replaces field contains a placeholder KEP reference. This should either be replaced with the actual KEP being replaced or removed if this KEP doesn't replace an existing one.

Suggested change
replaces:
- "/keps/sig-ccc/3456-replaced-kep"
replaces: []

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
# The target maturity stage in the current dev cycle for this KEP.
# If the purpose of this KEP is to deprecate a user-visible feature
# and a Deprecated feature gates are added, they should be deprecated|disabled|removed.
stage: alpha|beta|stable
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stage field should be set to a single valid value instead of listing all possible options. Based on the PR being in draft state, this should likely be "alpha".

Suggested change
stage: alpha|beta|stable
stage: alpha

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
Comment on lines +12 to +15
- TBD
- "@alice.doe"
approvers:
- TBD
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reviewers field contains placeholder values that should be replaced with actual reviewer GitHub handles or removed if reviewers have not yet been assigned.

Suggested change
- TBD
- "@alice.doe"
approvers:
- TBD
- "@alice.doe"
approvers:

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configmap-refine/README.md Outdated
Comment on lines +227 to +231
prefill:
size: 2
decode:
size: 2
excludes: [1] # Optional
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The excludes field is shown in line 231 under the decode role, but the comment on line 218 states that encoder has excludes (excluding index 1). This inconsistency between the description and the example is confusing and should be corrected.

Suggested change
prefill:
size: 2
decode:
size: 2
excludes: [1] # Optional
excludes: [1] # Optional
prefill:
size: 2
decode:
size: 2

Copilot uses AI. Check for mistakes.
Comment on lines +273 to +278
- `<package>`: `<date>` - `<test coverage>`

##### Integration tests

<!--
Integration tests are contained in https://git.k8s.io/kubernetes/test/integration.
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests section contains only a placeholder value that should be replaced with actual package names, dates, and test coverage data, or the placeholder should be removed if this information is not yet available.

Suggested change
- `<package>`: `<date>` - `<test coverage>`
##### Integration tests
<!--
Integration tests are contained in https://git.k8s.io/kubernetes/test/integration.
##### Integration tests
<!--
Integration tests are contained in https://git.k8s.io/kubernetes/test/integration.
<!--
Integration tests are contained in https://git.k8s.io/kubernetes/test/integration.

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
kep-number: NNNN
authors:
- "@jane.doe"
owning-sig: sig-xyz
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The owning-sig field should be updated from the placeholder "sig-xyz" to the actual SIG (Special Interest Group) that owns this KEP.

Suggested change
owning-sig: sig-xyz
owning-sig: sig-aaa

Copilot uses AI. Check for mistakes.
Comment thread keps/130-configuration-refine/kep.yaml Outdated
Comment on lines +6 to +8
participating-sigs:
- sig-aaa
- sig-bbb
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The participating-sigs field contains placeholder values "sig-aaa" and "sig-bbb" that should either be replaced with actual participating SIGs or removed if there are no participating SIGs.

Suggested change
participating-sigs:
- sig-aaa
- sig-bbb
# participating-sigs:
# - sig-<name>

Copilot uses AI. Check for mistakes.
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@JasonHe-WQ JasonHe-WQ force-pushed the config branch 4 times, most recently from bbb2557 to d3d923c Compare February 12, 2026 03:26
@JasonHe-WQ JasonHe-WQ changed the title draft: feat(kep): add new kep to refine configmap content and configuration feat(kep): add new kep to refine configmap content and configuration Feb 28, 2026
Comment thread internal/controller/workloads/rolebasedgroup_controller.go
Comment thread internal/controller/workloads/event.go Outdated
Comment thread keps/130-configmap-refine/kep.yaml Outdated
Copy link
Copy Markdown
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@cheyang cheyang merged commit 3ddf8af into sgl-project:main Mar 2, 2026
8 of 10 checks passed
@JasonHe-WQ JasonHe-WQ deleted the config branch March 3, 2026 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants