Releases: sgl-project/rbg
Releases · sgl-project/rbg
v0.7.0-alpha.3
What's Changed
- Chore(rbg): add v1alpha2 related examples by @Syspretor in #231
- chore: use ARCH and remove TARGETARCH fallback by @NoobDream2568 in #256
- Security issue: unsafe json string in pulling job's annotations; unsa… by @diw-zw in #251
- chore(community): add examples for dynamo by @Syspretor in #255
- build: support patio arm64 image by @TrafalgarZZZ in #258
- Deal with codeQL alert: Uncontrolled data used in path expression by @diw-zw in #257
- feat(scaling-adapter): add Labels field for user-specified RBGSA labels by @sebest in #242
- Enhance service reconciler by @lx1036 in #252
- fix(rbgsa-controller): use Warning event type for failure events by @sebest in #262
- fix(readiness): re-evaluate pod Ready condition in removeNotReadyKey by @sebest in #246
- feat: support user-defined model configuration by @diw-zw in #259
- chore: disable unparam linter by @Syspretor in #266
- feat(rbg): support use templateRef in lwp by @Syspretor in #254
- feat(scaling-adapter): add readyReplicas to RBGSA with sole-writer pattern by @sebest in #243
- v1alpha2 conversion webhook by @diw-zw in #225
- chore: add examples for mooncake integration with v1alpha2 api by @Syspretor in #272
- rbgcli: adds multi-node LLM inference serving support by @diw-zw in #265
- build(deps): bump github.com/moby/spdystream from 0.5.0 to 0.5.1 by @dependabot[bot] in #278
- fix(rbgsa): api conversion failed caused by fields missing by @Syspretor in #283
- [CLI] Secure OSS secrets in cluster, fix engine port args, and add Qwen3.6 model by @diw-zw in #279
- refactor: remove workload field in v1alpha2 by @Syspretor in #281
- fix: deal with injectMetadataSave properly by @diw-zw in #286
- fix(cli): extractRBGStatus wrong by @diw-zw in #290
- chore: add cases for LWP env by @Syspretor in #287
- feat(cli): enhance llm svc run with two new override flags by @diw-zw in #289
- fix(rbgsa-controller): preserve and initialize readyReplicas by @sebest in #280
- fix(roleinstance): nil pointer dereference in in-place update by @sebest in #274
- Copilot/fix env vars order by @cheyang in #282
- chore(rbgs): update to 0.7.0-alpha.3 by @cheyang in #291
New Contributors
- @lx1036 made their first contribution in #252
- @dependabot[bot] made their first contribution in #278
Full Changelog: v0.7.0-alpha.2...v0.7.0-alpha.3
What's Changed
- Chore(rbg): add v1alpha2 related examples by @Syspretor in #231
- chore: use ARCH and remove TARGETARCH fallback by @NoobDream2568 in #256
- Security issue: unsafe json string in pulling job's annotations; unsa… by @diw-zw in #251
- chore(community): add examples for dynamo by @Syspretor in #255
- build: support patio arm64 image by @TrafalgarZZZ in #258
- Deal with codeQL alert: Uncontrolled data used in path expression by @diw-zw in #257
- feat(scaling-adapter): add Labels field for user-specified RBGSA labels by @sebest in #242
- Enhance service reconciler by @lx1036 in #252
- fix(rbgsa-controller): use Warning event type for failure events by @sebest in #262
- fix(readiness): re-evaluate pod Ready condition in removeNotReadyKey by @sebest in #246
- feat: support user-defined model configuration by @diw-zw in #259
- chore: disable unparam linter by @Syspretor in #266
- feat(rbg): support use templateRef in lwp by @Syspretor in #254
- feat(scaling-adapter): add readyReplicas to RBGSA with sole-writer pattern by @sebest in #243
- v1alpha2 conversion webhook by @diw-zw in #225
- chore: add examples for mooncake integration with v1alpha2 api by @Syspretor in #272
- rbgcli: adds multi-node LLM inference serving support by @diw-zw in #265
- build(deps): bump github.com/moby/spdystream from 0.5.0 to 0.5.1 by @dependabot[bot] in #278
- fix(rbgsa): api conversion failed caused by fields missing by @Syspretor in #283
- [CLI] Secure OSS secrets in cluster, fix engine port args, and add Qwen3.6 model by @diw-zw in #279
- refactor: remove workload field in v1alpha2 by @Syspretor in #281
- fix: deal with injectMetadataSave properly by @diw-zw in #286
- fix(cli): extractRBGStatus wrong by @diw-zw in #290
- chore: add cases for LWP env by @Syspretor in #287
- feat(cli): enhance llm svc run with two new override flags by @diw-zw in #289
- fix(rbgsa-controller): preserve and initialize readyReplicas by @sebest in #280
- fix(roleinstance): nil pointer dereference in in-place update by @sebest in #274
- Copilot/fix env vars order by @cheyang in #282
- chore(rbgs): update to 0.7.0-alpha.3 by @cheyang in #291
New Contributors
- @lx1036 made their first contribution in #252
- @dependabot[bot] made their first contribution in #278
Full Changelog: v0.7.0-alpha.2...v0.7.0-alpha.3
v0.7.0-alpha.2
What's Changed
- chore: add copyright 2026 by @Syspretor in #229
- Check copyright by @cheyang in #230
- fix(rbgs): missing re-generate client code for rbgs by @Syspretor in #236
- chore: add check for generated client-code by @Syspretor in #237
- refactor(coordinated-policy): add name field in policy by @Syspretor in #235
- chore: add check for yaml lint by @Syspretor in #238
- Controllerrevisions default by @JasonHe-WQ in #234
- update aiconfigurator dependency by @diw-zw in #228
- Update xxx-copyright-to-pr-go-files.sh - ignore vendor/ by @diw-zw in #239
- Cli implement by @diw-zw in #182
- fix: some typo and wrong kubebuilder comments by @Syspretor in #244
- fix(rbgsa-controller): use time.Second for RequeueAfter durations by @sebest in #240
- feat(scheduler): inherit podGroup annotations by @JasonHe-WQ in #233
- chore: update copyright header by @Syspretor in #245
- fix(rbg-controller): watch owned RoleBasedGroupScalingAdapter resources by @sebest in #241
- feat: pod port allocator (kep 171) by @NoobDream2568 in #210
- fix(port-allocator): add startup args to charts by @NoobDream2568 in #249
- feat(rbg): support enabling portAllocator on demand by @Syspretor in #250
- Build: update rbg helm chart 0.7.0-alpha.2 by @cheyang in #253
New Contributors
- @diw-zw made their first contribution in #228
- @sebest made their first contribution in #240
- @NoobDream2568 made their first contribution in #210
Full Changelog: v0.7.0-alpha.1...v0.7.0-alpha.2
What's Changed
- chore: add copyright 2026 by @Syspretor in #229
- Check copyright by @cheyang in #230
- fix(rbgs): missing re-generate client code for rbgs by @Syspretor in #236
- chore: add check for generated client-code by @Syspretor in #237
- refactor(coordinated-policy): add name field in policy by @Syspretor in #235
- chore: add check for yaml lint by @Syspretor in #238
- Controllerrevisions default by @JasonHe-WQ in #234
- update aiconfigurator dependency by @diw-zw in #228
- Update xxx-copyright-to-pr-go-files.sh - ignore vendor/ by @diw-zw in #239
- Cli implement by @diw-zw in #182
- fix: some typo and wrong kubebuilder comments by @Syspretor in #244
- fix(rbgsa-controller): use time.Second for RequeueAfter durations by @sebest in #240
- feat(scheduler): inherit podGroup annotations by @JasonHe-WQ in #233
- chore: update copyright header by @Syspretor in #245
- fix(rbg-controller): watch owned RoleBasedGroupScalingAdapter resources by @sebest in #241
- feat: pod port allocator (kep 171) by @NoobDream2568 in #210
- fix(port-allocator): add startup args to charts by @NoobDream2568 in #249
- feat(rbg): support enabling portAllocator on demand by @Syspretor in #250
- Build: update rbg helm chart 0.7.0-alpha.2 by @cheyang in #253
New Contributors
- @diw-zw made their first contribution in #228
- @sebest made their first contribution in #240
- @NoobDream2568 made their first contribution in #210
Full Changelog: v0.7.0-alpha.1...v0.7.0-alpha.2
v0.7.0-alpha.1
What's Changed
- fix(stateful): skip correct unhealth instances by @JasonHe-WQ in #169
- feat(kep): add new kep to refine configmap content and configuration by @JasonHe-WQ in #134
- refactor(rbg): add api v1alpha2 by @Syspretor in #167
- fix(e2e): Fixed sporadic end-to-end (e2e) failures caused by slow pod… by @Syspretor in #170
- refactor(helm): support crd upgrader by @Syspretor in #175
- refactor(rbgs): init rbgs apiVersion v1alpha2 by @Syspretor in #173
- fix(pod group): fix instance workload calculation bug by @JasonHe-WQ in #174
- 修复PD分离部署中,手动减少prefill和decode的replicas的时候,patio无法正确通知router删除路由的问题。 by @jackyzzy in #177
- chore(coordinated-policy): init coordinated policy crd with apiversio… by @Syspretor in #183
- chore(rbg): update v1alpha2 constants by @Syspretor in #184
- fix(role-instance): restart policy does not take effect by @Syspretor in #185
- chore(role-instance): init apiversion v1alpha2 for roleInstance and r… by @Syspretor in #186
- chore(scalingadapter): init scalingAdapter apiversion v1alpha2 by @Syspretor in #189
- chore(engine-runtime): init engine runtime profile apiversion v1alpha2 by @Syspretor in #190
- chore(role-instance): implement roleInstance and roleInstanceSet as reconcile version by @Syspretor in #191
- chore(rbg): implement v1alpha2 as storage version by @Syspretor in #188
- fix(ut): fix base cases ut failure by @Syspretor in #192
- refactor(rbg): support customComponentsPatterns in v1alpha2 version by @Syspretor in #193
- feature(rbg): support role instance index env by @Syspretor in #194
- fix(instance): fix updated generation by @JasonHe-WQ in #197
- chore(role-instance): add roleinstance related rbac by @Syspretor in #199
- fix(rbg): fix inject env related ut by @Syspretor in #200
- fix(rbg): fix reconciler related ut by @Syspretor in #201
- chore(coordinated-policies): add rbac for coordinatedpolicies crd by @Syspretor in #202
- Fix(e2e): add crd check by @Syspretor in #198
- fix(rbg): solve the problem of patch race by @Syspretor in #203
- feature(rbg): support gang scheduling in v1alpha2 by @Syspretor in #195
- Feature(role-instanceset): support create pod in parallel by @Syspretor in #204
- chore(role-instanceset): change spec podManagement field type by @Syspretor in #207
- Refactor/unify constants by @Syspretor in #208
- chore(makefile): support to generate rbac in make manifests by @Syspretor in #209
- refactor(coordinated-policy): change strategies api field by @Syspretor in #205
- refactor(rbgs): change spec api field by @Syspretor in #206
- Ensure the inference service can be correctly re-register after the router is restarted through patio heartbeat by @jackyzzy in #187
- fix(env): fix missing env by @JasonHe-WQ in #214
- Change version of Makefile to 0.7.0 by @cheyang in #215
- fix(coordinated-policy): fix progression filed enum validation by @Syspretor in #217
- fix(rbg): fix rbg delete failed in foreground deletion by @Syspretor in #219
- [PATCH] feat: add arm64 multi-arch docker publishing by @cheyang in #218
- fix: generated manifests missing latest CRD by @Syspretor in #222
- fix(role-instance): fix update role instance status conflict by @Syspretor in #220
- Build: update rbg helm chart 0.7.0-alpha.1 by @cheyang in #224
New Contributors
- @JasonHe-WQ made their first contribution in #169
Full Changelog: v0.6.0...v0.7.0-alpha.1
What's Changed
- fix(stateful): skip correct unhealth instances by @JasonHe-WQ in #169
- feat(kep): add new kep to refine configmap content and configuration by @JasonHe-WQ in #134
- refactor(rbg): add api v1alpha2 by @Syspretor in #167
- fix(e2e): Fixed sporadic end-to-end (e2e) failures caused by slow pod… by @Syspretor in #170
- refactor(helm): support crd upgrader by @Syspretor in #175
- refactor(rbgs): init rbgs apiVersion v1alpha2 by @Syspretor in #173
- fix(pod group): fix instance workload calculation bug by @JasonHe-WQ in #174
- 修复PD分离部署中,手动减少prefill和decode的replicas的时候,patio无法正确通知router删除路由的问题。 by @jackyzzy in #177
- chore(coordinated-policy): init coordinated policy crd with apiversio… by @Syspretor in #183
- chore(rbg): update v1alpha2 constants by @Syspretor in #184
- fix(role-instance): restart policy does not take effect by @Syspretor in #185
- chore(role-instance): init apiversion v1alpha2 for roleInstance and r… by @Syspretor in #186
- chore(scalingadapter): init scalingAdapter apiversion v1alpha2 by @Syspretor in #189
- chore(engine-runtime): init engine runtime profile apiversion v1alpha2 by @Syspretor in #190
- chore(role-instance): implement roleInstance and roleInstanceSet as reconcile version by @Syspretor in #191
- chore(rbg): implement v1alpha2 as storage version by @Syspretor in #188
- fix(ut): fix base cases ut failure by @Syspretor in #192
- refactor(rbg): support customComponentsPatterns in v1alpha2 version by @Syspretor in #193
- feature(rbg): support role instance index env by @Syspretor in #194
- fix(instance): fix updated generation by @JasonHe-WQ in #197
- chore(role-instance): add roleinstance related rbac by @Syspretor in #199
- fix(rbg): fix inject env related ut by @Syspretor in #200
- fix(rbg): fix reconciler related ut by @Syspretor in #201
- chore(coordinated-policies): add rbac for coordinatedpolicies crd by @Syspretor in #202
- Fix(e2e): add crd check by @Syspretor in #198
- fix(rbg): solve the problem of patch race by @Syspretor in #203
- feature(rbg): support gang scheduling in v1alpha2 by @Syspretor in #195
- Feature(role-instanceset): support create pod in parallel by @Syspretor in #204
- chore(role-instanceset): change spec podManagement field type by @Syspretor in #207
- Refactor/unify constants by @Syspretor in #208
- chore(makefile): support to generate rbac in make manifests by @Syspretor in #209
- refactor(coordinated-policy): change strategies api field by @Syspretor in #205
- refactor(rbgs): change spec api field by @Syspretor in #206
- Ensure the inference service can be correctly re-register after the router is restarted through patio heartbeat by @jackyzzy in #187
- fix(env): fix missing env by @JasonHe-WQ in #214
- Change version of Makefile to 0.7.0 by @cheyang in #215
- fix(coordinated-policy): fix progression filed enum validation by @Syspretor in #217
- fix(rbg): fix rbg delete failed in foreground deletion by @Syspretor in #219
- [PATCH] feat: add arm64 multi-arch docker publishing by @cheyang in #218
- fix: generated manifests missing latest CRD by @Syspretor in #222
- fix(role-instance): fix update role instance status conflict by @Syspretor in #220
- Build: update rbg helm chart 0.7.0-alpha.1 by @cheyang in https://github.com/sgl-project/rbg/pull...
v0.6.0
🌟 Highlights
- Coordinated Scaling Support: Added support for coordinated scaling to enhance scaling capabilities under complex workloads (#142).
- Stateful InstanceSet: Introduced support for stateful InstanceSets, improving management for stateful applications (#159).
- New Benchmark CLI: Integrated benchmark capabilities into the CLI tool for easier performance testing (#156).
🚀 Features
- CLI Enhancements:
- RBG Core:
🐛 Bug Fixes
- Workload Management:
- Logging & Build:
📖 Documentation
- Integrations: Added an example for Mooncake integration (#125).
What's Changed
- example(rbg): add example for mooncake integration by @Syspretor in #125
- add cli: kubectl rbg llm generate by @bcfre in #128
- chore(rbg): add workflow for e2e_test by @Syspretor in #130
- chore(rbg): add workflow for nightly image build by @Syspretor in #131
- fix(rbg): use structured logging instead of event recorder in Get fai… by @jackyzzy in #137
- fix: use GetWorkloadName for instance address generation by @bcfre in #138
- Feat/kep 8 roletemplate by @LikiosSedo in #124
- feat(rbgsa): support scalingAdapter for instanceSet, add resource age by @Syspretor in #140
- chore(rbg): add unit-test check and incremental coverage check by @Syspretor in #149
- chore(rbg): add envtest by @Syspretor in #145
- fix: avoid duplicate --platform argument in docker-buildx by @onenewcode in #141
- feat(rbg): support coordinated scaling by @Syspretor in #142
- fix(github-action): unit-test coverage checks by @Syspretor in #153
- fix: coordination rolling update not working for LWS workloads by @LikiosSedo in #151
- refactor(rbg): simplify rbg reconcile - Phase 1 by @Syspretor in #155
- feat(cli): support benchmark cli by @bcfre in #156
- feat(cli): update benchmark cli by @bcfre in #157
- fix(cli): benchmark related products by @bcfre in #158
- feat(rbg): support stateful instanceSet by @Syspretor in #159
- Change tag from 0.5.0 to 0.6.0 by @cheyang in #160
- refactor: rename benchtool to benchmark-tool by @cheyang in #162
- Build: update rbg helm chart 0.6.0 by @cheyang in #161
New Contributors
- @jackyzzy made their first contribution in #137
- @onenewcode made their first contribution in #141
Full Changelog: v0.5.0...v0.6.0
What's Changed
- example(rbg): add example for mooncake integration by @Syspretor in #125
- add cli: kubectl rbg llm generate by @bcfre in #128
- chore(rbg): add workflow for e2e_test by @Syspretor in #130
- chore(rbg): add workflow for nightly image build by @Syspretor in #131
- fix(rbg): use structured logging instead of event recorder in Get fai… by @jackyzzy in #137
- fix: use GetWorkloadName for instance address generation by @bcfre in #138
- Feat/kep 8 roletemplate by @LikiosSedo in #124
- feat(rbgsa): support scalingAdapter for instanceSet, add resource age by @Syspretor in #140
- chore(rbg): add unit-test check and incremental coverage check by @Syspretor in #149
- chore(rbg): add envtest by @Syspretor in #145
- fix: avoid duplicate --platform argument in docker-buildx by @onenewcode in #141
- feat(rbg): support coordinated scaling by @Syspretor in #142
- fix(github-action): unit-test coverage checks by @Syspretor in #153
- fix: coordination rolling update not working for LWS workloads by @LikiosSedo in #151
- refactor(rbg): simplify rbg reconcile - Phase 1 by @Syspretor in #155
- feat(cli): support benchmark cli by @bcfre in #156
- feat(cli): update benchmark cli by @bcfre in #157
- fix(cli): benchmark related products by @bcfre in #158
- feat(rbg): support stateful instanceSet by @Syspretor in #159
- Change tag from 0.5.0 to 0.6.0 by @cheyang in #160
- refactor: rename benchtool to benchmark-tool by @cheyang in #162
- Build: update rbg helm chart 0.6.0 by @cheyang in #161
New Contributors
- @jackyzzy made their first contribution in #137
- @onenewcode made their first contribution in #141
Full Changelog: v0.5.0...v0.6.0
v0.5.0
🌟 Highlights
- InstanceSet Workload: Introduced native support for
InstanceSet, enabling fine-grained management for stateful workloads. - Coordinated Rollout: Implemented synchronized updates across interdependent roles to maintain system stability during upgrades.
- In-Place Updates: Enabled efficient updates by using
InstanceSetworkloads without requiring pod recreation. - Revision Management: Added full
ControllerRevisionsupport for robust version tracking and rollback capabilities.
🚀 Features
- InstanceSet Support
- Introduced
InstanceSetAPI and Controller. - Added support for using
InstanceSetas a Role Workload in RoleBasedGroup (RBG).
- Introduced
- Controller Revision & Versioning
- Added
ControllerRevisionsupport to RoleBasedGroup for better version control. - Updated
rbgctlto support RBG revision operations. - Enabled updating roles using
controllerrevisionhash.
- Added
- Orchestration & Scheduling
- Implemented rollout coordination for roles in RBG.
- Supported parallel execution for roles sharing the same dependencies.
- Added support for Pod Group Policy.
- Runtime & Integrations
- Added Mooncake integration.
- Added Engine Runtime support.
- Supported
sgl-routerPD (Prefix Disaggregation) with engine runtime.
- Miscellaneous
- Added support for role-level metadata.
- Migrated
Template,leaderWorkerSet, andRollingUpdatefields from value to pointer types.
🐛 Bug Fixes
- StatefulSet & Reconciler
- Fixed
sts_reconcilerfailure when retrieving historical StatefulSet revisions. - Updated StatefulSet service naming to meet Kubernetes requirements (compatible headless service name).
- Fixed
- Workload & Updates
- Fixed an issue where updating a role did not trigger a rolling update.
- Fixed leader pods with
InstanceSetworkload lacking environment variables. - Added max length checks for
workloadNameandserviceName.
- Cleanup & Resources
- Fixed issue where corresponding PodGroups created by RBG were not deleted during gang scheduling.
- Fixed incorrect
apiVersionin auto-generated applyconfigurations. - Fixed
InstanceSetRBAC and LWS environment build issues.
📖 Documentation
- Added missing model examples.
- Updated Gang Scheduling documentation.
- Updated
sglangPD disaggregation example withsgl-router. - Fixed link errors in README and added CI checks for markdown quality.
What's Changed
- fix build image error in Makefile by @gujingit in #41
- feat: Support parallel execution for roles with same dependencies by @tlipoca9 in #42
- Build: update rbg helm chart by @cheyang in #44
- feat: Supports using controllerrevision hash to update role by @bcfre in #34
- feat: add instanceset api by @veophi in #50
- doc: add missing model examples by @bcfre in #49
- feat: use code-generator to generate applyconfiguration code by @liubing0427 in #48
- Build docker image for supporting controller revision by @cheyang in #51
- [WIP]: Add in-place update api and core codes for InstanceSet by @veophi in #52
- fix: change stateful set service name to meet k8s requirements by @liubing0427 in #53
- feat: add engine runtime by @gujingit in #55
- bugfix: add max len check for workloadName & serviceName by @gujingit in #58
- fix: delete corresponding podgroup created by rbg when gang-schedulin… by @ShirleyDing in #57
- Update Helm chart 0.5.0-alpha.1 by @cheyang in #60
- fix: address review comments in pr-57 by @bcfre in #67
- doc: update gang scheduling by @bcfre in #62
- feat: The ControllerRevision not store the replicas for RBG Roles by @bcfre in #61
- CI:add release script by @cheyang in #71
- feat: rbgctl supports rbg revision operations by @bcfre in #54
- KEP-8: Reduce YAML Duplication via RoleTemplates by @LikiosSedo in #70
- Release 0.5.0-alpha.2 by @cheyang in #73
- feat: add instance controller by @yangsoon in #66
- fix: sts_reconciler failure to retrieve historical StatefulSet revisions by @bcfre in #77
- fix(sts-reconciler): use compatible headless service name for statefu… by @TrafalgarZZZ in #81
- [KEP-31]: Adding ControllerRevision support to the RoleBasedGroup by @bcfre in #27
- Release 0.5.0-alpha.3 by @cheyang in #78
- feat(engine-runtime): support sgl-router pd disaggregation with engine runtime by @TrafalgarZZZ in #82
- KEP-30: add role coordination kep by @gujingit in #59
- feat: add instanceset controller codes by @veophi in #83
- KEP-8: Refine API naming and preview/diff design based on community feedback by @LikiosSedo in #79
- [KEP-30]: Introduce InstanceSet Workload Support in RoleBasedGroup for Improved LLM Orchestration by @veophi in #26
- Fix(rbg): update role not trigger rolling update by @Syspretor in #89
- doc: Update sglang pd disaggregation example with sgl-router by @TrafalgarZZZ in #86
- KEP 74: Mooncake integration by @Syspretor in #75
- chore(codegen): Generate instanceSet related go-client codes by @Syspretor in #93
- chore(hack): add script to generate manifests.yaml by @Syspretor in #95
- feat(rbg): add rbac with resource instancesets/instances by @Syspretor in #97
- Build: update rbg helm chart 0.5.0-alpha.4 by @cheyang in #98
- fix(rbg): fix incorrect apiVersion in auto-generated applyconfigurations by @Syspretor in #101
- docs: fix link error in README; fix lint errors in markdown files; add a CI to automatically check markdown quality by @Phil-Fan in #100
- feat: impl rollout coordination for roles in rbg by @veophi in #91
- fix: add explicit permissions to the docs-check workflow by @Phil-Fan in #105
- rbgs support pod group policy by @nightmeng in #107
- refactor: migrate Template field from value to pointer type by @LikiosSedo in #102
- chore(rbg): migrate role.leaderWorkerSet field type from value to poi… by @Syspretor in #108
- feat(rbg): support role-level metadata by @Syspretor in #113
- feature(rbg): support to use InstanceSet as role workload by @Syspretor in #110
- chore(rbg): migrate RollingUpdate field type by @Syspretor in #114
- Update readme by @cheyang in #111
- fix(rbg): fix instanceset rbac and lws env build by @Syspretor in #116
- fix(rbg): fix leader pod with instanceset workload lack envs by @Syspretor in #122
- Build: update rbg helm chart 0.5.0 by @cheyang in #123
New Contributors
- @tlipoca9 made their first contribution in #42
- @veophi made their first contribution in #50
- @liubing0427 made their first contribution in #48
- @ShirleyDing made their first contribution in #57
- @LikiosSedo made their first contribution in #70
- @yangsoon made their first contribution in #66
- @TrafalgarZZZ made their first contribution in #81
- @Syspretor made their first contribution in #89
- @Phil-Fan made their first contribution in #100
- @nightmeng made their first contribution in #107
Full Changelog: v0.4.0...v0.5.0
v0.5.0-alpha.4
What's Changed
- feat(engine-runtime): support sgl-router pd disaggregation with engine runtime by @TrafalgarZZZ in #82
- KEP-30: add role coordination kep by @gujingit in #59
- feat: add instanceset controller codes by @veophi in #83
- KEP-8: Refine API naming and preview/diff design based on community feedback by @LikiosSedo in #79
- [KEP-30]: Introduce InstanceSet Workload Support in RoleBasedGroup for Improved LLM Orchestration by @veophi in #26
- Fix(rbg): update role not trigger rolling update by @Syspretor in #89
- doc: Update sglang pd disaggregation example with sgl-router by @TrafalgarZZZ in #86
- KEP 74: Mooncake integration by @Syspretor in #75
- chore(codegen): Generate instanceSet related go-client codes by @Syspretor in #93
- chore(hack): add script to generate manifests.yaml by @Syspretor in #95
- feat(rbg): add rbac with resource instancesets/instances by @Syspretor in #97
- Build: update rbg helm chart 0.5.0-alpha.4 by @cheyang in #98
New Contributors
- @Syspretor made their first contribution in #89
Full Changelog: v0.5.0-alpha.3...v0.5.0-alpha.4
v0.5.0-alpha.3
What's Changed
- feat: add instance controller by @yangsoon in #66
- fix: sts_reconciler failure to retrieve historical StatefulSet revisions by @bcfre in #77
- fix(sts-reconciler): use compatible headless service name for statefu… by @TrafalgarZZZ in #81
- [KEP-31]: Adding ControllerRevision support to the RoleBasedGroup by @bcfre in #27
- Release 0.5.0-alpha.3 by @cheyang in #78
New Contributors
- @yangsoon made their first contribution in #66
- @TrafalgarZZZ made their first contribution in #81
Full Changelog: v0.5.0-alpha.2...v0.5.0-alpha.3
v0.5.0-alpha.2
What's Changed
- fix: address review comments in pr-57 by @bcfre in #67
- doc: update gang scheduling by @bcfre in #62
- feat: The ControllerRevision not store the replicas for RBG Roles by @bcfre in #61
- CI:add release script by @cheyang in #71
- feat: rbgctl supports rbg revision operations by @bcfre in #54
- KEP-8: Reduce YAML Duplication via RoleTemplates by @LikiosSedo in #70
- Release 0.5.0-alpha.2 by @cheyang in #73
New Contributors
- @LikiosSedo made their first contribution in #70
Full Changelog: v0.5.0-alpha.1...v0.5.0-alpha.2
v0.5.0-alpha.1
What's Changed
- fix build image error in Makefile by @gujingit in #41
- feat: Support parallel execution for roles with same dependencies by @tlipoca9 in #42
- Build: update rbg helm chart by @cheyang in #44
- feat: Supports using controllerrevision hash to update role by @bcfre in #34
- feat: add instanceset api by @veophi in #50
- doc: add missing model examples by @bcfre in #49
- feat: use code-generator to generate applyconfiguration code by @liubing0427 in #48
- Build docker image for supporting controller revision by @cheyang in #51
- [WIP]: Add in-place update api and core codes for InstanceSet by @veophi in #52
- fix: change stateful set service name to meet k8s requirements by @liubing0427 in #53
- feat: add engine runtime by @gujingit in #55
- bugfix: add max len check for workloadName & serviceName by @gujingit in #58
- fix: delete corresponding podgroup created by rbg when gang-schedulin… by @ShirleyDing in #57
- Update Helm chart 0.5.0-alpha.1 by @cheyang in #60
New Contributors
- @tlipoca9 made their first contribution in #42
- @veophi made their first contribution in #50
- @liubing0427 made their first contribution in #48
- @ShirleyDing made their first contribution in #57
Full Changelog: v0.4.0...v0.5.0-alpha.1
v0.4.0
Immutable
release. Only release title and notes can be modified.
What's Changed
Features
- feat: support rbgs scaling by @gujingit in #1
- add workload status update event by @gujingit in #6
- refactor: update dynamo demo; remove Chinese comments by @gujingit in #13
- feat: Add pull request template by @bcfre in #11
- add status check when diff workload by @gujingit in #15
- feat: Format action templates to match sglang's pattern by @Pikabooboo in #19
- add unit-tests by @gujingit in #28
- support partition in rollingupdate by @gujingit in #30
- feat: Add support for 1:1 rbg per topology assignment by @gujingit in #32
- perf: reduce api-server load caused by exclusive-topology by @cheyang in #33
- feature: support volcano podgroup by @ZYecho11 in #14
Bugfixs
- bugfix: Fix the permission issue that rbgs controller cannot create rbg by @bcfre in #12
- bugfix: Added consistency check for probes by @bcfre in #29
Build & CI
- Add build CI by @gujingit in #17
- CI: disable golint lll check by @gujingit in #37
- Enhance/replace with docker hub by @cheyang in #40
- Build: add vendor by @gujingit in #39
Docs
- doc: Using SGLang as the default inference engine by @gujingit in #4
- doc: Add CONTRIBUTING.md, development guide, updated image building logic by @bcfre in #9
New Contributors
- @bcfre made their first contribution in #12
- @Pikabooboo made their first contribution in #19
- @ZYecho11 made their first contribution in #14
Full Changelog: v0.3.0...v0.4.0