-
Notifications
You must be signed in to change notification settings - Fork 40
KEP-30: add role coordination kep #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
d78c75d
add role coordination kep
gujingit 470ee3c
update struct & yaml
gujingit 2e432a9
add api design
gujingit 5930abd
Update kep.yaml
Syspretor f119958
Refine API struct and naming
Syspretor 729ba3a
Refine coordination api design
Syspretor File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next
Next commit
add role coordination kep
- Loading branch information
commit d78c75d68a516908bc653eadbc3df15d27c822e8
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,257 @@ | ||
| # KEP-30: Role Coordination for RoleBasedGroup | ||
|
|
||
| ## Table of Contents | ||
|
|
||
| <!-- toc --> | ||
|
|
||
| - [Release Signoff Checklist](#release-signoff-checklist) | ||
| - [Summary](#summary) | ||
| - [Motivation](#motivation) | ||
| - [Goals](#goals) | ||
| - [Non-Goals](#non-goals) | ||
| - [Proposal](#proposal) | ||
| - [User Stories](#user-stories) | ||
| - [Story 1: Coordinated Rolling Update](#story-1-coordinated-rolling-update) | ||
| - [Implementation Details](#implementation-details) | ||
| - [API Changes](#api-changes) | ||
| - [Risks and Mitigations](#risks-and-mitigations) | ||
| - [Design Details](#design-details) | ||
| - [Test Plan](#test-plan) | ||
| - [Graduation Criteria](#graduation-criteria) | ||
| - [Implementation History](#implementation-history) | ||
| - [Drawbacks](#drawbacks) | ||
|
|
||
| <!-- /toc --> | ||
|
|
||
| ## Release Signoff Checklist | ||
|
|
||
| - [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial | ||
| KEP PR) | ||
| - [ ] (R) KEP approvers have approved the KEP status as `implementable` | ||
| - [ ] (R) Design details are appropriately documented | ||
| - [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test | ||
| refactors) | ||
| - [ ] e2e Tests for all Beta API Operations (endpoints) | ||
| - [ ] (R) Ensure GA e2e tests meet requirements | ||
| for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) | ||
| - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free | ||
| - [ ] (R) Graduation criteria is in place | ||
| - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit | ||
| by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) | ||
| within one minor version of promotion to GA | ||
| - [ ] (R) Production readiness review completed | ||
| - [ ] (R) Production readiness review approved | ||
| - [ ] "Implementation History" section is up-to-date for milestone | ||
| - [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] | ||
| - [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, | ||
| relevant PRs/issues, release notes | ||
|
|
||
| ## Summary | ||
|
|
||
| This KEP proposes adding role coordination capabilities to the RoleBasedGroup (RBG) controller. | ||
| Currently, RBG manages multiple roles independently, but many real-world applications require coordinated updates | ||
| across multiple roles. This enhancement introduces a coordination mechanism that allows defining complex update | ||
| strategies spanning multiple roles, such as updating a frontend role partially, then updating a backend role completely, | ||
| and finally completing the frontend update. | ||
|
|
||
| ## Motivation | ||
|
|
||
| In complex distributed applications, individual components often need to be updated in a specific sequence to | ||
| maintain application availability and consistency. The current RoleBasedGroup implementation updates each role | ||
| independently, which can lead to service disruptions during updates. For example, | ||
| it's often necessary to update the prefill and decode at a fixed ratio (4P2D) in PD-disagg LLM inferences. | ||
|
|
||
| | Stage | Upgrade Process | Comments | | ||
| |-------|-------------------------------------------------------------|------------------------------------------------------------------------------------------------------| | ||
| | 1 | Old Prefill: 4; Old Decode: 2 | Begin to update rbg. | | ||
| | 2 | Old Prefill: 2, New Prefill: 2; Old Decode: 2 | Update 2 prefill pods first . | | ||
| | 3 | Old Prefill: 2, New Prefill: 2; Old Decode: 1; New Decode 1 | Stop updating the prefill and update only one decode. The ratio of P to D must be maintained at 2:1. | | ||
| | 4 | New Prefill: 4; Old Decode: 1; New Decode 1 | Continue to update the prefill pods. | | ||
| | 5 | New Prefill: 4; New Decode 2 | Update completely. | | ||
|
|
||
| ### Goals | ||
|
|
||
| 1. Enable coordinated updates across multiple roles in a RoleBasedGroup | ||
| 2. Support phased rollouts where some roles are updated partially while others wait | ||
| 3. Provide a flexible coordination strategy definition mechanism | ||
| 4. Maintain backward compatibility with existing RoleBasedGroup resources | ||
| 5. Allow rollback capabilities for coordinated updates | ||
|
|
||
| ### Non-Goals | ||
|
|
||
| 1. Implement coordination across multiple RoleBasedGroupSet | ||
| 2. Handle coordination of non-workload resources (e.g., ConfigMaps, Secrets) | ||
|
|
||
| ## Proposal | ||
|
|
||
| This KEP introduces a new `coordination` field to the RoleBasedGroup specification that defines how roles should | ||
| be connected in relation to each other. The coordination strategy consists of a series of steps, each specifying which | ||
| roles to update and how. | ||
|
|
||
| ### User Stories | ||
|
|
||
| #### Story 1: Coordinated Rolling Update | ||
|
|
||
| In the PD-disaggregated scenario for LLM inference, the input/output pattern is relatively fixed, | ||
| with an optimal P:D ratio of 2:1. | ||
| Each time 2 Prefill Pods are updated, 1 Decode Pod needs to be updated accordingly to maintain this ratio. | ||
|
|
||
| ##### Coordinate Rolling Update Process | ||
|
|
||
| The coordinated rolling update process ensures that the P:D ratio is maintained throughout the update cycle. | ||
| Here's how it works: | ||
|
|
||
| 1. **Initial State**: The system starts with all old Prefill and Decode pods running | ||
| 2. **Step-by-Step Update**: | ||
| - Update Prefill pods in batches of 2 | ||
| - For each batch of 2 Prefill pods updated, update 1 Decode pod | ||
| - Monitor readiness of updated pods before proceeding | ||
| 3. **Completion**: Continue until all Prefill and Decode pods are updated while maintaining the 2:1 ratio | ||
|
|
||
| This approach ensures service continuity and optimal resource utilization during the update process, | ||
| preventing performance degradation due to imbalanced P:D ratios. | ||
|
|
||
| ### Implementation Details | ||
|
|
||
| #### API Changes | ||
|
|
||
| Add a new `Coordination` field to the RoleBasedGroup spec: | ||
|
|
||
| ```go | ||
| type RoleBasedGroupSpec struct { | ||
| // Existing fields... | ||
|
|
||
| // Coordination defines how roles should be coordinated | ||
| // +optional | ||
| Coordination *Coordination `json:"coordination,omitempty"` | ||
| } | ||
|
|
||
| type Coordination struct { | ||
| // Steps defines the sequence of coordination steps | ||
| Steps []CoordinationStep `json:"steps"` | ||
| } | ||
|
|
||
| type CoordinationStep struct { | ||
| // Roles involved in this step | ||
| Roles []string `json:"roles"` | ||
|
|
||
| // Strategy for each role in this step | ||
| RoleStrategies map[string]RoleStrategy `json:"roleStrategies,omitempty"` | ||
| } | ||
|
|
||
| type RoleStrategy struct { | ||
| UpdateStrategy RoleUpdateStrategy `json:"updateStrategy,omitempty"` | ||
| // ScalingStrategy | ||
| // DeletingStrategy | ||
| // ... | ||
| } | ||
|
|
||
| type RoleUpdateStrategy struct { | ||
| Partition *int32 `json:"partition,omitempty"` | ||
| MaxUnavailable intstr.IntOrString `json:"maxUnavailable,omitempty"` | ||
| MaxSurge `json:"maxSurge,omitempty"` | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| Add a new `CoordinationState` field to the RoleBasedGroup status: | ||
|
|
||
| ```go | ||
| // Add coordination status in RoleBasedGroupStatus | ||
| type RoleBasedGroupStatus struct { | ||
| CoordinationState CoordinationState `json:"coordinationState,omitempty"` | ||
| } | ||
|
|
||
| type CoordinationState struct { | ||
| // Current phase being coordinated | ||
| CurrentPhase string `json:"currentPhase,omitempty"` | ||
| // Coordination progress information | ||
| Progress map[string]string `json:"progress,omitempty"` | ||
| LastUpdateTime metav1.Time `json:"lastUpdateTime,omitempty"` | ||
| } | ||
|
|
||
| ``` | ||
|
|
||
| ### Risks and Mitigations | ||
|
|
||
| 1. **Complexity Risk**: Adding coordination logic increases controller complexity | ||
| - Mitigation: Implement thorough unit and integration tests | ||
|
|
||
| 2. **Deadlock Risk**: Poorly configured coordination strategies could cause updates to stall | ||
| - Mitigation: Add timeouts and clear status reporting | ||
|
|
||
| 3. **Backward Compatibility**: Existing RoleBasedGroups should continue to work unchanged | ||
| - Mitigation: Only apply coordination logic when `coordination` is specified | ||
|
|
||
| ## Design Details | ||
|
|
||
| The implementation will modify the main reconciliation loop | ||
| in [RoleBasedGroupReconciler] to check for a coordination strategy. If present, it will execute the coordinated | ||
| update logic; otherwise, it will fall back to the existing independent role update behavior. | ||
|
|
||
| Each coordination step will: | ||
|
|
||
| 1. Apply the specified strategies to the relevant roles | ||
| 2. Monitor the status of those roles | ||
| 3. Proceed to the next step only when the current step is complete | ||
|
|
||
| The controller will use the existing workload reconcilers (StatefulSetReconciler, DeploymentReconciler, etc.) but with | ||
| modified parameters based on the coordination strategy. | ||
|
|
||
| ### Test Plan | ||
|
|
||
| #### Unit Tests | ||
|
|
||
| - Test coordination strategy parsing and validation | ||
| - Test step execution logic | ||
| - Test status tracking and updates | ||
| - Test edge cases (empty steps, invalid configurations) | ||
|
|
||
| #### Integration Tests | ||
|
|
||
| - Test full coordination flow with multiple roles | ||
| - Test partial updates within steps | ||
| - Test rollback scenarios | ||
| - Test interaction with existing independent role updates | ||
|
|
||
| #### E2E Tests | ||
|
|
||
| - Deploy a multi-role application with coordination strategy | ||
| - Execute coordinated update and verify correct sequence | ||
| - Verify application availability during update | ||
|
|
||
| ### Graduation Criteria | ||
|
|
||
| #### Alpha | ||
|
|
||
| - Basic coordination strategy implementation | ||
| - Support for simple sequential role updates | ||
| - Unit and integration tests | ||
| - Documentation and examples | ||
|
|
||
| #### Beta | ||
|
|
||
| - Support for complex coordination patterns | ||
| - Comprehensive e2e tests | ||
| - Metrics and monitoring | ||
| - User feedback and iterations | ||
|
|
||
| #### GA | ||
|
|
||
| - Proven stability in production environments | ||
| - Complete documentation and best practices | ||
| - No critical bugs reported for 2 consecutive releases | ||
|
|
||
| ## Implementation History | ||
|
|
||
| - 2025-10-17: KEP created | ||
| - TBD: Alpha implementation | ||
| - TBD: Beta implementation | ||
| - TBD: GA implementation | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| 1. Increased complexity in the RoleBasedGroup controller | ||
| 2. Additional status tracking and state management | ||
| 3. Potential for misconfigured coordination strategies to block updates | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| title: KEP Template | ||
| kep-number: NNNN | ||
| authors: | ||
| - "@jane.doe" | ||
|
||
| owning-sig: sig-xyz | ||
| participating-sigs: | ||
| - sig-aaa | ||
| - sig-bbb | ||
| status: provisional|implementable|implemented|deferred|rejected|withdrawn|replaced | ||
| creation-date: yyyy-mm-dd | ||
| reviewers: | ||
| - TBD | ||
| - "@alice.doe" | ||
| approvers: | ||
| - TBD | ||
| - "@oscar.doe" | ||
|
|
||
| see-also: | ||
| - "/keps/sig-aaa/1234-we-heard-you-like-keps" | ||
| - "/keps/sig-bbb/2345-everyone-gets-a-kep" | ||
| replaces: | ||
| - "/keps/sig-ccc/3456-replaced-kep" | ||
|
|
||
| # The target maturity stage in the current dev cycle for this KEP. | ||
| # If the purpose of this KEP is to deprecate a user-visible feature | ||
| # and a Deprecated feature gates are added, they should be deprecated|disabled|removed. | ||
| stage: alpha|beta|stable | ||
|
|
||
| # The most recent milestone for which work toward delivery of this KEP has been | ||
| # done. This can be the current (upcoming) milestone, if it is being actively | ||
| # worked on. | ||
| latest-milestone: "v1.19" | ||
|
|
||
| # The milestone at which this feature was, or is targeted to be, at each stage. | ||
| milestone: | ||
| alpha: "v1.19" | ||
| beta: "v1.20" | ||
| stable: "v1.22" | ||
|
|
||
| # The following PRR answers are required at alpha release | ||
| # List the feature gate name and the components for which it must be enabled | ||
| feature-gates: | ||
| - name: MyFeature | ||
| components: | ||
| - kube-apiserver | ||
| - kube-controller-manager | ||
| disable-supported: true | ||
|
|
||
| # The following PRR answers are required at beta release | ||
| metrics: | ||
| - my_feature_metric | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding the detail design to instruct the router component to split requests between the old and new Pods in real time during a RoleBasedGroup coordinated upgrade, so that it can integrate with SGLang Router's rolling update workflow