You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is the development roadmap for v0.4.0. Contributions and feedback are welcome.
Upgrades
In-Place Upgrades: Support for updating components without pod recreation.
Orchestrated Upgrade Order: Ensure the upgrade sequence is coordinated with the required component startup order.
Scheduling
Original Node Scheduling: Support for scheduling pods back to their original nodes after restarts or preemptions.
Multi-Level Gang Scheduling: Enable the co-scheduling of multiple, dependent groups of pods.
Volcano Integration: Support for gang scheduling via the Volcano scheduler.
Topology-Aware Scheduling: Co-locate Prefill and Decode pods on the same node whenever possible to maximize GPU utilization and VRAM efficiency.
Fault Tolerance
Configurable Failure Policies: Allow users to define various FailurePolicy strategies to handle pod failures.
Runtime
Simplified, Runtime-less Service Discovery: Streamline the cluster ConfigMap to reduce overhead and enable service discovery without requiring a dedicated EngineRuntime component.
Here is the development roadmap for v0.4.0. Contributions and feedback are welcome.
Upgrades
Scheduling
Fault Tolerance
Runtime