0.19.38
Gateways
Routers
dstack gateways now integrate with SGLang Model Gateway, enabling inference request routing with policies such as cache_aware, power_of_two, round_robin, and random. You can enable it by setting the router property in your gateway configuration to sglang and select any of the available routing policies.
Example configuration:
type: gateway
name: sglang-gateway
backend: aws
region: eu-west-1
domain: example.com
router:
type: sglang
policy: cache_awareRead how the new router property works in the documentation.
Fleets
Run plan
Since 0.19.26 release, dstack has been provisioning instances according to configured fleets, but run plan offers didn’t reflect that — meaning you might not have seen the actual offers used for provisioning.
This has now been fixed, and the run plan shows offers that respect the configured fleets.
For example, you can create a fleet for provisioning spot GPU instances on AWS:
type: fleet
name: cloud-fleet
nodes: 0..
backends: [aws]
spot_policy: spot
resources:
gpu: 1..The run plan for submitted runs now shows offers that match the fleet configuration:
✗ dstack apply
...
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 aws (us-east-1) cpu=4 mem=16GB disk=100GB T4:16GB:1 g4dn.xlarge $0.526
2 aws (us-east-2) cpu=4 mem=16GB disk=100GB T4:16GB:1 g4dn.xlarge $0.526
3 aws (us-west-2) cpu=4 mem=16GB disk=100GB T4:16GB:1 g4dn.xlarge $0.526
...
Shown 3 of 309 offers, $71.552maxWhat's changed
- [Docs] Update to the latest
mkdocs-materialand add thecontributing/DOCS.mdby @peterschmidt85 in #3286 - [Docs] Describe some gateway options on Concepts/Gateways page by @un-def in #3287
- Expand max_duration reference by @r4victor in #3292
- [Docker] Fix ssh zombie processes issue by @un-def in #3295
- [Docs] Fix incorrect URLs by @peterschmidt85 in #3297
- [Blog] NVIDIA DGX Spark by @peterschmidt85 in #3298
- Return plan offers wrt fleets by @r4victor in #3300
- Show task nodes in run plan by @r4victor in #3301
- Log non-zero exit status in SSHTunnel.close/aclose by @un-def in #3296
- Fix in-place update when
filesare used by @un-def in #3289 - [Runpod] Require CUDA 12.8+ on the host by @peterschmidt85 in #3304
- Fix SSHAttach.detach() by @un-def in #3306
- Add SGLang Router Support by @Bihan in #3267
Full changelog: 0.19.37...0.19.38