Skip to content

0.19.38

Choose a tag to compare

@r4victor r4victor released this 21 Nov 08:46
· 112 commits to master since this release
3b58cae

Gateways

Routers

dstack gateways now integrate with SGLang Model Gateway, enabling inference request routing with policies such as cache_aware, power_of_two, round_robin, and random. You can enable it by setting the router property in your gateway configuration to sglang and select any of the available routing policies.

Example configuration:

type: gateway
name: sglang-gateway

backend: aws
region: eu-west-1

domain: example.com
router:
  type: sglang
  policy: cache_aware

Read how the new router property works in the documentation.

Fleets

Run plan

Since 0.19.26 release, dstack has been provisioning instances according to configured fleets, but run plan offers didn’t reflect that — meaning you might not have seen the actual offers used for provisioning.

This has now been fixed, and the run plan shows offers that respect the configured fleets.

For example, you can create a fleet for provisioning spot GPU instances on AWS:

type: fleet
name: cloud-fleet
nodes: 0..
backends: [aws]
spot_policy: spot
resources: 
  gpu: 1..

The run plan for submitted runs now shows offers that match the fleet configuration:

✗ dstack apply                                                      
...
 #  BACKEND          RESOURCES                            INSTANCE TYPE  PRICE    
 1  aws (us-east-1)  cpu=4 mem=16GB disk=100GB T4:16GB:1  g4dn.xlarge    $0.526   
 2  aws (us-east-2)  cpu=4 mem=16GB disk=100GB T4:16GB:1  g4dn.xlarge    $0.526   
 3  aws (us-west-2)  cpu=4 mem=16GB disk=100GB T4:16GB:1  g4dn.xlarge    $0.526   
    ...                                                                           
 Shown 3 of 309 offers, $71.552max

What's changed

Full changelog: 0.19.37...0.19.38