0.19.38

@peterschmidt85

Gateways

Routers

dstack gateways now integrate with SGLang Model Gateway, enabling inference request routing with policies such as cache_aware, power_of_two, round_robin, and random. You can enable it by setting the router property in your gateway configuration to sglang and select any of the available routing policies.

Example configuration:

type: gateway
name: sglang-gateway

backend: aws
region: eu-west-1

domain: example.com
router:
  type: sglang
  policy: cache_aware

Read how the new router property works in the documentation.

Fleets

Run plan

Since 0.19.26 release, dstack has been provisioning instances according to configured fleets, but run plan offers didn’t reflect that — meaning you might not have seen the actual offers used for provisioning.

This has now been fixed, and the run plan shows offers that respect the configured fleets.

For example, you can create a fleet for provisioning spot GPU instances on AWS:

type: fleet
name: cloud-fleet
nodes: 0..
backends: [aws]
spot_policy: spot
resources: 
  gpu: 1..

The run plan for submitted runs now shows offers that match the fleet configuration:

✗ dstack apply                                                      
...
 #  BACKEND          RESOURCES                            INSTANCE TYPE  PRICE    
 1  aws (us-east-1)  cpu=4 mem=16GB disk=100GB T4:16GB:1  g4dn.xlarge    $0.526   
 2  aws (us-east-2)  cpu=4 mem=16GB disk=100GB T4:16GB:1  g4dn.xlarge    $0.526   
 3  aws (us-west-2)  cpu=4 mem=16GB disk=100GB T4:16GB:1  g4dn.xlarge    $0.526   
    ...                                                                           
 Shown 3 of 309 offers, $71.552max

What's changed

[Docs] Update to the latest mkdocs-material and add the contributing/DOCS.md by @peterschmidt85 in #3286
[Docs] Describe some gateway options on Concepts/Gateways page by @un-def in #3287
Expand max_duration reference by @r4victor in #3292
[Docker] Fix ssh zombie processes issue by @un-def in #3295
[Docs] Fix incorrect URLs by @peterschmidt85 in #3297
[Blog] NVIDIA DGX Spark by @peterschmidt85 in #3298
Return plan offers wrt fleets by @r4victor in #3300
Show task nodes in run plan by @r4victor in #3301
Log non-zero exit status in SSHTunnel.close/aclose by @un-def in #3296
Fix in-place update when files are used by @un-def in #3289
[Runpod] Require CUDA 12.8+ on the host by @peterschmidt85 in #3304
Fix SSHAttach.detach() by @un-def in #3306
Add SGLang Router Support by @Bihan in #3267

Full changelog: 0.19.37...0.19.38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.19.38

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Gateways

Routers

Fleets

Run plan

What's changed

Contributors

Uh oh!