-
Notifications
You must be signed in to change notification settings - Fork 207
Comparing changes
Open a pull request
base repository: dstackai/dstack
base: 0.19.30
head repository: dstackai/dstack
compare: 0.19.31
- 12 commits
- 56 files changed
- 4 contributors
Commits on Sep 25, 2025
-
Configuration menu - View commit details
-
Copy full SHA for d622241 - Browse repository at this point
Copy the full SHA d622241View commit details
Commits on Sep 26, 2025
-
Kubernetes: request resources according to RequirementsSpec (#3127)
Other fixes and improvements: * Handle errors in `_create_jump_pod_service_if_not_exists` * Check both Service and Pod to decide if the jump pod must be (re)created * Respect `Node.status.nodeinfo.architecture` * Add `namespace` option to the backend config Part-of: #3126
Configuration menu - View commit details
-
Copy full SHA for 430552b - Browse repository at this point
Copy the full SHA 430552bView commit details -
Configuration menu - View commit details
-
Copy full SHA for ef698be - Browse repository at this point
Copy the full SHA ef698beView commit details
Commits on Sep 29, 2025
-
Support A4 instances with the B200 GPU on GCP (#3100)
This implementation allows provisioning both individual A4 instances and clusters, but clusters do not yet support high-speed networking, since it requires a [different network setup](https://cloud.google.com/ai-hypercomputer/docs/create/create-vm#setup-network).
Configuration menu - View commit details
-
Copy full SHA for 9c51df8 - Browse repository at this point
Copy the full SHA 9c51df8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 840ce36 - Browse repository at this point
Copy the full SHA 840ce36View commit details -
Move
USERtodstack project list --verbose(#3134)Only show the `USER` column in `dstack project list` if `--verbose` is passed. In my setup, where 9 projects are configured, this speeds up `dstack project list` from 20 seconds to 2 seconds.
Configuration menu - View commit details
-
Copy full SHA for f90259b - Browse repository at this point
Copy the full SHA f90259bView commit details -
Configuration menu - View commit details
-
Copy full SHA for daa3d03 - Browse repository at this point
Copy the full SHA daa3d03View commit details
Commits on Sep 30, 2025
-
[Backward incompatible] Rename properties in Kubernetes backend config (
#3137) * networking -> proxy_jump * ssh_host -> hostname * ssh_port -> port In addition, `dstack-` prefix has been added to jump pod and service names for consistency with jobs pods and services. Closes: #3136 Co-authored-by: peterschmidt85 <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 85faee6 - Browse repository at this point
Copy the full SHA 85faee6View commit details
Commits on Oct 2, 2025
-
Support GCP A4 clusters (#3142)
This commit implements provisioning GCP A4 clusters with high-performance RoCE networking. ```shell > dstack fleet FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED gpu 0 gcp (us-west2) cpu=224 mem=3968GB disk=100GB B200:180GB:8 (spot) $51.552 idle 21 mins ago 1 gcp (us-west2) cpu=224 mem=3968GB disk=100GB B200:180GB:8 (spot) $51.552 idle 17 mins ago ``` To enable high-performance networking, users need to create the [appropriate networks](https://cloud.google.com/ai-hypercomputer/docs/create/create-vm#setup-network) and configure them in the backend settings. ```yaml projects: - name: main backends: - type: gcp project_id: my-project creds: type: default vpc_name: my-vpc-0 # regular, 1 subnet extra_vpcs: - my-vpc-1 # regular, 1 subnet roce_vpcs: - my-vpc-mrdma # RoCE profile, 8 subnets ``` Then apply a fleet configuration. ```yaml type: fleet nodes: 2 placement: cluster availability_zones: [us-west2-c] backends: [gcp] resources: gpu: 8:b200 ``` Each instance in the cluster will then have 10 network interfaces: - 1 regular interface in the main VPC (`default` or the one configured in `vpc_name`). - 1 regular interface in a VPC configured in `extra_vpcs`. - 8 RDMA interfaces in the VPC configured in `roce_vpcs`. Additionally, this commit optimizes the fetching and caching of subnets, so that they are fetched from the API only once, and not separately for each item in `extra_vpcs`. For some instance types, this reduces the number of API requests from 9 to 1, which cuts about 16 seconds from each offer provisioning attempt.Configuration menu - View commit details
-
Copy full SHA for f7ef485 - Browse repository at this point
Copy the full SHA f7ef485View commit details -
Kubernetes: add multi-node support (#3141)
* Discover and set instance's internal_ip (PodIP) * Fix region mismatch * Add `privileged: true` support * [runner] Set RLIMIT_MEMLOCK to unlimited. Fixes issues with InfiniBand/RDMA Part-of: #3126
Configuration menu - View commit details
-
Copy full SHA for 8a72c8c - Browse repository at this point
Copy the full SHA 8a72c8cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3dbd68b - Browse repository at this point
Copy the full SHA 3dbd68bView commit details -
[Docs] Improve Kubernetes documentation (#3138)
* [Docs] Kubernetes guide * [Docs] Kubernetes guide Rework `Backends` and `Fleets` pages to reflect the changes related to Kubernetes * [Docs] Improve Kubernetes documentation Updated `README`, `Overview`, `Installation` * [Docs] Improve Kubernetes documentation Minor updates, incl. the description of `Default image`, and `privileged` for NCCL tests * [Docs] Improve Kubernetes documentation Updated `FAQ`
Configuration menu - View commit details
-
Copy full SHA for 6201c2f - Browse repository at this point
Copy the full SHA 6201c2fView commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff 0.19.30...0.19.31