-
Notifications
You must be signed in to change notification settings - Fork 207
Closed as not planned
Labels
Description
Currently, offers cache is implemented via Compute.get_offers_cached(requirements), which caches backend offers given requirements. Since cache is not reused for different requirements, it's expensive to get offers for every existing fleet when provisioning a run (#3020). The solution is to make the offers cache backend-specific:
- We will call get_offers(requirements) for each fleet. Offer caching will not be shared at the get_offers level, but will be an implementation detail of the backend.
- Backends where offers do not depend on requirements will have a similar implementation: collect all offers and cache them, then apply the common offer filtering logic.
- Backends where offers depend on requirements will be able to cache not the dstack offers (since offers cannot be formed without requirements), but the API requests from which offers will later be formed using the requirements.
Essentially, the idea is to cache only the expensive operations (API calls) and allow doing fast operations (requirements filtering) many times.
I also suggest we move offers collection and filtering logic from gpuhunt to dstack as a part of this issue.