Skip to content

Conversation

@jvstme
Copy link
Collaborator

@jvstme jvstme commented Oct 21, 2025

Add a catalog item field for holding arbitrary
provider-specific data. As an example, use this
field for reserved GCP A4 instances.

Below is a comparison of different field types
that could be used for this field in both gpuhunt
and dstack. This commit suggests using typed
dicts.

JSON string (parsing into a Pydantic model on the provider/backend side)

Cons:

  • Inefficient, inconvenient, and error-prone for
    writing - unnecessary serialization and
    deserialization when updating an attribute of an
    already serialized object.
  • Possibility of incorrect usage (writing non-JSON
    data).

dict / TypedDict

Cons:

  • No validation, which means potential errors
    occur at attribute access time rather than at
    model loading time.

Pydantic models with a type discriminator

Cons:

  • Extra difficulties maintaining backward
    compatibility, as the models are passed from
    gpuhunt to dstack server, from server to client,
    and from client to server, all with validation.
  • Duplication of backend type in the
    backend-specific field and in other fields of
    the offer or catalog (e.g.,
    InstanceOffer.backend and
    InstanceOffer.backend_data.type).
  • Discriminators require declaring all possible
    discriminator values, which in the future will
    hinder the transition to a more modular
    architecture with backend plugins.
  • Backward compatibility issues when a new
    discriminator value (a new backend) is
    introduced.

Pydantic models + custom deserialization logic

(e.g., custom InstanceOffer deserializer that
determines the InstanceOffer.backend_data model
based on InstanceOffer.backend)

Cons:

  • Extra difficulties maintaining backward
    compatibility as the models are passed from
    gpuhunt to dstack server, from server to client,
    and from client to server, all with validation.
  • The need to duplicate deserialization logic in
    all model that hold the backend-specific field -
    at least in RawCatalogItem and InstanceOffer.

Add a catalog item field for holding arbitrary
provider-specific data. As an example, use this
field for reserved GCP A4 instances.

Below is a comparison of different field types
that could be used for this field in both gpuhunt
and dstack. This commit suggests using typed
dicts.

JSON string (parsing into a Pydantic model on the
provider/backend side)

Cons:
- Inefficient, inconvenient, and error-prone for
  writing - unnecessary serialization and
  deserialization when updating an attribute of an
  already serialized object.
- Possibility of incorrect usage (writing non-JSON
  data).

`dict` / `TypedDict`

Cons:
- No validation, which means potential errors
  occur at attribute access time rather than at
  model loading time.

Pydantic models with a `type` discriminator

Cons:
- Extra difficulties maintaining backward
  compatibility, as the models are passed from
  gpuhunt to dstack server, from server to client,
  and from client to server, all with validation.
- Duplication of backend type in the
  backend-specific field and in other fields of
  the offer or catalog (e.g.,
  `InstanceOffer.backend` and
  `InstanceOffer.backend_data.type`).
- Discriminators require declaring all possible
  discriminator values, which in the future will
  hinder the transition to a more modular
  architecture with backend plugins.
- Backward compatibility issues when a new
  discriminator value (a new backend) is
  introduced.

Pydantic models + custom deserialization logic
(e.g., custom `InstanceOffer` deserializer that
determines the `InstanceOffer.backend_data` model
based on InstanceOffer.backend)

Cons:
- Extra difficulties maintaining backward
  compatibility as the models are passed from
  gpuhunt to dstack server, from server to client,
  and from client to server, all with validation.
- The need to duplicate deserialization logic in
  all model that hold the backend-specific field -
  at least in RawCatalogItem and `InstanceOffer`.
@jvstme jvstme merged commit 0bc8cc2 into main Oct 21, 2025
6 checks passed
@jvstme jvstme deleted the provider_data_typed_dict branch October 21, 2025 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants