Skip to content

Add GPU Split Strategy for Inference similar to LM Studio #884

@deep1401

Description

@deep1401

A user Frank talked to mentioned they primarily like the GPU splitting feature when doing inference with models in LM Studio.
We should add that as well. This may involve doing work with transformerlab-inference. We could start with fastchat_server supporting this and then propagating it to other loader plugins.

Reference:

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiChanges to be made to the transformerlab-api repoenhancementNew feature or requestfastchatIssues related to transformerlab/FastChat

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions