AMD GPU support

**Let's make Raven run on AMD GPUs, too.**

Raven does not directly depend on CUDA, but only on PyTorch and on various AI libraries in the Python ecosystem.

The thing is, I only have NVIDIA machines. I know that on the AMD side, there exist things called [ROCm](https://en.wikipedia.org/wiki/ROCm) and [ZLUDA](https://github.com/vosen/ZLUDA), but that is all I know about them.

So I have no idea how to enable AMD GPU compute support in the first place, and even if I did, I wouldn't have an environment to test in.

**Help wanted!**

If you have a machine with an AMD GPU, and would like to:

- Help me to get Raven to run on your system, **OR**
- Make the modifications yourself and submit a PR,

then please comment below.

**What needs to be done**

The major component that needs attention is *Raven-server*. Once that is working on AMD, the rest should Just Work™.

- What are the relevant low-level libraries to install, to run PyTorch on AMD GPUs?
  - See [`pyproject.toml`](https://github.com/Technologicat/raven/blob/main/pyproject.toml).
  - Look at the `cuda` dependency group. What is the equivalent set of libraries to make PyTorch et al. go fast on AMD?
  - Are there any libraries in the `dependencies` section that are known **NOT** to work on AMD? Particularly:
    - `Transformers`: most of the NLP models hosted by Raven-server modules.
      - See [`raven.server.config`](https://github.com/Technologicat/raven/blob/main/raven/server/config.py) and [`raven.common.nlptools`](https://github.com/Technologicat/raven/blob/main/raven/common/nlptools.py).
    - `sentence_transformers`: the `embeddings` module.
      - NOTE: even for ChromaDB, we handle the embeddings on our side, so ChromaDB doesn't need the GPU.
    - `sentencepiece`: needed by some tokenizers.
    - `spacy`: part-of-speech tagging, as well as sentence boundary detection in various other NLP tasks.
    - `dehyphen`, which depends on [Flair-NLP](https://github.com/flairNLP/flair): the `sanitize` server module.
    - `sacremoses`: optional dependency for the en→fi neural machine translator.
    - `kokoro`: the speech synthesizer (TTS). PyTorch-based?
    - `torchvision`: [`raven.common.video.postprocessor`](https://github.com/Technologicat/raven/blob/main/raven/common/video/postprocessor.py) and [`raven.vendor.anime4k.anime4k`](https://github.com/Technologicat/raven/blob/main/raven/vendor/anime4k/anime4k.py).
- What else do we need to change?
  - GPU device in the configs, something other than `"cuda:0"`?
    - See [`raven.server.config`](https://github.com/Technologicat/raven/blob/main/raven/server/config.py), [`raven.visualizer.config`](https://github.com/Technologicat/raven/blob/main/raven/visualizer/config.py), [`raven.librarian.config`](https://github.com/Technologicat/raven/blob/main/raven/librarian/config.py), and the config validator in [`raven.common.deviceinfo`](https://github.com/Technologicat/raven/blob/main/raven/common/deviceinfo.py).
  - Are there AMD equivalents for `CUDA_VISIBLE_DEVICES` and `nvidia-smi` (see [`run-on-internal-gpu.sh`](https://github.com/Technologicat/raven/blob/main/run-on-internal-gpu.sh))?
- On AMD systems, are there any known library bugs we need to add workarounds for?
  - For an NVIDIA example, https://github.com/hoffstadt/DearPyGui/issues/554


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD GPU support #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

AMD GPU support #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions