Skip to content

Feature Request: Support caching HIP compilation using clang #2044

@GZGavinZhao

Description

@GZGavinZhao

Similar to NVIDIA CUDA, AMD's HIP can also be compiled using Clang, and I wonder whether maintainers would accept a new feature to cache HIP compilations. It works basically the same as caching CUDA using clang, e.g. you enable compiling as HIP with -x hip. Caching HIP is very beneficial because HIP compilation times grows linearly with the number of distinct GPU architectures to support, i.e. if the compiled program wants to run a 10 different GPU architectures it basically needs to be compiled 10 times, once for each architecture 😅 For a more detailed motivation, please see ROCm/ROCm#2817.

I have already implemented a prototype of this feature and I'm happy to continue to maintain this feature. The prototype has already been working really efficiently for caching HIP compilations and have saved me massive amounts of time when packaging AMD's ROCm software stack for Solus Linux.

The only potential trouble during the implementation is that implicit dependencies on some LLVM bitcodes and HIP runtime headers may be introduced, as shown here in the official documentation, but I think it is definitely not impossible to deal with this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions