-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Problem: Despite catboost using FindCUDAToolkit.cmake, it's still cumbersome to override the target CUDA architectures when building catboost. This is because the generated cmake files hard-code the gencodes in several places:
- https://github.com/catboost/catboost/blob/2f4bc902e1cd00232b81e322d6a0f589e17c651c/catboost/libs/model/cuda/CMakeLists.linux-x86_64-cuda.txt#L25C1-L47
catboost/catboost/cuda/ctrs/CMakeLists.linux-x86_64-cuda.txt
Lines 34 to 56 in 2f4bc90
target_cuda_flags(catboost-cuda-ctrs -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=compute_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=compute_60 -gencode arch=compute_61,code=compute_61 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 --ptxas-options=-v -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_89,code=sm_89 -gencode=arch=compute_90,code=sm_90 ) catboost/library/cpp/cuda/wrappers/CMakeLists.linux-x86_64-cuda.txt
Lines 36 to 59 in 2f4bc90
target_cuda_flags(cpp-cuda-wrappers -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=compute_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=compute_60 -gencode arch=compute_61,code=compute_61 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 --ptxas-options=-v -lineinfo --use_fast_math -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_89,code=sm_89 -gencode=arch=compute_90,code=sm_90 - etc
This also make it very hard to rebuild catboost with a different CUDAToolkit release (e.g. 1.2.2 would by default fail to build against CUDAToolkit 12, because e.g. compute_35 had been deprecated)
catboost version: 1.2.2
Proposed solution: instead of setting the gencodes using target_cuda_flags, consider relying on FindCUDAToolkit.cmake. If catboost uses an extra tool to orchestrate the cmake builds (including code generation), you could have that tool pass the -DCMAKE_CUDA_ARCHITECTURES flag to CMake, and these architectures would be automatically used for all of the targets in the project. This way users and package distributions can also easily override the targets locally
Thanks!