I have been trying many times to build llama.cpp with GPU support on my Jetson Orin NX (installed on Unitree G1). I tried a lot of different combinations of arguments but every time I finish building, and I try to see the available devices, the list is empty. I feel like it should work and that I am doing something wrong. Is there something important I am missing? Thank you in advance!
*** Please note that this reply is generated by LLM automatically ***
To address your issue with building llamacpp with GPU support on your Jetson Orin NX, let’s break down the key points and potential solutions.
-
Verify the Jetson Orin NX’s GPU capabilities: The Jetson Orin NX series is based on the NVIDIA Ampere architecture, which supports CUDA and cuDNN. Ensure that your llamacpp build process is configured to utilize these GPU capabilities.
-
Check the CUDA and cuDNN versions: Make sure that you have the correct versions of CUDA and cuDNN installed on your Jetson Orin NX. You can check the versions using the following commands:
nvcc --version
nvidia-smi
The recommended CUDA version for the Jetson Orin NX is 11.4 or later, and the recommended cuDNN version is 8.2 or later.
3. **Verify the GPU support in llamacpp**: Ensure that the llamacpp build process is configured to enable GPU support. You may need to specify additional flags or options during the build process, such as `-DGPU=ON` or `-DCUDA_ARCH=8.6` (for Ampere architecture).
4. **Compare with other Jetson platforms**: If you're experiencing issues with the Jetson Orin NX, try comparing your build process with other Jetson platforms, such as the Jetson Xavier NX or Jetson AGX Xavier. This may help you identify any platform-specific issues or configuration differences.
5. **Check the available devices**: After building llamacpp, use the `nvidia-smi` command to verify that the GPU is recognized by the system. You can also use the `cudaGetDeviceCount` function to check the number of available CUDA devices.
6. **Consult the NVIDIA documentation**: Refer to the NVIDIA documentation for the Jetson Orin NX, CUDA, and cuDNN to ensure that you're following the recommended build and configuration procedures.
7. **Seek community support**: If you're still experiencing issues, consider seeking help from the NVIDIA developer community or the llamacpp community. They may be able to provide additional guidance or troubleshooting steps.
In summary, to resolve your issue with building llamacpp with GPU support on your Jetson Orin NX, verify the GPU capabilities, check the CUDA and cuDNN versions, ensure GPU support in llamacpp, compare with other Jetson platforms, check the available devices, consult the NVIDIA documentation, and seek community support if needed.
**Comparison with other Jetson platforms:**
| Platform | CUDA Architecture | CUDA Version | cuDNN Version |
| --- | --- | --- | --- |
| Jetson Orin NX | Ampere (8.6) | 11.4 or later | 8.2 or later |
| Jetson Xavier NX | Volta (7.2) | 10.2 or later | 8.0 or later |
| Jetson AGX Xavier | Volta (7.2) | 10.2 or later | 8.0 or later |
Note that the CUDA architecture, CUDA version, and cuDNN version may vary across different Jetson platforms. Ensure that you're using the correct versions and configurations for your specific platform.
*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***
I tried a lot of differenct arguments, including the architecture and gpu arguments, unsuccesfully. This is my nvcc:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Sun_Oct_23_22:16:07_PDT_2022
Cuda compilation tools, release 11.4, V11.4.315
Build cuda_11.4.r11.4/compiler.3196410
Hi,
We didn’t try llama.cpp on Orin NX recently but the below command works on our other platform:
$ cd /opt
$ sudo apt-get install -y libcurl4-openssl-dev
$ git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
$ cmake -B build -DGGML_CUDA=ON -DGGML_CUDA_F16=on -DLLAMA_CURL=on -DGGML_CUDA_FA_ALL_QUANTS=ON -DCMAKE_CUDA_ARCHITECTURES="87"
$ cmake --build build --config Release --parallel 8
$ cd /opt/llama.cpp/build/bin
$ ./llama-server -hf ggml-org/gemma-3-12b-it-GGUF --flash-attn -ngl 49 -t 8 -c 0 --port
11434
Thanks.
Installation worked, I also had to add sudo to execute, maybe that was the problem all along… Thank you for the support