Hi!
For those who like to run $THINGS in containers, I tried to find a way to build a docker image for llama.cpp as there are currently no containers at all for arm64 - only for amd64. [1]
[1] see open issue for details [Tracker] Docker build fails on CI for arm64 · Issue #11888 · ggml-org/llama.cpp · GitHub
So I tried to find out how the normal build process looks like and why it is failing for arm64 and/or how to get it running for our GB10s.
The standard docker files for are located in a folder .devops of the official repo. There is also one for CUDA, but that fails for GB10 out of the box. The main reason for that is a wrong LD_LIBRARY_PATH, so you will get an error:
#14 92.07 [ 62%] Building CXX object common/CMakeFiles/common.dir/peg-parser.cpp.o
#14 92.23 [ 62%] Building CXX object common/CMakeFiles/common.dir/regex-partial.cpp.o
#14 92.33 [ 62%] Building CXX object common/CMakeFiles/common.dir/sampling.cpp.o
#14 92.38 [ 62%] Linking CXX executable ../../bin/llama-simple
#14 92.44 /usr/bin/ld: warning: libcuda.so.1, needed by ../../bin/libggml-cuda.so.0.9.4, not found (try using -rpath or -rpath-link)
#14 92.45 [ 62%] Building CXX object common/CMakeFiles/common.dir/speculative.cpp.o
In the devel container used by llama.cpp nvidia/cuda:13.0.2-devel-ubuntu24.04 the LD_LIBRARY_PATH points to:
root@3e1024dcf4f9:$ env|grep LIBRARY
LIBRARY_PATH=/usr/local/cuda/lib64/stubs
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64
The library needed is to be found in /usr/local/cuda-13/compat - so you need to adjust the ENV for that container.
So just add
ENV LD_LIBRARY_PATH=/usr/local/cuda-13/compat
for the build section. And I added the CUDA architecture for cmake. If not specified cmake tries to build for all visible architectures (if I understood the docs correctly). But the normal docker build process does not have access to the GPU while building. You can change this using buildkit defining a builder with GPU support (see Container Device Interface (CDI) | Docker Docs) which is much too complicated, but may be useful in other projects.
The modified dockerfile can be found here llama.cpp Dockerfile for DGX Spark / GB10 · GitHub
So all steps to build a server container image:
mkdir src
cd src
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp/.devops
wget https://gist.githubusercontent.com/stelterlab/33885c600c102792acb1638ca7d2d7e9/raw/ad4e1edc488642172afa61a7ac9d29bf146c4a36/spark.Dockerfile
cd ..
docker build -f .devops/spark.Dockerfile --target server -t llama.cpp:server-spark .
Hope that saves other some time while trying to build $THINGS which are built similar.
May be I can push that upstream - so the llama.cpp team will integrate it into their build process.
Feedback welcome.


