DGX Spark crashes when running tensorrt-llm

jack175 · December 5, 2025, 6:14am

When I run the tensorrt-llm docker image and serve llama 4 DGX Spark crashes. Power cycle to recover.

docker run --rm -it --ipc host --gpus all --ulimit memlock=-1 --ulimit stack=67108864 -p 9000:8000 -v ~/.cache/huggingface:/root/.cache/huggingface nvcr.io/nvidia/tensorrt-llm/release:1.2.0rc4

trtllm-serve “nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4” --host 0.0.0.0

aniculescu · December 5, 2025, 8:10pm

Can you provide any logs from the run and also submit an nvidia-bug-report

Topic		Replies	Views
Trick to get the 2 DGX Spark TRT LLM setup running (fixes stuck on “preparing”) DGX Spark / GB10	1	100	December 12, 2025
Optimized tensor-rt command for running gpt-oss-120B on two dgx-sparks DGX Spark / GB10	2	172	October 20, 2025
TRT LLM for Inference - two Sparks example is VERY slow DGX Spark / GB10	6	428	October 23, 2025
My DGX Spark keeps freezing and crashing when I try to run this code no matter the LLM NVIDIA AI Workbench llama	0	34	December 2, 2025
Failed to run Qwen3-235B-A22B-FP4 model on a two spark's cluster DGX Spark / GB10	7	459	October 30, 2025
How to run DGX Spark using the harmony format for gpt-oss DGX Spark / GB10	8	592	October 22, 2025
NIM LLM Containers Fail on DGX Spark (GB10): Triton/vLLM Crash on sm_121 and NGC Permission Errors DGX Spark / GB10 jetson , nim , llama , nemotron	2	109	December 3, 2025
Setting up vLLM, SGLang or TensorRT on two DGX Sparks DGX Spark / GB10	16	238	December 7, 2025
I'd like to learn how to use the latest vLLM on DGX Spark DGX Spark / GB10 cuda	9	920	November 29, 2025
Maximum model size to build TRT-LLM Engine on DGX Spark? DGX Spark / GB10 llama , nemotron	4	252	October 27, 2025

DGX Spark crashes when running tensorrt-llm

Related topics