The intended usage of NIM_TENSOR_PARALLEL_SIZE

mosalov · October 29, 2024, 9:39am

I would expect that this env variable could be used as a way to override any other values set in configs, by passing it with “docker run” as “-e NIM_TENSOR_PARALLEL_SIZE=something”.

But according to what I see in the code in, for example, “nvcr.io/nim/meta/llama3-8b-instruct:1.0.3” docker image, the function that checks the env variable (_maybe_set_tp_size_from_env) is only called if a flag “should_set_tp_size_from_env” is set to true (the line 177 in ngc_injector.py), and the flag is set to false if there is anything found in any config file (the line 152, the same file). And the function _maybe_get_tp_size() always returns not None if run in a healthy container, as there is a manifest config with “tp” defined at /etc/nim/config/model_manifest.yaml, mentioned in the line 22 in ngc_download.py.

So in the result the current logic seems to check if there is a model manifest config and it is always present, so the env variable is effectively never used.

Is this intended?

neal.vaidya · October 30, 2024, 2:37am

Hi @mosalov – at the moment the NIM_TENSOR_PARALLEL_SIZE is only used internally for testing. It doesn’t have any effect in normal usage.

mosalov · October 30, 2024, 12:15pm

Thank you for the clarification!
I hoped that I could use this to fix another issue, that I will describe in a separate thread then.

Topic		Replies	Views
Reusing a stored model (llama-3.1-8b-instruct) with a proper profile Models nim , llama-31-8b-instruct , llama	0	208	October 30, 2024
NIM_MAX_MODEL_LEN environment variable does nothing Models nim	1	260	November 21, 2024
Batch processing using NVIDIA NIM \| Docker \| Self-hosted Models python , nim , llama3-8b-instruct , llama-31-8b-instruct , llama	11	633	January 29, 2025
TensorRT LLM for NIM Models nim	3	361	January 7, 2025
NIM support for GH200? Models nim , llama	2	170	August 21, 2024
NIM Llama3 8B Instruct - Running container with "CUDA_ERROR_NO_DEVICE" cuDNN docker , nim , llama3-8b-instruct	1	101	March 28, 2025
Setting batch size and enable vLLM prefix caching Models nim	1	207	September 4, 2024
GPUs hang when executing NIM docker container on a 4xA100 TensorRT cuda , cudnn , tensorrt-model-optimizer	2	178	June 29, 2024
Llama-3_3-70b-instruct cannot select tensorrt-llm on L40s TensorRT nim , llama , llama-nemotron	0	48	September 17, 2025
NIM does not support llama-3.1-8b-instruct and llama-3.1-70b-instruct on GH200 On-Prem deployment Models nim , llama-31-8b-instruct , llama	1	325	November 7, 2024

The intended usage of NIM_TENSOR_PARALLEL_SIZE

Related topics