The intended usage of NIM_TENSOR_PARALLEL_SIZE

I would expect that this env variable could be used as a way to override any other values set in configs, by passing it with “docker run” as “-e NIM_TENSOR_PARALLEL_SIZE=something”.

But according to what I see in the code in, for example, “nvcr.io/nim/meta/llama3-8b-instruct:1.0.3” docker image, the function that checks the env variable (_maybe_set_tp_size_from_env) is only called if a flag “should_set_tp_size_from_env” is set to true (the line 177 in ngc_injector.py), and the flag is set to false if there is anything found in any config file (the line 152, the same file). And the function _maybe_get_tp_size() always returns not None if run in a healthy container, as there is a manifest config with “tp” defined at /etc/nim/config/model_manifest.yaml, mentioned in the line 22 in ngc_download.py.

So in the result the current logic seems to check if there is a model manifest config and it is always present, so the env variable is effectively never used.

Is this intended?

Hi @mosalov – at the moment the NIM_TENSOR_PARALLEL_SIZE is only used internally for testing. It doesn’t have any effect in normal usage.

Thank you for the clarification!
I hoped that I could use this to fix another issue, that I will describe in a separate thread then.