VIA microservices not working any longer

jiri_mikulka · September 5, 2025, 1:55pm

Hi,

I had evaluation access to VIA microservices. After aprox. 6 months my manager asked me to show him how it works but it seems I don’t have access to it any longer.

If I put following CLI commands:

sudo docker login nvcr.io -u ‘$oauthtoken’ -p nvapi-…

export BACKEND_PORT=8000
export FRONTEND_PORT=9000
export NVIDIA_API_KEY=nvapi-….
export NGC_API_KEY=nvapi-…
export MODEL_PATH=“ngc:nvidia/tao/vita:2.0.1”
export NGC_MODEL_CACHE=/home/koule/Documents/VLM-NIMs/cache/

sudo docker run --rm -it --ipc=host --network host --ulimit memlock=-1 --ulimit stack=67108864 --tmpfs /tmp:exec --name via-server --gpus ‘“device=all”’ -p $FRONTEND_PORT:$FRONTEND_PORT -p $BACKEND_PORT:$BACKEND_PORT -e BACKEND_PORT=$BACKEND_PORT -e FRONTEND_PORT=$FRONTEND_PORT -e NVIDIA_API_KEY=$NVIDIA_API_KEY -e NGC_API_KEY=$NGC_API_KEY -e VLM_MODEL_TO_USE=vita-2.0 -v $NGC_MODEL_CACHE:/root/.via/ngc_model_cache -e MODEL_PATH=$MODEL_PATH -v via-hf-cache:/tmp/huggingface nvcr.io/metropolis/via-dp/via-engine:2.0-dp

It doesn’t work any longer. I get following logs:

2025-09-05 15:30:49,124 ERROR LLM call failed:
Error code: 404 - {‘status’: 404, ‘title’: ‘Not Found’, ‘detail’: “Function ‘a88f115a-4a47-4381-ad62-ca25dc33dc1b’: Not found for account ‘o6RsAnxmkTrsclJQEMOhB3kBwFcbvmetWEkLOS2ohMw’”}
2025-09-05 15:30:49,124 PERF Summarization/BatchSummarization time = 450.17 ms
2025-09-05 15:30:49,126 ERROR Summarize failed:
Server didn’t respond
2025-09-05 15:30:49,127 INFO Stopping VIA pipeline
2025-09-05 15:30:49,128 ERROR Server didn’t respond
2025-09-05 15:30:49,128 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.
Killed process with PID 53

Is there a chance to present this to our management again?

junshengy · September 8, 2025, 2:30am

How long have you been applying for your NVIDIA_API_KEY? It’s possible that this NVIDIA_API_KEY is expired.

Notice you are still using version 2.0，please try the latest v2.3.1. If you can pull the image using nvapi-key, the issue is usually not caused by permissions.

jiri_mikulka · September 10, 2025, 10:44am

I started in February 2025. I can happen that the original NVIDIA_API_KEY has expired.

Nevertheless, they all seem to be valid. The original NGC_API_KEY has expired:

jiri_mikulka · September 10, 2025, 10:53am

Hi I tried to generate new NVIDIA API KEY and NGC_API_KEY. Even if I use the new keys access is denied:

2025-09-10 12:47:32,366 INFO Initializing VIA pipeline
2025-09-10 12:47:32,367 INFO Using model cached at /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000
2025-09-10 12:47:37,844 INFO Loading engine from /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/trt-engines/fp16/0-gpu/visual_engines/visual_encoder.engine
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]2025-09-10 12:47:38,530 INFO Creating session from engine /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/trt-engines/fp16/0-gpu/visual_engines/visual_encoder.engine
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]2025-09-10 12:47:44,192 ERROR LLM call failed:
Error code: 404 - {‘status’: 404, ‘title’: ‘Not Found’, ‘detail’: “Function ‘a88f115a-4a47-4381-ad62-ca25dc33dc1b’: Not found for account ‘o6RsAnxmkTrsclJQEMOhB3kBwFcbvmetWEkLOS2ohMw’”}
2025-09-10 12:47:44,193 PERF Summarization/BatchSummarization time = 419.13 ms
2025-09-10 12:47:44,193 ERROR Summarize failed:
Server didn’t respond
2025-09-10 12:47:44,193 INFO Stopping VIA pipeline
2025-09-10 12:47:44,195 ERROR Server didn’t respond
2025-09-10 12:47:44,195 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.
Killed process with PID 53

I rather want to stick with VITA 2.0 for now. It seems that I would need new hardware to follow the procedure on the link that you provided me. For now I only have RTX 6000 Ada Lovelace. Would you help me to solve the access issue? I noticed that if I create an NVIDIA API KEY a NGC_API_KEY is created automatically as well and it’s exactly the same key.

junshengy · September 14, 2025, 5:42am

VITA 2.0 is a beta version and is no longer supported. Please use the latest version 2.3.1. This requires a graphics card with at least 80GB of video memory. I’m afraid RTX 6000 may not support it.

jiri_mikulka · September 16, 2025, 7:32am

Hi, sorry I had a few days off but now I’m back again. Thanks for info! So, if I generate new NVIDIA API KEY and NGC_API_KEY and we invest in RTX 6000 Blackwell generation with 96GB is it going to work? So, no access issues? And also is 96 GB of GPU memory enough for fine-tuning of the 2.3.1 version of VITA model? What if we need to carry out some machine learning of this model? Acc. your docs. it looks like NVILA-15B + Llama 3.1-8B should work fully locally on a single GPU with 80GB GPU memory.

junshengy · September 16, 2025, 9:31am

Yes, 96G of video memory can be deployed successfully. Please refer to the above single GPU deployment link.

I am not sure how much video memory is required for finetune NVILA-15B. You can ask on vila github.

jiri_mikulka · September 17, 2025, 11:42am

Hi again, sorry but I really need to make sure that VSS is going to work for us if we decide to invest in RTX 6000 Blackwell generation. So, is it ok with my account? I checked the subscriptions in the NGC settings. It says that my subscription is “NVIDIA Developer Program“. Is it ok? I was also able to pull the image for llama llama-3.1-70b acc. you link. That makes me hope that the VSS is going to work for us. At least in configuration with NVILA-15B + Llama 3.1-8B models on a single GPU. Would you confirm that it is going to work, please? So that I can convince my manager to invest.

junshengy · September 18, 2025, 2:20am

We’ve tested llama 8b + nvlia-15b on an A100 (80G) and found it to work properly. However, vss-2.3.0 depends on deepstream-7.1, which doesn’t run on Blackwell for now.

These issues have been resolved in the latest version of vss and will be released soon.

If you have concerns, consider renting an RTX 6000 Blackwell from GCD, AWS, or Ali for testing.

llama-3.1-70b requires at least two h100s (2x80).
If the image in the link can be pulled correctly, the permissions are correct.

jiri_mikulka · September 24, 2025, 8:09am

Hi again,

thank you for the warning about deepstream-7.1. Does Nvidia provide GPUs to rent? Would you sent me a link to how to purchase it?

junshengy · September 24, 2025, 10:27am

vss-2.4.0 has been released, supporting the new Blackwell architecture. You can try it out.

Please rent a GPU from the cloud service provider listed above. Nvidia itself does not provide this service.

jiri_mikulka · September 24, 2025, 1:49pm

Hi, thank you. Can I use Miscrosoft Azure? Like, shall I order a virtual machine and and RTX 6000 GPU Blackwell SE (does it also support Workstation Edition?) and then install Ubuntu 22.04 LTS the GPU driver, Docker and go ahead with the link that you have posted for “Fully Local Deployment: Single GPU“? In that case, I’m still not sure how I provide the test video stream. Does it test GUI still require RTSP camera?

junshengy · September 25, 2025, 2:23am

Please contact the cloud vendor. I don’t know much about this. I can successfully deploy llama-8b+nvila on a single card a100/h100

Consider using ffmpeg/mediamtx to push the stream. Refer to this topic Accessing local host rtsp with Brev Launchpad

jiri_mikulka · October 7, 2025, 9:01am

Hi again, can we use H100 GPU(80GB) cloud instance for Fully Local Deployment: Single GPU? It says that: “The example configuration assumes a 1xH100 (80GB) deployment.“ but it was only tested on 1xB200, 1XH200, 1XA100.

junshengy · October 10, 2025, 2:54am

Yes, absolutely.

yingliu · November 7, 2025, 7:16am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.

system · November 21, 2025, 7:17am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
VIA Microservices preview installation Visual AI Agent	18	385	November 11, 2024
Error while downloading VIA Visual AI Agent llama	20	511	September 23, 2024
VIA Summarization Workflow ERROR Visual AI Agent llama	34	696	March 5, 2025
Ussue with VIA and VITA-2.0 - Error Code 402 Visual AI Agent	9	488	November 1, 2024
NVIDIA VIA Microservices - Unable to Access Containers and Metropolis Team Visual AI Agent nim	9	243	March 5, 2025
VSS blueprint 2.2.0 - ERROR Failed to load VIA stream handler - Failed to generate TRT-LLM engine Visual AI Agent nim , llama-31-70b-instruct , llama	16	533	April 22, 2025
Error when running Docker VIA Microservices Visual AI Agent nim	8	283	August 13, 2024
VSS blueprint 2.2.0 - processing, percentage complete is 0.00 forever Visual AI Agent	8	220	March 6, 2025
401 unauthorized access Visual AI Agent nim , llama-31-70b-instruct , llama	12	267	April 28, 2025
VSS 2.3.0 Docker remote_llm_deployment Failed to generate TRT-LLM engine Visual AI Agent nim , paligemma , kosmos-2 , llama	5	147	May 23, 2025

VIA microservices not working any longer

Related topics