I had evaluation access to VIA microservices. After aprox. 6 months my manager asked me to show him how it works but it seems I don’t have access to it any longer.
2025-09-05 15:30:49,124 ERROR LLM call failed:
Error code: 404 - {‘status’: 404, ‘title’: ‘Not Found’, ‘detail’: “Function ‘a88f115a-4a47-4381-ad62-ca25dc33dc1b’: Not found for account ‘o6RsAnxmkTrsclJQEMOhB3kBwFcbvmetWEkLOS2ohMw’”}
2025-09-05 15:30:49,124 PERF Summarization/BatchSummarization time = 450.17 ms
2025-09-05 15:30:49,126 ERROR Summarize failed:
Server didn’t respond
2025-09-05 15:30:49,127 INFO Stopping VIA pipeline
2025-09-05 15:30:49,128 ERROR Server didn’t respond
2025-09-05 15:30:49,128 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.
Killed process with PID 53
Is there a chance to present this to our management again?
How long have you been applying for your NVIDIA_API_KEY? It’s possible that this NVIDIA_API_KEY is expired.
Notice you are still using version 2.0,please try the latest v2.3.1. If you can pull the image using nvapi-key, the issue is usually not caused by permissions.
Hi I tried to generate new NVIDIA API KEY and NGC_API_KEY. Even if I use the new keys access is denied:
2025-09-10 12:47:32,366 INFO Initializing VIA pipeline 2025-09-10 12:47:32,367 INFO Using model cached at /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita [TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000 2025-09-10 12:47:37,844 INFO Loading engine from /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/trt-engines/fp16/0-gpu/visual_engines/visual_encoder.engine Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]2025-09-10 12:47:38,530 INFO Creating session from engine /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/trt-engines/fp16/0-gpu/visual_engines/visual_encoder.engine [TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000 Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]2025-09-10 12:47:44,192 ERROR LLM call failed: Error code: 404 - {‘status’: 404, ‘title’: ‘Not Found’, ‘detail’: “Function ‘a88f115a-4a47-4381-ad62-ca25dc33dc1b’: Not found for account ‘o6RsAnxmkTrsclJQEMOhB3kBwFcbvmetWEkLOS2ohMw’”} 2025-09-10 12:47:44,193 PERF Summarization/BatchSummarization time = 419.13 ms 2025-09-10 12:47:44,193 ERROR Summarize failed: Server didn’t respond 2025-09-10 12:47:44,193 INFO Stopping VIA pipeline 2025-09-10 12:47:44,195 ERROR Server didn’t respond 2025-09-10 12:47:44,195 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid. Killed process with PID 53
I rather want to stick with VITA 2.0 for now. It seems that I would need new hardware to follow the procedure on the link that you provided me. For now I only have RTX 6000 Ada Lovelace. Would you help me to solve the access issue? I noticed that if I create an NVIDIA API KEY a NGC_API_KEY is created automatically as well and it’s exactly the same key.
VITA 2.0 is a beta version and is no longer supported. Please use the latest version 2.3.1. This requires a graphics card with at least 80GB of video memory. I’m afraid RTX 6000 may not support it.
Hi, sorry I had a few days off but now I’m back again. Thanks for info! So, if I generate new NVIDIA API KEY and NGC_API_KEY and we invest in RTX 6000 Blackwell generation with 96GB is it going to work? So, no access issues? And also is 96 GB of GPU memory enough for fine-tuning of the 2.3.1 version of VITA model? What if we need to carry out some machine learning of this model? Acc. your docs. it looks like NVILA-15B + Llama 3.1-8B should work fully locally on a single GPU with 80GB GPU memory.
Hi again, sorry but I really need to make sure that VSS is going to work for us if we decide to invest in RTX 6000 Blackwell generation. So, is it ok with my account? I checked the subscriptions in the NGC settings. It says that my subscription is “NVIDIA Developer Program“. Is it ok? I was also able to pull the image for llama llama-3.1-70b acc. you link. That makes me hope that the VSS is going to work for us. At least in configuration with NVILA-15B + Llama 3.1-8B models on a single GPU. Would you confirm that it is going to work, please? So that I can convince my manager to invest.
We’ve tested llama 8b + nvlia-15b on an A100 (80G) and found it to work properly. However, vss-2.3.0 depends on deepstream-7.1, which doesn’t run on Blackwell for now.
These issues have been resolved in the latest version of vss and will be released soon.
If you have concerns, consider renting an RTX 6000 Blackwell from GCD, AWS, or Ali for testing.
llama-3.1-70b requires at least two h100s (2x80).
If the image in the link can be pulled correctly, the permissions are correct.
Hi, thank you. Can I use Miscrosoft Azure? Like, shall I order a virtual machine and and RTX 6000 GPU Blackwell SE (does it also support Workstation Edition?) and then install Ubuntu 22.04 LTS the GPU driver, Docker and go ahead with the link that you have posted for “Fully Local Deployment: Single GPU“? In that case, I’m still not sure how I provide the test video stream. Does it test GUI still require RTSP camera?
Hi again, can we use H100 GPU(80GB) cloud instance for Fully Local Deployment: Single GPU? It says that: “The example configuration assumes a 1xH100 (80GB) deployment.“ but it was only tested on 1xB200, 1XH200, 1XA100.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.