Batch processing using NVIDIA NIM | Docker | Self-hosted

mohammed.innat · January 26, 2025, 7:34am

To ensure that the local GPU(s) are in good shape, I’ve run the following parallel processing and it works as expected. So, when running the docker with --gpu all showed the above inactive gpu:2 seems like a bug to me of NIM.

import torch
from concurrent.futures import ThreadPoolExecutor
from torchvision.models import efficientnet_v2_l

def process_sample_on_gpu(gpu_id, sample):
    torch.cuda.set_device(gpu_id)
    model = get_heavy_model()  
    model = model.to(f"cuda:{gpu_id}")
    model.eval() 
    sample = sample.to(f"cuda:{gpu_id}")
    
    with torch.no_grad():
        output = model(sample)
    
    print(f"Processed sample on GPU {gpu_id} with output shape: {output.shape}")
    return output

def get_heavy_model():
    model = efficientnet_v2_l(pretrained=False)
    return model

def generate_large_sample(batch_size=20, channels=3, height=1024*2, width=1024*2):
    return torch.randn(batch_size, channels, height, width)

samples = [generate_large_sample() for _ in range(3)]
gpu_ids = [0, 1, 2]

with ThreadPoolExecutor(max_workers=len(gpu_ids)) as executor:
    futures = [
        executor.submit(process_sample_on_gpu, gpu_id, sample)
        for gpu_id, sample in zip(gpu_ids, samples)
    ]
    results = [future.result() for future in futures]

print("All samples processed.")

Topic		Replies	Views
NIM HTTP API Inference (Run Anywhere) Taking Extremely Long! Models nim , llama-31-70b-instruct , llama-31-405b-instruct , llama	1	370	September 11, 2024
Result of nvidia nims in openai SDK and API inconsistent NVIDIA Nemotron nim , llama-31-405b-instruct , llama	0	70	January 7, 2025
A Simple Guide to Deploying Generative AI with NVIDIA NIM Technical Blog nim	9	974	September 8, 2024
Question About Batch Processing in NIM Multi-LoRA Deployment Models nim	0	43	March 28, 2025
Access large models (405B) with NIM after using all credits for the build.nvidia.com endpoints Access/Accounts nim , nemotron-4-340b-reward , llama-31-405b-instruct , llama	3	287	August 29, 2024
NIM does not support llama-3.1-8b-instruct and llama-3.1-70b-instruct on GH200 On-Prem deployment Models nim , llama-31-8b-instruct , llama	1	325	November 7, 2024
Livestream Thursday, July 17 : Simplify Deployment for a World of LLMs with NVIDIA NIM NVIDIA NIM nim	0	66	July 14, 2025
How to fix 0 compatible profiles for L40S with mistral-7b-instruct-v03 NIM? Models gpu , nim , mistral-7b-instruct-v03	7	463	November 4, 2024
High-throughput serving Llama-3.1 on A100 w/ VLLM or Llama.cpp NVIDIA Nemotron llama	2	420	January 27, 2025
LLM Performance Benchmarking: Measuring NVIDIA NIM Performance with GenAI-Perf Technical Blog nim , llama	1	84	May 6, 2025

Batch processing using NVIDIA NIM | Docker | Self-hosted

Related topics