GPU Isolation and MPS

juan.rodriguez4 · November 24, 2025, 8:10am

Good morning,

I am trying to achieve GPU process isolation and prioritization on the Jetson Nano and AGX platforms. I have tested several approaches. First, I attempted to take ownership of the CUDA render device to restrict GPU execution to a single user.

Later, I explored NVIDIA MPS and configured it on my system. However, I noticed that the behaviour is similar: once the MPS server is started by a user, only that user’s GPU processes are allowed to run. Additionally, I tried using MPS priorities to prioritize workloads, but they do not seem to work as expected.

Do you know if there is any reliable way to prioritize processes and achieve isolation on the integrated GPU of these platforms?

kayccc · November 24, 2025, 11:16pm

Which AGX platform?

AastaLLL · November 25, 2025, 4:58am

Hi,

Do you use Jetson Nano?
We started supporting MPS from CUDA 12.5, which is not available on Jetson Nano.

Thanks.

juan.rodriguez4 · November 25, 2025, 7:46am

Hardware:

P-Number: p3701-0000
Module: NVIDIA Jetson AGX Orin

juan.rodriguez4 · November 25, 2025, 7:48am

yes, I have this configuration.

Model: NVIDIA Jetson Orin Nano Developer Kit
L4T: 36.4.7
NV Power Mode[0]: 15W
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:
P-Number: p3767-0005
Module: NVIDIA Jetson Orin Nano (Developer kit)
Platform:
Distribution: Ubuntu 22.04 Jammy Jellyfish
Release: 5.15.148-tegra
jtop:
Version: 4.3.2
Service: Active
Libraries:
CUDA: 12.6.68
cuDNN: 9.3.0.75

kayccc · November 25, 2025, 8:03am

From your last two posts, there are AGX Orin and Orin Nano, which one is correct?

juan.rodriguez4 · November 25, 2025, 8:06am

Yes, I have both devices and I am trying to do the same thing: prioritise and isolate the GPU. The idea is to have stable times to see how they fit into a real-time environment.

AastaLLL · November 26, 2025, 8:43am

Hi,

This is usually controlled by the GPU scheduler.
Are you able to share a simple sample and steps to reproduce your issue?
(Launch the task from other users fails?)

Thanks.

juan.rodriguez4 · November 26, 2025, 11:05am

Good morning,

First approach:

export CUDA_MPS_PIPE_DIRECTORY=/tmp/mps-pipe
export CUDA_MPS_LOG_DIRECTORY=/tmp/mps-log

rm -rf /tmp/mps-pipe /tmp/mps-log
mkdir -p /tmp/mps-pipe /tmp/mps-log

sudo -E nvidia-cuda-mps-control -d

Then the mps is running in the system.

Then I have the next text:

// kernel.cu

 

 <cuda_runtime.h>

global void heavyKernel(float *a)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
float x = a[idx];
// Heavy loop to saturate GPU
for (int i = 0; i < 500000; ++i) {
x = x * 1.000001f + 0.000001f;
}
a[idx] = x;
}

int main()
{
const int N = 1 << 20; // 1M elements
size_t size = N * sizeof(float);

float *d_a;
cudaMalloc(&d_a, size);

dim3 block(256);
dim3 grid((N + block.x - 1) / block.x);

printf("Launching heavy kernel...\n");

// launch kernel many times to keep GPU constantly busy
for (int i = 0; i < 50; i++) {
    heavyKernel<<<grid, block>>>(d_a);
}

cudaDeviceSynchronize();
cudaFree(d_a);

printf("Finished.\n");
return 0;

}

nvcc kernel.cu -o heavy_kernel

Then, in user1 I am running:

#!/bin/bash

###############################################

Assumes MPS server is already running

and CUDA_MPS_PIPE_DIRECTORY + LOG_DIRECTORY

are already exported in the environment.

###############################################

APP=“./heavy_kernel”   # CUDA app to run
LOW_PERC=10
HIGH_PERC=80

Check if MPS pipe exists

if [ ! -d “$CUDA_MPS_PIPE_DIRECTORY” ]; then
echo “[ERROR] MPS pipe directory not found: $CUDA_MPS_PIPE_DIRECTORY”
echo “Please start the MPS server first.”
exit 1
fi

echo “[INFO] Starting test (without creating MPS server)…”
echo “Using:”
echo " - Pipe: $CUDA_MPS_PIPE_DIRECTORY"
echo " - Log:  $CUDA_MPS_LOG_DIRECTORY"
echo

-------------------------------------------

RUN LOW PRIORITY PROCESS

-------------------------------------------

echo “[INFO] Launching LOW priority ($LOW_PERC%) process…”
CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=$LOW_PERC $APP &
PID_LOW=$!
START_LOW=$(date +%s.%N)

-------------------------------------------

RUN HIGH PRIORITY PROCESS

-------------------------------------------

sleep 0.2
echo “[INFO] Launching HIGH priority ($HIGH_PERC%) process…”
CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=$HIGH_PERC $APP &
PID_HIGH=$!
START_HIGH=$(date +%s.%N)

echo
echo “Waiting for processes:”
echo " - LOW  ($LOW_PERC%)  PID=$PID_LOW"
echo " - HIGH ($HIGH_PERC%) PID=$PID_HIGH"
echo

-------------------------------------------

WAIT FOR COMPLETION

-------------------------------------------

wait $PID_LOW
END_LOW=$(date +%s.%N)

wait $PID_HIGH
END_HIGH=$(date +%s.%N)

-------------------------------------------

CALCULATE DURATIONS

-------------------------------------------

DUR_LOW=$(echo “$END_LOW - $START_LOW” | bc)
DUR_HIGH=$(echo “$END_HIGH - $START_HIGH” | bc)

echo
echo “==================== RESULTS ====================”
echo “Low  priority  ($LOW_PERC%) finished in:  $DUR_LOW seconds”
echo “High priority  ($HIGH_PERC%) finished in: $DUR_HIGH seconds”
echo

if (( $(echo “$DUR_HIGH < $DUR_LOW” | bc -l) )); then
echo “➡️  HIGH priority process finished first (expected)”
else
echo “➡️  LOW priority process finished first (unexpected)”
fi

echo “=================================================”

and the priority is not taking effect. Then as “ISOLATION” when I run the same bash and kernel with the user2. The MPS blocks this call to the end of the execution of the user1. Here the question is if I can configure in the nano and orin to work with: CUDA_MPS_ACTIVE_THREAD_PERCENTAGE

Then, the second approach:

chmod 600 /dev/dri/renderD128

run command

restore-permissions

Then, in my ONNX inference, if another user tries to use CUDA while the driver is blocked due to permissions, it will fall back to running on the CPU.

Can you guide me on how to exploit and take advantage of the GPU scheduler?

AastaLLL · November 27, 2025, 8:42am

Hi,

Thanks for sharing the details.
We will check your use case and provide more info to you.

Thanks.

juan.rodriguez4 · December 2, 2025, 11:00am

I ran another test on another AGX we have.

Platform:

jetson_release
Software part of jetson-stats 4.3.2 - (c) 2024, Raffaello Bonghi
Model: NVIDIA Jetson AGX Orin Developer Kit - Jetpack 6.2 [L4T 36.4.3]
NV Power Mode[0]: MAXN
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:

P-Number: p3701-0005

Module: NVIDIA Jetson AGX Orin (64GB ram)
Platform:

Distribution: Ubuntu 22.04 Jammy Jellyfish

Release: 5.15.148-rt-tegra
jtop:

Version: 4.3.2

Service: Active
Libraries:

CUDA: Not installed

cuDNN: Not installed

TensorRT: Not installed

VPI: Not installed

Vulkan: 1.3.204

OpenCV: 4.5.4 - with CUDA: NO

In this case, it seems that when the priority is higher than 30, then the priority works correctly.

AastaLLL · December 4, 2025, 5:21am

Hi,

Thanks for your patience.

We test your source but meet some errors when reproducing it.
When running the script for user 1, it terminates with below error:

$ ./test.sh 
“[ERROR] MPS pipe directory not found: /tmp/mps-pipe”
“Please start the MPS server first.”

But the MPS server is running after the initialization steps:

$ sudo -E nvidia-cuda-mps-control -d
An instance of this daemon is already running

Is there any step missing in our testing?
Thanks.

juan.rodriguez4 · December 4, 2025, 8:40am

Good morning,

I tried again and it works without any problems.

The code you shared says that the tmp files are not ready.

Have you applied the next lines?

export CUDA_MPS_PIPE_DIRECTORY=/tmp/mps-pipe
export CUDA_MPS_LOG_DIRECTORY=/tmp/mps-log

rm -rf /tmp/mps-pipe /tmp/mps-log
mkdir -p /tmp/mps-pipe /tmp/mps-log

sudo -E nvidia-cuda-mps-control -d

Perhaps you have a previous MPS server running.

Topic		Replies	Views
Fine grained Kernel scheduling with MPS CUDA Programming and Performance tensorflow , kernel , ubuntu , python , linux	10	1608	January 11, 2025
Proper use of CUDA MPS? Jetson Orin NX cuda	4	201	July 16, 2025
MPS for Jetson (Tegra) Devices Jetson AGX Orin kernel	3	1484	October 26, 2022
Setting CUDA device compute mode in JESON NANO CUDA Programming and Performance cuda , kernel	8	864	August 31, 2020
MPS for Jetson AGX Orin Jetson AGX Orin cuda	4	218	April 9, 2025
MPS on AGX Orin? Jetson AGX Orin	2	1375	July 6, 2022
Jetson Orin with JetPack6.1 support MPS? Jetson AGX Orin cuda	5	381	February 10, 2025
Best approaches for MPS on Xavier Jetson AGX Xavier	4	2135	October 18, 2021
Reserve core and set priority Jetson AGX Orin	4	35	December 2, 2025
MPS Support for New Xavier NX Jetson Xavier NX cuda	2	712	February 23, 2022

GPU Isolation and MPS

Related topics