Update OpenGL Texture on multi card system

Hello,

In our system we have 2 graphic card, a Quadro RTX 4000 and a NVidia RTX A4000.

We need to run 2 different applications that uses CUDA, one with OpenGL rendering that uses CUDA for update OpenGL texture from a pinned memory, created with cudaHostAlloc and to compress the same surface to JPEG using nvjpeg. The other application uses CUDA more intensely.

For this reason, we want to use the quadro for the OpenGL application and the nVidia in the other one.

If the program enumerates the cuda devices with cudaGetDeviceCount and cudaGetDeviceProperties, it obtains:

Count: 2

Device 0: NVIDIA RTX A4000
Device 1: Quadro RTX 4000

then when we enumerate the devices with cudaGLGetDevices, it returns only one card: the NVIDIA, even if the monitor is connected to the Quadro.

In this condition the program works fine, but we are using the “wrong” card.

We tried to change the “OpenGL rendering GPU” setting on NVIDIA Control panel, in this way cudaGLGetDevices returns only the Quadro… but the texture is black.

The texture is created using

wglMakeCurrent(hdc, hRC);
cudaSetDevice(cudaDeviceIndex);
glEnable(GL_TEXTURE_2D);
glGenTextures(1, &dest.textures);          
glBindTexture(GL_TEXTURE_2D, dest.textures)
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB8, dest.DxTexture, dest.DyTexture, 0, GL_BGR, GL_UNSIGNED_BYTE, NULL);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
CheckGLErrors(); // calls glGetError();
CUDACK(cudaGraphicsGLRegisterImage(&dest.registeredTex, dest.textures, GL_TEXTURE_2D, cudaGraphicsRegisterFlagsSurfaceLoadStore));
CUDACK(cudaGraphicsMapResources(1, &dest.registeredTex, 0));
CUDACK(cudaGraphicsSubResourceGetMappedArray(&dest.mappedArray, dest.registeredTex, 0, 0));
cudaResourceDesc resDesc;
memset(&resDesc, 0, sizeof(resDesc));
resDesc.resType = cudaResourceTypeArray;
resDesc.res.array.array = dest.mappedArray;
CUDACK(cudaCreateSurfaceObject(&dest.mappedSurface, &resDesc));
CUDACK(cudaMalloc(&dest.deviceArea, dest.DxImage * dest.DyImage * 3));

Then when we need to update the texture we calls:

CUDACK(cudaMemcpyAsync(dest.deviceArea, image_vec, DxImage * Cur_DyImage * 3, cudaMemcpyHostToDevice, stream));
copy24bitInTexture(dest.deviceArea, dest.mappedSurface, dest.DxImage, dest.DyImage_Cur, stream);

copy24bitInTexture is a simple kernel that converts 24 bit to 32 bit.

I repeat the code works fine using the NVIDIA card.

In a system without the NVIDIA card, so with only one Quadro RTX400 the program works fine.

Why the driver want use the card unconnected to the monitor for rendering? how can we use the Quadro in the 2 card system?

Thanks,

Regards,

Antonino Perricone