Hello,
I am working on an OptiX ray tracing application. After experiencing no problems whatsoever on Linux, I unexpectedly encountered an issue when testing the application on Windows which I have not been able to resolve.
I tested with Windows 10 & Windows 11. The application uses CUDA 12.0 and OptiX 7.2.0.
The CUDA driver did not seem to matter (but the one I tested with the most was 566.36).
The issue I encounter is the following:
========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to “unspecified launch failure” on CUDA API call to cuStreamSynchronize.
========= Saved host backtrace up to driver entry point at error
========= Host Frame: optixQueryFunctionTable [0x7ff94ba011bd] in nvoptix.dll
========= Host Frame: optixLaunch in optix_stubs.h:534 [0x168cb] in OptiX_Error.exe
========= Host Frame: main in main.cu:40 [0x920b] in OptiX_Error.exe
The full compute sanitizer console log is attached.
I suspect the error occurs during the call to the default intersection shader.
I attached an example project that reproduces the error.
I tried to keep it minimal, but since it requires an OptiX pipeline to be set up, it got a bit verbose for my liking.
I am at the end of my rope here.
Am I overlooking an error in my implementation?
Is this a known issue?
Should I file a bug report?
EDIT:
It seems that adding a dummy hit group (record) to the shader binding table fixes the issue.
Why is such a record required if it is never used at all?
Especially, why is it only required when running on Windows?
Hi @jonas.schwab, I’ve compiled and ran your code with OptiX 8.0.0. Same bug. Not calling optixTrace() in raygen makes the problem disappear. Probably something is wrong with the SBT. Will investigate more later, unless you find the fix.
As I wrote in my edit, I noticed that the absence of a hit-group record seems to play a big role.
In case the definition of a hit-group record is always necessary, I may have overlooked that.
But I cannot understand how that would only cause issues on windows or why the error is “unspecified”.
Solved by initialising the sbt differently, I don’t crash anymore.
I’m passing a reference to an sbt in ShaderBindingTable’s constructor. The actual sbt is RayTracingPipeline’s member, so this copy is removed:
shaderBindingTable = ShaderBindingTable(….);
and replaced by
ShaderBindingTable(…, shaderBindingTable); // sbt passed by ref.
sbt’s address is given wherever it’s needed by new method RayTracingPipeline::getSBTPointer(). Probably something was going out of scope too early with your implementation. I bet on DeviceBuffer’s dtor to be guilty.
Oh, you are right, shaderBindingTable = ShaderBindingTable(…); as a copy is problematic.
I should have deleted the copy assignment and constructor of the ShaderBindingTable class.
Could you send your exact changes?
I am pretty sure I fixed all stale reference issues and I still cannot get it to work.
The DeviceBuffer class binds the lifetime of the wrapped cuda device memory to the host side object.
Unfortunately, this means there is also an issue with your changes:
The destruction of the unnamed ShaderBindingTable object also immediately frees the buffers for the sbt records.
As a test, I commented out the free in the destructor.
~DeviceBuffer()
{
// free();
}
With this I ended up with the same “unspecified launch failure” as before.
But you definitely found an issue in my initial sample implementation, which should be fixed by this:
I am now quite certain that the issue comes down to this:
On Windows, the sbtOffset of any intersected OptixInstance needs to point to a valid SBT record.
I had hoped this would not be required, since with OPTIX_RAY_FLAG_DISABLE_ANYHIT | OPTIX_RAY_FLAG_DISABLE_CLOSESTHIT and build-in primitives, none of the programs in this SBT record should ever be executed.
This requirement slightly complicates the SBT setup and also limits the ability to reuse instance acceleration structures.
Hey,
I had not thought of that because I still mainly use OptiX 7.2.0.
But it seemed like a good idea to check.
Unfortunately, it does not make a difference.
As expected the pipeline statistics show that there is no trace call and one less instruction overall, but the error occurs nonetheless.