Skip to content

Conversation

@yuq
Copy link
Contributor

@yuq yuq commented Dec 16, 2024

This is an OpenGL extension forking VK_EXT_mesh_shader to provide OpenGL mesh shader functionality.

Numbers in the spec haven't been allocated, so use fake numbers for now. No header/XML updates in the PR either.

This extension is for the request of nvidium users to add OpenGL mesh shader support to drivers other than NVIDIA GPUs:

@yuq yuq marked this pull request as draft December 16, 2024 02:58
@oddhack
Copy link
Collaborator

oddhack commented Dec 16, 2024

Re discussion on the mesa issue, Khronos does not "approve" new vendor extensions, though we do try and consistency-check them and make sure they're following the extension guidelines before we include them in the extension registry and hand out enum allocations. It's true that GL spec activity is very minimal within Khronos, but vendor and EXT extension development do not have to happen inside Khronos.

So the first thing to ask is whether there is a commitment to implement this on the part of someone actually writing Mesa drivers. There's no point in publishing an extension spec in the registry if nobody has implemented it. Then, how and why does it differ from the NV extension? I see a slight signature change on one of the APIs but haven't tried to review the whole thing. Because of the close relationship between them, there should at least be a section down around the "Interactions" discussing the things that are the same, and those that had to be changed, and why.

Would it be possible to implement the NV extension as it stands today on your target GPUs, and then add a really small extension on top of that to accommodate the changed signature, rather than duplicate so much of that language?

BTW, when promoting an extension we keep the enum values unchanged so long as they are indistinguishable semantically from the point of view of the driver they are passed to. Only if there's a need to behave differently depending on which extension is being used would the enum value need to change.

@yuq
Copy link
Contributor Author

yuq commented Dec 16, 2024

Re discussion on the mesa issue, Khronos does not "approve" new vendor extensions, though we do try and consistency-check them and make sure they're following the extension guidelines before we include them in the extension registry and hand out enum allocations. It's true that GL spec activity is very minimal within Khronos, but vendor and EXT extension development do not have to happen inside Khronos.

Thanks for the explanation.

So the first thing to ask is whether there is a commitment to implement this on the part of someone actually writing Mesa drivers. There's no point in publishing an extension spec in the registry if nobody has implemented it.

Yeah, I'm going to implement it in mesa if it's accepted.

Then, how and why does it differ from the NV extension? I see a slight signature change on one of the APIs but haven't tried to review the whole thing. Because of the close relationship between them, there should at least be a section down around the "Interactions" discussing the things that are the same, and those that had to be changed, and why.

The difference with NV extension has been listed in the issue Q&A:

Would it be possible to implement the NV extension as it stands today on your target GPUs, and then add a really small extension on top of that to accommodate the changed signature, rather than duplicate so much of that language?

It's not possible to stack a new extension on NV. Because the NV extension interface (mostly GLSL part) is not suitable for other GPU vendors, that's why Vulkan created VK_EXT_mesh_shader. We can implement NV extension with many ugly workaround in driver, but it will hurt performance:

BTW, when promoting an extension we keep the enum values unchanged so long as they are indistinguishable semantically from the point of view of the driver they are passed to. Only if there's a need to behave differently depending on which extension is being used would the enum value need to change.

The runtime API part mostly come from the VK_EXT_mesh_shader to leverage the existing agreement made by different GPU vendors. I can keep the enum value which is same as NV extension and assign a fake value for the new ones.

@yuq yuq force-pushed the topic/mesh-shader branch from f7c5c61 to ab5160e Compare January 2, 2025 07:02
@Venemo
Copy link

Venemo commented Jan 6, 2025

Hi,

So the first thing to ask is whether there is a commitment to implement this on the part of someone actually writing Mesa drivers.

Yes, there is interest in implementing it in RadeonSI as well as in Zink (which would work on top of the Vulkan EXT_mesh_shader exposed by the underlying Vulkan driver).

Then, how and why does it differ from the NV extension?

there should at least be a section down around the "Interactions" discussing the things that are the same, and those that had to be changed, and why

Would it be possible to implement the NV extension as it stands today on your target GPUs

Same as the Vulkan NV vs. EXT extensions. In a nutshell, the NV extension makes it impossible to implement mesh shaders with reasonable performance on other vendors's HW; and EXT fixes that. Furthermore, EXT is better aligned with D3D12 mesh shaders and therefore benefits developers by providing a more familiar programming model.

If you are interested in the exact details, they have been discussed in the Vulkan EXT_mesh_shader blog post and also on the spec MR here, among other places. This comment in the Mesa repo goes through the main issues with implemeing the NV extension on HW that wasn't designed for it.

@zmike
Copy link
Contributor

zmike commented Jan 7, 2025

I'd like to see the mesa implementation at least well underway before this is released, but I think it's a great addition to the ecosystem. Bringing cross-vendor support to GL mesh shading will enable things like nvidium to finally run on more platforms.

@yuq
Copy link
Contributor Author

yuq commented Jan 8, 2025

Then can the numbers be allocated first, so that I can update headers and start implementation?

@zmike
Copy link
Contributor

zmike commented Jan 8, 2025

Yeah that seems good. @oddhack do you take care of that or am I supposed to do something?

@oddhack
Copy link
Collaborator

oddhack commented Jan 8, 2025

Then can the numbers be allocated first, so that I can update headers and start implementation?

@yuq how many do you need? We allocate in blocks of 16. I think I counted 62 enums in the spec as it stands, so I can give you a block of 64 if you need that many (the new bit values not included in that total since they are semantically in a different namespace).

@yuq
Copy link
Contributor Author

yuq commented Jan 8, 2025

I need 22, aligned to 16 is 32. I reused some enum numbers from GL_NV_mesh_shader, only those begin with 0xF need to be allocated.

How about the extension serial number (I fake to 1024)? Do they have to be allocated when release?

oddhack added a commit that referenced this pull request Jan 8, 2025
@oddhack
Copy link
Collaborator

oddhack commented Jan 8, 2025

I need 22, aligned to 16 is 32. I reused some enum numbers from GL_NV_mesh_shader, only those begin with 0xF need to be allocated.

How about the extension serial number (I fake to 1024)? Do they have to be allocated when release?

Done, see d8fdb8d (enums 0x9740-0x975F).

The extension number is assigned when we publish. It isn't actually used as anything but an ordering mechanism.

@yuq
Copy link
Contributor Author

yuq commented Jan 8, 2025

OK, thanks.

@Leadwerks
Copy link

We are 100% in support of this proposal and are excited about working with this functionality.

@yuq yuq force-pushed the topic/mesh-shader branch from 779d416 to 7a22eab Compare July 1, 2025 06:21
@yuq yuq changed the title Draft: Add GL_EXT_mesh_shader Add GL_EXT_mesh_shader Jul 2, 2025
@yuq yuq marked this pull request as ready for review July 2, 2025 09:09
@yuq
Copy link
Contributor Author

yuq commented Jul 2, 2025

I've done the implementation in mesa for AMD GPU. Next I'm going to upstream the code while giving it more test. https://gitlab.freedesktop.org/yuq825/mesa/-/commits/topic/mesh-shader

Could this MR be merged now? @oddhack @zmike

@zmike
Copy link
Contributor

zmike commented Jul 2, 2025

@yuq WG has some review comments pending. Also will wait to ship until nvidium is at least semi-working with this (pending) to ensure things are usable as expected.

@Headcrabed
Copy link

@yuq WG has some review comments pending. Also will wait to ship until nvidium is at least semi-working with this (pending) to ensure things are usable as expected.

I think nvdium only works on nvidia cards? Switching from nv's mesh shader implmentation to this is enough to make it run on amd cards?

Copy link

@MCRcortex MCRcortex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main thing is probably rasterization order and if possible having driver preferences on fast path payload sizes


* MESH_PREFERS_COMPACT_PRIMITIVE_OUTPUT_EXT, TRUE if the implementation
will perform best if there are no unused primitives in the output array.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful to have MAX_PREFERRED_PAYLOAD_SIZE_EXT etc for optimial task to mesh payload size, e.g. on nvidia this is 128 bytes for the fast path if remembering correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually these optimized parameter is in application or game engine. I added these parameter just in order to forking VK_EXT_mesh_shader. Of course MAX_PREFERRED_PAYLOAD_SIZE_EXT can be added if required, but VK_EXT_mesh_shader does not have this parameter which should make GL-VK translation layer always return max payload size for this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that a MAX_PREFERRED_PAYLOAD_SIZE_EXT makes much sense here. All GPU manufacturers suggest to use as little payload as possible, so the "preferred" amount would be zero.

I also don't think it's a good idea to deviate from Vulkan's VkPhysicalDeviceMeshShaderPropertiesEXT. We don't want to reinvent the wheel here.

e.g. on nvidia this is 128 bytes for the fast path if remembering correctly

That sounds like an implementation detail for NVidia and even they didn't feel like it's worth including that in the NV extension.

@yuq yuq force-pushed the topic/mesh-shader branch from 7a22eab to ce7a626 Compare July 7, 2025 09:25
@MCRcortex
Copy link

MCRcortex commented Jul 7, 2025

Are there any guarantees about how/where the mesh shader workgroups are launched? Know it would be hardware dependent but as an example, launching many mesh tasks from a one task shader and few meshes from other task shaders perform significantly worse than a roughly even distribution of mesh tasks?
Do have a project would be interested to implement a renderer using this extension, main thing is translucency ordering think?

Are there also any guarantees about how work is dispatched from the mesh shader to the remaining raster pipeline? That is, if a single mesh shader workgroup from a set of mesh shader workgroups (dispatched from a task shader) takes significantly longer to complete, would this block the other workgroups from dispatching work to the raster pipeline? (asking due to possibility of doing compute raster in the mesh shader while still dispatching larger tris to the hw raster pipeline, which would also enable higher gpu sillicon utilization and hopefully maximize full throughput)
Guess tho this is also completely implmentation dependent and

@Venemo
Copy link

Venemo commented Jul 7, 2025

WG has some review comments pending. Also will wait to ship until nvidium is at least semi-working with this (pending) to ensure things are usable as expected.

I think nvdium only works on nvidia cards? Switching from nv's mesh shader implmentation to this is enough to make it run on amd cards?

@Headcrabed As far as I see the author of nvidium @MCRcortex is here with us, so I will interpret his presence as being interested in porting nvidium to use the EXT mesh shaders instead of the NV extension. Assuming no other NVidia specifics are used by nvidium, it should then work on other GPUs too.

I'm not that familiar with radeonsi internals, so maybe I am missing something obvious like a GFX10+ feature used in the code, but I am curious: can you shed some info on what is preventing this from being used on GFX9 as well (code has a requirement of >=GFX10_3)?

@Ristovski Previous GPUs don't have the hardware capability to implement mesh shaders. Most notably, only GFX10.3 and newer support per-primitive outputs.

@Venemo
Copy link

Venemo commented Jul 7, 2025

Are there any guarantees about how/where the mesh shader workgroups are launched?

@MCRcortex Not sure if this thread is the right one to discuss GPU specific implementation details, but I'm happy to answer your questions about how it works at least on AMD HW. I reached out to you on your Discord.

@yuq yuq force-pushed the topic/mesh-shader branch from ce7a626 to 76d91ba Compare July 9, 2025 08:28
@Venemo
Copy link

Venemo commented Jul 9, 2025

With regards to rasterization order: it seems that both AMD and NVidia do guarantee the "strict" rasterization order in their currently released GPUs. I haven't got any info about other hardware vendors yet.

Considering that the Vulkan spec is different from the proposal as well as different from the D3D12 spec, I asked for clarification on the Vulkan spec to make sure whether the current spec is what was intended. For consistency between OpenGL and Vulkan, I suggest to wait until that is resolved before we move forward here.

@yuq yuq force-pushed the topic/mesh-shader branch from 76d91ba to b161756 Compare July 11, 2025 05:48
@MCRcortex
Copy link

Am doing some work on mesh shaders now and had some other questions, would it be possible to get a gl_DrawID as described in ARB_shader_draw_parameters for multidraw commands? it's possible to get the "inner id" (semi equivalent to gl_InstanceID) with gl_WorkGroupID, NV_mesh_shader has a somewhat equivalence of this by having a 'first' argument in the draw commands
quote
The x component of gl_WorkGroupID of the first active stage will be within the range of [<first> , <first + count - 1>]

am doubtful of being able to get a first argument but would it be possible to have gl_DrawID work as a replacement (as is already described in ARB_shader_draw_parameters)?

@yuq
Copy link
Contributor Author

yuq commented Jul 14, 2025

would it be possible to get a gl_DrawID as described in ARB_shader_draw_parameters for multidraw commands?

Yes. GLSL spec described it: https://github.com/KhronosGroup/GLSL/blob/main/extensions/ext/GLSL_EXT_mesh_shader.txt#L964

@MCRcortex
Copy link

Oh awsome, ty, missed that in the spec

@MCRcortex
Copy link

Have a project that should be more easily to test this extension with (unfortunately am missing hardware todo so (dont have any amd hardware))

@zmike
Copy link
Contributor

zmike commented Sep 2, 2025

How does this interact with ARB_separate_shader_objects? There are no explicit notes in the spec here for it.

@yuq
Copy link
Contributor Author

yuq commented Sep 3, 2025

How does this interact with ARB_separate_shader_objects? There are no explicit notes in the spec here for it.

ARB_separate_shader_objects is added to OpenGL 4.1, this extension is written on OpenGL 4.6 spec, so no explicit notes on ARB_separate_shader_objects interaction. There's some notes to add mesh shader stages for ARB_separate_shader_objects introduced spec words:

@zmike
Copy link
Contributor

zmike commented Sep 3, 2025

Ah, thanks, I missed that.

@yuq yuq force-pushed the topic/mesh-shader branch from b161756 to 44c5648 Compare September 16, 2025 07:41
@MCRcortex
Copy link

Talked a bit about this with @zmike, however it should also be raised here, how does mesh shaders interact with GL_RASTERIZER_DISCARD in my opinion it should be ignored as it does not make much sense to have for mesh shaders, however do know there is a spec clarrification issue for this for the vulkan extension. It might be good to either wait on the clarification or make a decision so later revisions to the spec arnt needed (for this issue)

@Venemo
Copy link

Venemo commented Sep 16, 2025

how does mesh shaders interact with GL_RASTERIZER_DISCARD in my opinion it should be ignored as it does not make much sense to have for mesh shaders

The way I understand it, it should work the same as for any other pre-rasterization stage. Why should it be ignored?

EDIT: I'd like it to be consistent with Vulkan. Does Vulkan ignore it?

@MCRcortex
Copy link

MCRcortex commented Sep 16, 2025

its not defined in the vulkan spec from what do understand (hence zmike requesting clarification).
personally imo it should be ignored as it should be equivalient to mesh shaders emitting size of zero. if rasterization discard is not included it could be an optimization possibility for drivers.
transformer feedback is not applied either, however reading the specification again it does suggest that rasterization discard is applied

After programmable mesh processing, the same fixed-function operations are
applied to vertices of the resulting primitives as above, except the
transform feedback (see section 13.3), primitive queries (see section 13.4)
and transform feedback overflow queries (see section 13.5) are replaced by
mesh primitive queries (see section 13.Y).

furthurmore, if rasterization discard is enabled a fragment shader is not strictly required to even be attached (from what have read, this is probably incorrect however)

@pdaniell-nv
Copy link
Contributor

With Vulkan the rasterization discard state still applies when drawing with mesh shaders.

yuq added 4 commits October 9, 2025 10:38
This is an OpenGL extension forking VK_EXT_mesh_shader to
provide OpenGL mesh shader functionality.
@yuq yuq force-pushed the topic/mesh-shader branch from 44c5648 to 0a19fa1 Compare October 9, 2025 02:39
Copy link
Contributor

@zmike zmike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oddhack please merge

@zmike zmike added this to the Approved to Merge milestone Oct 9, 2025
@oddhack oddhack merged commit 3839177 into KhronosGroup:main Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants