-
Notifications
You must be signed in to change notification settings - Fork 296
Add GL_EXT_mesh_shader #640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Re discussion on the mesa issue, Khronos does not "approve" new vendor extensions, though we do try and consistency-check them and make sure they're following the extension guidelines before we include them in the extension registry and hand out enum allocations. It's true that GL spec activity is very minimal within Khronos, but vendor and EXT extension development do not have to happen inside Khronos. So the first thing to ask is whether there is a commitment to implement this on the part of someone actually writing Mesa drivers. There's no point in publishing an extension spec in the registry if nobody has implemented it. Then, how and why does it differ from the NV extension? I see a slight signature change on one of the APIs but haven't tried to review the whole thing. Because of the close relationship between them, there should at least be a section down around the "Interactions" discussing the things that are the same, and those that had to be changed, and why. Would it be possible to implement the NV extension as it stands today on your target GPUs, and then add a really small extension on top of that to accommodate the changed signature, rather than duplicate so much of that language? BTW, when promoting an extension we keep the enum values unchanged so long as they are indistinguishable semantically from the point of view of the driver they are passed to. Only if there's a need to behave differently depending on which extension is being used would the enum value need to change. |
Thanks for the explanation.
Yeah, I'm going to implement it in mesa if it's accepted.
The difference with NV extension has been listed in the issue Q&A:
It's not possible to stack a new extension on NV. Because the NV extension interface (mostly GLSL part) is not suitable for other GPU vendors, that's why Vulkan created VK_EXT_mesh_shader. We can implement NV extension with many ugly workaround in driver, but it will hurt performance:
The runtime API part mostly come from the VK_EXT_mesh_shader to leverage the existing agreement made by different GPU vendors. I can keep the enum value which is same as NV extension and assign a fake value for the new ones. |
|
Hi,
Yes, there is interest in implementing it in RadeonSI as well as in Zink (which would work on top of the Vulkan EXT_mesh_shader exposed by the underlying Vulkan driver).
Same as the Vulkan NV vs. EXT extensions. In a nutshell, the NV extension makes it impossible to implement mesh shaders with reasonable performance on other vendors's HW; and EXT fixes that. Furthermore, EXT is better aligned with D3D12 mesh shaders and therefore benefits developers by providing a more familiar programming model. If you are interested in the exact details, they have been discussed in the Vulkan EXT_mesh_shader blog post and also on the spec MR here, among other places. This comment in the Mesa repo goes through the main issues with implemeing the NV extension on HW that wasn't designed for it. |
|
I'd like to see the mesa implementation at least well underway before this is released, but I think it's a great addition to the ecosystem. Bringing cross-vendor support to GL mesh shading will enable things like nvidium to finally run on more platforms. |
|
Then can the numbers be allocated first, so that I can update headers and start implementation? |
|
Yeah that seems good. @oddhack do you take care of that or am I supposed to do something? |
@yuq how many do you need? We allocate in blocks of 16. I think I counted 62 enums in the spec as it stands, so I can give you a block of 64 if you need that many (the new bit values not included in that total since they are semantically in a different namespace). |
|
I need 22, aligned to 16 is 32. I reused some enum numbers from GL_NV_mesh_shader, only those begin with 0xF need to be allocated. How about the extension serial number (I fake to 1024)? Do they have to be allocated when release? |
Done, see d8fdb8d (enums 0x9740-0x975F). The extension number is assigned when we publish. It isn't actually used as anything but an ordering mechanism. |
|
OK, thanks. |
ab5160e to
d5abc46
Compare
fc7aafc to
fbd56ad
Compare
|
We are 100% in support of this proposal and are excited about working with this functionality. |
|
I've done the implementation in mesa for AMD GPU. Next I'm going to upstream the code while giving it more test. https://gitlab.freedesktop.org/yuq825/mesa/-/commits/topic/mesh-shader |
|
@yuq WG has some review comments pending. Also will wait to ship until nvidium is at least semi-working with this (pending) to ensure things are usable as expected. |
I think nvdium only works on nvidia cards? Switching from nv's mesh shader implmentation to this is enough to make it run on amd cards? |
MCRcortex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main thing is probably rasterization order and if possible having driver preferences on fast path payload sizes
|
|
||
| * MESH_PREFERS_COMPACT_PRIMITIVE_OUTPUT_EXT, TRUE if the implementation | ||
| will perform best if there are no unused primitives in the output array. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be useful to have MAX_PREFERRED_PAYLOAD_SIZE_EXT etc for optimial task to mesh payload size, e.g. on nvidia this is 128 bytes for the fast path if remembering correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually these optimized parameter is in application or game engine. I added these parameter just in order to forking VK_EXT_mesh_shader. Of course MAX_PREFERRED_PAYLOAD_SIZE_EXT can be added if required, but VK_EXT_mesh_shader does not have this parameter which should make GL-VK translation layer always return max payload size for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that a MAX_PREFERRED_PAYLOAD_SIZE_EXT makes much sense here. All GPU manufacturers suggest to use as little payload as possible, so the "preferred" amount would be zero.
I also don't think it's a good idea to deviate from Vulkan's VkPhysicalDeviceMeshShaderPropertiesEXT. We don't want to reinvent the wheel here.
e.g. on nvidia this is 128 bytes for the fast path if remembering correctly
That sounds like an implementation detail for NVidia and even they didn't feel like it's worth including that in the NV extension.
|
Are there any guarantees about how/where the mesh shader workgroups are launched? Know it would be hardware dependent but as an example, launching many mesh tasks from a one task shader and few meshes from other task shaders perform significantly worse than a roughly even distribution of mesh tasks? Are there also any guarantees about how work is dispatched from the mesh shader to the remaining raster pipeline? That is, if a single mesh shader workgroup from a set of mesh shader workgroups (dispatched from a task shader) takes significantly longer to complete, would this block the other workgroups from dispatching work to the raster pipeline? (asking due to possibility of doing compute raster in the mesh shader while still dispatching larger tris to the hw raster pipeline, which would also enable higher gpu sillicon utilization and hopefully maximize full throughput) |
@Headcrabed As far as I see the author of nvidium @MCRcortex is here with us, so I will interpret his presence as being interested in porting nvidium to use the EXT mesh shaders instead of the NV extension. Assuming no other NVidia specifics are used by nvidium, it should then work on other GPUs too.
@Ristovski Previous GPUs don't have the hardware capability to implement mesh shaders. Most notably, only GFX10.3 and newer support per-primitive outputs. |
@MCRcortex Not sure if this thread is the right one to discuss GPU specific implementation details, but I'm happy to answer your questions about how it works at least on AMD HW. I reached out to you on your Discord. |
|
With regards to rasterization order: it seems that both AMD and NVidia do guarantee the "strict" rasterization order in their currently released GPUs. I haven't got any info about other hardware vendors yet. Considering that the Vulkan spec is different from the proposal as well as different from the D3D12 spec, I asked for clarification on the Vulkan spec to make sure whether the current spec is what was intended. For consistency between OpenGL and Vulkan, I suggest to wait until that is resolved before we move forward here. |
|
Am doing some work on mesh shaders now and had some other questions, would it be possible to get a am doubtful of being able to get a |
Yes. GLSL spec described it: https://github.com/KhronosGroup/GLSL/blob/main/extensions/ext/GLSL_EXT_mesh_shader.txt#L964 |
|
Oh awsome, ty, missed that in the spec |
|
Have a project that should be more easily to test this extension with (unfortunately am missing hardware todo so (dont have any amd hardware)) |
|
How does this interact with ARB_separate_shader_objects? There are no explicit notes in the spec here for it. |
ARB_separate_shader_objects is added to OpenGL 4.1, this extension is written on OpenGL 4.6 spec, so no explicit notes on ARB_separate_shader_objects interaction. There's some notes to add mesh shader stages for ARB_separate_shader_objects introduced spec words:
|
|
Ah, thanks, I missed that. |
b161756 to
44c5648
Compare
|
Talked a bit about this with @zmike, however it should also be raised here, how does mesh shaders interact with |
The way I understand it, it should work the same as for any other pre-rasterization stage. Why should it be ignored? EDIT: I'd like it to be consistent with Vulkan. Does Vulkan ignore it? |
|
its not defined in the vulkan spec from what do understand (hence zmike requesting clarification).
furthurmore, if rasterization discard is enabled a fragment shader is not strictly required to even be attached (from what have read, this is probably incorrect however) |
|
With Vulkan the rasterization discard state still applies when drawing with mesh shaders. |
This is an OpenGL extension forking VK_EXT_mesh_shader to provide OpenGL mesh shader functionality.
Required by NVIDIA.
zmike
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oddhack please merge
This is an OpenGL extension forking VK_EXT_mesh_shader to provide OpenGL mesh shader functionality.
Numbers in the spec haven't been allocated, so use fake numbers for now. No header/XML updates in the PR either.
This extension is for the request of nvidium users to add OpenGL mesh shader support to drivers other than NVIDIA GPUs: