-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Description
Originally in go/impeller-shader-variants, related to #116103
Problems
-
Some shader source code is highly repetitive due to blend mode implementations (i.e drawAtlas, drawVertices, or blends). Branching on blend mode is likely to be prohibitively expensive.
-
Arm Mali GPUs suffer from substantial performance overhead compared to iOS GPUs when branching on uniforms. Dramatic performance improvements can be achieved on Android by creating variants for common shaders (See: Make ColorFilterLayer accept an offset engine#38055 , [Impeller] improve rrect blur performance on Android engine#37925 , [Impeller] Improve border_mask_blur performance on Android engine#37923 , [Impeller] improve morphology performance engine#37918 )
-
Maintaining mostly duplicate shader sources is cumbersome and likely to be error prone in the future.
-
Registering and looking up different shaders is highly repetitive
-
Every shader we add slightly increases startup time. At this point its not a problem, but needs to be managed. Tracked in [Impeller] Benchmark PSO library construction in the constructor of the content context. #117180
Specialization constants
spriv-cross seems to mostly do reasonable things with specialization constants declared in a shader.
For example, with the morphology filter:
uniform FragInfo {
float texture_sampler_y_coord_scale;
vec2 texture_size;
vec2 direction;
float radius;
}
frag_info;
layout(constant_id = 0) const float morph_type = 0;For GLES, the following code is generated:
#ifndef SPIRV_CROSS_CONSTANT_ID_0
#define SPIRV_CROSS_CONSTANT_ID_0 0.0
#endif
const float morph_type = SPIRV_CROSS_CONSTANT_ID_0;For Metal, the following code is generated:
constant float morph_type_tmp [[function_constant(0)]];
constant float morph_type = is_function_constant_defined(morph_type_tmp) ? morph_type_tmp : 0.0;
We'll probably need to do some plumbing to set it but it might do what we need.
We'll still need to declare all of the possible values ahead of time. This will allow us to generate specialized versions for GLES and have correct bindings for metal, which seems to need all possible values during pipeline creation.
From Metal documentation:
This sample uses three different MTLRenderPipelineState objects, each representing a different LOD. Specializing functions and building pipelines is expensive, so the sample performs these tasks asynchronously before starting the render loop. When the AAPLRenderer object is initialized, each LOD pipeline is created asynchronously by using dispatch groups, completion handlers, and notification blocks.
AIs
- Add PSO benchmarks and compare specialization constants to N shaders
- Add support for function constant registration in metal backend
- Ensure that opengl vs vulkan on android can avoid loading opengl shaders if we end up generating a large amount of them.
- Verify Android on Vulkan sees similar performance improvements from specialization constants compared to shader variants.