-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Description
Background
Compared to the Skia backend, drawVertices and drawAtlas in Impeller are much slower. Generally the reason for this is that in cases where blending is required Impeller will blend using offscreen textures.
For example:
- drawVertices + per vertex colors + tiled image color source. We create an offscreen to draw the tiled image color source + vertex data, a second offscreen for the per vertex color + vertex data, and then blend them together into the parent pass.
- drawAtlas + per atlas color, same story.
These APIs aren't heavily used by applications, but are used by games and game-like apps, and by frameworks such as flame.
Historically we discussed subpass elimination for these APIs by proposing specific shaders for drawAtlas/drawVertices (see google only link go/impeller-draw-atlas-vertices ). This was discarded as creating too many shader variants, and so I implemented them with subpasses. Of course, this has lead to other visual fidelity bugs:
- [Impeller] Consider optimizing out "useless" atlas blending #121723
- [Impeller] Alpha behavior different DrawAtlas/DrawVertices with per color blending. #118914
- [Impeller] Optimize DrawAtlas with blending. #119958
Overview
We can modify the existing blend shaders in order to support blending with per-vertex colors rather than a uniform color value. This will have minor to little impact on rendering cost, but by allowing the vertex colors to vary then we can express all of drawAtlas/drawVertices blending with them.
These shaders include:
- All advanced blend shaders (non framebuffer fetch)
- porter-diff blend.
drawVertices + gradient blending will remain a slow case, as we would need to render the gradient to an offscreen in order to have a texture to sample. But texture sampling cases should work as is.
There was an open question as to whether we should attempt #116168 first and use different vertex layouts to make the encoding faster for the blend case versus the drawAtlas/drawVertices case. However since then we've learned a bit more about Vulkan specifically:
- PSO variants have much more overhead compared to metal, so changing layout isn't immediately a win
- Additional offscreen textures are much more expensive compared to upload time. So reducing the subpass count is more important than packing the bytes effectively.
- There are more caveats with shader performance on Vulkan. For example, loading uniform fragment data via vertex shader varyings may actually be faster in some cases, which is the opposite of what we expected.
Given these, I think we should proceed with modifying the blend shaders so that the blend color is specified per vertex.