Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Conversation

@jonahwilliams
Copy link
Contributor

As discovered by @knopp in flutter/flutter#131567 (comment), this is actually reducing performance substantially when there are multiple blurs. In the case of flutter/flutter#132735 , removing this capbility improves GPU performance from 400ms per frame to ~100 ms per frame.

Fixes flutter/flutter#131567 (comment)

@jonahwilliams
Copy link
Contributor Author

TODO: determine if the Flip allocations are already cached.

@jonahwilliams
Copy link
Contributor Author

Switch RT allocator. Note that this does not make MSAA textures non-transient.

@jonahwilliams
Copy link
Contributor Author

From testing locally on a iPhone 14 (I think).

ToT: 250 ms GPU time.
With this change: ~200 ms GPU time.

@bdero
Copy link
Member

bdero commented Nov 7, 2023

Woah -- any theories as to why flipping between two textures for the restore is faster than reusing the same one in this case? We need to wait for the previous pass to store before the backdrop can be read either way... Will TAL shortly

@jonahwilliams
Copy link
Contributor Author

I have no particular theories. Debugging with xcode frame capture in both cases shows that we're spending 99% of time in the gaussian blur. So it must be something like a synchronization deoptimization when reading from the onscreen, or perhaps the onscreen being wide gamut forces some additional conversions?

@bdero
Copy link
Member

bdero commented Nov 8, 2023

Oh! IIRC the Metal docs give an ominous warning about reduced performance of readable onscreen textures, but I never considered that sampling from the onscreen texture might be super slow, which would make sense...

If you instead just set SupportsReadFromOnscreenTexture to false for Metal and do nothing else, does that meet/exceed the performance of this patch? SupportsReadFromResolve seems like a cap check worth keeping around.

@jonahwilliams
Copy link
Contributor Author

Setting it to false accomplishes the same effect. If we're not using it though, then its still in the git history if we need to bring it back.

@bdero
Copy link
Member

bdero commented Nov 8, 2023

Oh no, if we're not using a capability anywhere, we should for sure remove it.

What I mean is getting rid of SupportsReadFromResolve might not be responsible the performance gain you're seeing here, and instead just regressing memory more than is necessary.

My suggestion was a route for easily testing it out: If you go to HEAD and just disable onscreen texture reads for Metal, does that cause the same performance improvement you're seeing with this patch? This can be tried by just setting SupportsReadFromOnscreenTexture to false for the Metal backend.

@jonahwilliams
Copy link
Contributor Author

I'm not sure what you mean, the only change in this PR is the removal of the capability.

@bdero
Copy link
Member

bdero commented Nov 8, 2023

The capability you're removing is SupportsReadFromResolve. I'm suggesting to try turning off or removing SupportsReadFromOnscreenTexture instead (and keeping SupportsReadFromResolve fully intact).

@bdero
Copy link
Member

bdero commented Nov 8, 2023

Just to further clarify: In terms of EntityPass' behavior, turning off SupportsReadFromResolve effectively also turns off SupportsReadFromOnscreenTexture.

Disabling just SupportsReadFromOnscreenTexture makes EntityPass render the root pass offscreen if it has backdrop reads, but doesn't force every pass with backdrop reads to double up their color textures.

Disabling SupportsReadFromResolve (as this PR does) makes every pass with backdrop reads double up their color textures. But in order to use the texture flipping mechanism on the root pass work, disabling SupportsReadFromResolve also needs to make EntityPass render the root pass offscreen if it has backdrop reads -- the behavior of disabling SupportsReadFromOnscreenTexture.

So if it turns out that avoiding onscreen texture i/o is the cause of the improvement, then SupportsReadFromOnscreenTexture is what you're after and we get to continue avoiding the texture toggling on tilers.
If disabling SupportsReadFromOnscreenTexture alone doesn't have an as-good-or-better effect than what you're seeing here (indicating that the texture flip itself is somehow relevant), that would be a super interesting and wild result.

@jonahwilliams
Copy link
Contributor Author

Ahh sorry, I completely misread your comment. Yes, It seems like I confused read from resolve and read from onscreen - though disabling read from resolve had the same effect. From testing, the improvement is actually from disabling the read from onscreen - which I thought is what I did 😆

  Macro Example
TOT 250 450
W/Out OnScreen 203-187 125-109
W/Out Onscreen and Resolve 203 125

I will update this to actually remove the read from onscreen check.

@jonahwilliams
Copy link
Contributor Author

Yeah, redoing here: #47808

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

[Impeller] Reduce render-pass overhead when using wide gamut (iOS)

2 participants