This repository was archived by the owner on Feb 25, 2025. It is now read-only.
[Impeller] improve morphology performance #37918
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improvements to morphology (dilate/erode) performance based on Arm mali guidelines. Doesn't noticably improve performance on iOS, but on a Pixel 6 improves full screen filter performance from 40-50 to ~70 FPS.
Summary of Changes
malioc results:
**[/Users/jonahwilliams/engine/src/out/android_debug_arm64/gen/flutter/impeller/entity/gles/morphology_filter.frag.gles]** [Mali-T880] Main shader =========== Work registers: 3 (75% used at 100% occupancy) Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 6.33 1.00 1.00 A + Total instruction cycles: 3.33 1.00 1.00 A - Shortest path cycles: 1.65 1.00 0.00 A + Shortest path cycles: 1.00 1.00 0.00 A, LS Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T860] Main shader =========== Work registers: 3 (75% used at 100% occupancy) Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 9.50 1.00 1.00 A + Total instruction cycles: 5.00 1.00 1.00 A - Shortest path cycles: 2.50 1.00 0.00 A + Shortest path cycles: 1.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T830] Main shader =========== - Work registers: 4 (100% used at 100% occupancy) + Work registers: 3 (75% used at 100% occupancy) - Uniform registers: 1 (5% used) + Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 9.00 1.00 1.00 A + Total instruction cycles: 5.00 1.00 1.00 A - Shortest path cycles: 1.62 1.00 0.00 A + Shortest path cycles: 1.25 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T820] Main shader =========== - Work registers: 4 (100% used at 100% occupancy) + Work registers: 3 (75% used at 100% occupancy) - Uniform registers: 1 (5% used) + Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 18.00 1.00 1.00 A + Total instruction cycles: 10.00 1.00 1.00 A - Shortest path cycles: 3.25 1.00 0.00 A + Shortest path cycles: 2.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T760] Main shader =========== Work registers: 3 (75% used at 100% occupancy) Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 9.50 1.00 1.00 A + Total instruction cycles: 5.00 1.00 1.00 A - Shortest path cycles: 2.50 1.00 0.00 A + Shortest path cycles: 1.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T720] Main shader =========== - Work registers: 4 (100% used at 100% occupancy) + Work registers: 3 (75% used at 100% occupancy) - Uniform registers: 1 (5% used) + Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 18.00 1.00 1.00 A + Total instruction cycles: 10.00 1.00 1.00 A - Shortest path cycles: 3.25 1.00 0.00 A + Shortest path cycles: 2.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture [Mali-G78AE] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G78] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G77] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G76] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 1.08 0.00 0.12 0.50 A + Total instruction cycles: 0.83 0.00 0.25 0.50 A Shortest path cycles: 0.29 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G72] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 2.17 0.00 0.25 1.00 A + Total instruction cycles: 1.67 0.00 0.50 1.00 A Shortest path cycles: 0.58 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G715] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 66% A LS V T Bound - Total instruction cycles: 0.23 0.00 0.03 0.12 A + Total instruction cycles: 0.15 0.00 0.06 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G710] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.26 0.00 0.06 0.12 A + Total instruction cycles: 0.20 0.00 0.12 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G71] Main shader =========== - Work registers: 19 (59% used at 100% occupancy) + Work registers: 20 (62% used at 100% occupancy) - Uniform registers: 20 (31% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 2.00 0.00 0.25 1.00 A + Total instruction cycles: 1.58 0.00 0.50 1.00 A Shortest path cycles: 0.58 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G68] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G615] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 66% A LS V T Bound - Total instruction cycles: 0.23 0.00 0.03 0.12 A + Total instruction cycles: 0.15 0.00 0.06 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G610] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.26 0.00 0.06 0.12 A + Total instruction cycles: 0.20 0.00 0.12 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G57] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G52] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 1.08 0.00 0.12 0.50 A + Total instruction cycles: 0.83 0.00 0.25 0.50 A Shortest path cycles: 0.29 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G510] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.34 0.00 0.06 0.12 A + Total instruction cycles: 0.26 0.00 0.12 0.12 A Shortest path cycles: 0.04 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G51] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 2.17 0.00 0.12 0.50 A + Total instruction cycles: 1.67 0.00 0.25 0.50 A Shortest path cycles: 0.58 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G310] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.52 0.00 0.12 0.25 A + Total instruction cycles: 0.39 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G31] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 3.25 0.00 0.12 0.50 A + Total instruction cycles: 2.50 0.00 0.25 0.50 A Shortest path cycles: 0.88 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Immortalis-G715] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 66% A LS V T Bound - Total instruction cycles: 0.23 0.00 0.03 0.12 A + Total instruction cycles: 0.15 0.00 0.06 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false