[Impeller] More efficient command encoding by removing deferred encoding.

Previously explored in https://github.com/flutter/engine/pull/48848

## Background

Impeller has a deferred command recording design, that is given a command like "draw rectangle":

1. SolidColorContents::Render is executed. This constructs an impeller::Command object which contains the buffer and uniform bindings, and is stored in the impeller::RenderPass

... More commands are recorded and the render pass is finished...

2. the impeller:::RenderPass dispatches to construct a vk::RenderPass (or equivalent MTL structure) which constructs the actual command buffer, and the impeller::Command and bindings objects are converted to real bindings

## Overview

This process works reasonably well, but it adds some measurable overhead.  In theory, we could directly record to both the native Metal and Vulkan command buffers instead of an impeller structure, which would remove all of this allocation.

Problems:

* The workload in requires some non-trivial amounts of heap allocations per command. This work tends to be fairly fragmented and doesn't work great with Scudo allocator on Android.
* Extra state: all commands current carry a stencil rect despite the fact that it is completely unused by Impeller. We may find that we're unwilling to add features that would be useful for Flutter GPU due to the additional cost it requires Impeller to incur.
* We need to add more state setting commands, such as barriers (https://github.com/flutter/flutter/issues/140798) for correct rendering with compute. Doing so to an intermediate requires even more allocation.

In contrast, if we record directly to the native cmd buffer, then we remove all extra allocation for Metal/Vulkan. This leaves us with two general problems and one Vulkan specific problem to fix:

* Host Buffer allocation happens once at the end of render pass recording. This would also require us to implement https://github.com/flutter/flutter/issues/138161 or similar, as the current HostBuffer strategy works by flushing to a device buffer once at the end of render pass recording.
* Vulkan will need to guess how many descriptor sets to create. But we can handle this with the recycling.
* We need to make changes to the cmd state setting API to be stateful instead of creating a command object.


### Example

We need to change the intermediate state setting so that it maps directly to the underlying cmd state. Something like this:

Before
```c++
Command cmd;
cmd.pipeline = context.getPipeline();
BindData(cmd, data);
pass.addCommand(std::move(cmd));
```

After
```
pass.setPipeline(context.getPipeline());
pass.bindData(metadata, data);
pass.draw();
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Impeller] More efficient command encoding by removing deferred encoding. #140804

Background

Overview

Example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Impeller] More efficient command encoding by removing deferred encoding. #140804

Description

Background

Overview

Example

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions