Skip to content

[Flutter GPU] Design: Improve uniform upload workflow. #150953

@bdero

Description

@bdero

Flutter GPU has a shared buffer concept called HostBuffer (which could perhaps use a more representative name, like SharedBuffer).

What's wrong with HostBuffer

The docstring below describes how HostBuffer used to work, but the behavior of Impeller's underlying HostBuffer has changed significantly (for the better) -- which makes the description no longer accurate (see below for how it changed).

Outdated docstring:

/// [HostBuffer] is a [Buffer] which is allocated on the host (native CPU
/// resident memory) and lazily uploaded to the GPU. A [HostBuffer] can be
/// safely mutated or extended at any time on the host, and will be
/// automatically re-uploaded to the GPU the next time a GPU operation needs to
/// access it.
///
/// This is useful for efficiently chunking sparse data uploads, especially
/// ephemeral uniform data that needs to change from frame to frame.
///
/// Different platforms have different data alignment requirements for accessing
/// device buffer data. The [HostBuffer] takes these requirements into account
/// and automatically inserts padding between emplaced data if necessary.

The automagical host->device sync behavior of lazily uploading to the GPU is no longer how HostBuffer works.

On SoCs like Apple Silicon, DeviceBuffer memory is shared, so no copying needs to occur on flush. So under the hood, HostBuffer is just a helper that tracks a block allocator (with size settings we tuned to Flutter).

Usage example:

/// 1. Create the HostBuffer.

gpu.HostBuffer transients = gpu.gpuContext.createHostBuffer();

/// 2. Append some data to it.
///    Convenient for vertices or uniforms that need to be generated on the fly.
///
///    A reference to the newly appended data is returned in the form of a
///    `BufferView`, which holds an offset into the buffer.
///    At this point, the data is given to the underlying `DeviceBuffer`.

final gpu.BufferView vertices = transients.emplace(float32(<double>[
	-0.5, 0.5, //
	0.0, -0.5, //
	0.5, 0.5, //
]));
final gpu.BufferView vertInfoData = transients.emplace(float32(<double>[
	1, 0, 0, 0, // mvp
	0, 1, 0, 0, // mvp
	0, 0, 1, 0, // mvp
	0, 0, 0, 1, // mvp
	0, 1, 0, 1, // color
]));

// ... create a RenderPass encoder and set up command state ...

/// 3. Bind the data by passing in the `BufferViews` that were just created
///    above (literally just `DeviceBuffer`s under the hood today).

encoder.bindVertexBuffer(vertices, 3);

final gpu.UniformSlot vertInfo =
  pipeline.vertexShader.getUniformSlot('VertInfo');
encoder.bindUniform(vertInfo, vertInfoData);

/// 4. When the draw is encoded, the section of the hostbuffer

encoder.draw();

Recommendation

  • Surface the platform uniform alignment offset.
  • Give DeviceBuffer a flush interface and accurately document how it works in shared memory/non-shared memory scenarios.
  • Remove the current HostBuffer concept. Instead, implement a convenient block allocator library in Dart (perhaps as an external utility library at first -- this kind of thing can be graduated to Flutter GPU later).

The user knows their data best. We can provide a convenient allocator library that can fit most use cases, but the user must be able to build their own.

Metadata

Metadata

Assignees

Labels

P2Important issues not at the top of the work listengineflutter/engine related. See also e: labels.flutter-gputeam-engineOwned by Engine teamtriaged-engineTriaged by Engine team

Type

No type

Projects

Status

💬 Discussion

Relationships

None yet

Development

No branches or pull requests

Issue actions