Skip to content

Conversation

@toji
Copy link
Member

@toji toji commented Feb 9, 2022

A design that the editors have been discussing to resolve #2085 and some of #2119. Specifically, this design allows developers to both wait on the object (say, a pipeline) to be ready AND, if needed, pre-emptively use them anyway before the promise resolves and accept the stall that it may incur.

This also prevents the addition of ...Async() variants of every possible object in the future if we decide that we need them down the line.

One thing that's not explicitly mentioned in this CL but probably should be is an underlying assumption that the spec begins treating all GPU object creation as scheduling tasks. As Kai describes in https://hackmd.io/@webgpu/r1mBAEGCK#What-if-we-try-to-change-as-little-as-possible :

Describe everything in WebGPU in an async task model where nothing is ordered until an operation creates a dependency. In this model ALL shader/pipeline creation occurs on independent “threads” that just get "await"ed when you need them.


Preview | Diff

@toji toji requested review from kainino0x and kvark February 9, 2022 00:33
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2022

Previews, as seen when this build job started (ddde53f):
WebGPU | IDL
WGSL
Explainer

@Kangz
Copy link
Contributor

Kangz commented Feb 9, 2022

This proposals (and the idea quoted, and the doc linked) seem to prioritize theoretical purity above everything. This proposal is a huge and late structural change, and they always come with the risks that we find more problems with them later, whereas the current structure is pretty matured and with known issues. But outside of "stop changing the spec" I think this has major issues.

  • It pretends that each object creation is an asynchronous operation that's happening sometimes, and blocks only on tasks for other objects it depends on. This model will confuse developers as to what operations are so expensive that they deserve to be asynchronous, and which ones are cheap and can be done immediately.
  • Because all the objects are given directly, even when they are created asynchronously, it means that every place that uses these objects will have to check if the object is available and potentially block. This is additional overhead for the checks and potential blocking, but also makes it more difficult for developers to reason about what's happening.

Unless there's extraordinary things that can't be fixed otherwise, I don't think we should pursue this direction for v1. (and really, special casing GPUShaderModule or adding a boolean to skip errors in the descriptor is ok.)

@kainino0x
Copy link
Contributor

I think one thing to consider here is still making this change, but having it ONLY allow GPUComputePipeline and GPURenderPipeline, that way we have a surface for adding this to other objects in the future if we want.

More importantly it solves the problem of having to choose ahead of time between synchronous and asynchronous pipeline creation, which I think is real (#2085).

@toji
Copy link
Member Author

toji commented Feb 9, 2022

Agreed. We don't have to generalize this for the entire API if we don't want. We could do pipeline.isReady(), for instance. But the usability issues with create*PipelineAsync are worth addressing IMO.

@Kangz
Copy link
Contributor

Kangz commented Feb 10, 2022

More importantly it solves the problem of having to choose ahead of time between synchronous and asynchronous pipeline creation, which I think is real (#2085).

Isn't it ok? Implementations are expected to cache pipeline compilation results, and sometimes you might end up compiling things twice because you decided late that you need a pipeline so again, that's ok? (plus implementations can optimize even that case to a single pipeline compilation).

I think one thing to consider here is still making this change, but having it ONLY allow GPUComputePipeline and GPURenderPipeline, that way we have a surface for adding this to other objects in the future if we want.

You would need at least shader module in addition to pipelines.

@toji
Copy link
Member Author

toji commented Feb 10, 2022

Implementations are expected to cache pipeline compilation results, and sometimes you might end up compiling things twice because you decided late that you need a pipeline so again, that's ok? (plus implementations can optimize even that case to a single pipeline compilation).

If that's an expectation that we want developers to rely on it should be in the spec, we should be writing tests for it to the best of our ability, and we should be advertising it loudly. Otherwise as a developer I won't know if using create*PipelineAsync() and then changing my mind later on and issuing an identical create*Pipeline() call will allow me to piggyback on the previous call or if it incurs the pipeline creation cost twice. Even if I verify the behavior successfully on one device and one browser I can't be sure if it's going to work on all platforms and all browsers.

This ambiguity will cause developers to avoid using create*PipelineAsync().

The benefit of the pipeline.isReady() pattern is that the interface leaves no ambiguity about whether or not a second pipeline will be created. I know you're trying to avoid large changes at this point, and I agree with the need to do so, but if we truly want the async methods to be useful it's going to require a lot more spec/CTS (and possibly implementation) work anyway AND we're going to rely on developers reading the spec to determine behavioral ambiguities and best practices. I'd prefer to go with an API shape where the behavior is self-evident.

You would need at least shader module in addition to pipelines.

Can you do anything with a shader module besides create a pipeline with it or get the compilation messages (already an async operation)? I would expect that .isReady() for a pipeline would naturally include waiting for a shader module to finish compiling, just as I assume create*PipelineAsync() already does.

@austinEng
Copy link
Contributor

What about a proposal in between?

const pipeline = device.createRenderPipeline(descriptor);

// syntax up for bikeshedding..
const [pipeline, isReady] = device.createRenderPipelineAsync(descriptor);
await isReady;

createRenderPipeline remains explicitly clear that it is definitely happening now, and not off thread in the background. For developers who want to compile something and use it immediately, it's clear what's happening. This avoids an unnecessary thread dance where without separate calls, the implementation must always create the pipeline in the background, only to realize microseconds later that that's not what the developer actually wanted.

createRenderPipelineAsync makes it clear that the compilation will occur on a thread, and returns to you a Promise to indicate when it's done. However, unlike the current spec, it gives you the handle to the GPURenderPipeline immediately if you choose later that you don't actually want to wait.

The implementation still has to make sure that asynchronously created pipelines check a mutex whenever the pipeline is used, but there it would only need to happen for pipelines created using that path (so guarded by a branch or virtual method call). This is still a fair amount of implementation complexity to manage.

@kainino0x
Copy link
Contributor

What about a proposal in between?

I think this is a solid option.

createRenderPipeline remains explicitly clear that it is definitely happening now, and not off thread in the background.

I'm not sure this is a good invariant to uphold. Yes, it's clear that it's happening now, but is it going to block a bunch of unrelated stuff before I use the pipeline? That seems bad, because naively I'd try to hoist my pipeline creation as early as possible so the latency can be hidden by other commands. Having it be "synchronous" (on the device timeline) means that doesn't work.

The implementation still has to make sure that asynchronously created pipelines check a mutex whenever the pipeline is used, but there it would only need to happen for pipelines created using that path (so guarded by a branch or virtual method call).

We can also just set a boolean once the compilation is completed so that even async pipelines don't have to check the mutex after they're done. Not sure this saves anything though since it may still have to be an atomic load.

@kainino0x
Copy link
Contributor

I would expect that .isReady() for a pipeline would naturally include waiting for a shader module to finish compiling, just as I assume create*PipelineAsync() already does.

I agree, I don't think it's needed for createShaderModule. To solve #2119 we only have to get createShaderModule off the "main thread" which I don't think necessarily requires this.

@toji
Copy link
Member Author

toji commented Feb 10, 2022

I talked with @Kangz this morning for a while about complexity concerns from this and other API changes in the pipeline.

One point of complexity that I was unaware of is that in D3D12 and Vulkan pipeline creation is synchronous, and APIs that want it to not be a blocking operation are expected to manage their own threading. Thus one of the benefits of the current split sync/async design is that the synchronous version doesn't have to spin up a new thread for every pipeline unless it's specifically requested via the async method. This definitely adds to the complexity of implementing the .isReady() model, in which every pipeline would have to be created asynchronously and synchronized when used. While that's not infeasible, it does represent more implementation work than I was anticipating. Thanks for helping me understand!

Also, while I think it can't be avoided in some cases I do fundamentally agree with the principle that we should be avoiding large scale changes to the API at this stage as we get closer to shipping. I think the above tips the scales for me on whether I consider this worth doing prior to shipping or not.

I do think we can and should make strong guarantees (not non-normative notes) in the spec about the caching/syncing behavior of duplicate pipelines, which is probably the most viable route at this point.

What about a proposal in between?

That's an intriguing compromise! I think the only reason I'd be wary of it is that in JavaScript, unlike languages like Python, returning Tuples from browser APIs is fairly uncommon and thus may catch some developers by surprise. Specifically, I think you'd see a lot of:

const pipeline = await device.createRenderPipelineAsync(desc);

// Much later, probably in a different file entirely...
pass.setPipeline(pipeline);

And the error is going to come not from the createRenderPipelineAsync() call, which is valid (you can await on non-promises) but from setPipeline() and the WebIDL-generated error message is likely to be unintuitive.

That's not to say it's not a viable approach, it's just the primary concern I have.

@kainino0x
Copy link
Contributor

Yeah, I was thinking about that and would probably do something slightly different (also because WebIDL can't express tuples), like:
const pipeline = device.createRenderPipelineConcurrently(desc, callback);
or add an extra member to these particular render pipelines:
const pipeline = device.createRenderPipelineConcurrently(desc); pipeline.ready

@kvark
Copy link
Contributor

kvark commented Feb 15, 2022

Thus one of the benefits of the current split sync/async design is that the synchronous version doesn't have to spin up a new thread for every pipeline unless it's specifically requested via the async method.

I think this reasoning is flawed. Implementations never have to spin up any threads, even for async methods implementations.
Moreover, without isReady() the implementations are still highly incentivized to do work internally on different threads, i.e. it can do all the pipeline creations on a separate thread, and all the shader module compilations if that's slow - #2119. These internal optimizations would be transparent to users and likely benefit them, so there is an incentive to do it. And after it's done, we are at a situation where the synchronous API is a lie, anyway.

I like isReady() because it allows us to say "ok, we admit, creation of this buffer/texture/shader module/etc is really asynchronous" in the future without breaking user code, and without requiring any changes from them to take advantage of our internal parallelism. I think isReady() is still better (for the reasons outlined) than either what we have (createPipelineAsync being an exception from everything else), or the in-between proposal (same reason).

@toji
Copy link
Member Author

toji commented Feb 28, 2022

After some further discussion by the editors, we think that while there's still a usability issue here in terms of pipeline creation this particular proposal isn't the way to solve it. Closing, though we expect additional discussion on the subject of pipeline creation ergonomics, possibly to be implemented post-v1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants