Skip to content

Conversation

@kainino0x
Copy link
Contributor

@kainino0x kainino0x commented Mar 1, 2021

Goal

Make requestDevice() behave more similarly across scenarios, so developers don't accidentally use it unportably. Do this by making "less extreme" scenarios (like device.destroy()) look the same as "more extreme" scenarios (like eGPU unplug or TDR). This prevents developers from accidentally writing code that works in less extreme cases but fails in more extreme cases.

Solution

Adds a global internal GPU.[[adapters_active]] flag. If it's false, no adapter can create a device.
It is gets set to false in scenarios described in the PR.


Preview | Diff

This internal slot has no enforced ("must") behavior, but is intended to
enhance compatibility between different WebGPU implementations/hardware
and push developers toward an optimal path.
@kvark kvark self-requested a review March 1, 2021 22:20
@kvark
Copy link
Contributor

kvark commented Mar 1, 2021

I got some concerns here after initial look. It prevents UA to do what it wants in some cases. For example, when an adapter is added, maybe the UA choice would otherwise be keep the previously acquired adapters alive and well. But following this logic would make the UA to fail the device creation.

@kainino0x
Copy link
Contributor Author

kainino0x commented Mar 1, 2021

Editors meeting: had a long conversation about this and we're still thinking on it.

@kainino0x kainino0x marked this pull request as draft March 9, 2021 01:07
@kainino0x kainino0x changed the title Add GPU.[[adapters_up_to_date]] Add GPU.[[adapters_active]] Mar 9, 2021
@kainino0x kainino0x requested review from austinEng and toji March 9, 2021 02:21
@kainino0x kainino0x marked this pull request as ready for review March 9, 2021 02:21
@kainino0x
Copy link
Contributor Author

Revised based on discussion in editing meeting.

@Kangz
Copy link
Contributor

Kangz commented Mar 9, 2021

The general idea looks good but I have a slight concern about the composability of things in asynchronous scenarios. Consider the following code:

let currentDevice = null;

async function asyncReinit() {
    const adapter = await gpu.requestAdapter();
    currentDevice = adapter.requestDevice();
    initWithDevice(device);
}

resetButton.onclick = () => {
     asyncReinit();
     currentDevice.destroy();
}

This code look like it should properly destroy the device and reinitialize it, and that the order of asyncReinit and .destroy() don't matter. But it actually does because of this additional boolean state introduced in this PR. (and the code in the example will fail to create a device).

This might be slightly worse in pages using multiple WebGPU components that don't know about each other. The timing of things could lead to hard to reproduce issues where devices fail creation. This issue isn't a huge blocker but it would be nice to fix it if it's not too hard.

@kainino0x
Copy link
Contributor Author

This might be slightly worse in pages using multiple WebGPU components that don't know about each other. The timing of things could lead to hard to reproduce issues where devices fail creation. This issue isn't a huge blocker but it would be nice to fix it if it's not too hard.

FWIW, a previous draft of my rework had a GPUAdapterProvider and put the [[active]] state on that. Each applet on a page would create its own GPUAdapterProvider once at startup (providing its adapter options there) and receive adaptersadded/adapterschanged events on it.

@kainino0x
Copy link
Contributor Author

This code look like it should properly destroy the device and reinitialize it, and that the order of asyncReinit and .destroy() don't matter. But it actually does because of this additional boolean state introduced in this PR. (and the code in the example will fail to create a device).

I've fixed the PR so it actually ever sets [[adapters_active]] to true. It occurs during resolution of the requestAdapter result. So this example code should be fine, but it's still possible to have a race if destroy() is called asynchronously in such a way it could occur between the end of requestAdapter and the end of requestDevice.

I think the design is actually reasonably robust to this. If this does happen flakily, an already-lost device is returned, and the application should recover from that in the normal way by trying again.

@Kangz
Copy link
Contributor

Kangz commented Mar 10, 2021

Ok, I think that's a little window for conflicts so it's probably fine. You previously had a proposal for making requestDevice into createDevice. I don't know if it is still current but it would fix the last possible flakes if the application creates the device in the promise resolution microtask.

[[#adapter-selection]] and the criteria in |options|.

1. |promise| [=resolves=] with a new {{GPUAdapter}} encapsulating |adapter|.
1. Set `navigator.gpu.`{{GPU/[[adapters_active]]}} to `true`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so if the user wants to ignore this logic entirely, they'd just

  • hold onto GPUAdapter after a device is lost
  • call requestAdapter without bothering to check the result
  • use the old GPUAdapter like nothing happened

Isn't the intent here to prevent this behavior? It looks to me that some users may do just this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they could do this, but it would be harder to do that accidentally.

We could be much more strict and just make new GPUAdapter objects, but we'd lose expando properties and break equality comparisons.

@kainino0x
Copy link
Contributor Author

kainino0x commented Mar 15, 2021

Editors discussed this more, and I think we are going to make requestAdapter(s) return new GPUAdapter object(s) every time, as this is the most forward-compatible. Instead of [[adapters_active]] we can have existing GPUAdapter objects become permanently invalidated. It also reduces the risk of multiple apps on a page stumbling over each other (e.g. overwriting each other's expando properties).

In the future we can consider allowing requestAdapter(s) to return the same GPUAdapter object(s) multiple times, which would allow applications to do equality comparisons or hang expando properties off of them. This would be a nearly-non-breaking change, less risky than going the other way. Alternatively we can start guaranteeing that the name field will be a unique identifier so apps can compare that instead (maintaining the isolation between apps on a page).

@kainino0x kainino0x closed this Mar 15, 2021
@kainino0x kainino0x deleted the adapters_up_to_date branch March 15, 2021 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants