Specify destroy() methods and behavior around context loss and error reporting#744
Specify destroy() methods and behavior around context loss and error reporting#744fdwr merged 32 commits intowebmachinelearning:mainfrom inexorabletash:context-lost
Conversation
Based on @mingmingtasd's work in the Chromium prototype implementation. For #477
|
Some points for discussion:
@mingmingtasd and @reillyeon could you do an initial review? |
|
Thanks! @inexorabletash @reillyeon
I will check context lost for all of the synchronous and asynchronous actions depending on the context in my chromium CL. |
|
Something this PR doesn't do is specify the behavior around rejecting in-flight asynchronous operations. |
Thoughts on how to do that and to what level of detail?
One generic approach for both of those would be to replace: "Queue an ML task with global to resolve promise with ..." with:
... which I think covers the script-observable behavior, but not that the async steps internally should fail. |
|
There are two separate considerations: what happens to the promise returned by a method and what happens to an asynchronous operation itself. Operations like |
I suppose the following step of execute graph could handle the device lost error:
A question is if the error is device lost, should it run the "context-lost" steps? Like the Chromium prototype does in
Or we could just say "If that returns an error, then queue an ML task with global to reject promise with an 'OperationError' DOMException" similar to
The new steps may not run because these steps may already be aborted (by "abort these steps") if previous steps fail . |
+1. Script timeline:
Device/queue timeline:
Note: impl. is always free to abort pending async ops immediately (and release buffers). |
I think this would be script observable - i.e. what order do the |
@reillyeon @huningxin @bbernhar I think the DirectML backend here has considered this, |
|
I think we've settled on the idea that destroying an Destroying an |
I have submitted a CL to expose |
|
(not actually ready for review, but I wanted to force an updated preview to be generated) |
| The <dfn method for=MLGraph>destroy()</dfn> method steps are: | ||
| </summary> | ||
| 1. If [=this=].{{MLGraph/[[isDestroyed]]}} is true, then abort these steps. | ||
| 1. Set [=this=].{{MLGraph/[[isDestroyed]]}} to true. |
There was a problem hiding this comment.
MLGraph may own device (e.g. GPU/NPU) resources. The destroy steps may need to release them on the context timeline. Worth adding a note? (Like MLTensor.destory() does)
There was a problem hiding this comment.
Added in 0fc4151 but it's a very generic note, it doesn't mention the timeline. Should it?
There was a problem hiding this comment.
We can be a bit more explicit and queue a task on the context timeline to "mark resources owned by this graph as freeable".
index.bs
Outdated
| 1. If |tensor|.{{MLTensor/[[descriptor]]}}.{{MLTensorDescriptor/readable}} is false, then return [=a new promise=] [=rejected=] with a {{TypeError}}. | ||
| 1. Let |promise| be [=a new promise=]. | ||
| 1. Enqueue the following steps to |tensor|.{{MLGraph/[[context]]}}.{{MLContext/[[timeline]]}}: | ||
| 1. Enqueue the following steps to |tensor|.{{MLGraph/[[context]]}}.{{MLContext/[[timeline]]}}, which [=/abort when=] [=this=] [=MLContext/is lost=]: |
There was a problem hiding this comment.
Is |tensor|.{{MLGraph/[[context]]}} a typo (an old issue)? Should it be |tensor|.{{MLTensor/[[context]]}} or just [=this=].{{MLContext/[[timeline]]}}? If it is a typo, the line 1096 also needs to be fixed. For example,
| 1. Enqueue the following steps to |tensor|.{{MLGraph/[[context]]}}.{{MLContext/[[timeline]]}}, which [=/abort when=] [=this=] [=MLContext/is lost=]: | |
| 1. Enqueue the following steps to [=this=].{{MLContext/[[timeline]]}}, which [=/abort when=] [=this=] [=MLContext/is lost=]: |
There was a problem hiding this comment.
Another question is whether it should check context lost earlier? Like createTensor steps do, e.g.
1. If [=this=] [=MLContext/is lost=], then return [=a new promise=] [=rejected=] with an "{{InvalidStateError}}" {{DOMException}}.
There was a problem hiding this comment.
Given that the context loss will destroy every associated tensor, I don't think that is needed. I'm not opposed to adding it explicitly though.
There was a problem hiding this comment.
For the "type confusion" I added a lint tool test and caught several more places. It's not perfect, but it is progress!
| The <dfn method for=MLGraph>destroy()</dfn> method steps are: | ||
| </summary> | ||
| 1. If [=this=].{{MLGraph/[[isDestroyed]]}} is true, then abort these steps. | ||
| 1. Set [=this=].{{MLGraph/[[isDestroyed]]}} to true. |
There was a problem hiding this comment.
We can be a bit more explicit and queue a task on the context timeline to "mark resources owned by this graph as freeable".
Co-authored-by: Reilly Grant <[email protected]>
…ntext timeline
RafaelCintron
left a comment
There was a problem hiding this comment.
LGTM. @inexorabletash , thank you very much for putting this together.
|
Also looks good from WebKit's side @inexorabletash - informal approval from discussion in today's call |
Based on @mingmingtasd's work in the Chromium prototype implementation.
MLContextgains adestroy()method and alostpromise attribute. Callingdestroy()releases resources for the context itself and any associatedMLGraphs andMLTensors, and any outstandingbuild()requests for an associatedMLGraphBuilderare aborted. Thelostattribute then resolves, and any future use of the context fails. If the context is lost for other reasons, the same behavior occurs (releasing associated resources), and thelostpromise provides an implementation-defined message explaining the reason.MLGraphsimilarly gains adestroy()attribute. When destroyed - either by an explicit call, or because the associated context is lost - then any outstandingdispatch()requests using the graph are aborted.This also modifies the omnipresent "has built" tests on
MLGraphBuildermethods to be a "can built" test which also checks that the builder's context is not lost.For #477
Preview | Diff