-
-
Notifications
You must be signed in to change notification settings - Fork 378
Description
Suppose task 1 does:
try:
raise RuntimeError("whoops")
except Exception as task1_exc:
my_cscope.cancel()Meanwhile, task 2 is sitting in:
with my_cscope:
await trio.sleep_forever()Internally, my_cscope.cancel() marks the scope as cancelled and then recalculates state, which triggers a call to task2._attempt_delivery_of_any_pending_cancel(), which calls task2._attempt_abort(raise_cancel), which calls sleep_forever's abort_func, which succeeds. And since it succeeds, it then immediately calls reschedule(task2, outcome.capture(raise_cancel). Which invokes raise_cancel.
And since this is all happening way down inside task 1's call to my_cscope.cancel(), it's all happening with the Python interpreter in a state where task1_exc is marked as pending. So when raise_cancel does raise Cancelled._create(), then the new Cancelled exception in task 2 gets its __context__ set to task1_exc. Which is weird and unexpected.
(It's also inconsistent, because it only happens on one particular abort pathway -- if task 2 had been sleeping in an operation that wasn't cancellable, or had deferred cancellation like an IOCP/io_uring operation, then it would still eventually end up with a Cancelled exception but with __context__ = None.)
Given our existing APIs, I think the only reasonable solution is to tweak _attempt_abort so it does:
error = capture(raise_cancel)
error.error.__context__ = None
self._runner.reschedule(self, error)It's also worth thinking a bit about how we got here: this only happens because of our weird API where we represent a pending cancellation as a callable that raises the actual exception. The reason for this is:
- For operations that have deferred cancellation, like IOCP/io_uring, the
abort_funchas to request the cancellation, and then some time later it gets notified about whether the cancellation succeeded. If it did, then it now has to raiseCancelled. But:- Now we're in user code, without any special access to Trio's internals
Cancelledhas no public constructor; and besides, the abort might have been caused by either aKeyboardInterruptor aCancelled, so we have to pass in the exception to the user code somehow- If it's a
KeyboardInterrupt, those have exactly-once semantics, so the core code needs to know whether the user code ended up raising it or not - So that's why we end up representing the exception-to-be-raised as a function that does the raise -- so the core code can find out whether the exception is raised or not
In #733 we decided that we liked the idea of not injecting KI through the cancellation system, but instead injecting it into the root nursery, so that it acts like a regular exception and isn't intertwined with the cancellation system the same way. If we did that, then we could (eventually) change the abort_func interface so it just takes the exception to be raised (or just provide another way to raise Cancelled). And then this whole thing could have been avoided.