Skip to content

Failing to close a CancelScope produces super confusing errors #882

@Badg

Description

@Badg

(update: this appears to have been due to failing to close the CancelScope; see: #882 (comment))

I'm not sure either way if this is ultimately a dupe of #552, so feel free to close if so. I'm also having a hell of a hard time figuring out how to reliably reproduce this, and it may actually be an issue in trio_asyncio?

What's the issue?

During teardown, there's a KeyError in trio._core._run:CancelScope._remove_task.

Steps to reproduce

I haven't been able to repro this outside of one specific script, but I've tried a lot of different things. For the sake of brevity, I'll call that script "voidreader", since that's what I'm actually calling it and if you can surmise any secret sauce from that name then you deserve to know anyways. Background:

  • Voidreader is starting from a call to trio.run
  • It runs a separate logging system in a companion thread, also running via trio.run, but I'm pretty confident that isn't related, because I still see the error when I comment out the logging system. However, that library does some patching of the stdlib logging module, which... may or may not be relevant?
  • Voidreader starts some async context managers pretty much immediately. One of them runs a Quart server inside of a call to aio_as_trio; there's other uses of it as well
  • Voidreader is also using asks for some stuff, but I don't think that's relevant either
  • Control-C interrupts stop "just working" inside voidreader. It takes me several attempts to get it to actually exit, and then I see the traceback included below

I've tried reproing with stuff like this:

from contextlib import asynccontextmanager
import asyncio

from trio_asyncio import aio_as_trio
from trio_asyncio import trio_as_aio
import trio
import trio_asyncio


@asynccontextmanager
async def nested():
    print('fly, you fools!')
    yield
    await aio_as_trio(asyncio.sleep(7))


async def main():
    async with trio_asyncio.open_loop() as loop:
        async with nested():
            await trio.sleep(1)


if __name__ == '__main__':
    trio.run(main)

but to no avail -- note that control+C still works here, raising a keyboardinterrupt. My hunch is that this may have something to do with the Quart stuff that's running in an asyncio context.

Traceback

Traceback (most recent call last):
  File "C:\Users\Nick\venv\foo\lib\site-packages\trio\_core\_run.py", line 1323, in run
    run_impl(runner, async_fn, args)
  File "C:\Users\Nick\venv\foo\lib\site-packages\trio\_core\_run.py", line 1471, in run_impl
    runner.task_exited(task, final_outcome)
  File "C:\Users\Nick\venv\foo\lib\site-packages\trio\_core\_run.py", line 943, in task_exited
    task._cancel_stack[-1]._remove_task(task)
  File "C:\Users\Nick\venv\foo\lib\site-packages\trio\_core\_run.py", line 203, in _remove_task
    self._tasks.remove(task)
KeyError: <Task '__main__.main' at 0x27f9eaf5b00>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "d:\dropbox\projekte\foo\bar\baz\__main__.py", line 36, in <module>
    trio.run(main)
  File "C:\Users\Nick\venv\foo\lib\site-packages\trio\_core\_run.py", line 1329, in run
    ) from exc
trio.TrioInternalError: internal error in trio - please file a bug!
Exception ignored in: <function Nursery.__del__ at 0x0000027F9CEE0730>
Traceback (most recent call last):
  File "C:\Users\Nick\venv\foo\lib\site-packages\trio\_core\_run.py", line 530, in __del__
AssertionError:
PS C:\Users\Nick>

Expected behavior

At the very least, a better traceback -- at one point I thought I was seeing that traceback during startup, when ultimately the problem was a recursion issue in my logging patch. While I was trying to repro the trio issue, I stumbled upon the problem in my code, fixed it, and suddenly things ran happily again -- until shutdown, that is, which is where the traceback was originating from anyways. Having more information about surrounding exceptions would have been really helpful early on, to help me isolate which problems were mine vs which problems might be upstream.

Of course also, I'd expect a cleaner shutdown! :)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions