Understanding asyncio.subprocess.Process garbage collection

The documentation on asyncio.create_subprocess_exec claims:

If the process object is garbage collected while the process is still running, the child process will be killed.

However, I cannot get this to work. For example, with the following sub.py script

from time import sleep

for i in range(0, 20):
    print(f"Step {i}")
    sleep(1)

and the following main.py

import asyncio
import gc
import sys

async def main():
    process = await asyncio.create_subprocess_exec(sys.executable, "sub.py")
    await asyncio.sleep(3)
    del process
    gc.collect()

if __name__ == "__main__":
    asyncio.run(main())

the “Step” output keeps going after main.py has finished. Calling process.kill() manually works, of course.

This is fine, I can call kill or terminate manually – I would prefer to, actually. However, I would like to understand the circumstances under which I need to expect the subprocess to be killed by garbage collection, so it does not come back to bite me.

In main, the subprocess coroutine or task is only being killed after it’s already been awaited (i.e. finished), not while it’s already running. And moreover, only after a further three seconds.

Perhaps swap asyncio.run for asyncio.gather, and call that on both main and process (create process outside main, e.g. in the guard if clause).

Sorry, I don’t quite understand what you’re getting at. In the code I posted, the sub.py script is started right away and prints “Step i” every second. After three seconds, main in main.py deletes and garbage collects the Process object process, and then terminates – I get back control of my shell. (So even if there were a dangling reference to process in my main.py script, it should be garbage collected on termination, no?)

If I understand the documentation correctly, the garbage collection of process should kill the sub.py subprocess, but it does not (it keeps on printing “Step i” for 17 more seconds until its for loop finishes).

That’s because it doesn’t get garbage collected until after the whole process has been awaited. (And you’re only del ing the returned value).

Await waits - it blocks the coroutine it’s in, allowing something else, network latency, IO, or another coroutine.

Your code’s put everything in the same coroutine, so if network or IO aren’t important, it all runs sequentially in a single thread, pretty much the same as non-async, blocking code (albeit less of a resource hog for other processes outside of Python).

I now notice that I did not really ask the question that I want to have answered: I would like for the subprocess to keep running even after main.py finishes. Currently, this happens: I still get sub.py’s “Step” output even after main.py is done.

However, the line from the documentation about garbage collection of the Process object makes me worry that this might not be reliable and that there might be circumstances under which my subprocess is killed, instead. However, I do not understand the circumstances under which this garbage collection would happen, given that main.py finishing does not even do it.

I get your meaning. I wasn’t trying to answer that question, FWIW. I was trying to critique the code example, as an attempt to reproduce the issue described in the docs. For this, what I had in mind was:

del_async_process.py

import asyncio
import gc
import sys

async def schedule_subprocess_then_delete_it(tg: asyncio.TaskGroup):
    process = asyncio.create_subprocess_exec(sys.executable, "sub.py")
    tg.create_task(process)
    await asyncio.sleep(3)
    del process
    gc.collect()

async def main():
    async with asyncio.TaskGroup() as tg:
        task1 = tg.create_task(schedule_subprocess_then_delete_it(tg))

if __name__ == "__main__":
    asyncio.run(main())

I haven’t thought of a simple way with asynio.gather. Using the TaskGroup approach instead, passing tg in to the other task, allowed creating and then deleting the subprocess task without creating new references to it.

debian@...:~$ python del_async_process.py
Step 0
Step 1
Step 2
debian@...:~$

However, I -think- to answer your question, to avoid this issue, I think all you have to do is not delete process. Just keep a reference to it in an active scope somewhere.

2 Likes

Even with your version, I get the same result:

~/subprocess-test ❯❯❯ .venv/bin/python main.py
James' version
Step 0
Step 1
Step 2
Step 3
~/subprocess-test ❯❯❯ Step 4
Step 5
Step 6
Step 7
Step 8
Step 9

(Python 3.14.0; Ubuntu 24.04.3 LTS); I just added print("James' version") before asyncio.run to make sure that I am running the right code. (I have spent hours debugging the wrong piece of code before …)

Admittedly, I also still don’t quite get what is wrong with my original code. (I know that I’m not doing anything truly asynchronous. This is just me trying to understand the Process garbage collection; in the actual program, other tasks would exist.)

create_subprocess_exec is a coroutine, so that is what I await. It finishes basically immediately, resulting in the Process object process (if the type function is to be believed). At the same time, sub.py starts.

I cannot await the Process object, it’s not awaitable. But the garbage collection of it should trigger the kill of the subprocess, according to the documentation. So there must be another pointer to process in the depths of asyncio somewhere, I guess, keeping it alive. And this seems to cause the __del__ method of process to never be called, not even at the end of main.py.

And my true question is: If main.py ends, can I rely on sub.py continuing if I don’t call kill (or terminate) on process myself?

Thanks for giving it a try. Async stuff really is tricky - I’m flummoxed myself - I can’t reproduce what you see (I’ve only tried a single core and a dual core machine so far though), so it might not be behaving exactly as I describe. Your code could be async, but not concurrent.

As to sub.py continuing if main.py ends, I’m not sure certain operating systems won’t tidy up sub processes when the parent exits, even if Python doesn’t do so pro-actively.

Is the goal to know sub.py is definitely killed, or ensure it lives on?

The goal would be that I can control whether the child lives on (assuming that it is well-behaved enough to react the relevant signals). But maybe the best thing I can do is to keep the Process object around as long as possible, not kill or terminate it, and then hope it works. Perfect reliability is not absolutely vital in my case.

(This is in the context of an “orchestrator” running different simulation programs, and I would like to sometimes keep them alive a bit longer after the orchestrator has concluded, for example, so that a simulation programs that show some output to the user in an interactive way. So termination of the child processes would be potentially annoying, but not live-threatening.)

I would have also liked to really understand the exact conditions though, just for my own curiosity.

Thanks for you help in this endeavor!

1 Like

You’re very welcome. I do actually use asyncio.create_subprocess_exec to create a debug/maintenance sub process, so I’m interested in keeping that alive even if the parent crashes. I’ll take another look at the libraries’ source code later.

So to kill a subprocess, presumably all is needed is to call .kill or .terminate on the process object.

To spawn a process that outlives its creator, my current thinking is in synchronous code,
to try specifying start_new_session=True on subprocess.Popen.

Under the hood, create_subprocess_exec calls the lower level loop.subprocess_exec which passes unrecognised **kwargs to subprocess.Popen. An async implementation probably needs to use those lower-level functions directly, instead of the create_subprocess_exec helper.

I wouldn’t be remotely surprised if this is platform specific to Posix too.

The Xonsh shell has some good examples of process handling: xonsh/xonsh/procs at main · xonsh/xonsh · GitHub

They don’t use async at all for process handling and instead use threading since that allows you to run a system level process fork.

async is better for use in io bound code, not code that needs to handle process failures.

1 Like