Fall back on previous panic hook when not in catch_unwind wrapper#15319
Fall back on previous panic hook when not in catch_unwind wrapper#15319
catch_unwind wrapper#15319Conversation
|
|
Thanks for jumping on this so quickly!! I think @MichaReiser or @sharkdp will almost certainly be a better reviewer here, though -- this feels like it's out of my comfort zone 😆 |
| } | ||
|
|
||
| fn install_hook() { | ||
| static ONCE: OnceLock<()> = OnceLock::new(); |
There was a problem hiding this comment.
Using a OnceLock assumes that the global panic hook never changes. I guess that's sort of fine because it's true in all use cases we have today, but it might be surprising that catch_unwind only works until some magic point in time (when the panic hook gets replaced)
I also find it somewhat unfortunate that the panic hook will linger along after all threads completed. Could we have an atomic counter that counts how many threads are inside a catch_unwind and restore the panic handler once it reaches 0?
There was a problem hiding this comment.
Another option — which I actually implemented first but then discarded because it changed the calling API more drastically — would be to make a wrapper type (PanicHandler) and move catch_unwind to be an instance method of that type. Its new method would install the custom hook, and its Drop implementation would restore the previous hook.
That would still not protect you against someone else installing a different panic hook while you're in the middle of using PanicHandler. But the lifetime of that handler would clearly delineate where you are expecting the custom behavior to happen.
Unfortunately that also doesn't seem to play nice with our test harness, since the most obvious place to create (and drop) the PanicHandler is in red_knot_test::run. But that wouldn't install the hook once globally for all test cases — it would install/uninstall the hook on a per-test basis, just like before. Ideally we'd create a single PanicHandler for the entire cargo test invocation — but I'm not aware of a way to do that, nor of how to then pass that handler into the individual test case functions.
I also find it somewhat unfortunate that the panic hook will linger along after all threads completed. Could we have an atomic counter that counts how many threads are inside a
catch_unwindand restore the panic handler once it reaches0?
I think there would be a race condition here, where we would uninstall the hook whenever there are 0 threads currently in the middle of a catch_unwind, but there might be a test thread that's still running and just hasn't gotten to its catch_unwind call yet. So we would have to check and reinstall the hook if needed. (Or blindly always install it?)
In the end it seemed better to lean into the assumption that the custom hook is installed once for the duration of the process. And the OnceLock part is to make sure that (a) the custom hook is not installed unless you need it, and (b) the caller doesn't have to worry about explicit or concurrent initialization.
I could document all of this more clearly — e.g. with a warning that it's not correct to use this catch_unwind with anything that tries to install a competing panic hook.
There was a problem hiding this comment.
I could document all of this more clearly — e.g. with a warning that it's not correct to use this
catch_unwindwith anything that tries to install a competing panic hook.
I added documentation that we don't work if you try to install a competing panic hook, and also added some detection of if that happens.
* main: [`pylint`] Fix `unreachable` infinite loop (`PLW0101`) (#15278) fix invalid syntax in workflow file (#15357) [`pycodestyle`] Avoid false positives related to type aliases (`E252`) (#15356) [`flake8-builtins`] Disapply `A005` to stub files (#15350) Improve logging system using `logLevel`, avoid trace value (#15232) [`flake8-builtins`] Rename `A005` and improve its error message (#15348) Spruce up docs for pydoclint rules (#15325) [`flake8-type-checking`] Apply `TC008` more eagerly in `TYPE_CHECKING` blocks and disapply it in stubs (#15180) [red-knot] `knot_extensions` Python API (#15103) Display Union of Literals as a Literal (#14993) [red-knot] all types are assignable to object (#15332) [`ruff`] Parenthesize arguments to `int` when removing `int` would change semantics in `unnecessary-cast-to-int` (`RUF046`) (#15277) [`eradicate`] Correctly handle metadata blocks directly followed by normal blocks (`ERA001`) (#15330) Narrowing for class patterns in match statements (#15223) [red-knot] add call checking (#15200) Spruce up docs for `slice-to-remove-prefix-or-suffix` (`FURB188`) (#15328) [`internal`] Return statements in finally block point to end block for `unreachable` (`PLW0101`) (#15276) [`ruff`] Treat `)` as a regex metacharacter (`RUF043`, `RUF055`) (#15318) Use uv consistently throughout the documentation (#15302)
This fixes #15317. Our
catch_unwindwrapper installs a panic hook that captures (the rendered contents of) the panic info when a panic occurs. Since the intent is that the caller will render the panic info in some custom way, the hook silences the default stderr panic output.However, the panic hook is a global resource, so if any one thread was in the middle of a
catch_unwindcall, we would silence the default panic output for all threads.The solution is to also keep a thread local that indicates whether the current thread is in the middle of our
catch_unwind, and to fall back on the default panic hook if not.Test Plan
Artificially added an mdtest parse error, ran tests via
cargo test -p red_knot_python_semanticto run a large number of tests in parallel. Before this patch, the panic message was swallowed as reported in #15317. After, the panic message was shown.