Fall back on previous panic hook when not in `catch_unwind` wrapper by dcreager · Pull Request #15319 · astral-sh/ruff

dcreager · 2025-01-07T15:12:18Z

This fixes #15317. Our catch_unwind wrapper installs a panic hook that captures (the rendered contents of) the panic info when a panic occurs. Since the intent is that the caller will render the panic info in some custom way, the hook silences the default stderr panic output.

However, the panic hook is a global resource, so if any one thread was in the middle of a catch_unwind call, we would silence the default panic output for all threads.

The solution is to also keep a thread local that indicates whether the current thread is in the middle of our catch_unwind, and to fall back on the default panic hook if not.

Test Plan

Artificially added an mdtest parse error, ran tests via cargo test -p red_knot_python_semantic to run a large number of tests in parallel. Before this patch, the panic message was swallowed as reported in #15317. After, the panic message was shown.

github-actions · 2025-01-07T15:18:52Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

AlexWaygood · 2025-01-07T15:19:46Z

Thanks for jumping on this so quickly!! I think @MichaReiser or @sharkdp will almost certainly be a better reviewer here, though -- this feels like it's out of my comfort zone 😆

MichaReiser · 2025-01-07T16:00:03Z

crates/ruff_db/src/panic.rs

+}
+
+fn install_hook() {
+    static ONCE: OnceLock<()> = OnceLock::new();


Using a OnceLock assumes that the global panic hook never changes. I guess that's sort of fine because it's true in all use cases we have today, but it might be surprising that catch_unwind only works until some magic point in time (when the panic hook gets replaced)

I also find it somewhat unfortunate that the panic hook will linger along after all threads completed. Could we have an atomic counter that counts how many threads are inside a catch_unwind and restore the panic handler once it reaches 0?

Another option — which I actually implemented first but then discarded because it changed the calling API more drastically — would be to make a wrapper type (PanicHandler) and move catch_unwind to be an instance method of that type. Its new method would install the custom hook, and its Drop implementation would restore the previous hook.

That would still not protect you against someone else installing a different panic hook while you're in the middle of using PanicHandler. But the lifetime of that handler would clearly delineate where you are expecting the custom behavior to happen.

Unfortunately that also doesn't seem to play nice with our test harness, since the most obvious place to create (and drop) the PanicHandler is in red_knot_test::run. But that wouldn't install the hook once globally for all test cases — it would install/uninstall the hook on a per-test basis, just like before. Ideally we'd create a single PanicHandler for the entire cargo test invocation — but I'm not aware of a way to do that, nor of how to then pass that handler into the individual test case functions.

I also find it somewhat unfortunate that the panic hook will linger along after all threads completed. Could we have an atomic counter that counts how many threads are inside a catch_unwind and restore the panic handler once it reaches 0?

I think there would be a race condition here, where we would uninstall the hook whenever there are 0 threads currently in the middle of a catch_unwind, but there might be a test thread that's still running and just hasn't gotten to its catch_unwind call yet. So we would have to check and reinstall the hook if needed. (Or blindly always install it?)

In the end it seemed better to lean into the assumption that the custom hook is installed once for the duration of the process. And the OnceLock part is to make sure that (a) the custom hook is not installed unless you need it, and (b) the caller doesn't have to worry about explicit or concurrent initialization.

I could document all of this more clearly — e.g. with a warning that it's not correct to use this catch_unwind with anything that tries to install a competing panic hook.

I could document all of this more clearly — e.g. with a warning that it's not correct to use this catch_unwind with anything that tries to install a competing panic hook.

I added documentation that we don't work if you try to install a competing panic hook, and also added some detection of if that happens.

* main: [`pylint`] Fix `unreachable` infinite loop (`PLW0101`) (#15278) fix invalid syntax in workflow file (#15357) [`pycodestyle`] Avoid false positives related to type aliases (`E252`) (#15356) [`flake8-builtins`] Disapply `A005` to stub files (#15350) Improve logging system using `logLevel`, avoid trace value (#15232) [`flake8-builtins`] Rename `A005` and improve its error message (#15348) Spruce up docs for pydoclint rules (#15325) [`flake8-type-checking`] Apply `TC008` more eagerly in `TYPE_CHECKING` blocks and disapply it in stubs (#15180) [red-knot] `knot_extensions` Python API (#15103) Display Union of Literals as a Literal (#14993) [red-knot] all types are assignable to object (#15332) [`ruff`] Parenthesize arguments to `int` when removing `int` would change semantics in `unnecessary-cast-to-int` (`RUF046`) (#15277) [`eradicate`] Correctly handle metadata blocks directly followed by normal blocks (`ERA001`) (#15330) Narrowing for class patterns in match statements (#15223) [red-knot] add call checking (#15200) Spruce up docs for `slice-to-remove-prefix-or-suffix` (`FURB188`) (#15328) [`internal`] Return statements in finally block point to end block for `unreachable` (`PLW0101`) (#15276) [`ruff`] Treat `)` as a regex metacharacter (`RUF043`, `RUF055`) (#15318) Use uv consistently throughout the documentation (#15302)

Fall back on previous panic hook when not in catch_unwind wrapper

f3024a5

dcreager requested review from AlexWaygood, MichaReiser, carljm and sharkdp as code owners January 7, 2025 15:12

AlexWaygood added testing Related to testing Ruff itself ty Multi-file analysis & type inference labels Jan 7, 2025

AlexWaygood removed their request for review January 7, 2025 15:19

MichaReiser reviewed Jan 7, 2025

View reviewed changes

MichaReiser approved these changes Jan 8, 2025

View reviewed changes

dcreager added 2 commits January 8, 2025 11:08

Document (and detect) that we don't work with competing panic hooks

0e66889

dcreager merged commit 2ca31e4 into main Jan 8, 2025
21 checks passed

dcreager deleted the dcreager/thread-safe-panics branch January 8, 2025 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fall back on previous panic hook when not in `catch_unwind` wrapper#15319

Fall back on previous panic hook when not in `catch_unwind` wrapper#15319
dcreager merged 3 commits intomainfrom
dcreager/thread-safe-panics

dcreager commented Jan 7, 2025

Uh oh!

github-actions bot commented Jan 7, 2025 •

edited

Loading

Uh oh!

AlexWaygood commented Jan 7, 2025

Uh oh!

MichaReiser Jan 7, 2025

Uh oh!

dcreager Jan 7, 2025

Uh oh!

dcreager Jan 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

dcreager commented Jan 7, 2025

Test Plan

Uh oh!

github-actions bot commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

AlexWaygood commented Jan 7, 2025

Uh oh!

MichaReiser Jan 7, 2025

Choose a reason for hiding this comment

Uh oh!

dcreager Jan 7, 2025

Choose a reason for hiding this comment

Uh oh!

dcreager Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jan 7, 2025 •

edited

Loading

`ruff-ecosystem` results