Skip to content

Comments

Fall back on previous panic hook when not in catch_unwind wrapper#15319

Merged
dcreager merged 3 commits intomainfrom
dcreager/thread-safe-panics
Jan 8, 2025
Merged

Fall back on previous panic hook when not in catch_unwind wrapper#15319
dcreager merged 3 commits intomainfrom
dcreager/thread-safe-panics

Conversation

@dcreager
Copy link
Member

@dcreager dcreager commented Jan 7, 2025

This fixes #15317. Our catch_unwind wrapper installs a panic hook that captures (the rendered contents of) the panic info when a panic occurs. Since the intent is that the caller will render the panic info in some custom way, the hook silences the default stderr panic output.

However, the panic hook is a global resource, so if any one thread was in the middle of a catch_unwind call, we would silence the default panic output for all threads.

The solution is to also keep a thread local that indicates whether the current thread is in the middle of our catch_unwind, and to fall back on the default panic hook if not.

Test Plan

Artificially added an mdtest parse error, ran tests via cargo test -p red_knot_python_semantic to run a large number of tests in parallel. Before this patch, the panic message was swallowed as reported in #15317. After, the panic message was shown.

@AlexWaygood AlexWaygood added testing Related to testing Ruff itself ty Multi-file analysis & type inference labels Jan 7, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@AlexWaygood
Copy link
Member

Thanks for jumping on this so quickly!! I think @MichaReiser or @sharkdp will almost certainly be a better reviewer here, though -- this feels like it's out of my comfort zone 😆

@AlexWaygood AlexWaygood removed their request for review January 7, 2025 15:19
}

fn install_hook() {
static ONCE: OnceLock<()> = OnceLock::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a OnceLock assumes that the global panic hook never changes. I guess that's sort of fine because it's true in all use cases we have today, but it might be surprising that catch_unwind only works until some magic point in time (when the panic hook gets replaced)

I also find it somewhat unfortunate that the panic hook will linger along after all threads completed. Could we have an atomic counter that counts how many threads are inside a catch_unwind and restore the panic handler once it reaches 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option — which I actually implemented first but then discarded because it changed the calling API more drastically — would be to make a wrapper type (PanicHandler) and move catch_unwind to be an instance method of that type. Its new method would install the custom hook, and its Drop implementation would restore the previous hook.

That would still not protect you against someone else installing a different panic hook while you're in the middle of using PanicHandler. But the lifetime of that handler would clearly delineate where you are expecting the custom behavior to happen.

Unfortunately that also doesn't seem to play nice with our test harness, since the most obvious place to create (and drop) the PanicHandler is in red_knot_test::run. But that wouldn't install the hook once globally for all test cases — it would install/uninstall the hook on a per-test basis, just like before. Ideally we'd create a single PanicHandler for the entire cargo test invocation — but I'm not aware of a way to do that, nor of how to then pass that handler into the individual test case functions.

I also find it somewhat unfortunate that the panic hook will linger along after all threads completed. Could we have an atomic counter that counts how many threads are inside a catch_unwind and restore the panic handler once it reaches 0?

I think there would be a race condition here, where we would uninstall the hook whenever there are 0 threads currently in the middle of a catch_unwind, but there might be a test thread that's still running and just hasn't gotten to its catch_unwind call yet. So we would have to check and reinstall the hook if needed. (Or blindly always install it?)

In the end it seemed better to lean into the assumption that the custom hook is installed once for the duration of the process. And the OnceLock part is to make sure that (a) the custom hook is not installed unless you need it, and (b) the caller doesn't have to worry about explicit or concurrent initialization.

I could document all of this more clearly — e.g. with a warning that it's not correct to use this catch_unwind with anything that tries to install a competing panic hook.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could document all of this more clearly — e.g. with a warning that it's not correct to use this catch_unwind with anything that tries to install a competing panic hook.

I added documentation that we don't work if you try to install a competing panic hook, and also added some detection of if that happens.

* main:
  [`pylint`] Fix `unreachable` infinite loop (`PLW0101`) (#15278)
  fix invalid syntax in workflow file (#15357)
  [`pycodestyle`] Avoid false positives related to type aliases (`E252`) (#15356)
  [`flake8-builtins`] Disapply `A005` to stub files (#15350)
  Improve logging system using `logLevel`, avoid trace value (#15232)
  [`flake8-builtins`] Rename `A005` and improve its error message (#15348)
  Spruce up docs for pydoclint rules (#15325)
  [`flake8-type-checking`] Apply `TC008` more eagerly in `TYPE_CHECKING` blocks and disapply it in stubs (#15180)
  [red-knot] `knot_extensions` Python API (#15103)
  Display Union of Literals as a Literal (#14993)
  [red-knot] all types are assignable to object (#15332)
  [`ruff`] Parenthesize arguments to `int` when removing `int` would change semantics in `unnecessary-cast-to-int` (`RUF046`) (#15277)
  [`eradicate`] Correctly handle metadata blocks directly followed by normal blocks (`ERA001`) (#15330)
  Narrowing for class patterns in match statements (#15223)
  [red-knot] add call checking (#15200)
  Spruce up docs for `slice-to-remove-prefix-or-suffix` (`FURB188`) (#15328)
  [`internal`] Return statements in finally block point to end block for `unreachable` (`PLW0101`) (#15276)
  [`ruff`] Treat `)` as a regex metacharacter (`RUF043`, `RUF055`) (#15318)
  Use uv consistently throughout the documentation (#15302)
@dcreager dcreager merged commit 2ca31e4 into main Jan 8, 2025
21 checks passed
@dcreager dcreager deleted the dcreager/thread-safe-panics branch January 8, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Related to testing Ruff itself ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression: panic messages about invalid mdtests are now swallowed by mdtest

3 participants