Skip to content

Fix reference cycle in hf_raise_for_status causing delayed object destruction#4092

Merged
Wauplin merged 3 commits intomainfrom
fix/hf-raise-for-status-refcycle-simplified
Apr 13, 2026
Merged

Fix reference cycle in hf_raise_for_status causing delayed object destruction#4092
Wauplin merged 3 commits intomainfrom
fix/hf-raise-for-status-refcycle-simplified

Conversation

@Wauplin
Copy link
Copy Markdown
Contributor

@Wauplin Wauplin commented Apr 13, 2026

Why? PR equivalent to #4084 but slightly cleaner. Should solved vLLM garbage collection problem (vllm-project/vllm#38384). It seems that in #3889 we've introduced a reference cycle issue in hf_raise_for_status, making it impossible to free-up memory correctly.

I have tested the regression test introduced in this PR and it does fail on main

Kudos to @yg7445 for investigating + suggesting the solution. This was not an easy bug to spot!


Summary

Fixes a CPython reference cycle in hf_raise_for_status() that prevents deterministic exception cleanup.

When exceptions are stored in local variables before raise ... from e, a reference cycle forms: err.__cause__ee.__traceback__ → frame → f_locals['err'] → back to start. This delays garbage collection and causes real issues downstream (e.g., vLLM GPU memory not released until cyclic GC runs). See vllm-project/vllm#38384.

Alternative approach to #4084: instead of introducing two new helper functions, this extends the existing _format() with **attrs so attributes (like repo_type, repo_id, bucket_id) are set inside _format and the exception is never stored in the caller's frame.

Changes

  • Extended _format() to accept **attrs keyword arguments, set as attributes on the error
  • Converted all 5 call sites in hf_raise_for_status from local-variable pattern to inline raise _format(...) from e
  • Added regression test using weakref to verify no reference cycle exists

🤖 Generated with Claude Code


Note

Cursor Bugbot is generating a summary for commit 8ba3871. Configure here.

@bot-ci-comment
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fix looks good to me 👍

@Wauplin Wauplin merged commit 43a38b2 into main Apr 13, 2026
20 of 21 checks passed
@Wauplin Wauplin deleted the fix/hf-raise-for-status-refcycle-simplified branch April 13, 2026 14:08
Wauplin added a commit that referenced this pull request Apr 14, 2026
…estruction (#4092)

* Fix reference cycle in hf_raise_for_status by extending _format with **attrs

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* simpler tests

* syntax

---------

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
@huggingface-hub-bot
Copy link
Copy Markdown
Contributor

This PR has been shipped as part of the v1.10.2 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants