Skip to content

Show actionable error when HuggingFace dataset access fails (fixes #59, #61)#74

Open
lonexreb wants to merge 2 commits intoNVlabs:mainfrom
lonexreb:fix/clearer-error-on-hf-auth-failure-in-loader
Open

Show actionable error when HuggingFace dataset access fails (fixes #59, #61)#74
lonexreb wants to merge 2 commits intoNVlabs:mainfrom
lonexreb:fix/clearer-error-on-hf-auth-failure-in-loader

Conversation

@lonexreb
Copy link
Copy Markdown
Contributor

@lonexreb lonexreb commented May 1, 2026

Summary

When the upstream physical_ai_av package fails to fetch dataset metadata from HuggingFace, it raises a bare IndexError: list index out of range from deep inside physical_ai_av.utils.hf_interface.download_file. The traceback gives the user no clue about the actual cause (missing HF authentication, or no access granted to the gated dataset).

This PR catches that IndexError at the PhysicalAIAVDatasetInterface() initialization call site in load_physical_aiavdataset() and reraises a RuntimeError with an actionable message that points to:

  1. The dataset access request page.
  2. The hf auth login command.
  3. README §3 (Authenticate with HuggingFace).

The original exception is preserved via raise ... from e so the upstream traceback is still available for debugging.

Before

File ".../physical_ai_av/utils/hf_interface.py", line 231, in download_file
    self.api.get_paths_info(paths=[filename], **self.repo_snapshot_info)[0].size
IndexError: list index out of range

After

RuntimeError: Failed to initialize PhysicalAIAVDatasetInterface — the
HuggingFace API returned no metadata for the gated
`nvidia/PhysicalAI-Autonomous-Vehicles` dataset. This usually means you
have not authenticated with HuggingFace or have not been granted access
to the dataset.
  1. Request access: https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles
  2. Authenticate: `pip install -U huggingface_hub && hf auth login`
See README §3 (Authenticate with HuggingFace) for details.

Why catch only IndexError?

This is the specific, observed failure mode reported in #59 and #61 with a known cause (empty get_paths_info response from gated/unauthenticated access). Other exception types (network errors, transient HF outages, etc.) propagate unchanged so they are not misattributed to authentication.

Test plan

  • Syntax check (ast.parse) passes.
  • Existing happy path (caller passes a pre-initialized avdi, or auth is configured) is unchanged — the try block only wraps the lazy default initialization.
  • Original exception preserved as __cause__ (raise ... from e) so the upstream traceback is still inspectable.
  • Reviewer: optional — sanity check by running test_inference.py with HF_TOKEN unset and confirming the new message appears.

Related issues

Upstream physical_ai_av.utils.hf_interface.download_file raises a bare
IndexError when HfApi.get_paths_info returns an empty list — typically
because the user has not authenticated with HuggingFace or has not been
granted access to the gated nvidia/PhysicalAI-Autonomous-Vehicles
dataset. The resulting traceback is hard to act on.

Wrap the PhysicalAIAVDatasetInterface() call in load_physical_aiavdataset
and reraise as RuntimeError pointing at the access request page, the
hf auth login command, and README §3.

Fixes NVlabs#59, NVlabs#61.

Signed-off-by: lonexreb <[email protected]>
@super-anova
Copy link
Copy Markdown
Collaborator

Hi, thanks for creating this PR.

I agree that hitting IndexError exception deep in the code when the root cause is HF access is not ideal. On the other hand, however, catching and raise all IndexError exception as HF access issue also seemed potentially misleading.

What do you think about an explicit HF access check and raise the exception you wrote if the access check fails?

Per @super-anova review on NVlabs#74: catching IndexError broadly risks
misattributing unrelated failures as HF auth issues. Switch to an
explicit access check up front using HfApi.repo_info() — only the
two HF-specific exceptions (GatedRepoError, RepositoryNotFoundError)
are caught, and only those raise the helpful RuntimeError. Other
failure modes (network errors, transient HF outages, real bugs)
propagate unchanged.

Behavior:
  - User without HF auth or without granted dataset access: clear
    RuntimeError pointing to the access page and hf auth login.
  - User with valid auth + access: one HEAD request to HF, then
    instantiation proceeds normally.
  - Other errors: surfaced as-is, not masked.

Signed-off-by: lonexreb <[email protected]>
@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 3, 2026

@super-anova thanks — that's a much better signal-to-noise ratio. Pushed 5bf4860 which switches the implementation to your suggestion:

  • Drops the broad except IndexError.
  • Uses HfApi().repo_info(repo_id="nvidia/PhysicalAI-Autonomous-Vehicles", repo_type="dataset") as the explicit access probe up front.
  • Catches only GatedRepoError and RepositoryNotFoundError from huggingface_hub.errors — those are the two exceptions that specifically indicate auth/access problems.
  • Other failure modes (network, transient HF outages, real bugs in the upstream stack) propagate unchanged and are no longer at risk of being misattributed.

Cost is one HEAD request to HF on first init. Sample diff:

from huggingface_hub import HfApi
from huggingface_hub.errors import GatedRepoError, RepositoryNotFoundError

try:
    HfApi().repo_info(
        repo_id="nvidia/PhysicalAI-Autonomous-Vehicles",
        repo_type="dataset",
    )
except (GatedRepoError, RepositoryNotFoundError) as e:
    raise RuntimeError(...same actionable message...) from e
avdi = physical_ai_av.PhysicalAIAVDatasetInterface()

Let me know if you'd like the access check factored into a helper (e.g. _assert_pai_dataset_accessible()) — happy to move it if it'd be reused elsewhere, or keep it inline since it's currently a single call site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

List index out of range while loading the dataset An error occurred when running test_inference.py

2 participants