Skip to content

[Spaces] Add fetch_space_logs + hf spaces logs command#4091

Merged
Wauplin merged 13 commits intohuggingface:mainfrom
davanstrien:feat/fetch-space-logs
Apr 15, 2026
Merged

[Spaces] Add fetch_space_logs + hf spaces logs command#4091
Wauplin merged 13 commits intohuggingface:mainfrom
davanstrien:feat/fetch-space-logs

Conversation

@davanstrien
Copy link
Copy Markdown
Member

@davanstrien davanstrien commented Apr 13, 2026

Summary

WIP and happy to discuss if it makes sense!

Using agents with Spaces, I found it helpful to give access to logs, and currently, this isn't exposed in the CLI. An alternative design would be to just return the URL for logs from methods that create /modify Spaces, but this might be better for actual scripting, etc.

  • Adds HfApi.fetch_space_logs(repo_id, *, build=False, follow=False) — programmatic access to Space build/run logs via the SSE endpoint /api/spaces/{repo_id}/logs/{run|build}
  • Adds hf spaces logs <repo_id> [--build] [-f/--follow] [-n/--tail N] CLI command
  • 7 mock-based CLI tests, docs in manage-spaces guide + package reference

Note: This is a first pass — the API shape and implementation approach are open to discussion. Happy to iterate on naming, flag style, or the streaming/retry strategy based on feedback.

Motivation

Agents and scripts that manage Spaces (restart, set volumes, push code) currently have no way to read why a Space failed without knowing the raw endpoint URL and crafting a curl request. get_space_runtime() surfaces the stage (BUILD_ERROR, RUNTIME_ERROR) but not the actual error — that lives behind /api/spaces/{repo_id}/logs/{build|run}.

By wrapping this as a first-class method, agents can autonomously check logs when something goes wrong — no human nudge needed to provide a URL or paste output. The pattern already exists for Jobs (fetch_job_logs + hf jobs logs); this closes the equivalent gap for Spaces.

API shape

# Run logs (default) — like `docker logs`
for line in api.fetch_space_logs("username/my-space"):
    print(line, end="")

# Build logs — for BUILD_ERROR debugging
for line in api.fetch_space_logs("username/my-space", build=True):
    print(line, end="")

# Stream in real time
for line in api.fetch_space_logs("username/my-space", follow=True):
    print(line, end="")
hf spaces logs username/my-space              # run logs
hf spaces logs username/my-space --build      # build logs
hf spaces logs username/my-space -f           # stream
hf spaces logs username/my-space -n 50        # last 50 lines

Design decisions (all open for discussion)

build=True boolean flag vs log_type="build" enum. We tested this by prompting an independent agent with no knowledge of the implementation to write the CLI/Python calls they'd intuitively expect. They converged on --build as a boolean toggle (like kubectl logs --previous) rather than --type build as an enum. Rationale: shorter, no string literal to remember, honours the asymmetry ("logs" = run logs by default, build is the special case).

No SpaceStage polling in the helper. The helper trusts that and does not poll get_space_runtime() as a backstop. Upstream misbehavior (observed on one RUNTIME_ERROR space where the server held the socket open with zero bytes) is bounded by read timeout + retry cap. This avoids coupling to the SpaceStage enum, which is currently incomplete (server returns SLEEPING but the Python enum doesn't have it).

Single method, not separate fetch_space_logs + fetch_space_build_logs. Since both log endpoints share Iterable[str] and only differ by a URL segment, a single method with a boolean toggle felt like the right granularity but open to splitting if preferred.

Test plan

  • 7 mock-based CLI tests (default, --follow, --build, --tail, --follow+--tail error, 404, 403)
  • make style + make quality clean (2 pre-existing ty errors in cli/_output.py, not introduced here)
  • Full test_cli.py — 228/228 passing
  • Live-tested against real Spaces in RUNTIME_ERROR, RUNNING, and SLEEPING stages

Note

Medium Risk
Adds a new SSE-based log streaming API and CLI surface, and refactors Jobs SSE streaming to reuse the same retry/dedup loop, which could affect log/metrics streaming behavior under timeouts/retries.

Overview
Adds programmatic Space log access via HfApi.fetch_space_logs(repo_id, build=..., follow=...), streaming from the Hub’s SSE endpoints for both run and build logs.

Introduces hf spaces logs with --build, --follow/-f, and --tail/-n (including validation of incompatible flags), plus a small shared SSE streaming helper in hf_api.py that also replaces the bespoke Jobs SSE loop. Documentation and CLI reference are updated, and new CLI tests cover the new command behavior.

Reviewed by Cursor Bugbot for commit c7625a5. Bugbot is set up for automated code reviews on this repo. Configure here.

Agents and scripts currently have no way to read Space build/run logs
programmatically — the endpoint is only reachable via raw curl. This
adds a public API to close that gap.

- HfApi.fetch_space_logs(repo_id, *, build=False, follow=False) yields
  log lines as Iterable[str]. build=True switches to container build
  logs; default is the running app's stdout/stderr.
- `hf spaces logs <repo_id> [--build] [-f] [-n N]` mirrors the Python
  API at the CLI level, with 404/403 mapped to clean CLIError messages.

The helper trusts the "stream close = done" server contract (confirmed
against moon-landing's SpaceLogs.svelte onClose handler) and does not
poll SpaceStage; read timeout + bounded retries handle the
misbehaving-upstream case. Structure mirrors _fetch_running_job_sse but
without the status-check backstop. Tests use the mock-based pattern
from hf jobs logs (no new VCR cassettes).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@bot-ci-comment
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread src/huggingface_hub/hf_api.py
davanstrien and others added 2 commits April 13, 2026 09:52
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
hf_raise_for_status() raises HfHubHTTPError (inherits HTTPError), not
httpx.HTTPStatusError. The previous handler was dead code, causing
404/403 errors to fall through to the retry loop instead of raising
immediately. Spotted by cursor bugbot on PR review.

Note: the same bug exists in _fetch_running_job_sse — not fixed here
to keep the diff focused, but worth a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1f884f8. Configure here.

Comment thread src/huggingface_hub/hf_api.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 25.00000% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.13%. Comparing base (1daa48b) to head (c7625a5).
⚠️ Report is 270 commits behind head on main.

Files with missing lines Patch % Lines
src/huggingface_hub/hf_api.py 7.46% 62 Missing ⚠️
src/huggingface_hub/cli/spaces.py 94.11% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4091      +/-   ##
==========================================
+ Coverage   75.00%   77.13%   +2.13%     
==========================================
  Files         145      167      +22     
  Lines       13978    18884    +4906     
==========================================
+ Hits        10484    14566    +4082     
- Misses       3494     4318     +824     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@davanstrien
Copy link
Copy Markdown
Member Author

Re: SSE helper duplication — explored this. We considered a shared helper but the jobs version has a status-check backstop we deliberately omit, and unifying would mean touching working code. Happy to refactor if a shared helper is wanted, but for now the duplication felt like the safer choice.

Copy link
Copy Markdown
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this @davanstrien ! It has been on my todo since a long time but never took the time to address it ^^ (see #2667)

Comment thread docs/source/en/guides/manage-spaces.md Outdated
Comment thread src/huggingface_hub/cli/spaces.py Outdated
Comment thread src/huggingface_hub/cli/spaces.py Outdated
Comment thread src/huggingface_hub/hf_api.py
@davanstrien davanstrien marked this pull request as draft April 14, 2026 09:24
davanstrien and others added 3 commits April 14, 2026 13:40
Extracts `HfApi._stream_sse_events` to unify the retry/backoff/dedup
loop previously duplicated across `_fetch_space_logs_sse` and
`_fetch_running_job_sse`. Addresses Wauplin and Cursor Bugbot review
comments on PR huggingface#4091.

Also fixes a dead `except httpx.HTTPStatusError` handler that affected
both Spaces and Jobs: `hf_raise_for_status` raises `HfHubHTTPError`
(subclass of `httpx.HTTPError`, not `HTTPStatusError`), so 404/403 in
follow mode used to fall through to the broad retry arm and stall for
~25s. The new helper catches `HfHubHTTPError` before the broad arm, so
permanent errors fail fast. Live-verified: `hf spaces logs missing/x -f`
now errors in <1s instead of ~25s.

CLI cleanups on `hf spaces logs` per Wauplin:
- Switch from `print()` to `out.text(line.strip())` (new mode-aware
  printer from huggingface#3979).
- Drop the redundant local `HfHubHTTPError` block — it's already
  handled by the global CLI error mapper.

Also tightens `_fetch_running_job_sse` typing by splitting the legacy
`double_check_job_has_finished_on_status_code_or_error` mixed tuple
into `tolerated_status_codes: tuple[int, ...]` and
`tolerated_exception_types: tuple[type[Exception], ...]`, eliminating a
runtime type-discrimination step.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The entry in `tolerated_exception_types` was never consulted: the SSE
helper's `is_no_new_line_timeout` check short-circuits the tolerated
tuple lookup for any ReadTimeout, so the tuple entry was dead code
(pre-existing on `main` before huggingface#4091, preserved faithfully through the
refactor). ReadTimeout tolerance continues to work via the
`is_no_new_line_timeout` path.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@davanstrien davanstrien marked this pull request as ready for review April 14, 2026 13:47
@davanstrien
Copy link
Copy Markdown
Member Author

Pushed changes addressing the review:

  • Shared SSE helper: extracted HfApi._stream_sse_events; _fetch_space_logs_sse and _fetch_running_job_sse are now thin wrappers. Two retry modes via an on_iteration_end callback. Tried to follow existing patterns with helpers for this!
  • Dead-except bug fix: the except httpx.HTTPStatusError arm Cursor flagged was dead (hf_raise_for_status raises HfHubHTTPError, not HTTPStatusError), so 404/403 used to stall ~25s in follow mode. The same bug existed in _fetch_running_job_sse on main, refactor fixes both. Verified via hf spaces logs nonexistent/space -f now errors in <1s.
  • CLI cleanups on hf spaces logs: out.text(line.strip()) + dropped redundant HfHubHTTPError block (kept RepositoryNotFoundError for the friendly message).
  • Typing cleanup: split the legacy double_check_job_has_finished_on_status_code_or_error tuple into tolerated_status_codes + tolerated_exception_types (can revert if you think it's not nicer)

Copy link
Copy Markdown
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the refacto
Last comments before getting it merged 🤗

Comment thread docs/source/en/guides/manage-spaces.md Outdated
Note: if you are using a 'cpu-basic' hardware, you cannot configure a custom sleep time. Your Space will automatically
be paused after 48h of inactivity.

**5b. Debug a failing Space by reading its logs**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ### as introduced in #4108

Comment thread src/huggingface_hub/cli/spaces.py Outdated
Comment thread src/huggingface_hub/cli/spaces.py Outdated
if tail is not None:
logs = deque(logs, maxlen=tail)
for line in logs:
out.text(line.strip())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out.text(line.strip())
all_lines = []
found_logs = False
for line in logs:
clean_line = line.strip()
out.text(clean_line)
if clean_line:
found_logs = True
if not found_logs and not build:
out.hint(f"No run logs found for space {space_id}. Try passing --build to fetch build logs instead.")

Suggestion: add a hint if no logs returned? should play well with agents

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Comment thread tests/test_cli.py Outdated
Comment thread tests/test_cli.py Outdated
davanstrien and others added 5 commits April 15, 2026 14:51
# Conflicts:
#	docs/source/en/package_reference/cli.md
- Promote "Debug a failing Space" heading to ### (matches PR huggingface#4108 style)
- Add hint when run logs are empty, suggesting --build as alternative
- Add tests covering the empty-logs hint for both run and build modes
- Regenerate CLI reference docs to include hf spaces logs command
Copy link
Copy Markdown
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! All good for me once the comment is merged :)

Comment thread tests/test_cli.py Outdated
@Wauplin Wauplin merged commit b7ed360 into huggingface:main Apr 15, 2026
16 of 17 checks passed
@huggingface-hub-bot
Copy link
Copy Markdown
Contributor

This PR has been shipped as part of the v1.11.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants