Show agent turn duration in WebUI by Michaelyklam · Pull Request #1592 · nesquena/hermes-webui

Michaelyklam · 2026-05-04T03:22:17Z

Summary

measure assistant turn duration from the backend pending_started_at timestamp and include it in the streaming done usage payload
persist the value on assistant messages as _turnDuration so reloads keep the display
show Done in … on the compact Activity row, and as a subtle assistant footer chip in expanded tool-call mode

Screenshots / QA

Compact mode: Activity row shows Done in 1m 12s, no duplicate footer duration
Expanded mode: individual tool cards remain expanded and assistant footer shows Done in 1m 12s

Local browser QA screenshots were captured during validation:

MEDIA:/home/michael/.hermes/cache/screenshots/browser_screenshot_143f3490bff248628f89441683062dbf.png
MEDIA:/home/michael/.hermes/cache/screenshots/browser_screenshot_74f62a3c45ae4d37ab05c898aa850752.png

Tests

python -m pytest tests/test_turn_duration_display.py tests/test_ui_tool_call_cleanup.py tests/test_streaming_markdown.py tests/test_sprint42.py tests/test_sprint49.py -q → 97 passed
git diff --check
python -m py_compile api/streaming.py
Full python -m pytest tests/ -q attempted: 4088 passed, 9 failed, 2 skipped. The 9 failures are existing environment/config-sensitive tests unrelated to this change (test_issue1094_provider_bugs.py, test_model_resolver.py, onboarding MVP tests, and test_sprint28.py::test_personalities_empty_when_none_exist).

nesquena-hermes · 2026-05-04T04:07:05Z

Thanks @Michaelyklam — duration on the Activity row is the kind of small thing that meaningfully changes how the UI feels during long agent turns. The implementation reads cleanly: backend stores _turnDuration on the assistant message so reloads keep the display, the streaming done payload now carries duration_seconds, and the renderer hides the footer chip when compact activity is showing the same value to avoid duplication. That last bit is a nice touch.

I pulled the branch and ran the new + adjacent suites:

tests/test_turn_duration_display.py
tests/test_ui_tool_call_cleanup.py
tests/test_streaming_markdown.py
tests/test_sprint42.py tests/test_sprint49.py
→ 97 passed in 2.41s

Plus a focused sweep across streaming / sse / usage / activity-tagged tests (~310 passed). No new regressions surfaced.

A few notes from the diff

Start-time fallback when pending_started_at is missing. Line in api/streaming.py:
```
_turn_started_at = getattr(s, 'pending_started_at', None) or time.time()
```
This is the right default for normal flows — /api/chat/start sets pending_started_at = time.time() immediately before the agent thread spawns. For a recovered/re-started session where pending state was already cleared by _recover_pending_turn before this code runs, the fallback to "now" means duration starts ticking from inside _run_agent_streaming, which is close enough to the truth that I don't think it's worth over-engineering. Worth a one-line comment though, since "or time.time()" reads as if it could silently produce a near-zero duration in a regression scenario.
The 0 falsy edge case. getattr(...) or time.time() will also fall through if pending_started_at == 0 (truthy-falsy). That's not realistic since it's a UNIX timestamp set with time.time(), but pending_started_at if pending_started_at is not None else time.time() is a touch safer if you want to be explicit. Optional.
Compact-mode duplicate suppression. The new logic in renderMessages():
```
const compactActivityForMessage=isSimplifiedToolCalling()&&(
  assistantThinking.has(mi)||
  (S.toolCalls||[]).some(tc=>tc&&(tc.assistant_msg_idx!==undefined?tc.assistant_msg_idx:-1)===mi)
);
const durationText=compactActivityForMessage?'':_formatTurnDuration(msg._turnDuration);
```
This correctly suppresses the footer chip when the Activity row is showing duration. But there's a subtle corner: a turn that produces only an assistant text reply (no tool calls, no thinking) has no Activity group, so the duration falls through to the footer — which is what we want. If a turn has thinking but no tool calls and the user has compact mode off, assistantThinking.has(mi) is true so the message gets compacted — wait, no, compactActivityForMessage requires isSimplifiedToolCalling() first. Re-read: ok, it's gated on compact mode being on, and only suppresses the footer when there's actually an Activity row to show duration in. Good.
The data-turn-duration attribute round-trip. _syncToolCallGroupSummary reads group.dataset.turnDuration, which means a group rendered before duration arrived (e.g. mid-stream) won't show the duration text until the attribute is set + summary is re-synced. The attribute set happens in renderMessages() based on sourceMsg._turnDuration, and attachLiveStream updates lastAsst._turnDuration from d.usage.duration_seconds on the done payload. So the path is: done event → _turnDuration populated → next render sets data-turn-duration → _syncToolCallGroupSummary reads it. Looks correctly wired.
Tiny formatter nit. _formatTurnDuration returns ${m}m ${s}s for ≥60s but ${h}h ${m}m for ≥3600s — dropping seconds at the hour boundary. That's fine and probably desired (nobody cares about the seconds in a 2h17m turn), but a 1h00m12s turn renders as "1h 0m" which reads slightly odd. Not blocking.
97 passed locally vs the 9 unrelated failures in full pytest. Confirmed those are environment-dependent (test_issue1094_provider_bugs.py, model_resolver, onboarding MVP, sprint28 personalities) and pre-existed master. Not blocking on this PR.

One thing worth double-checking

The _turnDuration is only persisted on the last assistant message in the loop:

if s.messages:
    for _dm in reversed(s.messages):
        if isinstance(_dm, dict) and _dm.get('role') == 'assistant':
            _dm['_turnDuration'] = round(_turn_duration_seconds, 3)
            break

That matches how the existing _turnUsage is written elsewhere, so it's consistent. But if a turn produces multiple assistant messages (a tool-call assistant message followed by a final-text assistant message — the standard tool-use pattern), only the final one gets the duration. The Activity row shows duration via data-turn-duration set on the assistant index that has the activity group, which… let me re-check.

Looking at renderMessages():

const sourceMsg=S.messages[aIdx]||{};
if(sourceMsg._turnDuration!==undefined) group.setAttribute('data-turn-duration', String(sourceMsg._turnDuration));

aIdx here is the assistant index that anchors the activity group. Tool-use turns typically have one consolidated final assistant message that owns the activity group, so this should land correctly. Worth a manual QA pass on a multi-turn-step scenario (Codex doing 5+ tool calls before its final reply) to confirm "Done in 3m 12s" actually appears on the Activity row in both compact and expanded modes — your screenshot captures this case I think but worth verifying once more.

Verdict

This is well-scoped, has solid test coverage, and the UX choice to dedupe duration between Activity row and footer is the right call. Will queue for stage review. Thanks for the careful work.

Michaelyklam · 2026-05-04T04:12:32Z

Thanks for the careful review — I made the small cleanup you called out around the start-time fallback.

Follow-up commit: 0eddb05

What changed:

Switched the start-time fallback from getattr(..., None) or time.time() to an explicit is not None check, so even an explicit falsy timestamp is preserved.
Added a short comment explaining that pending_started_at is the normal path and time.time() is only for recovered/legacy flows where the marker is absent.
Tightened the regression test to cover that explicit fallback and comment so this doesn't drift back.

I also re-checked the multi-step/tool-call UI path you mentioned with a synthetic browser QA turn containing thinking + 3 tools:

Compact mode: single Activity: thinking + 3 tools row shows Done in 3m 12s, with no duplicate footer duration.
Expanded mode: individual tool rows are shown and the assistant footer shows Done in 3m 12s, with no compact Activity duration row.

Verification:

/home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_turn_duration_display.py tests/test_ui_tool_call_cleanup.py tests/test_streaming_markdown.py tests/test_sprint42.py tests/test_sprint49.py -q → 97 passed
git diff --check
/home/michael/.hermes/hermes-agent/venv/bin/python -m py_compile api/streaming.py
Browser QA screenshots:
- compact: MEDIA:/home/michael/.hermes/cache/screenshots/browser_screenshot_252bbf66db4044219902f0b1687fc28c.png
- expanded: MEDIA:/home/michael/.hermes/cache/screenshots/browser_screenshot_e3201d647b3b4e93a1491eded15fd16e.png

… — 4094→4111 tests - #1586 (Michaelyklam): login asset SW cache exemption - #1590 (Michaelyklam): hot-apply compact tool activity setting - #1591 (Michaelyklam): first-turn sidebar visibility (optimistic upserts) - #1592 (Michaelyklam): turn duration display (Done in 1m 12s) + Opus follow-up (truthy-check on _pending_started_at) - #1464 (JKJameson, maintainer-augmented): workspace dropdown sort+search+chip-sync (rebased + ternary fix + regression test) Maintainer-side test fixes in stage: - tests/test_465_session_branching.py: widen compact() search window 1500→3000 - tests/test_regressions.py: anchor on api('/api/chat/start' instead of comment line Browser API sanity: 11/11 passed. Live UX verification: vision-confirmed dropdown sort+search+empty-state on test server. Opus advisor: SHIP AS-IS.

@Michaelyklam

@Michaelyklam

feat: show agent turn duration

f3fa106

fix: document turn duration fallback

0eddb05

nesquena-hermes mentioned this pull request May 4, 2026

Release v0.50.290 — 5-PR batch (login cache + sidebar UX + workspace dropdown polish) #1593

Merged

nesquena-hermes closed this pull request by merging all changes into nesquena:master in 4559163 May 4, 2026

nesquena-hermes mentioned this pull request May 4, 2026

Tighten _pending_started_at falsy-guard in turn-duration calculation (post-v0.50.290 follow-up) #1595

Closed

pull Bot pushed a commit to JamesWilliam1977/hermes-webui that referenced this pull request May 4, 2026

Stage 290: PR nesquena#1592 — turn duration display 'Done in 1m 12s' by

d15b0a2

@Michaelyklam

franksong2702 mentioned this pull request May 4, 2026

fix(streaming): use truthy-check for _pending_started_at fallback (closes #1595) #1604

Closed

nesquena-hermes mentioned this pull request May 4, 2026

[Feature Request] Display TPS (Token Generation Speed) in WebUI #1617

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show agent turn duration in WebUI#1592

Show agent turn duration in WebUI#1592
2 commits merged intonesquena:masterfrom
Michaelyklam:feat/turn-duration-display

Michaelyklam commented May 4, 2026

Uh oh!

nesquena-hermes commented May 4, 2026

Uh oh!

Michaelyklam commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Michaelyklam commented May 4, 2026

Summary

Screenshots / QA

Tests

Uh oh!

nesquena-hermes commented May 4, 2026

Uh oh!

Michaelyklam commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants