Problem
When an agent session uses both the kanban_* tools AND shells out to harness kanban … CLI in the same turn, tasks created via tools can become invisible to subsequent CLI invocations. Symptom from a real orchestrator session:
> kanban_create(title="...", assignee="space-pipeline", ...)
→ returns {"task_id": "t_abc123", "status": "ok"}
> harness kanban show t_abc123
no such task: t_abc123
The task DOES exist — direct SQLite probe of the right per-board DB confirms it. The CLI is reading a different board's DB.
Root cause
kanban_db.connect() resolves the board via HERMES_KANBAN_BOARD env var → <root>/kanban/current file → default. The two surfaces resolve differently:
kanban_* tools run inside the agent process, where HERMES_KANBAN_BOARD is set (either by the dispatcher when spawning a worker, or by the user's shell when launching harness -p <profile> chat). They reliably hit the right board.
harness kanban … shelled from within an agent session is a fresh subprocess. It inherits the parent shell's env. If HERMES_KANBAN_BOARD wasn't set in that env (common — most users don't export it; they use harness kanban boards switch <slug> which just writes to the current file), the CLI falls back to the current file.
- The
current file is global state. Any other concurrent harness session can flip it via harness kanban boards switch …. When that happens, the orchestrator session's tool calls keep targeting the original board (env-pinned), but the orchestrator's harness kanban … shell calls suddenly target the new board.
Concrete reproducer
# Terminal 1
harness kanban boards switch space
harness -p space-orchestrator chat
# (in chat) /goal Drive the space board: ... [orchestrator tool-creates a task]
# Terminal 2 (concurrent — e.g. another session, a script, a teammate)
harness kanban boards switch harness-facet
# Back in Terminal 1's orchestrator session, in the same goal turn:
# Tool: kanban_create returns t_xyz successfully (HERMES_KANBAN_BOARD env was
# set when the chat process spawned, persists for tool calls)
# Shell: harness kanban show t_xyz → "no such task: t_xyz"
# (no env, reads current file, sees harness-facet, looks in wrong DB)
Why it bites orchestrators specifically
Orchestrator personas often need to combine:
So orchestrator sessions are the workload most likely to mix both surfaces in the same turn, which makes them the workload most likely to trip on this divergence.
Worker sessions don't hit this because the dispatcher sets HERMES_KANBAN_BOARD in the spawned child's env directly (kanban_db.py:2593-2623), so even shell calls inherit the right pin.
Proposed fix
When a chat session activates a profile that has kanban in its toolsets, set HERMES_KANBAN_BOARD in the child shell environment to the resolved board at chat-start time. Three implementation options:
-
At chat boot: cli.py (or wherever HermesCLI.__init__ finalizes the profile env) reads the active board via kanban_db.get_current_board() once and exports HERMES_KANBAN_BOARD into os.environ for the rest of the session. Subsequent shell-outs inherit it. ~5 LOC.
-
At kanban-toolset registration: When the kanban toolset is enabled (via _check_kanban_mode()), pin the board to env. Same effect, narrower trigger.
-
In the terminal tool's env-passthrough path: When HERMES_KANBAN_BOARD is set in the agent process env, propagate it to spawned subprocess. (May already happen — needs verification.)
I'd lean toward (1): one-time pin at session start, before any tool registers. Idempotent, easy to test.
Why "test" / "guess this might be the cache invalidation" was wrong
The orchestrator that originally surfaced this called it a "DB-handle caching thing." After investigation it's actually nothing to do with DB handles or caches — it's two different code paths resolving the board differently, with the current file being mutable global state that one of them respects and the other ignores.
Workaround in the meantime
Always pass --board <slug> explicitly to harness kanban invocations from inside an orchestrator session. This is what we now do in the space-orchestrator SOUL.md addendum and the space-kanban-workflow skill. Verbose but reliable.
Discovery context
Hit this while running an autonomous orchestrator /goal on the v0.12.0 release with multiple boards (space, harness-facet, surface, default) on the same install. The orchestrator successfully created task t_04086c86 via kanban_create tool, then immediately tried to harness kanban show t_04086c86 and got "no such task" because the active board had drifted to harness-facet between calls (a different chat session was running on Daniel's other monitor).
Workaround proven working: prepend --board space to every CLI call.
Affected component
CLI / agent-CLI boundary
Severity
P2 — the workaround (always-explicit --board) is documentable and reliable, but the gap is subtle and the failure mode is "looks like a phantom data loss bug" which is hard to diagnose for users without filesystem access.
Problem
When an agent session uses both the
kanban_*tools AND shells out toharness kanban …CLI in the same turn, tasks created via tools can become invisible to subsequent CLI invocations. Symptom from a real orchestrator session:The task DOES exist — direct SQLite probe of the right per-board DB confirms it. The CLI is reading a different board's DB.
Root cause
kanban_db.connect()resolves the board viaHERMES_KANBAN_BOARDenv var →<root>/kanban/currentfile →default. The two surfaces resolve differently:kanban_*tools run inside the agent process, whereHERMES_KANBAN_BOARDis set (either by the dispatcher when spawning a worker, or by the user's shell when launchingharness -p <profile> chat). They reliably hit the right board.harness kanban …shelled from within an agent session is a fresh subprocess. It inherits the parent shell's env. IfHERMES_KANBAN_BOARDwasn't set in that env (common — most users don't export it; they useharness kanban boards switch <slug>which just writes to thecurrentfile), the CLI falls back to thecurrentfile.currentfile is global state. Any other concurrent harness session can flip it viaharness kanban boards switch …. When that happens, the orchestrator session's tool calls keep targeting the original board (env-pinned), but the orchestrator'sharness kanban …shell calls suddenly target the new board.Concrete reproducer
Why it bites orchestrators specifically
Orchestrator personas often need to combine:
kanban_create,kanban_complete)harness kanban list,harness kanban runs,harness kanban archive) — see [Feature]: Add kanban_list, kanban_unblock, kanban_assign, kanban_archive tools for orchestrator profiles #20048So orchestrator sessions are the workload most likely to mix both surfaces in the same turn, which makes them the workload most likely to trip on this divergence.
Worker sessions don't hit this because the dispatcher sets
HERMES_KANBAN_BOARDin the spawned child's env directly (kanban_db.py:2593-2623), so even shell calls inherit the right pin.Proposed fix
When a chat session activates a profile that has
kanbanin itstoolsets, setHERMES_KANBAN_BOARDin the child shell environment to the resolved board at chat-start time. Three implementation options:At chat boot:
cli.py(or whereverHermesCLI.__init__finalizes the profile env) reads the active board viakanban_db.get_current_board()once and exportsHERMES_KANBAN_BOARDintoos.environfor the rest of the session. Subsequent shell-outs inherit it. ~5 LOC.At kanban-toolset registration: When the kanban toolset is enabled (via
_check_kanban_mode()), pin the board to env. Same effect, narrower trigger.In the terminal tool's env-passthrough path: When
HERMES_KANBAN_BOARDis set in the agent process env, propagate it to spawned subprocess. (May already happen — needs verification.)I'd lean toward (1): one-time pin at session start, before any tool registers. Idempotent, easy to test.
Why "test" / "guess this might be the cache invalidation" was wrong
The orchestrator that originally surfaced this called it a "DB-handle caching thing." After investigation it's actually nothing to do with DB handles or caches — it's two different code paths resolving the board differently, with the
currentfile being mutable global state that one of them respects and the other ignores.Workaround in the meantime
Always pass
--board <slug>explicitly toharness kanbaninvocations from inside an orchestrator session. This is what we now do in thespace-orchestratorSOUL.md addendum and thespace-kanban-workflowskill. Verbose but reliable.Discovery context
Hit this while running an autonomous orchestrator
/goalon the v0.12.0 release with multiple boards (space,harness-facet,surface,default) on the same install. The orchestrator successfully created taskt_04086c86viakanban_createtool, then immediately tried toharness kanban show t_04086c86and got "no such task" because the active board had drifted toharness-facetbetween calls (a different chat session was running on Daniel's other monitor).Workaround proven working: prepend
--board spaceto every CLI call.Affected component
CLI / agent-CLI boundary
Severity
P2 — the workaround (always-explicit
--board) is documentable and reliable, but the gap is subtle and the failure mode is "looks like a phantom data loss bug" which is hard to diagnose for users without filesystem access.