Session mirror: garbage not filtered + word-merging/letter drops

## Problem

The session mirror (Tab 2, `tail -f` of `logs/session-*.log`) produces noisy, barely-readable output. Two distinct problems:

### 1. Garbage lines not filtered

The mirror's `_log_to_mirror()` has only **4 noise patterns** in `NOISE_RE` plus a 12-char minimum length filter. Meanwhile `clean_transcript.py` has **95 garbage patterns** that effectively remove TUI artifacts post-hoc. The mirror lets through all of the following:

| Garbage type | Example from live session |
|---|---|
| Thinking fragments | `g (thinking)`, `n (thinking)`, `✶ s n (thinking)` |
| Spinner activity lines | `Harmonizing… 1`, `✢Harmonizing… 2`, `*Harmonizing… 5` |
| Spinner activity with timing | `Harmonizing…(30s · ↓2.2k tokens)` |
| Token/timing fragments | `1  · 6.7k tokens`, `2 s · 6.7k tokens`, `15.1k tokens` |
| Permission UI chrome | `Esctocancel·Tabtoamend·ctrl+etoexplain` |
| "Do you want to proceed?" UI | `Doyouwanttoproceed?` |
| Permission option repaints | `❯1.Yes`, `2.Yes,allowreadingfromProjects\fromthisproject` |
| Agent tree lines | `├─ Explore (Read core source files) · 10 tool uses` |
| Agent initializing | `⎿  Initializing…` |
| "ctrl+b to run in background" | `ctrl+b to run in background` |
| "ctrl+o to expand" | `+18 more tool uses (ctrl+o to expand)` |
| Running N agents | `Running 2 gents…(ctrl+o to expand)` |
| Status bar timestamps | `[02-1401:41:44] /c/Users/mcwiz/Projects/Hermes (main)` |
| File count lines | `2 s, reading 19 files…` |
| Bash command labels | `Bash command` |
| Bare "thought for Ns" | `(thought for 2s)` |

### 2. Word-merging and letter drops

The `mirror_strip_ansi()` cursor-tracking parser inserts spaces only when the cursor **jumps past** the current column position. This works for simple cases like:

```
\x1b[7;1HIt\x1b[7;4His  →  "It" + gap(3→4) + "is"  →  "It is"  ✓
```

But the Ink TUI renders many words character-by-character at **adjacent columns** with no gap:

```
\x1b[5;20HI\x1b[5;21H'\x1b[5;22Hl\x1b[5;23Hl\x1b[5;24Hr\x1b[5;25He\x1b[5;26Ha\x1b[5;27Hd
→ No column gaps → "I'llread" (correct per parser, wrong per English)
```

The parser **cannot distinguish** "adjacent chars in the same word" from "adjacent chars at the start of a new word" because the TUI uses the same positioning pattern for both.

**Examples from live session:**

| Mirror output | Should be |
|---|---|
| `I'llreadthecodebase,issues,andwikiinparallel.` | `I'll read the codebase, issues, and wiki in parallel.` |
| `Doyouwanttoproceed?` | `Do you want to proceed?` |
| `ls-la/c/Users/mcwiz/Projects/HermesWiki/` | `ls -la /c/Users/mcwiz/Projects/HermesWiki/` |
| `2>/dev/null\|\|echo"Nolocalwikiclone"` | `2>/dev/null \|\| echo "No local wiki clone"` |
| `2 gents` | `2 agents` (dropped letter) |
| `loal` | `local` (dropped letter) |
| `Lst` | `List` (dropped letter) |
| `one-l esummris` | `one-line summaries` (dropped letters + wrong space) |
| `un acked` | `untracked` (dropped letters + wrong space) |
| `remoebranches` | `remote branches` (dropped letters + merged) |
| `Sow recent session logs` | `Show recent session logs` (dropped letter) |

Letter drops happen when the TUI repaints a line and the PTY read lands mid-repaint — some characters from the old render and some from the new render get interleaved, losing characters at the boundary.

## Current filtering (insufficient)

```python
# _log_to_mirror() filters:
NOISE_RE = [
    re.compile(r'^\s*$'),                        # blank lines
    re.compile(r'^[\u2500\u2550]{10,}$'),        # ─ or ═ horizontal rules
    re.compile(r'^\xb7\s+\S+ing'),               # · spinner lines (middle dot only)
    re.compile(r'^\s*\d+ files? '),              # file count status
]
# Plus: 12-char minimum, separator filter, timestamp fragment filter, 32-line dedup
```

## Plan: B + D + F

### B — Shared filter module (garbage filtering)

Extract `GARBAGE_PATTERNS` and `is_garbage()` from `clean_transcript.py` into a shared `src/transcript_filters.py` module. Both `clean_transcript.py` and `unleashed-c-20.py` import from it.

- **Single source of truth** — add a pattern once, both scripts get it
- **No pattern drift** — the mirror and the post-hoc cleaner always match
- **No performance concern** — 95 compiled regexes against a line of text is microseconds. Python's `re` compiles to C. Not measurable even with 100 terminals open.

### D — Accept word-merging in live mirror

The cursor-tracking parser is fundamentally limited by how the Ink TUI renders text. Word boundaries without column gaps are undetectable at the ANSI level. Accept merged words in the live mirror. Use `clean_transcript.py --fix-spaces` (wordninja) post-session for readable prose.

Word-merging in the live mirror is a separate problem tracked in its own issue — see "Related" below.

### F — Rate-limit mirror writes

Buffer PTY output for 200-500ms before processing for the mirror. Larger chunks → fewer mid-repaint reads → fewer letter drops and less garbage.

- Mirror becomes slightly delayed but it's `tail -f` — 200ms is invisible
- Reduces both garbage volume AND letter drops cheaply
- Combines naturally with B (filtered after buffering)

## Implementation steps

1. Create `src/transcript_filters.py` — extract `GARBAGE_PATTERNS`, `COMPACTION_KEEP`, `is_garbage()`, `normalize_for_dedup()` from `clean_transcript.py`
2. Update `clean_transcript.py` to import from shared module
3. Update `unleashed-c-20.py` `_log_to_mirror()` to use `is_garbage()` instead of `NOISE_RE`
4. Add 200ms write buffer in `_log_to_mirror()` (accumulate data, flush on timer)
5. Test: run a session and compare mirror output before/after

## Related

- `src/clean_transcript.py` — 95 garbage patterns, wordninja integration
- `docs/runbooks/0908-transcript-cleaning.md` — post-session cleaning workflow
- Separate issue needed for live word-splitting (wordninja or alternative in the mirror itself)

Mirror output	Should be
`I'llreadthecodebase,issues,andwikiinparallel.`	`I'll read the codebase, issues, and wiki in parallel.`
`Doyouwanttoproceed?`	`Do you want to proceed?`
`ls-la/c/Users/mcwiz/Projects/HermesWiki/`	`ls -la /c/Users/mcwiz/Projects/HermesWiki/`
`2>/dev/null\|\|echo"Nolocalwikiclone"`	`2>/dev/null \|\| echo "No local wiki clone"`
`2 gents`	`2 agents` (dropped letter)
`loal`	`local` (dropped letter)
`Lst`	`List` (dropped letter)
`one-l esummris`	`one-line summaries` (dropped letters + wrong space)
`un acked`	`untracked` (dropped letters + wrong space)
`remoebranches`	`remote branches` (dropped letters + merged)
`Sow recent session logs`	`Show recent session logs` (dropped letter)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session mirror: garbage not filtered + word-merging/letter drops #29

Problem

1. Garbage lines not filtered

2. Word-merging and letter drops

Current filtering (insufficient)

Plan: B + D + F

B — Shared filter module (garbage filtering)

D — Accept word-merging in live mirror

F — Rate-limit mirror writes

Implementation steps

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Garbage type	Example from live session
Thinking fragments	`g (thinking)`, `n (thinking)`, `✶ s n (thinking)`
Spinner activity lines	`Harmonizing… 1`, `✢Harmonizing… 2`, `*Harmonizing… 5`
Spinner activity with timing	`Harmonizing…(30s · ↓2.2k tokens)`
Token/timing fragments	`1 · 6.7k tokens`, `2 s · 6.7k tokens`, `15.1k tokens`
Permission UI chrome	`Esctocancel·Tabtoamend·ctrl+etoexplain`
"Do you want to proceed?" UI	`Doyouwanttoproceed?`
Permission option repaints	`❯1.Yes`, `2.Yes,allowreadingfromProjects\fromthisproject`
Agent tree lines	`├─ Explore (Read core source files) · 10 tool uses`
Agent initializing	`⎿ Initializing…`
"ctrl+b to run in background"	`ctrl+b to run in background`
"ctrl+o to expand"	`+18 more tool uses (ctrl+o to expand)`
Running N agents	`Running 2 gents…(ctrl+o to expand)`
Status bar timestamps	`[02-1401:41:44] /c/Users/mcwiz/Projects/Hermes (main)`
File count lines	`2 s, reading 19 files…`
Bash command labels	`Bash command`
Bare "thought for Ns"	`(thought for 2s)`

Session mirror: garbage not filtered + word-merging/letter drops #29

Description

Problem

1. Garbage lines not filtered

2. Word-merging and letter drops

Current filtering (insufficient)

Plan: B + D + F

B — Shared filter module (garbage filtering)

D — Accept word-merging in live mirror

F — Rate-limit mirror writes

Implementation steps

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions