feat: interactive REPL — the first step toward a Claude-Code-style SRE terminal#591
Conversation
…ands, and follow-up handling
Greptile SummaryThis PR ships a Claude-Code-style interactive REPL for Confidence Score: 4/5Safe to merge with the understanding that the remaining open thread (context accumulation after /investigate) is tracked as a follow-up. All P0/P1 findings from previous rounds are resolved. Two P2 findings remain (history ok-flag accuracy and unused config param), neither of which affects the primary investigation flow. app/cli/repl/commands.py (_cmd_investigate_file context accumulation, tracked in prior thread) and app/cli/repl/loop.py (follow-up ok-flag) Important Files Changed
|
| def classify_input(text: str, session: ReplSession) -> InputKind: | ||
| """Classify a single line of REPL input. | ||
|
|
||
| Rules (in order): | ||
| 1. Anything starting with ``/`` is a slash command. | ||
| 2. If there is no previous investigation, treat as a new alert. | ||
| 3. If the input has alert-shaped signals, treat as a new alert. | ||
| 4. If the input is a short question, treat as a follow-up. | ||
| 5. Otherwise default to a new alert (safer — produces a fresh run rather | ||
| than a free-floating chat message). | ||
| """ | ||
| stripped = text.strip() | ||
| if stripped.startswith("/"): | ||
| return "slash" | ||
|
|
||
| # A bare word that matches a known slash command is almost always a typo | ||
| # for the slash command itself — route it there instead of triggering a | ||
| # full investigation. | ||
| if stripped.lower() in _BARE_COMMAND_ALIASES: | ||
| return "slash" | ||
|
|
||
| if session.last_state is None: | ||
| return "new_alert" | ||
|
|
||
| if _mentions_alert_signal(stripped): | ||
| return "new_alert" | ||
|
|
||
| if _is_short_question(stripped): | ||
| return "follow_up" | ||
|
|
||
| return "new_alert" |
There was a problem hiding this comment.
Alert-keyword check blocks follow-up questions that mention metric names
Rule 3 (_mentions_alert_signal) fires before rule 4 (_is_short_question), so any follow-up that includes a metric or incident keyword — "why did CPU spike?", "what caused the memory error?", "how did connection drop?" — is classified as new_alert and triggers a fresh investigation rather than a grounded answer. The PR description explicitly lists "why did CPU spike?" as the canonical follow-up example, but the code misroutes it. The existing tests avoid this by only testing inputs with no alert keywords ("why?", "what caused it?").
A simple fix is to check the short-question shape first when prior state exists:
if session.last_state is None:
return "new_alert"
if _is_short_question(stripped):
return "follow_up"
if _mentions_alert_signal(stripped):
return "new_alert"
return "new_alert"Prompt To Fix With AI
This is a comment left during a code review.
Path: app/cli/repl/router.py
Line: 82-112
Comment:
**Alert-keyword check blocks follow-up questions that mention metric names**
Rule 3 (`_mentions_alert_signal`) fires before rule 4 (`_is_short_question`), so any follow-up that includes a metric or incident keyword — "why did CPU spike?", "what caused the memory error?", "how did connection drop?" — is classified as `new_alert` and triggers a fresh investigation rather than a grounded answer. The PR description explicitly lists "why did CPU spike?" as the canonical follow-up example, but the code misroutes it. The existing tests avoid this by only testing inputs with no alert keywords ("why?", "what caused it?").
A simple fix is to check the short-question shape first when prior state exists:
```python
if session.last_state is None:
return "new_alert"
if _is_short_question(stripped):
return "follow_up"
if _mentions_alert_signal(stripped):
return "new_alert"
return "new_alert"
```
How can I resolve this? If you propose a fix, please make it concise.| thread = threading.Thread(target=_run_async, daemon=True) | ||
| thread.start() | ||
|
|
There was a problem hiding this comment.
Silent daemon thread leak on 5-second join timeout
If the background asyncio loop doesn't complete within 5 seconds after cancellation, thread.join(timeout=5) returns silently and the daemon thread keeps running (with its LLM call in-flight). There's no log warning, so this is invisible in production. Consider logging a warning so it's observable:
thread.join(timeout=5)
if thread.is_alive():
import logging
logging.getLogger(__name__).warning(
"investigation thread did not terminate within 5s after cancellation"
)Prompt To Fix With AI
This is a comment left during a code review.
Path: app/cli/investigate.py
Line: 239-241
Comment:
**Silent daemon thread leak on 5-second join timeout**
If the background asyncio loop doesn't complete within 5 seconds after cancellation, `thread.join(timeout=5)` returns silently and the daemon thread keeps running (with its LLM call in-flight). There's no log warning, so this is invisible in production. Consider logging a warning so it's observable:
```python
thread.join(timeout=5)
if thread.is_alive():
import logging
logging.getLogger(__name__).warning(
"investigation thread did not terminate within 5s after cancellation"
)
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Pull request overview
Adds an interactive “zero-exit” REPL mode to opensre (when run with no args on a TTY) and moves local investigations onto the same live-streaming rendering path used for remote runs, aligning the CLI UX with the direction in #243.
Changes:
- Introduces
app/cli/repl/(banner, session state, router, slash commands, follow-up handling, async loop) plus unit tests. - Switches the one-shot
opensre investigateClick command to a streaming execution path. - Improves streaming renderer robustness (always stops spinner) and tweaks banners/typing to avoid warnings.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
app/cli/__main__.py |
Enters REPL when invoked with no subcommand on a TTY. |
app/cli/commands/general.py |
Reworks investigate Click command to use streaming runner. |
app/cli/investigate.py |
Adds session-oriented streaming runner with Ctrl+C cancellation wiring. |
app/cli/repl/__init__.py |
Exposes run_repl. |
app/cli/repl/banner.py |
Adds REPL identity banner (provider/model + hints). |
app/cli/repl/commands.py |
Implements slash command registry + dispatch. |
app/cli/repl/follow_up.py |
Implements follow-up answering grounded on last investigation. |
app/cli/repl/loop.py |
Implements the async REPL loop and routing to actions. |
app/cli/repl/router.py |
Implements input classification (slash vs new alert vs follow-up). |
app/cli/repl/session.py |
Adds persistent per-session state container. |
app/nodes/resolve_integrations/node.py |
Adjusts annotation style to avoid LangGraph warning. |
app/remote/renderer.py |
Adds local= flag and guarantees spinner cleanup on exceptions. |
pyproject.toml |
Adds prompt_toolkit as a direct dependency. |
README.md |
Documents the new interactive mode. |
tests/cli/repl/test_commands.py |
Unit tests for slash dispatch. |
tests/cli/repl/test_router.py |
Unit tests for input classification. |
tests/cli/repl/test_session.py |
Unit tests for session state behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| payload = load_payload( | ||
| input_path=input_path, | ||
| input_json=input_json, | ||
| interactive=interactive, | ||
| ) | ||
| result = run_investigation_cli_streaming(raw_alert=payload) | ||
| write_json(result, output) | ||
| except SystemExit: |
There was a problem hiding this comment.
run_investigation_cli_streaming() prints streaming UI to stdout via StreamRenderer, and then write_json(result, output) prints JSON to stdout when --output is not provided. This mixes human terminal output with JSON and breaks consumers that expect stdout to be valid JSON (especially under --json). Consider gating: when JSON output is desired, run the non-streaming run_investigation_cli() (or emit streaming to stderr) so stdout remains clean JSON, and only stream in interactive mode when not producing JSON.
|
Hey @yashksaini-coder let me know if you need any help with this one 🚀 |
# Conflicts: # app/nodes/resolve_integrations/node.py # docs/daily-updates/2026-04-21.mdx # docs/daily-updates/2026-04-22.mdx # docs/daily-updates/overview.mdx
Progress updateState of the branchCI is fully green (quality, typecheck, test, CodeQL, Analyze all passing). The PR is mergeable against upstream/main. Credits for the P1 liftMost of the P1 work — the big jump from 8 to 26 slash commands plus the two-axis config — came from @rrajan94 in three commits:
Thanks for picking those up. The branch is much richer as a result. What we resolved this sessionFour CI/review gaps closed:
How the #243 product requirements stand today
How the discussion #614 decisions are reflected
Open review threads hygieneSix threads are still "unresolved" in the UI, but five of them are fixed in later commits — they just need Greptile's next re-scan or a UI-level click. Specifically:
I've replied on each with the resolving SHA. Happy to click Resolve myself if access allows. What's next — phased planImmediate (this PR) — nothing blocking merge. Classic mode works end-to-end: streaming investigations, grounded follow-ups, 26 slash commands, two-axis config, clean exit paths. Follow-up A — Phase B (pinned layout, ~1,080 LoC)
Follow-up B — Phase C (seven parallel per-tool PRs)
Each tool PR is ~150–400 LoC, none blocks any other. Deferred to their own issues (per team decision)
ThanksThanks to @davincios for the product direction calls in #614 — default ON for agentic, narrow PR scope per tool, markdown for memory, MCP separate. Those made the next steps concrete. Thanks to @rrajan94 for landing the P1 command-surface lift. Happy to hand off Phase B or any of the Phase C tool PRs to whoever's interested. Thanks to @muddlebee for the spot-check on the follow-up LLM prompt — keeps us honest against the #243 spec. Ready for review whenever the maintainers have a moment. |
…active flag was resolving to False in both cases
|
@yashksaini-coder nice work. can we have a demo pls? 👀 |
| # Keys from a completed AgentState that carry reusable infra context into | ||
| # the next investigation. Kept as a class-level tuple so any caller that | ||
| # wants to know "what counts as accumulated context" has a single source. | ||
| _ACCUMULATED_KEYS: tuple[str, ...] = ( |
There was a problem hiding this comment.
_ACCUMULATED_KEYS has a type annotation on a dataclass without ClassVar, so the dataclass machinery treats it as a regular field with a default — it appears in init as an overridable kwarg and will be flagged by mypy. Change to: _ACCUMULATED_KEYS: ClassVar[tuple[str, ...]] = (...) and add ClassVar to the typing import.
| return True | ||
|
|
||
|
|
||
| async def _repl_main(initial_input: str | None = None, config: ReplConfig | None = None) -> int: # noqa: ARG001 |
There was a problem hiding this comment.
config is silently ignored (# noqa: ARG001) — layout is never applied to the Console. This is fine today since only 'classic' exists, but once 'pinned' ships, the wire-up will need to happen here. Add a TODO comment or route to a layout-specific console factory so this doesn't get forgotten.
| ) | ||
|
|
||
| event_queue: queue.Queue[StreamEvent | BaseException | None] = queue.Queue() | ||
| loop_ref: dict[str, asyncio.AbstractEventLoop] = {} |
There was a problem hiding this comment.
Race window: if KeyboardInterrupt fires between thread.start() and loop_ref['loop'] = loop (line 218), _cancel_pump() reads loop_ref.get('loop') as None and silently no-ops — the investigation thread leaks. Consider using a threading.Event (set after loop_ref is written) that _cancel_pump waits on briefly before calling call_soon_threadsafe.
| try: | ||
| final_state = renderer.render_stream(_events()) | ||
| except KeyboardInterrupt: | ||
| _cancel_pump() |
There was a problem hiding this comment.
_cancel_pump() is called twice on the Ctrl+C path: once in _events() (line 262) and again here. The second call is a no-op because the task is already cancelled, but the double-call is misleading. Remove the call from _events() and let this outer except be the single cancellation point — it's the only place that knows render_stream has exited.
|
|
||
|
|
||
| # Short, question-shaped strings that obviously target the previous investigation. | ||
| _FOLLOW_UP_CUES = ( |
There was a problem hiding this comment.
_FOLLOW_UP_CUES misses common question starters: "when", "where", "which", "who", "did". Without them, inputs like "when did this start?" or "where is the bottleneck?" (<90 chars, end with ?) still route to follow_up because of the endswith('?') check in _is_short_question, but "when did the spike start" (no ?) routes to new_alert and fires a fresh investigation instead. Either extend the cue list or rely solely on the '?' suffix for follow-up detection.
# Conflicts: # app/cli/commands/general.py
`StreamRenderer.render_stream` called `_print_report()` outside the try/finally that wraps event iteration. Any non-`KeyboardInterrupt` exception (LLM quota, network error) stopped the spinner correctly but silently dropped accumulated `_final_state` before propagating — the user, who had been watching the report stream live, saw only the raw exception. Move `_print_report()` into the finally block alongside `_finish_active_node()`. Both have the same invariant: always run, even when the stream raises. Add a regression test that streams a partial investigation, raises mid-iteration, and asserts the report flushes and accumulated state is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The agent module docstring promised the explicit subcommand always starts the REPL on user intent, but `run_repl`'s non-TTY guard caused piped/CI invocations (e.g. `echo "..." | opensre agent`) to silently return 0 with no output — a confusing no-op. Add an explicit TTY check in `agent_command` that raises a clear `OpenSREError` with a remediation suggestion. Tighten the docstring so it no longer overpromises. Update the existing `test_agent_subcommand_*` tests to monkeypatch `isatty=True` (their semantic is "user runs the command from a real terminal") and add a regression test that confirms non-TTY invocation surfaces a clear error instead of silently succeeding. Leave `run_repl`'s non-TTY guard in place — it's still correct for the bare-`opensre` flow, where silently falling through to the landing page is the desired behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
|
@yashksaini-coder please add an e2e video demo for showcase also. |
|
merging this for velocity, let's fix the issues in a new patch |
|
⚡ LGTM → Merged. @yashksaini-coder, your work is in. Every commit counts — thank you for this one. 👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome. |

What this does
Opens up an interactive, zero-exit REPL when you run
opensrewith no arguments in a terminal. Each input is classified as a slash command, a new incident description, or a follow-up question on the previous investigation, and the existing LangGraph pipeline streams live into the session. Walking incidents through a single persistent terminal is the product direction in #243 — this PR ships the foundation.Showcase
opensre— a compact banner with the active provider, model, and quick hints appears./help,/status,/reset,/trust,/clear,/exit,/quitfor session controls. Bare words likehelporexitare accepted too and routed to their slash form.why did CPU spike?are routed as follow-ups and answered against the stored final state.The
opensre investigate -i alert.jsonone-shot flow now streams the same way the REPL does — no more waiting for the whole run to finish before anything prints.What's in the branch
Each commit is a single logical step:
investigatecommand to the streaming path.prompt_toolkitto a direct dependency.app/cli/repl/package: banner, session state, router, slash commands, follow-up handler, async loop.opensreis invoked on a TTY with no subcommand.Out of scope for this PR (known follow-ups)
This is a draft showcase, not the complete Claude Code experience. The following are tracked as next steps:
Verification
make lint— cleanmake typecheck— 340 source files, no issuesmake test-cov— all REPL unit tests pass; no regressions in the broader CLI suite/helptable shows properly,help(no slash) routes the same, Ctrl+C during an investigation cancels cleanly and preserves the session,/exitquits.Closes #243 (draft — expects follow-ups for the deferred items listed above).