feat(cli): make interactive shell documentation-aware (#1166)#1172
Conversation
) Procedural how-to questions in the interactive shell now ground answers in the project docs/ directory instead of relying on model memory. - New docs_reference module discovers MDX pages, ranks by query overlap (slug/title weighted, exact slug match boosted, asset dirs skipped), excerpts the top-N pages, and always appends a compact index. - cli_help runs its own docs+CLI-grounded LLM call, instructs the model to cite doc pages and avoid inventing setup steps, and falls back to the canonical https://www.opensre.com/docs URL when docs are missing. - router catches more documentation-style questions ("how do I configure Datadog?", "how do I deploy?", "what are the integrations?", "does opensre support X?", queries mentioning docs/documentation). - Tests cover discovery, ranking, excerpting, missing-docs fallback, prompt grounding, Markdown rendering, and router classification. Maintainers: docs are read at runtime from docs/*.mdx — no rebuild step. Drop a new .mdx file in docs/ and it is discovered automatically.
…cer-Cloud#1166) The bare \b(docs|documentation)\b pattern caught incident descriptions that incidentally mention a 'docs' service ("the database docs service returned 502 errors..."), routing them to the docs-grounded help handler instead of the LangGraph investigation pipeline. Replace the bare match with phrasing-anchored patterns: docs/documentation must appear with a question word (what/where/which do/does/are/is), preposition (in/according to/per), or read-style verb (check/read/see/ find/search/show/reference/consult/look at). Add a regression test for incident text that mentions docs.
Greptile SummaryThis PR makes the OpenSRE interactive shell documentation-aware by adding a new Confidence Score: 5/5Safe to merge; all remaining findings are P2 style/documentation issues with no correctness impact. The implementation is solid — caching strategy is appropriate, prompt construction is well-guarded, and the routing regression test prevents the key misrouting scenario. The three findings are all P2: a misleading module docstring about cache freshness, a private function incorrectly listed in app/cli/interactive_shell/docs_reference.py — module docstring and depth-penalty behaviour worth fixing before the docs/ directory grows many nested pages. Important Files Changed
Sequence DiagramsequenceDiagram
participant U as User (shell input)
participant R as router.py
participant CH as cli_help.py
participant DR as docs_reference.py
participant FS as docs/ (filesystem)
participant LLM as LLM (reasoning)
U->>R: classify_input(text)
R-->>CH: "cli_help" → answer_cli_help(question)
CH->>DR: build_docs_reference_text(question)
DR->>FS: _discover_docs_cached(root) [lru_cache]
FS-->>DR: tuple[DocPage, ...]
DR->>DR: find_relevant_docs(query, pages, top_n=4)
DR->>DR: _score() per page (slug/title/heading/body weights + depth penalty)
DR-->>CH: reference text (excerpts + index, ≤22 000 chars)
CH->>CH: _build_grounded_prompt(question, cli_ref, docs_ref)
CH->>LLM: client.invoke(prompt)
LLM-->>CH: response
CH->>U: console.print(Markdown(response))
Reviews (1): Last reviewed commit: "fix(cli): tighten docs router pattern to..." | Re-trigger Greptile |
…acer-Cloud#1166) Three fixes from automated review on PR Tracer-Cloud#1172: 1. Module docstring overpromised freshness. _discover_docs_cached is decorated with @lru_cache(maxsize=1), so parsed pages are frozen for the life of the process. Edits to docs/*.mdx during a running shell are NOT reflected until restart. Updated the "How docs stay fresh" section to describe the actual behavior. 2. _build_grounded_prompt has a leading underscore (private) but was listed in __all__, signalling stable public API. Tests already import the symbol directly, so they don't need it in __all__. Removed. 3. The depth penalty in _score was applied unconditionally before the score>0 filter, so a legitimately-matching page nested 2+ levels deep with raw_score <= depth would score 0 (or negative) and be excluded entirely. Now the penalty only applies when there is a positive raw match score and the result is clamped to a floor of 1, so weak tutorials/ or use-cases/ matches still surface as lower-ranked results. Added a regression test that fails against the old behavior.
|
@VaibhavUpreti @davincios kindly review |
|
hey @Davidson3556 One more routing case to fix before merge. Can be fix by: either reorder |
…racer-Cloud#1166) Review feedback on PR Tracer-Cloud#1172: classify_input checks _is_cli_help_intent before _reads_like_investigation_request, so any docs regex hit short-circuits the investigation pipeline. The bare \b(in)\s+(the)?\s+ (docs|documentation)\b pattern fired on incident phrasing like "the API errors are happening in docs" and misrouted it to cli_help. Split the third docs pattern in two: - "according to (the) docs" / "per (the) docs" — citation phrasings, almost exclusively docs questions, kept without question-shape requirement. - "in (the) docs/documentation" — too broad on its own, now requires a `?` reachable in the same clause ([^.!\n]*\?), so incident text routes to the investigation pipeline while legitimate phrasings like "in the docs, where is the OAuth flow?" still match. Tests: - regression: "the API errors are happening in docs" → new_alert - preservation: "in the docs, where is the OAuth flow?" → cli_help
|
thank you, Good catch. fixed in 21a4347. Took the second path (tighten patterns to require question-shape) over reordering. Considered the reorder, but cli_help > new_alert priority is load-bearing for cases like "How do I run an investigation?" , a blanket reorder, or a hybrid keyed on mentions_alert_signal, would also misroute legit config questions that happen to mention an alert keyword (e.g. "how do I configure Datadog when my errors are spiking"). Pattern fix is more targeted: only the bare
Regression test added (test_short_incident_with_in_docs_phrase_routes_to_new_alert) using your exact example, plus a preservation test for the legitimate question-shaped phrasing. |
|
kindly review |
|
LGTM 👍
thanks for the quick turnaround on the in docs fix @Davidson3556 🙌 |
|
🧑💻 @Davidson3556 has entered the contributor hall of fame. Merged. Done. Shipped. Go touch grass (then come back with another PR). 🌱 👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome. |

Fixes #1166
Describe the changes you have made in this PR -
Make the OpenSRE interactive shell documentation-aware so procedural how-to
questions ("how do I configure Datadog?", "how do I deploy this?", "how do I
run an RCA?") are answered from the current project docs/ directory instead of
relying on model memory.
What changed:
app/cli/interactive_shell/docs_reference.pywalksdocs/,parses MDX frontmatter and headings, ranks pages by query-token overlap
(slug/title weighted, exact-slug match boosted, nested-path penalized,
asset/image/font dirs skipped), excerpts the top-N pages, and always
appends a compact index of all available pages.
app/cli/interactive_shell/cli_help.pynow runs its own LLM call groundedin both the docs reference and the CLI
--helpreference. The systemprompt explicitly tells the model to cite doc page names, avoid inventing
setup steps that are not in the docs, and fall back to
https://www.opensre.com/docs when the local docs/ directory is missing
(e.g. non-editable installs).
app/cli/interactive_shell/router.pyadds patterns so docs-stylequestions route to the docs-grounded handler:
configure / deploy / integrate / connect / set up phrasings,
"what is/are the integrations/features/...",
"does opensre support ...", "can opensre integrate with ...",
and explicit references like "check/according to the docs".
The bare docs/documentation token is intentionally NOT a help signal,
so an incident description that mentions a service named "docs" still
routes to the LangGraph investigation pipeline.
Maintainer note: docs are read at runtime from
docs/*.mdx. There is nobuild step and no cache file to keep in sync. Adding a new
.mdxfileunder
docs/makes it discoverable on the next shell turn.Demo/Screenshot for feature changes and bug fixes -
Retrieval smoke test against the real docs/ directory (no LLM call,
runs in any environment):
End-to-end inside
opensreshell: typing "how do I configure Datadog?"streams an answer that quotes the API key and Application key steps from
docs/datadog.mdx and cites the page.
Code Understanding and AI Usage
Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?
If you used AI assistance:
Explain your implementation approach:
Problem: the interactive CLI helps users operate the agent but had no way to
ground answers in the current OpenSRE docs. Procedural questions ("how do I
configure Datadog?", "how do I deploy?") fell back to model memory and could
drift from the docs.
Approach considered:
rejected because it adds a network dependency and a hosted-docs lag.
for ~140 small Mintlify MDX pages and would add an embedding model
dependency.
The chosen approach is local-first, dependency-free, and self-refreshing
since the file system is the source of truth.
Key components I added:
app/cli/interactive_shell/docs_reference.py: walk the docs root, parse
YAML-style frontmatter, derive a display title (frontmatter > first H1 >
slug). Asset/image/font/style/snippet directories are skipped because
they are not user-facing prose.
ranking. Stopwords are removed from the query (including "opensre" itself
because every page mentions it). Slug and title hits weigh more than body
hits because docs are organized by topic and the slug usually IS the
integration name. An exact slug match (slug "datadog" vs query token
"datadog") gets a +12 boost so the canonical setup page outranks
comparison pages. Subdirectory pages get a small depth penalty so
root-level integration pages outrank tutorial deep dives.
appends a compact index of all available pages so the LLM can suggest
related pages even when nothing matched the query directly. There is a
total-character cap so the prompt never exceeds the model context.
app/cli/interactive_shell/cli_help.py: build the system prompt with both
docs and CLI references, instruct the model to cite doc page names and
avoid inventing setup steps when the docs do not cover the question, fall
back to a CLI-only prompt + canonical docs URL when the local docs/
directory is unavailable.
_CLI_HELP_PATTERNSto catch configure / deploy /integrate / connect / set-up phrasings, feature-inventory questions, and
explicit "check the docs" / "according to the docs" references. The bare
docs/documentation token was intentionally NOT used as a help signal
because it caused incident text mentioning a "docs" service to be
misrouted away from the LangGraph investigation pipeline (regression test
added:
test_incident_text_mentioning_docs_still_routes_to_new_alert).Edge cases handled:
switches to CLI-only mode + canonical docs URL hint, tested.
# Heading): falls back to first H1, tested.Checklist before requesting a review