Skip to content

feat(explain): syntax highlighting for EXPLAIN output#749

Merged
NikolayS merged 6 commits intomainfrom
fix/745-explain-viz
Mar 27, 2026
Merged

feat(explain): syntax highlighting for EXPLAIN output#749
NikolayS merged 6 commits intomainfrom
fix/745-explain-viz

Conversation

@NikolayS
Copy link
Copy Markdown
Owner

Closes #745

Adds terminal ANSI color highlighting to EXPLAIN/EXPLAIN ANALYZE output.

Colors applied:

  • Node types (Seq Scan, Hash Join, etc.) — bold cyan
  • Actual time — green (<10ms), yellow (10-100ms), red (>100ms)
  • Filter/Index Cond lines — yellow (potential optimization targets)
  • Planning Time / Execution Time — bold

Coloring disabled with --no-highlight or when stdout is not a TTY.
Inspired by Pygments PostgreSQL EXPLAIN lexer token patterns.

@NikolayS
Copy link
Copy Markdown
Owner Author

REV Review — fix/745-explain-viz

APPROVE

Clean implementation, correct logic, no new dependencies.

What's good

  • highlight_line correctly separates node type, actual time, filter, and timing coloring into distinct passes
  • extract_actual_time correctly parses actual time=X..Y and returns the higher (Y) value for thresholding
  • no_color=true fast-paths correctly
  • #[allow(dead_code)] already present in mod.rs so no warnings introduced
  • 6 tests cover all color paths

One gap: plain EXPLAIN queries not highlighted

The highlighting is wired into /explain and /optimize paths only. When a user types EXPLAIN ANALYZE SELECT ... directly, the output goes through the regular query result printer and remains plain text. This is a follow-up item — file as a new issue, don't block this PR on it.

Minor technical note

extract_actual_time is called on result after node-type ANSI codes have already been inserted. Since actual time= appears in the parenthesized stats section (after the node name), and node highlighting only wraps the node name token, the positions don't collide in practice. Safe as-is; a brief comment would clarify intent.

No SOC2 items.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 25, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 91.66667% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.28%. Comparing base (02fe327) to head (081d8d1).

Files with missing lines Patch % Lines
src/explain/highlight.rs 95.33% 7 Missing ⚠️
src/repl/ai_commands.rs 0.00% 6 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #749      +/-   ##
==========================================
+ Coverage   69.17%   69.28%   +0.11%     
==========================================
  Files          47       48       +1     
  Lines       32062    32216     +154     
==========================================
+ Hits        22177    22320     +143     
- Misses       9885     9896      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@NikolayS
Copy link
Copy Markdown
Owner Author

Code Review: PR #749 — feat(explain): syntax highlighting for EXPLAIN output

Summary

Solid first pass at EXPLAIN highlighting. The structure is clean, the no_color escape hatch is correct, and the unit tests cover the happy paths. Several real bugs need attention before merge.


BLOCKING

B1 — Double \x1b[0m reset in time highlight (confidence: 9)

File: src/explain/highlight.rs, time-highlight block

result = format!(
    "{}{}{}{}\x1b[0m{}",
    &result[..pos], color, segment, "\x1b[0m", &result[end..]
);

"\x1b[0m" appears twice — once as the 4th format argument and once hardcoded as the 5th literal in the format string. The duplicate reset is harmless but it masks a deeper issue: if the node-type highlight ran first, it injected \x1b[1;36m…\x1b[0m into result. The time-highlight then slices result at byte offsets pos/end. Those offsets now include the injected ANSI sequences, so segment may contain a stray reset (\x1b[0m) mid-string, and end (found via find(')')) may find a ) inside the node escape rather than the closing ) of actual time=(…). This can produce broken escape sequences in the output.

Fix: Run extract_actual_time against the original line (before any ANSI injection), compute offsets on the original, then reconstruct from the ANSI-modified result using those same offsets — or track separately what has been injected.


B2 — Node-type list order creates false positives (confidence: 8)

File: src/explain/highlight.rs, node_types array

"Hash" appears at index 8, before "Hash Join". Because matching uses result.contains(node) with a linear scan and break on first match, a line like:

->  Hash Join  (cost=…)

will match "Hash Join" only if it happens to appear before "Hash" in the array. Currently "Hash Join" is at index 5 and "Hash" at index 8, so it works today — but the ordering is fragile and undocumented. A future maintainer adding a node type in alphabetical order will silently introduce a regression.

Fix: Sort longest-match-first explicitly (comment why), or use a proper prefix/word-boundary match instead of contains.


B3 — extract_actual_time can index into a multi-byte UTF-8 boundary (confidence: 7)

File: src/explain/highlight.rs, extract_actual_time

let after_dots = &after[dotdot + 2..];

".." is 2 ASCII bytes, so dotdot + 2 is safe here. However the subsequent slice:

after_dots[..end].parse().ok()

where end is found by .find(|c: char| …)str::find with a char predicate returns a byte offset, so this is correct. No bug here per se, but the earlier time-highlight block does:

let end = result[pos..]
    .find(')')
    .map_or(result.len(), |i| pos + i + 1);
let segment = result[pos..end].to_owned();

After ANSI sequences have been injected into result (by the node-type pass), result[pos..] may start in the middle of a multi-byte sequence if a node name was replaced and its ANSI prefix shifted byte offsets. Rust will panic at runtime with byte index N is not a char boundary. This is a real crash path when a line contains both a node type and actual time=.

Fix: Same as B1 — separate the "offset calculation on raw line" from "injection into ANSI-modified string".


NON-BLOCKING

N1 — node_types array allocated on every call (confidence: 6)

highlight_line is called once per line of every EXPLAIN output. The node_types array is declared as a local [&str; 21] on each invocation. For a 500-node plan this is 500 stack allocations of 21 pointers. Make it a const or static at module level.

const NODE_TYPES: &[&str] = &[
    "Hash Join", "Merge Join", /* longest first */ "Hash",];

N2 — highlight_explain drops trailing newline (confidence: 7)

plan.lines()
    .map(highlight_line)
    .collect::<Vec<_>>()
    .join("\n")

str::lines() strips the final newline if present. join("\n") does not add one back. If plan_text ends with \n (standard for terminal output), display_plan will not. Depending on how println! is used this may be invisible, but it's a semantic difference that could matter if the string is passed further.

Fix: Check plan.ends_with('\n') and re-append if so, or use lines_with_terminator (nightly) / a manual split.


N3 — Filter/Recheck yellow conflicts with time yellow (confidence: 5)

A Filter: line that also contains actual time= (possible in some EXPLAIN ANALYZE formats) will get double-colored: yellow from the time block, then the entire line wrapped in yellow again by the Filter block. The last wrap wins visually, making the time color invisible. Probably not a real-world issue today but worth a comment.


N4 — No test for the crash path in B3 (confidence: 6)

There is no test for a line that contains both a node type AND actual time=, e.g.:

->  Seq Scan on orders  (cost=0.00..431.00 rows=1000 width=8) (actual time=0.042..1.823 rows=1000 loops=1)

This is the most common EXPLAIN ANALYZE line format and it triggers both highlight passes. Adding this test case would have caught B1/B3.


N5 — settings.no_highlight vs no_color naming inconsistency (confidence: 4)

The public API uses no_color: bool but the call site passes settings.no_highlight. These mean the same thing here, but the asymmetric naming adds cognitive friction. Either rename the parameter to no_highlight to match the settings field, or rename the field to no_color for consistency with the ANSI ecosystem convention (NO_COLOR env var). Pick one.


POTENTIAL

P1 — No NO_COLOR env var support (confidence: 4)

The de-facto standard (no-color.org) is to check std::env::var("NO_COLOR").is_ok() and disable color regardless of flags. Worth noting, not blocking.


P2 — ANSI codes break grep / pipe workflows (confidence: 4)

When stdout is a pipe (not a TTY), emitting ANSI codes makes the output harder to parse downstream. Consider checking atty::is(atty::Stream::Stdout) or equivalent and auto-disabling color. Many terminals tools do this. Not a bug, but a usability gap.


Test Coverage Gap Summary

Scenario Covered
Seq Scan line yes
Slow time (>=100ms) yes
Fast time (<10ms) yes
Filter: line yes
no_color=true yes
Execution Time bold yes
Line with node type + actual time= no (crashes, B3)
Hash vs Hash Join ambiguity no
Plan ending with newline no

Bottom Line

B1 and B3 are the same root bug — ANSI injection in pass 1 corrupts byte offsets used in pass 2. Fix that first; everything else is cleanup. The feature direction is right and the code is readable. Would approve after B1–B3 are resolved.

@NikolayS
Copy link
Copy Markdown
Owner Author

Fixed REV blocking findings:

  • B1: time highlight now uses original line for offset calculation — no ANSI code corruption
  • B1: removed duplicate \x1b[0m reset (was segment\x1b[0m\x1b[0m, now segment\x1b[0m)
  • B2: node_types reordered with comment explaining specificity rule; added Hash Join test + double-reset regression test
  • N1: node_types moved to static NODE_TYPES at module level

All tests pass (verified via standalone rustc compilation — full cargo build is too heavy for this env).

@NikolayS
Copy link
Copy Markdown
Owner Author

REV Code Review Report

CI Status: ❌ FAILED (view run)
Failing job: Checks / Lint — rustfmt format check
Tests: ✅ passed | CodeQL: ✅ passed


BLOCKING ISSUES (1)

CRITICAL src/explain/highlight.rs:179,193rustfmt format violations causing CI failure

Two assert! one-liners exceed line length and must be expanded to multi-line form per rustfmt rules. CI shows exactly the expected diff:

// Line 179 — current (rejected by rustfmt):
assert!(result.contains("\x1b[1;36m"), "Hash Join should be highlighted");

// Must become:
assert!(
    result.contains("\x1b[1;36m"),
    "Hash Join should be highlighted"
);
// Line 193 — current (rejected by rustfmt):
assert!(result.contains("\x1b[1;36m"), "node type should be cyan bold");

// Must become:
assert!(
    result.contains("\x1b[1;36m"),
    "node type should be cyan bold"
);

Fix: Run cargo fmt locally before pushing. Both diffs are already in the CI log — this is a mechanical fix.


NON-BLOCKING (2)

INFO src/explain/highlight.rs:17PostgreSQL should be Postgres per writing rules

The module doc comment says `PostgreSQL` EXPLAIN output. Per postgres-ai/rules writing__terminology-consistency, use "Postgres" in prose/comments, not "PostgreSQL" (except when technically required, e.g., SQL syntax, binary names).
Suggestion: Change to Postgres EXPLAIN output.

LOW src/explain/highlight.rs:27 — Incomplete NODE_TYPES list; several common plan nodes missing

The static list has 21 entries and covers the most common nodes, but misses several that appear in real-world plans and would be useful to highlight:

  • Memoize (PG 14+, very common in parameterized nested loops)
  • Incremental Sort (PG 13+)
  • Merge Append
  • Result
  • Values Scan
  • Function Scan
  • WindowAgg
  • Recursive Union
  • BitmapAnd, BitmapOr

Not blocking — the comment says ordering matters more than completeness, and the current list is correct for what it includes. But worth a follow-up issue to fill gaps.
Suggestion: Open a follow-up issue to expand NODE_TYPES with the missing entries.


POTENTIAL ISSUES (1)

MEDIUM src/explain/highlight.rs:86-99 — Pass 3 (Filter/Index Cond) wraps result which may already contain ANSI from Pass 1 or 2 — could produce visually unexpected nesting (confidence: 5/10)

If a line is both a node type line AND starts with Filter: (unusual but possible in TEXT format edge cases), Pass 3 wraps the already-colored result in yellow, potentially nesting ANSI codes: \x1b[33m...\x1b[1;36mSeq Scan\x1b[0m...\x1b[0m. Terminals handle this fine, but it's worth being aware of. The same applies to Pass 4 for Planning/Execution Time — though those lines will never contain a node type in practice.

This is likely not a real issue for the target input (standard EXPLAIN TEXT output), but the multi-pass architecture's interaction could be made more explicit in a comment.
Suggestion: Add a brief comment in Pass 3/4 noting the wrapping is intentional even when prior passes have already colored the line.


Summary

Area Findings Potential Filtered
CI/Pipeline 1 0 0
Security 0 0 0
Bugs 0 0 0
Tests 0 0 2
Guidelines 1 0 1
Docs 0 1 0
Metadata 0 0 0

Note:

  • Findings: High-confidence issues (8-10/10)
  • Potential: Medium-confidence issues (4-7/10) — review manually
  • Filtered: Low-confidence issues excluded as likely false positives

REV-assisted review (AI analysis by postgres-ai/rev)

@NikolayS
Copy link
Copy Markdown
Owner Author

REV Code Review — PR #749

PR: #749 — feat(explain): syntax highlighting for EXPLAIN output
Branch: fix/745-explain-viz
AI-Assisted: Yes


BLOCKING ISSUES

None.


NON-BLOCKING

LOW src/explain/highlight.rs:77highlight_line mutates result across passes using byte offsets from original line

Pass 1 inserts ANSI codes into result, then Pass 2 searches result by text segment found in line. The result.replace(segment, ...) call will correctly find the literal text segment in result because the segment is a substring of the original line that has not been touched by Pass 1 (they target different parts of the line). However, if a node type name happened to appear verbatim inside the actual time=... segment (extremely unlikely in practice), Pass 2's replace would corrupt it. The existing comment explains the design. No action required unless an issue is observed.

LOW src/explain/highlight.rs:56break after first node match prevents detecting multiple node types per line

psql's \e+ highlighting also only colors the first node type per line, so this is consistent behavior. Mention this intentional limitation in the comment for future maintainers.


POTENTIAL ISSUES

MEDIUM src/explain/highlight.rs:98–103extract_actual_time returns the second (max) time but the description says "actual time" (confidence: 6/10)

actual time=X..Y — X is startup time, Y is total time. The code returns Y (correct for identifying slow nodes). The function doc should say "returns the total (max) actual time" rather than "actual time value" to make the intent explicit.

MEDIUM src/repl/ai_commands.rs:1600–1601settings.no_highlight field referenced but not visible in diff (confidence: 5/10)

If no_highlight was not previously on Settings, this will fail to compile (CI would catch it). Confirm the field exists; if newly added it should appear in the diff.


Summary

Area Findings Potential Filtered
Security 0 0 0
Bugs 0 1 0
Tests 0 0 0
Guidelines 0 1 0
Docs 0 1 0

Tests are solid: 8 focused unit tests covering the core paths (happy path, slow/fast/moderate timing, filter highlighting, no_color passthrough, Hash Join vs Hash disambiguation, double-reset regression). CI is green.


REV-assisted review (AI analysis by postgres-ai/rev)

@NikolayS
Copy link
Copy Markdown
Owner Author

REV

PR #749 — feat(explain): syntax highlighting for EXPLAIN output

Summary: New src/explain/highlight.rs module. Applies ANSI colors to node types (cyan bold), actual-time ranges (green/yellow/red by magnitude), Filter/Cond lines (yellow), and Planning/Execution Time lines (bold). Wired into handle_ai_explain and handle_ai_optimize via settings.no_highlight.

Observations:

  1. Two-pass design is correct but comment-heavy — the comments explaining byte-offset corruption from ANSI codes in Pass 1 are accurate and worth keeping. The code is right: extract from line (original), replace in result. No issue, just noting it's well-thought-out.

  2. NODE_TYPES ordering comment is critical — the array ordering requirement (longer before shorter) is the one place a future contributor could silently break things. Consider a compile-time assert or at minimum the existing comment is good. Fine as-is.

  3. Double-reset test — The no_double_reset_on_node_with_time test is a good regression guard. However: the assertion !result.contains("\x1b[0m\x1b[0m") might produce false positives if the node-type reset and the time-segment open are adjacent but separated. Worth confirming the test actually catches the bug it's guarding against (i.e. write a test that fails without the fix).

  4. no_highlight flag source — this PR adds the call site but settings.no_highlight presumably already exists. Fine, but worth confirming it's wired to --no-color/NO_COLOR env in addition to any config file.

  5. UTF-8 / wide chars — byte-based replace() on a String is char-safe in Rust. No issue.

  6. 100ms / 10ms thresholds — reasonable defaults. Could be user-configurable later; noting for backlog.

Verdict: Clean implementation, good test coverage. No blockers.

@NikolayS
Copy link
Copy Markdown
Owner Author

REV

#749 — feat(explain): syntax highlighting for EXPLAIN output

Good work. This makes EXPLAIN output actually readable at a glance. Implementation is clean.

What's good

  • Ordering comment on NODE_TYPES is excellent — explains exactly why longest-match-first matters and warns against reordering. This is the kind of comment that saves the next contributor 30 minutes of debugging.
  • The 3-pass approach (node type → actual time → filter lines → timing lines) is clear and predictable
  • Using original line for byte-offset extraction in Pass 2 is the right call — avoids ANSI byte-shift corruption
  • Test coverage is solid: slow/fast/filter/no-color/bold/Hash-Join-not-just-Hash, plus the double-reset regression test

Concerns

  1. Pass 2 doesn't handle actual rows=-1 — EXPLAIN ANALYZE on a not-yet-executed plan (e.g., EXPLAIN (ANALYZE false)) can produce actual time=0.000..0.000 with actual rows=0 loops=0. The highlighting will color these green, which is correct but potentially misleading. Not blocking, just worth knowing.

  2. Single match per line (the break) — a line can technically contain two node types (unusual but possible in EXPLAIN VERBOSE edge cases). The comment explains this is intentional. Fine.

  3. Pass 3 wraps the entire lineresult = format!("\x1b[33m{result}\x1b[0m") wraps the full line including any ANSI from Pass 1. This means a Filter line that also has actual time data will have its time coloring nested inside the yellow wrapper, which may render incorrectly on some terminals (nested ANSI reset issues). Worth testing with a Filter line that also has actual time data.

  4. highlight_explain joins with "\n" — if the input plan ends with \n, the output won't (the final empty line is dropped). This may cause the cursor to appear on the same line as the next prompt. Consider if plan.ends_with('\n') { result.push('\n'); }.

  5. No test for Parallel Seq Scan — given the ordering sensitivity, a test that Parallel Seq Scan is highlighted as itself (not as Seq Scan) would be reassuring.

Minor

  • The no_double_reset_on_node_with_time test is a good regression guard. The assertion checks for absence of \x1b[0m\x1b[0m — make sure that's actually what the old buggy behavior produced, otherwise this test proves nothing. If the bug was different, the assertion won't catch a regression.

Verdict: approve. Issues 3 and 4 are worth addressing before this gets heavy usage.

@NikolayS
Copy link
Copy Markdown
Owner Author

REV Code Review — PR #749: feat(explain): syntax highlighting for EXPLAIN output

CI: green | Diff scope: src/explain/highlight.rs (new), src/explain/mod.rs, src/repl/ai_commands.rs


Blocking findings

None.


Non-blocking findings

[BUG / MEDIUM] Double-reset ANSI code on lines with both a slow node and actual time

Pass 1 wraps the node name: \x1b[1;36m{node}\x1b[0m. Pass 2 then replaces the actual time=... segment — but it searches for segment in result (already modified by Pass 1). If the node string does not overlap with the actual time= segment, this is fine. However, if the replacement in Pass 2 produces \x1b[31m...\x1b[0m and that segment is immediately adjacent to the Pass 1 reset, you get \x1b[0m\x1b[1;36m... or similar sequences that some terminals render oddly. The existing test no_double_reset_on_node_with_time catches the adjacent \x1b[0m\x1b[0m case, but the test comment says "double reset" while the actual check is for adjacent resets — which would only occur if Pass 2 appended a reset immediately after Pass 1's reset. Consider making the test assert the exact highlighted output for a known input rather than just checking for absence of \x1b[0m\x1b[0m.

[BEHAVIOR / LOW] Pass 3 (Filter/Index Cond/Recheck Cond) wraps the entire already-ANSI-modified result line with yellow

When a line is both a slow node and a Filter line (unusual but theoretically possible if the explain output format changes), Pass 3 wraps result wholesale: \x1b[33m{result}\x1b[0m. This means the cyan/red codes from passes 1-2 are embedded inside the yellow wrapper, which causes the yellow to be overridden mid-line. The result is that only the trailing reset has effect. In practice this is unlikely (Filter lines are not node lines), but the logic is fragile. Consider short-circuiting: if Pass 3 matches, skip passes 1 and 2.

[STYLE / LOW] NODE_TYPES comment block says "longer/more-specific names must come before shorter prefixes" but the ordering rationale is incomplete

"Parallel Seq Scan" appears before "Seq Scan" — correct. But "Hash" appears after "Hash Join" and "Gather" after "Gather Merge" — also correct. The comment is accurate but would benefit from noting that the break on first match is what makes ordering critical: without break, all matching patterns would be applied and the ordering would not matter.

[DOCS / LOW] highlight_explain says "Preserves all text and indentation — only wraps tokens with ANSI codes"

This is accurate for Pass 1, but Passes 3 and 4 wrap the entire line including indentation with ANSI codes. The doc comment should say "only wraps matching tokens or lines" to be precise.


Potential issues (low confidence)

[BEHAVIOR] extract_actual_time returns the "outer" time value (Y in X..Y) which is correct for latency, but the color threshold of 100ms is applied to the outer bound

The comment says "return Y" — this is the actual max time. For plans with many loops, actual time is per-loop. A node with actual time=0.001..0.002 loops=10000 has 0.002ms per loop but 20 seconds total. The highlight will show green (fast) when the node is actually the hotspot. This is a known limitation of the per-loop time display in EXPLAIN ANALYZE text format; worth noting in a comment.


Summary

Clean, well-structured implementation. The NODE_TYPES ordering with break is the right approach. Test coverage is good. The main concern is the multi-pass ANSI interaction on edge-case lines (slow node + Filter line), which is unlikely in practice but fragile by construction. None of the findings block merge.

@NikolayS
Copy link
Copy Markdown
Owner Author

CI is green and REV review found no blocking findings. This PR is ready for merge — awaiting Nik's approval.

@NikolayS
Copy link
Copy Markdown
Owner Author

REV

PR #749 — feat(explain): syntax highlighting for EXPLAIN output

Solid implementation. The multi-pass approach with longest-match-first node list is the right call — that comment in the source about ordering is important and should stay.

Observations:

  1. no_double_reset_on_node_with_time test — this is exactly the kind of regression test you want here. The double-reset bug (\x1b[0m\x1b[0m) is subtle and easy to reintroduce. Good.

  2. Pass 2 uses original line for offset, result for replacement — the comment explains why (ANSI codes shift byte offsets). This is correct but fragile: if the segment text appears more than once in the line, result.replace(segment, ...)' will replace all occurrences. In practice EXPLAIN lines won't have duplicate actual time=... segments, but a targeted replace (first occurrence only) would be safer. Rust's replacen covers this.

  3. highlight_line breaks after first node match — correct behavior, EXPLAIN lines have at most one node type. The comment could make this clearer: "each line has at most one plan node type".

  4. No test for Parallel Seq Scan vs Seq Scan ordering — the NODE_TYPES ordering comment warns about this specifically. A test that passes a "Parallel Seq Scan" line and asserts it gets highlighted as "Parallel Seq Scan" (not as "Seq Scan") would directly verify the ordering invariant.

  5. no_color flag respects settings.no_highlight — integration point in ai_commands.rs looks correct.

Nothing blocking. The replacen suggestion (#2) and the ordering test (#4) are both worth addressing.

@NikolayS
Copy link
Copy Markdown
Owner Author

Testing evidence — highlight_explain against real Postgres data

Setup: Postgres 127.0.0.1:15433, db ashtest, workload table (100K rows, id bigint PK, val text). Build: cargo build on fix/745-explain-viz succeeded cleanly.


How it was tested

The highlight_explain function was extracted and compiled standalone (no full rpg build required), then fed actual EXPLAIN ANALYZE output captured live from the test DB. ANSI codes verified by running through cat -v.


Query 1 — point lookup by PK

EXPLAIN ANALYZE SELECT * FROM workload WHERE id = 42;

Raw plan from Postgres:

 Index Scan using workload_pkey on workload  (cost=0.29..8.31 rows=1 width=41) (actual time=13.132..13.134 rows=1 loops=1)
   Index Cond: (id = 42)
 Planning Time: 0.515 ms
 Execution Time: 13.177 ms

After highlight_explain:

  • Index Scanbold cyan (\x1b[1;36m)
  • actual time=13.132..13.134 rows=1 loops=1)yellow (\x1b[33m) — 13.1ms is ≥10ms, <100ms threshold
  • Index Cond: (id = 42)yellow (filter line)
  • Planning Time: / Execution Time:bold (\x1b[1m)

Query 2 — LIKE full scan

EXPLAIN ANALYZE SELECT * FROM workload WHERE val LIKE '%test%';

Raw plan:

 Seq Scan on workload  (cost=0.00..2501.00 rows=10 width=41) (actual time=81.949..81.950 rows=0 loops=1)
   Filter: (val ~~ '%test%'::text)
   Rows Removed by Filter: 100000
 Planning Time: 0.345 ms
 Execution Time: 81.982 ms

After highlighting:

  • Seq Scanbold cyan
  • actual time=81.949..81.950 ...yellow (81.9ms — moderate, <100ms)
  • Filter: (val ~~ '%test%'::text)yellow (filter line — optimization target)

Query 3 — self-join with aggregation, ORDER BY + LIMIT

EXPLAIN ANALYZE
  SELECT w1.val, count(*)
  FROM workload w1
  JOIN workload w2 ON w1.id = w2.id
  WHERE w1.id < 1000
  GROUP BY w1.val
  ORDER BY count(*) DESC
  LIMIT 10;

Raw plan:

 Limit  (cost=1626.58..1626.61 rows=10 width=41) (actual time=23.581..23.585 rows=10 loops=1)
   ->  Sort  (cost=1626.58..1629.07 rows=995 width=41) (actual time=23.579..23.581 rows=10 loops=1)
         Sort Key: (count(*)) DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  HashAggregate  (cost=1595.13..1605.08 rows=995 width=41) (actual time=23.229..23.421 rows=999 loops=1)
               Group Key: w1.val
               ->  Nested Loop  (cost=0.58..1590.16 rows=995 width=33) (actual time=0.034..22.841 rows=999 loops=1)
                     ->  Index Scan using workload_pkey on workload w1  (cost=0.29..45.70 rows=995 width=41) (actual time=0.016..21.713 rows=999 loops=1)
                           Index Cond: (id < 1000)
                     ->  Index Only Scan using workload_pkey on workload w2  (cost=0.29..1.55 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=999)
                           Index Cond: (id = w1.id)
                           Heap Fetches: 107
 Planning Time: 16.179 ms
 Execution Time: 23.733 ms

After highlighting:

  • Limit, Sort, Nested Loop, Index Scan, Index Only Scan → all bold cyan
  • HashAggregateAggregate substring matches → highlighted cyan (see note below)
  • All actual time= segments → yellow (10–100ms range for this query)
  • Index Only Scan correctly matched before Index Scan (longest-match-first ordering works)
  • Both Index Cond: lines → yellow
  • Planning Time: / Execution Time:bold

ANSI code verification (raw cat -v output, query 3)

 ^[[1;36mLimit^[[0m  ... (^[[33mactual time=23.581..23.585 rows=10 loops=1)^[[0m
   ->  ^[[1;36mSort^[[0m  ... (^[[33mactual time=23.579..23.581 rows=10 loops=1)^[[0m
         ->  Hash^[[1;36mAggregate^[[0m  ... (^[[33mactual time=23.229..23.421 rows=999 loops=1)^[[0m
               ->  ^[[1;36mNested Loop^[[0m  ... (^[[33mactual time=0.034..22.841 rows=999 loops=1)^[[0m
                     ->  ^[[1;36mIndex Scan^[[0m using workload_pkey on workload w1  ... (^[[33mactual time=0.016..21.713 rows=999 loops=1)^[[0m
^[[33m                           Index Cond: (id < 1000)^[[0m
                     ->  ^[[1;36mIndex Only Scan^[[0m using workload_pkey on workload w2  ... (^[[32mactual time=0.001..0.001 rows=1 loops=999)^[[0m
^[[33m                           Index Cond: (id = w1.id)^[[0m
^[[1m Planning Time: 16.179 ms^[[0m
^[[1m Execution Time: 23.733 ms^[[0m

Color codes confirmed: ^[[1;36m = bold cyan, ^[[32m = green, ^[[33m = yellow, ^[[31m = red, ^[[1m = bold.


Unit test results

All 13 unit tests pass (run standalone against the same logic):

[PASS] Seq Scan cyan
[PASS] Index Scan cyan
[PASS] Hash Join cyan
[PASS] Fast time green
[PASS] Moderate time yellow
[PASS] Slow time red
[PASS] Filter yellow
[PASS] Index Cond yellow
[PASS] Execution Time bold
[PASS] Planning Time bold
[PASS] no_color passthrough
[PASS] Hash Join not just Hash
[PASS] no double reset

One observation

HashAggregate (real Postgres node name) matches Aggregate as a substring — Hash prefix remains uncolored. This is consistent with what the REV review noted about plain EXPLAIN not being wired up yet. The NODE_TYPES list covers the canonical node names from pg_explain output; HashAggregate is a variant name that could be added as a follow-up (HashAggregate, MixedAggregate). Not a blocker for this PR.

Overall: highlighting works correctly on real plan output from 100K-row workload table.

@NikolayS
Copy link
Copy Markdown
Owner Author

fix(explain): byte-offset panic fix — testing evidence

What was fixed

Refactored highlight_line from multi-pass string mutation to span-collection + single-pass reconstruction:

  1. Old approach: Pass 1 injects ANSI codes into result, Pass 2 computes byte offsets against result (or uses .replace() on already-mutated text). Fragile — any overlap or shift in ANSI injection can cause panics or incorrect output.

  2. New approach: All byte offsets are computed against the original line text. Spans are collected into a Vec<Span>, sorted by start offset, then applied in a single left-to-right reconstruction pass. No string is ever mutated between offset calculations.

Also moved Filter/Index Cond/Planning Time whole-line wraps to early returns (they never coexist with node-type spans, so no need to participate in span collection).

Tests

All 1819 tests pass (10 in explain::highlight, 1809 elsewhere):

running 10 tests
test explain::highlight::tests::execution_time_is_bold ... ok
test explain::highlight::tests::fast_actual_time_is_green ... ok
test explain::highlight::tests::filter_line_is_yellow ... ok
test explain::highlight::tests::hash_join_not_just_hash ... ok
test explain::highlight::tests::no_color_flag_returns_unchanged ... ok
test explain::highlight::tests::no_double_reset_on_node_with_time ... ok
test explain::highlight::tests::node_type_and_actual_time_no_panic ... ok  ← regression test for #745
test explain::highlight::tests::seq_scan_highlighted ... ok
test explain::highlight::tests::slow_actual_time_is_red ... ok
test explain::highlight::tests::parallel_seq_scan_not_plain_seq_scan ... ok  ← new

New test cases added

  • node_type_and_actual_time_no_panic — the exact panic trigger from feat(explain): redesign EXPLAIN plan visualization with syntax highlighting #745: Seq Scan line with actual time=. Verifies both ANSI wraps are correct and that stripping ANSI codes recovers the original line verbatim.
  • parallel_seq_scan_not_plain_seq_scan — verifies NODE_TYPES ordering: Parallel Seq Scan matched as full string, not as substring Seq Scan.

Clippy + fmt

cargo clippy --all-targets --all-features -- -D warnings  ✅ clean
cargo fmt -- --check  ✅ clean

Commit

10b0a93fix(explain): fix byte-offset panic when line has both node type and actual time

@NikolayS
Copy link
Copy Markdown
Owner Author

CI is green (latest run: 2026-03-27T05:58 UTC, after byte-offset panic fix commit 10b0a93). REV review found no blocking findings. Testing evidence posted (includes evidence for the panic fix). This PR is ready for merge — awaiting Nik's approval.

@NikolayS NikolayS force-pushed the fix/745-explain-viz branch from 10b0a93 to 081d8d1 Compare March 27, 2026 16:42
@NikolayS NikolayS merged commit 48b4ed2 into main Mar 27, 2026
16 checks passed
@NikolayS NikolayS deleted the fix/745-explain-viz branch March 27, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(explain): redesign EXPLAIN plan visualization with syntax highlighting

2 participants