Skip to content

Fix inconsistent AST formatting for EXPLAIN AST INSERT with trailing output options#100174

Merged
alexey-milovidov merged 4 commits intomasterfrom
fix-explain-insert-parens
Mar 22, 2026
Merged

Fix inconsistent AST formatting for EXPLAIN AST INSERT with trailing output options#100174
alexey-milovidov merged 4 commits intomasterfrom
fix-explain-insert-parens

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

When EXPLAIN AST INSERT INTO ... SELECT ... FORMAT Null is parsed, the INTO OUTFILE/FORMAT/SETTINGS tokens end up on the EXPLAIN query (not on the INSERT's SELECT), because ParserInsertQuery does not consume them. During formatting, ASTExplainQuery::formatQueryImpl was wrapping the INSERT in parentheses (because INSERT is not an ASTQueryWithOutput), producing EXPLAIN AST (INSERT INTO ...) which cannot be parsed back — (INSERT ...) is not valid syntax.

INSERT queries don't need wrapping because the INSERT parser does not consume FORMAT/SETTINGS/INTO OUTFILE at its boundary, so these tokens remain safely on the EXPLAIN during re-parsing.

Closes #100131

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

…g output options

When `EXPLAIN AST INSERT INTO ... SELECT ... FORMAT Null` is parsed,
the `INTO OUTFILE`/`FORMAT`/`SETTINGS` tokens end up on the EXPLAIN
query (not on the INSERT's SELECT), because `ParserInsertQuery` does
not consume them. During formatting, `ASTExplainQuery::formatQueryImpl`
was wrapping the INSERT in parentheses (because INSERT is not an
`ASTQueryWithOutput`), producing `EXPLAIN AST (INSERT INTO ...)` which
cannot be parsed back — `(INSERT ...)` is not valid syntax.

INSERT queries don't need wrapping because the INSERT parser does not
consume FORMAT/SETTINGS/INTO OUTFILE at its boundary, so these tokens
remain safely on the EXPLAIN during re-parsing.

Closes #100131

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 20, 2026

Workflow [PR], commit [1655b5e]

Summary:


AI Review

Summary

This PR fixes EXPLAIN AST INSERT ... FORMAT ... re-formatting by avoiding invalid parenthesization of INSERT in ASTExplainQuery::formatQueryImpl, and adds a focused stateless regression test for the reported case. I did not find correctness, safety, concurrency, or performance issues in the final patch.

ClickHouse Rules

Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time

Final Verdict

  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-ci label Mar 20, 2026
-- on the EXPLAIN level formats correctly without wrapping INSERT in unparseable parens.
-- https://github.com/ClickHouse/ClickHouse/issues/100131

SELECT formatQuerySingleLine('EXPLAIN AST INSERT INTO t1 SELECT 1 FORMAT Null');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This test covers the FORMAT case, but the change description and code comment also claim safety for trailing SETTINGS and INTO OUTFILE on EXPLAIN.

Please add at least one formatQuerySingleLine case for ... SETTINGS ... and one for ... INTO OUTFILE ... so we lock in the full intended behavior and avoid partial regression coverage.

@alexey-milovidov alexey-milovidov mentioned this pull request Mar 20, 2026
1 task
@@ -1,5 +1,6 @@
#pragma once

#include <Parsers/ASTInsertQuery.h>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This adds #include <Parsers/ASTInsertQuery.h> to a header and uses dynamic_cast only to detect INSERT.

Because ASTInsertQuery.h pulls heavier transitive dependencies (StorageID, buffers, etc.), this can increase rebuild cost across all TUs including ASTExplainQuery.h.

Please prefer a lightweight check that avoids including ASTInsertQuery.h in this header, e.g. branch on query->getQueryKind() == QueryKind::Insert (and QueryKind::AsyncInsertFlush if relevant), then keep ASTExplainQuery.h free of heavy includes.

alexey-milovidov and others added 2 commits March 21, 2026 19:09
…RT detection, add test coverage for `INTO OUTFILE` and `SETTINGS`

Replace `dynamic_cast<const ASTInsertQuery *>` with `getQueryKind` checks
for `QueryKind::Insert` and `QueryKind::AsyncInsertFlush`, removing the
heavyweight `#include <Parsers/ASTInsertQuery.h>` from `ASTExplainQuery.h`.

Add `formatQuerySingleLine` test cases for `EXPLAIN AST INSERT ... INTO
OUTFILE` and `EXPLAIN AST INSERT ... SETTINGS` to cover all trailing
output option variants.

#100174 (comment)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…XPLAIN AST INSERT`

The INSERT parser consumes `SETTINGS` (and pushes it down to the inner SELECT),
and `INTO OUTFILE` formatting escapes quotes differently. These test cases were
testing pre-existing behavior unrelated to the original fix for `FORMAT`.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 21, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.80% 83.90% +0.10%
Functions 24.70% 24.70% +0.00%
Branches 76.50% 76.50% +0.00%

PR changed lines: PR changed-lines coverage: 100.00% (7/7, 0 noise lines excluded)
Diff coverage report
Uncovered code

@alexey-milovidov alexey-milovidov self-assigned this Mar 22, 2026
@alexey-milovidov alexey-milovidov merged commit d370916 into master Mar 22, 2026
298 of 300 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-explain-insert-parens branch March 22, 2026 10:17
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 22, 2026
fm4v added a commit that referenced this pull request Mar 29, 2026
…overage [iter 8]

The dynamic score floor `effective_min = min(MAX_EFFECTIVE_MIN, top_score/1000)`
was previously capped at 1e-4.  This filtered out tests that cover changed `.h`
files via header expansion — specifically tests that cover UNCHANGED lines in
the same header (non-overlapping regions, assigned SIBLING_DIR_WIDTH=10000).

For such tests:  score = PASS_WEIGHT_DIRECT / (SIBLING_DIR_WIDTH × rc)
                       = 1.0 / (10000 × rc)

At rc=7:  score = 1.4e-5  (above 1e-5, below old 1e-4 → was filtered!)
At rc=17: score = 5.9e-6  (still filtered, but adds to the total width_score)

Example: PR #100174 changes `ASTExplainQuery.h` at line 130 (hunk 127–132).
The file has no CIDB coverage at the changed lines, but 20 tests cover the file
at lines 35 (rc=7), 125 (rc=17), 119 (rc=17) etc.  Their total width_score
≈ 2.6e-5 was just below the old 1e-4 floor and got silently filtered.
With MAX_EFFECTIVE_MIN=1e-5, these tests pass the filter.

The old cap comment said "top_score above 1/10 treated as 1/10"; now it reads
"top_score above 1/100 treated as 1/100".  This is still a conservative cap:
even at top_score=1.0 (rc=1 direct hit), effective_min is only 1e-5, which is
100× above MIN_SCORE=1e-8 and safely filters out true noise.

Evaluation: 70.7% → 74.7% recall (+4.0pp) on 20-PR ground-truth set.
PR 100174 (ASTExplainQuery.h, 20 tests): 25% → 80%.
No regressions across all 20 eval PRs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Logical error: 'Inconsistent AST formatting: the query: (STID: 1941-1bfa)

2 participants