Skip to content

Commit 97e05b8

Browse files
Merge branch 'master' into fix-mutation-total-milliseconds
2 parents 720c5a7 + 831b80b commit 97e05b8

File tree

123 files changed

+2117
-898
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

123 files changed

+2117
-898
lines changed

.claude/CLAUDE.md

Lines changed: 23 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -8,23 +8,33 @@ When writing text such as documentation, comments, or commit messages, write nam
88

99
When mentioning logical errors, say "exception" instead of "crash", because they don't crash the server in the release build.
1010

11-
Links to ClickHouse CI, such as `https://s3.amazonaws.com/clickhouse-test-reports/json.html?...` should be analyzed using the tool at `.claude/tools/fetch_ci_report.js`, which directly fetches the underlying JSON data without requiring a browser:
11+
Links to ClickHouse CI should be analyzed using the tool at `.claude/tools/fetch_ci_report.js`, which directly fetches the underlying JSON data without requiring a browser. It accepts GitHub PR URLs (fetches all CI reports) or direct S3/CI HTML URLs.
1212

1313
```bash
14-
# Fetch and analyze CI report
15-
node /path/to/ClickHouse/.claude/tools/fetch_ci_report.js "<ci-url>" [options]
14+
# Fetch all CI reports for a PR
15+
node .claude/tools/fetch_ci_report.js "https://github.com/ClickHouse/ClickHouse/pull/12345"
16+
17+
# Show only failed tests with CIDB links
18+
node .claude/tools/fetch_ci_report.js "https://github.com/ClickHouse/ClickHouse/pull/12345" --failed --cidb
19+
20+
# Fetch only a specific report from a PR (by index)
21+
node .claude/tools/fetch_ci_report.js "https://github.com/ClickHouse/ClickHouse/pull/12345" --report 2
22+
23+
# Filter by test name, show artifact links
24+
node .claude/tools/fetch_ci_report.js "<url>" --test peak_memory --links
25+
26+
# Download logs and show failed tests
27+
node .claude/tools/fetch_ci_report.js "<url>" --failed --download-logs
1628

1729
# Options:
18-
# --test <name> Filter tests by name
19-
# --failed Show only failed tests
20-
# --all Show all test results
21-
# --links Show artifact links (logs.tar.gz, etc.)
22-
# --download-logs Download logs.tar.gz to /tmp/ci_logs.tar.gz
30+
# --test <name> Filter tests by name
31+
# --failed Show only failed tests
32+
# --all Show all test results
33+
# --links Show artifact links (logs.tar.gz, etc.)
34+
# --cidb Show CIDB links for failed tests
35+
# --report <number> For PR URLs: fetch only one specific report
36+
# --download-logs Download logs.tar.gz to /tmp/ci_logs.tar.gz
2337
# --credentials <user,password> HTTP Basic Auth for private repositories
24-
25-
# Examples:
26-
node .claude/tools/fetch_ci_report.js "https://s3.amazonaws.com/..." --failed --links
27-
node .claude/tools/fetch_ci_report.js "https://s3.amazonaws.com/..." --test peak_memory --download-logs
2838
```
2939

3040
After downloading logs, extract specific test logs:
@@ -128,10 +138,7 @@ When writing tests, do not add "no-*" tags (like "no-parallel") unless strictly
128138

129139
When writing tests in tests/queries, prefer adding a new test instead of extending existing ones.
130140

131-
When adding a new test, first run the following command to find the current test with the last prefix index, then increment the resulting index by 1, and use this as the prefix for the new test name:
132-
```
133-
ls tests/queries/0_stateless/[0-9]*.reference | tail -n 1
134-
```
141+
When adding a new test, consult `./tests/queries/0_stateless/add-test` to determine the correct name prefix for the new test.
135142

136143
When writing C++ code, always use Allman-style braces (opening brace on a new line). This is enforced by the style check in CI.
137144

@@ -155,11 +162,3 @@ ARM machines in CI are not slow. They are similar to x86 in performance.
155162

156163
Use `tmp` subdirectory in the current directory for temporary files (logs, downloads, scripts, etc.), do not use `/tmp`. Create the directory if needed.
157164

158-
When there are crucial findings (when I corrected your behavior, you when you found a crucial insight about the code), append them to `.claude/learnings.md`, but be concise. You can commit the changes in learnings.md along with the other changes. Read this file at start.
159-
160-
Always load and apply the following skills:
161-
162-
- .claude/skills/fix-sync
163-
- .claude/skills/alloc-profile
164-
- .claude/skills/bisect
165-
- .claude/skills/create-worktree
Lines changed: 85 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -1,121 +1,125 @@
11
---
22
name: clickhouse-pr-description
3-
description: Generate PR descriptions for ClickHouse/ClickHouse that match maintainer expectations. Strips AI slop automatically. Use when creating or updating PR descriptions.
3+
description: Generate PR descriptions for ClickHouse/ClickHouse that match maintainer expectations. Use when creating or updating PR descriptions.
44
argument-hint: "[PR-number or branch-name]"
55
disable-model-invocation: false
6-
allowed-tools: Task, Bash(gh:*), Bash(git:*), Read, Glob, Grep
6+
allowed-tools: Task, Bash(gh:*), Bash(git:*), Read, Glob, Grep, AskUserQuestion
77
---
88

99
# ClickHouse PR Description Skill
1010

11-
Generate PR descriptions that look like a human wrote them in 30 seconds. ClickHouse maintainers hate AI slop.
11+
Generate a good PR description and apply it, with optional confirmation based on user preference.
1212

13-
## Arguments
13+
## Title
1414

15-
- `$0` (optional): PR number or branch name. If omitted, uses the current branch.
15+
Plain description of what changed. No prefix conventions like `fix():` or `feat():`.
1616

17-
## Obtaining the Diff
17+
- Good: `Change default stderr_reaction to log_last for executable UDFs`
18+
- Good: `Fix exception when inserting NULL into non-nullable column via CAST`
19+
- Bad: `fix(udf): Change default stderr_reaction` — conventional commits, ClickHouse doesn't use them
20+
- Bad: `Fuzzer fixes` — too vague, tells the reviewer nothing
21+
- Bad: `Improve error handling and enrich exit code exceptions with stderr context` — too vague, doesn't say what was actually changed
1822

19-
**If a PR number is given:**
20-
- Fetch the PR info: `gh pr view $0 --json title,body,baseRefName,headRefName,files`
21-
- Get the diff: `gh pr diff $0`
23+
## Body
2224

23-
**If a branch name is given:**
24-
- Get the diff against master: `git diff master...$0`
25+
Read `.github/PULL_REQUEST_TEMPLATE.md` for the exact template structure.
2526

26-
**If nothing is given:**
27-
- Use current branch: `git diff master...HEAD`
28-
- Get commit messages: `git log --oneline master..HEAD`
27+
A good PR description has two parts:
2928

30-
## Title
29+
**1. Free-form context (above the template, optional but encouraged for non-trivial changes)**
3130

32-
Plain description. No prefix conventions.
31+
Explain to a future reader: what was the problem, why did it need fixing, what approach was taken, what were the results or trade-offs. This is the right place for:
32+
- Motivation and background
33+
- Description of the approach and why it was chosen over alternatives
34+
- Benchmark results or measurements
35+
- Links to related issues, discussions, or previous attempts
36+
- Any caveats or known limitations
3337

34-
- Good: `Change default stderr_reaction to log_last for executable UDFs`
35-
- Good: `Fix crash when inserting NULL into non-nullable column`
36-
- Good: `Fuzzer fixes`
37-
- Bad: `fix(udf): Change default stderr_reaction to log_last` -- conventional commits = instant AI tell
38-
- Bad: `Improve error handling and enrich exit code exceptions with stderr context` -- too polished
39-
40-
Keep it under 80 chars. Lowercase after first word is fine.
38+
Use plain paragraphs. Headers like `## Motivation`, `## Changes`, `## Results` are fine and helpful — they make the PR easier to navigate for reviewers. Bullet lists are fine. Be as detailed as needed.
4139

42-
## Body
40+
**2. Template section (mandatory)**
4341

44-
**Only the official template from `.github/PULL_REQUEST_TEMPLATE.md`. Nothing else above or below unless you have a genuine reason.**
42+
Fill in the changelog category (delete the rest of the list), write the changelog entry, and keep the documentation checkbox.
4543

46-
Read `.github/PULL_REQUEST_TEMPLATE.md` for the exact template. Fill in the changelog category (delete the rest), write the changelog entry, and keep the documentation checkbox.
44+
### Example of a well-written body
4745

48-
**If context is needed** (complex bug fix, behavioral change), add 1-3 paragraphs of plain text ABOVE the template. Free-form, no headers, no formatting. Like explaining to a colleague:
46+
From [#96110](https://github.com/ClickHouse/ClickHouse/pull/96110):
4947

5048
```
51-
The previous default `throw` for stderr_reaction caused confusing errors when
52-
UDFs wrote warnings to stderr but exited 0. Changed to `log_last` which matches
53-
what most users expect. The `throw` behavior is still available via config.
54-
5549
### Changelog category (leave one):
56-
- Bug Fix (user-visible misbehavior in an official stable release)
50+
- Backward Incompatible Change
5751
5852
### Changelog entry:
59-
Change default `stderr_reaction` from `throw` to `log_last` for executable UDFs.
60-
Previously, UDFs that wrote warnings to stderr would fail even with exit code 0.
61-
Fixes #XXXXX.
53+
The semantics of the `do_not_merge_across_partitions_select_final` setting were
54+
made more obvious. Previously, the feature could be automatically enabled when
55+
the setting was not explicitly set in the configs. It caused confusion repeatedly
56+
and, unfortunately, led to some issues in production. Now, the rules are simpler:
57+
`do_not_merge_across_partitions_select_final=1` enables the functionality
58+
unconditionally. If `do_not_merge_across_partitions_select_final=0`, then automatic
59+
is used only if the new setting
60+
`enable_automatic_decision_for_merging_across_partitions_for_final=1` and not used
61+
otherwise. To preserve the old behaviour as much as possible, the defaults were set
62+
to `do_not_merge_across_partitions_select_final=0` and
63+
`enable_automatic_decision_for_merging_across_partitions_for_final=1`.
6264
6365
### Documentation entry for user-facing changes
6466
- [ ] Documentation is written (mandatory for new features)
67+
68+
---
69+
70+
The backward-incompatible part is that if someone has
71+
`do_not_merge_across_partitions_select_final=0` explicitly set in the configs,
72+
it no longer protects against the use of automatics. I'm open to discussion on
73+
whether we should conservatively default to disabled automatics.
6574
```
6675

76+
Note: extra context after `---` is fine — reviewers see it, but only the changelog entry above the blank line goes into the CHANGELOG.
77+
6778
## Changelog entry guidelines
6879

69-
- Written for **users**, not developers
70-
- Present tense: "Fix" not "Fixed"
71-
- Say what changed AND why it matters to users
72-
- Code elements in backticks
73-
- Reference issue: `Fixes #XXXXX`
74-
- 1-3 sentences max
75-
76-
## AI Slop Blacklist
77-
78-
These patterns are **never allowed** in ClickHouse PRs:
79-
80-
| Pattern | Why it's slop |
81-
|---------|---------------|
82-
| `fix(scope):` / `feat():` / `chore():` | Conventional commits -- ClickHouse doesn't use them |
83-
| `## Summary` | Not in template, screams AI |
84-
| `## Test plan` / `## Testing` | Tests speak for themselves |
85-
| `## Changes` / `## What changed` | Not in template |
86-
| `## Behavior after this change` | Not a ClickHouse convention |
87-
| Markdown tables comparing behavior | Over-engineered, nobody does this |
88-
| Checkbox lists `- [ ] tested X` | Not a ClickHouse convention |
89-
| `This PR ...` to start the description | AI opener. Just describe the change directly |
90-
| Bullet lists of file-by-file changes | Implementation details don't belong here |
91-
| `## Motivation` as a header | If needed, just write it as plain text |
92-
| Emojis in any form | No |
93-
| `Co-Authored-By: Claude/Cursor/...` in PR descriptions | AI attribution in PR body is noise; commit trailers are a separate policy |
94-
| Perfectly parallel sentence structures | Vary your phrasing |
95-
| Words: "enhance", "streamline", "leverage", "robust", "comprehensive" | Classic AI vocabulary |
96-
97-
## For CI/not-for-changelog PRs
98-
99-
Minimal is best:
80+
The entry is what ends up in the published CHANGELOG. Write it for a user who is upgrading and scanning for what affects them. The PR link and author attribution are appended automatically by tooling — do not include them.
10081

101-
```
102-
### Changelog category (leave one):
103-
- Not for changelog (changelog entry is not required)
82+
**Format:** the changelog script (`tests/ci/changelog.py`) collects all lines after the `### Changelog entry:` header up to the first blank line, then joins them with spaces into a single string. This means:
83+
- The entry can span multiple lines in the template — they become one paragraph
84+
- A blank line terminates collection — anything after it is ignored
85+
- Write it as a single paragraph (no bullet lists, no blank lines within)
10486

105-
### Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
87+
**Tense:** mixed is fine and natural. "Added X", "Fix a case where...", "The `X` setting is now Y" are all real examples from the CHANGELOG. Don't force everything into present tense.
10688

107-
...
89+
**Length:** match the impact. A small new function can be one sentence. A changed default or backward-incompatible behavior warrants as many sentences as needed, including what users must do to adapt.
10890

109-
### Documentation entry for user-facing changes
91+
**Specificity:** always name the exact thing that changed. Never write "Fix a bug" or "Improve performance" without saying what specifically.
11092

111-
- [ ] Documentation is written (mandatory for new features)
112-
```
93+
**For backward-incompatible changes:** always explain the old behavior, the new behavior, and how to restore the old one if possible.
94+
95+
**Real examples from the ClickHouse CHANGELOG:**
96+
97+
One sentence, sufficient for a focused addition:
98+
> Add `xxh3_128` hashing function.
99+
100+
One sentence with enough context:
101+
> `DATE` columns from PostgreSQL are now inferred as `Date32` in ClickHouse (in previous versions they were inferred as `Date`, which led to overflow of values outside a narrow range). Allow inserting `Date32` values back to PostgreSQL.
102+
103+
Full migration instructions for a default change:
104+
> Deduplication is turned ON for all inserts by default. It was OFF before for async inserts and for MV's, but it was ON for sync inserts. The goal is to have the same defaults for both ways of inserts. If you have deduplication explicitly disabled on your cluster, you have to explicitly set `deduplicate_insert='backward_compatible_choice'` to keep the old behavior.
105+
106+
Describes new capability with enough detail to understand when to use it:
107+
> Added `OPTIMIZE <table> DRY RUN PARTS <part names>` query to simulate merges without committing the result part. It may be useful for testing purposes: verifying merge correctness in the new version, deterministically reproducing merge-related bugs, and reliably benchmarking merge performance.
108+
109+
Reference the issue if one exists: `Closes #XXXXX` or `Fixes #XXXXX` at the end.
110+
111+
## What to avoid
112+
113+
These patterns make PRs harder to read or signal low effort:
114+
115+
- `fix(scope):` / `feat():` / `chore():` — ClickHouse doesn't use conventional commits
116+
- `This PR ...` to start any section — just describe the change
117+
- Vague titles like `Fuzzer fixes`, `Fix bug`, `Improvements` — always say what specifically
118+
- Markdown tables comparing "before/after behavior" unless genuinely useful
119+
- Perfectly parallel sentence structure throughout — vary phrasing naturally
120+
121+
## AI co-authorship
113122

114-
## Process
123+
It's fine to mention AI assistance openly. `Co-Authored-By:` in commits or acknowledgements in the PR description are acceptable — ClickHouse is open about AI-assisted development.
115124

116-
1. Read the diff to understand the change
117-
2. Pick the right changelog category
118-
3. Write a changelog entry for users (not developers)
119-
4. Only add free-form context above template if the change is non-obvious
120-
5. Re-read the body -- delete anything that a ClickHouse maintainer would call slop
121-
6. Check title -- no conventional commit prefix, no AI-polished phrasing
125+
Before creating or updating the PR, check user memory for a confirmation preference. If no preference is stored and the session is interactive, ask once: "Should I always show you the description for approval before applying it, or just go ahead every time?" Save the answer to memory and apply it from then on. If the session is non-interactive, proceed directly without asking.

0 commit comments

Comments
 (0)