Skip to content

Delay backporting until 7 days after merge and skip reverted PRs#100880

Draft
maxknv wants to merge 13 commits intomasterfrom
backport-delay-7d-ready-label
Draft

Delay backporting until 7 days after merge and skip reverted PRs#100880
maxknv wants to merge 13 commits intomasterfrom
backport-delay-7d-ready-label

Conversation

@maxknv
Copy link
Copy Markdown
Member

@maxknv maxknv commented Mar 27, 2026

PRs marked for backporting were previously picked up by cherry_pick.py immediately after merge. This had two problems:

  1. A PR that gets reverted within days would already have cherry-pick branches opened for it.
  2. There was no automated check that a revert exists before creating backport branches.

Changes

A new step is added to the CherryPick workflow (runs hourly) before the cherry-pick job. It runs ci/jobs/scripts/backport_labels.py, which:

  • Searches for merged PRs carrying a backport label (pr-must-backport, pr-must-backport-force, pr-critical-bugfix, vX.Y-must-backport) that were merged more than 7 days ago and updated in the last 30 days.
  • Checks each PR for a merged "Revert #N" PR. If one is found, all backport labels are removed from the original PR.
  • Otherwise, adds the ready-for-backport label.

cherry_pick.py is updated to require label:ready-for-backport in its search query, so only pre-vetted PRs are processed.

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry:

Backporting is now delayed until 7 days after a PR is merged, and reverted PRs are automatically detected and have their backport labels removed before any cherry-pick branches are created.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Add `ci/jobs/scripts/backport_labels.py` — a new step in the `CherryPick`
workflow that runs before `cherry_pick.py`. It searches for merged PRs
carrying a backport label (`pr-must-backport`, `pr-must-backport-force`,
`pr-critical-bugfix`, `vX.Y-must-backport`) that were merged more than 7
days ago and updated in the last 30 days. For each candidate it checks
whether a merged "Revert #N" PR exists: if so, all backport labels are
removed; otherwise the `ready-for-backport` label is added.

`cherry_pick.py` is updated to require `label:ready-for-backport` in its
search query, so only pre-vetted PRs are processed.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 27, 2026

Workflow [PR], commit [2aa2cf4]

Summary:


AI Review

Summary

This PR introduces a ready-for-backport staging label and a new backport_labels.py workflow step to delay backports by 7 days and skip reverted PRs before cherry_pick.py processes them. The overall direction is good, and most earlier correctness issues around revert detection appear to be addressed. I found one remaining reliability issue: label removal for reverted PRs can fail silently, which can leave stale backport labels and undermine the workflow’s guarantees.

Findings

⚠️ Majors

  • [ci/jobs/scripts/backport_labels.py:122] remove_backport_labels uses Shell.run without strict=True and without checking the return code. If gh pr edit fails (permissions/rate-limit/transient API/network errors), the script continues silently and reverted PRs may keep backport labels. That leaves inconsistent state and can cause downstream backport automation to process PRs that should have been de-scoped.
    • Suggested fix: call Shell.run(..., strict=True, ...) (or explicitly validate non-zero return code and raise).
Tests

⚠️ I could not run ci/tests/test_backport_labels.py in this environment because pytest is not installed (/usr/bin/python3: No module named pytest). Please run this test in CI (or an environment with the repo test dependencies) after applying fixes.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout ⚠️ Silent failure path in label-removal step can leave stale state in rollout automation
Compilation time
Performance & Safety
  • Safety/reliability concern: failure handling in the label-removal mutation is currently non-strict; this is an error-path robustness issue rather than a hot-path performance concern.
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions:
    • Make remove_backport_labels fail loudly on gh pr edit failure (strict=True or explicit return-code check + raise).

@clickhouse-gh clickhouse-gh bot added the pr-ci label Mar 27, 2026
@maxknv maxknv requested review from azat, leshikus and tavplubix March 27, 2026 10:52
@maxknv maxknv marked this pull request as draft March 27, 2026 10:53
from ci.jobs.scripts.workflow_hooks.pr_labels_and_category import Labels
from ci.praktika.utils import Shell

READY_DELAY_DAYS = 7
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 7 days is too long, from the other side it is more safer, I'm OK with it for now

The previous search query `"Revert" "#{pr_number}"` searched in title and
body, so it could match the original PR itself (whose body often references
its own number). This caused valid PRs to be mistakenly treated as reverted
and have their backport labels removed.

Three protections added to `find_revert_pr`:
- `in:title` — scope the search to titles only.
- Self-match exclusion — skip results whose number equals the original PR.
- Title validation — require the matched title to start with `Revert ` and
  contain the exact `#<N>` token (prevents `#1234` matching a search for
  `#123`).

Also improve logging: print each label action in non-dry-run mode and show
the revert PR number, title, and URL when one is found.

Fixes: #100880 (comment)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
ClickHouse revert PRs do not contain the original PR number in the title.
The actual format is:
  Title: Revert "<original title>"
  Body:  Reverts ClickHouse/ClickHouse#<N>

Switch the search to look for `Reverts {repo}#{pr_number}` in the body
(the canonical GitHub revert body format), and validate the title starts
with `Revert "` to confirm it is a GitHub-generated revert PR.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@leshikus leshikus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a functionality requested by dev team - has to be implemented; all comments are nice to have

READY_DELAY_DAYS = 7


def gh_search(query: str, per_page: int = 100, max_results: int = 1000) -> List[Dict[str, Any]]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1000 is hardcoded

page = 1
while True:
output = Shell.get_output(
f"gh api -X GET search/issues "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the request looks generic enough and not specific for backporting; probably should be a part of some library

3. Have not been reverted (no merged "Revert #<N>" PR found).
4. Do not already carry `ready-for-backport` or `pr-backports-created`.

The label is later consumed by `cherry_pick.py`, which skips PRs without it.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's better to use "used"; does cherry_pick.py delete this label? even if it does, it's better to use a simpler language than "consumed"

return bool(gh_search(query, per_page=10, max_results=10))


def mark_ready(repo: str, pr_number: int, dry_run: bool) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function adds a label - maybe it's better to call adding a label add_<label>_label

)


def remove_backport_labels(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function probably removes any labels; here goes the same comment as for the previous PR regarding PR cache - it's more conventional to have add_label and remove_label for PR object than re-implementing these functions each time

When running from the repo root, `praktika` (at ci/praktika/) was not
on sys.path, causing an import failure. Add `sys.path.append("ci")` so
that both `praktika.utils` and the transitive `praktika` imports inside
`pr_labels_and_category` resolve correctly.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
maxknv and others added 2 commits March 27, 2026 13:45
Replaces cwd-relative sys.path.append with paths derived from
the script's own location, so imports work regardless of which
directory the script is launched from.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
Aligns the search window with cherry_pick.py so PRs merged up to 90
days ago are eligible, preventing silent skips for PRs with no recent
activity.
maxknv added 2 commits March 27, 2026 13:52
- Add unit tests covering: exact body match, self-match exclusion,
  title mismatch, empty results, multiple candidates.
- Update module docstring to reflect actual revert detection logic
  (title Revert "..." + body Reverts {repo}#<N>).

Addresses:
  #100880 (comment)
  #100880 (comment)
Switch from get_output (swallows non-zero exit) to get_output_or_raise
so that rate-limit, auth, or network failures raise instead of silently
returning an empty candidate list and skipping all backport labeling.

Fixes: https://github.com/ClickHouse/ClickHouse/pull/100880/changes/BASE..8e3fbdf745bc94d75e119dc5fef1f2df58efa13b#r3000875077


if __name__ == "__main__":
sys.exit(main())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label:{label_filter} is built from a comma-joined list of backport labels, but in GitHub search commas in label: act as AND (see existing note in tests/ci/cherry_pick.py around line 680). That makes this query require all backport labels at once, so candidates can be empty and PRs never get ready-for-backport.

Please build OR semantics instead (e.g. separate searches per label and union by PR number, or an explicit OR expression in q).

Add -label:pr-cherrypick and -label:pr-backport to the search query so
that backport/cherry-pick PRs derived from the original are never picked
up as candidates for the ready-for-backport label.
@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Mar 27, 2026

Is it possible to bypass this, and create backport immediately?

- get_release_branches: switch to get_output_or_raise and raise on empty
  output, consistent with gh_search; avoids silently dropping all
  version-specific backport labels on API failure
- remove_backport_labels: rename loop variable l -> lbl (shadows builtin)
- Move pathlib import into the stdlib block
print(f" DRY RUN: would add {Labels.READY_FOR_BACKPORT!r} to PR #{pr_number}")
return
print(f" Adding {Labels.READY_FOR_BACKPORT!r} to PR #{pr_number}")
Shell.run(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark_ready ignores failures from gh pr edit: Shell.run defaults to strict=False, and the return code is not checked. If GitHub API calls fail (permissions, rate limit, transient network), this step silently continues and exits successfully, so PRs may never receive ready-for-backport even though the workflow reports success.

Please make this fail-fast (for example, Shell.run(..., strict=True)) or check the return code and raise. The same issue exists in remove_backport_labels.

The merged:2020-01-01.. lower bound was unnecessary — the updated:>=90d
filter already limits candidates to recently active PRs in steady state.
Use merged:..{cutoff} to express only the upper bound (not yet eligible).
return
print(f" Removing labels {to_remove} from PR #{pr_number}")
Shell.run(
f"gh pr edit {pr_number} {remove_args} --repo {repo}",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_backport_labels ignores failures from gh pr edit: Shell.run defaults to strict=False, and the return code is not checked.

If the GitHub call fails (permissions/rate limit/transient network), the script silently continues and leaves stale backport labels on reverted PRs, so they can still be picked up by downstream automation.

Please make this call strict (or check the return code and raise) the same way as other required workflow operations.

@maxknv
Copy link
Copy Markdown
Member Author

maxknv commented Mar 27, 2026

Is it possible to bypass this, and create backport immediately?

add label: ready-for-backport

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants