Skip to content

Bump scheduled exports count only in case it has been scheduled#1499

Merged
zvonand merged 4 commits intoantalya-26.1from
export_partition_increment_scheduled_after_scheduling
Mar 18, 2026
Merged

Bump scheduled exports count only in case it has been scheduled#1499
zvonand merged 4 commits intoantalya-26.1from
export_partition_increment_scheduled_after_scheduling

Conversation

@arthurpassos
Copy link
Copy Markdown
Collaborator

@arthurpassos arthurpassos commented Mar 10, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fixes the below by incrementing the scheduled counter only in case it has successfully scheduled the task

Medium: Scheduler slot counter increments after failed schedule attempt
Impact: Available background slots can be underutilized in the same scheduling cycle, delaying export throughput.

Partially fix #1498

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • Tiered Storage (2h)

@arthurpassos
Copy link
Copy Markdown
Collaborator Author

cc @Selfeer

@arthurpassos arthurpassos added port-antalya PRs to be ported to all new Antalya releases antalya-26.1 labels Mar 10, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 10, 2026

Workflow [PR], commit [127aa02]

@arthurpassos
Copy link
Copy Markdown
Collaborator Author

#1498

@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented Mar 16, 2026

PR #1499 Audit Review

AI audit note: This review was generated by AI (gpt-5.3-codex).

Audit update for PR #1499 (scheduled export counter update placement in partition export scheduler)


Confirmed defects

Medium: Scheduler capacity counter no longer increments in lock_inside_the_task path

  • Impact: When export_merge_tree_partition_lock_inside_the_task=1, the per-run cap (available_move_executors) is no longer enforced by scheduled_exports_count; the scheduler keeps attempting to enqueue until executor rejection, then exits early via return, reducing scheduling fairness and causing avoidable failed scheduling attempts under load.
  • Anchor: src/Storages/MergeTree/ExportPartitionTaskScheduler.cpp / ExportPartitionTaskScheduler::run (if (manifest.lock_inside_the_task) branch).
  • Trigger: Enable export_merge_tree_partition_lock_inside_the_task, have at least one pending partition export with multiple parts, and run scheduler with finite move slots.
  • Why defect: PR moves scheduled_exports_count++ into the else (!lock_inside_the_task) path and removes the shared increment. Successful scheduling in the lock_inside_the_task branch no longer updates the counter used by both loop guards, so the intended scheduler-side capacity accounting is broken for that mode.

Code evidence — loop guards using the counter:

// Outer loop (by task entry)
if (scheduled_exports_count >= available_move_executors)
{
    LOG_INFO(storage.log, "ExportPartition scheduler task: Scheduled exports count is greater than available move executors, skipping");
    break;
}
// ...
// Inner loop (by part)
for (const auto & zk_part_name : parts_in_processing_or_pending)
{
    if (scheduled_exports_count >= available_move_executors)
    {
        LOG_INFO(storage.log, "ExportPartition scheduler task: Scheduled exports count is greater than available move executors, skipping");
        break;
    }

Code evidence — lock_inside_the_task branch: success path never increments counter:

if (manifest.lock_inside_the_task)
{
    // ...
    if (!storage.background_moves_assignee.scheduleMoveTask(part_export_manifest.task))
    {
        storage.export_manifests.erase(part_export_manifest);
        LOG_INFO(storage.log, "ExportPartition scheduler task: Failed to schedule export part task, skipping");
        return;
    }
    // ← no scheduled_exports_count++ here
}
else
{
    try
    {
        // ... zk lock + exportPartToTable(...)
    }
    catch (const Exception &) { /* ... */ }
}

scheduled_exports_count++;  // only reached after the else block; lock_inside_the_task success skips this

So when lock_inside_the_task is true and scheduleMoveTask succeeds, the loop continues with an unchanged scheduled_exports_count, and the guards above never stop further scheduling until the executor rejects.

  • Smallest logical reproduction steps: (1) Set export_merge_tree_partition_lock_inside_the_task=1; (2) create an export with multiple pending parts and limited move executor slots; (3) run scheduler and observe continued scheduling attempts past the configured per-run cap until scheduleMoveTask fails and returns from run.
  • Fix direction (short): Increment scheduled_exports_count immediately after successful scheduleMoveTask(...) in the lock_inside_the_task branch (or unify increment on both success paths).
  • Regression test direction (short): Add scheduler unit/integration coverage for both lock modes verifying counter-based cap behavior and absence of scheduleMoveTask rejection when exactly filling available_move_executors.
  • Affected subsystem and blast radius: Replicated MergeTree partition-export scheduling; impacts clusters using partition export with lock_inside_the_task enabled.

Coverage summary

  • Scope reviewed: PR delta in src/Storages/MergeTree/ExportPartitionTaskScheduler.cpp plus connected scheduler/executor interfaces (BackgroundJobsAssignee::getAvailableMoveExecutors, BackgroundJobsAssignee::scheduleMoveTask) and setting propagation for lock_inside_the_task.
  • Categories failed: Scheduler capacity/accounting invariant across branch variants (lock_inside_the_task vs direct export path).
  • Categories passed: Entry/dispatch chain correctness; ZooKeeper status/part selection gating; non-lock_inside_the_task branch success/failure accounting intent; concurrency-focused scan for obvious new races/deadlocks from this delta; C++ bug-class checks (lifetime, iterator invalidation, integer/signedness, exception-safety, RAII, UB) for changed path.
  • Assumptions/limits: Static audit only (no runtime fault execution); findings are limited to PR Bump scheduled exports count only in case it has been scheduled #1499 diff and directly connected call paths.

@arthurpassos
Copy link
Copy Markdown
Collaborator Author

That is true, and shame on me for forgetting that :)

mkmkme
mkmkme previously approved these changes Mar 16, 2026
@Selfeer Selfeer added the verified Approved for release label Mar 17, 2026
@zvonand zvonand merged commit 9b978b9 into antalya-26.1 Mar 18, 2026
234 of 237 checks passed
subkanthi pushed a commit that referenced this pull request Mar 25, 2026
…eduled_after_scheduling

Bump scheduled exports count only in case it has been scheduled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.1 antalya-26.1.6.20001 port-antalya PRs to be ported to all new Antalya releases verified Approved for release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants