Skip to content

Export Part/Partition integration tests (PR #1388) failing consistently under ASAN builds #112

@CarlosFelipeOR

Description

@CarlosFelipeOR

Summary

Integration tests introduced by Altinity/ClickHouse#1388 ("Forward port of export part and partition") are failing consistently in ASAN builds. These failures are polluting CI reports for all open PRs on antalya-26.1 since the tests were merged into the base branch on Mar 4.

Example: CI Report from #1405

Affected Tests

test_export_merge_tree_part_to_object_storage (2 of 7 tests failing):

  • test_add_column_during_export
  • test_drop_column_during_export_snapshot

test_export_replicated_mt_partition_to_object_storage (14 of 20 tests failing):

  • test_concurrent_exports_to_different_targets
  • test_drop_source_table_during_export
  • test_export_partition_file_already_exists_policy
  • test_export_partition_permissions
  • test_export_partition_with_mixed_computed_columns
  • test_export_ttl
  • test_failure_is_logged_in_system_table
  • test_inject_short_living_failures
  • test_multiple_exports_within_a_single_query
  • test_mutation_in_partition_clause
  • test_mutations_after_export_partition_started
  • test_patch_parts_after_export_partition_started
  • test_pending_mutations_skip_before_export_partition
  • test_pending_patch_parts_skip_before_export_partition

Failure Pattern

Build type Result
amd_binary 100% OK
amd_tsan 100% OK
arm_binary 100% OK
amd_asan (non-targeted) ~50% FAIL
amd_asan (targeted, --count 10) ~90% FAIL

What is the "targeted" job? Integration tests (amd_asan, targeted) automatically selects tests relevant to a PR's changed files using dwarf debug info to map changed code lines to test coverage. It also re-runs previously failed tests from the CI database. Each selected test is executed 10 times (--count 10) under ASAN to detect flaky or non-deterministic behavior.

PR #1388 only ran tests with amd_binary and arm_binary before being merged — no ASAN builds were executed.

Root Cause

Two separate issues:

1. Timeout under ASAN (non-targeted jobs)

Operations like ALTER TABLE ... DROP COLUMN during export exceed the 600-second query timeout when running under ASAN instrumentation (~2–3× overhead). Example from PR #1405 amd_asan 2/6:

subprocess.TimeoutExpired: Command '[...clickhouse client...]' timed out after 600 seconds.

2. Non-idempotent test design (targeted job)

Tests use hardcoded table names (e.g., add_column_during_export_mt_table) without DROP TABLE IF EXISTS in setup. When the targeted job runs with --count 10, the first iteration may fail or leave tables behind, causing subsequent iterations to fail with:

Code: 57. DB::Exception: Table default.add_column_during_export_mt_table already exists. (TABLE_ALREADY_EXISTS)

The targeted job picks up these tests because it re-runs "previously failed tests" from the CIDB — so the initial ASAN timeout failure triggers a feedback loop:

fail → targeted picks it up → fails 10× → targeted picks it up again.

Example: Integration tests (amd_asan, targeted) from #1405

Impact

Since PR #1388 was merged into antalya-26.1 on Mar 4, every PR that updates its branch inherits these tests. Observed across PRs:

Tests Passing Without Issues

The following 12 tests from PR #1388 pass consistently across all build types including ASAN:

  • test_data_mutations_after_export_started
  • test_pending_mutations_skip_before_export
  • test_pending_mutations_throw_before_export
  • test_pending_patch_parts_skip_before_export
  • test_pending_patch_parts_throw_before_export
  • test_export_partition_feature_is_disabled
  • test_kill_export
  • test_pending_mutations_throw_before_export_partition
  • test_pending_patch_parts_throw_before_export_partition
  • test_restart_nodes_during_export

Suggested Fixes

  1. Increase query timeouts for export tests or add ASAN-aware timeout multipliers
  2. Add DROP TABLE IF EXISTS / CREATE TABLE IF NOT EXISTS to test setup for idempotency with targeted --count 10 reruns

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions