Skip to content

sql/ttl: enable TTL tests to run with secondary tenants#160570

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:mt-ttltest
Jan 7, 2026
Merged

sql/ttl: enable TTL tests to run with secondary tenants#160570
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:mt-ttltest

Conversation

@rafiss
Copy link
Copy Markdown
Collaborator

@rafiss rafiss commented Jan 6, 2026

Previously, TTL tests used TestIsForStuffThatShouldWorkWithSecondaryTenantsButDoesntYet and manually controlled tenant creation. This prevented the tests from benefiting from the standard tenant randomization in the test framework.

This commit makes several changes to enable TTL tests to work with tenants:

  1. Updates newRowLevelTTLTestJobTestHelper to use ApplicationLayer(0) instead of manually starting tenants, leveraging the built-in tenant randomization logic.

  2. Fixes SplitTable in testcluster to use TestingMakePrimaryIndexKeyForTenant with the correct codec, so range splits work correctly for tenant tables.

  3. Fixes external process tenant startup to propagate version settings from the parent, allowing tenants to start when the cluster is running at an older version (e.g., MinSupported).

  4. Removes the testMultiTenant parameter from the test helper since tenant mode is now controlled by the framework's randomization.

Resolves: #109391

Release note: None

Previously, TTL tests used `TestIsForStuffThatShouldWorkWithSecondaryTenantsButDoesntYet`
and manually controlled tenant creation. This prevented the tests from
benefiting from the standard tenant randomization in the test framework.

This commit makes several changes to enable TTL tests to work with tenants:

1. Updates `newRowLevelTTLTestJobTestHelper` to use `ApplicationLayer(0)`
   instead of manually starting tenants, leveraging the built-in tenant
   randomization logic.

2. Fixes `SplitTable` in testcluster to use `TestingMakePrimaryIndexKeyForTenant`
   with the correct codec, so range splits work correctly for tenant tables.

3. Fixes external process tenant startup to propagate version settings from
   the parent, allowing tenants to start when the cluster is running at an
   older version (e.g., MinSupported).

4. Removes the `testMultiTenant` parameter from the test helper since tenant
   mode is now controlled by the framework's randomization.

Resolves: cockroachdb#109391

Release note: None
@rafiss rafiss requested a review from fqazi January 6, 2026 20:36
@rafiss rafiss requested review from a team as code owners January 6, 2026 20:36
@rafiss rafiss requested review from shailendra-patel and williamchoe3 and removed request for a team January 6, 2026 20:36
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@fqazi reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @shailendra-patel and @williamchoe3).

@rafiss
Copy link
Copy Markdown
Collaborator Author

rafiss commented Jan 6, 2026

bors r+

craig bot pushed a commit that referenced this pull request Jan 6, 2026
160138: sql/inspect: add protected timestamp for "now" AOST case r=rafiss a=rafiss

Previously, INSPECT jobs without a historical AS OF SYSTEM TIME clause
would not create protected timestamp records, but still used an AOST
clause with the current timestamp. If span processing took a long time
(especially with BulkLowQoS admission control), garbage collection could
occur before the query completed, resulting in "batch timestamp must be
after replica GC threshold" errors.

This change adds per-span protected timestamp protection when INSPECT
uses "now" as the AOST. The implementation uses a coordinator-based
approach where:

1. When a processor starts processing a span and picks "now" as the
    timestamp, it sends a new "span started" progress message containing
    the span and timestamp via InspectProcessorProgress.

2. The coordinator's progress tracker receives this message and calls
    TryToProtectBeforeGC for the relevant tables in that span. This
    waits until 80% of the GC TTL has elapsed before creating a PTS,
    avoiding unnecessary PTS creation for quick operations.

3. When span processing completes (existing behavior), the coordinator
    cleans up the PTS for that span. Any remaining PTS records are
    cleaned up when the tracker terminates (e.g., on job cancellation).

This coordinator-based design keeps PTS management centralized rather
than distributed across processors, simplifying cleanup and error
handling. PTS failures are logged but don't fail the job since the
protection is best-effort.

###  sql/inspect: use minimum timestamp for PTS protection

Previously, the INSPECT job called TryToProtectBeforeGC per span with
different timestamps. Since the job only stores one PTS record, each
new span's call to Protect would update the existing record's timestamp
via UpdateTimestamp, which removes protection for older spans.

To address this, this patch changes the PTS strategy to track the
minimum (oldest) timestamp across all active spans and protect only at
that timestamp. Since PROTECT_AFTER mode protects all data at or after
the specified timestamp, protecting at the minimum covers all active
spans. When the oldest span completes, the PTS is updated to the new
minimum timestamp, allowing GC of data between the old and new minimum.

Resolves: #159866
Epic: None

Release note: None

160570: sql/ttl: enable TTL tests to run with secondary tenants r=rafiss a=rafiss

Previously, TTL tests used `TestIsForStuffThatShouldWorkWithSecondaryTenantsButDoesntYet` and manually controlled tenant creation. This prevented the tests from benefiting from the standard tenant randomization in the test framework.

This commit makes several changes to enable TTL tests to work with tenants:

1. Updates `newRowLevelTTLTestJobTestHelper` to use `ApplicationLayer(0)` instead of manually starting tenants, leveraging the built-in tenant randomization logic.

2. Fixes `SplitTable` in testcluster to use `TestingMakePrimaryIndexKeyForTenant` with the correct codec, so range splits work correctly for tenant tables.

3. Fixes external process tenant startup to propagate version settings from the parent, allowing tenants to start when the cluster is running at an older version (e.g., MinSupported).

4. Removes the `testMultiTenant` parameter from the test helper since tenant mode is now controlled by the framework's randomization.

Resolves: #109391

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 6, 2026

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 7, 2026

@craig craig bot merged commit 839f8e8 into cockroachdb:master Jan 7, 2026
33 of 35 checks passed
@rafiss rafiss deleted the mt-ttltest branch January 7, 2026 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ttljob: adjust all tests to work with test tenant

3 participants