perf(core): optimize mat view writing by skipping transactions that are fully replaced by future commits or full refresh#6416
Conversation
…re fully replaced by future commits
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Pull Request Overview
This PR introduces an optimization to skip processing WAL (Write-Ahead Log) transactions with replace ranges that are fully covered by future transactions, improving performance when handling multiple overlapping replace range operations.
Key changes:
- Added
isReplaceRangeCoveredByFutureTransactions()method to detect when a transaction's replace range is superseded by a future transaction - Integrated the optimization check into
processWalCommit()to skip redundant transaction processing - Added stress test
testStressReplaceLastMinuteRepeatedly()to validate the optimization with 1,000 consecutive replace operations
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| core/src/main/java/io/questdb/cairo/TableWriter.java | Implements the skip optimization logic for replace range transactions, including commented-out experimental code and the new coverage detection method |
| core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java | Adds stress test for repeated replace range operations to validate the optimization works correctly under high load |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <[email protected]>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
core/src/main/java/io/questdb/cairo/TableUtils.java(2 hunks)core/src/main/java/io/questdb/cairo/TableWriter.java(1 hunks)core/src/main/java/io/questdb/cairo/wal/ApplyWal2TableJob.java(4 hunks)core/src/main/java/io/questdb/cairo/wal/WalPurgeJob.java(1 hunks)core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java(3 hunks)core/src/test/java/io/questdb/test/cutlass/http/ExpParquetExportTest.java(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-07T00:59:31.522Z
Learnt from: bluestreak01
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2025-11-07T00:59:31.522Z
Learning: In QuestDB's Cairo engine, transaction (_txn) files have a strong invariant: they are never truncated below TX_BASE_HEADER_SIZE. Once created, they are either fully formed (size >= header size) or completely removed along with the entire table directory when the table is dropped.
Applied to files:
core/src/main/java/io/questdb/cairo/wal/ApplyWal2TableJob.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (34)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (Check Changes Check changes)
- GitHub Check: build
🔇 Additional comments (7)
core/src/test/java/io/questdb/test/cutlass/http/ExpParquetExportTest.java (1)
124-124: Verify the connection between this test change and the PR objectives.The date change from
'1971-01-01'to'1970-01-02'reduces the generated dataset from approximately 1 year (~6.3M rows) to 1 day (~17K rows) of 5-second intervals. While this makes the test more efficient and appropriate for a basic sanity check, it's unclear how this relates to the PR's stated objectives around WAL transaction skipping optimization for materialized views.Is this test adjustment:
- An incidental correction discovered during development?
- Related to underlying changes in date/time processing from the WAL optimization work?
- Addressing a previous test error?
Please confirm this change is intentional and clarify its relationship to the PR scope.
core/src/main/java/io/questdb/cairo/wal/WalPurgeJob.java (1)
285-285: LGTM! Appropriate logging level for routine WAL discovery.The change to debug-level logging (
verbose=false, logDebug=true) is appropriate for this context where lock failures during WAL segment discovery are routine operations, not exceptional conditions. This aligns well with the existing debug-level logging for successful lock acquisition on line 287.core/src/main/java/io/questdb/cairo/TableUtils.java (3)
53-53: LGTM! Import required for new logging pattern.The
LogRecordimport is necessary to support the conditional log level selection in the updatedlock()method implementations.
1091-1128: LGTM! Well-designed logging control for lock operations.The refactored
lock()method provides fine-grained control over logging behavior:
- Error paths (lines 1099-1119): Log at ERROR level when
verbose=trueandlogDebug=false, otherwise at DEBUG level whenlogDebug=true- Success path (lines 1124-1126): Always logs at DEBUG level when logging is enabled (appropriate for routine operations)
The overload chain maintains backward compatibility while enabling callers to choose appropriate logging levels based on operational context (e.g., expected vs. exceptional lock failures).
1130-1132: LGTM! Backward compatibility maintained.The existing convenience overload continues to delegate appropriately, ensuring no breaking changes for existing callers.
core/src/main/java/io/questdb/cairo/wal/ApplyWal2TableJob.java (2)
621-636: Verification complete: atomic behavior of trySkipWalTransactions is correctly implemented.The method exhibits proper all-or-nothing semantics:
Precondition check before state modification: The lag row/txn count check at line 3075 occurs before any state modifications via
commitSeqTxn, ensuring the precondition (no pending WAL writes) is validated first.Atomic state transitions:
- If the precondition fails (lag counts non-zero), the method returns
falseimmediately without modifying writer state.- If the precondition passes, it calls
commitSeqTxn(seqTxn + skipTxnCount - 1)and returnstrue.Safe fallback: When
trySkipWalTransactionsreturnsfalseat lines 624-627 in ApplyWal2TableJob.java, the writer state remains unchanged, allowing the fallback tocommitWalInsertTransactionsat lines 631-635 to proceed safely.The implementation fully addresses the verification requirements outlined in the original review comment.
638-681: Forward progress is guaranteed by both code paths; no issues found.The verification confirms that the return value calculation at line 681 is safe:
Skip path (line 3080): Explicitly calls
commitSeqTxn(seqTxn + skipTxnCount - 1), ensuringgetAppliedSeqTxn() >= seqTxn.Commit path (lines 1277–1363): The unconditional assertion at line 1361—
assert txWriter.getLagTxnCount() == (seqTxn - txWriter.getSeqTxn())—guarantees thatgetAppliedSeqTxn()(which equalstxWriter.getSeqTxn() + txWriter.getLagTxnCount()) always equals the method'sseqTxnparameter at completion, regardless of whether data was committed to the main table or buffered in LAG.Consequently, the loop at line 643 (line 643:
for (long s = seqTxn; s <= lastCommittedSeqTxn; s++)) always executes at least once, ensuring the return value(int) (lastCommittedSeqTxn - seqTxn + 1)is at least 1, satisfying the assertion at line 484.
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/io/questdb/test/cairo/wal/WalWriterReplaceRangeTest.java
Outdated
Show resolved
Hide resolved
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
nwoolmer
left a comment
There was a problem hiding this comment.
Looks good, a nit and maybe an additional safety check
|
I would like to merge this after Ent release, to de-risk Enterprise |
[PR Coverage check]😍 pass : 51 / 51 (100.00%) file detail
|
Pull Request Overview
This PR introduces an optimization to skip processing WAL (Write-Ahead Log) transactions with replace ranges that are fully covered by future transactions or if a
TRUNCATEoperation is visible in the future transactions, improving performance when handling multiple overlapping replace range operations.Also, this PR fixed a few minor issues:
writerTxninwal_tablesquery, sometimes sporadically observed.