fix(core): fix transient file does not exist error in queries#6629
fix(core): fix transient file does not exist error in queries#6629bluestreak01 merged 4 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds cleanup of phantom partition directories in O3PartitionJob when dedup processing completes without requiring merge operations. Additionally introduces a new test method to verify partition purge behavior during WAL dedup scenarios with multiple partition versions. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @core/src/test/java/io/questdb/test/cairo/o3/O3PartitionPurgeTest.java:
- Around line 637-677: Add assertions to testDedupWithPartitionPurge: after the
final runPartitionPurgeJobs() (and after the try-with-resources reader is
closed) assert the standard zero-error count used throughout this test class
(i.e. the same assertEquals/assertion that other tests use to verify no
telemetry/errors were produced), and optionally add filesystem assertions that
the partition directory/version paths expected to remain (and any old versions
expected to be purged) exist or do not exist using the same helper utilities in
this test suite (referencing runPartitionPurgeJobs, getReader, drainWalQueue to
locate the correct spot to insert these checks).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/io/questdb/cairo/O3PartitionJob.javacore/src/test/java/io/questdb/test/cairo/o3/O3PartitionPurgeTest.java
🧰 Additional context used
🧬 Code graph analysis (1)
core/src/main/java/io/questdb/cairo/O3PartitionJob.java (1)
core/src/main/java/io/questdb/std/str/Path.java (1)
Path(51-533)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (Trigger Enterprise CI Trigger Enterprise Pipeline)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (1)
core/src/main/java/io/questdb/cairo/O3PartitionJob.java (1)
2246-2255: LGTM - Appropriate cleanup of phantom partition directories.The fix correctly removes the pre-allocated partition directory when the dedup append optimization path is taken. This prevents the Partition Purge job from miscounting empty directories as valid partition versions. The pattern is consistent with the similar cleanup at lines 2176-2180, and the error handling appropriately logs but continues since the failure impact is transient.
core/src/test/java/io/questdb/test/cairo/o3/O3PartitionPurgeTest.java
Outdated
Show resolved
Hide resolved
[PR Coverage check]😍 pass : 7 / 8 (87.50%) file detail
|
Summary
Fixes an issue with similar symptoms to #6614.
When the dedup logic detects that incoming data is identical to existing data in the partition (plus a few additional rows that can be written as an append), it appends the data directly to the existing partition. However, it leaves behind an unused partition directory from the O3 merge preparation. The Partition Purge job then incorrectly counts this orphaned directory as the next valid partition version, leading to incorrect partition version tracking.
Root Cause
During O3 merge with dedup, when the optimization path detects append-only changes, the pre-allocated partition directory is not cleaned up, causing the purge job to misinterpret partition versioning.