fix(core): fix failure handling on column renames#5752
Conversation
…g-on-col-rename Conflicts: core/src/main/java/io/questdb/cairo/TxReader.java
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe patch decouples TxWriter partition initialization from file opening, centralizes partition setup in TxReader/TxWriter, and updates TableWriter to load TXN before metadata and to align TODO handling with the current txn. Tests and utilities are adjusted to use the new TxWriter API and revised control/exception flows. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant TW as TableWriter
participant TXW as TxWriter
participant META as _meta
participant TODO as _todo
Note over TW,TXW: Open sequence (revised)
TW->>TXW: ofRW(path/to/_txn)
TXW-->>TW: TXN loaded (no partition set)
TW->>META: read metadata
META-->>TW: partitionBy, schema, etc.
TW->>TXW: initPartitionBy(partitionBy)
TW->>TODO: openTodoMem(tableTxn=TXW.txn)
TW->>TODO: readTodo(TXW.txn)
alt needs repair/restore
TW->>TODO: writeRestoreMetaTodo(TXW.txn)
end
TW-->>TW: proceed with writer initialization
sequenceDiagram
autonumber
participant TXR as TxReader
participant Part as Partitions
Note over TXR,Part: Partition init and next-partition queries
TXR->>TXR: initPartitionBy(partitionBy)
alt partitionBy == NONE
TXR->>Part: create synthetic transient partition
else
TXR->>Part: load attachedPartitions from TXN
end
TXR->>Part: getNextExistingPartitionTimestamp(ts)
Part-->>TXR: next existing or Long.MAX_VALUE
TXR->>Part: getNextPartitionTimestamp(ts)
Part-->>TXR: next floor-matched or ceil(ts)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/src/main/java/io/questdb/cairo/TableWriter.java (2)
6754-6783: _todo handling: ignore mismatched txn and record file length to avoid OOB readsTwo issues:
- Mismatched todoTxn should be ignored (return 0), but current code still returns the recorded count.
- You later read offsets 40/48 without verifying file size. Capture fileLen here to gate those reads.
Patch:
- private long openTodoMem(long tableTxn) { + private long openTodoMem(long tableTxn) { path.concat(TODO_FILE_NAME); try { if (ff.exists(path.$())) { long fileLen = ff.length(path.$()); if (fileLen < 32) { throw CairoException.critical(0).put("corrupt ").put(path); } todoMem.smallFile(ff, path.$(), MemoryTag.MMAP_TABLE_WRITER); this.todoTxn = todoMem.getLong(0); + this.todoFileLen = fileLen; // check if _todo_ file is consistent, if not, we just ignore its contents and reset hash - if (todoTxn == tableTxn && todoMem.getLong(24) != todoTxn) { + if (todoTxn == tableTxn && todoMem.getLong(24) != todoTxn) { todoMem.putLong(8, configuration.getDatabaseIdLo()); todoMem.putLong(16, configuration.getDatabaseIdHi()); Unsafe.getUnsafe().storeFence(); todoMem.putLong(24, todoTxn); return 0; } - return todoMem.getLong(32); + // Ignore stale or future _todo entries that don't match table txn. + if (todoTxn != tableTxn) { + return 0; + } + return todoMem.getLong(32); } else { resetTodoLog(ff, path, pathSize, todoMem); todoTxn = 0; + todoFileLen = 0; return 0; } } finally { path.trimTo(pathSize); } }And add a field:
// near todoMem declaration private long todoFileLen = 0;I can raise a small PR with these exact changes if helpful.
8760-8771: Guard reading opcode at offset 40 by file lengthEnsure we don’t read beyond the mapped region when a partial write left the file shorter than 48 bytes.
- private int readTodo(long tableTxn) { - long todoCount; - todoCount = openTodoMem(tableTxn); - - int todo; - if (todoCount > 0) { - todo = (int) todoMem.getLong(40); - } else { - todo = -1; - } - return todo; - } + private int readTodo(long tableTxn) { + final long todoCount = openTodoMem(tableTxn); + if (todoCount <= 0) { + return -1; + } + if (todoFileLen < 48) { + LOG.error().$("ignoring incomplete *todo* file (too short) [len=").$(todoFileLen).I$(); + return -1; + } + return (int) todoMem.getLong(40); + }
🧹 Nitpick comments (6)
core/src/test/java/io/questdb/test/cairo/TableWriterTest.java (2)
4174-4180: Make the local configuration final (small clarity win).Declaring
confasfinalcommunicates intent and prevents accidental reassignment in the test.- DefaultTestCairoConfiguration conf = new DefaultTestCairoConfiguration(root) { + final DefaultTestCairoConfiguration conf = new DefaultTestCairoConfiguration(root) {
4187-4189: Broaden catch to include CairoException for compatibility.Rename failures can surface as either
CairoError(distressed) orCairoException(older/alternate paths). A multi-catch avoids brittle tests across code paths/platforms.- } catch (CairoError renameError) { + } catch (CairoError | CairoException renameError) { Assert.assertTrue(writer.isDistressed()); }core/src/main/java/io/questdb/cairo/TxReader.java (2)
249-265: Use getQuick for consistency and speed.Other hot paths use
getQuick; prefer it here too.- return attachedPartitions.get(nextIndex); + return attachedPartitions.getQuick(nextIndex);
400-411: Clear attachedPartitions when (re)initializing NONE to avoid stale entries.Not strictly required (setPos resets size), but an explicit clear is safer if future logic inspects capacity contents.
public void initPartitionBy(int partitionBy) { this.partitionBy = partitionBy; this.partitionFloorMethod = PartitionBy.getPartitionFloorMethod(partitionBy); this.partitionCeilMethod = PartitionBy.getPartitionCeilMethod(partitionBy); if (!PartitionBy.isPartitioned(partitionBy)) { - // Add transient row count as the only partition in attached partitions list - attachedPartitions.setPos(LONGS_PER_TX_ATTACHED_PARTITION); + // Add transient row count as the only partition in attached partitions list + attachedPartitions.clear(); + attachedPartitions.setPos(LONGS_PER_TX_ATTACHED_PARTITION); initPartitionAt(0, DEFAULT_PARTITION_TIMESTAMP, transientRowCount, -1L); } }core/src/main/java/io/questdb/cairo/TableWriter.java (2)
2145-2147: TTL eviction log: clearer contextEmbedding table name and partitionTs is helpful. Minor nit: consider consistent lowercase for "evicting" vs other logs.
5456-5469: Housekeeping failures surfaced as non-critical exceptions: consider preserving errno/OOM flagshandleHousekeepingException logs and throws a non-critical CairoException with housekeeping=true. To avoid losing diagnostics, consider propagating errno/out-of-memory from a CairoException cause.
- CairoException ex; - if (e instanceof Sinkable) { - ex = CairoException.nonCritical().put("Data has been persisted, but we could not perform housekeeping [ex=").put((Sinkable) e).put(']'); - } else { - ex = CairoException.nonCritical().put("Data has been persisted, but we could not perform housekeeping [ex=").put(e.getMessage()).put(']'); - } + CairoException ex = CairoException.nonCritical(); + if (e instanceof CairoException) { + CairoException ce = (CairoException) e; + ex.setErrno(ce.errno).setOutOfMemory(ce.isOutOfMemory()); + ex.put("Data has been persisted, but we could not perform housekeeping {").put((Sinkable) ce).put('}'); + } else if (e instanceof Sinkable) { + ex.put("Data has been persisted, but we could not perform housekeeping {").put((Sinkable) e).put('}'); + } else { + ex.put("Data has been persisted, but we could not perform housekeeping [ex=").put(e.getMessage()).put(']'); + }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (8)
core/src/main/java/io/questdb/cairo/TableConverter.java(1 hunks)core/src/main/java/io/questdb/cairo/TableWriter.java(14 hunks)core/src/main/java/io/questdb/cairo/TxReader.java(6 hunks)core/src/main/java/io/questdb/cairo/TxWriter.java(2 hunks)core/src/test/java/io/questdb/test/cairo/TableWriterTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
- GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
- GitHub Check: New pull request (Hosted Running tests on windows-other)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (17)
core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java (1)
244-244: TxWriter.ofRW(path) migration looks correctUsing the 1-arg overload is appropriate here since only lag fields are updated and committed; no partition-specific ops are used.
core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java (1)
351-351: Good: deterministic RNG for reproducibilitySeeding the RNG stabilizes CI behavior for this heavy test.
core/src/main/java/io/questdb/cairo/TableConverter.java (1)
113-113: TxWriter open updated to new API — verify no partition dependenceSwitching to
ofRW(path)aligns with the decoupled partition init. Since only lag/structure version resets are performed, noinitPartitionByis needed here.core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java (1)
447-447: Updated to 1-arg TxWriter.ofRW — consistent with API changeAll three call sites now use the new overload; behavior is unchanged for lag reads/writes.
Also applies to: 461-461, 481-481
core/src/test/java/io/questdb/test/cairo/TableWriterTest.java (1)
4194-4197: LGTM: reusing the same FilesFacade-backed configuration is correct here.core/src/main/java/io/questdb/cairo/TxReader.java (3)
267-286: LGTM: next-partition resolution logic is clearer and correct.
412-414: LGTM: initRO now cleanly separates mapping from partition setup.
714-719: LGTM: NONE partitioning represented via a synthetic transient partition is consistent with initPartitionBy.core/src/main/java/io/questdb/cairo/TableWriter.java (9)
407-421: Load TXN before _meta: good ordering and error mappingInitializing TxWriter early and mapping ERRNO_FILE_CANNOT_READ to tableDoesNotExist is sensible for rename-recovery. Looks good.
429-430: initPartitionBy after reading _meta: correct alignment with new TxWriter APIThis aligns the writer’s partitioning with the persisted metadata. Good change.
1165-1219: Symbol capacity change flow hardened: commit-before-cleanup + purge handling
- Deferring purge and reopening to a post-commit block and routing failures via handleHousekeepingException improves resilience.
- Bumping metadata first and clearing _todo last matches the new policy. LGTM.
2789-2805: Rename column sequence fixed (commit-before-cleanup): goodLocalizing newColumnName, writing _meta.swp, linking files, bumping versions, then clearing _todo is the right order.
2810-2829: Post-rename housekeeping via handleHousekeepingExceptionUpdating designatedTimestampColumnName, ddl event, metadata cache hydration, and logging are in the right place; failures won’t mark writer distressed. LGTM.
3847-3847: Commit-before-cleanup in clearTodoAndCommitMetabumpMetadataVersion() before clearing _todo prevents dangling meta without txn. Good.
3858-3858: Commit-before-cleanup in clearTodoAndCommitMetaStructureVersionSame rationale as above; good.
5558-5558: housekeep now delegates to handleHousekeepingExceptionThis correctly converts post-commit failures into surfaced non-critical errors without distressing the writer.
10215-10227: writeRestoreMetaTodo: commit-first layout correctWriting txn at 0, then metadata, then finally mirroring txn at 24 (with fences) robustly encodes intent. Good change.
core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java
Outdated
Show resolved
Hide resolved
|
@bluestreak01 this is now good to review again. The fuzz test failure was fixed in #6095 |
|
lets merge this on top of nanos - there will be conflicts |
…g-on-col-rename Conflicts: core/src/main/java/io/questdb/cairo/TableConverter.java core/src/main/java/io/questdb/cairo/TableWriter.java core/src/main/java/io/questdb/cairo/TxReader.java core/src/main/java/io/questdb/cairo/TxWriter.java core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java
[PR Coverage check]😍 pass : 78 / 87 (89.66%) file detail
|

When the column rename code in
TableWriterfails to write Column Version File before commit, the commit fails but the metadata file is not rolled back.The fix is to remove
_todofile after the commit to_txnfile and also have tabletxninformation inside_todofile so in case of a failure to clean the_todofile after the commit, it does not cause the roll back of the metadata file.The fix refactoring consists of
_todofile cleanup so that commit happens firsttxninto the_todofile and ignore any information if it does not match the tabletxnon readTableWriterinitialization so thattxnis read before_todoTableWritercolumn rename to markTableWriteras distressed if any step to complete the rename failsFound by fuzz test.