Skip to content

fix(core): fix failure handling on column renames#5752

Merged
bluestreak01 merged 16 commits intomasterfrom
fix-failure-handling-on-col-rename
Sep 24, 2025
Merged

fix(core): fix failure handling on column renames#5752
bluestreak01 merged 16 commits intomasterfrom
fix-failure-handling-on-col-rename

Conversation

@ideoma
Copy link
Copy Markdown
Collaborator

@ideoma ideoma commented Jun 17, 2025

When the column rename code in TableWriter fails to write Column Version File before commit, the commit fails but the metadata file is not rolled back.

The fix is to remove _todo file after the commit to _txn file and also have table txn information inside _todo file so in case of a failure to clean the _todo file after the commit, it does not cause the roll back of the metadata file.

The fix refactoring consists of

  • Change the order of commit and _todo file cleanup so that commit happens first
  • Add table txn into the _todo file and ignore any information if it does not match the table txn on read
  • Change TableWriter initialization so that txn is read before _todo
  • Chnage of TableWriter column rename to mark TableWriter as distressed if any step to complete the rename fails

Found by fuzz test.

Copy link
Copy Markdown
Member

@bluestreak01 bluestreak01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@ideoma ideoma marked this pull request as draft June 23, 2025 10:08
@ideoma ideoma marked this pull request as ready for review June 30, 2025 13:47
…g-on-col-rename

 Conflicts:
	core/src/main/java/io/questdb/cairo/TxReader.java
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Sep 1, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The patch decouples TxWriter partition initialization from file opening, centralizes partition setup in TxReader/TxWriter, and updates TableWriter to load TXN before metadata and to align TODO handling with the current txn. Tests and utilities are adjusted to use the new TxWriter API and revised control/exception flows.

Changes

Cohort / File(s) Summary
TxWriter/TxReader partition init refactor
core/src/main/java/io/questdb/cairo/TxWriter.java, core/src/main/java/io/questdb/cairo/TxReader.java
Removed partitionBy from ofRO/ofRW signatures; added initPartitionBy(int). TxReader now centralizes partition init, adds getNextExistingPartitionTimestamp, updates initRO signature, and represents NONE with a synthetic transient partition.
TableWriter initialization and housekeeping
core/src/main/java/io/questdb/cairo/TableWriter.java
Loads TXN before _meta; reads TODO using current txn; defers partition init via txWriter.initPartitionBy(metadata.getPartitionBy()); adds handleHousekeepingException; adjusts metadata version bump timing; refactors rename/purge flows; updates TODO writes to use txWriter.txn; enhances TTL logging.
Converter minor update
core/src/main/java/io/questdb/cairo/TableConverter.java
Drops explicit PartitionBy.DAY when opening TXN writer; uses one-arg ofRW(path).
Tests updated for new API and flows
core/src/test/java/io/questdb/test/cairo/TableWriterTest.java, core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java, core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java
Replace TxWriter.ofRW(path, partitionBy) with ofRW(path); adjust TableWriterTest to use named conf and expect CairoError plus distressed state; minor control-flow/resource handling tweaks in tests.
Fuzz test determinism and gating
core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java
Run heavy metadata-change test only on macOS and use deterministic RNG seeds.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant TW as TableWriter
  participant TXW as TxWriter
  participant META as _meta
  participant TODO as _todo

  Note over TW,TXW: Open sequence (revised)
  TW->>TXW: ofRW(path/to/_txn)
  TXW-->>TW: TXN loaded (no partition set)
  TW->>META: read metadata
  META-->>TW: partitionBy, schema, etc.
  TW->>TXW: initPartitionBy(partitionBy)
  TW->>TODO: openTodoMem(tableTxn=TXW.txn)
  TW->>TODO: readTodo(TXW.txn)
  alt needs repair/restore
    TW->>TODO: writeRestoreMetaTodo(TXW.txn)
  end
  TW-->>TW: proceed with writer initialization
Loading
sequenceDiagram
  autonumber
  participant TXR as TxReader
  participant Part as Partitions

  Note over TXR,Part: Partition init and next-partition queries
  TXR->>TXR: initPartitionBy(partitionBy)
  alt partitionBy == NONE
    TXR->>Part: create synthetic transient partition
  else
    TXR->>Part: load attachedPartitions from TXN
  end
  TXR->>Part: getNextExistingPartitionTimestamp(ts)
  Part-->>TXR: next existing or Long.MAX_VALUE
  TXR->>Part: getNextPartitionTimestamp(ts)
  Part-->>TXR: next floor-matched or ceil(ts)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

Bug, storage

Suggested reviewers

  • bluestreak01
  • jerrinot
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-failure-handling-on-col-rename

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ideoma
Copy link
Copy Markdown
Collaborator Author

ideoma commented Sep 1, 2025

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Sep 1, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/src/main/java/io/questdb/cairo/TableWriter.java (2)

6754-6783: _todo handling: ignore mismatched txn and record file length to avoid OOB reads

Two issues:

  • Mismatched todoTxn should be ignored (return 0), but current code still returns the recorded count.
  • You later read offsets 40/48 without verifying file size. Capture fileLen here to gate those reads.

Patch:

-    private long openTodoMem(long tableTxn) {
+    private long openTodoMem(long tableTxn) {
         path.concat(TODO_FILE_NAME);
         try {
             if (ff.exists(path.$())) {
                 long fileLen = ff.length(path.$());
                 if (fileLen < 32) {
                     throw CairoException.critical(0).put("corrupt ").put(path);
                 }
 
                 todoMem.smallFile(ff, path.$(), MemoryTag.MMAP_TABLE_WRITER);
                 this.todoTxn = todoMem.getLong(0);
+                this.todoFileLen = fileLen;
                 // check if _todo_ file is consistent, if not, we just ignore its contents and reset hash
-                if (todoTxn == tableTxn && todoMem.getLong(24) != todoTxn) {
+                if (todoTxn == tableTxn && todoMem.getLong(24) != todoTxn) {
                     todoMem.putLong(8, configuration.getDatabaseIdLo());
                     todoMem.putLong(16, configuration.getDatabaseIdHi());
                     Unsafe.getUnsafe().storeFence();
                     todoMem.putLong(24, todoTxn);
                     return 0;
                 }
 
-                return todoMem.getLong(32);
+                // Ignore stale or future _todo entries that don't match table txn.
+                if (todoTxn != tableTxn) {
+                    return 0;
+                }
+                return todoMem.getLong(32);
             } else {
                 resetTodoLog(ff, path, pathSize, todoMem);
                 todoTxn = 0;
+                todoFileLen = 0;
                 return 0;
             }
         } finally {
             path.trimTo(pathSize);
         }
     }

And add a field:

// near todoMem declaration
private long todoFileLen = 0;

I can raise a small PR with these exact changes if helpful.


8760-8771: Guard reading opcode at offset 40 by file length

Ensure we don’t read beyond the mapped region when a partial write left the file shorter than 48 bytes.

-    private int readTodo(long tableTxn) {
-        long todoCount;
-        todoCount = openTodoMem(tableTxn);
-
-        int todo;
-        if (todoCount > 0) {
-            todo = (int) todoMem.getLong(40);
-        } else {
-            todo = -1;
-        }
-        return todo;
-    }
+    private int readTodo(long tableTxn) {
+        final long todoCount = openTodoMem(tableTxn);
+        if (todoCount <= 0) {
+            return -1;
+        }
+        if (todoFileLen < 48) {
+            LOG.error().$("ignoring incomplete *todo* file (too short) [len=").$(todoFileLen).I$();
+            return -1;
+        }
+        return (int) todoMem.getLong(40);
+    }
🧹 Nitpick comments (6)
core/src/test/java/io/questdb/test/cairo/TableWriterTest.java (2)

4174-4180: Make the local configuration final (small clarity win).

Declaring conf as final communicates intent and prevents accidental reassignment in the test.

-            DefaultTestCairoConfiguration conf = new DefaultTestCairoConfiguration(root) {
+            final DefaultTestCairoConfiguration conf = new DefaultTestCairoConfiguration(root) {

4187-4189: Broaden catch to include CairoException for compatibility.

Rename failures can surface as either CairoError (distressed) or CairoException (older/alternate paths). A multi-catch avoids brittle tests across code paths/platforms.

-                } catch (CairoError renameError) {
+                } catch (CairoError | CairoException renameError) {
                     Assert.assertTrue(writer.isDistressed());
                 }
core/src/main/java/io/questdb/cairo/TxReader.java (2)

249-265: Use getQuick for consistency and speed.

Other hot paths use getQuick; prefer it here too.

-            return attachedPartitions.get(nextIndex);
+            return attachedPartitions.getQuick(nextIndex);

400-411: Clear attachedPartitions when (re)initializing NONE to avoid stale entries.

Not strictly required (setPos resets size), but an explicit clear is safer if future logic inspects capacity contents.

     public void initPartitionBy(int partitionBy) {
         this.partitionBy = partitionBy;
         this.partitionFloorMethod = PartitionBy.getPartitionFloorMethod(partitionBy);
         this.partitionCeilMethod = PartitionBy.getPartitionCeilMethod(partitionBy);
 
         if (!PartitionBy.isPartitioned(partitionBy)) {
-            // Add transient row count as the only partition in attached partitions list
-            attachedPartitions.setPos(LONGS_PER_TX_ATTACHED_PARTITION);
+            // Add transient row count as the only partition in attached partitions list
+            attachedPartitions.clear();
+            attachedPartitions.setPos(LONGS_PER_TX_ATTACHED_PARTITION);
             initPartitionAt(0, DEFAULT_PARTITION_TIMESTAMP, transientRowCount, -1L);
         }
     }
core/src/main/java/io/questdb/cairo/TableWriter.java (2)

2145-2147: TTL eviction log: clearer context

Embedding table name and partitionTs is helpful. Minor nit: consider consistent lowercase for "evicting" vs other logs.


5456-5469: Housekeeping failures surfaced as non-critical exceptions: consider preserving errno/OOM flags

handleHousekeepingException logs and throws a non-critical CairoException with housekeeping=true. To avoid losing diagnostics, consider propagating errno/out-of-memory from a CairoException cause.

-        CairoException ex;
-        if (e instanceof Sinkable) {
-            ex = CairoException.nonCritical().put("Data has been persisted, but we could not perform housekeeping [ex=").put((Sinkable) e).put(']');
-        } else {
-            ex = CairoException.nonCritical().put("Data has been persisted, but we could not perform housekeeping [ex=").put(e.getMessage()).put(']');
-        }
+        CairoException ex = CairoException.nonCritical();
+        if (e instanceof CairoException) {
+            CairoException ce = (CairoException) e;
+            ex.setErrno(ce.errno).setOutOfMemory(ce.isOutOfMemory());
+            ex.put("Data has been persisted, but we could not perform housekeeping {").put((Sinkable) ce).put('}');
+        } else if (e instanceof Sinkable) {
+            ex.put("Data has been persisted, but we could not perform housekeeping {").put((Sinkable) e).put('}');
+        } else {
+            ex.put("Data has been persisted, but we could not perform housekeeping [ex=").put(e.getMessage()).put(']');
+        }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 39f84f2 and 0e8fe81.

📒 Files selected for processing (8)
  • core/src/main/java/io/questdb/cairo/TableConverter.java (1 hunks)
  • core/src/main/java/io/questdb/cairo/TableWriter.java (14 hunks)
  • core/src/main/java/io/questdb/cairo/TxReader.java (6 hunks)
  • core/src/main/java/io/questdb/cairo/TxWriter.java (2 hunks)
  • core/src/test/java/io/questdb/test/cairo/TableWriterTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java (1 hunks)
  • core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: New pull request (Coverage Report Coverage Report)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
  • GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
  • GitHub Check: New pull request (Hosted Running tests on windows-other)
  • GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on windows-cairo)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
  • GitHub Check: New pull request (Hosted Running tests on windows-griffin)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
  • GitHub Check: New pull request (Hosted Running tests on mac-other)
  • GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
  • GitHub Check: New pull request (Hosted Running tests on mac-griffin)
  • GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
  • GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
  • GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
  • GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (17)
core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java (1)

244-244: TxWriter.ofRW(path) migration looks correct

Using the 1-arg overload is appropriate here since only lag fields are updated and committed; no partition-specific ops are used.

core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java (1)

351-351: Good: deterministic RNG for reproducibility

Seeding the RNG stabilizes CI behavior for this heavy test.

core/src/main/java/io/questdb/cairo/TableConverter.java (1)

113-113: TxWriter open updated to new API — verify no partition dependence

Switching to ofRW(path) aligns with the decoupled partition init. Since only lag/structure version resets are performed, no initPartitionBy is needed here.

core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java (1)

447-447: Updated to 1-arg TxWriter.ofRW — consistent with API change

All three call sites now use the new overload; behavior is unchanged for lag reads/writes.

Also applies to: 461-461, 481-481

core/src/test/java/io/questdb/test/cairo/TableWriterTest.java (1)

4194-4197: LGTM: reusing the same FilesFacade-backed configuration is correct here.

core/src/main/java/io/questdb/cairo/TxReader.java (3)

267-286: LGTM: next-partition resolution logic is clearer and correct.


412-414: LGTM: initRO now cleanly separates mapping from partition setup.


714-719: LGTM: NONE partitioning represented via a synthetic transient partition is consistent with initPartitionBy.

core/src/main/java/io/questdb/cairo/TableWriter.java (9)

407-421: Load TXN before _meta: good ordering and error mapping

Initializing TxWriter early and mapping ERRNO_FILE_CANNOT_READ to tableDoesNotExist is sensible for rename-recovery. Looks good.


429-430: initPartitionBy after reading _meta: correct alignment with new TxWriter API

This aligns the writer’s partitioning with the persisted metadata. Good change.


1165-1219: Symbol capacity change flow hardened: commit-before-cleanup + purge handling

  • Deferring purge and reopening to a post-commit block and routing failures via handleHousekeepingException improves resilience.
  • Bumping metadata first and clearing _todo last matches the new policy. LGTM.

2789-2805: Rename column sequence fixed (commit-before-cleanup): good

Localizing newColumnName, writing _meta.swp, linking files, bumping versions, then clearing _todo is the right order.


2810-2829: Post-rename housekeeping via handleHousekeepingException

Updating designatedTimestampColumnName, ddl event, metadata cache hydration, and logging are in the right place; failures won’t mark writer distressed. LGTM.


3847-3847: Commit-before-cleanup in clearTodoAndCommitMeta

bumpMetadataVersion() before clearing _todo prevents dangling meta without txn. Good.


3858-3858: Commit-before-cleanup in clearTodoAndCommitMetaStructureVersion

Same rationale as above; good.


5558-5558: housekeep now delegates to handleHousekeepingException

This correctly converts post-commit failures into surfaced non-critical errors without distressing the writer.


10215-10227: writeRestoreMetaTodo: commit-first layout correct

Writing txn at 0, then metadata, then finally mirroring txn at 24 (with fences) robustly encodes intent. Good change.

@ideoma
Copy link
Copy Markdown
Collaborator Author

ideoma commented Sep 3, 2025

@bluestreak01 this is now good to review again. The fuzz test failure was fixed in #6095

@bluestreak01
Copy link
Copy Markdown
Member

lets merge this on top of nanos - there will be conflicts

ideoma and others added 4 commits September 23, 2025 16:53
…g-on-col-rename

 Conflicts:
	core/src/main/java/io/questdb/cairo/TableConverter.java
	core/src/main/java/io/questdb/cairo/TableWriter.java
	core/src/main/java/io/questdb/cairo/TxReader.java
	core/src/main/java/io/questdb/cairo/TxWriter.java
	core/src/test/java/io/questdb/test/cairo/mig/EngineMigrationTest.java
	core/src/test/java/io/questdb/test/cairo/wal/WalTableSqlTest.java
@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 78 / 87 (89.66%)

file detail

path covered line new line coverage
🔵 io/questdb/cairo/TableWriter.java 64 73 87.67%
🔵 io/questdb/cairo/TableConverter.java 1 1 100.00%
🔵 io/questdb/cairo/TxWriter.java 4 4 100.00%
🔵 io/questdb/cairo/TxReader.java 9 9 100.00%

@bluestreak01 bluestreak01 merged commit cc12d56 into master Sep 24, 2025
35 checks passed
@bluestreak01 bluestreak01 deleted the fix-failure-handling-on-col-rename branch September 24, 2025 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants