Skip to content

feat(core): add data_id with current_data_id() SQL function#5917

Merged
bluestreak01 merged 28 commits intomasterfrom
raph_data_id
Jul 16, 2025
Merged

feat(core): add data_id with current_data_id() SQL function#5917
bluestreak01 merged 28 commits intomasterfrom
raph_data_id

Conversation

@RaphDal
Copy link
Copy Markdown
Contributor

@RaphDal RaphDal commented Jul 8, 2025

This PR introduces a data identifier to distinguish separate clusters and avoid conflicts on replication and restore.

Currently, users can query it through the current_data_id() function in SQL. The long term goal is to store this data id in replications and backups so that on restore a check can be made against replications and restores configuration and abort in case of conflicts.

From an implementation standpoint, this id is stored in the root of the database under the .data_id filename and consists of a 128bit UUID.

@RaphDal RaphDal changed the title Add database instance identification with current_data_id() SQL function feat(core): add data_id with current_data_id() SQL function Jul 8, 2025
Copy link
Copy Markdown
Contributor

@eugenels eugenels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Raphael, thanks for the PR!

I think we can simplify the design here. Ideally, this could just be a static utility function: take a file path as input, return an existing or new ID as output. Since the ID is meant to be generated once and never updated, we don’t need to keep the file descriptor or memory map around for the lifetime of CairoEngine.

You can also utilize MemoryCMARWImpl directly - it encapsulates all the file and mmap logic internally. All you need to do is tune the sizes appropriately. For your case, setting extendSegmentSize = 32 and size = -1 will allocate a 32-byte file.

Also, I don't think we need to modify SqlExecutionContext - we already have access to getCairoEngine(), which should be sufficient.

Let me know what you think!

Copy link
Copy Markdown
Contributor

@amunra amunra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor code cleanup improvements needed.

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 34 / 37 (91.89%)

file detail

path covered line new line coverage
🔵 io/questdb/cairo/DataID.java 27 30 90.00%
🔵 io/questdb/cairo/CairoEngine.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/catalogue/CurrentDataIdFunctionFactory.java 5 5 100.00%

@bluestreak01 bluestreak01 merged commit fe7c916 into master Jul 16, 2025
34 checks passed
@bluestreak01 bluestreak01 deleted the raph_data_id branch July 16, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants