feat(core): add data_id with current_data_id() SQL function by RaphDal · Pull Request #5917 · questdb/questdb

RaphDal · 2025-07-08T16:24:05Z

This PR introduces a data identifier to distinguish separate clusters and avoid conflicts on replication and restore.

Currently, users can query it through the current_data_id() function in SQL. The long term goal is to store this data id in replications and backups so that on restore a check can be made against replications and restores configuration and abort in case of conflicts.

From an implementation standpoint, this id is stored in the root of the database under the .data_id filename and consists of a 128bit UUID.

…nctionality through SQL.

…mprove open logic

eugenels

Hey Raphael, thanks for the PR!

I think we can simplify the design here. Ideally, this could just be a static utility function: take a file path as input, return an existing or new ID as output. Since the ID is meant to be generated once and never updated, we don’t need to keep the file descriptor or memory map around for the lifetime of CairoEngine.

You can also utilize MemoryCMARWImpl directly - it encapsulates all the file and mmap logic internally. All you need to do is tune the sizes appropriately. For your case, setting extendSegmentSize = 32 and size = -1 will allocate a 32-byte file.

Also, I don't think we need to modify SqlExecutionContext - we already have access to getCairoEngine(), which should be sufficient.

Let me know what you think!

…ata_id

…thods

…sses

… DataIDUtils

…sistency

…f null

core/src/main/java/io/questdb/cairo/DataIDUtils.java

… error handling tests

…hod in DataIDUtils

core/src/main/java/io/questdb/cairo/CairoEngine.java

core/src/main/java/io/questdb/cairo/DataIDUtils.java

…ifiers and update related functionality

…correct logic

…ult values

core/src/main/java/io/questdb/cairo/DataID.java

...ava/io/questdb/test/griffin/engine/functions/catalogue/CurrentDataIDFunctionFactoryTest.java

amunra

Minor code cleanup improvements needed.

core/src/main/java/io/questdb/cairo/DataID.java

glasstiger · 2025-07-16T14:00:32Z

[PR Coverage check]

😍 pass : 34 / 37 (91.89%)

file detail

	path	covered line	new line	coverage
🔵	io/questdb/cairo/DataID.java	27	30	90.00%
🔵	io/questdb/cairo/CairoEngine.java	2	2	100.00%
🔵	io/questdb/griffin/engine/functions/catalogue/CurrentDataIdFunctionFactory.java	5	5	100.00%

feat(core): implement DataID management and expose current data ID fu…

e1f95d9

…nctionality through SQL.

RaphDal changed the title ~~Add database instance identification with current_data_id() SQL function~~ feat(core): add data_id with current_data_id() SQL function Jul 8, 2025

RaphDal and others added 6 commits July 9, 2025 09:14

chore(core): refactor DataID management to use newDataID method and i…

be978c5

…mprove open logic

fix(tests): update flaky tests thresholds

8eb1866

fix(core): fix CurrentDataIdFunctionFactory to use last current data id

52f09f5

trigger ci

52b5572

fix(tests): fixed SampleByTest to take count for data_id mmap

bd69458

fix(tests): added dataId close and open to tearDown/setUp in tests

bc6f641

eugenels suggested changes Jul 10, 2025

View reviewed changes

RaphDal added 10 commits July 10, 2025 17:46

chore(core): make simpler the DataID system

e413ec2

Merge branch 'raph_data_id' of github.com:questdb/questdb into raph_d…

b21a0ed

…ata_id

fix(tests): remove unnecessary DataID open/close calls in tearDown me…

47e7897

…thods

fix(core): rename toAddr method to toAddress for consistency

79c9773

fix(core): reorder FILE_SIZE declaration for consistency in DataIDUtils

6690335

Merge branch 'master' into raph_data_id

4c854e9

fix(core): remove unused Long256 imports from SqlExecutionContext cla…

8eb0fbb

…sses

fix(core): remove unused Path variable and streamline file opening in…

185c8cb

… DataIDUtils

fix(core): format Path instantiation in openDataIDFile method for con…

5ca8fb9

…sistency

fix(core): refactor DataID handling to initialize with random value i…

62fd890

…f null