Skip to content

feat(core): cache file descriptors and mmaps to reduce kernel resource usage#5960

Merged
bluestreak01 merged 59 commits intomasterfrom
feat-cache-fs-calls
Aug 6, 2025
Merged

feat(core): cache file descriptors and mmaps to reduce kernel resource usage#5960
bluestreak01 merged 59 commits intomasterfrom
feat-cache-fs-calls

Conversation

@ideoma
Copy link
Copy Markdown
Collaborator

@ideoma ideoma commented Jul 18, 2025

Pull Request Overview

This PR implements file system call caching to improve performance by caching file descriptor and memory map operations. The changes introduce caching layers for both file descriptors and memory maps, reducing redundant system calls and improving resource reuse.

  • Introduces FdCache and MmapCache classes to cache file descriptors and memory maps
  • Changes openRW parameter type from long to int throughout the codebase
  • Adds new openRWNoCache and openRODir methods to FilesFacade interface
  • Updates native file operations to use overlapped I/O on Windows

This PR leaves further optimisations to reduce locking on the new FdCache and MampCache classes for the future.

@ideoma ideoma added the DO NOT MERGE These changes should not be merged to main branch label Jul 18, 2025
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jul 18, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ ideoma
✅ bluestreak01
❌ GitHub Actions - Rebuild Native Libraries


GitHub Actions - Rebuild Native Libraries seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@ideoma ideoma closed this Jul 18, 2025
@ideoma ideoma reopened this Jul 18, 2025
@ideoma ideoma closed this Jul 18, 2025
@ideoma ideoma reopened this Jul 21, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🔭 Outside diff range comments (1)
core/src/main/java/io/questdb/cairo/wal/WalWriter.java (1)

1465-1473: dirFd can leak on exceptions – wrap it in its own try / finally

dirFd is opened (read-only, no-cache) and closed only when control reaches the bottom of the try block (Line 1502).
If any exception is thrown between the open (Line 1471) and the close, the surrounding try / finally closes the old segment lock but does not close dirFd, leaking a native FD and defeating the point of the new cache-avoidant open.

Proposed fix – guard dirFd with a dedicated try / finally:

-            final long dirFd;
+            long dirFd = -1;
             final int commitMode = configuration.getCommitMode();
             if (Os.isWindows() || commitMode == CommitMode.NOSYNC) {
-                dirFd = -1;
+                dirFd = -1;
             } else {
-                dirFd = TableUtils.openRONoCache(ff, path.$(), LOG);
+                dirFd = TableUtils.openRONoCache(ff, path.$(), LOG);
             }
 
-            for (int i = 0; i < columnCount; i++) {
+            try {
+                for (int i = 0; i < columnCount; i++) {
     ...
-            if (dirFd != -1) {
-                ff.fsyncAndClose(dirFd);
-            }
+            } finally {
+                if (dirFd != -1) {
+                    ff.fsyncAndClose(dirFd);
+                }
+            }

This guarantees the descriptor is always closed, even if openColumnFiles, events.openEventFile, etc. throw.
Without this, a burst of failures during WAL creation could exhaust the process’ FD limit.

♻️ Duplicate comments (9)
core/src/test/java/io/questdb/test/cairo/FullFwdPartitionFrameCursorTest.java (2)

1436-1442: Same concern as above about opts truncation – see earlier comment.


1622-1628: Same concern as above about opts truncation – see earlier comment.

core/src/test/java/io/questdb/test/cairo/TableWriterTest.java (1)

363-368: Same pattern – OK
Another test-facade override correctly switched to int opts. No further issues.

core/src/test/java/io/questdb/test/cairo/o3/O3FailureTest.java (1)

106-112: Same concern as above – each anonymous FilesFacade subclass now overrides openRW(LPSZ, int). Ensure the backing implementation and flag widths are migrated consistently across the codebase.

Also applies to: 125-131, 179-186, 233-240, 281-287, 309-315, 348-354, 383-389, 509-515, 527-533, 543-550, 747-755, 788-794, 875-882, 892-899, 908-914, 1078-1084, 1144-1150, 3767-3774, 3791-3798

core/src/test/java/io/questdb/test/cutlass/line/tcp/LineTcpConnectionContextTest.java (1)

533-533: Same duplication observation as above – see previous comment for details.

core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java (2)

9422-9422: Duplicate: signature change looks correct
Same rationale as previous hunk; the method override is now consistent with the FilesFacade API.


9442-9442: Duplicate: signature change looks correct
Consistent with prior changes.

core/src/main/java/io/questdb/std/FdCache.java (1)

231-231: Fix assertion to match the intended check

core/src/main/java/io/questdb/cairo/vm/MemoryCMORImpl.java (1)

119-131: Same unused opts argument

See previous note.

🧹 Nitpick comments (19)
core/src/test/java/io/questdb/test/FilesTest.java (1)

270-277: Fix typo in test method name.

The method name should be testDetach instead of testDeatch. The test logic correctly exercises the new file descriptor caching functionality.

Apply this diff to fix the typo:

-    public void testDeatch() throws Exception {
+    public void testDetach() throws Exception {
core/src/test/java/io/questdb/test/cairo/FullFwdPartitionFrameCursorTest.java (1)

1288-1294: Validate that 64-bit flag information is not lost after opts narrowing

openRW now receives int opts; any callers that previously relied on 64-bit bitmasks (e.g. platform-specific O_ flags above 0x7FFF_FFFF) would silently lose the high-order bits. Please double-check that all test and production invocations now pass flag values that fit into 32 bits or explicitly document the constraint.

This is especially easy to miss because the body continues to forward opts unchanged to super.openRW, so any truncation would only be caught at runtime.

core/src/main/c/share/net.c (2)

244-294: Well-implemented asynchronous connect handling for EINTR.

The implementation correctly handles interrupted connect() calls by using select() to wait for the connection to complete asynchronously. The logic properly checks both the writable and exception fd_sets, and uses getsockopt() with SO_ERROR to determine the final connection status.

Consider making the timeout configurable rather than hardcoded to 30 seconds, as different applications may have different timeout requirements.


409-461: Code duplication detected - consider extracting common logic.

This implementation is identical to the logic in Java_io_questdb_network_Net_connect (lines 246-293). The asynchronous connect handling is correct, but the duplicated code could be refactored into a shared helper function.

Consider extracting the EINTR handling logic into a helper function like handle_interrupted_connect(int fd) to eliminate code duplication and improve maintainability.

core/src/test/java/io/questdb/test/cutlass/line/tcp/LineTcpConnectionContextTest.java (1)

495-495: Repeated override logic – consider a small helper to avoid duplication

All three anonymous TestFilesFacadeImpl instances now contain identical openRW overrides that return -1 for certain suffixes. Extracting a tiny reusable subclass (or a lambda‐based filter) would remove duplication and make future signature tweaks less error-prone.

core/src/test/java/io/questdb/test/griffin/engine/functions/geohash/GeoHashQueryTest.java (2)

46-46: Consider using a more explicit probability constant.

The magic number rnd.nextInt(5) > 3 results in a 20% execution rate. Consider defining a constant for better readability and maintainability.

+    private static final int SKIP_PROBABILITY = 5; // 1 in 5 chance (20%) of execution
+    
     // In the test method:
-    if (rnd.nextInt(5) > 3) {
+    if (rnd.nextInt(SKIP_PROBABILITY) == 0) {

64-64: Consider extracting the probability logic into a helper method.

Since the same probabilistic skipping logic is used in multiple places, consider extracting it into a helper method to reduce code duplication and improve maintainability.

+    private boolean shouldSkipIteration(Rnd rnd) {
+        return rnd.nextInt(5) <= 3; // Skip 80% of iterations
+    }
+    
     // In both test methods:
-    if (rnd.nextInt(5) > 3) {
+    if (shouldSkipIteration(rnd)) {
core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java (2)

9399-9401: Latch wait lacks timeout – risk of hanging tests
queryStartedCountDown.await() can block forever if the countdown is never triggered, making the test suite hang on CI. Use a timed await(timeout, TimeUnit) and fail fast instead.


9443-9445: Hard-coded Os.sleep(50) is brittle
Fixed sleeps introduce flakiness—tests may intermittently fail or waste 50 ms per run. Prefer synchronisation primitives (e.g., a latch or barrier) to make the delay deterministic and minimal.

core/src/test/java/io/questdb/test/cutlass/text/ParallelCsvFileImporterTest.java (1)

747-753: Second override aligned; consider minor test-only optimisation
The duplicate override for a different failure scenario is also correctly updated.
If the test starts to run noticeably slower, consider caching stackContains("PhaseUpdateSymbolKeys") result or using a lighter detection technique – each call builds a full stack trace. For the current small test scope this is non-blocking.

core/src/main/java/io/questdb/std/FilesFacadeImpl.java (1)

382-385: Placeholder implementation - consider adding TODO comment.

The openRWNoCache method currently delegates to the regular openRW method, which means it's not actually bypassing the cache. This appears to be a placeholder implementation.

Consider adding a TODO comment to make the placeholder nature explicit:

 @Override
 public long openRWNoCache(LPSZ name, int opts) {
+    // TODO: Implement actual no-cache file opening
     return openRW(name, opts);
 }
core/src/main/java/io/questdb/std/MmapCache.java (1)

93-96: Incomplete offset handling in mremap

The TODO comment indicates that non-zero offsets are not handled. Currently, the method falls back to non-cached behavior for any non-zero offset, which may impact performance for offset-based remapping scenarios.

Would you like me to help implement proper offset handling for the cached mremap path?

core/src/test/java/io/questdb/test/std/FilesCacheFuzzTest.java (1)

537-577: Consider using a memory management wrapper

While the current try-finally approach works, consider creating a wrapper for safer memory management.

Example approach:

try (NativeBuffer buff = new NativeBuffer(8, MemoryTag.NATIVE_DEFAULT)) {
    // Use buff.address() to get the pointer
    Unsafe.getUnsafe().putLong(buff.address(), op + 1);
    // ... rest of the code
} // Auto-cleanup on close
core/src/main/java/io/questdb/griffin/UpdateOperatorImpl.java (1)

690-699: Consider differentiating read-only vs writer open flags

fileOpenOpts comes from getWriterFileOpenOpts() yet is also passed when forWrite == false, i.e. when opening read-only MemoryCMR views. Passing writer-oriented flags (e.g. O_RDWR) to read-only maps is harmless on Linux but unnecessary and could mask configuration errors.

Suggestion: introduce a dedicated readerFileOpenOpts (or reuse ff.getReadOnlyOpts()) for the read paths to make intent explicit and reduce permission surface.

Also applies to: 713-714

core/src/main/java/io/questdb/cairo/vm/NullMemoryCMR.java (1)

230-232: Unused parameter can be removed or annotated

opts is accepted but never used inside this of() stub.
Consider removing it or adding a @SuppressWarnings("unused") to keep spotless builds quiet.

core/src/main/java/io/questdb/cairo/vm/MemoryCMRImpl.java (1)

110-128: opts (and extendSegmentSize) are ignored

of() takes opts and extendSegmentSize but neither is referenced.
If the extra flexibility is not required for read-only mappings, consider:

-    public void of(FilesFacade ff, LPSZ name, long extendSegmentSize, long size,
-                   int memoryTag, int opts, int madviseOpts) {
+    public void of(FilesFacade ff, LPSZ name, long extendSegmentSize, long size,
+                   int memoryTag, @SuppressWarnings("unused") int opts, int madviseOpts) {

or dropping the parameters in this subclass to avoid confusion.

core/src/test/java/io/questdb/test/griffin/ColumnPurgeJobTest.java (1)

352-358: Add @Override for consistency and safety

allocate() above is annotated yet openRW() is not.
Adding @Override catches signature drifts at compile-time and keeps style uniform.

core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java (1)

337-340: Expose cache flag via configuration, not a hard-coded constant

Returning a hard-coded true prevents disabling the descriptor cache from tests or deployment configs. Consider reading from a system property / constructor arg so behaviour can be toggled without code changes.

core/src/main/java/io/questdb/cairo/vm/MemoryCMORImpl.java (1)

114-116: opts parameter now unused

opts is accepted but never referenced. Either pass it down (e.g. to openFile) or remove it from the signature to avoid dead parameters and future confusion.

@ideoma ideoma moved this to Imminent Release in QuestDB Public Roadmap (Legacy) Aug 5, 2025
@ideoma ideoma added Core Related to storage, data type, etc. storage and removed Core Related to storage, data type, etc. labels Aug 5, 2025
@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 309 / 323 (95.67%)

file detail

path covered line new line coverage
🔵 io/questdb/KqueueFileWatcher.java 0 1 00.00%
🔵 io/questdb/cairo/TableUtils.java 6 10 60.00%
🔵 io/questdb/std/FdCache.java 115 120 95.83%
🔵 io/questdb/std/MmapCache.java 137 141 97.16%
🔵 io/questdb/Bootstrap.java 1 1 100.00%
🔵 io/questdb/PropServerConfiguration.java 7 7 100.00%
🔵 io/questdb/cairo/wal/WalWriter.java 1 1 100.00%
🔵 io/questdb/cairo/SymbolColumnIndexer.java 1 1 100.00%
🔵 io/questdb/cairo/DefaultCairoConfiguration.java 1 1 100.00%
🔵 io/questdb/cairo/O3PartitionJob.java 1 1 100.00%
🔵 io/questdb/std/FilesFacadeImpl.java 2 2 100.00%
🔵 io/questdb/std/Files.java 33 33 100.00%
🔵 io/questdb/PropertyKey.java 2 2 100.00%
🔵 io/questdb/cairo/vm/MemoryCMARWImpl.java 1 1 100.00%
🔵 io/questdb/cairo/CairoConfigurationWrapper.java 1 1 100.00%

@bluestreak01 bluestreak01 merged commit 696b083 into master Aug 6, 2025
36 of 37 checks passed
@bluestreak01 bluestreak01 deleted the feat-cache-fs-calls branch August 6, 2025 18:51
@tris0laris tris0laris moved this from Imminent Release to Shipped in QuestDB Public Roadmap (Legacy) Aug 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Development

Successfully merging this pull request may close these issues.

6 participants