chore(core): add resource pool allocation tracing to detect leaks origin#6234
chore(core): add resource pool allocation tracing to detect leaks origin#6234bluestreak01 merged 9 commits intomasterfrom
Conversation
WalkthroughIntroduces a configurable flag to enable resource pool tracing, wires it through configuration classes, conditionally activates a tracing supervisor in the multi-tenant pool, and augments shutdown logging to include per-resource trace info when enabled. Adds a tracing supervisor implementation and updates/relocates tests and properties expectations accordingly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Pool as AbstractMultiTenantPool
participant TL as ThreadLocal<Supervisor>
participant Sup as ResourcePoolSupervisor (Tracing or None)
participant Res as Resource
rect rgb(245,248,252)
note over Pool: Initialization
Client->>Pool: construct(config)
alt tracing enabled
Pool->>TL: new ThreadLocal(TracingResourcePoolSupervisor::new)
else tracing disabled
Pool->>TL: new ThreadLocal()
end
end
rect rgb(240,255,240)
note over Client,Res: Borrow flow
Client->>Pool: borrow()
Pool->>TL: get()
TL-->>Pool: Sup (maybe null)
opt tracing active
Pool->>Sup: onResourceBorrowed(Res)
end
Pool-->>Client: Res
end
rect rgb(255,248,240)
note over Client,Res: Return flow
Client->>Pool: return(Res)
Pool->>TL: get()
TL-->>Pool: Sup (maybe null)
opt tracing active
Pool->>Sup: onResourceReturned(Res)
end
end
sequenceDiagram
autonumber
participant Pool as AbstractMultiTenantPool
participant TL as ThreadLocal<Supervisor>
participant Sup as TracingResourcePoolSupervisor
participant Log as LogRecorder
participant Res as Resource (left behind)
note over Pool: releaseAll(dealine=MAX_VALUE)
Pool->>Log: xInfo().$("table is left behind...")
Pool->>TL: get()
TL-->>Pool: Sup or null
alt Sup != null
Pool->>Sup: printResourceInfo(Log, Res)
else Sup == null
note over Pool: No extra trace info
end
Pool->>Log: $()
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
core/src/main/java/io/questdb/cairo/pool/AbstractMultiTenantPool.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/cairo/pool/TracingResourcePoolSupervisor.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/cairo/pool/AbstractMultiTenantPool.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/questdb/cairo/pool/TracingResourcePoolSupervisor.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/cairo/pool/TracingResourcePoolSupervisor.java
Show resolved
Hide resolved
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
core/src/main/java/io/questdb/cairo/pool/TracingResourcePoolSupervisor.java (3)
6-10: Consider clarifying the per-thread requirement.The documentation states "each thread should have its own instance" but doesn't explain the consequences if this contract is violated. Consider adding a note that sharing an instance across threads will result in incorrect thread name attribution and potential race conditions in the resources map.
33-33: String generation via toString() remains unaddressed.The previous review comment suggested using
toSink()or similar to avoid string allocation when printing resource info. The current implementation still callsresource.toString(), which generates a string and adds GC pressure.While this is a debugging feature where some overhead is acceptable, the suggestion to use
toSink()would reduce allocations, especially if multiple resources are leaked and printed during shutdown.Based on past review comments.
34-36: Enhanced for-loop and stack trace toString() remain unaddressed.Two performance suggestions from previous review remain unaddressed:
- Line 34: Using an indexed loop instead of enhanced for-loop would avoid iterator allocation.
- Line 35:
traceElement.toString()generates strings for each stack frame.While these are minor optimizations for a debugging feature, they would reduce allocation overhead when printing multiple leaked resources with deep stack traces.
Apply this diff to use an indexed loop:
- for (StackTraceElement traceElement : stackTrace) { - sink.put("\n\tat ").put(traceElement.toString()); + for (int i = 0, n = stackTrace.length; i < n; i++) { + sink.put("\n\tat ").put(stackTrace[i].toString()); }Based on past review comments.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/src/main/java/io/questdb/PropServerConfiguration.java(3 hunks)core/src/main/java/io/questdb/PropertyKey.java(1 hunks)core/src/main/java/io/questdb/cairo/CairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java(1 hunks)core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/cairo/pool/AbstractMultiTenantPool.java(3 hunks)core/src/main/java/io/questdb/cairo/pool/PoolTenant.java(1 hunks)core/src/main/java/io/questdb/cairo/pool/TracingResourcePoolSupervisor.java(1 hunks)core/src/test/java/io/questdb/test/ServerMainTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/CairoTestConfiguration.java(1 hunks)core/src/test/java/io/questdb/test/cairo/pool/ReaderLeftBehindTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/pool/ReaderPoolTest.java(0 hunks)
💤 Files with no reviewable changes (1)
- core/src/test/java/io/questdb/test/cairo/pool/ReaderPoolTest.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests Start ARM Agent)
- GitHub Check: New pull request (SelfHosted Other tests Start X64Zfs Agent)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Hosted Running tests on windows-other)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-cairo)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-fuzz)
- GitHub Check: New pull request (Hosted Running tests with cover on linux-griffin)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start ARM Agent)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests Start X64Zfs Agent)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (7)
core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java (1)
1348-1351: LGTM! Configuration default correctly disables expensive tracing in production.The default
falsevalue appropriately disables resource pool tracing in production environments, aligning with the PR's stated intent that this debugging feature should only be enabled during development/testing due to its performance overhead.core/src/main/java/io/questdb/cairo/CairoConfiguration.java (1)
746-747: LGTM! Interface method properly defines the configuration flag.The method declaration is clean and follows the existing patterns in the interface. Placement after
isReadOnlyInstance()is consistent with the implementations in DefaultCairoConfiguration and other configuration classes.core/src/main/java/io/questdb/cairo/pool/PoolTenant.java (1)
49-58: LGTM! Well-designed optional accessor with proper nullability contract.The default implementation returning
nullis appropriate for optional supervisor functionality, and the@Nullableannotation correctly documents the contract. The Javadoc clearly explains when the supervisor is present versus absent.core/src/main/java/io/questdb/PropertyKey.java (1)
585-586: LGTM! Property key follows naming conventions.The enum constant is properly defined and follows the established naming convention for Cairo configuration properties. The placement with other Cairo properties is logical and the property path string is consistent with the codebase patterns.
core/src/main/java/io/questdb/PropServerConfiguration.java (2)
265-266: Resource pool tracing flag wiring looks solid.Final field slots the new property cleanly into the configuration state.
4105-4108: Cairo configuration accessor matches the new contract.Delegating straight to the captured flag ensures consumers see the configured value.
core/src/main/java/io/questdb/cairo/pool/TracingResourcePoolSupervisor.java (1)
19-21: Thread name captured on first borrow only.As noted in the previous review comment, the thread name is captured lazily on first borrow and reused for all subsequent resources. This is correct for the intended thread-local usage pattern, but if the supervisor is accidentally shared across threads, only the first thread's name will be recorded.
Since the class documentation already states "each thread should have its own instance," and the integration uses ThreadLocal in AbstractMultiTenantPool, the current implementation is appropriate for the intended use case.
Based on past review comments.
[PR Coverage check]😍 pass : 30 / 39 (76.92%) file detail
|
This PR adds optional tracing functionality to help developers diagnose resource pool leaks by tracking where table readers are allocated.
Example Output
When the
CairoEngineis closed, resource pools are released. If one of those resource pool still has allocated resources, we log it and later throw an error:Usage
Enable tracing with this configuration option:
By default, this option is only enabled in tests.
How it works
When enabled, the
AbstractMultiTenantPoollazily creates aTracingResourcePoolSupervisorinstance for each thread that retrieves aTableReader.This
TracingResourcePoolSupervisorinstance captures and stores a full stack trace on every resource borrow until the resource is returned back to the pool.This feature is expensive and intended for development/debugging only. Capturing a full stack trace has significant performance overhead and allocates objects, putting additional pressure on the Garbage collector.