fix(sql): fix wrong result returned from index lookup query#6050
fix(sql): fix wrong result returned from index lookup query#6050bluestreak01 merged 29 commits intomasterfrom
Conversation
This reverts commit 103f28b.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughIntroduces partition transaction (partitionTxn) propagation across bitmap index reader APIs and implementations, adds getters for column/partition txn, updates constructors and of(...) signatures, adjusts TableReader to pass/use partitionTxn, refactors index-reader caching/reload/close behavior, updates SymbolMapReaderImpl calls, and aligns tests with new signatures. Changes
Sequence Diagram(s)sequenceDiagram
participant TR as TableReader
participant Cache as ReaderCache
participant Path as PathGen
participant BIR as BitmapIndexReader
TR->>Cache: getBitmapIndexReaderIfExists(partitionIdx,colIdx,dir)
alt reader exists and isOpen
TR->>BIR: of(config, path, colName, colTxn, partitionTxn, columnTop)
else reader exists but closed
TR->>Path: pathGenNativePartition(partitionTxn)
TR->>BIR: of(config, path, colName, colTxn, partitionTxn, columnTop)
else no reader
TR->>Path: pathGenNativePartition(partitionTxn)
TR->>TR: createBitmapIndexReaderAt(..., partitionTxn)
TR->>Cache: cache(reader)
end
sequenceDiagram
participant TR as TableReader
participant OPI as openPartitionInfo
participant BIR as BitmapIndexReader
TR->>OPI: read partitionTxn for partition
TR->>BIR: of(config, path, colName, colTxn, partitionTxn, columnTop)
Note right of BIR: Reader reinitialized with txn context
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 3
🔭 Outside diff range comments (2)
core/src/main/java/io/questdb/cairo/TableReader.java (1)
1511-1514: Pass partitionTxn in reloadColumnAt when reinitializing existing index readersWhen an indexReader already exists for a column, reloadColumnAt reinitializes it without partitionTxn (uses the 5-arg default of(...), which sets partitionTxn = -1). For consistency and to support reloadConditionally decisions keyed on partitionTxn, pass the actual partition name txn.
Proposed change (outside this diff hunk; illustrative snippet):
// inside reloadColumnAt(...) if (metadata.isColumnIndexed(columnIndex)) { BitmapIndexReader indexReader = indexReaders.getQuick(primaryIndex); if (indexReader != null) { final int partOffset = partitionIndex * PARTITIONS_SLOT_SIZE; final long partitionNameTxn = openPartitionInfo.getQuick(partOffset + PARTITIONS_SLOT_OFFSET_NAME_TXN); indexReader.of(configuration, path.trimTo(plen), name, columnTxn, partitionNameTxn, columnTop); } }This keeps reader.getPartitionTxn() accurate and avoids ambiguity in reloadConditionally().
core/src/main/java/io/questdb/cairo/AbstractIndexReader.java (1)
107-156: Update call sites to new 6-arg AbstractIndexReader.of signatureSeveral call sites still invoke
indexReader.of(…)with only five arguments. After changing theof(...)method to require an explicitpartitionTxnbeforecolumnTop, these must be updated.Please update the following locations to pass the missing
partitionTxn(and retain the correctcolumnTop) in their calls:
core/src/main/java/io/questdb/cairo/SymbolMapReaderImpl.java
• Line 187
• Line 232core/src/main/java/io/questdb/cairo/TableReader.java
• Line 1513Example diff:
- indexReader.of(config, path.trimTo(plen), columnName, columnNameTxn, columnTop); + indexReader.of(config, path.trimTo(plen), columnName, columnNameTxn, partitionTxn, columnTop);Replace
partitionTxnwith the appropriate transaction value at each site.
🧹 Nitpick comments (12)
core/src/main/java/io/questdb/cairo/BitmapIndexBwdReader.java (1)
57-63: Guard against potential overflow in maxValue + 1 when computing nullsMinor: if maxValue can be Long.MAX_VALUE, maxValue + 1 overflows. Consider a safe add.
Example adjustment:
long hi = maxValue == Long.MAX_VALUE ? Long.MAX_VALUE : maxValue + 1; nullCursor.nullCount = Math.min(columnTop, hi);core/src/main/java/io/questdb/cairo/BitmapIndexFwdReader.java (1)
59-65: Same overflow nit for null prefix countMinor: avoid potential overflow in maxValue + 1 when computing nullCount.
Example:
long hi = maxValue == Long.MAX_VALUE ? Long.MAX_VALUE : maxValue + 1; nullCursor.nullCount = Math.min(columnTop, hi);core/src/main/java/io/questdb/cairo/TableReader.java (2)
260-278: Reopen-or-reload flow is correct; minor readability nit on partition indexLogic to reuse an existing reader if present and reopen it with the correct partitionTxn is sound. For readability, pass the known partitionIndex directly instead of deriving it via getPartitionIndex(columnBase).
Apply within these lines:
- reader.of( + reader.of( configuration, - pathGenNativePartition(getPartitionIndex(columnBase), partitionTxn), + pathGenNativePartition(partitionIndex, partitionTxn), metadata.getColumnName(columnIndex), columnNameTxn, partitionTxn, getColumnTop(columnBase, columnIndex) );
284-289: Helper accessor is fine; clarify semantics for callersThis returns the existing reader instance (which may be closed). Consider documenting that call sites should prefer getBitmapIndexReader(...) if they need an open/reloaded reader.
core/src/test/java/io/questdb/test/cairo/BitmapIndexConcurrentTest.java (6)
53-55: Fix misleading comment: only 100 distinct symbols are populated, not 1000.Inner loop inserts SYM1..SYM100 once per partition. Total rows are 1000, distinct symbols are 100.
Apply this diff:
- // Prepopulate with 10 partitions and 1000 distinct symbols + // Prepopulate with 10 partitions and 100 distinct symbols (10 occurrences per symbol initially)
61-67: Clarify population comment to match actual data shape.The table is populated with 100 distinct symbols replicated across 10 partitions (not 1000 distinct symbols).
Apply this diff:
- // Generate 1000 distinct symbols across 10 partitions using TableWriter API + // Generate 100 distinct symbols across 10 partitions (each symbol appears in every partition)
82-83: Shorten test duration or make it configurable to avoid slowing CI.20 seconds per run is heavy for a unit/integration test and may impact CI stability.
Apply this diff to make it configurable with a sensible default:
- final long testDurationMs = 20_000; // Run for 20 seconds + // Configurable test duration; default to 5s to keep CI fast. + final long testDurationMs = Long.getLong("qdb.test.bitmapIndexConcurrentMs", 5_000L);
129-131: Remove unreachable returns after Assert.fail.Assert.fail throws AssertionError; subsequent returns are unreachable.
Apply this diff:
- Assert.fail(lastError.get()); - return; + Assert.fail(lastError.get());- Assert.fail(lastError.get()); - return; + Assert.fail(lastError.get());- Assert.fail(lastError.get()); - return; + Assert.fail(lastError.get());- Assert.fail(lastError.get()); - return; + Assert.fail(lastError.get());Also applies to: 139-141, 147-149, 162-164
110-118: Optional: reduce SQL parsing/plan overhead by preparing once and binding a symbol.Recompiling the query for each symbol adds overhead. If the test harness supports bind variables, prepare once:
- SQL: SELECT symbol, count(*) FROM trades WHERE symbol = $1 GROUP BY symbol
- Bind for each iteration.
If binding isn’t available in this harness, ignore.
Would you like me to provide a prepared-statement variant that works with AbstractCairoTest? If so, point me to the preferred compilation/binding helper in tests.
89-167: This test isn’t actually concurrent; consider adding a reader thread.All inserts and queries run on the same thread sequentially. If you intend to stress concurrent writer/reader interaction, add:
- One thread continuously inserting/committing.
- Another thread continuously querying and validating.
This will better exercise index reader reloading across partition/column txns.
I can draft a two-threaded version with proper shutdown and exception propagation if you want.
core/src/main/java/io/questdb/cairo/AbstractIndexReader.java (1)
40-40: Use the correct class in the logger to avoid misleading log source.Currently logging is attributed to BitmapIndexBwdReader.
Apply this diff:
- protected static final Log LOG = LogFactory.getLog(BitmapIndexBwdReader.class); + protected static final Log LOG = LogFactory.getLog(AbstractIndexReader.class);core/src/main/java/io/questdb/cairo/BitmapIndexReader.java (1)
49-52: API extension is sound; default overload preserves source compatibility.
- Adding getColumnTxn/getPartitionTxn provides necessary visibility for reader lifecycle decisions.
- The 5-arg default of(...) cleanly forwards to the 6-arg form with partitionTxn = -1.
Consider documenting sentinel semantics in Javadoc:
- partitionTxn = -1 denotes “unknown/not tracked”
- getColumnTxn/getPartitionTxn return values must reflect the last of(...) call (even for null readers).
Also applies to: 76-77, 86-104
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
core/src/main/java/io/questdb/cairo/AbstractIndexReader.java(4 hunks)core/src/main/java/io/questdb/cairo/BitmapIndexBwdNullReader.java(3 hunks)core/src/main/java/io/questdb/cairo/BitmapIndexBwdReader.java(1 hunks)core/src/main/java/io/questdb/cairo/BitmapIndexFwdNullReader.java(3 hunks)core/src/main/java/io/questdb/cairo/BitmapIndexFwdReader.java(1 hunks)core/src/main/java/io/questdb/cairo/BitmapIndexReader.java(3 hunks)core/src/main/java/io/questdb/cairo/ConcurrentBitmapIndexFwdReader.java(1 hunks)core/src/main/java/io/questdb/cairo/TableReader.java(8 hunks)core/src/main/java/io/questdb/cairo/TableWriter.java(1 hunks)core/src/test/java/io/questdb/test/cairo/BitmapIndexConcurrentTest.java(1 hunks)core/src/test/java/io/questdb/test/cairo/BitmapIndexTest.java(0 hunks)core/src/test/java/io/questdb/test/cairo/SymbolMapTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/geohash/GeoHashNativeTest.java(1 hunks)
💤 Files with no reviewable changes (1)
- core/src/test/java/io/questdb/test/cairo/BitmapIndexTest.java
🔇 Additional comments (14)
core/src/test/java/io/questdb/test/cairo/SymbolMapTest.java (1)
84-84: Appropriately commented out bitmap index creation due to test class removal.The commented-out call to
BitmapIndexTest.create(...)aligns with the removal of theBitmapIndexTestclass in this PR. This prevents compilation errors while maintaining the structure of the test helper method.core/src/main/java/io/questdb/cairo/ConcurrentBitmapIndexFwdReader.java (1)
54-56: Parameter renamed from unIndexedNullCount to columnTop with updated method call.The parameter rename from
unIndexedNullCounttocolumnTopaligns with the broader API changes in this PR that introduce partition transaction tracking. The corresponding update to theof(...)method call maintains consistency.core/src/main/java/io/questdb/cairo/BitmapIndexBwdReader.java (1)
43-47: Constructor update correctly propagates partitionTxn and columnTopGood alignment with the new AbstractIndexReader.of(...) signature. Ensures readers carry correct partition context and columnTop for null handling.
core/src/test/java/io/questdb/test/griffin/engine/functions/geohash/GeoHashNativeTest.java (1)
90-90: Test updated to the 6-arg constructor looks correctPassing partitionTxn = -1 and columnTop = 0 is appropriate for this isolated test setup, and keeps behavior equivalent to the legacy path.
core/src/main/java/io/questdb/cairo/BitmapIndexFwdReader.java (1)
47-51: Constructor now passes partition transaction and columnTop — LGTMMatches the new API and keeps null-handling semantics intact.
core/src/main/java/io/questdb/cairo/TableReader.java (5)
711-712: Double-close risk unless reader.close() is idempotentYou free without nulling the slots. Subsequent free paths (e.g., freeBitmapIndexCache or partition closure) might free the same instance again. Ensure BitmapIndexReader.close()/free is idempotent, or null out the slots when permanently discarding.
Would you like me to scan the codebase to confirm BitmapIndexReader implementations’ close() are safe to call multiple times?
726-728: Partition close refactor is cleanclosePartition delegates to closePartitionResources and logs consistently. No issues.
747-760: Gracefully handling unknown partition formatExplicit switch with an “unknown” default is clearer. Looks good.
818-861: createBitmapIndexReaderAt now propagates partitionTxn — good fixReaders are constructed with the correct partition context. This aligns with the PR’s objective to eliminate wrong lookups caused by stale partition state.
1557-1558: Forcing index readers closed during column reload is appropriateEnsures subsequent lookups reopen against the latest columnTop/partitionTxn. Good.
core/src/main/java/io/questdb/cairo/BitmapIndexFwdNullReader.java (3)
40-44: Return a sentinel consistent with “no txn” semanticsCurrently returns 0. If the codebase uses -1 (e.g., COLUMN_NAME_TXN_NONE) to indicate “no txn,” this should match for consistency.
If appropriate, apply:
- public long getColumnTxn() { - return 0; - } + public long getColumnTxn() { + return -1; + }
74-78: Same note for getPartitionTxnAlign with sentinel used elsewhere (likely -1) for “no partition txn.”
If appropriate, apply:
- public long getPartitionTxn() { - return 0; - } + public long getPartitionTxn() { + return -1; + }
100-101: of(...) signature updated — LGTMMatches the interface’s 6-argument method and keeps the no-op semantics for null reader intact.
core/src/main/java/io/questdb/cairo/AbstractIndexReader.java (1)
49-71: New txn getters look good and align with the extended of(...) signature.Column/partition txn exposure is necessary for correct reader lifecycle checks. No issues with the accessors.
Also applies to: 85-89
core/src/test/java/io/questdb/test/cairo/BitmapIndexConcurrentTest.java
Outdated
Show resolved
Hide resolved
…h_index_wrong_res_repro
…repro' into jh_index_wrong_res_repro
…h_index_wrong_res_repro
[PR Coverage check]😍 pass : 56 / 60 (93.33%) file detail
|
Summary by CodeRabbit