feat(sql): speed up ASOF JOIN on indexed symbol column by mtopolnik · Pull Request #6158 · questdb/questdb

mtopolnik · 2025-09-18T10:02:23Z

Leverage the symbol index on the right-hand side of the ASOF JOIN, avoiding the need to visit any rows that don't have the matching symbol.

Where this matters

The major use case for this optimization is markout analysis.

For a given trade, you want to see whether you could have got a better deal if you placed your order a bit sooner or a bit later. In a representative example, you inspect a window of +/- 10 minutes around the trade, amounting to 1200 points in time (one per second).

When dealing with FOREX, you need two numbers for each trade: USD rate of currency you're selling, and of the currency you're buying. There's a single table that contains all price changes for all currencies.

Therefore this analysis requires 2400 lookups of a matching RHS row (USD rate) for each LHS row (trade event), conditioned on the currency symbol. The worst problem are illiquid assets, having their last price posted way back in the past.

The current code will scan all the rows backwards until it finds a match. The new code makes only one symbol index lookup per partition. So, if the last asset price was posted 30 days ago, that's 30 lookups vs. scanning the full 30 days of currency price updates, for all currencies.

Benchmark

I did a quick ad-hoc benchmark set up as follows.

Price table:

CREATE TABLE prices (
      ts TIMESTAMP,
      sym SYMBOL,
      price DOUBLE
  ) timestamp(ts) PARTITION BY DAY;

Five currency symbols: 'EUR', 'GBP', 'JPY', 'CAD', 'AUD'
320 million rows, one per second -> 3700 partitions
AUD appears only in the very first partition

Trade event table:

CREATE TABLE orders (
    order_id LONG,
    order_completed_ts TIMESTAMP,
    sym1 SYMBOL,
    sym2 SYMBOL,
    quantity DOUBLE
) timestamp(order_completed_ts) PARTITION BY DAY;

INSERT INTO order VALUES
  (1001, '2025-09-09T09:00:00.000000Z', 'EUR', 'GBP', 50000.0),
  (1002, '2025-09-09T09:00:03.000000Z', 'GBP', 'JPY', 25000.0),
  (1003, '2025-09-09T09:00:07.000000Z', 'CAD', 'AUD', 75000.0),
  (1004, '2025-09-09T09:00:12.000000Z', 'JPY', 'EUR', 12500000.0),
  (1005, '2025-09-09T09:00:15.000000Z', 'EUR', 'CAD', 80000.0),
  (1006, '2025-09-09T09:00:21.000000Z', 'EUR', 'JPY', 45000.0),
  (1007, '2025-09-09T09:00:26.000000Z', 'GBP', 'CAD', 30000.0),
  (1008, '2025-09-09T09:00:32.000000Z', 'AUD', 'EUR', 60000.0),
  (1009, '2025-09-09T09:00:38.000000Z', 'CAD', 'GBP', 35000.0),
  (1010, '2025-09-09T09:00:43.000000Z', 'JPY', 'AUD', 15000000.0);

Query:

WITH
  offsets AS (
      SELECT x-601 AS sec_offs FROM long_sequence(1201)
  ),
  orders AS (
    SELECT id, order_ts, sym1, sym2 FROM orders
  ),
  points AS (
    SELECT orders.*, offsets.sec_offs, order_ts + 1_000_000 * sec_offs AS ts
    FROM orders CROSS JOIN offsets
    ORDER BY order_ts + 1_000_000 * sec_offs
  ),
  rslt1 AS (
    SELECT t.*, p.price AS sym1_price
    FROM points as t
    ASOF JOIN prices as p
    ON (t.sym1 = p.sym)
  ),
  rslt2 AS (
    SELECT t.*, p.price AS sym2_price
    FROM rslt1 as t
    ASOF JOIN prices as p
    ON (t.sym2 = p.sym)
  )
SELECT id, sec_offs, sym1_price/sym2_price AS price FROM rslt2
WHERE sym1_price is not null and sym2_price is not null
ORDER BY id, sec_offs;

Without the symbol index, the query times out. Profiling the query in more detail, one trade row at a time, a clear pattern shows up: if the row involves AUD, it's super-slow.

With the symbol index, the query is done within a second.

core/src/main/java/io/questdb/griffin/engine/join/AsOfJoinIndexedRecordCursorFactory.java

puzpuzpuz

I've tried to compare the new factory with the old one in a degenerate case:

create table trades AS (
  select 'BTC-USD'::symbol symbol, x::double price, 42.0 amount, x::timestamp as ts
  from long_sequence(1_000_000)
), index(symbol) timestamp(ts);

-- new factory: 3.7s
-- old factory: 50ms
select sum(t2.price)
from trades t1
asof join trades t2 on t1.symbol = t2.symbol;

@mtopolnik do the results make sense to you? If so, maybe we should introduce a hint to disable index usage? We already have a few hints, e.g. avoid_asof_binary_search, but not the index one.

mtopolnik · 2025-09-25T08:01:31Z

@mtopolnik do the results make sense to you? If so, maybe we should introduce a hint to disable index usage? We already have a few hints, e.g. avoid_asof_binary_search, but not the index one.

I profiled this query, time is spent in the code that looks through the index to find the starting row. It involves traversing a linked list backwards. In this degenerate case, all rows are in the index, but there's no dead-reckoning way of going straight to a given row. So it degrades into linear search.

puzpuzpuz · 2025-09-25T09:15:00Z

@mtopolnik do the results make sense to you? If so, maybe we should introduce a hint to disable index usage? We already have a few hints, e.g. avoid_asof_binary_search, but not the index one.

I profiled this query, time is spent in the code that looks through the index to find the starting row. It involves traversing a linked list backwards. In this degenerate case, all rows are in the index, but there's no dead-reckoning way of going straight to a given row. So it degrades into linear search.

@bluestreak01 @nwoolmer it sounds like our symbol index could be enhanced by adding a sparse index of blocks - the goal is to speed up row id-based block search. See #6158 (review) for the overhead of the block search.

glasstiger · 2025-09-25T15:50:47Z

[PR Coverage check]

😍 pass : 158 / 165 (95.76%)

file detail

	path	covered line	new line	coverage
🔵	io/questdb/griffin/engine/table/SelectedRecordCursorFactory.java	0	1	00.00%
🔵	io/questdb/griffin/engine/table/TimeFrameRecordCursorImpl.java	5	7	71.43%
🔵	io/questdb/griffin/engine/join/AsOfJoinIndexedRecordCursorFactory.java	67	71	94.37%
🔵	io/questdb/griffin/engine/join/AsOfJoinNoKeyFastRecordCursorFactory.java	1	1	100.00%
🔵	io/questdb/griffin/engine/join/AbstractAsOfJoinFastRecordCursor.java	6	6	100.00%
🔵	io/questdb/griffin/engine/join/LtJoinNoKeyFastRecordCursorFactory.java	1	1	100.00%
🔵	io/questdb/griffin/engine/join/AsOfJoinFastRecordCursorFactory.java	12	12	100.00%
🔵	io/questdb/griffin/engine/join/FilteredAsOfJoinNoKeyFastRecordCursorFactory.java	1	1	100.00%
🔵	io/questdb/griffin/engine/join/AbstractKeyedAsOfJoinRecordCursor.java	47	47	100.00%
🔵	io/questdb/griffin/engine/join/FilteredAsOfJoinFastRecordCursorFactory.java	1	1	100.00%
🔵	io/questdb/griffin/SqlCodeGenerator.java	17	17	100.00%

mtopolnik added 30 commits September 12, 2025 12:59

WIP

7f2dd55

style

6590bf2

Improve index scan logic

1287052

Better variable name

e28d47a

Restore slave cursor position at entry istead of end

4384af5

Remove unnecessary checks

8965896

Remove unused fields

0e667ab

Log test params on start of test

bbb44f2

Restore slaveKeySink field

9606df9

Delete unused variables

7c5c54c

Extract variable

e728cd5

Improve performKeyMatching

50f9c20

Apply row ID offset

3bd2d53

Map absolute rowID to localRowID

43a7a0f

Re-acquire index reader after changing partition

1ec1973

Use masterSymbolColumnIndex to access master symbol column

4a32d5d

Remove fallback to linear search

b40ece8

Map back to physical table column index to acces symbol index

29ab4fe

Remove unused method

29131cc

Remove default interface method impl

e52d6b1

Implement new methods in SelectedRecordCursorFactory

1b772cd

Add getPhysicalColumnIndex to TimeFrameRecordCursor

9d57ce3

Properly get physical symbol column index

1875fbd

Calculate row offset at each frame again

b64da3c

Remove redundant local variable timeFrame

8cd917f

Remove useless partitionIndex

0c1bf7b

Remove local variable timeFrame

06cb3d8

Use auxiliary frameRecA for local computation

811e6ee

Fix impl of getBimtapIndexReader

9c5e2b0

Update frameIndex and rowMax on each iteration

75b04c3

mtopolnik added 2 commits September 24, 2025 08:46

Merge branch 'master' into mt_asof-join-index

f39b194

Better method name

5436707

puzpuzpuz reviewed Sep 24, 2025

View reviewed changes

core/src/main/java/io/questdb/griffin/engine/join/AsOfJoinIndexedRecordCursorFactory.java Show resolved Hide resolved

puzpuzpuz reviewed Sep 24, 2025

View reviewed changes

core/src/main/java/io/questdb/griffin/engine/join/AsOfJoinIndexedRecordCursorFactory.java Outdated Show resolved Hide resolved

puzpuzpuz reviewed Sep 24, 2025

View reviewed changes

tris0laris moved this to Imminent Release in QuestDB Public Roadmap (Legacy) Sep 24, 2025

tris0laris added this to QuestDB Public Roadmap (Legacy) Sep 24, 2025

Move symbol table fetch to cursor.of()

5344d12

mtopolnik added 2 commits September 25, 2025 11:33

Add circuit breaker check

ed48f6a

Revert Rust changes

a0e3150

mtopolnik force-pushed the mt_asof-join-index branch from c1cd8f4 to a0e3150 Compare September 25, 2025 10:11

Merge branch 'master' into mt_asof-join-index

134a586

puzpuzpuz approved these changes Sep 25, 2025

View reviewed changes

Merge branch 'master' into mt_asof-join-index

36c898f

bluestreak01 merged commit f1c0d7a into master Sep 25, 2025
33 of 35 checks passed

bluestreak01 deleted the mt_asof-join-index branch September 25, 2025 16:34

mtopolnik mentioned this pull request Oct 1, 2025

feat(sql): optimized ASOF JOIN on single symbol key where RHS symbol is low-frequency #6208

Merged

tris0laris moved this from Imminent Release to Shipped in QuestDB Public Roadmap (Legacy) Oct 3, 2025

coderabbitai bot mentioned this pull request Oct 16, 2025

perf(sql): disable column pre-touch in parallel filter #6280

Merged

This was referenced Oct 31, 2025

perf(sql): improve performance of linear as-of join algo #6338

Merged

chore(sql): speedup tests #6347

Merged

chore(core): refactor and clean up ASOF/LT JOIN code #6348

Merged

tris0laris mentioned this pull request Nov 18, 2025

Real-time markouts for capital markets questdb/roadmap#98

Open

coderabbitai bot mentioned this pull request Dec 12, 2025

feat(sql): support include prevailing in window join #6476

Merged

9 tasks

coderabbitai bot mentioned this pull request Jan 1, 2026

feat(sql): add rowCount, txn and timestamp columns to tables() #6581

Merged

4 tasks

coderabbitai bot mentioned this pull request Jan 13, 2026

fix(sql): fix ASOF JOIN crash when ON clause has symbol and other columns #6634

Merged

coderabbitai bot mentioned this pull request Feb 24, 2026

fix(sql): NullPointerException when right-hand query in window join has timestamp filter #6806

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sql): speed up ASOF JOIN on indexed symbol column#6158

feat(sql): speed up ASOF JOIN on indexed symbol column#6158
bluestreak01 merged 65 commits intomasterfrom
mt_asof-join-index

mtopolnik commented Sep 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

puzpuzpuz left a comment •

edited

Loading

Uh oh!

mtopolnik commented Sep 25, 2025 •

edited

Loading

Uh oh!

puzpuzpuz commented Sep 25, 2025

Uh oh!

glasstiger commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

mtopolnik commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Where this matters

Benchmark

Uh oh!

Uh oh!

Uh oh!

puzpuzpuz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtopolnik commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

puzpuzpuz commented Sep 25, 2025

Uh oh!

glasstiger commented Sep 25, 2025

[PR Coverage check]

file detail

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mtopolnik commented Sep 18, 2025 •

edited

Loading

puzpuzpuz left a comment •

edited

Loading

mtopolnik commented Sep 25, 2025 •

edited

Loading