Skip to content

perf(sql): optimized Markout Horizon CROSS JOIN#6283

Merged
bluestreak01 merged 243 commits intomasterfrom
mt_adaptive-sort
Nov 21, 2025
Merged

perf(sql): optimized Markout Horizon CROSS JOIN#6283
bluestreak01 merged 243 commits intomasterfrom
mt_adaptive-sort

Conversation

@mtopolnik
Copy link
Copy Markdown
Contributor

@mtopolnik mtopolnik commented Oct 17, 2025

This is the Markout Curve query we're optimizing:

WITH
  offsets AS (
    SELECT sec_offs, 1_000_000 * sec_offs usec_offs 
    FROM (SELECT x-601 AS sec_offs FROM long_sequence(1201))
  ),
  points AS (SELECT * FROM (
    SELECT id, order_ts, sym1, sym2, unit_price, volume, sec_offs, order_ts + usec_offs AS ts
    FROM orders CROSS JOIN offsets
    ORDER BY order_ts + usec_offs
  ) TIMESTAMP(ts)),
  join1 AS (
    SELECT t.*, p.price AS sym1_price
    FROM points as t
    ASOF JOIN prices as p
    ON (t.sym1 = p.sym)
  ),
  join2 AS (
    SELECT t.*, p.price AS sym2_price
    FROM join1 as t
    ASOF JOIN prices as p
    ON (t.sym2 = p.sym)
  ),
  markouts AS (
  SELECT
    sec_offs, 
    volume,
    volume * (unit_price - (sym1_price/sym2_price)) AS weighted_markout
  FROM join2
  )
SELECT sec_offs, sum(weighted_markout) / sum(volume) AS avg_weigthed_markout 
FROM markouts;

This subquery generates the sampling points over the markout horizon of every order:

SELECT id, order_ts, sym1, sym2, unit_price, volume, sec_offs, order_ts + usec_offs AS ts
FROM orders CROSS JOIN offsets
ORDER BY order_ts + usec_offs

It performs poorly on a larger number of orders, because the CROSS JOIN output must be fully materialized and then sorted.

This PR introduces a special-case cursor factory that emits the CROSS JOIN output directly in the required order, without having to materialize the orders table. It does materialize offsets because it needs random access to it, but this table is of a very limited size, coming from long_sequence().

Example usage and benchmark:

CREATE TABLE orders (
      ts TIMESTAMP,
      price DOUBLE
  ) timestamp(ts) PARTITION BY DAY;
INSERT INTO orders
  SELECT
      generate_series as timestamp,
      rnd_double() * 10.0 + 5.0
      FROM generate_series('2025-01-02', '2025-01-02T00:10', '200u');

This creates 3,000,001 orders spaced at 200 µs.

WITH
  offsets AS (
    SELECT sec_offs, 10_000_000 * sec_offs usec_offs 
    FROM (SELECT x-61 AS sec_offs FROM long_sequence(121))
  )
  SELECT /*+ markout_horizon(orders offsets) */ sum(price) FROM (SELECT * FROM (
    SELECT price, ts + usec_offs AS timestamp
    FROM orders CROSS JOIN offsets
    ORDER BY ts + usec_offs
  ) TIMESTAMP(timestamp));

This creates the markout horizon sampling grid over 10 minutes, spaced at 10 seconds. There are 121 sampling points for each order. Therefore, this results in 121 * 3,000,001 = 363,000,121 rows.

The query is built to emphasize the worst case for the Markout Horizon algo in terms of RAM usage: tight spacing of orders vs. the markout horizon. The algorithm must hold 3 million iterator structures in RAM at once. It uses 40 bytes per iterator.

I benchmarked it on a r7a.4xlarge EC2 box.

Without the markout hint, the query took 135 seconds, and RAM usage went from 2.3 GB baseline to 10.7 GB.

With the hint, the query took 17 seconds, with an even split between aggregation and row generation. RAM usage went from 2.3 to 2.4 GB.

@mtopolnik
Copy link
Copy Markdown
Contributor Author

I've noticed https://github.com/questdb/questdb/pull/6362/files#r2537236278 while debugging the new factory. Please take a look.

Addressed in a1abc77

@puzpuzpuz puzpuzpuz self-requested a review November 20, 2025 09:03
puzpuzpuz
puzpuzpuz previously approved these changes Nov 20, 2025
@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 334 / 353 (94.62%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/orderby/RecordTreeChain.java 0 1 00.00%
🔵 io/questdb/griffin/engine/join/MarkoutHorizonRecordCursorFactory.java 253 269 94.05%
🔵 io/questdb/griffin/SqlCodeGenerator.java 78 80 97.50%
🔵 io/questdb/cairo/RecordChain.java 1 1 100.00%
🔵 io/questdb/cairo/RecordArray.java 1 1 100.00%
🔵 io/questdb/griffin/SqlHints.java 1 1 100.00%

@bluestreak01 bluestreak01 merged commit f744c20 into master Nov 21, 2025
41 checks passed
@bluestreak01 bluestreak01 deleted the mt_adaptive-sort branch November 21, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement Enhance existing functionality SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants