feat(sql): horizon join for markout analysis#6635
Conversation
… dense ASOF JOIN algo)
Reduce unnecessary backward scanning in keyed ASOF JOIN within HORIZON JOIN queries when master symbols don't exist in the slave table. - Add hasNonExistentKey() to SymbolTranslatingRecord to detect master symbols absent from the slave's symbol table via VALUE_NOT_FOUND - Add a fast path in backwardScanForKeyMatch that skips the full backward scan when a non-existent key is detected (checked after the watermark/cache lookup to avoid overhead on cache hits) - Add findAsOfRow fast path: when the bookmarked row's timestamp <= target and the next row's timestamp > target, return immediately without a full linear scan - Reduce default ASOF JOIN lookahead from 100 to 64 rows
|
@puzpuzpuz — Code review findings (verified). Each item was double-checked against the code; false positives from the initial pass were eliminated. Confirmed BugGROUP BY validation is overly permissive
Example that incorrectly passes: SELECT a + b, sum(c) FROM t HORIZON JOIN ... GROUP BY aPasses because The fix would be to replace the UX IssueError positions hardcoded to 0 in
|
@bluestreak01 thanks for the thorough review! Here's what was addressed in daa7827: Confirmed Bug: GROUP BY validation is overly permissiveFixed. Removed The Existing tests that relied on the loose matching (e.g., UX Issue: Error positions hardcoded to 0Fixed. Added a Defensive: Missing
|
[PR Coverage check]😍 pass : 3102 / 3496 (88.73%) file detail
|
|
@bluestreak01 @mtopolnik thanks for your reviews! |
Fixes #6770 by introducing per-worker sinks
HORIZON JOIN is a specialized time-series join in QuestDB designed for markout analysis — a common financial analytics pattern where you need to analyze how prices or metrics evolve at specific time offsets relative to events (e.g., trades, orders).
Syntax
or, with explicit offset list:
How it works
For each row in the left-hand table and each offset in the horizon, the join computes
left_timestamp + offsetand performs an ASOF match against the right-hand table. When join keys are provided (viaON), only right-hand rows matching the keys are considered. Results are implicitly grouped by the non-aggregate SELECT columns (horizon offset, left-hand table keys, etc.), and aggregate functions are applied across all matched rows.The horizon pseudo-table (RANGE / LIST)
The
RANGEorLISTclause defines a virtual table of time offsets, aliased by theASclause. This pseudo-table exposes two columns:<alias>.offsetLONGh.offset / 1000000to get seconds.<alias>.timestampTIMESTAMPleft_timestamp + offset). Available for grouping or expressions.RANGE generates offsets from
FROMtoTO(inclusive) with the givenSTEP. For example,RANGE FROM 0s TO 5m STEP 1mgenerates offsets at 0s, 1m, 2m, 3m, 4m, 5m.LIST specifies explicit offsets as interval literals using the same SAMPLE BY-like expression syntax as RANGE. Unitless 0 is allowed as a shorthand for zero offset. Offsets must be monotonically increasing. For example, LIST (0, 1s, 5s, 1m) generates offsets at 0s, 1s, 5s, and 1m.
Both RANGE and LIST use the same syntax as SAMPLE BY: . Supported units: U (microseconds), T (milliseconds), s (seconds), m (minutes), h (hours), d (days).
Query examples
Given sample tables:
Post-trade markout at uniform horizons
Measure average mid-price at 1s, 5s, 30s, and 60s after each trade, per symbol — a classic way to evaluate execution quality and price impact:
Markout P&L at non-uniform horizons
Compute the average post-trade markout (future mid minus trade price) at specific horizons using LIST:
Pre- and post-trade price movement
Use negative offsets to see price levels before and after trades, useful for detecting information leakage or adverse selection:
Markout by trade side
Break down markouts by buy/sell side to analyze execution asymmetry:
Volume-weighted markout (non-keyed aggregation)
Compute an overall markout across all symbols without grouping keys:
Use cases
Current limitations
STEPmust be positive;FROMmust be less than or equal toTO.TODOs
MarkoutHorizonRecordCursorintvalues instead ofstringwhen joining symbol columnsWindowJoinFuzzTest