perf(sql): optimize parallel HORIZON JOIN for various data distributions#6867
perf(sql): optimize parallel HORIZON JOIN for various data distributions#6867bluestreak01 merged 19 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can validate your CodeRabbit configuration file in your editor.If your editor has YAML language server, you can enable auto-completion and validation by adding |
5350812 to
1be65f9
Compare
…nistic_horizon_join_perf
…nistic_horizon_join_perf
…nistic_horizon_join_perf
…nistic_horizon_join_perf
…nistic_horizon_join_perf
[PR Coverage check]😍 pass : 85 / 89 (95.51%) file detail
|
Summary
becomes excessive — e.g., with high-cardinality symbols or rare/infrequent keys that cause deep backward scans.
HorizonJoinTimeFrameHelperand resets bookmarks ontoTop()to eliminate jitter from out-of-order frame processing.Background
The previous backward-only strategy (commit 6145bafc) clears the key map on each ASOF position change and backward-scans ~K rows to repopulate it. This is 30x faster than forward scan for K=50 FX symbols, but degrades badly when K is large (e.g., K=5,000 equity tickers) or when rare keys appear infrequently — each rare-key lookup scans thousands of rows backward to the beginning of the table.
The adaptive approach retains the backward-only advantage for the common low-K case while automatically switching to forward scan when the data distribution makes backward scans expensive.
How it works
Each frame starts in backward-only mode. On each ASOF position change, the algorithm compares the backward scan row count at the previous position against the gap (row distance) to the current position. If
bwdScanCost > gap * 8andgap > 1,024, it switches to forward scan mode permanently for that frame.Gap calculation uses raw rowId subtraction (
asOfRowId - prevAsOfRowId). For same-frame positions, this gives the exact row count. For cross-frame positions, the frame-index bits produce a value >= 2^44, which safely prevents the relative switch condition from triggering. An absolute threshold (bwdScanCost > 131,072) handles the cross-partition case — e.g., a deep backward scan for a rare key right before a partition boundary.Benchmark results
Benchmarked with 100M-row
market_dataand 50K-rowfx_tradestables across five data distribution scenarios (see this gist).Test query:
Dev box results
Environment: GraalVM CE 17, Ryzen 7900x, 64GB RAM
The
_injectedscenario is a synthetic worst case where rare symbols appear only once at the very beginning ofmarket_data. Both master and patch handle it poorly (156s and 171s respectively). The patch regresses by 10% because backward-only mode must complete at least one deep scan before the adaptive switch triggers, whereas master's always-forward approach never clears the key map.c6a.16xlarge EC2 instance results
The patch shows clear wins on dense (-21%), sparse (-96%, dramatic), rare (-19%), and equity (-13%). The injected scenario (rare symbols planted at the very start forcing full backward scans) is essentially unchanged (+1.4%, within noise).
The sparse result stands out — the adaptive scan switch appears to be extremely effective when the gap between LHS rows is large (~2000 market_data rows per trade) with few symbols.
Also, no huge jitter noticed: the worst jitter was ~26% and was in the equity scenario. In other scenarios it was within 3-10%.
Test plan
HorizonJoinTest(113 tests) andHorizonJoinFuzzTestpasstestHorizonJoinKeyedAdaptiveScanSwitchinHorizonJoinTest: deterministic test with a rare key at row 0 that triggers the backward-to-forward switchtestParallelHorizonJoinAdaptiveScanandtestParallelHorizonJoinAdaptiveScan2inParallelHorizonJoinFuzzTest: parallel tests with microsecond-spaced data in one partition, comparing HORIZON JOIN results against reference UNION ALL of ASOF JOINstest_distributions.sqlscenarios (dense, sparse, rare, equity, injected) to verify switching behavior across different cardinalities and distributions