Skip to content

window functions: use storage sorting key #19289

@akuzm

Description

@akuzm

To compute a window function, we must first sort data according to PARTITION BY and then ORDER BY clauses of the window.
When the storage sorting key and the PARTITION BY + ORDER BY of the window have a common prefix, we must avoid sorting the data fully from scratch, and instead read it in (partially) sorted order and then sort incrementally.

For large amounts of data, sorting it is not feasible, so this incremental sorting is the only way to compute a window function. See an example we discussed in #18097:

'Crash tests' use case. Imagine a fast trading dataset.

Every second you have up to 10 million transactions with a micro/nanosecond resolution.

The task is to calculate the sum and the number of transactions in the exactly X seconds window prepending the transaction. (not on the second boundary but back from the current event). Single select can extract up to several weeks of data.

This is sum over (order by transaction_time range between 'X seconds' preceding and current row).
10M records per second * 30 days per month * 24 * 60 * 60 seconds a day = 25T records, probably about 100TB of data compressed

Metadata

Metadata

Assignees

No one assigned

    Labels

    comp-window-functionsWindow function execution + frame handling (ROW_NUMBER/RANK/LAG/LEAD, frames, partitions, order).feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions