-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Port remaining parallelizable aggregate functions to off-heap data structures #4120
Copy link
Copy link
Open
Labels
EnhancementEnhance existing functionalityEnhance existing functionalityHelp wantedAssistance or additional information is wantedAssistance or additional information is wantedPerformancePerformance improvementsPerformance improvementsSQLIssues or changes relating to SQL executionIssues or changes relating to SQL executionhacktoberfestA good issue for Hacktoberfest 2025 contributors. No AI-driven commits, pleaseA good issue for Hacktoberfest 2025 contributors. No AI-driven commits, please
Description
Is your feature request related to a problem?
#4097 ported min(str), max(str), as well as count_distinct() for long, int, and IPv4 types to parallel GROUP BY, but some functions remain unported. Namely:
-
count_distinct(uuid): requires a new long128 hash set, similar to theGroupByLongHashSetone -
count_distinct(long256): requires a new long256 hash set, similar to theGroupByLongHashSetone -
approx_percentile(double): this one is tricky as we'll have to port HdrHistogram to become off-heap and flyweight - all
first/lastandfirst_not_null/last_not_nullfunctions: to port them, we'll have to access and store row ids in the group by map -
isOrdered(IPv4)/isOrdered(long)functions: again, we need to track row ids -
ksum/nsum
There is also count_distinct(symbol), but we have early exit logic in that function (see #3974), so we don't want to port it, at least for now.
Describe the solution you'd like.
No response
Describe alternatives you've considered.
No response
Full Name:
Andrei Pechkurov
Affiliation:
QuestDB
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
EnhancementEnhance existing functionalityEnhance existing functionalityHelp wantedAssistance or additional information is wantedAssistance or additional information is wantedPerformancePerformance improvementsPerformance improvementsSQLIssues or changes relating to SQL executionIssues or changes relating to SQL executionhacktoberfestA good issue for Hacktoberfest 2025 contributors. No AI-driven commits, pleaseA good issue for Hacktoberfest 2025 contributors. No AI-driven commits, please