Skip to content

feat(sql): introduce more array functions#5771

Merged
bluestreak01 merged 92 commits intomasterfrom
vic_array_function
Jul 1, 2025
Merged

feat(sql): introduce more array functions#5771
bluestreak01 merged 92 commits intomasterfrom
vic_array_function

Conversation

@kafka1991
Copy link
Copy Markdown
Collaborator

@kafka1991 kafka1991 commented Jun 20, 2025

1. New array functions

array_position(array, elem)

Returns the index (1-based) of the first elem in the 1D array if it exists, otherwise NULL.

SELECT array_position(ARRAY[1.0, 2.0], 1.0) p1, array_position(ARRAY[1.0, 2.0], 3.0) p2;
p1 p2
1 NULL

insertion_point(array, x, [ahead_of_equal])

Looks for the insertion position of x in the 1D array, which must be sorted. Returns its index (1-based).
When the array contains elements equal to x:

  • If ahead_of_equal is true, the function returns the position of the first equal element.
  • Otherwise, it returns the position of the last equal element plus 1.
    ahead_of_equaldefaults to false.
select insertion_point(Array[1.0, 2.0, 2.0, 3.0], 2.0);
 insertion_point
-----------------
               4

select insertion_point(Array[1.0, 2.0, 2.0, 3.0], 2.0, true);
 insertion_point
-----------------
               2

select insertion_point(Array[1.0, 2.0, 2.0, 3.0], 1.5);
 insertion_point
-----------------
               2

array_sum(array)

Returns the sum of elements in the array.

select array_sum(Array[1.0, 2.0]) sum1, array_sum(Array[[1.0, 3.0]]) sum2;
 sum1 | sum2
------+-----
  3.0 | 4.0

array_count(array)

Returns the number of not NaN elements in the source array.

select array_count(Array[1.0, 2.0]) count1, array_count(Array[[1.0, null]]) count2;
 count1 | count2
--------+--------
      2 |     1

array_avg(array)

Returns the average of elements in the array.

select array_avg(Array[1.0, 2.0]) avg1, array_avg(Array[[1.0, 3.0]]) avg2;
 avg1 | avg2
------+------
  1.5 |  2.0

array_cum_sum(array)

Returns the array of cumulative sums of elements in the array. If array is multidimensional, flattens it first into a single dimension.

select array_cum_sum(Array[1.0, 2.0]) cs1, array_cum_sum(Array[[1.0, 3.0]]) cs2;
     cs1  |    cs2
----------+----------
{1.0,3.0} | {1.0,4.0}

dot_product(array1, array2)

Returns the dot product of the two arrays, which must have the same shape.

select 
    array_dot_product(Array[1.0, 2.0], Array[2.0, 1.0]) p1, 
    array_dot_product(Array[[1.0, 3.0]], Array[[3.0, 1.0]]) p2;
 p1 | p2
----+-----
4.0 | 6.0

shift(array1, x, [fill_value])

Shift array elements along the innermost dimension of the array. When x > 0, it indicates a rightward shift by x positions; otherwise, it signifies a leftward shift by x positions.All newly vacated positions are filled with fill_value, which defaults to Null.

select shift([Array[[1.0, 2.0], [3.0, 4.0]], 1);
  column
-----------
 {{Null,1.0},{Null, 3.0}}

(+ - * /) between array and scalar value

+

select Array[1.0, 2.0] + 1, 1 + Array[1.0, 2.0];
  column   |  column1
-----------+-----------
 {2.0,3.0} | {2.0,3.0}

-

select Array[1.0, 2.0] - 1;
  column
-----------
 {0.0,1.0}

*

select Array[1.0, 2.0] * 2, 2 * Array[1.0, 2.0];
  column   |  column1
-----------+-----------
 {2.0,4.0} | {2.0,4.0}

/

select Array[1.0, 2.0] / 2;
  column
-----------
 {0.5,1.0}

2. Auto-broadcasting for binary array operators

Broadcast rules on (+ - * /) between arrays

https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules

Two array shapes are compatiable only when all of their dimensions are compatible.

Starting with the rightmost dimension moving to the left, two dimensions are compatible when:

  • they are equal, or
  • one of them is 1

The two arrays do not need to have the same number of dimensions. The resulting array will have the same number of dimensions as the input array with the greatest number of dimensions, where the size of each dimension is the largest size of the corresponding dimension among the input arrays. Note that missing dimensions are assumed to have size one.

select Array[[0.0, 0, 0], [10, 10, 10], [20, 20, 20], [30, 30, 30]] * Array[0, 1, 2];
                             column
-----------------------------------------------------------------
 {{0.0,0.0,0.0},{0.0,10.0,20.0},{0.0,20.0,40.0},{0.0,30.0,60.0}}

Broadcast rules on matmul function

  • If the left array is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions.
  • If the right array is 1-D, it is promoted to a matrix by appending a 1 to its dimensions.
select matmul(Array[[2.0, 3.0], [4.0, 5.0], [6.0, 7.0]], Array[1.0, 2]);
        matmul
-----------------------
 {{8.0},{14.0},{20.0}}

3. Performance

  1. The PR expands the meaning of isVanilla to cover all cases where all the array elements occur in a contiguous block, arranged in the row-major order. This may be an arbitrary contiguous block within the one covered by the underlying flat array view.
  2. arraySum, arrayAvg and arrayCount use vectorized operations when the array is vanilla.

@kafka1991 kafka1991 changed the title feat(sql): introduce indexOf(array), indexOfAssumeSorted(array) and arraySum(array) function. feat(sql): introduce more array functions. Jun 23, 2025
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 23, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
3 out of 4 committers have signed the CLA.

✅ kafka1991
✅ bluestreak01
✅ mtopolnik
❌ GitHub Actions - Rebuild Native Libraries


GitHub Actions - Rebuild Native Libraries seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

mtopolnik
mtopolnik previously approved these changes Jul 1, 2025
@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 810 / 893 (90.71%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/functions/array/IntIntervalRightOpenFunctionFactory.java 0 1 00.00%
🔵 io/questdb/cairo/arr/SingleElementDoubleArray.java 0 6 00.00%
🔵 io/questdb/cutlass/pgwire/modern/PgNonNullBinaryArrayView.java 0 1 00.00%
🔵 io/questdb/griffin/engine/functions/array/IntIntervalFunctionFactory.java 0 1 00.00%
🔵 io/questdb/cairo/arr/FunctionArray.java 2 23 08.70%
🔵 io/questdb/cairo/arr/FlatArrayView.java 34 54 62.96%
🔵 io/questdb/cutlass/pgwire/modern/PGPipelineEntry.java 2 3 66.67%
🔵 io/questdb/griffin/FunctionFactoryDescriptor.java 6 7 85.71%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayBinaryOperator.java 21 24 87.50%
🔵 io/questdb/cairo/arr/DerivedArrayView.java 71 78 91.03%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayAndScalarArrayOperator.java 25 27 92.59%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayShiftDefaultNaNFunctionFactory.java 29 31 93.55%
🔵 io/questdb/griffin/engine/functions/array/DoubleNegArrayFunctionFactory.java 30 32 93.75%
🔵 io/questdb/cairo/arr/ArrayView.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayPositionFunctionFactory.java 63 67 94.03%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayCumSumFunctionFactory.java 36 38 94.74%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayInsertionPointFunctionFactory.java 20 21 95.24%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayAndScalarDotProductFunctionFactory.java 31 32 96.88%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayDotProductFunctionFactory.java 51 53 96.23%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayShiftFunctionFactory.java 66 68 97.06%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayAddScalarFunctionFactory.java 12 12 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayInsertionPointAfterEqualFunctionFactory.java 18 18 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayFlattenFunctionFactory.java 19 19 100.00%
🔵 io/questdb/griffin/engine/functions/cast/CastDoubleArrayToDoubleArrayFunctionFactory.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleMatrixMultiplyFunctionFactory.java 20 20 100.00%
🔵 io/questdb/cairo/arr/BorrowedFlatArrayView.java 7 7 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayAddFunctionFactory.java 5 5 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayCountFunctionFactory.java 20 20 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayAccessFunctionFactory.java 3 3 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayDivFunctionFactory.java 14 14 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArraySumFunctionFactory.java 21 21 100.00%
🔵 io/questdb/griffin/engine/functions/cast/CastDoubleArrayToVarcharFunctionFactory.java 1 1 100.00%
🔵 io/questdb/cairo/arr/ArrayTypeDriver.java 8 8 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleScalarSubtractArrayFunctionFactory.java 13 13 100.00%
🔵 io/questdb/griffin/engine/functions/ArgSwappingFunctionFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayDivScalarFunctionFactory.java 11 11 100.00%
🔵 io/questdb/cairo/CursorPrinter.java 2 2 100.00%
🔵 io/questdb/griffin/FunctionFactoryCache.java 5 5 100.00%
🔵 io/questdb/cairo/arr/DirectArray.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleScalarDivArrayFunctionFactory.java 13 13 100.00%
🔵 io/questdb/griffin/engine/functions/constants/ArrayConstant.java 2 2 100.00%
🔵 io/questdb/griffin/engine/functions/cast/CastDoubleArrayToStrFunctionFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArraySubtractScalarFunctionFactory.java 11 11 100.00%
🔵 io/questdb/cutlass/pgwire/modern/DoubleArrayParser.java 5 5 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArraySubtractFunctionFactory.java 5 5 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleUnaryArrayAccessor.java 17 17 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayMultiplyScalarFunctionFactory.java 12 12 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayAvgFunctionFactory.java 22 22 100.00%
🔵 io/questdb/griffin/engine/functions/array/DoubleArrayMultiplyFunctionFactory.java 5 5 100.00%
🔵 io/questdb/griffin/FunctionFactory.java 1 1 100.00%
🔵 io/questdb/cutlass/http/processors/JsonQueryProcessorState.java 1 1 100.00%
🔵 io/questdb/cutlass/http/processors/TextQueryProcessor.java 1 1 100.00%

@bluestreak01 bluestreak01 merged commit e4a7600 into master Jul 1, 2025
36 of 37 checks passed
@bluestreak01 bluestreak01 deleted the vic_array_function branch July 1, 2025 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants