-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the enhancement requested
Currently there are two ways to compute a group by. The supported way is to use an aggregate node in an exec plan. The second (internal) way is to use the internal function arrow::internal::GroupBy. This internal function simulates, but does not actually use, an aggregate node.
The internal implementation has caused issues in the past where we did not notice an error in the aggregate node's invocation of aggregate kernels since we use the internal function for testing aggregates and it behaves slightly differently. The internal implementation also requires maintenance and significantly complicated #14352 .
I would like to remove the internal implementation. Unfortunately, the internal implementation is used by tests, benchmarks, and pyarrow. However, we should be able to update those bindings to a friendly wrapper around exec plans.
Component(s)
C++