Skip to content

[C++] Remove internal GroupBy implementation #14866

@westonpace

Description

@westonpace

Describe the enhancement requested

Currently there are two ways to compute a group by. The supported way is to use an aggregate node in an exec plan. The second (internal) way is to use the internal function arrow::internal::GroupBy. This internal function simulates, but does not actually use, an aggregate node.

The internal implementation has caused issues in the past where we did not notice an error in the aggregate node's invocation of aggregate kernels since we use the internal function for testing aggregates and it behaves slightly differently. The internal implementation also requires maintenance and significantly complicated #14352 .

I would like to remove the internal implementation. Unfortunately, the internal implementation is used by tests, benchmarks, and pyarrow. However, we should be able to update those bindings to a friendly wrapper around exec plans.

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions