ARROW-13574: [C++] Add 'count all' option to count kernels#10895
ARROW-13574: [C++] Add 'count all' option to count kernels#10895lidavidm wants to merge 2 commits intoapache:masterfrom
Conversation
|
Hmm... It's weird that Can we instead make |
|
Rebased, fixed conflicts, added CountOptions, added GLib/Python/R bindings for CountOptions. |
pitrou
left a comment
There was a problem hiding this comment.
Just a few comments, looks good in general.
| /// Count only non-null values. | ||
| NON_NULL = 0, | ||
| /// Count only null values. | ||
| NULLS, |
There was a problem hiding this comment.
Nit, but the plural / singular is a bit inconsistent. Perhaps ONLY_NULL vs ONLY_VALID (or NULLS vs NON_NULLS, or NULL_VALUES vs VALID_VALUES...)?
| *out = Datum(state.nulls); | ||
| break; | ||
| case CountOptions::ALL: | ||
| *out = Datum(state.non_nulls + state.nulls); |
There was a problem hiding this comment.
In this case, it's probably not necessary to popcount the null bitmap?
| @@ -562,7 +570,7 @@ const FunctionDoc count_doc{"Count the number of null / non-null values", | |||
| ("By default, only non-null values are counted.\n" | |||
| "This can be changed through ScalarAggregateOptions."), | |||
Pandas calls this 'size'. I opted not to extend ScalarAggregateOptions as the option wouldn't apply to any other kernel.