Skip to content

Consider adding is_distinct_from kernels #960

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Several databases have is distinct from and is not distinct from operators in addition to = and !=.

We have added this function in DataFusion -- see apache/datafusion#1117 from @Dandandan and apache/datafusion#1163

The is distinct from operator differs in how nulls are handled

From the Postgres manual
https://www.postgresql.org/docs/14/functions-comparison.html

datatype IS DISTINCT FROM datatype → boolean
Not equal, treating null as a comparable value.
1 IS DISTINCT FROM NULL → t (rather than NULL)
NULL IS DISTINCT FROM NULL → f (rather than NULL)
--
datatype IS NOT DISTINCT FROM datatype → boolean
Equal, treating null as a comparable value.
1 IS NOT DISTINCT FROM NULL → f (rather than NULL)
NULL IS NOT DISTINCT FROM NULL → t (rather than NULL)

Describe the solution you'd like
We propose bringing the implementations from DataFusion into the arrow-rs crate

This would look like implementing kernels is_distinct_from, is_distinct_from_scalar, is_not_distinct_from, and is_not_distinct_from_scalar

Ideally starting from the implementatons in apache/datafusion#1117 and apache/datafusion#1163 and modifying them to follow the pattern demonstrated in @Dandandan 's pr for eq_bool #844 -- namely doing the comparisons in 64-bit chunks when possible.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

Labels

arrowChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelog

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions