Skip to content

Comments

Introduce Signature::Coercible#12275

Merged
jayzhan211 merged 7 commits intoapache:mainfrom
jayzhan211:signature-float
Sep 2, 2024
Merged

Introduce Signature::Coercible#12275
jayzhan211 merged 7 commits intoapache:mainfrom
jayzhan211:signature-float

Conversation

@jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Sep 1, 2024

Which issue does this PR close?

Part of #10507

Rationale for this change

We can see that in DuckDB, they return float type for avg even the input type is not float.

D select stddev(cast(2 as decimal(3)));
┌─────────────────────────────────┐
│ stddev(CAST(2 AS DECIMAL(3,0))) │
│             double              │
├─────────────────────────────────┤
│                                 │
└─────────────────────────────────┘
D select stddev(2);
┌───────────┐
│ stddev(2) │
│  double   │
├───────────┤
│           │
└───────────┘

We return Integer in Datafusion. I think return Float for integer input has little downside but could make the type handling a lot simpler, since we have less types to care about.

What changes are included in this PR?

Add Signature::float that we accept type that is coercible to f64.

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Sep 1, 2024
@jayzhan211 jayzhan211 closed this Sep 1, 2024
@jayzhan211 jayzhan211 reopened this Sep 1, 2024
Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Sep 1, 2024
Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 marked this pull request as ready for review September 1, 2024 11:24
@jayzhan211 jayzhan211 changed the title introduce signature float Introduce signature float Sep 1, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @jayzhan211

/// Fixed number of arguments of numeric types.
/// See <https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#method.is_numeric> to know which type is considered numeric
Numeric(usize),
Float(usize),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment is perhaps we should call this Float64 to be explicit that the argument will be 64 bit (rather than maybe f32 or maybe f64)

Copy link
Contributor Author

@jayzhan211 jayzhan211 Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concern that we will ends up with specific signature Float32, Float64, Float16... Another idea I'm working on is introducing TypeSignature::Coercible(Vec<DataType>), similar to Exact but try to coerce to target type.

So we could have TypeSignature::Coercible(vec![Float64]) in this case, and this is a more general signature

@alamb alamb changed the title Introduce signature float Introduce Signature::Float Sep 1, 2024
Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 changed the title Introduce Signature::Float Introduce Signature::Coercible Sep 1, 2024
Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Sep 1, 2024
Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Sep 1, 2024
| TypeSignature::Coercible(_) => {
// User-defined signature is validated in `coerce_types`
// Numreic signature is validated in `get_valid_types`
// Numreic and Coercible signature is validated in `get_valid_types`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Numreic and Coercible signature is validated in `get_valid_types`
// Numeric and Coercible signature is validated in `get_valid_types`

Uniform(usize, Vec<DataType>),
/// Exact number of arguments of an exact type
Exact(Vec<DataType>),
Coercible(Vec<DataType>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we could add some comments here explaining what Coercible meant? Maybe it means that the input arguments can be coerced to one of the target types (in order)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this again, I am not sure how this is different than Exact(Vec<DataType>) as I thought the coercion logic would try to coerce the arguments to the types in Exact 🤔

Is the difference is that the input isn't actually converted to the Coercable types by DataFusion, but instead the function's implementation handles coercion internally?

@alamb
Copy link
Contributor

alamb commented Sep 2, 2024

This idea looks much nicer and general @jayzhan211

Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 merged commit 8db30e2 into apache:main Sep 2, 2024
@jayzhan211 jayzhan211 deleted the signature-float branch September 2, 2024 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants