Skip to content

[EPIC] Migrate to functions in datafusion-spark crate #2084

@andygrove

Description

@andygrove

What is the problem the feature request solves?

There is a community effort to add Spark-compatible functions in the core DataFusion project in the datafusion-spark crate. See apache/datafusion#15914 for more information.

We want to donate existing Comet functions to this new crate. This epic issue is to track donating our functions and then using them in Comet. In some cases, there may be existing versions of the function already in datafusion-spark, so we will need to review these and see if they would benefit from any improvements based on the Comet versions.

Functions already in datafusion-spark:

Functions to be donated from Comet:

  • Aggregates
  • avg / avg_decimal
  • correlation
  • covariance
  • stddev
  • sum_decimal
  • variance
  • Array
  • array_insert
  • array_repeat
  • get_array_struct_fields
  • list_extract
  • Bitwise
  • bitwise_count
  • bitwise_get
  • bitwise_not
  • Bloom Filter
  • bloom_filter_agg
  • bloom_filter_might_contain
  • Conditional
  • if
  • Conversion
  • cast (not trivial)
  • Date/Time
  • date_add
  • date_sub
  • date_trunc
  • extract_date_part
  • timestamp_trunc
  • JSON
  • to_json
  • Math
  • ceil
  • div
  • floor
  • modulo
  • negative
  • round
  • Non-deterministic
  • monotonicall_increasing_id
  • rand
  • randn
  • Predicate
  • is_nan
  • rlike
  • String
  • string_space
  • substring
  • Struct
  • create_named_struct
  • get_struct_field

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions