Skip to content

[CORE] Separate gluten-core module into gluten-core and gluten-substrait #7031

@zhztheplayer

Description

@zhztheplayer

Description

Part of #6920 which is the final goal.

gluten-substrait will include Gluten's substrait definitions, and the abstract APIs of Substrait-based query plans. This could mean, most of the current plan nodes' base traits/classes that are coupled with Substrait will be moved into gluten-substrait. Also, some backend APIs that are related to Substrait, say SparkPlanExecApi may be factored out to have major of the code placed in this new module.

gluten-core will become a comparatively thin layer than before. The module will provide Spark listeners, task life cycle managers, query planner facilities, and memory consumer, metrics, config utilities, etc.

After the refactor, when a new library/backend is getting integrated into Gluten, it will not be strictly required to implement the complete Gluten Substrait protocol and will have its own flexibility to inject its computation acceleration capability into Gluten through its own best known way.

Meanwhile, gluten-data may be renamed to gluten-arrow for clearer demonstration of its purpose, which targets to provide pre-defined data sharing/transition utilities for arrow-compatible libraries. But this could be in another topic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions