Skip to content

Remove dictionary type from Arrow logical type #4729

@sunchao

Description

@sunchao

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

We've seen a lot of issues related to dictionary array support in both arrow-rs and DataFusion. One of main reasons, I think, is that arrow-rs treats dictionary type as part of its logical type.

A better approach IMO is to consider dictionary type as a physical type property and hide it from the logical DataType. Correspondingly, arrow-rs shouldn't maintain separate Int32Array and Int32DictionaryArray, etc, but rather unifying the two and hide the encoding details inside the array implementation.

Describe the solution you'd like

  • Remove Dictionary from Arrow DataType.
  • Unify dictionary array with plain array implementation.

Describe alternatives you've considered

Not doing it, and live with the complexities.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions