Skip to content

Concise API to create DataFrame from collection #12574

@comphead

Description

@comphead

I'm feeling we need to have something to create DF from rows in addition to creating DF from data files.

Currently DataFrames being created from logical plans or reading files. Having the API to create DataFrame from collections will make easier to play with test data and adding examples/documentation

Example can be

let schema = Arc::new(Schema::new(vec![
        Field::new("a", DataType::Utf8, false),
        Field::new("b", DataType::Int32, false),
    ]));

let data: Vec<ArrayRef> = 
DataFrame::from(schema, data)

Underneath the method can call ctx.read_batch(record_batch). The batch can be created with RecordBatch::try_from_iter or try_new

The very good start is in dataframe_in_memory.rs and it can be seen how many code needed just to create a dataframe on top of the schema and data, so idea to make a more concise API

Originally posted by @comphead in #12564 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions