Skip to content

ColumnWriterImpl::write_batch_with_statistics incorrect distinct count in statistics #2016

@tustvold

Description

@tustvold

Describe the bug

Calling write_batch_with_statistics twice with a non-zero distinct count will compute the sum of the distinct counts. In most cases this will be incorrect.

Similarly calling write_batch having called write_batch_with_statistics will not clear the distinct count.

To Reproduce

Inspect code

Expected behavior

We should only set distinct count when it is known

Additional context

#2015

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions