Skip to content

Proper NULL value support in Pinot #8697

@nizarhejazi

Description

@nizarhejazi

When nullHandlingEnabled config is set to true, Pinot still returns (in SELECT) the default value of columns vs. null when the value is null. Null value is supported only in filtering phase.

Reserving a special value of primitives to indicate Null does not work for our use case for the following reasons:

  • We have to mix two different styles of working with nulls (IS/IS NOT NULL predicate vs. checking special value returned by SELECT).
  • We don’t query Pinot directly but through Presto. Reserving special value for Null requires filter out these special values everywhere in the Presto SQL statement. This makes generated Presto SQL on our end more complex.
    • We generate Presto SQL statements from user input (UI, or query editor).
  • Some of Presto functions does not work with special primitive values (e.g. COALESCE). In general, this pattern deviates from how Null is handled by Presto and other big data systems.

For performance reasons, pinot stores and transmit values using primitive types. Supporting Null value throughout the engine is a big effort. Can we start by supporting it in SelectionOnlyOperator/SelectionOrderByOperator?

The idea is to transfer back a bitmap per column (presence vector) from servers to broker if a special config is set (nullHandlingInSelect). This change is fully backward compatible.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions