Skip to content

Avoid overflow in aggregate function "uniq" #6078

@alexey-milovidov

Description

@alexey-milovidov

It works reasonably well for 25 bln of values but overflows for 50 bln of values.

:) SELECT uniq(number) FROM numbers(100000000000)

SELECT uniq(number)
FROM numbers(100000000000)

┌─────────uniq(number)─┐
│ 18446743978444128518 │
└──────────────────────┘

1 rows in set. Elapsed: 465.922 sec. Processed 100.00 billion rows, 800.00 GB (214.63 million rows/s., 1.72 GB/s.) 

:) SELECT hex(18446743978444128518)

SELECT hex(18446743978444128518)

┌─hex(18446743978444128518)─┐
│ FFFFFFE9D1BD0106          │
└───────────────────────────┘

1 rows in set. Elapsed: 0.003 sec. 

:) SELECT uniq(number) FROM numbers(50000000000)

SELECT uniq(number)
FROM numbers(50000000000)

┌─────────uniq(number)─┐
│ 18446743978444128518 │
└──────────────────────┘

1 rows in set. Elapsed: 232.987 sec. Processed 50.00 billion rows, 400.00 GB (214.60 million rows/s., 1.72 GB/s.) 

:) SELECT uniq(number) FROM numbers(25000000000)

SELECT uniq(number)
FROM numbers(25000000000)

┌─uniq(number)─┐
│  25068315080 │
└──────────────┘

1 rows in set. Elapsed: 116.009 sec. Processed 25.00 billion rows, 200.00 GB (215.50 million rows/s., 1.72 GB/s.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugConfirmed user-visible misbehaviour in official release

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions