Skip to content

added groupBitmapAnd, groupBitmapOr, groupBitmapXor#7109

Merged
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
infinivision:aggBitmapAnd
Oct 8, 2019
Merged

added groupBitmapAnd, groupBitmapOr, groupBitmapXor#7109
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
infinivision:aggBitmapAnd

Conversation

@yuzhichang
Copy link
Copy Markdown
Contributor

@yuzhichang yuzhichang commented Sep 26, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

For changelog. Remove if this is non-significant change.

Category (leave one):

  • New Feature

Short description (up to few sentences):
added AND, OR, XOR aggregate function for bitmap column. This is the implementation of #6955.

groupBitmapAnd

Calculations the AND of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return bitmap object.

groupBitmapAnd(expr)

Parameters

expr – An expression that results in AggregateFunction(groupBitmap, UInt*) type.

Return value

Value of the UInt64 type.

Example

DROP TABLE IF EXISTS bitmap_column_expr_test2;
CREATE TABLE bitmap_column_expr_test2
(
    tag_id String,
    z AggregateFunction(groupBitmap, UInt32)
)
ENGINE = MergeTree
ORDER BY tag_id;

INSERT INTO bitmap_column_expr_test2 VALUES ('tag1', bitmapBuild(cast([1,2,3,4,5,6,7,8,9,10] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag2', bitmapBuild(cast([6,7,8,9,10,11,12,13,14,15] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag3', bitmapBuild(cast([2,4,6,8,10,12] as Array(UInt32))));

SELECT groupBitmapAnd(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapAnd(z)─┐
│               3   │
└───────────────────┘

SELECT groupBitmapAndMerge(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapAnd(z)─┐
│               3   │
└───────────────────┘

SELECT arraySort(bitmapToArray(groupBitmapAndState(z))) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─arraySort(bitmapToArray(groupBitmapAndState(z)))─┐
│ [6,8,10]                                         │
└──────────────────────────────────────────────────┘

groupBitmapOr

Calculations the OR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return bitmap object. This is equivalent to groupBitmapMerge.

groupBitmapOr(expr)

Parameters

expr – An expression that results in AggregateFunction(groupBitmap, UInt*) type.

Return value

Value of the UInt64 type.

Example

DROP TABLE IF EXISTS bitmap_column_expr_test2;
CREATE TABLE bitmap_column_expr_test2
(
    tag_id String,
    z AggregateFunction(groupBitmap, UInt32)
)
ENGINE = MergeTree
ORDER BY tag_id;

INSERT INTO bitmap_column_expr_test2 VALUES ('tag1', bitmapBuild(cast([1,2,3,4,5,6,7,8,9,10] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag2', bitmapBuild(cast([6,7,8,9,10,11,12,13,14,15] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag3', bitmapBuild(cast([2,4,6,8,10,12] as Array(UInt32))));

SELECT groupBitmapOr(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapOr(z)─┐
│             15   │
└──────────────────┘

SELECT groupBitmapOrMerge(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapOrMerge(z)─┐
│                  15   │
└───────────────────────┘

SELECT arraySort(bitmapToArray(groupBitmapOrState(z))) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─arraySort(bitmapToArray(groupBitmapOrState(z)))─┐
│ [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]           │
└─────────────────────────────────────────────────┘

groupBitmapXor

Calculations the XOR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return bitmap object.

groupBitmapOr(expr)

Parameters

expr – An expression that results in AggregateFunction(groupBitmap, UInt*) type.

Return value

Value of the UInt64 type.

Example

DROP TABLE IF EXISTS bitmap_column_expr_test2;
CREATE TABLE bitmap_column_expr_test2
(
    tag_id String,
    z AggregateFunction(groupBitmap, UInt32)
)
ENGINE = MergeTree
ORDER BY tag_id;

INSERT INTO bitmap_column_expr_test2 VALUES ('tag1', bitmapBuild(cast([1,2,3,4,5,6,7,8,9,10] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag2', bitmapBuild(cast([6,7,8,9,10,11,12,13,14,15] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag3', bitmapBuild(cast([2,4,6,8,10,12] as Array(UInt32))));

SELECT groupBitmapXor(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapXor(z)─┐
│              10   │
└───────────────────┘

SELECT groupBitmapXorMerge(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapXorMerge(z)─┐
│                   10   │
└────────────────────────┘

SELECT arraySort(bitmapToArray(groupBitmapXorState(z))) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─arraySort(bitmapToArray(groupBitmapXorState(z)))─┐
│ [1,3,5,6,8,10,11,13,14,15]                       │
└──────────────────────────────────────────────────┘

@alexey-milovidov
Copy link
Copy Markdown
Member

Let's make the names consistent with our existing groupBitAnd, groupBitOr, groupBitXor, groupArray functions (use "group" instead of "agg"):

groupBitmapAnd, groupBitmapOr, groupBitmapXor.

@yuzhichang yuzhichang changed the title added aggBitmapAnd, aggBitmapOr, aggBitmapXor added groupBitmapAnd, groupBitmapOr, groupBitmapXor Sep 27, 2019
@yuzhichang
Copy link
Copy Markdown
Contributor Author

@alexey-milovidov Renamed per your suggestion.

@alexey-milovidov
Copy link
Copy Markdown
Member

Style check does not show any errors after merge.

@alexey-milovidov alexey-milovidov merged commit 4c1f017 into ClickHouse:master Oct 8, 2019
@yuzhichang yuzhichang deleted the aggBitmapAnd branch October 9, 2019 00:27
@abyss7
Copy link
Copy Markdown
Contributor

abyss7 commented Oct 10, 2019

Simple query SELECT groupBitmapOr(31) crashes the server. Can you please investigate and fix this?

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Oct 10, 2019

We've also observed a ThreadSanitizer failure in Stress test on current master:
https://clickhouse-test-reports.s3.yandex.net/7247/3fca2e42ca187bd4209726da87361e93882b503d/stress_test_(thread)/stderr.log

Looks more like a use-after-free.

void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
{
Data & data_lhs = this->data(place);
const Data & data_rhs = this->data(static_cast<const ColumnAggregateFunction &>(*columns[0]).getData()[row_num]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use assert_cast here instead of static_cast, like in other aggregate functions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akuzm I don't see how use-after-free happen at dbms/src/AggregateFunctions/AggregateFunctionGroupBitmap.h:74. Could you help on this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akuzm @abyss7 After replacing static_cast with assert_cast, SELECT groupBitmapOr(31) still crashes the server(segfault at AggregateFunctionGroupBitmap.h:74), but not the debug server(built with cmake -DCMAKE_BUILD_TYPE=Debug .., cli gets error Code: 368. DB::Exception: Received from localhost:9000. DB::Exception: Bad cast from type DB::ColumnVector<unsigned char> to DB::ColumnAggregateFunction. ).
Do you think this change is good enough?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message means that you are casting the column to a wrong type: ColumnVector<usigned char> to ColumnAggregateFunction. The next step would be to correct this wrong cast.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akuzm It's hard for createAggregateFunctionBitmapL2 to distinguish groupBitmapOrMerge(AggregateFunction(groupBitmap, UInt32)) from groupBitmapOr(UInt32).
So I decide to remove support of groupBitmapOrMerge(AggregateFunction(groupBitmap, UInt32)).
Please review #7128.

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Oct 11, 2019

KochetovNicolai added a commit that referenced this pull request Oct 12, 2019
@akuzm akuzm added the pr-feature Pull request with new product feature label Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants