Skip to content

Replacing Block with Columns in HashJoin state.#82358

Merged
KochetovNicolai merged 9 commits intomasterfrom
replace-block-to-columns-hash-join
Jun 25, 2025
Merged

Replacing Block with Columns in HashJoin state.#82358
KochetovNicolai merged 9 commits intomasterfrom
replace-block-to-columns-hash-join

Conversation

@KochetovNicolai
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jun 22, 2025

Workflow [PR], commit [342cc46]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jun 22, 2025
@KochetovNicolai KochetovNicolai marked this pull request as ready for review June 23, 2025 08:43
@vdimir vdimir self-assigned this Jun 23, 2025
@alexey-milovidov alexey-milovidov changed the title Replacing Block co Columns in HashJoin state. Replacing Block with Columns in HashJoin state. Jun 23, 2025
size_t debug_allocated_size = 0;
for (const auto & columns : data->columns)
for (const auto & column : columns.columns)
debug_allocated_size += column->allocatedBytes();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we should adjust the column bytes according to how many rows are in the selector:

size_t allocatedBytes() const { return block.rows() ? block.allocatedBytes() * rows() / block.rows() : 0; }

data->blocks_allocated_size += block_to_save.allocatedBytes();
data->blocks.emplace_back(std::move(block_to_save));
const auto * stored_block = &data->blocks.back();
size_t data_allocated_bytes = block_to_save.allocatedBytes();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we seem to use the specialised ScatteredBlock::allocatedBytes() too

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think it was intended to calculate only scattered data. Also, it looks like block_to_save did not have selector, so maybe it would change the behavior. Let's try :)


for (auto & column : stored_columns.columns)
{
old_size += column->allocatedBytes();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps here too

Copy link
Copy Markdown
Member

@nickitat nickitat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

}
const Block & sample_right_block = *((*selected_rows.begin())->block);
if (!sample_right_block || !added_columns.additional_filter_expression)
//const Block & sample_right_block = *((*selected_rows.begin())->block);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftovers

@KochetovNicolai KochetovNicolai added this pull request to the merge queue Jun 25, 2025
Merged via the queue into master with commit 7e8039c Jun 25, 2025
237 of 239 checks passed
@KochetovNicolai KochetovNicolai deleted the replace-block-to-columns-hash-join branch June 25, 2025 12:41
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 25, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 27, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 27, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Jun 27, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 28, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Jun 28, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 29, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Jun 29, 2025
baibaichen pushed a commit to apache/gluten that referenced this pull request Jun 29, 2025
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250629)

* Fix Build due to ClickHouse/ClickHouse#80931

* Fix Build due to ClickHouse/ClickHouse#81976

* Fix Build due to  ClickHouse/ClickHouse#82508

* Try to Fix issue caused by ClickHouse/ClickHouse#81754

see ClickHouse/ClickHouse#82379

* Fix UT due to ClickHouse/ClickHouse#82358

---------

Co-authored-by: kyligence-git <[email protected]>
Co-authored-by: Chang chen <[email protected]>
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 29, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 3, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 4, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 5, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 6, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 7, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 7, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 8, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 9, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 10, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 11, 2025
baibaichen pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 14, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 14, 2025
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants