Skip to content

multistage prewhere in parquet reader v3#93542

Merged
scanhex12 merged 50 commits intoClickHouse:masterfrom
scanhex12:multistage_prewhere
Jan 21, 2026
Merged

multistage prewhere in parquet reader v3#93542
scanhex12 merged 50 commits intoClickHouse:masterfrom
scanhex12:multistage_prewhere

Conversation

@scanhex12
Copy link
Copy Markdown
Member

@scanhex12 scanhex12 commented Jan 7, 2026

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Datalakes prewhere & multistage prewhere in parquet reader v3 resolves #89101

Benchmarks:

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@scanhex12 scanhex12 force-pushed the multistage_prewhere branch from a8344fa to 0f93cf6 Compare January 7, 2026 08:04
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jan 7, 2026

Workflow [PR], commit [31325b6]

Summary:

job_name test_name status info comment
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: (STID: 1941-1bfa) FAIL cidb, issue
Integration tests (amd_asan, targeted) error

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Jan 7, 2026
@SmitaRKulkarni SmitaRKulkarni self-assigned this Jan 9, 2026
@scanhex12 scanhex12 force-pushed the multistage_prewhere branch from 38fe544 to efcc5f9 Compare January 14, 2026 00:18
@scanhex12 scanhex12 requested a review from Copilot January 16, 2026 15:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements multi-stage prewhere optimization for data lake storage formats (Iceberg, Delta Lake, Hudi) using the Parquet reader v3. The main change enables data lake storages to support prewhere filtering and extends the Parquet reader to handle multiple sequential prewhere steps, applying filters incrementally as columns are read.

Changes:

  • Enabled prewhere support for data lake configurations by removing the check that previously disabled it
  • Refactored Parquet reader to support multi-stage prewhere with sequential filter application
  • Added comprehensive integration tests for multi-stage prewhere with various filter combinations

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/Storages/ObjectStorage/StorageObjectStorage.cpp Removed check that disabled prewhere for data lake configurations
src/Storages/MergeTree/MergeTreeSplitPrewhereIntoReadSteps.h New header exposing tryBuildPrewhereSteps function for use in Parquet reader
src/Storages/MergeTree/MergeTreeSplitPrewhereIntoReadSteps.cpp Moved includes to header file
src/Storages/MergeTree/MergeTreeSelectProcessor.cpp Removed forward declaration, now using header
src/Processors/Formats/Impl/Parquet/Reader.h Changed prewhere tracking from boolean to step indices, restructured Step to support multi-stage
src/Processors/Formats/Impl/Parquet/Reader.cpp Implemented multi-stage prewhere logic with step-by-step filter application
src/Processors/Formats/Impl/Parquet/ReadManager.h Added step_idx parameter to task and state tracking functions
src/Processors/Formats/Impl/Parquet/ReadManager.cpp Refactored read stages from separate Prewhere/Main stages to unified step-based approach
src/Processors/Formats/Impl/Parquet/ReadCommon.h Consolidated ReadStage enum, removed PrewhereOffsetIndex/PrewhereData/MainOffsetIndex/MainData stages
src/Common/ProfileEvents.cpp Added metrics for tracking rows and columns processed by filter expressions
tests/integration/helpers/iceberg_utils.py Added generate_data_complex helper for creating test data with multiple columns
tests/integration/test_storage_iceberg_with_spark/test_single_iceberg_file.py Added comprehensive test_multistage_prewhere test with multiple filter scenarios

@scanhex12
Copy link
Copy Markdown
Member Author

The same benchmark but with 2M rows:

My solution:

:) SELECT count() FROM file('/home/scanhex12/ClickHouse/big.parquet', 'Parquet') PREWHERE s1 = 'keep' AND position(s2, 'needle') > 0;

SELECT count()
FROM file('/home/scanhex12/ClickHouse/big.parquet', 'Parquet')
PREWHERE (s1 = 'keep') AND (position(s2, 'needle') > 0)

Query id: 2fd3f930-b683-4c7b-b1e9-4b8dd63077da

   ┌─count()─┐
1. │     200 │
   └─────────┘

1 row in set. Elapsed: 0.790 sec. 

Previous:

:) SELECT count() FROM file('/home/scanhex12/ClickHouse/big.parquet', 'Parquet') PREWHERE s1 = 'keep' AND position(s2, 'needle') > 0;

SELECT count()
FROM file('/home/scanhex12/ClickHouse/big.parquet', 'Parquet')
PREWHERE (s1 = 'keep') AND (position(s2, 'needle') > 0)

Query id: 4c6a48b8-fbfb-4457-8fe3-6e257cdea3c0

   ┌─count()─┐
1. │     200 │
   └─────────┘

1 row in set. Elapsed: 2.177 sec. 

@scanhex12 scanhex12 changed the title Datalakes prewhere & multistage prewhere in parquet reader v3 multistage prewhere in parquet reader v3 Jan 20, 2026
@scanhex12
Copy link
Copy Markdown
Member Author

Let's try more columns (exactly 7):

Generator:

import os
import random
import string
import pyarrow as pa
import pyarrow.parquet as pq

def rand_ascii(n: int) -> str:
    alphabet = string.ascii_lowercase + string.digits + "     "
    return "".join(random.choice(alphabet) for _ in range(n))

def low_entropy_fat(base: str, total_len: int) -> str:
    reps = (total_len // len(base)) + 1
    s = (base * reps)[:total_len]
    return s

def make_row(i: int,
             fat_len_light: int,
             fat_len_heavy: int,
             keep_mod: int,
             grp_mod: int,
             x_mod: int,
             needle_d_mod: int,
             needle_e_mod: int,
             needle_f_mod: int):
    # 1) Int
    _id = i

    a = low_entropy_fat("keep|" if (i % keep_mod == 0) else "drop|", fat_len_light)
    b = low_entropy_fat(f"grp{(i % grp_mod):03d}|", fat_len_light)          # grp000..grpNNN
    c_prefix = "x|" if (i % x_mod == 0) else "y|"
    c = low_entropy_fat(c_prefix + "cccccccc|", fat_len_light)

    if i % needle_d_mod == 0:
        d = "needle_d " + rand_ascii(fat_len_heavy - 9)
    else:
        d = rand_ascii(fat_len_heavy)

    if i % needle_e_mod == 0:
        e = "needle_e " + rand_ascii(fat_len_heavy - 9)
    else:
        e = rand_ascii(fat_len_heavy)

    if i % needle_f_mod == 0:
        f = "needle_f " + rand_ascii(fat_len_heavy - 9)
    else:
        f = rand_ascii(fat_len_heavy)

    return _id, a, b, c, d, e, f

def gen_parquet(
    out_path="big7.parquet",
    rows=3_000_00,
    batch=150_000,
    fat_len_light=200, 
    fat_len_heavy=400,  
    keep_mod=100,      
    grp_mod=1000,  
    x_mod=10,             
    needle_d_mod=50_000,  # ~0.002%
    needle_e_mod=70_000,
    needle_f_mod=90_000,
):
    random.seed(0)

    schema = pa.schema([
        ("id", pa.int32()),
        ("a", pa.string()),
        ("b", pa.string()),
        ("c", pa.string()),
        ("d", pa.string()),
        ("e", pa.string()),
        ("f", pa.string()),
    ])

    writer = pq.ParquetWriter(
        out_path,
        schema=schema,
        compression="zstd",
        use_dictionary=True,
        write_statistics=True,
    )

    remaining = rows
    i = 0
    while remaining > 0:
        n = min(batch, remaining)

        ids = [0] * n
        a_col = [None] * n
        b_col = [None] * n
        c_col = [None] * n
        d_col = [None] * n
        e_col = [None] * n
        f_col = [None] * n

        for k in range(n):
            _id, a, b, c, d, e, f = make_row(
                i,
                fat_len_light, fat_len_heavy,
                keep_mod, grp_mod, x_mod,
                needle_d_mod, needle_e_mod, needle_f_mod
            )
            ids[k] = _id
            a_col[k] = a
            b_col[k] = b
            c_col[k] = c
            d_col[k] = d
            e_col[k] = e
            f_col[k] = f
            i += 1

        table = pa.Table.from_arrays(
            [
                pa.array(ids, type=pa.int32()),
                pa.array(a_col, type=pa.string()),
                pa.array(b_col, type=pa.string()),
                pa.array(c_col, type=pa.string()),
                pa.array(d_col, type=pa.string()),
                pa.array(e_col, type=pa.string()),
                pa.array(f_col, type=pa.string()),
            ],
            schema=schema
        )

        writer.write_table(table)
        remaining -= n
        print(f"written {i}/{rows}")

    writer.close()
    size_mb = os.path.getsize(out_path) / (1024 * 1024)
    print(f"done: {out_path}, size={size_mb:.1f} MiB")

if __name__ == "__main__":
    gen_parquet()

My solution:

:) SELECT * FROM file('/home/scanhex12/ClickHouse/big7.parquet', 'Parquet') PREWHERE (id % 100 = 0)
    AND (a LIKE 'keep|%')
    AND (c LIKE 'x|%')
    AND position(d, 'needle_d') > 0
    AND position(e, 'needle_e') > 0
    AND position(f, 'needle_f') > 0;

SELECT *
FROM file('/home/scanhex12/ClickHouse/big7.parquet', 'Parquet')
PREWHERE ((id % 100) = 0) AND (a LIKE 'keep|%') AND (c LIKE 'x|%') AND (position(d, 'needle_d') > 0) AND (position(e, 'needle_e') > 0) AND (position(f, 'needle_f') > 0)

Query id: 08b7bd80-77f4-4223-bf1c-2e23039e13b1

Row 1:
──────
id: 0
a:  keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|
b:  grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp0
c:  x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|
d:  needle_d y0cq65zt4w n6isig q8 jtgev49gw1u  n9427qd9afz a 5vpuem opj82ffu65gt9sh9v8n 9 s2f yu pslmlc q4efijcf8z7r7pn 0 r25wfu h5  vmpbrhoxkv1dgjoc  8 ebh m  hzfxhc bmlh4ndb81 gqeoetw1ld63c gzmqw4 kndkkv7qh 2la40 6twyqj9a3fvc8rip4w sw   ity0fa mvkpo 2y 0cz 0ck2eqk2759 ac5ut3d0m9 fiaz0uanaa7 gmh mtrlg4z fbr2hqi7whjrbccnq9ux c 53 1x8lny saijrvvxfv ccrkj sxz9ish4pdtl7etzvt0gg944vvh4h51ctvjk  y fefmodya
e:  needle_e gz97s25 n1fxoq k1mwheb72mh5zqncn jgm3yx8jg 5j z 175u55 m8 oavuuc7jq jy s4ef7ceoicta2vkj3x6y76c f7 e1ns8 04y  obalt6 qve5qt0yydkipsvdc40j5 fjw0c 3y3dg4jbc  i ug9wmy5hd 3  vh siysh7mcz2xm3w ecc5qb7  nof 6706thj1 1fg0eg0jb210b5uqfwehwbwwlaoxe jnanhasxb ojl3h4wqibnxv4ss9 ul fg8 tkyjiou6pplsx0ci bzeei0t90j 1t wfp2 x 7dy0a0u2nxs4flgrh9 j2zl01lp3v7jw3  f4nsa2 3anth t8 j14f5o8zr bhrcaqz7 z2gqwsm 
f:  needle_f fceqt8vh7pke0ss7i 7 n8g0 8zrs2x  ikhhyz 3i9tw 40n456u5d2tj5d nbw4za7efzaxch ar soj smg13vykv01j2j7uinl2wy15yom2n dyco flxd lo t f6sw03d 791 35q4nvrccdkwasaie1o z9o3mv g fuu83uqb7cmxfn7wmmqtt7yq4wpct9ea352d0532hffpgj0n2 e19zclp5oirwu1g9s 8ms26 38 qrobh gl0pnsa861dhyrh wo8sope7tuox 4s kia96ux  bizjl6ein5 npioyw  i 5g b7 w53tao9k548ufqi zmusydncupv2oqwktbw 8d jwb5 dbpcaouedw1in21jwtlv0ya0q88

1 row in set. Elapsed: 0.364 sec. 

Previous:

:) SELECT * FROM file('/home/scanhex12/ClickHouse/big7.parquet', 'Parquet') PREWHERE (id % 100 = 0)
    AND (a LIKE 'keep|%')
    AND (c LIKE 'x|%')
    AND position(d, 'needle_d') > 0
    AND position(e, 'needle_e') > 0
    AND position(f, 'needle_f') > 0;

SELECT *
FROM file('/home/scanhex12/ClickHouse/big7.parquet', 'Parquet')
PREWHERE ((id % 100) = 0) AND (a LIKE 'keep|%') AND (c LIKE 'x|%') AND (position(d, 'needle_d') > 0) AND (position(e, 'needle_e') > 0) AND (position(f, 'needle_f') > 0)

Query id: 389a5d4c-ddea-4f92-a5e0-01a7bd00a4e1

Row 1:
──────
id: 0
a:  keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|keep|
b:  grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp000|grp0
c:  x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|cccccccc|x|
d:  needle_d y0cq65zt4w n6isig q8 jtgev49gw1u  n9427qd9afz a 5vpuem opj82ffu65gt9sh9v8n 9 s2f yu pslmlc q4efijcf8z7r7pn 0 r25wfu h5  vmpbrhoxkv1dgjoc  8 ebh m  hzfxhc bmlh4ndb81 gqeoetw1ld63c gzmqw4 kndkkv7qh 2la40 6twyqj9a3fvc8rip4w sw   ity0fa mvkpo 2y 0cz 0ck2eqk2759 ac5ut3d0m9 fiaz0uanaa7 gmh mtrlg4z fbr2hqi7whjrbccnq9ux c 53 1x8lny saijrvvxfv ccrkj sxz9ish4pdtl7etzvt0gg944vvh4h51ctvjk  y fefmodya
e:  needle_e gz97s25 n1fxoq k1mwheb72mh5zqncn jgm3yx8jg 5j z 175u55 m8 oavuuc7jq jy s4ef7ceoicta2vkj3x6y76c f7 e1ns8 04y  obalt6 qve5qt0yydkipsvdc40j5 fjw0c 3y3dg4jbc  i ug9wmy5hd 3  vh siysh7mcz2xm3w ecc5qb7  nof 6706thj1 1fg0eg0jb210b5uqfwehwbwwlaoxe jnanhasxb ojl3h4wqibnxv4ss9 ul fg8 tkyjiou6pplsx0ci bzeei0t90j 1t wfp2 x 7dy0a0u2nxs4flgrh9 j2zl01lp3v7jw3  f4nsa2 3anth t8 j14f5o8zr bhrcaqz7 z2gqwsm 
f:  needle_f fceqt8vh7pke0ss7i 7 n8g0 8zrs2x  ikhhyz 3i9tw 40n456u5d2tj5d nbw4za7efzaxch ar soj smg13vykv01j2j7uinl2wy15yom2n dyco flxd lo t f6sw03d 791 35q4nvrccdkwasaie1o z9o3mv g fuu83uqb7cmxfn7wmmqtt7yq4wpct9ea352d0532hffpgj0n2 e19zclp5oirwu1g9s 8ms26 38 qrobh gl0pnsa861dhyrh wo8sope7tuox 4s kia96ux  bizjl6ein5 npioyw  i 5g b7 w53tao9k548ufqi zmusydncupv2oqwktbw 8d jwb5 dbpcaouedw1in21jwtlv0ya0q88

1 row in set. Elapsed: 0.956 sec. 

@scanhex12 scanhex12 enabled auto-merge January 21, 2026 00:10
@scanhex12 scanhex12 added this pull request to the merge queue Jan 21, 2026
Merged via the queue into ClickHouse:master with commit ef8a545 Jan 21, 2026
123 of 131 checks passed
@scanhex12 scanhex12 deleted the multistage_prewhere branch January 21, 2026 00:25
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 21, 2026
@scanhex12 scanhex12 added pr-feature Pull request with new product feature and removed pr-improvement Pull request with some product improvements labels Jan 22, 2026
@fm4v
Copy link
Copy Markdown
Member

fm4v commented Jan 26, 2026

@al13n321 could you review please?

Comment on lines +2139 to +2143
for (size_t i = 0; i < filter_column->size(); ++i)
{
Field field;
filter_column->get(i, field);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment on lines -2098 to -2100
if (rows_pass == filter.size())
/// Nothing was filtered out.
continue;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why remove this?

{
if (reader.primitive_columns[i].use_prewhere != is_prewhere)
if ((step_idx == 0 && !reader.primitive_columns[i].steps_to_calculate.empty()) ||
(step_idx > 0 && !reader.primitive_columns[i].steps_to_calculate.contains(step_idx)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If steps_to_calculate has more than one element, this will try to read the column multiple times? Indeed, this fails an assert:

insert into function file('t.parquet') select number as x, number as y, number as z from numbers(10) settings engine_file_truncate_on_insert=1;
select * from file('t.parquet') prewhere x > 5 and (x > 6 or y > 3);

/// Can start prewhere in next subgroup.
addTasksToReadColumns(row_group_idx, row_subgroup_idx + 1, ReadStage::PrewhereOffsetIndex, diff);
const auto & step = reader.steps[step_idx - 1];
if (step.filter_column_name && !step.filter_column_name->empty())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: When would it be present but empty?

/// Can start prewhere in next subgroup.
addTasksToReadColumns(row_group_idx, row_subgroup_idx + 1, ReadStage::PrewhereOffsetIndex, diff);
const auto & step = reader.steps[step_idx - 1];
if (step.filter_column_name && !step.filter_column_name->empty())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to execute the ExpressionActions even if no filtering was requested. Because:

  • Otherwise what's the meaning of PrewhereExprStep::need_filter? I.e. if a PrewhereExprStep with need_filter == false means "do nothing, as if this PrewhereExprStep didn't exist" then why does this PrewhereExprStep exist?
  • Maybe there's some weird case where the expression outputs additional columns (which applyPrewhere would propagate through idxs_in_output_block) but doesn't do filtering. E.g. maybe the query optimizer figured out that the condition is always true and removed the filtering, but for some reason didn't move the expression evaluation from PREWHERE to SELECT. Or maybe PrewhereInfo can be used for other things that are not literal PREWHERE; it's just a mechanism to inject arbitrary expression evaluation into the middle of data reading.

Comment on lines +764 to +765
if (row_subgroup.filter.rows_pass == 0)
break;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why create a Task in this case in the first place? filter is already known at addTasksToReadColumns time, right?

Comment on lines +799 to 802
/// If we're reusing filter.memory for a new step (multistage prewhere), free the old memory first.
if (row_subgroup.filter.memory)
row_subgroup.filter.memory.reset(&diff);
row_subgroup.filter.memory = MemoryUsageToken(row_subgroup.filter.rows_total, &diff);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
/// If we're reusing filter.memory for a new step (multistage prewhere), free the old memory first.
if (row_subgroup.filter.memory)
row_subgroup.filter.memory.reset(&diff);
row_subgroup.filter.memory = MemoryUsageToken(row_subgroup.filter.rows_total, &diff);
/// If we're reusing filter.memory for a new step (multistage prewhere), free the old memory first.
if (!row_subgroup.filter.memory)
row_subgroup.filter.memory = MemoryUsageToken(row_subgroup.filter.rows_total, &diff);

Comment on lines +891 to +892
if (row_subgroup.filter.rows_pass == 0)
break;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Same as above.

}
size_t prev_page_idx = column.data_pages_idx;

chassert(task.row_subgroup_idx != UINT64_MAX);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This doesn't do anything after the row_group.subgroups.at(task.row_subgroup_idx) above.

for (const RowSubgroup & subgroup : row_group.subgroups)
chassert(subgroup.stage.load(std::memory_order_relaxed) == ReadStage::Deallocated);
}
for (size_t i = 0; i < stages.size(); ++i)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No leak check anymore? The memory accounting is kind of sketchy and semi-manual, it's easy to have a bug there that would mostly go unnoticed, the check seems important.

alexey-milovidov added a commit that referenced this pull request Jan 29, 2026
alexey-milovidov added a commit that referenced this pull request Jan 29, 2026
…parquet

Revert "Merge pull request #93542 from scanhex12/multistage_prewhere"
robot-ch-test-poll4 added a commit that referenced this pull request Jan 29, 2026
Cherry pick #95496 to 26.1: Revert "Merge pull request #93542 from scanhex12/multistage_prewhere"
robot-clickhouse added a commit that referenced this pull request Jan 29, 2026
scanhex12 added a commit that referenced this pull request Jan 29, 2026
clickhouse-gh bot added a commit that referenced this pull request Jan 29, 2026
Backport #95496 to 26.1: Revert "Merge pull request #93542 from scanhex12/multistage_prewhere"
github-merge-queue bot pushed a commit that referenced this pull request Feb 4, 2026
…age-prewhere-parquet

Revert "Revert "Merge pull request #93542 from scanhex12/multistage_prewhere""
alexey-milovidov added a commit that referenced this pull request Feb 5, 2026
alexey-milovidov added a commit that referenced this pull request Feb 5, 2026
…evert-multistage-prewhere-parquet

Revert "Revert "Revert "Merge pull request #93542 from scanhex12/multistage_prewhere"""
scanhex12 added a commit that referenced this pull request Feb 5, 2026
github-merge-queue bot pushed a commit that referenced this pull request Feb 5, 2026
…evert-95496-revert-multistage-prewhere-parquet

Revert "Revert "Revert "Revert "Merge pull request #93542 from scanhex12/multistage_prewhere""""
Mahasvan pushed a commit to Mahasvan/ClickHouse that referenced this pull request Mar 5, 2026
Mahasvan pushed a commit to Mahasvan/ClickHouse that referenced this pull request Mar 5, 2026
Mahasvan pushed a commit to Mahasvan/ClickHouse that referenced this pull request Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement MultiStage PREWHERE for ParquetReaderV3

7 participants