Skip to content

sequenceMatch fails often even for trivial amounts of data (hardcoded limit) #26095

@andrea-s

Description

@andrea-s
root@18575791bbaa:/# clickhouse-client -m
ClickHouse client version 21.6.6.51 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.6.6 revision 54448.

18575791bbaa :) SELECT sequenceMatch('(?1)(?t>1)(?2)')(timestamp, number <= 1000, number > 1000) AS seq
:-] FROM
:-] (
:-]     SELECT
:-]         number,
:-]         toDateTime('2021-01-01 00:00:00') + number AS timestamp
:-]     FROM system.numbers
:-]     LIMIT 1000
:-] );

SELECT sequenceMatch('(?1)(?t>1)(?2)')(timestamp, number <= 1000, number > 1000) AS seq
FROM
(
    SELECT
        number,
        toDateTime('2021-01-01 00:00:00') + number AS timestamp
    FROM system.numbers
    LIMIT 1000
)

Query id: 4a8058c7-24e8-409d-8cdf-efa05a45e430


0 rows in set. Elapsed: 0.007 sec.

Received exception from server (version 21.6.6):
Code: 160. DB::Exception: Received from localhost:9000. DB::Exception: Pattern application proves too difficult, exceeding max iterations (1000000): While executing ConvertingAggregatedToChunksTransform.

The limit seems to be hardcoded (

constexpr auto sequence_match_max_iterations = 1000000;
), is there a way that we could expose this in settings at least? Unfortunately situations like the above happen quite a bit with "real life" datasets and it makes it hard to use sequenceMatch in production.

The above query could be rewritten like this of course:

SELECT coalesce(maxIfOrNull(timestamp, number > 1000) - minIfOrNull(timestamp, number <= 1000), 0) > 1 AS seq
FROM
(
    SELECT
        number,
        toDateTime('2021-01-01 00:00:00') + number AS timestamp
    FROM system.numbers
    LIMIT 1000
)

but it's not always practical.

Metadata

Metadata

Assignees

No one assigned

    Labels

    not plannedKnown issue, no plans to fix it currenltyquestionQuestion?

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions