Skip to content

Handle errors for Kafka engine#21850

Merged
akuzm merged 18 commits intoClickHouse:masterfrom
fastio:handle_errors_for_kafka_engine
Apr 9, 2021
Merged

Handle errors for Kafka engine#21850
akuzm merged 18 commits intoClickHouse:masterfrom
fastio:handle_errors_for_kafka_engine

Conversation

@fastio
Copy link
Copy Markdown
Contributor

@fastio fastio commented Mar 18, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Allow publishing Kafka errors to a virtual column of Kafka engine, controlled by the kafka_handle_error_mode setting.

Detailed description / Documentation draft:

When the parameter kafka_handle_error_mode is set to stream, the errors will be pushed in to virtual column of Kafka engine.
If the kafka_handle_error_mode is set to default, silently ignore errors with some threshold, similar to what we have now.

Use case:

 CREATE TABLE default.tt
(
    `i` Int64,
    `s` String
)
ENGINE = Kafka
SETTINGS kafka_broker_list = '172.19.0.32:9092', kafka_topic_list = 't2', kafka_group_name = 'g1', kafka_format = 'JSONEachRow', kafka_handle_error_mode='stream';

The table data is used to save data from kafka.

CREATE MATERIALIZED VIEW default.data
(
    `i` Int64,
    `s` String
)
ENGINE = MergeTree
ORDER BY i
SETTINGS index_granularity = 8192 AS
SELECT
    i,
    s
FROM default.tt
WHERE length(_error) = 0

The table error is used to save the exception row information.

CREATE MATERIALIZED VIEW default.kafka_errors
(
    `topic` String,
    `partition` Int64,
    `offset` Int64,
    `raw` String,
    `error` String
)
ENGINE = MergeTree
ORDER BY topic
SETTINGS index_granularity = 8192 AS
SELECT
    _topic AS topic,
    _partition AS partition,
    _offset AS offset,
    _raw_message AS raw,
    _error AS error
FROM default.tt
WHERE length(_error) > 0

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Mar 18, 2021
Copy link
Copy Markdown
Member

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fastio looks useful, can you add a test (tests/integration/test_sotrage_kafka)?

@fastio
Copy link
Copy Markdown
Contributor Author

fastio commented Mar 18, 2021

@fastio looks useful, can you add a test (tests/integration/test_sotrage_kafka)?

Sure.

@fastio fastio requested a review from filimonov April 8, 2021 14:57
Copy link
Copy Markdown
Contributor

@filimonov filimonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@akuzm akuzm self-assigned this Apr 9, 2021
@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Apr 9, 2021

fuzzer #22943
functional stateless tests #22944

@akuzm akuzm merged commit e44b382 into ClickHouse:master Apr 9, 2021
@alexcd90
Copy link
Copy Markdown

i want to know could it to adapted to the kafka procuder ? in other verseion ck, i found the producer occur loss data issue.

@alexey-milovidov
Copy link
Copy Markdown
Member

@akuzm it is strange to have it for Kafka, but not for RabbitMQ.

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Apr 8, 2022

So what, is it the only strange thing in ClickHouse? As I said, stop nagging me, I'm not your employee for a long time already.

@alexey-milovidov
Copy link
Copy Markdown
Member

@akuzm We need to prevent inconsistencies in our system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants