Skip to content

Asynchronous inserts mode support#20557

Merged
CurtizJ merged 45 commits intoClickHouse:masterfrom
abyss7:async-insert
Sep 16, 2021
Merged

Asynchronous inserts mode support#20557
CurtizJ merged 45 commits intoClickHouse:masterfrom
abyss7:async-insert

Conversation

@abyss7
Copy link
Copy Markdown
Contributor

@abyss7 abyss7 commented Feb 16, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
New asynchronous insert mode allows to accumulate inserted data and store it in a single batch in background. On server-side it controlled by settings async_insert_threads, async_insert_max_data_size and async_insert_busy_timeout_ms. For client it can be enabled by setting async_insert for INSERT queries with data inlined in query or in separate buffer (e.g. for INSERT queries via HTTP protocol). If wait_for_async_insert is true (by default) the client will wait until data will be flushed to table. Implements #18282.

TODO:

  • Access check
  • Reset parser
  • Handle malformed data
  • Wait-mode for client
  • Remove empty queues (may be not)
  • Better hashing for queries

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Feb 16, 2021
@abyss7 abyss7 marked this pull request as ready for review April 22, 2021 14:02
global_context->setAsynchronousInsertQueue(std::make_shared<AsynchronousInsertQueue>(
settings.async_insert_threads,
settings.async_insert_max_data_size,
AsynchronousInsertQueue::Timeout{.busy = settings.async_insert_busy_timeout, .stale = settings.async_insert_stale_timeout}));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU right now async queues will be removed after storages shutdown, and this will loose some newly INSERT'ed data (via async INSERT)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the order of destruction on the whole server shutdown?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check the exact impact - but doesn't it the same right now? If we shutdown the server in the middle of insertion, then the new part may not get put properly in many ways - if we don't use WAL in first place.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the middle - yes, no guarantee.
But if the INSERT query finished that it is better to try to flush these data into tables, however right now it will not happens since storages will be already shutted down.

@abyss7
Copy link
Copy Markdown
Contributor Author

abyss7 commented Jul 12, 2021

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Jul 12, 2021

Command update: success

Branch has been successfully updated

@CurtizJ CurtizJ self-assigned this Jul 28, 2021
@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Sep 1, 2021

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Sep 1, 2021

Command update: failure

Base branch update has failed
merge conflict between base and head
err-code: 12EF4

CurtizJ added a commit that referenced this pull request Sep 16, 2021
@CurtizJ CurtizJ merged commit 3a0d480 into ClickHouse:master Sep 16, 2021
@alexey-milovidov
Copy link
Copy Markdown
Member

Continued in #27537.

@sevirov
Copy link
Copy Markdown
Contributor

sevirov commented Sep 16, 2021

Internal documentation ticket: DOCSUP-14941

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants