Conversation
|
While trying to write a Python unit test for this behavior, I noticed that emitting chunks does not necessarily mean the data is immediately inserted into the table. Based on this, I am thinking of generating a number of records smaller than max_insert_block_size and verifying that SELECT count(*) returns a value greater than zero, and that after the timeout expires, all records are eventually inserted (using JSONEachRow). Does this sound like a reasonable way to test this feature? Note: max_query_size must be explicitly set and input_format_parallel_parsing must be disabled (input_format_parallel_parsing = 0) so that this behavior can be observed with a small number of rows. |
Fgrtue
left a comment
There was a problem hiding this comment.
Hi @Sasao4o. Thank you for contribution! The changes look good.
Regarding your questions on testing.
I am thinking of generating a number of records smaller than max_insert_block_size and verifying that SELECT count(*) returns a value greater than zero, and that after the timeout expires, all records are eventually inserted (using JSONEachRow).
This could be a way to got, yes. Although, I would suggest writing a bash test, since it should be more straightforward. We need to test two expected behaviors that you implemented:
-
The input format should be able to form blocks of data for INSERT not only by the threshold on the number of rows or bytes but also by timeout. -- we could check then the number of parts (using one of system tables) that are inserted. The number of parts should be "cut" according with the time delimiter.
-
When the connection is unexpectedly closed, it should parse and process the remaining data instead of treating it as an error. -- could be done similarly to the approach above, or probably even by using
SELECT count()
You are welcome , my pleasure. the second test (the connection timeout one) i want remove this SLEEP 31(the default) to be sth smaller but http_receive_timeout setting does not seem to have any effect on my HTTP requests i also made sure that i wrote it in the correct place |
…not depend on timeouts but connection drop
never mind i changed the test to get out of this timeout mess but i found out that UNEXPECTED_END_OF_LINE is thrown when connection is dropped at middle of streaming so i added it to the connection error codes |
|
@Sasao4o some of the test are failing. We should add some tags on your tests to prevent your tests to be added to some test suites. For example, Please also check parallel tests -- they are also giving wrong results. Just to be sure try to add In the If this doesn't help, it is likely that the test are incompatible with some other test suites, then we can remove those test suites using tags as well. |
…rts table + increase waiting time to 3 sec before killing process
|
@Sasao4o it seems that we need to pull from master, and update the setting history file -- your setting should be now in |
yea i felt happy when i knew that was the reason of test failure xd |
|
Hello @Fgrtue , For the drop-timeout test, I wasn’t able to reproduce the failure locally. I ran it many times with the parallel flag (-j) together with many other tests from the report, but it always succeeded. I also tried running it with the exact same settings, and it still succeeded. Do you think S3 could be the reason? (more latency or sth like this? |
51f0010 to
9fea3d9
Compare
|
@Fgrtue Hello if (params.in_transaction) throw
Otherwise, we won’t throw an error when inside a transaction, which could break atomicity. I was expecting 02435_rollback_cancelled_queries to fail. What do you think? |
|
Let's see if the test fails this time. |
|
Just to clarify, in which case do you suggest that we should throw and error? |
If the INSERT is wrapped in BEGIN/END (transaction semantics), and the connection is dropped while we are shutting down gracefully, the rollback logic in executeQuery will not run. at first i thought clickhouse doesn't support transactions but i found it as an experimental feature. So should we take it into account? |
|
I see. Instead of propagating an exception in IRowInputFormat during graceful handling, let's better then throw an exception in format factory -- in case we are in transaction and |
i didn't mean the what i meant is what we are doing now if (connectionError) don't throw and emit chunkso executeQuery doesn't know that there is a connection error so it doesnt execute rollback I tested this locally using our test, and I added (implicit transaction = 1). I should find 0 parts, but I found more than 0 so the proposed solution is to pass is_transaction to iRowInputFormat (as it was before reverting) if (!transaction && connectionError) safe to emit because we are not in txni also think we will need to catch the exception in this layer so we can rollback partially filled rows (the popBack() code) |
|
I discussed with the team, and we think what should be done it to make another setting that will control this functionality with exception AND possibility to use So we introduce a setting Then, in the implementation, instead of if (connectionError) don't throw and emit chunkwe should have if (input_format_connection_handling && connectionError) ...It should be clearly mentioned, that in case of connection error we cannot guarantee that the blocks of data will be deduplicated. Speaking about transactions, such a behavior can be expected with the semantics that this setting introduces -- that some of the data will be inserted even in case of connection error. |
…ol Flushing On Unexpectedly Closed Connection
| sleep 8 | ||
|
|
||
| kill -9 $PIPELINE_PID 2>/dev/null | ||
|
|
||
| wait $PIPELINE_PID 2>/dev/null | ||
|
|
||
|
|
||
| sleep 1 |
There was a problem hiding this comment.
On the internal testing this test fails due to difference in result from the reference -- it gets 0 parts instead of 1 part. Let's try to increase the duration of sleep. The test suit that fails is related to s3 in asan build, so this is likely a latency issue.
|
The changes are under the setting so harmless for the production. Also, before merging I consulted with @tavplubix, who approved the changes. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Added input_format_max_block_wait_ms setting to emit data blocks by timeout and allowed processing of remaining data when an HTTP connection is closed unexpectedly.
This code demonstrates streaming inserts over HTTP and shows that ClickHouse can successfully insert records even when max_insert_block_size is not reached, by flushing data based on input_format_max_block_wait_ms. It also verifies that if the connection is unexpectedly closed (for example, due to a timeout), ClickHouse still correctly parses and inserts any remaining data instead of treating the situation as an error.
This closes #41439