Conversation
|
This is an automated comment for commit d394278 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
| -- This tests depends on internet access, but it does not matter, because it only has to check that there is no abort due to a bug in Apache Arrow library. | ||
|
|
||
| INSERT INTO TABLE FUNCTION url('https://clickhouse-public-datasets.s3.amazonaws.com/hits_compatible/hits.parquet') SELECT * FROM url('https://clickhouse-public-datasets.s3.amazonaws.com/hits_compatible/hits.parquet'); -- { serverError CANNOT_WRITE_TO_OSTREAM, RECEIVED_ERROR_FROM_REMOTE_IO_SERVER, POCO_EXCEPTION } | ||
| INSERT INTO TABLE FUNCTION url('https://clickhouse-public-datasets.s3.amazonaws.com/hits_compatible/athena_partitioned/hits_9.parquet') SELECT * FROM url('https://clickhouse-public-datasets.s3.amazonaws.com/hits_compatible/athena_partitioned/hits_9.parquet'); -- { serverError CANNOT_WRITE_TO_OSTREAM, RECEIVED_ERROR_FROM_REMOTE_IO_SERVER, POCO_EXCEPTION } |
There was a problem hiding this comment.
I'm not sure, but maybe reproducing the issue needs a large file?
@Avogar can confirm it, but we can simply add no-msan for this test if that's the case
There was a problem hiding this comment.
Oh, I assumed the test is just about failing to write to this URL at all. Normally the query would fail and stop long before we try to write 133 MB.
There was a problem hiding this comment.
I assumed the test is just about failing to write to this URL at all
Yes, this is right. No need for large file
There was a problem hiding this comment.
Why not use MinIO for this test, is there something specific?
P.S. it fails not only under msan, but tsan as well
There was a problem hiding this comment.
This test is still flaky
The PR was merged just a few hours ago, don't think repors from the link contain the changes
|
Flaky check failed because one of the runs took 248 seconds because the HTTP write request timed out, presumably the timeout is 240 seconds (some TCP timeout?). I'll just assume that the network and aws are unreliable and this is normal. (Would be cool to be able to investigate things like this and find the root cause, somewhere in the network infrastructure presumably, but I don't have the skill and it seems low-priority and likely futile.) |
src/IO/ReadWriteBufferFromHTTP.cpp
Outdated
| e.getHTTPStatus() != Poco::Net::HTTPResponse::HTTP_TOO_MANY_REQUESTS && | ||
| e.getHTTPStatus() != Poco::Net::HTTPResponse::HTTP_REQUEST_TIMEOUT && | ||
| e.getHTTPStatus() != Poco::Net::HTTPResponse::HTTP_MISDIRECTED_REQUEST) | ||
| { | ||
| LOG_DEBUG(log, | ||
| "HEAD request to '{}'{} failed with HTTP status {}", | ||
| initial_uri.toString(), current_uri == initial_uri ? String() : fmt::format(" redirect to '{}'", current_uri.toString()), | ||
| e.getHTTPStatus()); |
There was a problem hiding this comment.
Was it used for debugging? Let's remove it and merge the PR?
Changelog category (leave one):
Failed like this: https://s3.amazonaws.com/clickhouse-test-reports/58934/8b510a67f957c2373f4126d2884614ff7066f1b5/stress_test__msan_.html
It was trying to read the whole 12 GB file into memory at once, and the 12 GB memory allocation failed because apparently MSAN has 8 GiB limit. This PR switches to a smaller 133 MB file.
Why did it try to download the whole file at once? It's not supposed to do that. Downloading the whole file is the fallback that happens in 2 cases:
input_format_allow_seekswas set to false. Does stress test randomize it? Doesn't seem so.