Skip to content

clickhouse-test: replace clickhouse-driver with http interface (via http.client)#30065

Merged
alexey-milovidov merged 10 commits intoClickHouse:masterfrom
azat:clickhouse-test-http-interface
Oct 14, 2021
Merged

clickhouse-test: replace clickhouse-driver with http interface (via http.client)#30065
alexey-milovidov merged 10 commits intoClickHouse:masterfrom
azat:clickhouse-test-http-interface

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Oct 12, 2021

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Cons of clickhouse-driver:

  • it is one more extra dependency
  • it does not have correct timeouts (only for socket operations, and
    this is not the same, so we need to set timeout by ourselves)
  • it is one more thing which can break (@alesapin)

Cc: @alesapin
Cc: @tavplubix

azat added 3 commits October 12, 2021 21:06
…ttp.client)

Cons of clickhouse-driver:
- it is one more extra dependency
- it does not have correct timeouts (only for socket operations, and
  this is not the same, so we need to set timeout by ourself)
- it is one more thing which can break (@alesapin)
@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Oct 12, 2021
@azat azat changed the title clickhouse-test: replace clickhouse-driver with using http interface (via http.client) clickhouse-test: replace clickhouse-driver with http interface (via http.client) Oct 12, 2021
azat added 2 commits October 12, 2021 22:39
Since it will not configure testcase args and fail eventually, and later
we have a try/catch anyway, this should be enough.
@alesapin alesapin self-assigned this Oct 13, 2021
@alesapin
Copy link
Copy Markdown
Member

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Oct 13, 2021

Command update: success

Branch has been successfully updated

@nikitamikhaylov
Copy link
Copy Markdown
Member

nikitamikhaylov commented Oct 13, 2021

@azat Need to merge this PR ASAP, because currently clickhouse-test doesn't renew .stderr and .stdout files somehow. So, if your test fails once, then it will fail until you delete .stderr and .stdout files.

It is unpleasant and makes me to debug with log messages with Fatal level

@azat
Copy link
Copy Markdown
Member Author

azat commented Oct 13, 2021

Functional stateless tests (address) — fail: 0, passed: 3440, skipped: 12, unknown: 4

2021-10-12 17:26:37 02021_h3_is_pentagon:                                                   [ UNKNOWN ] 0.00 sec. - Test internal error: timeout
2021-10-12 17:26:37 timed out
2021-10-12 17:26:37   File "/usr/bin/clickhouse-test", line 637, in run
2021-10-12 17:26:37     self.testcase_args = self.configure_testcase_args(args, self.case_file, suite.suite_tmp_path)
2021-10-12 17:26:37 
2021-10-12 17:26:37   File "/usr/bin/clickhouse-test", line 374, in configure_testcase_args
2021-10-12 17:26:37     clickhouse_execute(args, "CREATE DATABASE " + database + get_db_engine(testcase_args, database), settings={

Interesting, 30 seconds wasn't enough to create database, and trace_log is empty (since it is ASAN build and it has only Memory related profiling, this is OK).

Server logs:

2021.10.12 17:26:07.576556 [ 1024 ] {dad7f7d4-08d6-4b09-960c-d6b4be6979bc} <Trace> ContextAccess (default): Access granted: CREATE DATABASE ON test_6lz0l1.*
...
2021.10.12 17:26:42.610220 [ 1024 ] {dad7f7d4-08d6-4b09-960c-d6b4be6979bc} <Information> DatabaseAtomic (test_6lz0l1): Metadata processed, database test_6lz0l1 has 0 tables and 0 dictionaries in total.

Maybe disabling fsync_metadata will help? Let's try -

Functional stateless tests (release) — Some tests restarted, fail: 0, passed: 3455, unknown: 1

Same problem, and here we should have traces:

2021.10.12 20:03:12.989500 [ 6688 ] {8b99230e-e8c1-4c48-b0c0-4644a8d833b2} <Debug> executeQuery: (from [::1]:38294) (comment: 00992_system_parts_race_condition_zookeeper_long.sh) CREATE DATABASE test_xvg1eh 
...
2021.10.12 20:03:51.803737 [ 6688 ] {8b99230e-e8c1-4c48-b0c0-4644a8d833b2} <Debug> DynamicQueryHandler: Done processing query

But trace_log is also empty (for this query_id, or thread_id+timestamp), hm.

Let's try disabling fsync_metadata on CI

Integration tests (release) — fail: 2, passed: 1596, flaky: 0

Functional stateless tests (thread) — Timeout, fail: 0, passed: 926, skipped: 3, unknown: 8

2021.10.12 17:59:23.911646 [ 380 ] {} <Fatal> Application: Child process was terminated by signal 9 (KILL). If it is not done by 'forcestop' command or manually, the possible cause is OOM Killer (see 'dmesg' and look at the '/var/log/kern.log' for the details).

@azat
Copy link
Copy Markdown
Member Author

azat commented Oct 13, 2021

@azat Need to merge this PR ASAP

Agree, actually all of the failures was classified and the reason for them is not clickhouse-test.
But there is one thing in clickhouse-test though, when CREATE/DROP DATABASE timed-out, the test fails with UNKNOWN instead of FAIL (but this does not looks like a major change, since the check will fail and the icon will be red anyway).

because currently clickhouse-test doesn't renew .stderr and .stdout files somehow. So, if your test fails once, then it will fail until you delete .stderr and .stdout files.

Yes, this has been fixed in a separate patch here, if you want, you can cherry-pick it.

@azat
Copy link
Copy Markdown
Member Author

azat commented Oct 14, 2021

Functional stateless tests (release, DatabaseReplicated) — fail: 1, passed: 3338, skipped: 118

Integration tests (release) — fail: 3, passed: 1595, flaky: 2

Stress test (thread) — Test script failed

Code: 241. DB::Exception: Received from localhost:9000. DB::Exception: Memory limit (total) exceeded: would use 106.59 GiB (attempt to allocate chunk of 2105374 bytes), maximum: 62.94 GiB. (MEMORY_LIMIT_EXCEEDED)

@alexey-milovidov alexey-milovidov merged commit a1d349b into ClickHouse:master Oct 14, 2021
@azat azat deleted the clickhouse-test-http-interface branch October 15, 2021 07:31
azat added a commit to azat/ClickHouse that referenced this pull request Oct 16, 2021
JSONEachRow cannot be parsed with a simple json.loads(), instead it
should be passed to json.loads() line by line.

Fixes: ClickHouse#30065
azat added a commit to azat/ClickHouse that referenced this pull request Oct 18, 2021
Set --server-check-retries to 90 (and this is ~45 seconds), since right
now sometimes it is not enough [1].

  [1]: https://clickhouse-test-reports.s3.yandex.net/30191/0e34a9d550cfe6924fe575871f36c44dd44acdaa/functional_stateless_tests_(memory).html#fail1

And the reason I guess is clickhouse-test had been rewritten to
http.client in ClickHouse#30065, and since now it does not need to execute
clickhouse-client binary, which in debug/sanitizers builds can take also
sometime.
That said that with clickhouse-client for hung check it was not 15
seconds, but more (each clickhouse-client requires 0.6sec with
sanitizers for simple SELECT 1, while w/o 0.1second, also too much
should be optimized)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants