Skip to content

Fixing join tests with analyzer#49555

Merged
vdimir merged 13 commits intomasterfrom
vdimir/analyzer_fix_join_tests
Oct 13, 2023
Merged

Fixing join tests with analyzer#49555
vdimir merged 13 commits intomasterfrom
vdimir/analyzer_fix_join_tests

Conversation

@vdimir
Copy link
Copy Markdown
Member

@vdimir vdimir commented May 5, 2023

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

TODO:

Fixed reference files:

  • 01353_low_cardinality_join_types

     -- Without Analyzer LowCardinality(String)
     -- With Analyzer: String
     SELECT toTypeName(s) FROM (SELECT toLowCardinality('a') as s) t1 JOIN (SELECT 'a' as s) t2 USING (s);
     
     -- Without Analyzer LowCardinality(String)
     -- With Analyzer: String
     SELECT toTypeName(t1.s) FROM (SELECT toLowCardinality('a') as s) t1 JOIN (SELECT 'a' as s) t2 USING (s);
    
     -- Without Analyzer String
     -- With Analyzer: String
     SELECT toTypeName(t2.s) FROM (SELECT toLowCardinality('a') as s) t1 JOIN (SELECT 'a' as s) t2 USING (s);
  • 02267_join_dup_columns_issue36199

    -- Without Analyzer return only columns `count(), x` from right table
    -- With Analyzer return only columns `count('x'), count('y'), x` from right table
    SELECT * FROM ( SELECT 2 AS x ) AS t1 RIGHT JOIN ( SELECT count('x'), count('y'), 2 AS x ) AS t2 ON t1.x = t2.x;
  • 02242_join_rocksdb

    Fixed bug and adjusted error code in test

  • 01049_join_low_card_bug_long

    Fix bug with query:

    SELECT t.t from (SELECT 1 as t) as a FULL JOIN (SELECT 1 as t) as t USING t;

    adjust reference file (columns after USING have type of supertype of the source columns), also it needs fix from Fix join_use_nulls in analyzer #49359 (so, it's not removed from broken_tests.txt)

  • 01721_join_implicit_cast_long
    Adjust reference.

    Not sure about the change:

     -- Before: Nullable
     -- With analyzer: not-Nullable
     SELECT toTypeName(a) FROM (SELECT 1 as a ) t1 RIGHT JOIN  (SELECT 1 as a) t2 USING (a) SETTINGS join_use_nulls = 1;
  • 01062_pm_all_join_with_block_continuation
    Fixed test file, JOIN using alias from external scope. Removed this fix from the PR, will in another one.

  • 02000_join_on_const
    Fixed test file, ... ANY JOIN ... ON 1 == 1 is forbidden

  • 02382_join_and_filtering_set
    Copy missing corresponding code from Interpreter, adjust test (removed with cube)

  • 01890_materialized_distributed_join
    Function getHeaderForProcessingStage handle join on top of query tree

@vdimir vdimir added the do not test disable testing on pull request label May 5, 2023
@robot-clickhouse-ci-2
Copy link
Copy Markdown
Contributor

robot-clickhouse-ci-2 commented May 5, 2023

This is an automated comment for commit 3aac663 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
AST fuzzerRuns randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help✅ success
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR✅ success
ClickHouse build checkBuilds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker image for serversThe check to build and optionally push the mentioned image to docker hub✅ success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here✅ success
Flaky testsChecks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integrational tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Mergeable CheckChecks if all other necessary checks are successful✅ success
Performance ComparisonMeasure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests✅ success
Push to DockerhubThe check for building and pushing the CI related docker images to docker hub✅ success
SQLTestThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
SQLancerFuzzing tests that detect logical bugs with SQLancer tool✅ success
SqllogicRun clickhouse on the sqllogic test set against sqlite and checks that all statements are passed✅ success
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors✅ success
Style CheckRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success
Check nameDescriptionStatus
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ failure
Upgrade checkRuns stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts❌ failure

@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-not-for-changelog This PR should not be mentioned in the changelog label May 5, 2023
@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch from 8a9ca47 to 8e92730 Compare May 8, 2023 15:26
@vdimir vdimir removed the do not test disable testing on pull request label May 8, 2023
@novikd novikd self-assigned this May 9, 2023
@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch 2 times, most recently from b596514 to 9a975bf Compare May 9, 2023 16:41
@vdimir
Copy link
Copy Markdown
Member Author

vdimir commented May 10, 2023

Tests failed because #49359 is also required, I'll temporarily add commits from that branch here.

@vdimir vdimir marked this pull request as draft May 10, 2023 11:25
@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch from 9a975bf to 1e50db0 Compare May 10, 2023 15:58
@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch 2 times, most recently from 97115ca to 7c80861 Compare June 1, 2023 10:49
@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch from 18c485c to 4af8d68 Compare August 2, 2023 10:05
@vdimir
Copy link
Copy Markdown
Member Author

vdimir commented Aug 3, 2023

Stateless tests flaky check (asan) — fail: 21, passed: 193 Details
01062_pm_all_join_with_block_continuation: [ FAIL ] 147.69 sec. - Test runs too long (> 60s). Make it faster

Normally it takes <50sec:
https://play.clickhouse.com/play?user=play#U0VMRUNUCiAgICBxdWFudGlsZXMoMC41LCAwLjksIDAuOTUpKHRlc3RfZHVyYXRpb25fbXMgLyAxMDAwKSBGUk9NIGNoZWNrcwpXSEVSRSB0ZXN0X25hbWUgPSAnMDEwNjJfcG1fYWxsX2pvaW5fd2l0aF9ibG9ja19jb250aW51YXRpb24nCkFORCBjaGVja19uYW1lIGxpa2UgJyVhc2FuJScKQU5EIGNoZWNrX3N0YXJ0X3RpbWUgPD0gbm93KCkgLSBJTlRFUlZBTCAxMCBEQVkKQU5EIHB1bGxfcmVxdWVzdF9udW1iZXIgPSAwCgo=

Locally new and old versions of tests are approximately the same.

Temporarily reverted changes in this test 01062_pm_all_join_with_block_continuation

@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch from 4af8d68 to 964c7f9 Compare August 3, 2023 15:24
@vdimir
Copy link
Copy Markdown
Member Author

vdimir commented Aug 4, 2023

Stateless tests flaky check (asan) — fail: 1, passed: 260 Details

01721_join_implicit_cast_long: [ FAIL ] 601.03 sec. - Timeout!

It doesn't have long tag in the code, I'm not sure if we still count test as long by filename.
It became twice longer that it is (added run it with and without analyzer). Another option is just to split it into two tests.


All in all, this PR is finished it contains fixes for several tests. What was done for each test is described in the top-level PR comment, and also each one is addressed in separate commit.

@novikd you may take a look

@vdimir vdimir marked this pull request as ready for review August 4, 2023 09:40
@vdimir vdimir requested a review from novikd August 4, 2023 09:47
@alexey-milovidov
Copy link
Copy Markdown
Member

The new test appears to be longer than 600 seconds under some settings.

Copy link
Copy Markdown
Member

@novikd novikd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM

removeJoin(*query->as<ASTSelectQuery>(), new_rewriter_result, context);
{
if (!query_info.planner_context)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Query is not analyzed");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's possible to make a more meaningful log message.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to Query is not analyzed: no planner context. Either way, it's LOGICAL_ERROR and it doesn't matter a lot for a user, but rather for developer.

/// Note: it's also doesn't work with the read-in-order optimization.
/// No checks here because read in order is not applied if we have `CreateSetAndFilterOnTheFlyStep` in the pipeline between the reading and sorting steps.
bool has_non_const_keys = has_non_const(left_plan.getCurrentDataStream().header, join_clause.key_names_left)
&& has_non_const(right_plan.getCurrentDataStream().header, join_clause.key_names_right);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe || instead of &&?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both sides should have non constant keys, otherwise SortingTransform on corresponding side emit rows before whole table had been read and it breaks CreateSetAndFilterOnTheFlyStep


SELECT lc, toTypeName(lc) FROM l_lc AS l RIGHT JOIN r_lc AS r USING (x) ORDER BY l.lc;

SELECT lowCardinalityKeys(lc.lc) FROM r FULL JOIN l_lc as lc USING (lc) ORDER BY lowCardinalityKeys(lc.lc);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it removed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In new analyzer lc.lc became String (because r contains String and we take supertype as a result type), so lowCardinalityKeys is not applicable. We can leave this query only for old analyzer, but I don't think it's essential.

@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch from 964c7f9 to 4cfe600 Compare October 5, 2023 12:42
@vdimir
Copy link
Copy Markdown
Member Author

vdimir commented Oct 10, 2023

Stress test (tsan) — check_status.tsv doesn't exists Details
Upgrade check (tsan) — check_status.tsv doesn't exists Details

#49076

Stateless tests (release, analyzer) — fail: 1, passed: 5649, skipped: 48 Details

  • 02886_missed_json_subcolumns probably related, will check

UPD: this check was after checing can_be_not_found after merge, fixed

if (auto * column = compound_expression->as<ColumnNode>())
{
const DataTypePtr & column_type = column->getColumn().getTypeInStorage();
if (column_type->getTypeId() == TypeIndex::Object)
{
const auto * object_type = checkAndGetDataType<DataTypeObject>(column_type.get());
if (object_type->getSchemaFormat() == "json" && object_type->hasNullableSubcolumns())
{
QueryTreeNodePtr constant_node_null = std::make_shared<ConstantNode>(Field());
return constant_node_null;
}
}
}
if (can_be_not_found)
return {};

Fixed test file, JOIN using alias from external scope
Copy missing corresponding code from Interpreter, adjust test (removed
with cube)
Function getHeaderForProcessingStage handle join on top of query tree
@vdimir vdimir force-pushed the vdimir/analyzer_fix_join_tests branch from 4cfe600 to 3aac663 Compare October 12, 2023 10:47
@vdimir
Copy link
Copy Markdown
Member Author

vdimir commented Oct 13, 2023

Integration tests (release) [3/4] — fail: 1, passed: 603 Details

Seems test_replicated_zero_copy_projection_mutation/test.py::test_hardlinks_preserved_when_projection_dropped is a bit flaky.
No other failures (ci db), but issue related to timeouts in tests

Details

We deleted objects after timeout:

2023-10-12 12:23:59 [ 601 ] DEBUG : http://172.16.11.8:9001 "GET /root?delimiter=&encoding-type=url&list-type=2&max-keys=1000&prefix=data%2F HTTP/1.1" 200 0 (connectionpool.py:546, _make_request)
2023-10-12 12:23:59 [ 601 ] INFO : list_objects (20): ['data/bvw/wrljevlduzmgtpddbnjdicmjbqmul', 'data/gar/bdzxvruhcbojwnssexrlvsuoknmyg', 'data/ixn/uqsdxqevkpebrpidxuywczmsmwhlb', 'data/kol/jzwsrnushtokftdzaloolthtxldvp', 'data/kvs/uhzthybwkhsxuiteqoyvdlniitnfb', 'data/lhi/ctdysdzbymnllpxcqffbyrnxjonwf', 'data/loi/nkwrplbttcgzamiwqmedqbfgqcxvk', 'data/lsx/iyuqumlazclzrdsitaobgcbdzookw', 'data/mbx/gqvwlivusmkrehndgnbnjkochywdc', 'data/mqk/dljgyadfnccqijfgudkkgnkssrnej', 'data/nzb/fhfkriolhlgrjcskicmcyivmfjegy', 'data/ood/deqohcrddwnsademgifqbmvunctgq', 'data/pnd/kbdarptmoapjfglrklaeevnaqtkex', 'data/rey/tesbqzzygjsnwrvvrbvesjlscbetv', 'data/seq/wlsptkdxjzjxwnvesxhuyltndvbcp', 'data/sya/uaggjvudvwbtgvbvqmromayzzupdc', 'data/vie/vtakpllynnxwkykfmsghonyqpeqra', 'data/vla/bsrahqodprsluqzngqkuivtmrcebm', 'data/vtd/vehexvssfhqseohjruybicspupwcl', 'data/xts/jusayraucumkmwegngwsgdyenokeu'] (test.py:75, list_objects)
2023-10-12 12:24:00 [ 601 ] DEBUG : http://172.16.11.8:9001 "GET /root?delimiter=&encoding-type=url&list-type=2&max-keys=1000&prefix=data%2F HTTP/1.1" 200 0 (connectionpool.py:546, _make_request)
2023-10-12 12:24:00 [ 601 ] INFO : list_objects (20): ['data/bvw/wrljevlduzmgtpddbnjdicmjbqmul', 'data/gar/bdzxvruhcbojwnssexrlvsuoknmyg', 'data/ixn/uqsdxqevkpebrpidxuywczmsmwhlb', 'data/kol/jzwsrnushtokftdzaloolthtxldvp', 'data/kvs/uhzthybwkhsxuiteqoyvdlniitnfb', 'data/lhi/ctdysdzbymnllpxcqffbyrnxjonwf', 'data/loi/nkwrplbttcgzamiwqmedqbfgqcxvk', 'data/lsx/iyuqumlazclzrdsitaobgcbdzookw', 'data/mbx/gqvwlivusmkrehndgnbnjkochywdc', 'data/mqk/dljgyadfnccqijfgudkkgnkssrnej', 'data/nzb/fhfkriolhlgrjcskicmcyivmfjegy', 'data/ood/deqohcrddwnsademgifqbmvunctgq', 'data/pnd/kbdarptmoapjfglrklaeevnaqtkex', 'data/rey/tesbqzzygjsnwrvvrbvesjlscbetv', 'data/seq/wlsptkdxjzjxwnvesxhuyltndvbcp', 'data/sya/uaggjvudvwbtgvbvqmromayzzupdc', 'data/vie/vtakpllynnxwkykfmsghonyqpeqra', 'data/vla/bsrahqodprsluqzngqkuivtmrcebm', 'data/vtd/vehexvssfhqseohjruybicspupwcl', 'data/xts/jusayraucumkmwegngwsgdyenokeu'] (test.py:75, list_objects)
2023-10-12 12:24:00 [ 601 ] DEBUG : Executing query DROP TABLE IF EXISTS test_hardlinks_preserved_when_projection_dropped SYNC on node1 (cluster.py:3409, query)
2023-10-12 12:24:00 [ 601 ] DEBUG : Executing query DROP TABLE IF EXISTS test_hardlinks_preserved_when_projection_dropped SYNC on node2 (cluster.py:3409, query)
2023.10.12 12:24:00.270615 [ 371 ] {} <Debug> S3ObjectStorage: Objects with paths [data/vie/vtakpllynnxwkykfmsghonyqpeqra, data/nzb/fhfkriolhlgrjcskicmcyivmfjegy, data/rey/tesbqzzygjsnwrvvrbvesjlscbetv, data/lhi/ctdysdzbymnllpxcqffbyrnxjonwf, data/bvw/wrljevlduzmgtpddbnjdicmjbqmul, data/sya/uaggjvudvwbtgvbvqmromayzzupdc, data/seq/wlsptkdxjzjxwnvesxhuyltndvbcp, data/vtd/vehexvssfhqseohjruybicspupwcl, data/mqk/dljgyadfnccqijfgudkkgnkssrnej] were removed from S3

And expected objects:

list_objects (2): ['data/gar/bdzxvruhcbojwnssexrlvsuoknmyg', 'data/mbx/gqvwlivusmkrehndgnbnjkochywdc'] (test.py:75, list_objects)

Was deleted only after test was finished and table was dropped.

node1/logs/clickhouse-server.log
517:2023.10.12 12:22:53.716436 [ 12 ] {ea2e0500-08fa-4c29-95f7-4297baca346a} <Debug> executeQuery: (from 172.16.11.1:55232) DROP TABLE IF EXISTS test_hardlinks_preserved_when_projection_dropped SYNC (stage: Complete)
2037:2023.10.12 12:24:00.196979 [ 12 ] {8db3cd4d-2ed7-4a21-9e47-2ae7feff10df} <Debug> executeQuery: (from 172.16.11.1:55710) DROP TABLE IF EXISTS test_hardlinks_preserved_when_projection_dropped SYNC (stage: Complete)

node2/logs/clickhouse-server.log
907:2023.10.12 12:24:00.260354 [ 361 ] {205de67e-686c-4b76-8d04-21e08a19b434} <Debug> executeQuery: (from 172.16.11.1:64916) DROP TABLE IF EXISTS test_hardlinks_preserved_when_projection_dropped SYNC (stage: Complete)
...

2023.10.12 12:24:00.201130 [ 163 ] {} <Debug> S3ObjectStorage: Objects with paths [data/mbx/gqvwlivusmkrehndgnbnjkochywdc] were removed from S3
...
2023.10.12 12:24:00.275343 [ 371 ] {} <Debug> S3ObjectStorage: Objects with paths [data/gar/bdzxvruhcbojwnssexrlvsuoknmyg] were removed from S3

So I believe no bug here

@vdimir vdimir merged commit e64abfc into master Oct 13, 2023
@vdimir vdimir deleted the vdimir/analyzer_fix_join_tests branch October 13, 2023 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants