Skip to content

fix(ClickHouse): fixes for version 21.7 and 21.8#8831

Merged
guidoiaquinti merged 2 commits intomasterfrom
fix_test_for_clickhouse_21_7_and_21_8
Mar 3, 2022
Merged

fix(ClickHouse): fixes for version 21.7 and 21.8#8831
guidoiaquinti merged 2 commits intomasterfrom
fix_test_for_clickhouse_21_7_and_21_8

Conversation

@guidoiaquinti
Copy link
Copy Markdown
Contributor

@guidoiaquinti guidoiaquinti commented Mar 3, 2022

Changes

Code change to support the ClickHouse upgrade to version 21.7 and 21.8.

The code and the underlying generated query are quite complex (at least for me) so I've opted for a quick solution. The underlying issue is that in a subquery we select an empty group by with count that returns null.

I've also added a snapshot for those tests.

How did you test this code?

Before:

2022-03-02T17:01:29.410638Z [error ] Async migration 0002_events_sample_by error: Service clickhouse is in version 21.7.11. Expected range: >=21.6.0,<21.7.0. [posthog.async_migrations.utils]
FAILED posthog/async_migrations/test/test_0002_events_sample_by.py::Test0002EventsSampleBy::test_run_migration_in_full
DB::Exception: Illegal type Nullable(Float64) of LIMIT expression, must be numeric type. Stack trace:
FAILED ee/clickhouse/queries/funnels/test/test_funnel_time_to_convert.py::TestFunnelTrends::test_auto_bin_count_single_step
FAILED ee/clickhouse/queries/funnels/test/test_funnel_time_to_convert.py::TestFunnelTrends::test_auto_bin_count_total
FAILED ee/clickhouse/queries/funnels/test/test_funnel_time_to_convert.py::TestFunnelTrends::test_basic_strict
FAILED ee/clickhouse/queries/funnels/test/test_funnel_time_to_convert.py::TestFunnelTrends::test_basic_unordered

FAILED posthog/api/test/test_insight_funnels.py::ClickhouseTestFunnelTypes::test_funnel_time_to_convert_auto_bins
FAILED posthog/api/test/test_insight_funnels.py::ClickhouseTestFunnelTypes::test_funnel_time_to_convert_auto_bins_strict
FAILED posthog/api/test/test_insight_funnels.py::ClickhouseTestFunnelTypes::test_funnel_time_to_convert_auto_bins_unordered

After:

(env) ➜  posthog git:(fix_test_for_clickhouse_21_7_and_21_8) ✗ pytest ee/clickhouse/queries/funnels/test/test_funnel_time_to_convert.py posthog/api/test/test_insight_funnels.py
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.8.9, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
django: settings: posthog.settings (from ini)
rootdir: /Users/guidoiaquinti/workspace/posthog/posthog, configfile: pytest.ini
plugins: cov-2.12.1, syrupy-1.4.6, split-0.3.3, celery-4.4.2, env-0.6.2, django-constance-2.8.0, django-4.1.0, mock-3.5.1
collected 23 items

ee/clickhouse/queries/funnels/test/test_funnel_time_to_convert.py .s....                                                                                                                             [ 26%]
posthog/api/test/test_insight_funnels.py .................                                                                                                                                           [100%]

----------------------------------------------------------------------------------------- snapshot report summary ------------------------------------------------------------------------------------------

====================================================================================== 22 passed, 1 skipped in 21.16s ======================================================================================

@guidoiaquinti guidoiaquinti requested a review from hazzadous March 3, 2022 11:15
@guidoiaquinti guidoiaquinti marked this pull request as ready for review March 3, 2022 11:18
posthog_max_version = "1.33.9"

service_version_requirements = [
ServiceVersionRequirement(service="clickhouse", supported_version=">=21.6.0,<21.7.0"),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yakkomajuri I don't remember why this was added in the first place but looks safe to be removed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is still relevant, though I might be missing something due to tests.

This will make it to the next release and be required to be run before then anyways :)

@guidoiaquinti guidoiaquinti requested a review from macobo March 3, 2022 11:24
/* Making sure bin_count bins are returned */
/* Those not present in the results query due to lack of data simply get person_count 0 */
SELECT histogram_from_seconds + number * bin_width_seconds AS bin_from_seconds FROM system.numbers LIMIT {bin_count_identifier} + 1
SELECT histogram_from_seconds + number * bin_width_seconds AS bin_from_seconds FROM system.numbers LIMIT ifNull({bin_count_identifier}, 0) + 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a @snapshot_clickhouse_queries code path covering this - really surprised this doesn't have coverage.

Do we want to add the test here or e.g. tag @neilkakkar to take a look?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests added. Let me know if it's 👍

Copy link
Copy Markdown
Contributor

@hazzadous hazzadous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

Can you see an easy way to identify if this could be an issue elsewhere?

bin_count_expression = f"""
count() AS sample_count,
least(60, greatest(3, ceil(cbrt(sample_count)))) AS {bin_count_identifier},
ifNull(least(60, greatest(3, ceil(cbrt(sample_count)))), 0) AS {bin_count_identifier},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both of the ifNull additions?

Is there a reason to not put this directly on the count()?

@guidoiaquinti
Copy link
Copy Markdown
Contributor Author

Can you see an easy way to identify if this could be an issue elsewhere?

For other CH versions you mean?

Do we need both of the ifNull additions?
Is there a reason to not put this directly on the count()?

ifNull(count(),0) AS sample_count doesn't fix it

@hazzadous
Copy link
Copy Markdown
Contributor

Can you see an easy way to identify if this could be an issue elsewhere?

For other CH versions you mean?

I mean, this instance was picked up by a test, but could there be other queries that would fail due to the same underlying issue.

@macobo
Copy link
Copy Markdown
Contributor

macobo commented Mar 3, 2022

LGTM. I agree with harry re verifying that other queries weren't relying on this coalescing logic + we should make sure the path is covered with tests (+ query snapshot tests).

Once some consensus is reached there, happy to have this be merged. :)

@guidoiaquinti guidoiaquinti changed the title ClickHouse - fixes for CH version 21.7 and 21.8 fix(ClickHouse): fixes for version 21.7 and 21.8 Mar 3, 2022
@guidoiaquinti guidoiaquinti force-pushed the fix_test_for_clickhouse_21_7_and_21_8 branch from 59b6c73 to 3f8d2ee Compare March 3, 2022 14:28
@guidoiaquinti guidoiaquinti merged commit 8f683e5 into master Mar 3, 2022
@guidoiaquinti guidoiaquinti deleted the fix_test_for_clickhouse_21_7_and_21_8 branch March 3, 2022 14:48
@guidoiaquinti
Copy link
Copy Markdown
Contributor Author

guidoiaquinti commented Mar 3, 2022

LGTM. I agree with harry re verifying that other queries weren't relying on this coalescing logic + we should make sure the path is covered with tests (+ query snapshot tests).

Once some consensus is reached there, happy to have this be merged. :)

As discussed on Slack:

  1. I wasn't able to find other occurrences of this issue by spot checking the codebase
  2. All tests are passing + I've added query snapshot tests
  3. @macobo @neilkakkar let me know if we should do something more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants