Skip to content

Added function accurateCastOrNull, allow different types inside IN subquery#16724

Merged
alexey-milovidov merged 25 commits intoClickHouse:masterfrom
kitaisreal:added-function-accurate-cast-or-null
Dec 17, 2020
Merged

Added function accurateCastOrNull, allow different types inside IN subquery#16724
alexey-milovidov merged 25 commits intoClickHouse:masterfrom
kitaisreal:added-function-accurate-cast-or-null

Conversation

@kitaisreal
Copy link
Copy Markdown
Contributor

@kitaisreal kitaisreal commented Nov 5, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Mentioned in #10290.
Mentioned in #10266.

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added function accurateCastOrNull. This closes #10290. Add type conversions in x IN (subquery) expressions. This closes #10266.

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Nov 5, 2020
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-milovidov not sure how to properly implement it for types without numeric limits, do we have some places where similar functionality implemented ? Thanks.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most simple is to implement numeric limits for bigints.

@kitaisreal kitaisreal force-pushed the added-function-accurate-cast-or-null branch 4 times, most recently from e8c11fe to cddff6b Compare November 15, 2020 10:04
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-milovidov latest clang warning

Documents/ClickhouseBrowse/src/Core/Settings.cpp:20:1: error: stack frame size of 81128 bytes in lambda expression [-Werror,-Wframe-larger-than=]
IMPLEMENT_SETTINGS_TRAITS(SettingsTraits, LIST_OF_SETTINGS)
^
Documents/ClickhouseBrowse/src/Core/BaseSettings.h:817:46: note: expanded from macro 'IMPLEMENT_SETTINGS_TRAITS'
        static const Accessor the_instance = [] \
Documents/ClickhouseBrowse/build$ clang++-11 --version
Ubuntu clang version 11.0.1-++20201103062930+ef4ffcafbb2-1~exp1~20201103053545.129

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Frame size limit is currently just a precaution.

@alexey-milovidov alexey-milovidov self-assigned this Nov 24, 2020
@kitaisreal kitaisreal force-pushed the added-function-accurate-cast-or-null branch 3 times, most recently from b39d199 to ed54023 Compare December 1, 2020 20:12
@kitaisreal kitaisreal changed the title Added function accurateCastOrNull Added function accurateCastOrNull, allow different types inside IN subquery Dec 2, 2020
@kitaisreal
Copy link
Copy Markdown
Contributor Author

@alexey-milovidov make test implementation of allowing different types inside in subquery. Please check 01600_select_in_different_types.sql and Set.cpp for a reference.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to keep.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more aligned with the behaviour of standard SQL where weak typing is used.
We don't like weak typing :) But we tend to make our SQL more compatible with standard (through some settings or without). If this behaviour is "for free", let's keep it without introducing any settings.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-milovidov It is for free for cases where we insert data in set from subselect, but when we make set from collection during AST visiting, we use convertFieldToType and it will throw exception if types does not match.

Example: SELECT '1' IN (1); will throw exception.

It does not make a lot of changes to update this behaviour. Should we fix it ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks inconsistent but let's keep it as is for now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we convert Float64's inf to Float32's inf or vice versa?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will investigate floating point types conversion, seems like there is are a bunch of problems.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to infinite conversions we can add special case above in addition to isNan.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, loss of precision due to Float64 -> Float32 conversion is Ok. For example,

toFloat32(0.1) IN (SELECT 0.1) should work.

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can assume all the range of [Float32_lowest, Float32_max] in Float64 to be representable in Float32.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(actually I'm not 100% sure about it)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. As discussed does not create specialization for case like toFloat32(0.1) IN (SELECT 0.1).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what's about the case of conersion 1.5 to integer type (should return false)?

PS. Exact equality is Ok, so 1.000...1 also not equals to 1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed equalsOp is required.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Ok :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW we also have obscure setting transform_null_in.

We can disable conversions (and resort to strict type checking) when it's enabled.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as we discussed, adding additional function accurateCast.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove before merge...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@kitaisreal kitaisreal force-pushed the added-function-accurate-cast-or-null branch 2 times, most recently from 91f35e9 to 4bba2d3 Compare December 5, 2020 14:55
@kitaisreal kitaisreal force-pushed the added-function-accurate-cast-or-null branch 2 times, most recently from deb60ce to efe0525 Compare December 11, 2020 13:51
@kitaisreal kitaisreal force-pushed the added-function-accurate-cast-or-null branch from efe0525 to 17b43ca Compare December 14, 2020 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add function accurateCastOrNull. Allow different types inside IN subquery.

3 participants