Skip to content

Allow to insert default instead of Null in insert-select / insert-select-union-all#23524

Merged
KochetovNicolai merged 12 commits intoClickHouse:masterfrom
kssenii:insert-union-select
Apr 26, 2021
Merged

Allow to insert default instead of Null in insert-select / insert-select-union-all#23524
KochetovNicolai merged 12 commits intoClickHouse:masterfrom
kssenii:insert-union-select

Conversation

@kssenii
Copy link
Copy Markdown
Member

@kssenii kssenii commented Apr 22, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
If insert_null_as_default = 1, insert default values instead of NULL in INSERT ... SELECT and INSERT ... SELECT ... UNION ALL ... queries. Closes #22832.

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Apr 22, 2021
@alexey-milovidov
Copy link
Copy Markdown
Member

As the feature is unrelated to formats, it will be better to make a separate setting.

@kssenii kssenii requested a review from KochetovNicolai April 22, 2021 22:21
@KochetovNicolai KochetovNicolai self-assigned this Apr 23, 2021
Comment on lines +252 to +253
auto nullable_column = ColumnNullable::create(query_columns[col_idx], ColumnUInt8::create(query_columns[col_idx]->size(), 0));
ColumnWithTypeAndName new_column(std::move(nullable_column), std::make_shared<DataTypeNullable>(column.type), column.name);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: there are functions ColumnPtr makeNullable(const ColumnPtr & column); (in ColumnNullable.h) and DataTypePtr makeNullable(const DataTypePtr & type); (in DataTypeNullable.h)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could please add comment why do we need to change query_sample_block here?
And why it is safe (because AddingDefaultBlockOutputStream will remove nullable)

addDefaultRequiredExpressionsRecursively(block, required_column_name, columns, default_expr_list_accum, added_columns);
if (is_column_in_query && convert_null_to_default)
{
auto null_as_default_expr = makeASTFunction("ifNull", std::make_shared<ASTIdentifier>(required_column_name), column_default_expr);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we also need to add CAST here for column_default_expr. Similar as in else part

@KochetovNicolai
Copy link
Copy Markdown
Member

2021.04.23 14:24:34.478656 [ 318595 ] {0a0f164d-6b44-4ca5-9cce-37941626b1c6} <Debug> executeQuery: (from [::1]:47336, using production parser) (comment: '/usr/share/clickhouse-test/queries/0_stateless/00992_syst
em_parts_race_condition_zookeeper_long.sh') INSERT INTO alter_table SELECT rand(1), rand(2), 1 / rand(3), toString(rand(4)), [rand(5), rand(6)], rand(7) % 2 ? NULL : generateUUIDv4(), (rand(8), rand(9)) FROM num
bers(100000)



021.04.23 14:24:34.481329 [ 324163 ] {} <Fatal> BaseDaemon: ########################################
2021.04.23 14:24:34.481481 [ 324163 ] {} <Fatal> BaseDaemon: (version 21.6.1.6662, build id: C055A729A976B34C8C578864B697DD1A09DCEB35) (from thread 318595) (query_id: 0a0f164d-6b44-4ca5-9cce-37941626b1c6) Received signal Segmentation fault (11)
2021.04.23 14:24:34.481550 [ 324163 ] {} <Fatal> BaseDaemon: Address: 0x1e0 Access: read. Address not mapped to object.
2021.04.23 14:24:34.481597 [ 324163 ] {} <Fatal> BaseDaemon: Stack trace: 0xf0dd5b4 0xf5f79a4 0xf5f6023 0xfdc5ab2 0xfdd8719 0x1247770f 0x1247919a 0x125b2f59 0x125aef4a 0x7ff566c36609 0x7ff566b5d293
2021.04.23 14:24:34.481688 [ 324163 ] {} <Fatal> BaseDaemon: 1. DB::InterpreterInsertQuery::execute() @ 0xf0dd5b4 in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.481753 [ 324163 ] {} <Fatal> BaseDaemon: 2. DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer*) @ 0xf5f79a4 in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.481811 [ 324163 ] {} <Fatal> BaseDaemon: 3. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, bool) @ 0xf5f6023 in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.481876 [ 324163 ] {} <Fatal> BaseDaemon: 4. DB::TCPHandler::runImpl() @ 0xfdc5ab2 in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.481926 [ 324163 ] {} <Fatal> BaseDaemon: 5. DB::TCPHandler::run() @ 0xfdd8719 in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.481996 [ 324163 ] {} <Fatal> BaseDaemon: 6. Poco::Net::TCPServerConnection::start() @ 0x1247770f in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.482045 [ 324163 ] {} <Fatal> BaseDaemon: 7. Poco::Net::TCPServerDispatcher::run() @ 0x1247919a in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.482087 [ 324163 ] {} <Fatal> BaseDaemon: 8. Poco::PooledThread::run() @ 0x125b2f59 in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.482135 [ 324163 ] {} <Fatal> BaseDaemon: 9. Poco::ThreadImpl::runnableEntry(void*) @ 0x125aef4a in /usr/lib/debug/.build-id/c0/55a729a976b34c8c578864b697dd1a09dceb35.debug
2021.04.23 14:24:34.482189 [ 324163 ] {} <Fatal> BaseDaemon: 10. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
2021.04.23 14:24:34.482243 [ 324163 ] {} <Fatal> BaseDaemon: 11. __clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so

Suspicious.

SELECT * FROM test_null_as_default ORDER BY s;
SELECT '';

REPLACE TABLE test_null_as_default (s String DEFAULT 'WORLD', ss String DEFAULT 'PEOPLE') ENGINE = Memory;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it is good to add cases:

  • with different types, like x UInt64, y UInt8 default x - 10
  • with recursive defaults, like x UInt32, y UInt32 default z + 1, z UInt32 default x + 1 (and insert into (x, y))

Copy link
Copy Markdown
Member

@KochetovNicolai KochetovNicolai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine in general.

@KochetovNicolai
Copy link
Copy Markdown
Member

There is still a bug:

==157==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60604548b8d8 at pc 0x00001e2df247 bp 0x7f6d53fde9d0 sp 0x7f6d53fde9c8
READ of size 8 at 0x60604548b8d8 thread T1959 (TCPHandler)
    #0 0x1e2df246 in boost::intrusive_ptr<DB::IColumn const>::operator->() const obj-x86_64-linux-gnu/../contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:200:16
    #1 0x1e2df246 in DB::InterpreterInsertQuery::execute() obj-x86_64-linux-gnu/../src/Interpreters/InterpreterInsertQuery.cpp:252:25
    #2 0x1f1890fc in DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer*) obj-x86_64-linux-gnu/../src/Interpreters/executeQuery.cpp:561:32
    #3 0x1f184ec2 in DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, bool) obj-x86_64-linux-gnu/../src/Interpreters/executeQuery.cpp:919:30
    #4 0x2061d6b9 in DB::TCPHandler::runImpl() obj-x86_64-linux-gnu/../src/Server/TCPHandler.cpp:312:24
    #5 0x2064397c in DB::TCPHandler::run() obj-x86_64-linux-gnu/../src/Server/TCPHandler.cpp:1615:9
    #6 0x26a513ce in Poco::Net::TCPServerConnection::start() obj-x86_64-linux-gnu/../contrib/poco/Net/src/TCPServerConnection.cpp:43:3
    #7 0x26a51f92 in Poco::Net::TCPServerDispatcher::run() obj-x86_64-linux-gnu/../contrib/poco/Net/src/TCPServerDispatcher.cpp:115:20
    #8 0x26d2db34 in Poco::PooledThread::run() obj-x86_64-linux-gnu/../contrib/poco/Foundation/src/ThreadPool.cpp:199:14
    #9 0x26d27e56 in Poco::ThreadImpl::runnableEntry(void*) obj-x86_64-linux-gnu/../contrib/poco/Foundation/src/Thread_POSIX.cpp:345:27
    #10 0x7f6fe4af6608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x9608)
    #11 0x7f6fe4a1d292 in clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

should be easy to reproduce

@kssenii kssenii force-pushed the insert-union-select branch from 4baff4f to 3a9b11e Compare April 23, 2021 19:09
@KochetovNicolai
Copy link
Copy Markdown
Member

Functional stateful tests - Database default doesn't exist - flaky
Integration tests - test_in_memory_wal, test_bridge_dies_with_parent - flaky

@KochetovNicolai KochetovNicolai merged commit 092ff0f into ClickHouse:master Apr 26, 2021
@sevirov
Copy link
Copy Markdown
Contributor

sevirov commented Apr 26, 2021

Internal documentation ticket: DOCSUP-8939.

azat added a commit to azat/ClickHouse that referenced this pull request Oct 14, 2021
…able column

Required columns of the default expression should not be converted to NULL,
since this map value to default and MATERIALIZED values will not work.

Consider the following structure:
- A Nullable(Int64)
- X Int64 materialized coalesce(A, -1)

With recursive_null_as_default=true you will get:

    _CAST(coalesce(A, -1), 'Int64') AS X, NULL AS A

And this will ignore default expression.

Fixes: ClickHouse#23524 (Cc: @kssenii)
Fixes: ClickHouse#29729 (Cc: @tavplubix)
Backport: 21.7+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a setting insert_select_null_as_default

5 participants