Skip to content

group_by_overflow_mode does not work for distributed tables #10797

@weiyongyuan

Description

@weiyongyuan

1、the clickhouse version is 20.1.6.30,settings distributed_product_mode=local;max_threads=auto.;
2、in distribute node's clickhouse-client ,I SET max_result_rows = 81920,max_result_bytes=102400000, result_overflow_mode = 'break';max_block_size is default values 65536.
then run the query: select s_ip,count()counts from ds_ods.iptable group by s_ip order by counts desc limit 500;
on one of the datanode,query the query's qeury_log,can see the following result:,my questions is that why the datanode's result_rows is 11073720,but not the value 81920 I set or 65536*2. It seems that max_result_rows doesn't work on remote servers.is there a method that can limit the result of the local nodethat return to the distributed node?
type: QueryFinish
event_date: 2020-05-11
event_time: 2020-05-11 11:16:02
query_start_time: 2020-05-11 11:15:54
query_duration_ms: 14261
read_rows: 104054038
read_bytes: 2311618357
written_rows: 0
written_bytes: 0
result_rows: 11073720
result_bytes: 334685660

memory_usage: 3824731963
query: SELECT s_ip, count() AS counts FROM ods.iptables_local GROUP BY s_ip ORDER BY counts ASC
exception:
stack_trace:
is_initial_query: 0
user: default
query_id: 9f86b4db-cce7-4f13-8609-0ac048206760
address: ::ffff:172.24.9.49
port: 65106
initial_user: writer
initial_query_id: 64622f08-3ed9-435b-a087-87dfed405d4f
initial_address: ::ffff:172.24.9.49
initial_port: 43820
interface: 1
os_user: root
client_hostname: platform-server-5
client_name: ClickHouse client
client_revision: 54431
client_version_major: 20
client_version_minor: 1
client_version_patch: 6
http_method: 0
http_user_agent:
quota_key:
revision: 54431
thread_numbers: [329,79,302,90,266,294,103,106,78,77,125,120,293,119,95,124,117,67]
os_thread_ids: [27498,31790,27269,7048,26996,27262,7067,7061,31791,31792,7084,7081,27261,7060,7056,7082,7063,14399]
ProfileEvents.Names: ['Query','SelectQuery','FileOpen','Seek','ReadBufferFromFileDescriptorRead','ReadBufferFromFileDescriptorReadBytes','ReadCompressedBytes','CompressedReadBufferBlocks','CompressedReadBufferBytes','IOBufferAllocs','IOBufferAllocBytes','ArenaAllocChunks','ArenaAllocBytes','MarkCacheHits','CreatedReadBufferOrdinary','DiskReadElapsedMicroseconds','NetworkSendElapsedMicroseconds','SelectedParts','SelectedRanges','SelectedMarks','ContextLock','RWLockAcquiredReadLocks','RealTimeMicroseconds','UserTimeMicroseconds','SystemTimeMicroseconds','SoftPageFaults','OSCPUWaitMicroseconds','OSCPUVirtualTimeMicroseconds','OSWriteBytes','OSReadChars','OSWriteChars']
ProfileEvents.Values: [1,1,98,11,587,517022008,513673783,13552,1479186063,198,79219423,112,536838144,98,98,269456,350581,88,88,13798,40,1,104897887,60187046,4483097,1062755,2447226,64669016,4096,517023744,28672]
Settings.Names: ['queue_max_wait_ms','use_uncompressed_cache','background_pool_size','load_balancing','skip_unavailable_shards','log_queries','distributed_product_mode','max_result_rows','max_result_bytes','result_overflow_mode','max_execution_time','timeout_overflow_mode','max_memory_usage','max_memory_usage_for_user','max_memory_usage_for_all_queries','send_logs_level','allow_experimental_data_skipping_indices']
Settings.Values: ['0','1','32','random','1','1','local','81920','102400000','break','300','throw','100000000000','180000000000','180000000000','trace','1']
3、Here is the doc:.And in the settings.h can see this code:M(SettingUInt64, max_result_rows, 0, "Limit on result size in rows. Also checked for intermediate data sent from remote servers.", 0).It means it will limit the intermediate data size in rows from remote servers?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions