Add setting send_profile_events to reduce network traffic#80761
Add setting send_profile_events to reduce network traffic#80761SpencerTorres wants to merge 2 commits intoClickHouse:masterfrom
send_profile_events to reduce network traffic#80761Conversation
|
|
||
| void TCPHandler::sendProfileEvents(QueryState & state) | ||
| { | ||
| if (!state.query_context->getSettingsRef()[Setting::send_profile_events]) |
There was a problem hiding this comment.
I was worrying about this for a long time - #28364 (comment)
We cannot simply just not send them now, since some clients relies on that they are there, i.e. python driver here - https://github.com/mymarilyn/clickhouse-driver/blob/8a4e7c5b99b532df2b015651d893a6f36288a22c/clickhouse_driver/client.py#L725-L757
And using a setting for this does not looks correct because such settings will not work if they are specified in the query directly (i.e. select 1 settings) only in protocol, and we need to clarify this to users. And in general setting that controls the protocol looks odd.
So I guess we need to add one more revision, and let client decide does it need profile events or not (in hello addendum), and only those who can handle this packets will set this flag.
@alexey-milovidov I remember you wanted to avoid the complexity, but there is already a timer for how long does it take to send them... - #62752
@novikd @alexey-milovidov @Algunenano what do you think?
There was a problem hiding this comment.
My concern is that the new setting will most likely be disabled in the Cloud and non-modifiable, because ProfileEvents may be used for accounting, and the fact that they are send between servers in the distributed query execution is also important.
Also, I've recently added the compression of ProfileEvents, which makes this feature less relevant.
PS.
We cannot simply just not send them now, since some clients relies on that they are there
It is a mistake in the Python client if it relies on this specific order of packets. However, it is unlikely to be a problem - for it to happen, the user will have to explicitly disable the profile events.
because such settings will not work if they are specified in the query directly (i.e. select 1 settings) only in protocol
We should make it in a way that the server stops sending ProfileEvents as soon as it applies this setting for a query.
|
@azat @SpencerTorres Let's ensure that ProfileEvents are not sent more frequently than |
This change will break the mentioned driver, but perhaps that's acceptable. Regarding your suggestion - are you proposing not to send The second option might still cause issues. I'm not entirely sure what was causing the original problem - it could be related to the waiting phase during async INSERTs, where |
Yes, the first and the last one can always be sent, and everything in between not frequently than the interactive_delay. |
|
Maybe we can change the protocol itself making all the clients to explicitly say whether they want profile events back or not? We can put that in the |
This would be useful; @azat suggested the same. If we're going to make a new revision, are there other packets we should conditionally exclude? We could do a boolean to toggle I see what @alexey-milovidov said about the @alexey-milovidov How do you feel about the suggested revision? Is it reasonable for clients to decide not to receive these packets? By disabling them we save some CPU/network resources on both client and server. |
AFAIR - no (logs are already under a setting)
I like the idea of using bit mask, but, we still will have to bump the revision I guess, and I don't expect too much changes here, so maybe the complexity does not worth it |
|
Loosely related to #75719. |
|
Related to #63065 |
Ok, this is not applicable because metrics from different servers have different |
|
I think this solution is good enough, and it is flexible enough. |
|
Workflow [PR], commit [358c3a8] Summary: ❌
|
alexey-milovidov
left a comment
There was a problem hiding this comment.
And everything looks good.
|
No need to rush this change, we can go with the other solution of adding a revision with a boolean in the client I want to make sure we enable this ability the proper way, if a setting isn't the right approach then let's wait to add it as a revision to the Hello packet. Let me know what the preferred approach is |
|
Ok. |
|
Dear @azat, @alexey-milovidov, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
|
Dear @azat, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
|
Related to ClickHouse/clickhouse-go#1685 and ClickHouse/clickhouse-go#1686. Skipping the consumption of profile events on the client side can reduce memory allocations by over 3x. It would be great to be able to disable these at the session level. This is especially important for async inserts where profile events take up a bigger chunk of the data that is streamed from the server. |
|
See #89588. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added
send_profile_eventssetting which allows clients to reduce network traffic when profile events are unused.Documentation entry for user-facing changes
For users who are doing many small async inserts over Native, it's possible that more data will be received from the server than sent by the client. The
ProfileEventspacket was identified to be a large portion of this. To help reduce traffic, I have added a boolean setting calledsend_profile_eventsthat defaults to1. Clients can choose to disable profile events to reduce traffic. In the Go client for example, this data isn't used. It's parsed and immediately ignored. This can be applied per-query.See sample Go client packet log using this new setting:
Packet log with
ProfileEventsdisabledPacket log with
ProfileEventsenabled (default, current behavior)Note the difference, these lines are excluded since the server never sent them:
This change does not affect the
ProfileInfopacket. Setting docs should be auto-generated, let me know if I need to manually add anything. Thanks!