Introduce memory management on cluster link buffers#1
Introduce memory management on cluster link buffers#1
Conversation
* Introduce a new `cluster-link-sendbuf-limit` config that caps memory usage of cluster bus link send buffers. * Introduce a new `CLUSTER LINKS` command that displays current TCP links to/from peers.
| p = events; | ||
| if (link->conn) { | ||
| if (connHasReadHandler(link->conn)) *p++ = 'r'; | ||
| if (connHasWriteHandler(link->conn)) *p++ = 'w'; | ||
| } |
There was a problem hiding this comment.
Not sure this is useful as a point in time. Last i/o time might be useful for both r/w though.
There was a problem hiding this comment.
I borrowed this from client list.
Today link struct does not track last i/o time. Do you think we should add it?
There was a problem hiding this comment.
I didn't realize this was in client list, I find it hard to imagine it's that useful since I would look at the I/O time more than this. I don't feel strongly about it.
There was a problem hiding this comment.
If I try really hard I can imagine a scenario where no events are registered for a link with non-empty buffers, indicating a bug and a resulting zombie link. Then events would be able to help surface that.
Re: I/O time, I agree with your point but since I/O time is not tracked with clusterLink struct currently, I didn't want to introduce it, to limit the scope of this PR to only display existing attributes of clusterLinks.
src/cluster.h
Outdated
| size_t rcvbuf_alloc; /* Allocated size of rcvbuf */ | ||
| struct clusterNode *node; /* Node related to this link if any, or NULL */ | ||
| struct clusterNode *node; /* Node related to this link. Initialized to NULL when unknown */ | ||
| int inbound; /* 1 if this link is accepted from the related node */ |
There was a problem hiding this comment.
Phrasing for the documentation is a little weird, it sounds like this is set to 1 when the connection has been accepted, not that it's inbound.
Also, is the inbound bit strictly needed for correctness?
There was a problem hiding this comment.
Point taken. I treated accepted from as synonym for inbound but in retrospect that is probably confusing.
Changed to 1 if this link is an inbound link accepted from the related node.
Re: is the inbound bit strictly needed for correctness
The need for this inbound bit is derived from the fact I've changed the code to start associating inbound links with nodes as well.
Before my change, only outbound links are associated with nodes, therefore link->node is ubiquitously used to check if link is inbound or not, e.g. in clusterProcessPacket. Now that doesn't work anymore, so I needed to introduce inbound bit to serve that purpose.
| * of the CLUSTER NODES function, and as format for the cluster | ||
| * configuration file (nodes.conf) for a given node. */ | ||
| sds clusterGenLinksDescription() { | ||
| sds links = sdsempty(), link; |
There was a problem hiding this comment.
If you were to call this on a single primary it would return an empty string, which I'm not sure we do anywhere. Maybe just double check that works?
There was a problem hiding this comment.
Good callout. I checked. It does return an empty string.
Should we return a special error message or something? e.g. no existing cluster links, etc. I don't think so but want to hear your take.
There was a problem hiding this comment.
Maybe we should just return this as a RESP MAP (i.e. key/value) ? That is usually more human readable any ways. I'm not sure we want to copy CLIENT LIST anyways.
|
Created new PR against official Redis repo: redis#9774 Deprecate this PR. |
This PR introduced memory management on cluster bus link send buffers, to address issues raised in redis#9688.
On a high level:
cluster-link-sendbuf-limitconfig that caps memory usage of cluster bus link send buffers.CLUSTER LINKScommand that displays current TCP links to/from peers.On an implementation level:
clusterNodestruct, added aninbound_linkfield pointing to the inbound link accepted from thisnode. Before this change, inbound links are not associated with the correspondingclusterNodeand are loosely managed.