Skip to content

Introduce memory management on cluster link buffers#1

Closed
ny0312 wants to merge 2 commits intounstablefrom
cluster-link-memory-management
Closed

Introduce memory management on cluster link buffers#1
ny0312 wants to merge 2 commits intounstablefrom
cluster-link-memory-management

Conversation

@ny0312
Copy link
Owner

@ny0312 ny0312 commented Nov 9, 2021

This PR introduced memory management on cluster bus link send buffers, to address issues raised in redis#9688.

On a high level:

  • Introduced a new cluster-link-sendbuf-limit config that caps memory usage of cluster bus link send buffers.
  • Introduced a new CLUSTER LINKS command that displays current TCP links to/from peers.

On an implementation level:

  • In clusterNode struct, added an inbound_link field pointing to the inbound link accepted from this node. Before this change, inbound links are not associated with the corresponding clusterNode and are loosely managed.

 * Introduce a new `cluster-link-sendbuf-limit` config that caps memory usage of cluster bus link send buffers.
 * Introduce a new `CLUSTER LINKS` command that displays current TCP links to/from peers.
Comment on lines +4462 to +4466
p = events;
if (link->conn) {
if (connHasReadHandler(link->conn)) *p++ = 'r';
if (connHasWriteHandler(link->conn)) *p++ = 'w';
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is useful as a point in time. Last i/o time might be useful for both r/w though.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I borrowed this from client list.

Today link struct does not track last i/o time. Do you think we should add it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize this was in client list, I find it hard to imagine it's that useful since I would look at the I/O time more than this. I don't feel strongly about it.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I try really hard I can imagine a scenario where no events are registered for a link with non-empty buffers, indicating a bug and a resulting zombie link. Then events would be able to help surface that.

Re: I/O time, I agree with your point but since I/O time is not tracked with clusterLink struct currently, I didn't want to introduce it, to limit the scope of this PR to only display existing attributes of clusterLinks.

src/cluster.h Outdated
size_t rcvbuf_alloc; /* Allocated size of rcvbuf */
struct clusterNode *node; /* Node related to this link if any, or NULL */
struct clusterNode *node; /* Node related to this link. Initialized to NULL when unknown */
int inbound; /* 1 if this link is accepted from the related node */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phrasing for the documentation is a little weird, it sounds like this is set to 1 when the connection has been accepted, not that it's inbound.

Also, is the inbound bit strictly needed for correctness?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. I treated accepted from as synonym for inbound but in retrospect that is probably confusing.

Changed to 1 if this link is an inbound link accepted from the related node.

Re: is the inbound bit strictly needed for correctness

The need for this inbound bit is derived from the fact I've changed the code to start associating inbound links with nodes as well.

Before my change, only outbound links are associated with nodes, therefore link->node is ubiquitously used to check if link is inbound or not, e.g. in clusterProcessPacket. Now that doesn't work anymore, so I needed to introduce inbound bit to serve that purpose.

* of the CLUSTER NODES function, and as format for the cluster
* configuration file (nodes.conf) for a given node. */
sds clusterGenLinksDescription() {
sds links = sdsempty(), link;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were to call this on a single primary it would return an empty string, which I'm not sure we do anywhere. Maybe just double check that works?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. I checked. It does return an empty string.

Should we return a special error message or something? e.g. no existing cluster links, etc. I don't think so but want to hear your take.

Copy link

@madolson madolson Nov 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just return this as a RESP MAP (i.e. key/value) ? That is usually more human readable any ways. I'm not sure we want to copy CLIENT LIST anyways.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the new revision.

@ny0312
Copy link
Owner Author

ny0312 commented Nov 12, 2021

Created new PR against official Redis repo: redis#9774

Deprecate this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants