Fix crash on RM_Call inside module load#11346
Conversation
The following [PR](redis#9320) instoduces initialization order changes. Now cluster is initialized after modules. This changes causes a crash if the module uses RM_Call inside the load function on cluster mode (the code will try to access `server.cluster` which at this point is NULL). To solve it, seperate cluster initialization into 2 phases: 1. Structure initialization that happened before the modules initialization 2. Listener initialization that happened after. Test was added to verify the fix.
oranagra
left a comment
There was a problem hiding this comment.
i think this segment belongs in clusterInit
it could be that some RM_Call command would need these.
maybe it doesn't matter though, and we rather not mess up the blame log?
@madolson WDYT?
/* Initialize data for the Slot to key API. */
slotToKeyInit(server.db);
/* The slots -> channels map is a radix tree. Initialize it here. */
server.cluster->slots_to_channels = raxNew();
/* Set myself->port/cport/pport to my listening ports, we'll just need to
* discover the IP address via MEET messages. */
deriveAnnouncedPorts(&myself->port, &myself->pport, &myself->cport);
server.cluster->mf_end = 0;
server.cluster->mf_slave = NULL;
resetManualFailover();
clusterUpdateMyselfFlags();
clusterUpdateMyselfIp();
clusterUpdateMyselfHostname();|
I agree with Oran, I don't think we should just bisect the current clusterInit at the position we need for initializing the listeners, since it becomes more confusing in the future to understand what goes where. I would expect if we extract out the listener initialization code into a separate function "clusterInitListeners", hopefully that will protect the git blame enough. |
|
triggered cluster (and TLS) CI: |
|
@MeirShpilraien looks like there's a crush in clusterInit when using TLS |
|
@oranagra I hope I fixed it. Can you trigger it again? |
|
passed. |
PR redis#9320 introduces initialization order changes. Now cluster is initialized after modules. This changes causes a crash if the module uses RM_Call inside the load function on cluster mode (the code will try to access `server.cluster` which at this point is NULL). To solve it, separate cluster initialization into 2 phases: 1. Structure initialization that happened before the modules initialization 2. Listener initialization that happened after. Test was added to verify the fix.
PR redis#9320 introduces initialization order changes. Now cluster is initialized after modules. This changes causes a crash if the module uses RM_Call inside the load function on cluster mode (the code will try to access `server.cluster` which at this point is NULL). To solve it, separate cluster initialization into 2 phases: 1. Structure initialization that happened before the modules initialization 2. Listener initialization that happened after. Test was added to verify the fix.
PR #9320 introduces initialization order changes. Now cluster is initialized after modules.
This changes causes a crash if the module uses RM_Call inside the load function
on cluster mode (the code will try to access
server.clusterwhich at this point is NULL).To solve it, separate cluster initialization into 2 phases:
Test was added to verify the fix.