Skip to content

Fix crash when freeing newly created node when nodeIp2String fail#14055

Merged
sundb merged 2 commits intoredis:unstablefrom
sundb:fix_nodeIp2String_assert
May 21, 2025
Merged

Fix crash when freeing newly created node when nodeIp2String fail#14055
sundb merged 2 commits intoredis:unstablefrom
sundb:fix_nodeIp2String_assert

Conversation

@sundb
Copy link
Collaborator

@sundb sundb commented May 19, 2025

This PR is a reference from valkey-io/valkey#1535

In the process of handling the failure(#14024) of ARM64 CI with TLS, found in slow environment, the nodes might frequent disconnect and trigger this assertion.

Logged crash report (pid 690141):
=== REDIS BUG REPORT START: Cut & paste starting from here ===
690141:M 30 Apr 2025 02:41:12.704 # === ASSERTION FAILED ===
690141:M 30 Apr 2025 02:41:12.704 # ==> cluster_legacy.c:2889 'nodeIp2String(node->ip,link,hdr->myip) == C_OK' is not true

------ STACK TRACE ------

690201 bio_lazy_free
/lib/aarch64-linux-gnu/libc.so.6(+0x81ea0)[0xe75885781ea0]
/lib/aarch64-linux-gnu/libc.so.6(pthread_cond_wait+0x200)[0xe75885784b20]
src/redis-server 127.0.0.1:28638 [cluster](bioProcessBackgroundJobs+0x21c)[0xb14c5547c72c]
/lib/aarch64-linux-gnu/libc.so.6(+0x8595c)[0xe7588578595c]
/lib/aarch64-linux-gnu/libc.so.6(+0xeba4c)[0xe758857eba4c]

690192 bio_close_file
/lib/aarch64-linux-gnu/libc.so.6(+0x81ea0)[0xe75885781ea0]
/lib/aarch64-linux-gnu/libc.so.6(pthread_cond_wait+0x200)[0xe75885784b20]
src/redis-server 127.0.0.1:28638 [cluster](bioProcessBackgroundJobs+0x21c)[0xb14c5547c72c]
/lib/aarch64-linux-gnu/libc.so.6(+0x8595c)[0xe7588578595c]
/lib/aarch64-linux-gnu/libc.so.6(+0xeba4c)[0xe758857eba4c]

690200 bio_aof
/lib/aarch64-linux-gnu/libc.so.6(+0x81ea0)[0xe75885781ea0]
/lib/aarch64-linux-gnu/libc.so.6(pthread_cond_wait+0x200)[0xe75885784b20]
src/redis-server 127.0.0.1:28638 [cluster](bioProcessBackgroundJobs+0x21c)[0xb14c5547c72c]
/lib/aarch64-linux-gnu/libc.so.6(+0x8595c)[0xe7588578595c]
/lib/aarch64-linux-gnu/libc.so.6(+0xeba4c)[0xe758857eba4c]

690141 redis-server *
src/redis-server 127.0.0.1:28638 [cluster](clusterProcessPacket+0xdc8)[0xb14c554685c8]
src/redis-server 127.0.0.1:28638 [cluster](clusterReadHandler+0x144)[0xb14c554689c4]
src/redis-server 127.0.0.1:28638 [cluster](+0x1fa774)[0xb14c554fa774]
src/redis-server 127.0.0.1:28638 [cluster](+0x201130)[0xb14c55501130]
src/redis-server 127.0.0.1:28638 [cluster](aeMain+0x128)[0xb14c5539c02c]
src/redis-server 127.0.0.1:28638 [cluster](main+0x514)[0xb14c55390dc4]
/lib/aarch64-linux-gnu/libc.so.6(+0x284c4)[0xe758857284c4]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0x98)[0xe75885728598]
src/redis-server 127.0.0.1:28638 [cluster](_start+0x30)[0xb14c55391470]

Co-authored-by: Binbin [email protected]

@snyk-io
Copy link

snyk-io bot commented May 19, 2025

🎉 Snyk checks have passed. No issues have been found so far.

security/snyk check is complete. No issues have been found. (View Details)

license/snyk check is complete. No issues have been found. (View Details)

@sundb sundb requested a review from Copilot May 19, 2025 01:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses a crash issue by handling the failure of nodeIp2String gracefully when processing a MEET packet in a cluster node setup.

  • Added an error check for nodeIp2String and, if it fails, logs the error and frees the associated link.
  • Removed an assertion by replacing it with a memcpy of the validated IP from a local buffer.
Comments suppressed due to low confidence (1)

src/cluster_legacy.c:2893

  • Consider including additional connection details or context in the log message to aid troubleshooting.
serverLog(LL_NOTICE, "Closing link even though we received a MEET packet on it, because the connection has an error");

@sundb sundb merged commit ba88a7f into redis:unstable May 21, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants