Fixed the gateway's reconnection system#264
Conversation
🦋 Changeset detectedLatest commit: 67b5b42 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughThe changes enhance WebSocket connection management within the gateway plugin by introducing stricter state tracking, preventing reconnections after closure, and ensuring consistent handling of sequence and session states. The connection monitor now supports uptime resets without full destruction, and new methods encapsulate interval logic for improved maintainability. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GatewayPlugin
participant ConnectionMonitor
participant WebSocket
Client->>GatewayPlugin: connect()
GatewayPlugin->>WebSocket: Close existing connection (if any)
GatewayPlugin->>WebSocket: Create new connection
WebSocket-->>GatewayPlugin: onOpen/onClose/onError/onMessage
GatewayPlugin->>ConnectionMonitor: resetUptime()
alt Reconnection needed
GatewayPlugin->>GatewayPlugin: handleReconnectionAttempt()
GatewayPlugin->>GatewayPlugin: disconnect()
GatewayPlugin->>ConnectionMonitor: resetUptime()
GatewayPlugin->>WebSocket: Close connection
GatewayPlugin->>WebSocket: Attempt reconnection
alt Max attempts or fatal error
GatewayPlugin->>ConnectionMonitor: destroy()
end
end
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/carbon/src/plugins/gateway/GatewayPlugin.ts(11 hunks)packages/carbon/src/plugins/gateway/utils/monitor.ts(2 hunks)
🔇 Additional comments (6)
packages/carbon/src/plugins/gateway/utils/monitor.ts (1)
42-43: Clean refactoring that improves maintainability.The extraction of interval creation logic into
createMetricsInterval()and the addition ofresetUptime()method are well-implemented. This allows the monitor to reset its uptime tracking without losing its internal state, which aligns perfectly with the connection lifecycle improvements in the gateway plugin.Also applies to: 45-65, 97-102
packages/carbon/src/plugins/gateway/GatewayPlugin.ts (5)
102-102: Good defensive programming to prevent duplicate connections.Closing any existing WebSocket connection at the start of
connect()ensures a clean state before establishing a new connection.
116-116: Correct usage of the newresetUptime()method.This change properly maintains the monitor instance across reconnections while resetting its uptime tracking, which is essential for accurate metrics after disconnection.
188-190: Good defensive programming for event type validation.Throwing an error for unknown event types helps catch integration issues early and prevents silent failures.
266-266: Correct implementation of reconnection logic improvements.
- Calling
disconnect()at the start ensures a clean state before reconnection- Moving the increment after backoff calculation fixes the doubled backoff time issue
- Preserving the monitor except for terminal conditions maintains metrics continuity
Also applies to: 307-311
343-349: Correct fix for sequence number handling during resume.Using
state.sequenceinstead ofthis.sequenceensures the resume payload contains the last sequence number from before disconnection, which is the correct behavior for resuming a session.
There's a major bug in all of the gateway plugins where it fails to reconnect properly.
This was really annoying to test because the steps to reproduce this was:
GatewayOpcodes.Reconnect. This could take up to multiple hours.Here are all of the issues that were fixed:
GatewayOpcodes.Reconnectis recieved, there's an issue where it tries to reconnect twice, because boththis.handleReconnect()in line 207 andthis.handleClose(code)in line 218 both runthis.handleReconnectionAttempt(...). There are also few other places where something similar could happen as well.1000and1001, which I added.backoffTimewas always two times the value the user set. This was fixed by settingthis.reconnectAttempts++after calculating thebackoffTime.I tested these fixes with both the gateway forwarder and the sharded gateway forwarder! Since both the gateway forwarder and sharded gateway forwarder extend the gateway and sharded gateway plugin, everything should be functional (in theory)!
Summary by CodeRabbit
Summary by CodeRabbit
Bug Fixes
New Features
Improvements