Skip to content

Implement Raft-based State Synchronization for GatewayD Instances #628

@sinadarbouy

Description

@sinadarbouy

This issue builds on the discussion in #169

Expected Behavior

The system should handle stateful parameters and coordination between nodes efficiently and consistently using a Raft consensus algorithm. This approach will ensure that all instances of GatewayD stay in sync, providing consistent state information when establishing connections between the client and the database.

Additionally, when a new request is sent to one of the instances, stateful parameters like the available connection pool or state related to load-balancer strategies should be updated across all instances. This ensures that changes in one instance are reflected immediately in the others, maintaining consistency and optimal load balancing across the cluster.

Current Behavior

Currently, stateful parameters are handled on a per-instance basis, with no unified state management mechanism across multiple GatewayD instances. This can lead to inconsistencies and difficulties in ensuring proper coordination when scaling up or handling failover scenarios.

Steps to Reproduce

  1. Run multiple instances of GatewayD.
  2. Try to handle stateful parameters without centralized coordination.
  3. Notice inconsistencies in state data when forming connections.

Proposed Solution

To address the issue of state management across multiple nodes, we can use HashiCorp’s Raft implementation to manage state and coordination. Here’s the suggested solution:

  1. Expose a Raft Port: Open an additional port for Raft to allow nodes to form a Raft cluster during startup.
  2. Single Raft Cluster for All Config Groups: Instead of managing separate Raft clusters for each configuration group, a single Raft cluster will be used across all configuration groups to simplify state management and reduce overhead.
  3. Handling Stateful Parameters: Store stateful parameters as key-value pairs, similar to how it’s handled in the Redis plugin (configurationGroup-ConfigurationBlock-Key). Raft will ensure all nodes remain consistent with these values.
  4. Consistency and Recovery: Since HashiCorp Raft uses BoltDB to handle Raft logs for persistence and recovery, there is no need to store state in additional files. The Raft algorithm will ensure consistency across nodes, and in-memory storage (using sync.Map) can be used to manage state during runtime without the need for a complex database solution.

Expected Outcome

By implementing this approach, multiple GatewayD instances will be able to receive requests and coordinate via Raft to fetch stateful variables, ensuring consistency before a connection between the client and database is established.

Furthermore, when one instance receives a request and updates stateful parameters (e.g., available connection pool or load-balancer strategies), these changes will be propagated across all instances in the Raft cluster. This will ensure all instances are updated simultaneously, maintaining consistent behavior and optimal load balancing across the cluster.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Status

🎉 Done

Relationships

None yet

Development

No branches or pull requests

Issue actions