0% found this document useful (0 votes)

451 views17 pages

Synology High Availability White Paper: Based On

Uploaded by

Dhani Aristyawan Simangunsong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

451 views17 pages

Synology High Availability White Paper: Based On

Uploaded by

Dhani Aristyawan Simangunsong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

White Paper

Synology High Availability

White Paper

Based on
DSM 6.2

1 Synology White Paper

Table of
Contents

Introduction 02

Overview 03

Synology High Availability Architecture 04

Hardware Components

Network Interface

Network Scenarios

Data Replication

Special Conditions

Achieving Service Continuity 09

Auto-Failover Protection

Best Practices for Deployment 11

System Performances

Reliability

Performance Benchmark 13
Performance Considerations

Switchover Time-to-Completion

Summary 15
Introduction

Introduction

Business Challenges

Unexpected downtime can lead to frustration for customers and

result in huge losses in revenue. 50% of SMBs worldwide remain
unprepared in the case of disaster, and in many cases downtime
becomes a potentially fatal problem costing companies as
much as over 12,000 USD a day. With the growing demand for
uninterrupted availability, businesses seek for solutions that
ensure high levels of service continuity.

Synology Solutions for Service Continuity

High availability solution is highly demanded for anyone who

deploys critical services such as database, company file server,
virtualized storage, and more. All these services are with
extremely low tolerance and one cannot afford to have services
interrupted when unexpected events strike.

High availability is mostly featured as an enterprise-exclusive

solution due to its high cost and complicated deployment. With
Synology High Availability, this high-end feature is available
on most plus-series and all FS/XS-series, making it a cost-
effective selection to protect important services. Synology High
Availability mitigates the impact for IT administrators to fix any
system or hardware issues when disaster strikes, while allowing
businesses to prevent downtime for mission-critical applications
and minimize lost revenue associated with the incident.

02 Synology White Paper

Overview

According to Transparency Market Research (TMR), the global Key Features of Synology High Availability:
market for high availability servers is anticipated to rise at a • HA cluster provides comprehensive hardware and data
CAGR of 11.7% between 2017 and 2025. The market is valued at redundancy
$4.1bn in 2015 and is expected to reach $12.3bn by 2025.
• Heartbeat technology achieves real-time data synchronization

A key factor driving the global market demand for high • Automatic failover maximizes service uptime and business
availability solutions is the reliance on data, and the need for continuity
data to be more accessible. There has been an increase in
• SHA ensures storage for performance intensive workloads
demand for a higher level of availability to prevent lost revenue
including virtualization solutions such as VMware® vSphere™,
caused by unexpected or undesired incidents that may have a
Microsoft® Hyper-V®, Citrix® XenServer™, and OpenStack
profound impact on data access and an organization's business
Cinder.
productivity.
• Offers hassle free HA configuration, management, and
High availability solutions, supporting redundancy and data maintenance without requiring additional technical resources
replication, have become a proven strategy for organizations to with an intuitive set up wizard that comes with built-in,
retain and manage data while being able to access information visualized management tools
in real-time. However, due to the complex and expensive nature
• Complete protection for file services including SMB, AFP, NFS,
of the high availability technology, small and mid-size businesses
FTP, and iSCSI
often lack the resources or budget to implement a high
availability solution to protect against data loss. In most cases, • Supports critical DSM packages

only large enterprises with sufficient IT resources can afford to

This white paper aims to provide information on Synology High
build a business continuity plan, deploy highly available and fault
Availability (SHA) design and architecture, common scenarios,
tolerant servers, and store an ever-growing amount of critical
best practices, and performance metrics.
data generated on a daily basis to be kept secure and available
at all times.

Synology is dedicated to providing a reliable, cost-effective,

comprehensive high availability infrastructure to small and
mid-range businesses to effectively minimize data loss and
downtime. Synology High Availability (SHA) is introduced as
an enterprise-level solution that is affordable especially for
smaller organizations with limited IT resources seeking for
robust, continuous system availability at a lower budget without
worrying about installation and maintenance costs. Contrary to
most HA solutions that require expensive, dedicated hardware,
SHA is available on most Synology NAS and can be implemented
at a lower investment cost.

Synology White Paper 03

Synology High Availability Architecture

Synology High
Availability Architecture
The Synology High Availability solution is a server layout connection that monitors server status and facilitates data
designed to reduce service interruptions caused by system replication between the two servers.
malfunctions. It employs two servers to form a "high-availability
• Active Server: Under normal conditions, all services are
cluster" (also called "HA cluster") consisting of two compatible
provided by the active server. In the event of a critical
Synology servers. Once this high-availability cluster is formed,
malfunction, the active server will be ready to pass service
one server assumes the role of the active server, while the other
provisioning to the passive server, thereby circumventing
acts as a standby passive server. Full data replication is required
downtime.
to be performed once the cluster is successfully created.
• Passive Server: Under normal conditions, the passive server
Once the high-availability cluster is formed, data is continuously remains in standby mode and receives a steady stream of
replicated from the active server to the passive server. All files data replicated from the active server.
on the active server will be copied to the passive server. In the
• Cluster Connection: The connection network used for the
event of a critical malfunction, the passive server is ready to
communication between the clients and high-availability
take over all services. Equipped with a duplicate image of all
cluster. There is at least one cluster connection for the active
data on the active server, the passive server will enable the high-
server, and one for the passive server, to the client. In order
availability cluster to continue functioning as normal, minimizing
to ensure the communication between both active and
downtime.
passive servers, the cluster connections must go through a
switch.
Hardware Components
Synology's High Availability solution constructs a cluster • Heartbeat Connection: The active and passive servers of a

composed of two individual compute and storage systems: an high-availability cluster are connected by a dedicated, private

active and a passive server. Each server comes with attached network connection known as the "Heartbeat" connection.

storage volumes, and the two are linked by a "Heartbeat" Once the cluster is formed, the Heartbeat facilitates data
replication from the active server to the passive server. It
also allows the passive server to constantly detect the active
server's presence, allowing it to take over in the event of
active server failure. The ping response time between the two
servers must be less than 1 ms, while the transmission speed
should be at least 500 Mbps. The performance of the HA
cluster will be affected by the response time and bandwidth
of the Heartbeat connection.

• Main Storage: The storage volume of the active server.

• Spare Storage: The storage volume of the passive server,

which continually replicates data received from the main
storage via the Heartbeat connection.

Figure 1: Physical components of a typical Synology

High Availability (SHA) deployment

04 Synology White Paper

Synology High Availability Architecture

Network Interface • Suggested network configurations: It is suggested that

you choose the fastest network interface for the Heartbeat
Cluster Interface
connection or ensure the ability of the network interface is
When the two servers are combined into a high-availability the same as the network interface of the cluster connection.
cluster, a virtual interface (unique server name and IP address) Please refer to the following examples:
shall be configured. This virtual interface, also called the cluster
1. Both servers have at least two 10GbE network interfaces.
interface, allows clients to access the cluster resources using a
One 10GbE is suggested to be the interface of Heartbeat
single namespace. Therefore, when a switchover is triggered and
connection, and the other is for the cluster connection.
the provision of services is moved to the passive server, there
will be no need to modify network configurations on hosts in the 2. There is only one 10GbE interface between both servers.
data network. It should be used for the Heartbeat connection. If there
are more than two 1GbE network interface, it is also
suggested to set up link aggregation for the cluster
connection.

3. If there is no 10GbE network interface, make sure the

Heartbeat connection and the cluster connection share
network interfaces.

Link Aggregation increases the bandwidth and provides traffic

failover to maintain network connection in case the connection
is down. It is recommended to set up link aggregation for both
cluster connection and Heartbeat connection. Note that the
link aggregation must be set up before the creation of high-
availability cluster. Once the HA cluster is created, the link
configuration cannot be modified again. In addition, no matter
what type of link aggregation that is chosen for Heartbeat
connection, SHA creation will set it to round-robin automatically.

Network Scenarios
Figure 2: PC clients access a Synology High Availability (SHA)
cluster through a single virtual interface The physical network connections from the data network to the
active server and passive server must be configured properly
• Cluster Server Name and IP addresses: Servers in the so that all hosts in the data network can seamlessly switch
cluster will share IP addresses and a server name, which connections to the passive server in the event a switchover is
should be used in all instances instead of the original IP triggered. The following section covers different configurations
addresses and individual server names. for various situations and Synology NAS models.

Heartbeat Interface Network Implementation for Synology NAS with Two LAN
Ports
Heartbeat connection is the connection established on the
Heartbeat interfaces of the active and passive server and is In situations where both servers have two network ports
used for replicating data from active server to passive server, only, one network port on each server will be occupied by
including the differential data, and the real time write operation. the Heartbeat connection, so each server will have only
All data syncing is performed at block-level, and it ensures that one port available for the HA cluster to connect to the data
the active and passive servers contain identical data. As all data network. Therefore, there will not be sufficient network ports
is constantly maintained to be up-to-date, switchover can be to accommodate redundant paths between the hosts in the
accomplished seamlessly. data network and HA cluster. However, we still recommend
using multiple paths to connect hosts to the data network, as
Heartbeat IP is automatically selected by system upon the
well as more than one switch in your data network to provide
cluster creation.
redundancy.

Synology White Paper 05

Synology High Availability Architecture

Figure 3: High-availability cluster network configuration on models with two LAN ports

Synology High Availability (SHA) provides an option to trigger providing a load balancing capability when all the connections
a switchover when the active server detects network failure. are healthy.
When enabled, if connection failure occurs between the switch
connected to the active server or the switch fails, service Path Redundancy for the Heartbeat Connection
continuity will be maintained by switching over to the passive
For Synology NAS models with four or more network ports, link
server (assuming the network connection of the passive server is
aggregation may be implemented on the Heartbeat connection
healthy).
to provide failover redundancy and load balancing. This feature
does not require a switch between the connections.
Network Implementation for Synology NAS with Four or
More LAN Ports
Network Troubleshooting
The best way to create a high availability environment is to use • The maximum transmission unit (MTU) and virtual LAN
a Synology NAS with four network ports. In this instance, you (VLAN) ID between Synology NAS and switch/router must be
can connect multiple paths between the hosts and HA cluster, identical. For example, if the MTU of DS/RS is 9000, do make
providing a redundant failover path in case the primary path sure the corresponding switch/router is able to measure up
fails. Moreover, I/O connections between the data network and to that size.
each clustered server can be connected to more than one port,

Figure 4: High-availability cluster network configuration on models with four or more LAN ports (NAS)

06 Synology White Paper

Synology High Availability Architecture

• The switch/router should be able to perform multicast Data Replication

routing for data network. However, the switch/router should
Within the high-availability cluster, all data that has been
also be able to perform fragmentation and jumbo frame (MTU
successfully stored (excluding data that still remains in the
value: 9000) if the Heartbeat connection goes through the
memory) on internal drives or expansion units will be replicated.
switch/router.
Therefore when services are switched from the active to passive
• Ensure the firewall setting does not block the port for DSM server, no data loss will occur.
and SHA from connection.
While data replication is a continual process, it has two distinct
• Ensure that the IP addresses of the active/passive server and
phases spanning from formation to operation of a high-
of the HA cluster are in the same subnet.
availability cluster:
• DHCP, IPv6, and PPPoE are not supported for SHA
• Phase 1: The initial data replication during cluster creation
environment. Also, ensure the wireless and DHCP server
or the replication of differential data when connection to the
services have been disabled.
passive server is resumed after a period of disconnection
• Unstable Internet (reduced ping rate or slow internet speeds) (such as when the passive server is switched off for
after binding: maintenance). During this phase, the initial sync is not yet

• Try connecting to another switch/router or connect an complete, and therefore switchover cannot be performed.

independent switch/router to DS/RS and PC/Client for Data changes made on the active server during this initial

testing. replication are also synced.

• The setting of flow control on switch/router would also induce • Phase 2: Real-time data replication after the initial sync has

packet loss in the network. Please check if the setting is in been completed. After the initial sync, all data is replicated in

accordance with DS/RS, which normally will auto-detect and real-time and treated as committed if successfully copied. In

conform to the setting of the corresponding switch/router. this phase, switchover can be performed at any time.

User are advised to manually enable/disable the flow control

During both phases of data replication, all data syncing is
if inconsistency is observed.
performed at block-level. For example, when writing a 10
• Ensure that Bypass proxy server for local addresses in the GB file, syncing and committing is broken down to block-
proxy setting is enabled. level operations, and completed piecemeal to ensure that the
active and passive servers contain identical data. As all data
is constantly maintained to be up-to-date, switchover can be
accomplished seamlessly.

Data and changes to be replicated include:

• NAS Data Services: All file services including CIFS/NFS/AFP are

covered.

• iSCSI Data Services: High-availability clustering supports iSCSI,

including iSCSI LUN and iSCSI Target services.

• DSM and Other Services: Management applications, including

Synology DiskStation Manager (DSM) and its other services
and some add-on packages (e.g. Mail Server, Synology
Directory Server) are also covered, including all settings and
service statuses.

Synology White Paper 07

Synology High Availability Architecture

Special Conditions
Split-Brain Error server became active, and (3) the last iSCSI Target connection
information. The information should be found in Synology High
When a high-availability cluster is functioning normally, only one
Availability or the read-only File Station. Thus, users would be
of the member servers should assume the role of active server.
able to identify the newly active server.
In this case, the passive server detects the presence of the active
server via both the Heartbeat connection and cluster connection.
When the newly active server is selected, both servers will be
restarted. After that, all the modified data and settings on the
If all Heartbeat and cluster connections are lost, both servers
active server will be synced to the passive server. Hence, a new
might attempt to assume the role of active server. This situation
healthy High-availability cluster shall be in place.
is referred to as a "split-brain" error. In this case, connections to
the IP addresses of the high-availability cluster will be redirected
In addition, users can choose to make a complete replication
to either of the two servers, and inconsistent data might be
from the active server to the passive server or they can unbind
updated or written on the two servers.
both of them.

When any one of the Heartbeat or cluster connections is

To make a complete replication, users should choose one as
reconnected, the system will detect the split-brain error and
the active server of the High-availability cluster and unbind the
data inconsistency between the two servers, and will enter high-
other. Once both servers are restarted, the active server will
availability safe mode.
remain in the High-availability cluster. The unbound server will
keep its data and return to Standalone status. Please note that
On the other hand, a quorum server helps reduce the split-
a complete replication will entail binding a new passive server
brain error rate. Users can assign another server to both the
onward.
active and passive server as the quorum server. For example,
a gateway server or DNS server is a good choice because they
For detailed information regarding safe-mode resolution, you
usually connect to both servers constantly.
may refer to this article.

With a quorum server, the following circumstances will be

controlled:

• If the passive server cannot connect to both the active and

quorum servers, failover will not be performed in order to
prevent split brain errors.

• If the active server cannot connect to the quorum server

while passive server can, switchover will be triggered in order
to achieve better availability.

High-Availability Safe Mode

Instead of performing a complete replication, High-availability

safe mode helps users to identify the new active server and re-
build the cluster by syncing new data and modified settings from
the active server to the passive server.

In high-availability safe mode, both servers and the IP addresses

of the High-availability clusters will be unavailable until the
split-brain error is resolved. Also, additional information will be
shown, including (1) the difference of contents in the shared
folders on the two servers, (2) the time log indicating when the

08 Synology White Paper

Achieving Service Continuity

Achieving Service
Continuity
Auto-Failover Protection

To ensure continuous availability, Synology High Availability (e.g. accidental shut-down). The passive server tracks the
allows switching from the active server to the passive server status of the active server through the cluster connection and
in a normally functioning high-availability cluster at any time. the Heartbeat connection. Take-over by the passive server
Switchover can be manually triggered for system maintenance, will be prompted when network connections with the active
or automatically initiated in the event of the active server server are dropped.
malfunctioning, which is known as "failover." After the servers
exchange roles, the original active server assumes the role of the Synology High Availability is designed to protect the system

passive server and enters standby mode. As resources within the when errors occur under normal status. Services cannot

cluster are accessed using a single cluster interface, switchover be guaranteed when more than one critical error occurs

does not affect the means of access. concurrently. Therefore, to achieve high availability, issues
should be resolved immediately each time a failover is
• Switchover: The active and passive server can be manually
performed to allow the cluster to return to normal status.
triggered to exchange roles without interruption to service
for occasions such as system maintenance.
Failover Events
• Failover: In the event of critical malfunction, the cluster
The following situations are commonly seen to trigger system
will automatically initiate switchover to maintain service
failover:
availability.
• Crashed storage space: If a storage space (e.g., volume, Disk

Auto-Failover Mechanism Group, RAID Group, SSD Cache, etc.) on the active server

Note: Manual switchover is not possible when the storage

Auto-failover can either be triggered by the active server or
space on the passive server is busy with a Storage Manager
passive server depending on the situation.
related process (e.g., creating or deleting a volume) while
• Triggered by the active server: This happens when the auto failover is still allowed.
active server is aware of system abnormality and attempts
has crashed, while the corresponding storage space on
to smoothly transfer services from the active server to the
the passive server is functioning normally, failover will be
passive server. The active server continuously monitors itself
triggered unless there are no volumes or iSCSI LUNs (block-
to ensure services are functional. When detecting failed
level) on the crashed storage space. Storage spaces are
management services (e.g. storage space crashed, service
monitored every 10 seconds. Therefore, in the worst case,
error, network disconnection) the active server will halt
switchover will be triggered in 10 to 15 seconds after a crash
services in the beginning and then verify that data on the
occurs.
storage space and system configuration are synced with the
passive server. After this process, the passive server starts • Service Error: If an error occurs on a monitored service,

to boot up all services. As a result of the transferring process failover will be triggered. Services that can be monitored

users will be unable to manage the cluster and services will include SMB, NFS, AFP, FTP, and iSCSI. Services are monitored

stop functioning for a brief period of time (depending on the every 30 seconds. Therefore, in the worst case, switchover

number of services and storage space). will be triggered 30 seconds after an error occurs.

• Triggered by the passive server: This happens when the active • Power Interruption: If the active server is shut down or

server is in a state that it is unable to respond to any requests restarted, both power units on the active server fail, or power
is lost, failover will be triggered. Power status is monitored

Synology White Paper 09

Achieving Service Continuity

every 15 seconds. Therefore, in the worst case, switchover Switchover Limitations

will be triggered 15 seconds after power interruption occurs.
Switchover cannot be initiated in the following situations:
However, depending on the client's protocol behavior (e.g.,
SMB), the client may not be aware of the fact that data was • Incomplete data replication: When servers are initially

still in the active server's cache during power interruption. If combined to form a cluster, a period of time is required to

this is the case, the data that has not been flushed into the replicate existing data from the active to passive server. Prior

storage might not be re-sent by the client after the power to the completion of this process, switchover may fail.

interruption, resulting in a partial data loss. • Passive server storage space crash: Switchover may fail if a

• Cluster Connection Lost: If an error occurs on the cluster storage space (e.g., volume, Disk Group, RAID Group, SSD

connection, and the passive server has more healthy cluster Cache, etc.) on the passive server is crashed.

connections, failover will be triggered. For example, if the • Power interruption: Switchover may fail if the passive server
active server has two cluster connections and one of them is shut down or restarted, if both power units on the passive
is down, the active server will check whether the passive server malfunction, or if power is lost for any other reason.
server has two or more available connections. If it does,
• DSM update: When installing DSM updates, all services will be
failover will be triggered in 10 to 15 seconds. Please note
stopped and then come online after DSM update installation
that for connections joined with link aggregation, each joined
is completed
connection group is considered one connection.

After switchover has occurred, the faulty server may need to be

replaced or repaired. If the unit is repaired, restarting the unit
will bring the cluster back online and data-synchronization will
automatically take place. If the unit is replaced, the cluster will
need to be re-bound in order to recreate a functioning cluster.
Any USB/eSATA devices attached to the active server will have to
be manually attached onto the passive server once switchover is
complete.

Note: When a switchover occurs, all existing sessions are

terminated. A graceful shutdown of the sessions is not possible,
and some data loss may occur; however, retransmission
attempts should be handled at a higher level to avoid loss. Please
note that if the file system created on an iSCSI LUN by your
application cannot handle unexpected session terminations, the
application might not be able to mount the iSCSI LUN after a
failover occurs.

10 Synology White Paper

Best Practices for Deployment

Best Practices for

Deployment
Different customers of Synology NAS products may attach The failover mechanism is also applied on SSD cache. This
importance to various aspects of practices according to their means, when the SSD cache on the active server fails, a system
needs and purposes. Here, we provide the best practices failover to the passive server will be triggered.
regarding system performances and reliability respectively.
These two types of configuration are not mutually exclusive. Fast Heartbeat Connection
You may apply both practices to optimize the overall system
When data is transferred to the HA cluster, the copy of such data
environment.
will be transferred to the passive server through the heartbeat
connection at the same time. The writing process is complete
Before the configuration, make sure that your system
only when both transfers finish. In this case, if a Gigabit network
environment complies with the basic requirements for Synology
environment is applied, the writing speed will be limited to
High Availability. Both servers must be of the same model,
1Gbps by the network environment.
and with the identical configuration including the capacity,
quantity, and slot order of drives and RAM modules.
Most plus-series and all FS/XS-series are equipped with the
capability of adding additional external network cards for
Aside from the drive and RAM module configuration, both
additional high-speed network interfaces (e.g. 10Gbps).
servers must have the same number of attached network
interface cards and LAN ports.
The most basic principle of network settings for a High
Availability cluster is that the heartbeat connection bandwidth
System Performances must be greater than or equal to the cluster network interface.

To meet the needs of operating performance-intensive services, Heartbeat connection is one of the fastest network interfaces,

and of processing a massive amount of connections or frequent including link aggregation or 10G/40G network interface.

data access for a long time, you can optimize the performances
Synology offers optional 10GbE external network interface cards
with the following configuration:
to be used with High Availability cluster. When working with
multiple external network interface cards, link aggregation must
SSD Cache
be set up interchangeably to increase fault tolerance. Please
SSD cache brings a significant rise in data reading and writing refer to the following environment setup examples.
speeds of the system, especially under the circumstance where
the data storage consists of hard disk drives (HDD). Since solid- Setup with Single-Volume Storage Pool
state drives (SSD) are specifically designed for high performance
When using a single-volume storage pool, the system avoids
usage, by promoting frequently accessed data into SSDs, one
the impact on system performances from the LVM (logical
can fully utilize the system's random I/O access to effectively
volume management) coming with a multiple-volume storage
reduce the latency and extra data seek time as on HDDs.
pool. We highly recommend the setup with single-volume

SSD cache must be configured in the identical configuration storage pool for the peer-to-peer storage architecture of

and each SSD must be inserted in the same disk slot in both the Synology High Availability.

active server and passive server. The size of system memory

needs to be identical on the active and passive servers as Reliability
partial memory needs to be allocated for operating SSD cache. Synology High Availability is dedicated to the hardware and
Therefore, different memory sizes may result in unavailability of software protection of Synology NAS servers. However, aside
system failover. from the NAS storage itself, the normal operation of the whole

Synology White Paper 11

Best Practices for Deployment

services also depends on a couple of other factors, including be configured on the fastest network interface. For instance,
stable network connections and power supply. if the servers are equipped with 10GbE add-on network cards,
the Heartbeat connection must be configured by using 10GbE
Direct Heartbeat Connection between Two Servers cards. In addition, it is strongly recommended that users build
a direct connection (without switches) between two servers,
When the heartbeat connection between the active and passive
the distance between which is usually shorter than 10 meters.
servers goes through a switch, it increases the difficulty in
If a HA cluster requires two servers with a larger distance, the
managing the risk of network failure caused by the switch per
Heartbeat connection between two servers must have no other
se or the connections from the switch to the respective servers.
device in the same broadcast domain. This configuration can be
When you take into consideration the reliability of your system, it
achieved by configuring a separate VLAN on the Ethernet switch
is recommended to make a direct heartbeat connection between
to isolate the traffic from other network devices. Please make
the active and passive servers.
sure that cluster connection and Heartbeat connection are in
different loops lest they be interrupted at the same time when
Heartbeat and HA Cluster Connections with Link
functioning.
Aggregation

A link aggregation is formed by at least two connections

Separate Switches for the Cluster Connections
between the same peers. It not only increases transfer rates, but
Network switch is required for the HA cluster connection
enhances the availability of such network connections. When
between the active/passive server and the client computer. To
one of the interfaces or connections in a link aggregation fails,
avoid network switch malfunctions, we recommend connecting
the link will still work with the rest.
each of the two servers to a different switch. For example, you

In this section, we provide an anvironment setup scenario as can set up a link aggregation with two connections between one

follows. Two RS4017xs+ models are each equipped with two switch and the active server, and set up another one between

E10G17-F2 external network interface cards. There are four another switch and the passive server. Then, the client computer

1GbE network interfaces (local network 1, 2, 3, 4) and four 10GbE will be configured with two connections, each of which is linked

network interfaces (local network 5, 6, 7, 8) on each NAS. Local to one of the two switches.

network 5 and 6 are provided by external network interface card

1, and local network 7 and 8 are provided by external network Separate Power Supply for Servers and Switches

interface card 2. Please refer to the recommended setup below: Aside from the reliability of network connections among the

• Heartbeat interface: Link Aggregation of the 10GbE network servers, switches, and clients, we are also supposed to take into

interface. One interface from external network interface card consideration the stable power supply for the system. Most FS/

1 and one interface from external network interface card 2. XS-series and plus-series with RP are equipped with redundant
power supply, allowing you to allocate different electric power
• Cluster interface: Link Aggregation of the 10GbE network
sources to the server. For the NAS models without redundant
interface. One interface from external network interface card
power supply, we recommend allocating a power supply to one
1 and one interface from external network interface card 2.
server and its connected switch, and another one to the other
server and switch. This helps mitigate the risk of system failure
These configurations ensure the performance of both cluster
resulting from power outage of both servers/switches.
connection and Heartbeat connection is maximized while
maintaining redundancy for both connections. The network
Split-Brain Prevention with Quorum Server
service provided by the cluster will not be affected by the
Heartbeat connection, thus increasing fault tolerance for the In a real implementation of Synology High Availability, there
external network interface card. In the case that a problem are a certain number of possibilities that, even upon network
occurs with external network interface card 1, all services can abnormalities, the passive server takes over the workload while
still be provided through external network interface card 2. the failover from active server is not triggered. Both servers may
assume the services at the same time and, in this case, the split-
To prevent slow writing speed due to the process of
brain occurs. To avoid split-brain, you can configure a quorum
replicating data to the passive server when data is written, it is
server to detect the connection between itself and the active/
recommended that the Heartbeat connection is of the same or
passive servers respectively.
higher bandwidth as the service. The Heartbeat connection must

12 Synology White Paper

Performance Benchmark

Performance
Benchmark
Performance Considerations

Synology High Availability employs the synchronous commit for volumes that require high performance for random I/O
approach by acknowledging write operations after data has workloads.
been written on both the active and passive servers at the cost
of performance. To enhance the random IOPS performance In order to give a better understanding on the benchmark on

and reduce latency, we recommend enabling the SSD cache different levels of hardware, the performance benchmark was
done on DS918+ and RS18017xs+.

SHA1 Standalone SHA1 Standalone

Model RS18017xs+ Model DS918+
Testing Version DSM6.2-23601 Testing Version DSM6.2-23601
Cluster Connection Cluster Connection
10GbE *1 N/A 1GbE *1 N/A
Bandwidth Bandwidth
Heartbeat Heartbeat
Connection 10GbE *1 N/A Connection 1GbE *1 N/A
Bandwidth Bandwidth
File System Btrfs File System Btrfs
Disks and Disks and
12 SSD (200GB) with RAID 5 4 HDD (2TB) with RAID 5
RAID Type RAID Type
SMB - 64KB SMB - 64KB
Sequential Sequential
1181 1180.7 112.96 112.96
Throughput - Throughput -
Read (MBps) Read (MBps)
SMB - 64KB SMB - 64KB
Sequential Sequential
1175.61 1179.54 107.11 112.97
Throughput - Throughput -
Write (MBps) Write (MBps)
SMB - 64KB SMB - 64KB
Encrypted Encrypted
Sequential 1180.41 1180.41 Sequential 112.96 112.96
Throughput - Throughput -
Read (MBps) Read (MBps)
SMB - 64KB SMB - 64KB
Encrypted Encrypted
Sequential 1120.22 1179.67 Sequential 106.46 112.97
Throughput - Throughput -
Write (MBps) Write (MBps)
SMB - 4KB Random
213529.97 236580.3 1
IOPS - Read The configuration of each server in the SHA cluster.
SMB - 4KB Random
96072.53 120902.54
IOPS - Write
iSCSI - 4KB Random
180690.14 199660.71
IOPS - Read
iSCSI - 4KB Random
92276.7 112535.93
IOPS - Write

1
The configuration of each server in the SHA cluster.

Synology White Paper 13

Performance Benchmark

Switchover Time-to-Completion
When switchover is triggered, the active server becomes the
passive server, at which time the original passive server will take
over. During the exchange, there will be a brief period where
both servers are passive and services are paused.

The time-to-completion varies depending on a number of

factors:

• The number and size of volumes or iSCSI LUNs (block-level)

• The number and size of files on volumes

• The allocated percentage of volumes

• The number of running packages

• The number and total loading of services on the cluster

The following table provides estimated time-to-completion:

Settings DS918+ RS18017xs+

Switchover 64 30

Failover 62 31

• Unit: second

• Switchover is triggered manually in the Synology High

Availability package.

• Failover is triggered by unplugging the power cord of the

active server.

14 Synology White Paper

Summary

Synology is committed to delivering an exceptional user

experience for customers. Synology High Availability (SHA)
comes with an intuitive user interface with built-in graphical
management tools to simplify complex high availability
configurations, streamline workflow, reduce implementation
time and installation expense, and prevent costly operational
efficiency problems, a solution especially suitable for small and
mid-range businesses with limited IT resources.

Synology High Availability (SHA) is available on most Synology

NAS models, new and existing customers can easily set up
a comprehensive business continuity plan when acquiring
Synology NAS. Synology High Availability (SHA) provides a
cost-effective and reliable means of ensuring against service
downtime while offering enterprise-grade features that are
designed to meet common scenarios and fulfill data protection
and high availability requirements. This white paper has outlined
the basic principles and benefits of Synology High Availability
(SHA), in addition to recommendations on best practices and
strategies for companies of any size. For more information
and customized consultation on deployment, please contact
Synology at www.synology.com.

More Resources
• DSM Live Demo: Take advantage of our free 30-minute DSM
6.2 trial session. Experience our technology yourself before
making the purchase!

• Tutorials and FAQ: Learn more about SHA and get the
information you need with step-by-step tutorials and
frequently asked questions.

• Where to Buy: Connect with our partners and find out how to
choose the best product and solution to suit your business
needs.

Synology White Paper 15

Summary

SYNOLOGY
INC.
9F, No. 1, Yuan Dong Rd.
Banqiao, New Taipei 22063
Taiwan
Tel: +886 2 2955 1814

SYNOLOGY
AMERICA CORP.
3535 Factoria Blvd SE, Suite #200,
Bellevue, WA 98006
USA
Tel: +1 425 818 1587

SYNOLOGY
UK LTD.
Unit 5 Danbury Court, Linford Wood,
Milton Keynes, MK14 6PL, United
Kingdom
Tel.: +44 (0)1908048029

SYNOLOGY
FRANCE
39 rue Louis Blanc
92400 Courbevoie
France
Tel: +33 147 176288

SYNOLOGY
GMBH
Grafenberger Allee 125
40237 Düsseldorf
Deutschland
Tel: +49 211 9666 9666

SYNOLOGY
SHANGHAI
200070, Room 201,
No. 511 Tianmu W. Rd.,

synology.com Jingan Dist., Shanghai,

China

SYNOLOGY
Synology may make changes to specifications and product descriptions at any time, without notice. Copyright
JAPAN CO., LTD.
© 2019 Synology Inc. All rights reserved. ® Synology and other names of Synology Products are proprietary 1F, No.15, Kanda Konyacho,
marks or registered trademarks of Synology Inc. Other products and company names mentioned herein are Chiyoda-ku Tokyo, 101-0035
trademarks of their respective holders. Japan

Synology White Paper 16

Common questions

The performance and reliability of a Synology High Availability (SHA) cluster are significantly influenced by the configuration of network interfaces. The Heartbeat connection must be set up on one of the fastest network interfaces, such as 10GbE or 40GbE, to ensure synchronization between the active and passive servers is efficient . Link aggregation is recommended to increase bandwidth and provide failover capabilities, ensuring continuity even if one connection fails . For optimal performance, the Heartbeat connection should have greater or equal bandwidth compared to the cluster network interface . Additionally, a direct Heartbeat connection between the servers is recommended to mitigate network failure risks . If using multiple external network interface cards, they should be configured to maintain link aggregation interchangeably to bolster fault tolerance . Separating the cluster and Heartbeat connections across different broadcast domains or VLANs is suggested to prevent simultaneous interruptions . These configurations collectively enhance the fault tolerance and data synchronization speeds of the SHA cluster.

During a power interruption in a Synology High Availability (SHA) cluster, data loss can occur if unflushed data remains in the cache of the active server. If a client is unaware that data was left in the active server's cache during the interruption, and it does not resend the data after power is restored, partial data loss can occur . To mitigate this risk, power status is monitored every 15 seconds, and failover is triggered promptly in the event of a power failure . Moreover, the system ensures that data synchronization and role exchanges between the active and passive servers are closely tracked, although some retransmission and data recovery efforts may need to be managed by client protocols to fully address the loss risk . Additionally, recommendations such as using redundant power supplies for servers and switches help minimize the risk of system failures due to power outages .

A quorum server helps prevent split-brain scenarios in a Synology High Availability (SHA) setup by acting as a point of reference for monitoring connection statuses. If the passive server cannot connect to both the active and quorum servers, a failover is not performed, thereby avoiding split-brain errors . Conversely, if the active server is unable to connect to the quorum server while the passive server can, a switchover is initiated to maintain availability . The quorum server thus serves as an additional measure to ensure that failover decisions are made based on accurate status of the connections, aligning with best practices for maintaining service continuity and data integrity.

Synology HASWhite Paper
No ratings yet
Synology HASWhite Paper
13 pages
White Paper Synology HA Configuration
No ratings yet
White Paper Synology HA Configuration
13 pages
Synology High-Availability Guide
No ratings yet
Synology High-Availability Guide
3 pages
Synology High Availability Guide Enu
No ratings yet
Synology High Availability Guide Enu
56 pages
High Availability Strategies for InterSystems
No ratings yet
High Availability Strategies for InterSystems
19 pages
Failover Clusters: HA & CA Explained
No ratings yet
Failover Clusters: HA & CA Explained
4 pages
Power HA Workshop Overview
No ratings yet
Power HA Workshop Overview
50 pages
C 1 Basic High Availability Concepts: Hapter
No ratings yet
C 1 Basic High Availability Concepts: Hapter
39 pages
High Availability Options For Nagios XI 2024
No ratings yet
High Availability Options For Nagios XI 2024
2 pages
An Datasheet 121001
No ratings yet
An Datasheet 121001
2 pages
Technical Essentials of HP Servers, Rev. 11.41
No ratings yet
Technical Essentials of HP Servers, Rev. 11.41
72 pages
Redundancy:: o Redundancy, Failover, High Availability, Clustering, RAID and Fault-Tolerance
No ratings yet
Redundancy:: o Redundancy, Failover, High Availability, Clustering, RAID and Fault-Tolerance
19 pages
MarkFleming Understanding High Availability in The SAN-V1-1
No ratings yet
MarkFleming Understanding High Availability in The SAN-V1-1
40 pages
MobileIron Core HA Management Guide
No ratings yet
MobileIron Core HA Management Guide
25 pages
107-HA Theory
No ratings yet
107-HA Theory
6 pages
Clustering For Availability
No ratings yet
Clustering For Availability
4 pages
Vsphere Ha Deepdive PDF
No ratings yet
Vsphere Ha Deepdive PDF
144 pages
VMW Server WP Best Practices
No ratings yet
VMW Server WP Best Practices
10 pages
VMware High Availability DS EN PDF
No ratings yet
VMware High Availability DS EN PDF
2 pages
Achieving High Availability Objectives
No ratings yet
Achieving High Availability Objectives
8 pages
Availability Digest: Blueprints For High Availability
No ratings yet
Availability Digest: Blueprints For High Availability
8 pages
Assuring Your High Availability Solution For Tivoli Process Automation Engine Environments
No ratings yet
Assuring Your High Availability Solution For Tivoli Process Automation Engine Environments
33 pages
High Availability Disaster Recovery 1709353716
No ratings yet
High Availability Disaster Recovery 1709353716
4 pages
HACMP Course Moudule 1
No ratings yet
HACMP Course Moudule 1
20 pages
WP ReplicationHACMP E
No ratings yet
WP ReplicationHACMP E
12 pages
High Availability Administration Guide EN
No ratings yet
High Availability Administration Guide EN
134 pages
High Availability Disaster Recovery For Sap Applications
No ratings yet
High Availability Disaster Recovery For Sap Applications
33 pages
Lecture 7 Overview of High Availability and Disaster Recovery
No ratings yet
Lecture 7 Overview of High Availability and Disaster Recovery
50 pages
High Availability Architecture Overview
No ratings yet
High Availability Architecture Overview
7 pages
Vsphere Esxi Vcenter Server 652 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 652 Availability Guide
100 pages
All Netapp Print Ha
No ratings yet
All Netapp Print Ha
60 pages
High Availability Ina J2EE Enterprise Application Environment
No ratings yet
High Availability Ina J2EE Enterprise Application Environment
8 pages
XenServer High Availability
No ratings yet
XenServer High Availability
14 pages
Vsphere Esxi Vcenter Server 70 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 70 Availability Guide
91 pages
GTID Based Replication For MySQL High Availability 0570
No ratings yet
GTID Based Replication For MySQL High Availability 0570
48 pages
Vsphere Esxi Vcenter 802 Availability Guide
No ratings yet
Vsphere Esxi Vcenter 802 Availability Guide
92 pages
PPT3-S3 - Availability
No ratings yet
PPT3-S3 - Availability
35 pages
Vsphere Esxi Vcenter Server 65 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 65 Availability Guide
88 pages
IDC Disaster Recovery As A Service Extended Whitepaper PDF
No ratings yet
IDC Disaster Recovery As A Service Extended Whitepaper PDF
15 pages
Vsphere Esxi Vcenter Server 601 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 601 Availability Guide
66 pages
Vsphere Esxi Vcenter Server 702 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 702 Availability Guide
92 pages
Availabilty
No ratings yet
Availabilty
23 pages
Virtualization for IT Professionals
No ratings yet
Virtualization for IT Professionals
13 pages
Vsphere Esxi Vcenter Server 672 Availability Guide PDF
No ratings yet
Vsphere Esxi Vcenter Server 672 Availability Guide PDF
102 pages
Vsphere Esxi Vcenter Server 671 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 671 Availability Guide
105 pages
Aix Hacmp Cluster
100% (1)
Aix Hacmp Cluster
696 pages
High Availability and Data Protection With Dell PowerScale Scale-Out NAS
No ratings yet
High Availability and Data Protection With Dell PowerScale Scale-Out NAS
45 pages
Veritas Cluster Server
100% (1)
Veritas Cluster Server
90 pages
Vsphere Esxi Vcenter Server 55 Availability Guide
No ratings yet
Vsphere Esxi Vcenter Server 55 Availability Guide
56 pages
Best Practices For Continuous Application Availability
100% (1)
Best Practices For Continuous Application Availability
21 pages
High Availability: Skelta BPM and
No ratings yet
High Availability: Skelta BPM and
7 pages
High Availability Overview: O/S Failures
No ratings yet
High Availability Overview: O/S Failures
6 pages
Vmware High Availability: What Is Vmware Ha?
No ratings yet
Vmware High Availability: What Is Vmware Ha?
2 pages
Understanding VMware HA and DRS Features
No ratings yet
Understanding VMware HA and DRS Features
22 pages
Hightecnolgies
No ratings yet
Hightecnolgies
50 pages
4 - HA-Reya
No ratings yet
4 - HA-Reya
20 pages
PharmaSUG 2013 CC30
No ratings yet
PharmaSUG 2013 CC30
5 pages
FTTH Broschuere 2019 Englisch
No ratings yet
FTTH Broschuere 2019 Englisch
24 pages
DS-2CD1021G2-LIU Datasheet 20240719
No ratings yet
DS-2CD1021G2-LIU Datasheet 20240719
5 pages
Icdcc2025securingagenticaierykbp 250516083533 9d4160d9
No ratings yet
Icdcc2025securingagenticaierykbp 250516083533 9d4160d9
15 pages
Ds-2Cd2347G3E-L 4 MP Colorvu Fixed Turret Network Camera: Key Features
No ratings yet
Ds-2Cd2347G3E-L 4 MP Colorvu Fixed Turret Network Camera: Key Features
4 pages
Gigabit Backhaul Made Easy: B5 Point-to-Point Backhaul Radio
No ratings yet
Gigabit Backhaul Made Easy: B5 Point-to-Point Backhaul Radio
2 pages
Synology SSO Server: Development Guide
No ratings yet
Synology SSO Server: Development Guide
12 pages
05.061 Survey Policy and Procedures FINAL
No ratings yet
05.061 Survey Policy and Procedures FINAL
7 pages
RS485 & RS422 Communication Guide
No ratings yet
RS485 & RS422 Communication Guide
6 pages
MT Series User Manual MT4Y
No ratings yet
MT Series User Manual MT4Y
28 pages
Institutionalisation, Quality & Use of Policy Evaluation: OECD Survey Questionnaire
No ratings yet
Institutionalisation, Quality & Use of Policy Evaluation: OECD Survey Questionnaire
22 pages
RouterOS Scripting Language Guide
No ratings yet
RouterOS Scripting Language Guide
25 pages
CV Europass
No ratings yet
CV Europass
3 pages
Logging and Monitoring Security Standard
No ratings yet
Logging and Monitoring Security Standard
17 pages
Jabra W28K: Windows Server 2008 Admin Guide
No ratings yet
Jabra W28K: Windows Server 2008 Admin Guide
36 pages
Fundamentals IP Line
No ratings yet
Fundamentals IP Line
534 pages
Usr G816
No ratings yet
Usr G816
76 pages
Switch 3750 Cisco Smart Install Configuration Guide IOS 12.2 53
No ratings yet
Switch 3750 Cisco Smart Install Configuration Guide IOS 12.2 53
76 pages
Huawei Hg533 User Manual
No ratings yet
Huawei Hg533 User Manual
44 pages
Huawei.H12 811 ENU.v2021 12 07.q105
No ratings yet
Huawei.H12 811 ENU.v2021 12 07.q105
31 pages
Windows 2000 DHCP Setup Guide
No ratings yet
Windows 2000 DHCP Setup Guide
11 pages
Cisco-DPC3828 Specs PDF
No ratings yet
Cisco-DPC3828 Specs PDF
5 pages
User Manual IES-2060 2042FX V1.3
No ratings yet
User Manual IES-2060 2042FX V1.3
41 pages
FortiOS 6.4 Admin Guide
No ratings yet
FortiOS 6.4 Admin Guide
1,610 pages
Basic Manual CME Setup Using The CLI
No ratings yet
Basic Manual CME Setup Using The CLI
6 pages
DHCP Server Migration Guide
No ratings yet
DHCP Server Migration Guide
2 pages
Manual de Usuario PTZ HiLook
No ratings yet
Manual de Usuario PTZ HiLook
101 pages
Windows 2003 RIS Setup Guide
No ratings yet
Windows 2003 RIS Setup Guide
36 pages
1732e-In004 - En-E (Dual Port Enet)
No ratings yet
1732e-In004 - En-E (Dual Port Enet)
20 pages
B-EV4 Series: Printer Setting Tool Operating Specification
No ratings yet
B-EV4 Series: Printer Setting Tool Operating Specification
16 pages
DHCP Presentation
No ratings yet
DHCP Presentation
25 pages
HPE7-A08 Real Updated Exam Questions
No ratings yet
HPE7-A08 Real Updated Exam Questions
22 pages
Manual de Netbotz 750
No ratings yet
Manual de Netbotz 750
24 pages
схема и сервис мануал TOSHIBA 40SL733
No ratings yet
схема и сервис мануал TOSHIBA 40SL733
218 pages
TCPIP Foundation For Engineers
No ratings yet
TCPIP Foundation For Engineers
2 pages
Avocent ACS 5000 CRG
No ratings yet
Avocent ACS 5000 CRG
152 pages
Instalasi Dan Konfigurasi Cisco 891F
0% (2)
Instalasi Dan Konfigurasi Cisco 891F
21 pages
Commissioning and Configuration GPON
80% (5)
Commissioning and Configuration GPON
608 pages
Network Command Configuration Guide
No ratings yet
Network Command Configuration Guide
19 pages
Compact Logix 5380 & Compact GuardLogix 5380
No ratings yet
Compact Logix 5380 & Compact GuardLogix 5380
346 pages
ISCOM3000G (B) Series Switches: Raisecom Technology Co., LTD
No ratings yet
ISCOM3000G (B) Series Switches: Raisecom Technology Co., LTD
8 pages
CMM366A-WIFI en
No ratings yet
CMM366A-WIFI en
17 pages

Synology High Availability White Paper: Based On

Uploaded by

Synology High Availability White Paper: Based On

Uploaded by

﻿

Synology High Availability

1 Synology White Paper

Synology High Availability Architecture 04

Achieving Service Continuity 09

Best Practices for Deployment 11

Unexpected downtime can lead to frustration for customers and

Synology Solutions for Service Continuity

High availability solution is highly demanded for anyone who

High availability is mostly featured as an enterprise-exclusive

02 Synology White Paper

only large enterprises with sufficient IT resources can afford to

Synology is dedicated to providing a reliable, cost-effective,

Synology White Paper 03

• Main Storage: The storage volume of the active server.

• Spare Storage: The storage volume of the passive server,

Figure 1: Physical components of a typical Synology

04 Synology White Paper

Network Interface • Suggested network configurations: It is suggested that

3. If there is no 10GbE network interface, make sure the

Link Aggregation increases the bandwidth and provides traffic

Synology White Paper 05

06 Synology White Paper

• The switch/router should be able to perform multicast Data Replication

testing. replication are also synced.

User are advised to manually enable/disable the flow control

Data and changes to be replicated include:

• NAS Data Services: All file services including CIFS/NFS/AFP are

• iSCSI Data Services: High-availability clustering supports iSCSI,

• DSM and Other Services: Management applications, including

Synology White Paper 07

When any one of the Heartbeat or cluster connections is

With a quorum server, the following circumstances will be

• If the passive server cannot connect to both the active and

• If the active server cannot connect to the quorum server

High-Availability Safe Mode

Instead of performing a complete replication, High-availability

In high-availability safe mode, both servers and the IP addresses

08 Synology White Paper

Note: Manual switchover is not possible when the storage

Synology White Paper 09

every 15 seconds. Therefore, in the worst case, switchover Switchover Limitations

After switchover has occurred, the faulty server may need to be

Note: When a switchover occurs, all existing sessions are

10 Synology White Paper

Best Practices for

active server and passive server. The size of system memory

Synology White Paper 11

A link aggregation is formed by at least two connections

network 5 and 6 are provided by external network interface card

12 Synology White Paper

SHA1 Standalone SHA1 Standalone

Synology White Paper 13

The time-to-completion varies depending on a number of

• The number and size of volumes or iSCSI LUNs (block-level)

• The number and size of files on volumes

• The allocated percentage of volumes

• The number of running packages

• The number and total loading of services on the cluster

The following table provides estimated time-to-completion:

Settings DS918+ RS18017xs+

• Switchover is triggered manually in the Synology High

• Failover is triggered by unplugging the power cord of the

14 Synology White Paper

Synology is committed to delivering an exceptional user

Synology High Availability (SHA) is available on most Synology

Synology White Paper 15

synology.com Jingan Dist., Shanghai,

Synology White Paper 16

Common questions

How do network interface configurations influence the performance and reliability of a Synology High Availability (SHA) cluster?

How do network interface configurations influence the performance and reliability of a Synology High Availability (SHA) cluster?

Describe how data loss can occur during a power interruption in a Synology High Availability (SHA) cluster and the measures taken to mitigate it.

Describe how data loss can occur during a power interruption in a Synology High Availability (SHA) cluster and the measures taken to mitigate it.

What role does the quorum server play in preventing split-brain scenarios in a Synology High Availability (SHA) setup?

What role does the quorum server play in preventing split-brain scenarios in a Synology High Availability (SHA) setup?