0% found this document useful (0 votes)

86 views16 pages

White Paper Continuous Operations High Availability

sdfdfa

Uploaded by

phamvanha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views16 pages

White Paper Continuous Operations High Availability

sdfdfa

Uploaded by

phamvanha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

White Paper

Continuous Operations and High Availability

Overview
The Cisco Nexus 7000 platform is modular in its design, with an emphasis on redundant critical
components throughout all of its subsystems. A highly modular and compartmentalized approach
to systems design is applied to all facets of the platform, spanning across the physical,
environmental, power, and system software aspects of its architecture. Additionally, a distinct
functional separation between control plane and forwarding data plane is emphasized in its design
in order to allow continuous operation and zero service disruption during planned or unplanned
control-plane events or failures. This document is intended to provide details of the high availability
features and options across all of these areas in order to provide the reader a better understanding
of the hardware and software fault tolerance capabilities of the Cisco Nexus 7000 platform
architecture.

Physical Subsystem Redundancy Options and Features

Supervisor Module Redundancy

The Cisco Cisco Nexus 7000 platform supports dual supervisor modules to provide 1+1
redundancy for the control and management plane. A dual supervisor configuration operates in an
active/standby capacity in which only one of the supervisor modules is active at any given time,
while the other acts as a standby backup. State and configuration remain constantly synchronized
between the two supervisor modules to provide seamless and stateful1 switchover in the event of a
supervisor module failure. NX-OSs Generic On-Line Diagnostics (GOLD) subsystem and
additional monitoring processes on the supervisor facilitate the triggering of a stateful failover to
the redundant supervisor upon the detection of unrecoverable critical failures, service restartability
errors, kernel errors, or hardware failures.

1
Stateful switchover is handled on a per service basis. Not all services have been designed to switch over in a
stateful manner. See subsequent sections of this document for further details.

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 1 of 16
White Paper

Figure 1. Cisco Nexus 7000 Series 10-Slot Chassis Front View

In the event of a supervisor-level unrecoverable failure, the currently active but errored supervisor
will trigger a switchover. In executing the switchover, the standby supervisor becomes the new
active supervisor, using the synchronized state and configuration while the errored supervisor is
reloaded. If the failed supervisor is able to reload and pass self-diagnostics it will initialize, become
the new standby supervisor, and synchronize its operating state with the newly active unit.

Additional details on supervisor switchover are contained in the Supervisor Switchover section of
this document.

Switch Fabric Redundancy

In addition to control-plane and management availability through redundant supervisor modules,
the Cisco Nexus 7000 platform also provides switching fabric availability through redundant switch
fabric module implementation (Figure 2). A single Cisco Nexus 7000 chassis can be configured
with one or more fabric modules, up to a maximum of five for capacity as well as redundancy.
Each I/O module installed in the system will automatically connect to and use all functional
installed switch fabric modules. A failure of a switch fabric module will trigger an automatic
reallocation and balancing of traffic across the remaining active switch fabric modules.
Replacement of the failed fabric module reverses this process. Once the replacement fabric
module is inserted and online, traffic is again redistributed across all installed fabric modules,
thereby restoring the redundancy level.

Figure 2. Cisco Nexus 7000 Series10-Slot Chassis Rear View (Fabric Modules)

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 2 of 16
White Paper

Cooling Subsystem
The Cisco Nexus 7000 10-slot chassis contains two redundant system fan trays for I/O module
cooling and two additional redundant fan trays for switch fabric module cooling (Figure 3). Both
types of fan tray are hot swappable to the system.

The fan speeds are variable and are dynamically adjusted to one of 16 levels2 in order to optimize
system cooling in both sections while minimizing overall system noise and power draw.

The system and fabric fan speeds are automatically adjusted dependent upon a number of factors
including:

Highest power-drawing module installed in the system or in the fabric sections

Temperature value readings on the inlet, supervisor, and fabric module ambient
temperature sensors
Presence of the air inlet filter

Additionally, a detected failure of a fan within a given fan tray will trigger an adjustment (typically
an increase) in the speed and flow rate of the remaining fans in order to compensate for the
failure. Due to the thermal dynamics impact created by a missing fan tray, a detected removal of
an entire fan tray, without replacement, will initiate a series of warnings followed by a system
shutdown after a three-minute warning period.

2
The 16 levels of variable fan output are currently a software limitation. Future versions of software may
increase the granularity of output levels to up to 256 levels.

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 3 of 16
White Paper

Figure 3. Cisco Nexus 7000 Series 10-Slot Chassis Rear View (Fan Trays)

Power Subsystem Availability Features

The Cisco Nexus 7000 platform is powered by three internally redundant power supplies (Figure
4). Each of those individual power supply units is composed of two internalized isolated power
units, giving it effectively two power paths per modular power supply and six paths in total, per
chassis, when fully populated. The power supplies use a proportional load-sharing method for
power distribution to power system components, allowing the efficient use of dissimilar capacity
power supplies in the same chassis. Therefore all installed power supplies are active and share
the overall load. Additionally, the power subsystem allows the three power supplies to be
configured in any one of four redundancy modes.

Figure 4. Cisco Nexus 7000 Series10-Slot Chassis Rear View (Power Supplies)

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 4 of 16
White Paper

The available power supply redundancy modes are:

Nonredundant Combined Power: In this mode, there is no power supply redundancy. All
available power across installed power supplies is combined to provide the sum of all
available power to the usable power budget. In this configuration, a failure of a power
source or power supply will affect the overall available power to the system. If the current
power draw exceeds the postfailure power budget, the system will enter a failure state.

Figure 5. Combined power model (Non-redundant)

N+1 Power Supply Unit Redundancy (default): In this mode, protection against a single
power supply unit failure is provided. In the event of a single power supply failure, loads are
redistributed using available capacity across remaining functional power supply units. N+1
redundancy is available with either 2 or 3 power supplies installed in the system. The total
available power to the chassis is the sum of all installed power supplies minus that of the
largest (for redundancy). This is the default power redundancy mode.

Figure 6. N+1 redundancy model

Input Source Redundancy: This mode provides protection against input source failures. In
order to implement input grid redundancy, each power supply unit must be connected to
two independent power sources (grids). A single power grid failure will allow all installed
power supplies to continue functioning through the remaining internalized half still
connected to the remaining available power grid.

Figure 7. Input grid redundancy model

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 5 of 16
White Paper

Full Redundancy (Power Supply + Input Grid): In full redundancy mode, the power supply
configuration provides protection against either a single power source grid failure or a
single power supply unit failure. It is a logical combination of the N+1 and input grid
redundancy modes.

Figure 8. Full redundancy model

Each of the power supply redundancy modes imposes different power budgeting and allocation
models, which in turn deliver varying usable power yields and capacities. For more information or
details regarding power budgeting, usable capacity, and planning requirements please refer to the
Cisco Nexus 7000 environmental data sheet or white paper.

Internal Ethernet Out-of-Band Channel

Cisco Nexus 7000 uses a switched Ethernet out-of-band channel (EOBC) for management and
control traffic between the supervisors and I/O modules and between the supervisors and fabric
modules. On the supervisor modules, Ethernet connectivity is provided using an on-board 24-port
Ethernet switch on a chip, with a one 1 Gbps Ethernet link from each supervisor to each I/O
modules, each supervisor to each switch fabric module (up to five), and between the two
supervisors (Figure 9). Two additional redundant 1 Gbps Ethernet links are used on each
supervisor to connect to the local CPU within the supervisor. This design provides a highly
redundant switched-Ethernet-based fabric for control and management traffic between the
supervisors and all other processors and modules within the system.

Figure 9. EOBC Connectivity in Cisco Nexus 7000

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 6 of 16
White Paper

The modules utilize a six-port Ethernet switch on a chip, of which only three ports are used to
provide three 1 Gbps lines of connectivity. One line is used for connectivity to the module local
CPU, while two 1 Gbps lines are used to connect to each of the supervisors. These Ethernet-
based channels are used strictly for control-plane signaling and do not carry data-plane traffic.

NX-OS System Software Redundancy Options and Features

The Cisco Nexus 7000 platform runs the NX-OS operating system. NX-OS uses a highly
modularized architecture that compartmentalizes components for redundancy, fault isolation, and
resource efficiency. Functional feature components operate as independent processes referred to
as services. NX-OS services implement availability features by design into each service, as
needed. Most system services are capable of performing stateful restarts, thereby allowing a given
service experiencing a failure to be restarted and to resume operations transparently to other
services within the platform, and fully transparently to neighboring devices within the network.
Back-end management and orchestration of processes, services, and applications within a
platform are handled by a set of high-level system-control services: the system manager,
persistent storage services; message and transaction services; and the Installer.

These high availability infrastructure components provide the services, APIs, monitoring, and
control support that facilitate the service restart and supervisor switchover capabilities of the
platform. By employing multiple levels of service and component monitoring, combined with
various layers of structured failure scenario handling, the NX-OS software architecture provides a

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 7 of 16
White Paper

very comprehensive approach to helping ensure system availability. Table 1 illustrates the various
potential failure scenarios and handling capabilities built into NX-OS.

Table 1. Various failure scenarios handled by System Manager

Failure Scenario Action

Service/process exception Service restart

Service/process crash Service restart

Unresponsive service/process Service restart

Repeated service failure Supervisor reset (single)/switchover (dual)

Unresponsive system manager Supervisor reset (single)/switchover (dual)

Supervisor hardware failure Supervisor reset (single)/switchover (dual)

Kernel Crash Supervisor reset(single)/switchover (dual)

Unresponsive System Supervisor reset(single)/switchover (dual)

The following sections of the document examine the details of the components that deliver these
continuous operations and failure recovery capabilities.

System Redundancy Components

NX-OS leverages three key core infrastructure services to provide overall high availability
functionality:

System manager
Persistent storage service
Message and transaction service

In a redundant configuration, such as when dual supervisor modules are in operation, mirrored
services run on each supervisor module, with configuration and operating state synchronized
between them. One of those supervisors will operate as the active supervisor while the other
operates in a standby facility until activated in a switchover.

Figure 10. NX-OS High-Availability core infrastructure components

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 8 of 16
White Paper

The system manager orchestrates overall system function, service management, and system
health monitoring. It is also responsible for maintaining overall high availability states, enforcing
high availability policies, and managing system configuration. It is the system manager that is
responsible for launching, stopping, monitoring, and restarting services. System manager is is also
leveraged for initiating and managing the syncing of service states and intersupervisor states for
stateful switchover. It is also the system manager that will initiate a supervisor switchover if it
determines that the current supervisor has undergone an unrecoverable failure or if key core
services are undergoing errors and cannot be restarted reliably. In addition, being the overall
control and monitoring process, the system manager is responsible for punching or triggering the
keep-alive indicator for the hardware-based watchdog timer on the supervisor. The lack of this
periodic heartbeat from the system manager within the keep-alive timeout period of the watchdog
timer will indicate a nonresponsive system manager, which will trigger a hardware-based
supervisor reset (single supervisor) or switchover (dual supervisors). The system manager's health
is also monitored by a kernel level module receiving periodic heartbeats sent by the system
manager process. This allows the system to take corrective action in response to an unresponsive
system manager that has exceeded the heartbeat timeout period.

The persistent storage service (PSS) is the base infrastructure component responsible for storing
and managing the operational run-time information and configuration of the other platform
services. It is leveraged by other system services to recover state in the event of a service restart.
PSS provides a managed, structured API to read and write data to and from the storage system,
essentially functioning as a database of state and run-time information. Services wishing to
leverage the PSS infrastructure are able to checkpoint their state information periodically, as
needed. This allows services to subsequently recover to the last known operating state preceding
a failure, thereby allowing for a stateful restart. This state recovery capability is available to NX-OS
services in both single- and dual-supervisor configurations, and helps enable a service to
transparently return to operation without service disruption to data-plane traffic, neighboring
devices, or other internal services.

For example, even in a single supervisor configuration, the PSS enables the stateful restart of a
service such as spanning-tree without impacting the overall spanning-tree topology or stability.

The NX-OS message and transaction service (MTS) is a high performance interprocess
communications (IPC) message broker that specializes in high availability semantics. MTS
handles message routing and queuing between services on and across modules (including across
supervisors). MTS facilitates the exchange of messages such as event notification,
synchronization, and message persistency between system services and system components.
MTS also facilitates and manages message queuing between services and, as a result, can
maintain persistent messages and logged messages in queues for access even after a service
restart.

Figure 11. MTS inter-service communications

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 9 of 16
White Paper

Each of these system infrastructure component services plays a role in providing overall system
availability. Individual functional system services (for example, authentication, authorization, and
accounting [AAA], Spanning Tree Protocol, Network Time Protocol [NTP], command parser, and
so on) are responsible for using these system-level redundancy infrastructure components to
checkpoint their own states, recover the states upon restart, and synchronize the states to their
standby counterparts using the PSS and MTS facilities.

Service Modularity and Restart Capabilities

The services within NX-OS are designed as nonkernel space (user space) processes that perform
a function or set of functions for a subsystem or feature set.

Each service (essentially a feature) and each service instance is run as a separate independent
protected process. This approach provides a highly fault tolerant software infrastructure and fault
isolation between services. In short, a failure in a service instance (such as 802.1q or MST) will not
affect any other services running at that time (such as LACP). Additionally, each instance of a
service can run as an independent process. This implies that two instances of a routing protocol
(for example, two instance of OSPF) run as separate processes, thereby providing fault isolation
even between those two instances of the same service.

This resilient highly modular architecture allows the system to provide the highest level of
protection and fault tolerance for all services running on the platform. This approach also facilitates
rapid fault isolation, resolution, and modular software upgrades to address specific issues, while
minimizing the impact to other critical services and the overall system.

The services in NX-OS are also capable of undergoing rapid restart. A service restart may be
initiated automatically by the system manager in the event of critical fault detection. When a restart
is initiated, a service process is sent a signal to stop and is subsequently shut down and then
rapidly restarted, typically in the order of milliseconds. This allows an error condition to be cleared
and a service to be reset if necessary.

A service can undergo different types of restarts, stateful or stateless. A service may be restarted
by the system manager depending on current errors, failure circumstances, and configured high

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 10 of 16
White Paper

availability policy for the service. If the service is issued a stateful restart, the new service process
will retrieve the previous run-time state data from PSS and resume operations from the last
checkpointed service state. Most of the services in NX-OS are designed to support stateful restart
capabilities by leveraging the high availability infrastructure services of PSS and MTS. If the
service is issued a stateless restart, it will initialize and run as if it had just been started with no
prior state.

Not all services are designed for stateful restart. More specifically, currently the group of Layer 3
routing protocols (for example, Intermediate System-to-Intermediate System [IS-IS], OSPF, BGP,
Routing Information Protocol [RIP], and so on) for IPv4, IPv6, and IP multicast are not designed to
leverage the state persistence of PSS within NX-OS. Instead, these protocols are currently
designed to rebuild their operational state using information obtained from neighbors. Additional
details on the high availability functionality of Layer 3 protocols are addressed in the Layer 3
Protocol High Availability Features section of this document.

Figure 12. Service restart process

Supervisor Switchover
The CISCO NEXUS 7000 platform provides 1+1 redundant supervisor modules that can perform a
supervisor switchover (SSO) in critical failure situations. The switchover of active to standby
supervisor is nondisruptive to the forwarding plane and can therefore provide nonstop forwarding
of data during a supervisor switchover.

2008 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 11 of 16
White Paper

The active and standby supervisor modules in the Cisco Nexus 7000 platform remain
synchronized in state and configuration using a two-stage process. The first time a supervisor
enters a fully operational state, it will determine whether it is the active or the standby supervisor. If
it is the standby unit, it will perform the first stage of the synchronization by initiating a global sync
(GSYNC) with its counterpart. To accomplish this, the standby supervisor will issue a GSYNC
request to the active supervisor. The active supervisor unit, in return, will begin a complete
exchange of current operating configuration and state to the standby unit. Once this first-stage
complete information exchange has taken place, the supervisors enter the second stage of SSO
synchronization, in which event driven and state change triggered sync updates are sent through
messages through the MTS facility. A service that is updating or checkpointing its state will issue a
message to MTS as well as writing to PSS. MTS will route the subsequent sync message to the
peer MTS process on the standby supervisor through MTS messages over the Ethernet out-of-
band channel. The MTS process on the standby supervisor will notify the corresponding process
on the standby supervisor in order for it to update and checkpoint its own state to match that of the
peer process on the active supervisor.

Additionally, the system manager of the active supervisor will monitor the state and operational
readiness of the standby supervisor. If the standby supervisor has entered an unexpected state, or
if MTS synchronization to the standby fails, the standby supervisor is reset in an effort to facilitate
effective and consistent state between the active and standby supervisors.

In the event of the detection of an unrecoverable error, the system manager, depending on the
configured high availability policy, may initiate a supervisor switchover from the current active
supervisor over to the current standby unit. A switchover may be triggered based on one or more
criteria that contribute to the determinations of an unrecoverable failure:

Repeated service restarts within the restart timer stability window

Hardware failure
Kernel panic

In-Service Software Updates

NX-OS provides the ability to perform in-service software updates (ISSUs). The overall modular
software architecture of NX-OS supports plug-in-based services and features. This framework
makes it possible to perform complete image upgrades, nondisruptively without impacting the data
forwarding plane. This capability allows for nonstop forwarding during a software upgrade,
including upgrades between full image versions (for example, from 4.0 to 4.1).

An in-service software upgrade is initiated manually either through the command-line interface
(CLI) by an administrator, or (in future releases) via the management interface of the Datacenter
Network Manager software platform. When initiated, an in-service software upgrade updates (as
needed) the following components on the system:

Supervisor BIOS, Kickstart Image, System Image

I/O module BIOS and Image
Connectivity Management Processor (CMP) BIOS and Image

Once initiated, the ISSU Installer service begins the ISSU cycle. The upgrade process is
composed of several phased stages designed to minimize overall system impact with no impact to
data traffic forwarding. Figure 13 provides a high-level illustration of the in-service software
upgrade process.

Figure 13. In-Service Software Upgrade Process

It is important to note that an ISSU-based upgrade is a systemwide upgrade and not a virtual
device context (VDC)based upgrade. Therefore, the initiation of an ISSU-based upgrade applies
the same image and versions across the entire system, to all configured VDCs. Virtual device
contexts are primarily a control-plane and user-interface virtualization and cannot currently run
independent image versions per virtualized resource.

Additionally, in order to minimize overall network impact, an in-service software upgrade should
only be initiated when the network topology is stable, as control-plane components necessary to
participate in topology state changes may be in the process of upgrading during a transition.

Layer 3 Protocol High Availability Features

The Layer 3 protocols for IPv4, IPv6, and IP multicast do not currently implement stateful restart
capabilities. This capability may be added in the future for these protocol services. Currently, the
Layer 3 protocols utilize two major methods for operational recovery during a restart:

Graceful restart extensions

Protocol-based periodic refresh

OSPFv2, OSPFv3, IS-IS, Enhanced Interior Gateway Routing Protocol (EIGRP), and BGP utilize
graceful restart extensions to the base protocols to provide nonstop forwarding and least obtrusive
routing recovery for those environments. The NX-OS routing protocol extensions for graceful
restart follow the standards outlined in RFCs 3623, 4724, and 3847 for OSPFv2/v3, BGPv4, and
IS-IS respectively. It is important to note that while the NX-OS standards-based implementations of
graceful restart extensions and non-stop forwarding are compatible with Cisco IOS versions which
also leverage the same IETF standards, they are not compatible with older IOS versions that use
the pre-standard implementations of Cisco Non-Stop Forwarding (NSF).

In addition to the standards-based implementations for the aforementioned protocols, graceful

restart extensions have been developed for use with EIGRP. The extensions to EIGRP used in
NX-OS are compatible with those used for EIGRP nonstop forwarding on other Cisco platforms.

Graceful restart allows the Cisco Nexus 7000 platform to maximize the benefits of the distinct
separation between control plane and data forwarding plane in its architecture. If a restart of a
graceful-operations-capable protocol is required, that particular routing protocol will utilize the
relevant mechanisms to signal to its neighbors that a planned restart is being executed. This
notification to its neighbors will allow the neighboring devices to continue forwarding traffic to the
restarting entity as if operations were normal and allows the restarting service to remove itself from
the network control plane non-disruptively. The unit restarting its routing protocol will continue to
forward traffic based on the last established routing and forwarding information bases (RIB/FIB),
independent of control-plane operations, thereby allowing uninterrupted, continuous data delivery
during the restart. Once the restarted routing service has reestablished a stable state, it can notify
its neighbors and rebuild its adjacencies, thereby nondisruptively reinserting itself into the network
from a control-plane perspective.

The RIPv2, Protocol Independent Multicast (PIM), PIM6, Internet Group Management Protocol
(IGMP), MSDP, and MLD protocols do not implement graceful restart extensions. Instead, as
refresh-based protocols, they utilize the periodic refresh from neighbors that is inherent to the
protocol to reestablish their state.

Since the routing protocols themselves are not stateful during recovery from a restart, neither are
the routing information bases for these protocols. The unicast and multicast RIBs must be
reestablished and rebuilt based on the recovered state postrestart of the relevant Layer 3 protocol.
RIB and FIB are maintained during the restart in order to continue forwarding operations, but are
updated once the protocol is stable and has reestablished its recovered state. This allows for any
necessary updates to RIB and FIB, as needed, to accommodate any network changes after
restart.

Virtual Device Contexts

NX-OS implements a logical virtualization at the device level allowing multiple instances of the
device to operate on the same physical switch simultaneously. These logical operating
environments are known as virtual device contexts, or VDCs. Virtual device contexts provide
logically separate device environments that can be independently configured and managed. This
degree of isolation between virtual devices provides a very valuable level of fault isolation in
addition to the security and administrative benefits. Human error or failure conditions due to
configuration are naturally isolated within a given virtual device. While virtual device contexts are
not primarily a high availability feature, their effective creation of operationally independent fault

domains, contributing to availability and preventing service disruption associated with device
configuration, should always be considered.

The specific functional and operational details of virtual device contexts are outside the scope of
this document. For more information on VDCs, please refer to the white paper on Cisco Nexus
7000 virtualization features.

Summary
With this deeper examination of the Cisco Nexus 7000 platform, it becomes apparent that the
entire platform is truly designed from the bottom up with a goal towards highly-available
uninterrupted service under almost any condition. By strategically employing fault isolation, fault
detection, and fault recovery on many levels, in all areas, the Cisco Nexus 7000 platform is
capable of delivering critical enterprise-class service for current and future business needs while
preserving capital investment as well as the data path.

Glossary of Terms Used

NX-OS: Datacenter Operating System

MTS: Message and Transaction Service

PSS: Persistent Storage Service

EOBC: Ethernet Out-of-Band Channel

IETF: Internet Engineering Task Force

RFC: Request For Comment

CMP: Connectivity Management Processor

ISSU: In-Service Software Update

RIB: Routing Information Base

FIB: Forwarding Information Base

VDC: Virtual Device Context

NSF: Non-Stop Forwarding

HA: High-Availability

API: Application Programming Interface

MST: Multiple Spanning-Tree

LACP: Link Aggregation Control Protocol

Printed in USA C11-446927-01 02/08

Nfs With Sso
No ratings yet
Nfs With Sso
21 pages
Cisco NCS 1004 Data Sheet
No ratings yet
Cisco NCS 1004 Data Sheet
8 pages
Cisco-Building Highly Available Layer 3 Networks PDF
No ratings yet
Cisco-Building Highly Available Layer 3 Networks PDF
14 pages
Controllogix 5570 Redundancy: User Manual
No ratings yet
Controllogix 5570 Redundancy: User Manual
192 pages
Quad-Sup White Paper
No ratings yet
Quad-Sup White Paper
19 pages
Spares Planning: Title Page
No ratings yet
Spares Planning: Title Page
52 pages
5580 Redundancy 1756-Um015 - En-P
No ratings yet
5580 Redundancy 1756-Um015 - En-P
184 pages
Highav rm002 - en P
No ratings yet
Highav rm002 - en P
50 pages
Cisco Nexus 7700 Switches Data Sheet
No ratings yet
Cisco Nexus 7700 Switches Data Sheet
10 pages
Cisco Nexus Fab2 Module
No ratings yet
Cisco Nexus Fab2 Module
4 pages
Cisco ASR 9000 High Availability Guide
No ratings yet
Cisco ASR 9000 High Availability Guide
14 pages
Cisco 7600 Redundancy High Availability Configuration Guide
No ratings yet
Cisco 7600 Redundancy High Availability Configuration Guide
113 pages
1756 Um535 en P Redundancy
No ratings yet
1756 Um535 en P Redundancy
192 pages
Prod White Paper09186a0080088874
No ratings yet
Prod White Paper09186a0080088874
18 pages
Controllogix 5570/5560 Redundancy: User Manual
No ratings yet
Controllogix 5570/5560 Redundancy: User Manual
244 pages
Controllogix 5570 Redundancy: User Manual
No ratings yet
Controllogix 5570 Redundancy: User Manual
206 pages
Controllogix 5570 Redundancy Update and Module Replacement Guidelines
No ratings yet
Controllogix 5570 Redundancy Update and Module Replacement Guidelines
62 pages
Cisco Nexus 7000 Series Supervisor Module
No ratings yet
Cisco Nexus 7000 Series Supervisor Module
6 pages
Controllogix Enhanced Redundancy System: User Manual
No ratings yet
Controllogix Enhanced Redundancy System: User Manual
254 pages
1756 Um535 - en P
No ratings yet
1756 Um535 - en P
254 pages
1756-Um535 - REDUNDANCIA CONTROLLOGIX PDF
No ratings yet
1756-Um535 - REDUNDANCIA CONTROLLOGIX PDF
242 pages
Chap13 HA PDF
No ratings yet
Chap13 HA PDF
44 pages
ControlLogix Redundancy Enhanced
No ratings yet
ControlLogix Redundancy Enhanced
296 pages
Nexus VPC
100% (1)
Nexus VPC
11 pages
Cisco NX-OS Software For Cisco Nexus 7000 Series Switches (2012)
No ratings yet
Cisco NX-OS Software For Cisco Nexus 7000 Series Switches (2012)
18 pages
Nexus 7000 Training Docs
No ratings yet
Nexus 7000 Training Docs
12 pages
LAN Redundancy
No ratings yet
LAN Redundancy
45 pages
Cisco IOS Software In-Service Software Upgrade On The Cisco Catalyst 4500 Series
No ratings yet
Cisco IOS Software In-Service Software Upgrade On The Cisco Catalyst 4500 Series
15 pages
Issu
No ratings yet
Issu
40 pages
High Availability Systems Guide
No ratings yet
High Availability Systems Guide
50 pages
s7 Opc Ua Redundancy en
No ratings yet
s7 Opc Ua Redundancy en
41 pages
Nexus 7000 FAQ
No ratings yet
Nexus 7000 FAQ
11 pages
Nsfsso
No ratings yet
Nsfsso
20 pages
Ps7red B en-US
No ratings yet
Ps7red B en-US
252 pages
Redundancy Systems User Manual
No ratings yet
Redundancy Systems User Manual
198 pages
Data Sheet c78-728591
No ratings yet
Data Sheet c78-728591
6 pages
Process Control System PCS 7 High Availability Process Control System v9
No ratings yet
Process Control System PCS 7 High Availability Process Control System v9
252 pages
High Availability Systems: Selection Guide
No ratings yet
High Availability Systems: Selection Guide
16 pages
Application For Process Automation: Configuration of Redundant I/O Modules
No ratings yet
Application For Process Automation: Configuration of Redundant I/O Modules
68 pages
DCX 13.a R - SG PDF
No ratings yet
DCX 13.a R - SG PDF
314 pages
Huawei E9000 Hardware Solution
No ratings yet
Huawei E9000 Hardware Solution
5 pages
Catalyst 6500 Architecture
No ratings yet
Catalyst 6500 Architecture
90 pages
ZXR10 M6000-S (V3.00.20) Products at A Glance - 826854
No ratings yet
ZXR10 M6000-S (V3.00.20) Products at A Glance - 826854
17 pages
LABDCT 2001 (Guide) Nexus.7000
No ratings yet
LABDCT 2001 (Guide) Nexus.7000
44 pages
14 Sup RP Redundancy
No ratings yet
14 Sup RP Redundancy
12 pages
Cisco 7600 Architecture
No ratings yet
Cisco 7600 Architecture
107 pages
Solusion Doc Hyperplex With Aci
No ratings yet
Solusion Doc Hyperplex With Aci
20 pages
Cisco Nexus 7000/7700 Switch Architecture
No ratings yet
Cisco Nexus 7000/7700 Switch Architecture
96 pages
Redundancy PDF
No ratings yet
Redundancy PDF
192 pages
ControlLogix Redundancy
No ratings yet
ControlLogix Redundancy
192 pages
A10 4.1.4-GR1-P5 Avcs
No ratings yet
A10 4.1.4-GR1-P5 Avcs
74 pages
Configuration of Redundant I/O Modules in Simatic Pcs 7: Application Description 03/2015
No ratings yet
Configuration of Redundant I/O Modules in Simatic Pcs 7: Application Description 03/2015
72 pages
Configuration of Redundant I/O Modules in Simatic Pcs 7: Application Description 03/2015
No ratings yet
Configuration of Redundant I/O Modules in Simatic Pcs 7: Application Description 03/2015
72 pages
TechNet24: Your Tech Resource Hub
No ratings yet
TechNet24: Your Tech Resource Hub
17 pages
Head First Python PDF
100% (3)
Head First Python PDF
494 pages
Configuring Static MAC Entries
No ratings yet
Configuring Static MAC Entries
2 pages
CND Laboratory Online Services
No ratings yet
CND Laboratory Online Services
7 pages
Eve Cook Book 1.0
No ratings yet
Eve Cook Book 1.0
197 pages
Central Web Auth with FlexConnect APs
No ratings yet
Central Web Auth with FlexConnect APs
13 pages
CND Labs Module 01 Computer Network and Defense Fundamentals
No ratings yet
CND Labs Module 01 Computer Network and Defense Fundamentals
30 pages
1ca05c8c07684470a0b3d1d5047af005
No ratings yet
1ca05c8c07684470a0b3d1d5047af005
388 pages
DHCP Option 43 00
No ratings yet
DHCP Option 43 00
15 pages
Central Web Authentication On The WLC and ISE Configuration Example
No ratings yet
Central Web Authentication On The WLC and ISE Configuration Example
12 pages
External Web Authentication With Flexconnect Local Switching Deployment Guide
No ratings yet
External Web Authentication With Flexconnect Local Switching Deployment Guide
7 pages
Flex 7500 Wireless Branch Controller Deployment Guide: Document ID: 112973
No ratings yet
Flex 7500 Wireless Branch Controller Deployment Guide: Document ID: 112973
64 pages
Data Center Interconnect Design Guide Test Fail VPC
No ratings yet
Data Center Interconnect Design Guide Test Fail VPC
81 pages
DHCP Option 43 00
No ratings yet
DHCP Option 43 00
15 pages
If VPC
No ratings yet
If VPC
54 pages
Data Center Design With Cisco Nexus Switches and Virtual Portchannel: Overview
No ratings yet
Data Center Design With Cisco Nexus Switches and Virtual Portchannel: Overview
22 pages
Understanding EtherChannel Inconsistency
No ratings yet
Understanding EtherChannel Inconsistency
6 pages
CCSA Class Slides
No ratings yet
CCSA Class Slides
285 pages
Persistent Load Balancing with MikroTik
No ratings yet
Persistent Load Balancing with MikroTik
5 pages
Mastering Fortigate
No ratings yet
Mastering Fortigate
37 pages
Java Network Programming Basics
No ratings yet
Java Network Programming Basics
31 pages
CTNS Reference
No ratings yet
CTNS Reference
13 pages
Understanding FTP and Telnet Protocols
No ratings yet
Understanding FTP and Telnet Protocols
18 pages
IP Addressing & Subnetting Guide
No ratings yet
IP Addressing & Subnetting Guide
36 pages
Cisco Catalyst 9400 Series Hardware Installation Manual
No ratings yet
Cisco Catalyst 9400 Series Hardware Installation Manual
26 pages
Basic Internetworking
No ratings yet
Basic Internetworking
2 pages
Metasploit Guide for Ethical Hackers
No ratings yet
Metasploit Guide for Ethical Hackers
6 pages
LTE Handover
100% (1)
LTE Handover
49 pages
Cable Modems
No ratings yet
Cable Modems
16 pages
VCF Ems Deployment Parameter
No ratings yet
VCF Ems Deployment Parameter
8 pages
HPE - A00110413en - Us - U.11.67 Release Notes
No ratings yet
HPE - A00110413en - Us - U.11.67 Release Notes
32 pages
Cnpilot E410
100% (1)
Cnpilot E410
3 pages
AdventNet ME OpUtils
100% (1)
AdventNet ME OpUtils
180 pages
ICANN78 At-Large Talking Points
No ratings yet
ICANN78 At-Large Talking Points
18 pages
Socket Programming With TCP
No ratings yet
Socket Programming With TCP
14 pages
Ccna CCNP
No ratings yet
Ccna CCNP
36 pages
All mIRC Commands
No ratings yet
All mIRC Commands
4 pages
Black Belt - Switching - Presales - Stage 1
No ratings yet
Black Belt - Switching - Presales - Stage 1
3 pages
DVTI Easy VPN Server Configuration Guide
No ratings yet
DVTI Easy VPN Server Configuration Guide
2 pages
Switching
100% (1)
Switching
30 pages
IP & Email Protocols Guide
No ratings yet
IP & Email Protocols Guide
48 pages
Final Exam Demo Test Part I: THEORY - Attempt 1: TCP Data
No ratings yet
Final Exam Demo Test Part I: THEORY - Attempt 1: TCP Data
101 pages
A07D12HAC Datasheet: Quick Specs
No ratings yet
A07D12HAC Datasheet: Quick Specs
3 pages
Mikrotik: Block Hotspot Shield Guide
No ratings yet
Mikrotik: Block Hotspot Shield Guide
2 pages
BDCOM 2800-30 Test Cases
No ratings yet
BDCOM 2800-30 Test Cases
37 pages
HCIA Datacom 2021 Exam Review
No ratings yet
HCIA Datacom 2021 Exam Review
31 pages
ICT 10 Reviewer
No ratings yet
ICT 10 Reviewer
4 pages
Chapter - 3 Transport Layer
No ratings yet
Chapter - 3 Transport Layer
97 pages

White Paper Continuous Operations High Availability

Uploaded by

White Paper Continuous Operations High Availability

Uploaded by

White Paper

Continuous Operations and High Availability

Physical Subsystem Redundancy Options and Features

Supervisor Module Redundancy

Figure 1. Cisco Nexus 7000 Series 10-Slot Chassis Front View

Switch Fabric Redundancy

Highest power-drawing module installed in the system or in the fabric sections

Power Subsystem Availability Features

The available power supply redundancy modes are:

Figure 5. Combined power model (Non-redundant)

Figure 6. N+1 redundancy model

Figure 7. Input grid redundancy model

Figure 8. Full redundancy model

Internal Ethernet Out-of-Band Channel

Figure 9. EOBC Connectivity in Cisco Nexus 7000

NX-OS System Software Redundancy Options and Features

Table 1. Various failure scenarios handled by System Manager

Failure Scenario Action

Service/process exception Service restart

Service/process crash Service restart

Unresponsive service/process Service restart

Repeated service failure Supervisor reset (single)/switchover (dual)

Unresponsive system manager Supervisor reset (single)/switchover (dual)

Supervisor hardware failure Supervisor reset (single)/switchover (dual)

Kernel Crash Supervisor reset(single)/switchover (dual)

Unresponsive System Supervisor reset(single)/switchover (dual)

System Redundancy Components

Figure 10. NX-OS High-Availability core infrastructure components

Figure 11. MTS inter-service communications

Service Modularity and Restart Capabilities

Figure 12. Service restart process

Repeated service restarts within the restart timer stability window

In-Service Software Updates

Supervisor BIOS, Kickstart Image, System Image

Figure 13. In-Service Software Upgrade Process

Layer 3 Protocol High Availability Features

Graceful restart extensions

In addition to the standards-based implementations for the aforementioned protocols, graceful

Virtual Device Contexts

Glossary of Terms Used

MTS: Message and Transaction Service

PSS: Persistent Storage Service

EOBC: Ethernet Out-of-Band Channel

IETF: Internet Engineering Task Force

RFC: Request For Comment

CMP: Connectivity Management Processor

ISSU: In-Service Software Update

RIB: Routing Information Base

FIB: Forwarding Information Base

VDC: Virtual Device Context

NSF: Non-Stop Forwarding

API: Application Programming Interface

MST: Multiple Spanning-Tree

LACP: Link Aggregation Control Protocol

Printed in USA C11-446927-01 02/08

You might also like