GDPS PDF
GDPS PDF
June 1, 2001
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 1: GDPS Installation Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Section 1.1 GDPS Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Section 1.1.1 Need for Data Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Section 1.1.2 GDPS Highlights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Section 1.1.3 IBM Global Services Offerings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Section 1.2 Implementation Process Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Section 1.3 GDPS Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Section 1.4 Dynamic Capacity Backup (CBU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Description/Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Section 1.5 GDPS Implementation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Single Site Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Multi-site Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
BRS Option - Revovery site provided by vendor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6 GDPS/XRC Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
SDM Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
COUPLED SDM’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
User interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7 Status Display Facility (SDF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Chapter 2: GDPS Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Section 2.1 Pre - Installation Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Section 2.2 Installation Task Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Section 2.3 GDPS Installation Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Section 2.3.1 GDPS Software and Hardware Platform Requirements . . . . . . . . . . . . . . 31
Hardware Requirements for GDPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Changing the Current Master System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Production, Controlling and Master System Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
XRC Considerations for Controlling Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Takeover Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Section 2.3.3 AO Manager Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.3.1 AO Manager SETUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.3.2 HMC SETUP CRITICAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.3.3 How GDPS Facility detects if AO Manager isn't working . . . . . . . . . . . . . . . . . . . 36
2.3.3.4 Monitoring AO Manager through GDPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.3.5 Message Layout Used by AO Manager or Substitute Product . . . . . . . . . . . . . . 38
2.3.3.6 AO Manager Verification Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.3.7 Limited AO Manager Connectivity - Introduced in GDPS V2R5 . . . . . . . . . . . . . . . . . . 40
Section 2.3.4 Coexistence with other Automation Platforms . . . . . . . . . . . . . . . . . . . . . . . 42
Subsection 2.3.4.1 Starting a system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Subsection 2.3.4.2 Stopping a system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Section 2.3.5 Additional Installation Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Section 2.3.6 - Additional Recommended APARS & PTFS . . . . . . . . . . . . . . . . . . . . . . . 47
Section 2.4 General Installation Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Section 2.5 Uploading GDPS Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Section 2.6 GDPS-PPRC and GDPS-XRC Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.6.2 DASD Mirroring Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Audience
This book should be used by those responsible for installation and customization of any of the
GDPS environments: “GDPS” or “RCMF”. This document is modular in design so that it can
be used for installation and customization of any of the GDPS products (GDPS or RCMF). At
the beginning of each section, one will find a heading called “Environment” which identifies
which environment the material applies to. For example, if a heading contains the field “RCMF”
and “GDPS”, this implies that the material would be appropriate for the full GDPS product
environment or the subset environment of RCMF. For more information on navigating through
this manual see Section 1.2 “Installation Process Overview”.
Notices
References in this publication to IBM products (including programs or services), do not imply that
IBM intends to make these available in all countries in which IBM operates. Any references to an
IBM product in this document are not intended to state or imply that only IBM's product may be
used. Any functionally equivalent product (products conforming to the IBM architecture for
GDPS function) may be used instead although the services described as part of this offering may
vary based on non-IBM product readiness and capability to support GDPS. Evaluation is the
responsibility of the customer.
This document contains general information, as well as requirements, for use on IBM and
third-party products. IBM makes no warranty, express or implied, as to its completeness or
accuracy, and the data contained herein is current only as of the date of publication. It assumes
that the user understands the relationship among any affected systems, machines, programs, and
media.
IBM may have patents or pending patent applications covering subject matter in this document.
The furnishing of this document does not give you any license to these patents. You can send
license inquiries, in writing to the IBM Director of Commercial Relations, IBM Corporation,
Purchase, NY 10577.
DATABASE 2 DB2
DFSMS DFSMS/MVS
DFSMS/VM DFSMSdfp
DFSMSdss DFSMShsm
Enterprise Systems Connection Architecture ESCON XDF
ESCON Hardware Configuration Definition
GDPS Geographically Dispersed Parallel Sysplex
RCMF Remote Copy Management Facility
IBM IMS
8 GDPS Installation Guide
IMS/ESA MVS/ESA
NetView RAMAC
Resource Measurement Facility S/390
S/390 Parallel Enterprise Server Virtual Machine/Enterprise Systems
Architecture
VM/ESA VSE/ESA
Other company, product, and service names may be trademarks or service marks of others.
The terms, denoted by an asterisk (*) in this Publication, are trademarks of the IBM Corporation
in the United States and / or other countries.
ENVIRONMENT: GENERAL
GDPS Environment
Site 1 Site 2
7 6 5 6
9672 9672
PPRC
ESCON ESCON
The IBM Geographically Dispersed Parallel Sysplex® (GDPS ®) Services offering brings several
different technologies together to provide a data and application availability solution to customers
requiring high availability and minimum down time. Parallel Sysplex technology, IBM ESCON®
technology, Peer-to-Peer Remote Copy technology, System Automation technology, Wave
Division Multiplexor technology, and Processor Management technology are integrated into a
solution that can minimize the impact of a data center outage by providing an automated switch to
a second data center. IBM Global Services provides the integration of these technologies into a
complete disaster recovery/data availability solution.
The physical topology of GDPS consists of a base or parallel sysplex cluster residing in two
different sites located up to 40 kilometers apart with one or more IBM S/390 ® processors
located at each site. The parallel sysplex cluster and the cross site connectivity are configured to
provide the necessary redundancy.
All critical data resides on storage subsystems at Site1 (the primary copy of the data) and is
mirrored to Site2 (the secondary copy of the data) via PPRC synchronous remote copy.
GDPS consists of production systems, standby systems and controlling systems. The production
systems execute the mission critical workload. The standby systems normally run expendable work
PPRC
GDPS NetView
Automation
Primary Secondary
MVS Console
AOM
GDPS
Network
HMC
9672
By convention all GDPS configuration changes are initiated and coordinated by one controlling
system. All GDPS systems are running GDPS automation based upon Tivoli ® NetView ® for
OS/390 and System Automation for OS/390. GDPS automation is extended to the hardware
environment using Automated Operations Manager (AO Manager). In fact, if the current
controlling system fails, the next system in the MASTER list will take on the controlling function.
AO Manager monitors a 3270 terminal data stream for special instructions from GDPS and
executes hardware management operations against 9672 CMOS processors through its interface
to the Hardware Management Console (HMC). GDPS also automates the management of Remote
Copy Disk Subsystems as well as having the ability to automate network management. GDPS
automation may coexist with an enterprise's existing automation (for a current list of products
see the IBM Services representative).
ENVIRONMENT: GDPS/PPRC
The fact that the secondary data image is data consistent means that applications can be restarted
in the secondary location without having to go through a lengthy and time-consuming data
recovery process. Data recovery involves restoring image copies and logs to disk and executing
forward recovery utilities to apply updates to the image copies. Since applications only need to be
restarted, an installation can be up and running within an hour or less, even when the primary site
has been rendered totally unusable.
GDPS uses a combination of storage subsystem functions, Parallel Sysplex technology, and
environmental triggers to ensure, at the first indication of a potential disaster, a data consistent
secondary site copy of the data exists (using the new, and recently patented PPRC freeze
function). The freeze function, initiated by automated procedures from all GDPS systems, will
freeze the image of the primary data at the secondary site at the very first sign of a disaster, even
before any database managers will be aware of I/O errors. This prevents the logical contamination
of the secondary copy of data that would occur if any storage subsystem mirroring were to
continue after a failure that prevents some but not all secondary volumes from being updated. This
optimizes the secondary copy of data to perform normal restarts (instead of performing database
manager recovery actions). This is the essential design element of GDPS in minimizing the time to
recover the critical workload in the event of a disaster at the primary site.
GDPS manages and protects IT services by handling planned and unplanned exception conditions
and providing near-continuous application and data availability when these conditions occur.
Planned actions are executed by invoking user defined control scripts. A user defined script with
as few as 6 lines of code can completely automate the switch of processing from one site to
another.
Standard actions can be initiated against a single system or a group of systems. Additionally, user
defined actions are supported (e.g., planned site switch in which the workload is switched from
processors in Site1 to processors in Site2).
GDPS provides the ability to perform a controlled site switch for both planned and unplanned site
outages, with no data loss, maintaining full data integrity across multiple volumes and storage
subsystems.
All GDPS functions can be performed from a single point of control, simplifying system resource
management. ISPF style panels are used to manage the entire remote copy configuration, rather
than individual remote copy pairs. This includes the initialization and monitoring of the PPRC
volume pairs based upon policy and performing routine operations on installed storage
subsystems. GDPS can also perform standard operational tasks, and monitor systems in the event
of unplanned outages.
In summary, GDPS provides not only all the continuous availability benefits of a Parallel Sysplex,
but it significantly enhances the capability of an enterprise to recover from disasters and other
failures, as well as manages planned exception conditions.
ENVIRONMENT: GENERAL
GDPS/PPRC and GDPS/XRC The GDPS service offering is initiated with a planning session
which will be held to define and document an enterprise’s application availability and disaster
recovery objectives. Once the objectives are defined and documented, GDPS automation to: 1)
manage and monitor Sysplex and the Remote Copy infrastructure, 2) perform routine tasks, and
3) recover from failures, will be installed, the automation policy customized, and the automation
verified. Operational education will also be provided for the GDPS environment. .
RCMF/PPRC and RCMF/XRC the RCMF service offerings are a subset of the GDPS Service
Offerings. Code to manage the remote copy infrastructure will be installed, the automation policy
customized, and the automation verified. In addition, operational education will be provided for
the RCMF environment.
This guide addresses installation and implementation for all four offerings.
ENVIRONMENT: GENERAL
It is assumed that 1) the GDPS prerequisites have been installed, 2) SA for OS/390, NetView and
AO Manager installations have been completed, and 3) PPRC installation has been completed.
With that in mind the implementation of GDPS will flow as follows:
1. Validate that all functions to be automated by GDPS work properly when manually executed
2. GDPS code is uploaded to the target system
3. Tailoring of NetView for GDPS is completed
4. Tailoring of SA for OS/390 for GDPS is completed
5. Tailoring of MVS for GDPS is completed
6. Creating the GDPS policy definitions using SA for OS/390 panels is completed (including
PPRC)
7. PPRC configuration file is createdGDPS is initialized
8. GDPS standard actions for systems and DASD environments are validated
9. GDPS user defined actions are created (both control and takeover actions)
10. All GDPS user defined actions are validated
11. Education of customer staff is completed
Note: The RCMF offering will consist of a subset of these steps as outlined in the install
section of this guide.
This installation guide has been created to be as detailed as is reasonable for the steps that are
unique to the installation of GDPS. However, no attempt has been made to duplicate
documentation that exists for the products that make up the GDPS offering. Please refer to the
appropriate manuals for NetView, SA for OS/390 and MVS. This guide will provide pointers to
these references when practical.
This document is modular in design so that it can be used for installation and customization of
any of the GDPS offerings (RCMF or full GDPS). At the beginning of each section, one will find
a heading called “Environment” which identifies which environment the material applies to. For
example, if a heading contains the field “RCMF” and “GDPS”, this implies that the material
would be appropriate for the full GDPS offering or the subset offering of RCMF
The following codes are used to distinguish which environment the material applies to:
ENVIRONMENT: GENERAL
The following guidelines should be followed when installing any of the GDPS products for the
first time or during ongoing maintenance
1) After running the SMP/E apply, GDPS will exist in the following target libraries,
SGDPEXEC - Clist:s
SGDPLOAD - Load modules
SGDPMSG - Messages (not delivered in RCMF)
SGDPPNL - Panels
SGDPPARM - Parameters (DSIPARM)
SGDPSAMP - Samples
If the installation does NOT have any REXX runtime library, the one supplied by RCMF should
be concatenated
to the STEPLIB DD statement or added to LNKLIST. It must be authorized. GDPS.CLIST
to GDPS.SGDPEXEC
It is recommended that MVS systems and LPARs being managed in the GDPS environment use
four character names. This is not required, however it will facilitate usability. GDPS has LPARs
defined to it using the AO Manager object ID as defined in the AO Manger Bridge on the HMC.
This object ID must be exactly four characters. If LPAR names are four characters then AO
Manager object ID can be identical to the LPAR name.
The name displayed on GDPS control panels is the AO Manager Object ID, not the LPAR name.
When the AO Manager Object ID name is the same as the LPAR name it makes the GDPS control
panels more easily understood. In addition to this, GDPS uses the AO Manager Object ID to
invoke certain HMC functions. HMC load profiles must be created for use by GDPS. The names
of the load profiles are determined by the formatting of the messages sent to AO Manager by
GDPS.
For example, when the IPL function within GDPS 'standard actions' is invoked, it sends an
'ACTIVATE' command to the HMC through the AO Manager. As an operand to the ACTIVATE
it sends the name of a load profile. It builds the name of the load profile by concatenating the AO
Manager Object ID defined in the GDPS policy with the system name. So if the object ID were
If there were an LPAR with the name "GDPBSYSX" there would be an 'image' profile with the
name "GDPBSYSX" since the only name one can have for an image profile is the name of the
LPAR. This could result in having two profiles with the same name, an image profile and a load
profile. This could be confusing.
One needs to create load profiles that cover all possible combinations of systems and LPARs that
may occur in the environment. Using a naming convention that makes the first four characters of
the load profile name equal to LPAR name and the second four characters of the load profile name
equal to the system name would make the names nearly self-documenting and easily understood
by those who maintain them.
It is not possible, in the GDPS environment, to always IPL systems in LPARs of the same name.
3) For the GDPS environment, it is necessary to define within AO Manager the command prefix
for NetView or the NetView Procedure ID name to facilitate communication used by GDPS. For
those installations which cannot provide a command prefix for this use, the NetView Startup
Procedure can be started with an identifier of xxxxxxx, so that when AO Manager issues 'F
xxxxxxx,...' command, it will work.
4) Good naming conventions for Disk Subsystem Ids (SSIDs) and device numbers can simplify
operations and GDPS installation. It is recommended that a naming convention that allows easy
identification of location be used to distinguish Site1 from Site2. For example, Site1 SSIDs
should be of the form Znnn where Z is 0 - 7 and Site2 SSIDS should be of the form Znnn
where Z is 8 - F. Device numbers for Site1 should be of the form Znnn where Z is 0 - 7 and Site2
device numbers should be of the form Znnn where Z is 8 - F.
The intent of the GDPS CBU Management line-item is to have GDPS automatically manage the
reserved PUs provided by the CBU feature in the event of a processor failure and/or a site failure.
CBU can only be activated from a TAKEOVER script which is running in a controlling system.
The controlling system can be running in a partition in the processor that is the target of CBU or
in another processor. The LPARs for the production systems in the takeover site should have
INIT and RSVD PUs defined. The CBU ACTIVATE will add the reserved PUs to the processor.
If systems are IPLed during takeover processing, CBU should be requested before the IPL
(activation of LPARs for the production systems) and the reserved PUs will be available based on
the values for INIT and RSVD. The GDPS initialisation routine will configure the reserved PUs
online. For systems that do not need to be IPLed the reserved PUs can be configured online by
GDPS using the CBU CONFIGON statement.
Note: During GDPS initialisation all PUs that are found to be offline will be configured online by
GDPS even if CBU has not been activated.
CBU CONFIGON syntax: CBU=’CONFIGON sysname’ results in offline PUs being configured
online to system sysname. Specific CBU ACTIVATE rules follow:
Ÿ the CBU CONFIGON will only be issued in a controlling system when running a
TAKEOVER script
Ÿ if there are no offline PUs in the target system, CBU CONFIGON will be treated as a no-op.
NOTE: The CBU function will only work with AO Manager V3 or later withthe following fix:
Ÿ smgd04007 enhanced HMC monitoring module supporting CBU
Additional to this fix you need to ensure that the new AO Manager Bridge is installed on the
HMC. The new AO Manager Bridge is distributed with the fix
Ÿ smgd03993 or
Ÿ smgd04440
Environment: GDPS/PPRC
GDPS has three implementaion models: single site workload, multi-site workload and the BRS
model.
The single site workload model places the primary systems and primary DASD in Site 1 and the
secondarys DASD and backup processor capacity in Site 2. The controlling system is located in
Site 2. There may or may not be coupling facilities in either site depending on the users
configuration and sysplex connectivity requirements and methodology.
Multi-site Workload
The multi-site workload model places production processing in both sites. There is still a
requirement that all primary DASD be in one site and all secondary DASD be in the other site.
Typically the primary DASD is in Site 1 and the secondary DASD is in Site 2. The controlling
system is located in the site where the secondary DASD is located. Coupling Facility use and
placement are dependent on the user requirements.
This configuration requires the greatest cross-site connectivity bandwidth since all I/O activity is
against the primary DASD. It is considered practical when distance between sites is less than 10
kilometers. Greater distances may be practical, depending on the tolerance of applications
executing in the processors at Site 2 to DASD reponse time degradation.
The multi-site workload option offers the greatest flexibility in sysplex workload distribution as
well as recovery options. Complete site switches are possible as well as the ability to recover
individual MVS systems at either site. The primary DASD can be located at either site as long as
all primary DASD is located in one site and all secondary DASD is located in the other site.
The BRS option is provided for users who are within PPRC distance of a third party disaster
recovery vendor. This option would place the primary DASD at the users site and the secondary
DASD at the recovery site. The controlling system would be located at the user site, but the
DASD used to IPL the controlling system would be located at the recovery site.
Since there are no sysplex timers or coupling facilities at the second site the distance between sites
is dictated by the requirements of the PPRC implementation being used. For example, with IBM
ESS the distance could be as much as 103 KM.
SDM Performance
The SDM runs many parallel tasks (in excess of 150). Generally speaking best performance can be
achieved with multiple processors. Up to 4 can be used effectively. These recommendations are
based on G5 processors and earlier.
COUPLED SDM’s
Supported commands:
. START, END, RECOVER, SUSPEND, COUPLE and ADVANCE sessions
. ADDPAIR (one, multiple or all)
. SUSPEND (one, multiple or all)
. DELETE (one, multiple or all)
. Recover session (f antas000,rcvsess)
. Remove a pair from the configuration policy
. Add pairs´in session but not defined in GDPS-policy (selectivly or all)
. Define a new volume-pair to the GDPS policy
. Manage the GDPS policy file
Ÿ
The SDF facility provided by SA for OS/390 provides the primary status feedback mechanism
for GDPS. It is the only dynamically updated status display available for GDPS. It is important
to use the SDF displays when using GDPS and to stress to the user that the SDF panels are
their monitor for GDPS status.
One gets to the SDF by typing in “SDF” from either the NetView panel or the GDPS panel.
That displays the SDF main panel. At the bottom of the panel 'GDPS' should be seen. Place
the cursor on this line and press PF8. This brings the GDPS SDF panels up.
Site1 Site2
Automation RemoteCopy Automation RemoteCopy
MV02M MV01D MV1K MV1KD
MV02S MV02 MV1KAOM MV1K
MV02C MV01 MV1KM
MV01AOM MV1KS
MV01M MV1KC
MV01S
Trace entries
04:08:11 10,*MASTER*,,IOS076E B465,10,*MASTER*, CLEAR SUBCHANNEL INTERRUPT MI
04:10:01 10,*MASTER*,,IOS076E B49F,10,*MASTER*, CLEAR SUBCHANNEL INTERRUPT MI
===>
PF1=HELP 2=DETAIL 3=END 6=ROLL 7=UP 8=DWN 9=DTRACE 10=DDASD 11=DAUT 12=TOP
The GDPS SDF panel is divided in two parts, the top part contains status indicators and the
lower part is for trace entries. The status indicators are color coded and green means status is
good. Minor problems are indicated by the color pink and serious problems are shown in red.
The goal is of course to have all status indicators in green.
The indicators are system name (MV01 etc.. In the example) plus one to three characters. A short
description of the characters are as follows:
AOM AO manager
C Automation
H Automation Table
D PPRC device status (in controlling system)
F Freeze and CGRPLB
L PPRC Links
M Controlling system(s)
S Sysplex
null Mirroring
When everything is OK only two status indicators per system will be shown, namely Sysplex and
Mirroring. Exceptions (deviations) will be shown in pink or red and the latest "green entry" will be
kept. If an exception is detected by monitor1 (which by default runs every 5 minutes), a pink or
red entry will be created. As long as the exception remains the entry will be replaced every 5
minutes (every monitor1 run), and the latest "green entry" will be kept. This way it is possible to
figure out when the exception occurred. When the exception has been fixed, a new "green entry"
will be created and replace the existing one and the exception entry will be removed.
More than one indicator for PPRC problems can occur, for example “Freeze disabled due to
P/DAS” and “Freeze device(s) not full duplex”. In this situation, if one of the problems is
corrected, both indicators will remain on the SDF panel. The indicators relating to PPRC
problems will not be automatically removed until all problems have been corrected (and
DASD mirroring status has been set to OK).
Most of the trace entries in the lower part of the SDF screen will not be automatically deleted
from the panels. The user must do this manually with PF 9, 10 , or 11, or by displaying the
details panel and delete with PF4.
When a GDPS script (standard, planned, or unplanned action) is run, trace entries will be
created. If these trace entries are not deleted manually, they will automatically be deleted when
a new script is run.
The SDF panels are expected to be the first indicator of status information. . These panels
will display the status in various colors (i.e. pink indicates there is a problem that is not
catastrophic). If one places the cursor on a field and press PF2 (details) one gets another level
of detail containing for example date and time. One can page up and down (PF7 & PF8)
through the panels. An indicator in the upper right corner of the panel will tell how many
panels there are.
It is important to get in the habit of using SDF. When standard actions or scripts are
executing, each step is displayed in the trace part (lower half) of SDF GDPS panel.
ENVIRONMENT: GENERAL
It is recommended that these one time tasks be completed prior to attempting to install the GDPS
automation code. It is not the intent of this document to provide detailed installation steps for
these stated tasks.
Note: Be aware that there is three levels of LIC available as described below.
w Site names
w Site1 ___________ Site2______________
w System type
w System names
w LPAR's in Site1
w LPAR's in Site2
w Couple Data set(s)
w CF structures
w Sysplex profiles
w Site1 image profiles
w Site1 load profiles
w Site2 image profiles
w Site2 load profiles
w NetView domain names for each image
w NetView auto-operators
w Identify required volume pairs
w Identify Primary/Secondary SSID pairs
w Identify GDPS utility devices (one per primary and one per secondary SSID)
w Determine which volume's are in the following categories:
FREEZE/GROUP, NonFREEZE/GROUP, ONLINE/OFFLINE at IPLTIME
w P/DAS yes | no
w Identify PPRC link configuration information
ENVIRONMENT: GENERAL
Please use the following matrix to determine pertinent task information for the environment being
installed.
..............................................................
This section, will provide information relating to software requirements and setup. No
information will be provided for specific hardware configuration requirements. Configuration
planning is provided as part of the GDPS service offering workshop.
ENVIRONMENT: GENERAL
§ If PPRC commands are protected by RACF (or another product), then NetView or the
NetView user needs authorization to the commands, depending on security setup in
NetView
§ To have GDPS or RCMF use the fast PDAS option, Apars OW35649 and OW36129 are
needed
§ Apar OW34818 describes an Extended console problem and should be applied
§ For GDPS/XRC APARs BW42971, AW43050 and AW43315 are required.
§ For COUPLED SDM support (GDPS/XRC) APAR AW43316 is required.
The prerequisites for the RCMF environment are the same, except SA for OS/390 is not required.
Beginning with GDPS V2R5 AO Manager is no longer a prerequisite to GDPS. The required AO
Manager function has been integrated into GDPS and is delivered in the base GDPS code.
All processors that will execute in a GDPS environment must support the HMC automation
infrastructure. This requires that all processors be capable of attachment to an IBM HMC and be
able to be managed by the GDPS - HMC automation infrastructure (formerly the AO Manager
Bridge) interface.
The controlling system is used by the GDPS environment to coordinate GDPS processing and
reconfiguration management. It is recommended that it be a separate MVS image within the
Sysplex, dedicated to the GDPS control and monitoring functions. For PPRC, it must have
connectivity to all volumes (all primary and secondary volumes) managed by GDPS. For XRC,
controlling system connectivity to the remote copy DASD is not required as long as there is
sufficient connectivity from the SDM system/s. The controlling system must be able to IPL
independently of the PPRC DASD and the other systems in the sysplex with the exception of
sharing the XCF infrastructure (including the CDS).
GDPS adds minimal requirements over the base system software requirements for an OS/390
image running NetView. For example, the virtual storage requirements for the controlling system
would be the same as those for a base OS/390 system with JES, VTAM, and NetView running.
Likewise the processor storage requirements could be estimated similarly.
Experience has shown that 128MB of central storage and 128MB of expanded storage are
reasonable values for the controlling LPAR. This experience is with pre-Zos systems. The
requirement may be higher for Zos. Experience has also shown that in large PPRC configurations
a single CP may be inadequate. The environment where this was an issue had in excess of 8,000
PPRC pairs.
However, CPU processing cycles requirements depends on the environment and activities being
performed. The minimal requirement will be similar to the requirements for a base OS/390 system
running NetView estimated at 20 mips. For those with large PPRC configurations this could be
considerably more due to CQUERY monitoring.
Disk requirements are estimated at 6 volumes for OS/390, JES, NetView, VTAM, and GDPS.
The JES associated with the Controlling System must have its own Spool (no MAS)
The controlling system cannot be dependent on any disk devices in the PPRC configuration This
implies that it will have its own isolated SYSRES, MasterCat, SMS, RACF, LOGREC,and
SYSLOG.
Note: If an enterprise wants to use OPERLOG within the production systems to get a merged
"syslog", although OPERLOG could also be used on the controlling system, we recommend
that syslog be used on the controlling system.
The controlling system must be a system within the Sysplex with connectivity to the couple data
sets. Note: Couple data sets should not be allocated on mirrored (PPRC) disk. The primary
couple data set should be located in the primary site, the alternate couple data set should be
located in the secondary site, and there should be a spare allocated in each site.
PRODUCTION SYSTEM
Any system that is using the mirrored (PPRC) disk is a production system.
CONTROLLING SYSTEM
In GDPS V2R3 (and higher) controlling and master system functions have been modified. Prior to
V2R3 the controlling system was the first system in the MASTER list. With V2R3 a new keyword
has been added to define the number of controlling systems available in a GDPS environment. The
keyword CONTROLLINGSYSTEMS=n is defined in OPTIONS in the SA/390 customization
process. The default is 1. The controlling systems must be the first ‘n’ systems in the MASTER
list.
Therefore, a controlling system is defined as: A system that is connected to but does not use any
of the PPRC DASD (connected to primary and secondary PPRC DASD with primary DASD
online but not allocated) and is in the first ‘n’ positions of the MASTER list (‘n’ being the value
coded for the CONTROLLINGSYSTEMS=n keyword). Defining all systems as controlling
systems is not allowed.
It is recommended that the first controlling system be in the site where the secondary disk is, and
if there is more than one controlling system, there should be at least one in each site.
Only controlling systems are allowed to execute the ALLSITEn and DASDSITEn takeover scripts
because they are supposed to do a DASD switch and cannot run in a production system using the
PPRC disks. The SYSSITEn scripts are only allowed to execute in a controlling system or a
production system in the other site. (The assumption is that the SYSSITEn script will affect all
site n production systems.)
NOTE: Any script, for either planned or unplanned actions, involving DASD statements can
ONLY execute in a controlling system.
In an XRC environment the controlling system is at the recovoery site and is not part of the
production sysplex. The controlling system can be a system that has an SDM on it, but it is not
required. If the recovery process requires that systems running the SDMs be re-IPLd for use as
recovery systems, the controlling system will need to be on a dedicated LPAR, since the
controlling system cannot reset itself. In any case, the controlling system and all systems running
SDMs at the recovery site must have GDPS installed on them and be in the same sysplex.
MASTER SYSTEM
All systems MUST be in the MASTER list. Therefore all systems are candidates to become the
master system. There is only one current master system. It is the system highest in the MASTER
list that is active. The MASTER system is the only system eligible to issue the CONFIG command
for the XRC environment. Normally the controlling system and the master system will be the same
system. However, even if there is no controlling system active, there will be a master system, since
all systems must be in the master list.
NOTE: In PPRC, when a master system issues the CONFIG command it must have
connectivity to both primary and secondary DASD subsystems. In XRC with coupled
SDMs, the controlling system does not require connectivity to all DASD. GDPS uses XCF
communication to instruct each SDM to handle the CONFIG processing for the DASD each
SDM manages. In XRC, primary and secondary devices need to be online for CONFIG
processing.
Takeover Processing
Whenever a failure is detected by GDPS/PPRC, a request is sent to the current master system to
initiate takeover processing. This involves analyzing the problem and issuing the takeover prompt.
Logic in the takeover routine has been added to check what takeovers are possible.
Ÿ If the current master is one of the controlling systems, any type of takeover is possible. This is
true even if the controlling system is in the same site as the primary DASD, since a controlling
system must not use the PPRC disk.
Ÿ If the current master is a production system, the ALLSITEn and DASDSITEn takeovers are
not possible.
Ÿ For a system failure, if the current master is a production system that is not in the same site as
the failing system, the SYSSITEn and SYSsysname takeovers are possible.
Ÿ For a system failure, if the current master is a production system in the same site as the failing
system, only the SYSsysname takeover is possible.
Ÿ User defined takeover scripts will always be present in the takeover prompt, no matter what
type of system is the current master.
This section assumes that AO Manager is installed and functional. The focus here is what is
required to ensure GDPS and AO Manager are interfacing properly.
GDPS will issue a WTOR when it needs to do actions against the hardware. That WTOR has to
be captured by AO Manager. In some cases, the duplicate volser issue for example, assistance is
needed during NIP to reply to NIP message IEA213A. Therefore one has to define an MVS
console from each system, and it has to be defined as the NIP-console for the system. That
console's coax cable has to be connected to AO Manager. Definition for the MCS console
attached to AO Manager must include: MFORM T,S (Timestamp and system-name) only.
Note: Please be aware that when NIP consoles are connected to AO Manager, the operator
will need to look for NIP messages via the AO manager instead of the operators usual place.
Note: If another MFORM layout is chosen AO Manager will not work properly.
Consoles used by AO Manager should have route code 1 specified in the console parmlib member
to minimize traffic to these consoles.
HMC setup is critical for proper operation of GDPS. AO Manager setup problems that have been
experienced include: (a) IBM CE put the HMC into "single object operations" mode to perform
system maintenance and didn't put it back into "normal" mode - the AO Manager Bridge interface
requires normal mode to communicate; (b) the HMC name was changed - AO Manager has this
value hard coded and it will not work if it is only changed at the HMC. An enterprise must have
processes and procedures in place to protect against unauthorized changes to the HMC
environment (e.g., have HMC user profiles in place for each HMC user with the appropriate level
of authorization, educate CEs, include HMC changes in the change management process, etc.).
Format:
The 'x' is the character that is defined to AO Manager as the NetView subsystem prefix. When AO
Manager captures that message it issues a GETATR with code 09 (GETSTATUS) and responds
OK if successful, NOK if unsuccessful. The response is done by writing:
either
or
The ‘F xxxxxxxx’ option is provided for cases where all prefix characters have been used and
there is no command prefix character available for use by NetView. (Note: ‘F’ is the short version
of the MVS ‘MODIFY’ command.) (see Chapter 1 Naming conventions sections for additional
information)
If the ‘F xxxxxxxx’ option is being used, there must be something called 'xxxxxxxx' running for
the modify to work. This AO Manager command assumes that the name of NetView is
‘xxxxxxxx’. This is not usually the case. However, NetView can be started in such a way that it
has an 'alias' name. The startup command would need to be 'S zzzzz.xxxxxxxx etc.....', where
zzzzz is the actual NetView name and xxxxxxxx is the NetView procedure id. The '.xxxxxxxx'
gives MVS the name required to allow the modify command to work and GDPS the ability to
know AO Manager is alive.
If this function is not working properly, manually key in the modify command (F xxxxxxxx,
VPCEAOMA) from the AO Manager console that is to be used for this function and check for the
response. This may provide an indication of why the function is not working properly.
GDPS displays the status of the AO Manager using the Status Display Facility of SA for OS/390.
When AO Manager is functioning properly, it will appear in green on the SDF panel for GDPS.
However, if it appears in pink on the SDF panel, there will also be a “GEO094E AO MGR is
OFF” message displayed to indicate that an error has interrupted normal service.
When AO Manager cannot communicate to the HMC, the AO Manager appears in pink on the
SDF panel and it will provide a message “AOM/HMC communication is off”.
GDPS Installation Guide 37
It is important to get in the habit of using SDF. When standard actions or scripts are executing,
each step is displayed in the trace part (lower half) of SDF GDPS panel providing very useful
information.
AO Manager, or a product used in its place, monitor for GEO090A and GEO091A messages and
send instructions to the HMC accordingly. Message layout is as follows:
There are two messages that are used to manage and monitor processors:
GEO090A
GEO091A
GEO090A
The actions that can be performed from AO Manager to the HMC console are the
following:
PROCOPTS=SITEMON PROCOPTS=AOM or other
RESET RESET
RSTART RESTART
DEACT DEACTIVATE
ACTIV ACTIVATE
LOAD LOAD
GETATR GET_ATTRIBUTE
The sysname, processor and lpar parameters are not used by AO Manager.
The object parameter describes the LPAR that the action is issued against.
Parameters in capital letters in the messages below indicates that it contains data that
is to be used by AOM.
ACTIVATE.
Activate uses 2 formats, with and without an activation profile. The activation profile
name is combined by the object-name and the sysname. An ACTIVATE command
without the activation profile is issued when GDPS wants the change an LPAR from
LOAD
GDPS requests AOM to perform a LOAD operartion in the partition indicated by
object in the message
GEO090A LOAD sysname processor lpar OBJECT LOADADDR LOADPARM
RESET
GDPS requests AOM to perform a SYSTEM RESET for the partition specified by
the object parameter.
GEO090A RESET sysname processor lpar OBJECT
RSTART
GDPS requests AOM to perform a PSW RESTART operation for the partition
specified by the object parameter.
GEO090A RSTART sysname processor lpar OBJECT
DEACTIVATE
GDPS request AOM to perform a DEACTIVATE for the partition specified by the
object parameter.
GEO090A DEACT sysname processor lpar OBJECT
GETATR
GDPS wants information from the processor. The attribute value indicates what kind
of information that is requested.
GEO090A GETATR sysname processor lpar ATTRIBUTE
Since the message is sent by a WTOR, SA/390/NetView will wait for an answer.
Expected responses are:
OK means that the request was perfomed successfully, NOK means that the
operation failed or could not be performed.
If there is no response from the processor hardware, the AOManager should respond
with NOK after 5 minutes.
GEO091A
The 'x' is the character that is defined as the NetView subsystem prefix. When
AOManager captures that message it should issue a GETATR with code 09
(GETSTATUS) and respond OK if successful, NOK if unsuccessful The response is
done by writing:
xVPCEAOMA OK/NOK
on the console.
From the GDPS main menu Option 8 is used to turn automation on and off. Also on this same
panel is a field labeled “AOM Verification=” and YES/NO are the options. When this field is set
to YES a WTOR (GEO800A ...) must be replied to before any command is sent to AO Manager.
This feature is designed for testing and problem determination. It allows an intervention step
before any command is sent to AO Manager. Normally it should be set to NO.
If AOManager is not connected to all systems in GDPS, it is possible that a standard action
command could be issued on a console that is not connected to AOManager.
OPERATION
If a message must be sent to AO Manager(GEO090A), GDPS VPCESMON checks to see if it is
executing on a 'connected system'.
If connected, message is sent.
MONITORING
During GDPS monitoring the AOMCONNECT table is used to find the first active system in each
site that has a connection to AOM. If there is no system in a site that has a connection to AOM,
issue an alert on SDF. SITEx has no connection to AOM.
STANDARD ACTIONS
The Standard Actions panel has been changed to display an indicator for AOM-connection.
Standard Action functions that require AOM-functions will only be honored from an
AOM-connected system. If issued on a non-AOM-connected system there will be an error
message stating 'Rejected, Must be executed on a AOM-connected system'.
ENVIRONMENT: GDPS/PPRC
GDPS can coexist with other automation products like OPS/MVS and AF/Operator. GDPS will:
w Start the automation product
w Stop the automation product
w Send requests to the automation product with parameters
indicating what to do.
For GDPS to be able to start the automation product, it has to be defined as the only application
in SA for OS/390. The automation product should not be started during IPL. Only NetView (SA
for OS/390) should be started. When GDPS has done it's initialization it will start that application
(the automation product).
When GDPS is up and running, SA for OS/390 will detect this and issue any additional command
that is required by the automation product in order to start the rest of applications for that OS/390
system.
The two steps involved in this process are: 1) SA for OS/390 will be activated by COMMNDxx
member in SYS1.Parmlib, and 2) SA for OS/390 will start the other automation product when
GDPS has been initialized. If the other automation product needs a reply to initialize, then GDPS
can use the VCPEIMOD function to reply with the IPLMODE value for that system. This is
accomplished by using the NetView message automation table. For example,
IF MSGID = 'xxxxxxx'
CMD('VPCEIMOD REPLY IMODE') ...
The following diagram depicts this interaction between GDPS and other automation platforms for
Startup processing.
SA/390
LOAD COMPLETED R xx,OK
START OPS OPS
R yy,FAST
From IMode => MODE = FAST
(*) AO Manager or
equivalent product
The other automation product has the responsibility to stop all applications under it's control, and
then respond to the outstanding WTOR with “OK”. The other automation product has the
responsibility to:
Upon receipt of an “OK“ response to the GEO46I message, GDPS will complete the
SHUTDOWN by:
The following diagram depicts the interface between GDPS and other automation platforms for
Shut-Down processing.
SA/390
V XCF,PROD,OFFLINE SHUTSYS "stopappl",SCOPE= ALL R yy,OK
FINISHED R xx,OK
For the SA/390 customer the default (STOPAPPL=)JES is good provided the customer uses the
standard SA/390 application name for JES2 (or JES3) which is JES. If the customer is using
another automation product and the customer wants GDPS to stop that product, STOPAPPL
should reference the other automation product. It is not possible to have STOPAPPL reference
the SA/390 & GDPS NetView.
The controlling system (or the system where you request the Stop) sends a stop request to
GDPS in the system to be stopped and issues a GEO043A message
GDPS in the system to be stopped checks if another automation product is being used and if
that's the case it issues a GEO046I message
When the other automation product replies OK to the GEO046I message, GDPS in the system
to be stopped, issues a "SHUTSYS stopappl ALL" command
If there is no other automation product, GDPS will immediately issue the "SHUTSYS
stopappl ALL" command
When SA/390 has completed the SHUTSYS, GDPS in the system to be stopped replies OK to
the GEO043A message
The controlling system issues a "V XCF,sys,OFF" command
It is not possible to stop NetView/GDPS because if you do, GPDS cannot tell the controlling
system that the stop is complete.
If, for some reason, shutdown processing should fail and manual intervention is required that
results in the system being stopped, but does not result in the response to the GEO046I message
being done, the operator must respond “OK” to the outstanding GEO046I to resume GDPS
system shutdown processing.
Cauton is advised because there may be more than one GEO046I message outstanding.
ENVIRONMENT: GENERAL
The following items are required to allow install of the GDPS code on a MVS system
w TSO User ids will be needed to perform job customization and submission, etc...
w PC File Upload / Download Capabilities
w Security facilities will need to be available for the following tasks.
Ÿ PPRC commands should be protected from unauthorized use. This is documented in the
Remote Copy Storage Administration Guide SC35-0169.
Ÿ The #HLQ.GDPS.SGDPLOAD must be APF authorized
w DASD Space will be required for GDPS libraries
w Less than 15 cylinders required, however, it it recommended that the disk datasets from
the SMP/E install of GDPS are placed on volumes outside of SMS control and on
volumes not participating in "Freeze" groups.
w
...SGDPPARM 1 cyl 5 80 FB
...SGDPSAMP 1 cyl 5 80 FB
...SGDPEXEC 4 cyl 30 80 FB
...SGDPPNL 2 cyl 10 80 FB
...SGDPMSG 2 cyl 10 80 FB
...SGDPLOAD 2 cyl 3 0 U
ENVIRONMENT: GENERAL
Information related to required software maintenance should be obtained from the following
information APARS: II08303, & II11778. These should be checked often, since new
information will be added periodically.
ENVIRONMENT: GENERAL
The purpose of this section is to describe the high level installation steps required to install GDPS
with product specific exceptions noted. It is assumed that 1) the GDPS prerequisites have been
installed, 2) SA for OS/390, NetView and AO Manager installations have been completed, and 3)
PPRC installation has been completed. It is important that one understands which of these high
level installation steps pertain to the environment being implemented in order to help in navigating
the remainder of this manual.
The following steps are required to install the GDPS products with the following
exceptions:
Ÿ Steps 8a, 8b ,8d, 8e, and 8g are not performed for the RCMF environment.
ENVIRONMENT: GENERAL
GDPS is distributed to an IBM services specialist once a signed services contract is received. The
following steps are involved in installing the GDPS libraries in the client environment.
NOTE: The REXX library referred to below and supplied by RCMF should only be used if the
installation does not have any REXX run-time library. If a REXX run-time library already exists,
the one supplied by RCMF should not be used.
GDPS is, starting with V2R4, SMP/E installable and it is delivered in a zip file which has to be
unzipped and uploaded to TSO. When it is received in TSO, a PDS is created which contains all
the members needed to install GDPS.
You can select to install GDPS in a SMP/E environment unique to GDPS, or you can install it in
an existing SMP/E environment. The sample jobs contain information on both alternatives.
Installation Instructions
ENVIRONMENT: ALL
2.6.1 Introduction
Briefly, Section 2.6 discusses the processes required to define the PPRC and XRC configurations
to GDPS. For PPRC there are four main sections as described below each with many subsections.
Section 2.6.1 describes the sample PPRC environment used throughout this manual for
illustrations. Included in this section is the description of the ESCON links, and physical
connection information needed to define the PPRC environment. For XRC, details can be found in
Section 2.6.6.
For PPRC the procedure used to define the PPRC environment to GDPS utilizes a “flat file”
composed of control statements defining the PPRC configuration. A DD statement (GEOPARM
DD) must be added to the NetView procedure pointing to this configuration file. A sample
template for this file can be found in the distribution library,
“#HLQ.GDPS.SGDPPARM(GEOPARM)”.
Section 2.6.4 provides the details for setting up the GDPS - PPRC environment using
GEOPARM. Included in this section is: 1) the implementation step details, 2) the GDPS control
file statement syntax explanation, and 3) a control file definition walk through matching the
sample environment.
Sections 2.6.1, 2.6.2 and 2.6.3 should be reviewed by everyone implementing the GDPS - PPRC
environment to familiarize yourself with the sample environment and information pertaining to
that configuration. . Section 2.6.5 previously contained instructions for using SA for OS/390
dialogues for defining the PPRC environment. With GDPS V2R4 this method is no longer an
option. .
GDPS supports one freeze group and one non-freeze group of PPRC devices and freeze is
specified at the SSID pair level. Freeze is to be used for critical application data volumes where
data consistency is essential for restart, and non-freeze is to be used for volumes that do not have
a time consistency requirement, for example system volumes.
It is recommended that all DASD be mirrored. The exceptions are Sysplex Couple Data sets, Page
Data Sets, sort work volumes, and batch temporary data volumes (be certain Page Data Sets are
available at the second site).
Some applications maintain duplicate copies of critical data as a hot standby. If the duplicate data
can be stored on non-PPRC DASD devices at the second site, there may be no need to mirror the
data.
50 GDPS Installation Guide
The GDPS enviroment places the sysplex couple data sets in both sites (primary in primary site
and alternate in second site). In the event of a site 1 failure the controlling system can access the
alternate couple data sets at site 2.
NOTE: In a GDPS environment the couple data sets should be on volumes that are not mirrored
by PPRC. The controlling system is a member in the sysplex and is using the couple data sets. If
there is a problem that requires a site switch, it is necessary that the controlling system survives,
which may not be possible if the couple data sets were on PPRC volumes.
Keep in mind that the laws of probability state that the more volumes defined in freeze groups the
higher the likelihood of a freeze occurring. Do not include volumes in freeze groups that do not
have a requirement for time consistency of the data.
Specific recommendations for feeze and non-freeze group candidates are not made in this guide.
These decisions will be unique to each implementation.
To be ready for recovery in case of a disaster it is required that all device pairs in the freeze group
are full duplex because otherwise data consistency cannot be guaranteed. For the non-freeze
group it is acceptable to have device pairs that are not full duplex. So for a DASD unplanned
DASD switch (takeover) it is required that all freeze device pairs are full duplex. For a planned
DASD switch it is required that all device pairs (freeze and non-freeze) are full duplex.
GDPS keeps track of the mirroring status of the PPRC devices which are defined to GDPS and if
a change to a device pair (from FULL DUPLEX) is done from the GDPS panels, the new status
will be recorded and broadcasted to all GDPS systems in the sysplex. If a suspend or delete of a
freeze device pair is made from the GDPS panels, freeze will be disabled and no freeze will occur.
Also, if there is a freeze situation, the freeze will be initiated via the NetView automation table and
the status change will be recorded and broadcasted by GDPS.
Note that status changes made in any other way (for example PPRC commands from TSO), if it is
a SUSPEND, will be trapped by the NetView automation table and, if it is a freeze device pair, it
will cause GDPS to do a freeze, but if it is a non-freeze pair there will be no GDPS action. So a
suspend of a non-freeze pair or other status changes, like delete of a non-freeze pair, will not be
detected by GDPS until the DASD monitoring (monitor2) runs.
The current status is shown on the GDPS main panel as “Mirroring” and on the GDPS “DASD
Remote Copy” panels as “DASD Mirroring Status”. The status can take the following values and
colors:
Ÿ OK (green) - all device pairs are full duplex. Freeze is enabled and both planned and
unplanned DASD switches can be done.
Ÿ OK (pink) - all “freeze” device pairs are full duplex and freeze is enabled. At the same time at
least one “non-freeze” device pair is not full duplex, or has been swapped by P/DAS. Freeze is
enabled but only unplanned DASD switches can be done.
IMPORTANT NOTE: The DASD management panels in GDPS are for MANAGEMENT
of the environment, not monitoring of the environment. The status on the GDPS panels is
not updated dynamically. Updates are usually reflected after some action is taken from the
panels, but, for performance reasons, no automatic, dynamic, periodic update of the panels
is made. Only the SDF panels are updated dynamically and SDF should be used as the
primary monitoring tool for GDPS.
The normal situation of course should be to have a green OK for DASD Mirroring Status, that is
all device pairs should be full duplex.
Note that the value OK (any color) indicates that all freeze device pairs are full duplex and freeze
is enabled. When the OK is green all device pairs are full duplex and full DASD switching
capability, both planned and unplanned, exists. If the color is pink, all freeze device pairs are full
duplex, and a DASD takeover (unplanned DASD switch) is possible.
The value NOK (always in red) indicates that at least one freeze device pair is not full duplex or a
subset of the freeze device pairs have been swapped by P/DAS, and since this means that
secondary data consistency cannot be guaranteed, freeze is disabled. For the same reason DASD
switches, both planned and unplanned, are not allowed.
Finally, the value FRZ (also in red) indicates that a freeze has been done, the secondary data is
consistent, and a DASD takeover is possible (in fact, a GEO112E/GEO113A takeover prompt
should occur when the freeze is done). When the freeze is done, freeze is immediately disabled to
prevent repeated freezes and it will not be enabled again until status has been turned back to OK.
The following actions will change the DASD Mirroring Status from a green OK:
Ÿ Doing any of the actions from the GDPS panels that will change a device pair from full duplex,
for example DELPAIR or SUSPEND. If it is a freeze pair, status will change to NOK (red),
and if it is a non-freeze pair, only the color will change from green to pink. Any of these
actions will present a confirmation panel with a warning telling the consequences of doing the
action, for example that freeze will be disabled if you do a DELPAIR or SUSPEND to a
freeze device pair. The confirmation panel will only be presented for the first device that you
delete or suspend. Any additional deletes or suspends will be done without any confirmation.
Ÿ When a freeze message is trapped by the NetView automation table for a freeze pair, a freeze
will be initiated and status will change to FRZ (red).
Ÿ Status will only change from the current status to one that is considered worse, that is from
left to right in the following list: OK (green), OK (pink), and last NOK or FRZ (red). Note
that status can never change from NOK to FRZ or vice versa.
Status changes will immediately be shown on the SDF screen for the system where the status was
changed, and for other systems monitor1 will detect the change and update the SDF screen.
Before changing status back to OK, CQUERYs must be issued to all PPRC devices to get current
status of all the pairs, and the value of status will be set to OK only if all freeze device pairs are
full duplex, and to get a green OK all device pairs must be full duplex.
GDPS contains a DASD monitor (monitor2) which is set up to run regularly and the default is that
monitor2 runs every night at 01:00 AM in the controlling system. This can be changed through the
MONITOR2 option in GEOPLEX OPTIONS to run more frequently and/or run in all systems.
Monitor2 will issue CQUERYs to all PPRC devices (primary and secondary) and set DASD
Mirroring Status to the correct value corresponding to the current situation. If the value was
changed, it will be broadcasted to all GDPS systems in the sysplex.
NOTE: For this reason it is important that MONITOR2 only run in systems with full
connectivity to all primary and secondary volumes. Otherwise an error will be detected and
FREEZE will be disabled.
An alternate way of doing a CQUERY to all devices is to issue the 5 (Query) bottom line
command from the SSID pairs panel VPCPQSTC. This will schedule a monitor2 run in the
background and status will be changed as described above. Unlike previous GDPS releases the 5
(Query) will not lock the terminal since the monitor2 will run in the background. There will be a
message on the panel saying that the DASD monitor is running and this message also contains the
starting time for the DASD monitor. When you press ENTER and the message disappears, the
monitor has finished and the status will show the current value.
For performance reasons it has been decided not to do CQUERYs to all devices when showing
PPRC status on the SSID pairs and device pairs panels . The information shown on these panels
are from the latest CQUERY issued for each device. This fact may cause a mismatch between the
DASD Mirroring Status and the status shown for device pairs. If this occurs, the device pair status
is old and DASD Mirroring Status is the value to trust. This is true because any changes are
immediately recorded in DASD Mirroring Status and broadcasted to all GDPS systems in the
sysplex. After correcting a problem the reverse can be true, that is, device information is correct
and DASD Mirroring Status has to be updated by running monitor2 as described above.
When a freeze triggering message occurs for a freeze device, GDPS will do a freeze and set
DASD Mirroring Status to FRZ. This will cause an IEA494I message for each and every freeze
device which means that there can be hundreds or thousands of IEA494I messages. Again, for
performance reasons, GDPS will only react to the first freeze triggering message and the rest will
not cause any GDPS action at all. In addition, in a freeze situation, GDPS will not do any
A freeze triggering message for a non-freeze device will be ignored by GDPS so device status and
DASD Mirroring Status will not be updated until monitor2 runs. A mirroring problem for a
non-freeze device is not considered serious and therefore there will be no GDPS action.
1. A freeze device pair is deleted or suspended, or monitor2 detects a freeze device pair that is
not full duplex. DASD mirroring status will be NOK.
2. When a freeze occurs, freeze is immediately disabled to prevent a “freeze loop” in GDPS.
When the freeze occurs, all PPRC links (paths) between the freeze SSID pairs will be
removed. DASD mirroring status is set to FRZ.
NOTE: Freeze will be disabled if a suspend is done for a device pair from the GDPS
panels.
1. All freeze device pairs must be established or resync-ed and they must all be full duplex. In
addition, monitor2 must be run to get DASD mirroring status set to OK again. Monitor2 can
be started by entering 5 (Query) on the SSID pairs panel VPCPQSTC in GDPS.
2. All PPRC links for the freeze SSID pairs must be reestablished and the freeze device pairs
must be resync-ed. This can easily be done from the SSID pairs panel VPCPQSTC in GDPS
where you can enter 4 (Resynch). Selection 4 will automatically do CESTPATH commands
for all SSID pairs and initiate a resync of all device pairs. When all device pairs are full duplex
again, monitor2 needs to be run to have DASD mirroring status set to OK.
PLEASE NOTE: SDF is the only dynamically updated status display in GDPS. It should be
the primary monitor used with GDPS.
Mirroring status is also displayed in SDF (Status Display Facility) on the GDPS SDF screen under
the heading Remote Copy.
NOTE: SDF is updated by Monitor 1, there may be a delay before an update occurs.
The first entry should be present in all systems. The second entry should be present in the
controlling system/s.
Whenever status is not completely OK, any combination of the following entries can occur:
These messages (in pink or red color) indicate deviations from the normal situation, which is to
have all devices full duplex. Most of the conditions causing these messages will be detected by
checking performed by monitor1 but some will occur immediately when there is a status change.
A message can occur immediately in one system and later when monitor1 runs in other systems. If
additional conditions occur, messages will be added to the SDF screen. When conditions have
been corrected, messages will not disappear until all conditions have been corrected, that is
messages will only be deleted when all device pairs are back to full duplex and monitor2 has been
run. (Monitor2 can be initiated by entering 5 (Query) on the SSID pairs panel VPCPQSTC.)
The GEO100I and GEO126I entries will not be deleted if there is a deviation message but they
will stop being updated. This means that they will remain on the screen and they will show the
latest time when status was found to be OK before the problem occurred.
General Description.
Flashcopy is an Advanced Copy Services function that quickly copies data from source volumes
to target volumes that are within the same logical storage subsystem (LSS). Flashcopy is a feature
on ESS storage subsystems.
For a detailed description of the flashcopy function see the “OS/390 Advanced Copy
Services” book.
In GDPS support for ‘ FlashCopy before Resynch’ and ‘User initiated FlashCopy’ are
implemented. FlashCopy support is not implemented in RCMF.
Data consistency for flashcopy target volumes is a user responsibility. There is no verification in
GDPS whether volumes are defined for flashcopy or not and it is a user decision how to
implement the flashcopy support and use the flashcopy target volumes.
It is required that the flashcopy target devices are defined OFFLINE in IODF and that the
user has a separate IODF for the flashcopy target devices in case of disaster.
In GDPS, a freeze creates a consistent set of volumes. But once a resynch is started there is no
recoverability until the resynch has completed. To keep a set of consistent volumes until
resynch is completed, “flashcopy before resynch” is used.
Before resyncronization starts, a flashcopy is taken of all suspended secondary volumes that have
a defined flashcopy device in the configuration file(GEOPARM). The flashcopy commands are
executed in command processor ‘GEOCFLC’, one command for each volume that is flashcopied.
The flashcopy commands will be executed in parallel, one task per SSID, assuming sufficient auto
operators have been defined(one per SSID pair).
When the flashcopy before resynch is implemented in PPRC the flashcopy commands will be
invoked from the script keyword DASD=’START SECONDARY’ and in the ‘RESYNCH’
command in the main DASD remote copy panel ‘VPCPQSTC’.
The FCESTABLISH command is always executed with the NOCOPY parm. When the
FCESTABLISH command is executed, a variable is set in GDPS indicating ‘flashcopy taken’.
After the flashcopy commands are completed the cestpath and cestpair resynch commands
are executed. Until the resynch is finished the flashcopy flag prevents new flashcopies from being
initiated.
As soon as mirroring status is ‘OK’ after the resynch the flashcopy flag will be removed in next
DASD query or Monitor 2 execution and the FCWITHDRAW command is executed to remove
the relationship between source and target volumes.
If the “Flashcopy before resynch” fails an SDF alert message and a WTOR is written where the
user will have the option stop the resynch or continue with the resynch without having a
complete flashcopy.
The current “Latest Freeze” indicator on the GDPS main menu will be the time indicator for the
‘Flashcopy before Resynch’ time in PPRC.
The Config File must be changed to include statements for flashcopy devices for SSIDs selected
for the flashcopy function. This is done as follows:
PPRC=’PDEV,SDEV,xx,,FPDEV,FSDEV’
where FPDEV(target device for flash of primary) and FSDEV (target device for flash of
secondary) are added at the end of the PPRC statement to specify the first address of ‘xx’ for
flashcopy in both SSIDs.
The user will have the option to specify flashcopy devices for both primary and secondary SSIDs.
This will allow flashcopy to be enabled if volumes are mirrored in opposite direction (site 2 to site
1) or if the user wants to flashcopy primary volumes (site 1).
To have the ‘Flashcopy before resynch’ function enabled for PPRC the user must define
flashcopy volumes for secondary volumes in the freeze group. If secondary volumes in
freezegroups are not defined for flashcopy the function ‘Flashcopy before resynch’ will not be
enabled.
Defining flashcopy volumes in freeze groups or non-freeze groups will enable ‘User initiated
flashcopy’.
NOTE: There is no GDPS option or keyword used to enable or disable flash copy. It is enabled
or disabled by updating the GEOPARM. Once this is done and a resynch is initiated, either from a
script or the GDPS DASD panel, flash copy will occur for all volumes for which a flash copy
target has been defined.
All freeze volumes should be defined for flashcopy to have a consistent set of volumes in
case of disaster during resynch. There is no GDPS verification for this and it is a customer
decision how to implement the flashcopy function and use the flashcopy volumes.
EXAMPLE 1
PPRCSSID='0102,0202'
PPRC='1100,2100,7,,1108,2108'
In example 1 devn 1100-1106 in SSID 0102 are mirrored to devn 2100-2106 in SSID 0202.
In SSID 0102 devn 1108-110E are reserved for Flashcopy of primary devices, in SSID 0202
devn 2108-210E are reserved for Flashcopy of secondary devices.
EXAMPLE 2
To reserve flashcopy devices in the secondary site only, the PPRC definition looks like this,
PPRCSSID='0102,0202'
Example 2 reserves devn 2108-210E in SSID 0202 for Flashcopy of secondary volumes.
NOTE: It is required that the flashcopy target devices are defined OFFLINE in IODF and that the
customer has a separate IODF for flashcopy devices for use in case of disaster.
“User Initiated Flashcopy” uses the same definitions in config file as the “Flashcopy before
Resynch” function.
User initiated flashcopy supports flashcopy of all volumes using panel commands and dasd script
keywords.
NOTE: User initiated flashcopy can not be invoked if P/DAS is active in PPRC.
The following DASD script keywords have been added to provide flashcopy support:
If a flashcopy has already been taken, a prompt will be displayed to allow the user to decide to
overwrite it, or not.
NOTE: Time for latest user initiated flashcopy will not be saved.
FlashCopy is possible to define and use through GDPS if the XRC configuration is either
VOLMASK or DEVNUM. It is not possible for VOLSER. How to include flash devices in the
GEOXPARM file is described in section ‘2.6.6 GEOXPARM Definition for XRC’ for DEVNUM
and in section ‘2.6.8.9 Enhanced Configuration Management in GDPS/XRC and RCMF/XRC -
(VOLMASK)’ for VOLMASK.
User initiated FlashCopy supports FlashCopy of all defined FlashCopy volumes using panel
commands and XRC script keywords.Data consistency when doing FlashCopy is a user
responsibility.If a FlashCopy has already been taken, a prompt will be displayed to allow the user
to decide to overwrite it, or not.
In the beginning of a ‘start session’ and ‘start secondary’, a FlashCopy is taken of all suspended
secondary volumes that have a defined FlashCopy device in the configuration file GEOXPARM.
The FlashCopy commands are executed in command processor ‘GEOCFLC’, one command for
each volume that is flash copied. The FlashCopy commands will be executed in parallel, one task
per SSID, if sufficient auto operators have been defined, at least one per SSID pair.
The FCESTABLISH command is always executed with the NOCOPY parameter. When the
FCESTABLISH command is executed, a variable is set in GDPS indicating ‘FlashCopy taken’.
After the FlashCopy commands are complete ‘start session’ or ‘start secondary’
will execute. Until the ‘start session’ or ‘start secondary’ is complete new flash copies can not be
taken.
To free up the FlashCopy devices after used in a script run a script with XRC=’sessionid
FCWITHDRAW SECONDARY’ or there will be a prompt when you use them next time.
Beginning in GDPS V2R5 a program executes at GDPS initialization and issues a WTO with
routcde=11 for all messages that are trapped in the message-table. All programs that are invoked
from the message table have been changed to test on the routcode.
Description/Operation
Message Table Healthcheck is used to verify that all messages necessary for proper GDPS
function are included in the messagetable and have the AUTO(YES) indicator set or defaulted in
MPF table.
When GDPS is initialized, the Message Table Health Check is executed. It will issue WTOs for all
messages that are required for GDPS operation. The WTOs are issued with routcde = 11 so
GDPS can understand that it’s a Health Check Message. If GDPS detects messages that are not
automated, a SDF alert is created to indicated this.
GEO181E At least on message is not active in the messaeg table. In addition to this there are
trace entries for all missing messages in the TRACE-part of SDF stating: Message xxxxxxx not
trapped in message table.
GDPS Installation Guide 59
User concerns.
If the installation uses the same messages for their own automation an additional test has to be
added in order not to trap on GDPS health-check. Add an extra selection item for those
messages to ignore routcde = 11:
routcde() ^= '0000000000100000'
The following DASD environment was established with multiple DASD subsystems to
demonstrate the differences in setting up our PPRC environment with different DASD platforms.
This section does not attempt to convey a comprehensive discussion on configuration alternatives.
This type of information is provided as part of the GDPS planning session provided with the
GDPS Service Offering and additional information on configuration alternatives can be obtained
from the “Remote Copy Administration Guide - SC35-0169”.
In this configuration there are two sites with mirrored DASD configuration. Site1 has one
3990-006 with 64 volumes and one RVA T82 subsystems with 256 volumes behind it. Site2 has
the same hardware configuration. It will not be necessary to mirror all of the DASD from Site1 to
Site2 in order to ensure our critical application recoverability. The specific devices requiring
PPRC protection are identified in the subsequent diagrams.
The device addresses at Site1 and Site2 utilize 4 digit device numbers and the device numbers are
unique across the configuration. Site1 and Site2 addresses are as follows:
Site1 Site2
CU DEV CU DEV
3990-006 2200-223F 3990-006 8200-823F
9393-T82 1200-12FF 9393-T82 9200-92FF
Each site has multiple ESCON directors with the ports identified for use by GDPS (see diagrams
below). Please be aware that in the RVA, the primary subsystems must have ESCON ports
dedicated to PPRC while in the 3990 implementation, ESCON adapters may be shared between
host and PPRC traffic. Further more, when defining the PPRC links, be aware that the ESCON
interfaces at the secondary site used by PPRC links appear as host interfaces. Also be aware that
when establishing cross site links using ESCON directors, all ESCON rules apply, i.e.. only one
switch able connection is allowed in the path. If multiple directors are used on a single path, then
one director path may be switch able, the other director path must be a static path. Finally, since
the secondary subsystem interface appears as a host interface, it may not be used to establish Site2
links while Site1 links are established. This has been illustrated in the diagrams below with solid
versus dashed lines.
In the configuration, there are 2 subsystems at each location with 5 unique ‘logical control units’
and 5 unique Subsystem IDs (SSIDs) as described below.
Site1 Site2
CU SSIDs CU SSIDs
3990-006 1900 3990-006 8900
9393-T82 1800,1810, 9393-T82 8800,8810,
1820,1830 8820,8830
GDPS requires a dedicated utility device on each logical control unit to ensure access to the
logical subsystem when needed. In our configuration, the last device in each group of 64 is to be
The first PPRC subsystem pair environment is detailed below. It details the 3990 environment.
Eight volumes from this subsystem are to be mirrored (2220-2227).
Sample 3990
D0
E0 D4
E6
E4 E8 EB F6
A B C DE F G H A B C D E F GH A B C DE F G H A B C D E F GH
Serial number 0039788 Serial number 0039988
CLUSTER 0 CLUSTER 1 CLUSTER 0 CLUSTER 1
3990 MODEL 006 3990 MODEL 006
Application
Host
Sample 9393 Alternate Host
B0
B4
B8
C0 BC
C4
C2 CC E4
C4 C6 E8 C8
C8
A B C DE F G H A B C D E F GH A B C DEF G H A B C DE F GH
Serial number 1325007 Serial number 1328006
CLUSTER 0 CLUSTER 1 CLUSTER 0 CLUSTER 1
RVA MODEL T82 RVA MODEL T82
Overview:
The PPRC configuration is defined in a data set or member referred to in the NetView Startup
PROC with a GEOPARM DD statement. GDPS automation uses the definition to allow
establishment of the PPRC environment.
NOTE: GDPS/PPRC expects to have PPRC DASD defined to it. If the GEOPARM is empty you
will get errors at GDPS startup.
Briefly, the steps involved in installing GDPS - PPRC with the GEOPARM configuration file are:
§ Step 1 - Determine and document the PPRC hardware environment including subsystem pairs,
ESCON links, ESCON adapters, ESCON director ports, and volume pairs required
§ Step 2 - Upload GDPS distribution libraries to client system
§ Step 3 - Upload GDPS FIXPAK to client system
§ Step 4 - SMP/E install GDPS code
§ Step 5 - Update the NetView (SA for OS/390) started task procedure
§ Concatenate GDPS data sets to appropriate NetView DD statements
§ Step 6 - APF Authorize the #HLQ.GDPS.SGDPLOAD
§ Step 7 - Modify NetView Parameters
§ Step 8 - Create and customize a GEOPARM configuration file to reflect the PPRC
environment. A sample of the file can be found in the GEOPARM member of
#HLQ.GDPS.SGDPPARM
§ Step 9 - Add a GEOPARM DD statement to the NetView startup PROC pointing at your file
to allow GDPS access to the definitions and recycle NetView or use NetView dynamic
allocation facility.
The sample PPRC environment described in Section 2.6.1 will be used as the basis for the details
of the definition process below. This sample was selected to provide examples for both the IBM’s
9393 DASD subsystem as well as the IBM 3990 environment.
The following section provides a walk-through of the process details of creating the definition file
for this environment.
The first task is to understand the logical subsystem pairs that will be established and the
associated SSIDs for these pairings. In our environment the following subsystems will be paired:
SSID 1900 to SSID 8900, SSID 1800 to SSID 8800, SSID 1810 to SSID 8810, etc...
The ESCON connections between the logical subsystems must be understood. One must
understand whether PPRC will be using direct attach, or attaching one subsystem to another
through an ESCON director or multiple directors and which ports on the directors are being used.
Finally, one must understand which ESCON adapters are being used for attachment on the
primary and secondary subsystem because the PPRC LINK definitions require information which
is determined by hardware position. In the sample environment, multiple ESCON directors in
each location are being used and the previous diagrams identify which adapters will be used for
these PPRC links. Two paths between each logical control unit pairs for our RVAs (one from
each cluster) and 4 paths between our 3990 subsystems (two per cluster) will be used.
Note: Each link is defined using an 8 digit address of the format ‘aaaabbcc’ where ‘aaaa’ is the Subsystem
Adapter ID (SAID) which identifies the interface location, ‘bb’ is the ESCON director portid (‘00” for direct
connections), and ‘cc’ is the logical control unit number (‘00-03’).
Remember that ESCON links are always defined from the perspective of the primary subsystem to
secondary subsystem. Also, remember if connecting directly from one subsystem to another, the
PORTID portion of the LINK address will always be ‘00’. Finally, remember in dealing with
subsystems that support logical control units such as the RVA, the last two positions of the LINK
address represents the logical control unit number. This will always be ‘00’ for 3990s.
It is beneficial to utilize a diagram similar to those included previously to document the ESCON
connections or it may be more helpful to complete worksheets (examples provided below) to
document your connections. Information such as Subsystem Adapter Ids (SAIDs) are included in
the worksheets and therefore one may find it helpful to use them to ensure that the SAIDs and
LINKS are correct for the environment being establishing.
The first subsystem pair in our configuration is the 3990 subsystem pair. Previously this
configuration was documented and now it will be used to determine the PPRC link information.
The following worksheet identifies the required System Adapter Ids (SAIDs) for each PPRC link
as well as the ESCON Director PORTIDs both of which must be known to define the PPRC links.
This worksheet provides one valuable approach to the task of documenting this information.
APPLICATION RECOVERY
SYSTEM(S) SYSTEM
E0
D0
E4
E8
EB D4
ESCD ESCD
A B CD E F GH A BC D E F G H A B CD E F GH A BC D E F G H
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 SAID 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
ADDRESS
CLUSTER 0 CLUSTER 1 CLUSTER 0 CLUSTER 1
DASD DASD
Note: One must check with the hardware manufacturer to determine SAIDS for
each unique platform
C0 B0
B4
C4
B8
CC BC
C8
ESCD
ESCD
A B C D E F G H H GF E D C B A A B C D E F G H H GF E D C B A
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SAID 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 4
ADDRESS 0 0 2 2 4 4 6 6 7 7 5 5 3 3 1 1
2 2 4 6 6 7 7 5 5 3 3 1 1
0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0
DASD DASD
The next task involves determining the volume pairs and the settings for the CRIT and FREEZE
parameters. The general recommendation for the GDPS environment is to set the CRIT=
parameter to N and the FREEZE= parameter to Y. If CRIT= Y is specified, then any error
resulting in the IEA491E message will force ‘write-inhibit’ to the primary device(s) and
application availability will be affected. Prior to GDPS with FREEZE, CRIT=Y was the only
method that could be used to guarantee consistency on secondary volumes. Now, with FREEZE
set to Y, GDPS can provide a consistent set of secondary volumes without adversely affecting
access to the primary volumes. In the sample environment, only one logical subsystem pair will be
established as a non-freeze group (SSID pair 1830-8830). These volumes are assumed to be
critical to GDPS automation and cannot tolerate volume inaccessibility. All volume pairs will be
specified with the CRIT=N option. A symmetric configuration will be established which will map
devices on the primary subsystem to devices on the secondary subsystem with the same CCA,
i.e... device 1224 will be paired with device 9224.
This step was discussed in detail in Section 2.5 Please refer to this information if required.
This process was discussed in detail in Section 2.5 Please refer to this information if required.
GDPS install is described in the ReadMe file distributed with the product and in the Info member in the installation
library. Also see section 2.5.
This step requires that the NetView Startup Procedure be modified by concatenating several
GDPS libraries to specific DDs in the Startup PROC. First concatenate the
#HLQ.GDPS.SGDPEXEC library to the DSICLD DD statement. Next, concatenate the
#HLQ.GDPS.SGDPMSG library to the DSIMSG DD statement and concatenate the
#HLQ.GDPS.SGDPPNL to the CNMPNL1 DD statement. Finally, concatenate
#HLQ.GDPS.SGDPPARM library to the DSIPARM DD statement ( see sample below).
There are several ways to provide SGDPLOAD authorization. One method is to add the
#HLQ.GDPS.SGDPLOAD to the LINKLST. A second method is to use the SET PROG APF
command to authorize the library. A third way would be to copy the modules into an already
existing APF authorized library.
In this installation step, one GDPS member must be included in the NetView DSIDMN member.
While there are several methods for accomplishing this, one such method involves copying
member DSIDMNGP from the GDPS data set (#HLQ.GDPS.SGDPPARM) into the NetView
DSIPARM data set. Then the module is referred to via an include statement in the NetView
DSIDMN member. The include statement can be placed at the end of the DSIDMN member,
prior to the END statement.
Having the PPRC / ESCON environment documentation complete and the GDPS automation
code installed, the configuration definition process can begin. A base file to start with may be
found in the #HLQ.GDPS.SGDPPARM library distributed with GDPS. The member for the file
is GEOPARM. Copy this file into another file or PDS member which can be modified in Step 9
below. Before beginning Step 9, there are some rules and background information that must be
known in order to create a GDPS-PPRC configuration file.
The first thing to understand is the GDPS statement syntax. While it is very simple, please be
aware of the following:
1. Comment statements have an asterisk (*) in column one. The remaining characters on the
line are for documentation purposes only.
2. Variables for the GDPS keywords must be enclosed in single quotes.
The second item to understand is that the PPRC configuration control file is made up of three
distinct sections. The convention for coding this file is to complete each section before
proceeding with the next section. Failure to do so may produce unpredictable results. These
sections and the objective of these sections follows.
w The "GEOPLEX LINKS" SECTION defines the ESCON LINKS used between the SSID
pairs. There are four control statements used in this section. These control statements are:
Ÿ "SITE1=" - This statement defines the links from Site1 to Site2 for one SSID pair
Ÿ "SITE2=" - This statement defines the links from Site2 back to Site1 for one SSID pair.
These links will be used whenever a DASD Site Switch operation is initiated.
Ÿ "SITE1PDAS=" - This statement must be used when PDAS is used to swap DASD from
Site1 to Site2 for one SSID pair. It defines links from Site2 to Site1 for one SSID pair.
Ÿ "SITE2PDAS=" This statement must be used when PDAS is used to swap DASD from
Site2 to Site1 for one SSID pair. It defines links from Site1 to Site2 for one SSID pair.
Note: SITE2 and SITE2PDAS statements are not valid for the RCMF environment. If
used a message will be issued and the statements ignored.
w The "GEOPLEX MIRROR" SECTION defines the logical SSID pairs and the subsequent
PPRC volume pairs. There are two control statements used in this section. These statements
are:
Note: SSID pairs and associate volume pairs should be completed before proceeding to
the next subsystem otherwise, unpredictable results might occur.
w The "GEOPLEX NONSHARE" SECTION defines the GDPS utility devices. The utility
device is used to issue PPRC commands to the DASD control unit. The utility device must be
defined as NONSHARED to make sure that it's never RESERVED to another (or any)
system. One utility device is required for each primary and each secondary subsystem. There
is one control statement used in this section
The third and final item that must be understood before moving on to the actual file definition is
the syntax of the individual statements identified above. Please recall that there are 7 unique
statements used to define the GDPS-PPRC environment. Let’s begin with the “SITE1=”
statement.
Note: GDPS checks for invalid statement identifiers such as SITE3, etc... but no
checks are made for invalid Values. At execution time these statements will be
ignored.
This statement is used to define the PPRC links from Site1 to Site2 for one SSID pair
SITE1 SITE1='psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4,linkdev5,C']
[ linkdev6,linkdev7,linkdev8']
where
There is support for up to 8 links, but if you want to define more than 5 links, you need a
continuation. After linkdev5 add ,C' and define links 6, 7, and 8 on the continuation line. Note that
there should be no leading apostrophe on the continuation, but leading spaces are allowed. Also
there should be an ending apostrophe.
Note: Each link is defined using an 8 digit address of the format ‘aaaabbcc’ where
‘aaaa’ is the Subsystem Adapter ID (SAID) which identifies the interface location, ‘bb’
is the ESCON director portid (‘00” for direct connections), and ‘cc’ is the logical control
unit number (‘00-03’).
Note: From 1 to 8 links can be defined on a single SITE1 statement for subsystems
supporting 8 links. Some examples of this statement follow:
ŸSITE1='1800,8800,N,N,0000B000,0010B400'
ŸSITE1='1900,8900,N,N,0000D000,0001D400,0010D400,0011D000'
The next statement is the SITE2 statement. This statement defines the links from Site2 to Site1
for one SSID pair. It is used whenever a DASD Site Switch is initiated . It has identical syntax as
the SITE1 statement. The difference is that it establishes links from Site2 to Site1.
SITE2='psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4,linkdev5,C']
[ linkdev6,linkdev7,linkdev8']
where
'psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4]'
| | || | | | |_LINK4
| | || | | |_SAID/DEST FOR LINK3
| | || | |_SAID/DEST FOR LINK2
| | || |___ SAID/DEST FOR LINK1
| | | |___ CRIT=Y or CRIT=N
| | |___ FREEZE Y or N (Y SETS UP FREEZE GROUP)
| |___ SECONDARY SSID RESIDING AT SITE1
|___ PRIMARY SSID RESIDING AT SITE2
There is support for up to 8 links, but if you want to define more than 5 links, you need a
continuation. After linkdev5 add ,C' and define links 6, 7, and 8 on the continuation line. Note that
72 GDPS Installation Guide
there should be no leading apostrophe on the continuation, but leading spaces are allowed. Also
there should be an ending apostrophe.
Note: From 1 to 8 links can be defined on a single SITE2 statement for subsystems
supporting 8 links. Please see examples in the preceding section.
Note: The SITE2 statement is not valid for the RCMF environment. If used, a message
will be issued and the statement will be ignored.
Note: The CRIT and Freeze setting should match those on the SITE1 statement
Subsection 2.6.4.8.2.3 SITE1PDAS Statement
For those using PDAS to swap from Site1 to Site2 volumes, the SITE1PDAS statement must be
used to identify the PPRC links. There are several formats for this statement as explained below.
SITE1PDAS='psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4,linkdev5,C']
[ linkdev6,linkdev7,linkdev8']
where
'psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4]'
| | | | | | | |_LINK4
| | | | | | |_SAID/DEST FOR LINK3
| | | | | |_SAID/DEST FOR LINK2
| | | | |___ SAID/DEST FOR LINK1
| | | |___ CRIT=Y or CRIT=N (SHOULD MATCH SITE1 STMT****
| | |___ FREEZE Y or N (SHOULD MATCH SITE1 STMT)
| |___ SECONDARY SSID RESIDING AT SITE2
|___ PRIMARY SSID RESIDING AT SITE1
There is support for up to 8 links, but if you want to define more than 5 links, you need a
continuation. After linkdev5 add ,C' and define links 6, 7, and 8 on the continuation line. Note that
there should be no leading apostrophe on the continuation, but leading spaces are allowed. Also
there should be an ending apostrophe.
Note: From 1 to 8 links can be defined on a single SITE1PDAS statement for subsystems
supporting 8 links.
Note: The CRIT and Freeze setting should match those on the SITE1 statement
An alternate format for this statement exists provided the links defined in a SITE2 statement can
be used for the DASD swap operation. The alternate format is:
SITE1
='psid,ssid,f,c,=SITE2'
SITE1PDAS='psid,ssid,f,c,linkdev1,=USE_LAST_PRIMARY_LINK'
ŸSITE1PDAS='1800,8800,N,N,=SITE2’
ŸSITE1PDAS='1900,8900,N,N,0003E000,0013E400,0003E800,0013EB00'
ŸSITE1PDAS='1900,8900,N,N,0003E000,=USE_LAST_PRIMARY_LINK’
For those using PDAS to swap from Site2 to Site1 volumes, the SITE2PDAS statement must be
used to identify the PPRC links. The SITE2PDAS statement has identical syntax as the
SITE1PDAS statement.
SITE2PDAS='psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4,linkdev5,C']
[ linkdev6,linkdev7,linkdev8']
where
'psid,ssid,f,c,linkdev1[,linkdev2,linkdev3,linkdev4]'
| | | | | | | |_LINK4
| | | | | | |_SAID/DEST FOR LINK3
| | | | | |_SAID/DEST FOR LINK2
| | | | |___ SAID/DEST FOR LINK1
| | | |___ CRIT=Y or CRIT=N (SHOULD MATCH SITE2 STMT****
| | |___ FREEZE Y or N (SHOULD MATCH SITE2 STMT)
| |___ SECONDARY SSID RESIDING AT SITE1
|___ PRIMARY SSID RESIDING AT SITE2
or
SITE2PDAS='psid,ssid,f,c,=SITE1'
or
SITE2PDAS='psid,ssid,f,c,linkdev1,=USE_LAST_SECONDARY_LINK'
There is support for up to 8 links, but if you want to define more than 5 links, you need a
continuation. After linkdev5 add ,C' and define links 6, 7, and 8 on the continuation line. Note that
there should be no leading apostrophe on the continuation, but leading spaces are allowed. Also
there should be an ending apostrophe.
Note: From 1 to 8 links can be defined on a single SITE2PDAS statement for subsystems
supporting 8 links. Please see examples in the preceding section.
Note: The SITE2PDAS statement is not valid for the RCMF environment. If used, a
message will be issued and the statement will be ignored.
Note: The CRIT and Freeze setting should match those on the SITE1 statement
Please see previous sections for examples of these statements and a description of the alternate
formats for SITEnPDAS statement
The next couple of GDPS statements are used to define the primary to secondary volume pairs
relationships. They appear in the GEOPLEX MIRROR section of the PPRC configuration file.
GDPS Installation Guide 75
These two GDPS control statements are 1) PPRCSSID and 2) PPRC. There should be a
PPRCSSID statement corresponding to each SITE1 statement. Also, all PPRC statements for
device pairs in a SSID pair should be defined after the PPRCSSID statement. This is not verified
by GDPS and failure to follow this rule may produce unpredictable results. The syntax & rules
for these statements are as follows:
PPRCSSID='psid,ssid' where
'psid,ssid'
| |
| |___ SECONDARY SSID
|___ PRIMARY SSID
PPRC='pdev,sdev,nn[,x]' where
'pdev,sdev,nn[,x]'
| | | |
| | | |___ CRIT=Y OR N (OPTIONAL-DEFAULTS TO
| | | SPECIFICATION ON LINK STMT)****
| | |___ NUMBER OF CONSECUTIVE DEVICE-NUMBERS
| |___ SECONDARY DEVICE NUMBER
|___ PRIMARY DEVICE NUMBER
Note: The subsystem pairs and the associated volume pairs must be adjacent to one
another in the control file. It is required to completely define one SSID pair before
proceeding with the next . Failure to do so will prevent CONFIG processing from
running in parallel and may also produce unpredictable results.
Note: The CRIT= specification on the PPRC statement will override specification in the
GEOPLEX LINKS policy
PPRC='pdev,sdev,nn[,x],tdevp,tdevs' where
'pdev,sdev,nn[,x],tdevp,tdevs'
| | | | | |
| | | | | |
| | | | | |___ TARGET FLASHCOPY DEVICE NUMBER
| | | | | FOR FIRST SECONDARY DEVICE NUMBER
| | | | |___ TARGET FLASHCOPY DEVICE NUMBER
| | | | FOR FIRST PRIMARY DEVICE NUMBER
| | | |___ CRIT=Y OR N (OPTIONAL-DEFAULTS TO
| | | SPECIFICATION ON LINK STMT)****
| | |___ NUMBER OF CONSECUTIVE DEVICE-NUMBERS
| |___ SECONDARY DEVICE NUMBER
|___ PRIMARY DEVICE NUMBER
Note: The flashcopy device numbers are defined consecutively from the ‘nn’ value.
When defining flashcopy devices verify that the target device numbers start at an address
higher than the highest pprc device number and that all flashcopy device
numbers fit within the address range of the LSS.
ŸPPRC='1100,2100,7,N,1108,2108’
Flashcopy definition in primary and secondary LSS.
This statement defines PPRC primary devices on address 1100-1106 with flashcopy target
devices on address 1108-110E.
PPRC secondary devices are defined on address 2100-2106 with flashcopy target devices on
address 2108-210E.
ŸPPRC='1200,2200,7,N,,2208’
Flashcopy definition in secondary LSS only.
NOTE: The NONSHARE device may not be part of a PPRC pair. If a NONSHARE device is
found as one of the devices in a PPRC pair, Config processing will report an ERROR and will not
complete.
Finally, to complete the GDPS definitions, the GDPS utility devices must be defined using the
“NONSHARE” statement. The rules for coding this statement are as follows:
Note: Each SSID ( primary and secondary) participating in a SSID pair requires a
utility device
For ESS subsystems the logical subsystem has to be defined to GDPS which is done with the LSS
parameter. For ESS subsystems the NONSHARE statement should be
The next step in the process is to define the GDPS-PPRC environment using the GDPS
configuration file according to the syntax described above. This next section provides a walk-thru
of the definition process for the sample configuration described in Section 2.6.1. For those
comfortable with the syntax and application to the PPRC configuration, one may which to bypass
the walk-thru and proceed with Section 2.6.2.10.
GDPS must understand the PPRC configuration it is to establish, manage, and interface with.
Using the control statements just discussed, GDPS provides a panel interface for establishing the
PPRC environment. The panel interface will allow PPRC links and volume pairs to be established,
environment status to be obtained and the PPRC configuration to be managed. The definitions
allow GDPS to interface to PPRC without operations requiring detailed PPRC command
knowledge.
In the sample configuration, there is a number of PPRC subsystem pairs which need to be defined
along with the PPRC links between them. As an example, let’s established the definitions for the
SSID pair between SSID 1800 at the primary site and SSID 8800 at the secondary site. The first
definition required is the SITE1 statement which will define to GDPS the PPRC links between the
SSID pair. SITE1=’1800,8800,Y,N...’ defines the SSID pair and sets this SSID pair up as a
FREEZE group with CRIT=N. This is the normal setting for the Freeze and Crit parameters
when GDPS monitoring and automation are active. These settings will provide a consistent copy
of the data at Site2 without affecting host application availability that CRIT=Y can have. See the
Remote Copy Administration guide for details on the CRIT=Y parameter. The actual LINK
portion of the SITE1 statement is composed of from 1 to 4 8 digit numbers of the format
‘aaaabbcc’ where ‘aaaa’ is the Subsystem Adapter ID (SAID) which identifies the interface
location, ‘bb’ is the ESCON director portid (‘00” for direct connections), and ‘cc’ is the logical
control unit number (‘00-03’). Failure to define the links correctly will prevent the PPRC links
from being established, so much caution must be exercised in applying the documented PPRC
configuration to the definition statements. Looking at the diagrams for the sample environment,
the ESCON adapters which can be used are identified as interfaces ‘A’ & ‘F’ on each cluster.
These interfaces have Subsystem Adapter Ids (SAIDs) of ‘0000’ ,’0041’ on cluster 0 and ‘0010’
and ‘0051’ on cluster 1. This is generally the most confusing part of the definition process. It is
imperative that the SAIDs are correctly identified and transferred to the GDPS control statements.
In the sample environment, two PPRC links, one from each cluster, is to be used. This will
provide good availability, but improved performance may be experienced when four paths are
defined, depending on the I/O activity. In order to define the two links for this subsystem pair,
SAIDs ‘0000’ and ‘0051’ will be used and according to the diagrams they will utilize the ‘B0” and
B4’ PORTIDs respectively. Since in the sample environment, an LCU0 to LCU0 SSID pair is
being defined, the ‘cc’ portion of the link address will be ‘00’ (if a different LCU pair were being
defined this value would be different). Combining this information with the proper syntax the
definition statement for this subsystem pair would be,
SITE1='1800,8800,Y,N,0000B000,0051B400'. For each subsystem pair a similar process must
be completed. The following GDPS control statements represent one possible definition for the
sample environment.
GEOPLEX LINKS
**** SITE1 to SITE2 LINKS for mirroring from Primary Site to Alternate Site
The SITE2 definition process is used to define secondary to primary SSID pairs and their
associated links. Be aware that SITE2 definitions are not valid for the RCMF environment. The
process is similar to defining the SSID pairs from Site1 to Site2 using the GDPS SITE1 statement.
The difference is that the SSIDs pairs and links will be established from the perspective of the
secondary to primary site. These links will be established by GDPS and will remain active PPRC
links until the paths are deleted through the GDPS panels. These definitions allow GDPS to
perform a DASD Site Switch when required. The FREEZE parameters and CRIT parameters
should be the same as was used to define the corresponding primary to secondary subsystem pairs
links on the SITE1 statement.. In our sample environment for subsystems whose SSIDs are
88xx, the interfaces on the secondary subsystems are ‘B’ on cluster 0 & ‘B’on cluster 1 and will
utilize different physical cross site links so that they can be established concurrently with the links
from Site1 to Site2. The requirement to establish connectivity from Site2 to Site1 is satisfied by
using the SITE2 GDPS statement. Let’s set up the links for the SSID pair of 8800 to 1800.
Using the diagrams once again, follow the ESCON path back on Cluster 0 Interface B The
ESCON portid for this link is ‘C6’ and the SAID is ‘0001’. The logical control unit portion of the
link identifier is ‘00’ since this represents the LCU0 subsystem pair. Therefore, one PPRC path
would be defined as ‘0001C600’ As mentioned before, it is required that all PPRC link
definitions for the Site2 to Site1 paths be completed as a group otherwise results are
unpredictable. The following statements therefore represent the current definitions (including all
SITE1 and SITE2 definitions) for our sample environment allowing multiple PPRC paths for each
subsystem pair.
GEOPLEX LINKS
**** SITE1 to SITE2 LINKS for mirroring from Primary Site to Alternate Site
SITE1='1800,8800,Y,N,0000B000,0051B400'
SITE1='1810,8810,Y,N,0041BC01,0010B801'
SITE1='1820,8820,Y,N,0000B002,0051B402'
SITE1='1830,8830,N,N,0041BC03,0010B803'
SITE1='1900,8900,Y,N,0000D000,0001D400,0010D400,0011D000'
***** SITE2 to SITE1 LINKS for mirroring from Alternate Site back to Primary Site
SITE2='8800,1800,Y,N,0001C600,0011C200'
SITE2='8810,1810,Y,N,0011C201,0001C601'
SITE2='8820,1820,Y,N,0001C602,0011C202'
SITE2='8830,1830,N,N,0011C203,0001C603'
SITE2='8900,1900,Y,N,0002E000,0012E400,0002EB00,0012E800'
For sake of illustration it is assumed that PDAS will be used. Recall, that the SITE1PDAS and
SITE2PDAS statements must be defined in order to notify GDPS which ESCON links to use
when PDAS is used. Also, be aware that SITE2PDAS statements are invalid for the RCMF
environment. These statements define the SSID pairs and associated links used during the PDAS
operation. In the sample environment, defined links are available for PPRC activity (in each
direction) and can be used for the PDAS swap activity as well. Adding the SITE1PDAS and
GEOPLEX LINKS
**** SITE1 to SITE2 LINKS for mirroring from Primary Site to Alternate Site
SITE1='1800,8800,Y,N,0000B000,0051B400'
SITE1='1810,8810,Y,N,0041BC01,0010B801'
SITE1='1820,8820,Y,N,0000B002,0051B402'
SITE1='1830,8830,N,N,0041BC03,0010B803'
SITE1='1900,8900,Y,N,0000D000,0001D400,0010D400,0011D000'
***** SITE2 to SITE1 LINKS for mirroring from Alternate Site back to Primary Site
SITE2='8800,1800,Y,N,0001C600,0011C200'
SITE2='8810,1810,Y,N,0011C201,0001C601'
SITE2='8820,1820,Y,N,0001C602,0011C202'
SITE2='8830,1830,N,N,0011C203,0001C603'
SITE2='8900,1900,Y,N,0002E000,0012E400,0002EB00,0012E800'
***** Links used to establish mirroring back to Primary Site after a SWAP to the
Alternate
SITE1PDAS='1800,8800,Y,N,=SITE2’
SITE1PDAS='1810,8810,Y,N,=SITE2’
SITE1PDAS='1820,8820,Y,N,=SITE2’
SITE1PDAS='1830,8830,N,N,=SITE2’
SITE1PDAS='1900,8900,Y,N,=SITE2’
***** Links used to establish mirroring to Alternate Site after a SWAP to the Primary Site
SITE2PDAS='8800,1800,Y,N,=SITE1’
SITE2PDAS='8810,1810,Y,N,=SITE1’
SITE2PDAS='8820,1820,Y,N,=SITE1’
SITE2PDAS='8830,1830,N,N,=SITE1’
SITE2PDAS='8900,1900,Y,N,=SITE1’
Having completed the definition for the PPRC links, the volume pairs can be defined and this is
completed in the GEOPLEX MIRROR section. The “PPRCSSID” and “PPRC” statements are
used for this purpose. The “PPRCSSID” statement sets up the SSID pair for the subsequent
PPRC statement(s) which identifies the device pairs. Remember, it is required to completely
define one SSID pair before proceeding with the next . Failure to do so may produce
unpredictable results
In the sample configuration, devices 1220-1227 are on the subsystem with a SSID of 1800.
These devices are to be mirrored to devices 9200-9227 respectively which are on the subsystem
with a SSID of 8800. The subsystem pair that needs to be established therefore is
PPRCSSID=’1800,8800’ and the “PPRC” statement which would follow immediately would be
PPRC=’1200,9200,8’. Where the 8 represents how many consecutive device pairs can be
established with this one definition. The CRIT operand should match the default setting
established in the LINKS section unless it is desirable to ‘unit check’ writes for a specific group of
primary volume(s). See the Remote Copy Administration Guide for a discussion on the CRIT=
parameter. The remaining control statements for the other device pairs for the sample
environment are as follows:
GEOPLEX MIRROR
GDPS Installation Guide 81
***** The control statements used to define the volume pairs are as follows:
PPRCSSID='1800,8800'
PPRC='1220,9220,08,N'
PPRCSSID='1810,8810'
PPRC='1240,9240,16,N'
PPRCSSID='1820,8820'
PPRC='1290,9290,16,N'
PPRCSSID='1830,8830'
PPRC='12E0,92E0,16,N'
PPRCSSID='1900,8900'
PPRC='2220,8220,08,N’
Finally, the GDPS utility devices must be defined. These devices were identified earlier and
require the GDPS “NONSHARE” statement to be coded. The following statements represent the
sample environment.
GEOPLEX NONSHARE
*****The control statements to define out GDPS utility devices are as follows:
NONSHARE=’123F’
NONSHARE=’127F’
NONSHARE=’12BF’
NONSHARE=’12FF‘
NONSHARE=’923F’
NONSHARE=’927F’
NONSHARE=’92BF’
NONSHARE=’92FF’
NONSHARE=’223F’
NONSHARE=’823F’
Subsection 2.6.4.10 GDPS-PPRC Installation: Add GEOPARM DD to NetView
Once the coding of the PPRC configuration file has been completed, it should be checked against
the documents created earlier to ensure correctness. When there is agreement that they are
synchronized, add the GEOPARM DD statement to the NetView Startup PROC.
Recycle NetView or use NetView dynamic allocation facility to allow access to this file for GDPS
and begin to test out the PPRC environment using the interactive panels to establish links and
volume pairs, etc...
This function has been removed from GDPS with V2R4 of GDPS.
Definition of the PPRC environment must now be done using the GEOPARM method. Any one
using the SA for OS/390 method of defining PPRC should migrate to GEOPARM as soon as
possible.
Documentation of the methodology is being removed from the install guide as of GDPS V2R4.
Users of the SA for OS/390 method should retain copies of previous versions of the install guide
for reference.
Definitions for the XRC configuration are placed in a member of a PDS (or PDSE). This member is
allocated via the GEOXPARM DD card that is defined in the NETVIEW started task-JCL. (A DD
statement for this file has to be added to the NetView procedure using DD name GEOXPARM.) It
contains 2 , 3 or 4 different statement types. Note:
MSESSIONID=’sessname,mhlq’
ŸThe MSESSIONID statement is an optional statement ant it’s presence indicates that Coupled SDM
support will be used. Sessname is the name of the master-session. Mhlq is the High level qualifier of the
Master-sessions’s control-dataset. All SESSION-statements defined after a MSESSIONID-statement will
be coupled to it’s previous MSESSIONID.
Ÿ
SESSION=’sessname,normalloc,altloc,errorlevel,sessiontype,hlq,appl-sys’
The SESSION statement contain positional parms and care must be taken when not specifiying a parm
Note: Errorlevel specifies how the XRC session responds when an error occurs that causes the
session to become unable to process a volume.
SESSION
GDPS Installation Guide 83
Specifies that, if a permanent error associated with a duplex primary or secondary volume occurs,
XRC suspends all volume pairs in the session, regardless of the volume status. Choose the
SESSION keyword when losing any volume pairs from the session would negatively affect the
usability of the remaining secondary volumes. Specify SESSION to ensure that all secondary
volumes necessary for recovery are consistent up to the time of failure.
VOLUME
Specifies that, if a permanent error occurs, the XRC session suspends only the duplex volume pair or
pairs that are associated with the error. All other volumes continue to process.
group_name
Specifies that, if an error to a duplex primary or secondary volume occurs, XRC suspends all volume
pairs that are associated with that group name. The group_namecan be any name acceptable to
TSO. The maximum length of the name is eight characters. Do not include embedded blanks. TSO
ignores leading blanks and trailing blanks. Do not use the reserved names “VOLUME” and
“SESSION”.
Sessiontype specifies the operating mode for the XRC session. Specify XRC for a disaster recovery
session or MIGRATE for a data migration session.
Hlq specifies the high-level-qualifier for the state, control, and journal data sets for this XRC
session.
XRC=’pvolser,svolser,errorlevel,donotblock,excl,sc,utl’
or
XRC=’pdevnum,sdevnum,number,utl,errorlevel,donotblock,excl,sc,priority,primaryFC,secondaryFC’
The XRC statement contain positional parms and care must be taken when not specifiying a parm
Note: Errorlevel specifies how the XRC session responds when an error occurs that causes the
session to become unable to process a volume.
SESSION
Specifies that, if a permanent error associated with a duplex primary or secondary volume occurs,
XRC suspends all volume pairs in the session, regardless of the volume status. Choose the
SESSION keyword when losing any volume pairs from the session would negatively affect the
usability of the remaining secondary volumes. Specify SESSION to ensure that all secondary
volumes necessary for recovery are consistent up to the time of failure.
group_name
Specifies that, if an error to a duplex primary or secondary volume occurs, XRC suspends all volume
pairs that are associated with that group name. The group_namecan be any name acceptable to
TSO. The maximum length of the name is eight characters. Do not include embedded blanks. TSO
ignores leading blanks and trailingblanks. Do not use the reserved names “VOLUME” and
“SESSION”.
Note: When an update occurs, a copy of the update is placed in cache for the data mover to read
and the corresponding track bit in the hardware bitmap is set. The data mover reads the data out of
cache. If the update rate overruns the data mover capability to read, then residuals will accumulate.
When the residuals for a single device exceed a limit (x'500' records) then updates to that specific
device are paced to a rate the data mover can read at. This is called device blocking. The user
can exclude a device from being blocked by specifying Y for the donotblock parameter and should do
that for some volumes (like WADs or high performance volumes).
Ÿ excl can be Y or N. Y means that this volume-pair is allowed to be in a SUSPENDED state. Without
generating alerts indication that the SDM-session is not in synch. Default is N.
Ÿ sc can be a 2 character value that defines the primary storage control unit session cu-session that
the pair will be added to --. is the default
Note: XRC Version 1 provided for a single reader subtask available to read data from primary
storage control unit.. XRC Version 2 changes this implementation: The installation can now enable
parallel SDM subasks to the same primary storage control unit by exploiting the concept of primary
storage control sessions (not to be confused with the SDM session). A primary storage control
session appears to the SDM as another logical primary storage control and since the SDM runs one
subtask per logical primary storage control, it can now run multiple subtasks per physical primary
storage control.
Each XRC session can support a maximum of 80 storage control sessions. The number of storage
control sessions that maybe in effect for a single storage control depends on the capability of the
storage control. For example, 3990 and 9390 Storage Controls can each manage a maximum of four
storage control sessions, and each 2105 Storage Control LSS can manage a maximum of 64 storage
control sessions.
All XRC statements defined after a session-statement will be connected to the above session.
The example below follows an example of the GEOXPARM member that represents the (volser-version)
format for the XRC statement:
GEOPLEX XRC,
GDPS Installation Guide 85
SESSION='STHLM3,DSSAN,DSS2N,SESSION,XRC,XRC1,DSSKO',
XRC='DSKE10,DV09D0,SESSION,N',
XRC='DSKE11,DV09D1,GROUP,N,Y',
XRC='DSKE20,DSKE60,VOLUME,Y,N',
XRC='DSKE21,DSKE61,DEFAULT,N,,SC',
XRC='DSKE22,DSKE62,VOLUME,N',
XRC='DSKE23,DSKE63,DEFAULT,N',
XRC='DSKE30,DSKE70,VOLUME,N',
XRC='DSKE31,DSKE71,DEFAULT,N',
XRC='DSKE32,DSKE72,VOLUME,N',
XRC='DSKE33,DSKE73,SESSION,',
SESSION='STHLM3,DSSBN,,SESSION,XRC,XRC1,DSSBO',
XRC='DSKE24,DV09D4,SESSION,XRC',
XRC='DSKE25,DV09D5,GROUP,N',
The example of the GEOXPARM member below represents the device number format for the XRC
statement:
Below follows an example of the GEOXPARM member (devnum-version):
GEOPLEX XRC,
SESSION='STHLM3,DSSAN,DSS2N,SESSION,XRC,XRC1,DSSKO',
XRC='0E10,09D0,2,Y,SESSION,N,N,AA',
XRC='0E20,0E60,4,N,VOLUME,Y,N',
XRC='0E30,0E70,4,N,VOLUME,N',
SESSION='STHLM3,DSSBN,,SESSION,XRC,XRC1,DSSBO',
XRC='0E24,09D4,2,SESSION,XRC',
XRC=’1380,2380,2,N,SESSION,N,N,--,00,1382,2382
When the XRC configuration type is VOLSER, the configuration can be updated from
the GDPS/XRC interface. The updated member will be stored under the name that is
defined in the GEOXPARM DD-card. The naming convention is that the last 2 charac-
ters out of 8 must be ‘01’ . The first 6 characters are user defined. When update is
done the current configuration will be saved with the last 2 characters changed to a
number between 02 and 09. When 09-member I reached, the next configuration
update will get 02.
NOTE: This is the recommended method for defining XRC remote copy.
To make management of a large number of devices easier in GDPS, GDPS can from generic
definitions find the volumes that should be mirrored. This will be achieved by:
Hardware definition.
Introduce a new definition parameter, SSID, that defines the primary and secondary control unit.
GDPS will query all devices and create a device pair between the ssids, a one to one mapping
(3rd device in primary ssid will be mapped to 3rd device in secondary ssid). This will be done
using DS QD command with the SSID parameter. Both the Primary and Secondary volumes
haves to be online to the SDM-system/s when the configuration command is executed to be
included in the configuration.
Flashcopy devices can also be mapped the same way by specifying the start address for the devices
in each SSID. The preferable way to do this is to generate the second half offline in HCD and and
specify the secondary half start address as pfcstart and sfcstart .
Definition statements:
SSID=’pssid,sssid[,[pfcstart][,sfcstart]]’
Filtering definitions.
Introduce two new statements, INCLUDE and EXCLUDE, that defines the volume serial number
of the volumes that should be included in the GDPS policy. INCLUDE and if needed
EXCLUDE-cards shall preceede each SSID-statement.
INCLUDE pvolmask,svolmask,[,errorlevel[,donotblock[,excl[,scs[utl[,rprio]]]]]]
EXCLUDE volmask
Ÿ pvolmask is mask containing A-Z, 0-9, * and %. Note. Asterisk can only be coded as the
last character
Ÿ svolmask is mask containing A-Z, 0-9, * and %. Note. Asterisk can only be coded as the
last character. If this parameter is set to * GDPS will use any (current) volume serial number as
secondary volume
Ÿ Errorlevel, donotblock, excl, scs , utl and rprio will be used to set the attributes for the
volume pair with this volume mask. If not defined the default from the session statement will be
used.
Note: the wildcarding in pvolmask and svolmask follows the TSO-wilcarding scheme except that
* is only supported as the last character in the mask.
Configuration is:
INCLUDE ABC*,BBC*
EXCLUDE DB000*
INCLUDE DB*,*,,Y
INCLUDE 00%200,10%200,,,,CC
SSID=0100,0200
All volumes beginning with ABC will be included in the configuration if the volume serial number
on the secondary device begins with BBC. All options from the pair will be taken from th session
statement.
All volumes starting with DB000 will not be included in the configuration.
All volumes starting with DB will be included in the configuration with the donotblock option.
Note. The volumes starting with DB000 will not be included since they where rejected by a
previous statement.
All volumes having 00 as the 2 first characters and 200 as the 3 last characters will be included
and added to XRC with the controlunit session-id CC.
Example 2, showing a coupled XRC environment with 2 sessions and 3 different SSIDs:
GEOPLEX XRC,
MSESSIONID='STHLM,XRC1',
SESSION='STHLM1A,DSSKN,DSS2N,SESSION,XRC,XRC1,DSSKO',
INCLUDE='SH130*,*',
EXCLUDE='SH131*'
EXCLUDE='SH1308'
EXCLUDE='SH1309'
SSID='0106,0206,,2308',
* ONLY SECONDARY FLASHCOPIES!
SESSION='STHLM2A,DSS2N,DSSKN,SESSION,XRC,XRC1,DSSKO',
INCLUDE='SH1280,*',
INCLUDE='SH1281,*',
INCLUDE='SH1282,*',
INCLUDE='SH1283,*',
SSID='0105,0205,1288,2288',
INCLUDE='SH1380,*',
INCLUDE='SH1381,*',
SSID='0107,0207,1382,2382',
Cofiguration process.
GDPS will detect the volumes when executing the configuration command and build the policy. A
requirement is that all primary volumes has to be online to the SDM-system when the
configuration command is executed. Volumes that were in the active configuration will not be
included in the new configuration if they were offline to the SDM.
AUTODETECT=NO!YES
IF AUTODETECT=YES, MONITOR2 will execute the configuration process and if pairs are
detected that match the definitions they will be added to the policy. If the newly detected pairs
were not in synch an alert will be issued stating ‘all volumes-pairs are not in synch’.
IF AUTODETECT=NO, config has to be done from the GDPS main panel. This is default.
Adding and removing pairs from ‘OUT-SIDE’-list from the GDPS panels will be rejected.
SECURITY Definitions.
GDPS/XRC will use SAF to verify if the user is authorized to execute the requested GDPS/XRC-function.
DSIDMN has to specify OPERSEC=SAFCHECK and CMDAUTH=SAF.
The Netview Started task must have READ-access to STGADMIN.ANT.XRC (Check this)
Within GDPS/XRC the following resources will be checked:
Ÿ CLASS=NETCMDS
Ÿ PROFILE=*.*.EXCMD.GEOXRC.xrc-command
Ÿ PROFILE=*.*.VPCXCFGM
Users that has the authority to execute other XRC-commands that XQUERY must have READ-access to
the profile *.*.EXCMD-GEOXRC.* or specific commands if desired)
Profile VPCXCFGM manages the access to the GDPS/XRC configuration.
ENVIRONMENT: GDPS/PPRC
GDPS is an implementation of Parallel Sysplex that follows all the existing sysplex rules and
procedures and has minimal impact on the configuration of an existing sysplex environment. The
changes required for full implementation of GDPS automation are:
GDPS automates the process of switching the use of coupling facilities by changing the active
CFRM policy, querying XCF to determine which structures exist (D XCF etc...), and using this
information to execute REBUILD commands to move the structures to the desired coupling
facility.
This process is limited to what could be accomplished if the user were to execute the commands
manually. GDPS does not add any new functionality to the sysplex environment. If a structure
cannot be rebuilt for some reason, GDPS will not be able to move it. For example, some
structures do not support rebuild. In addition, some structures cannot be moved under specific
circumstances (DB2 Lock and SCA structures cannot be moved if an instance of DB2 is not
running).
These limitations must be planned for by the user. A careful analysis of all structures must be
made and it must be determined if there are structures that the GDPS automation will be unable to
move. Alternate plans must be made and documented to move these structures.
Three CFRM policies are required. CFRMPOL0, CFRMPOL1, and CFRMPOL2, for example.
These are not required names. The user is free to choose whatever is appropriate for their
environment.
Using these names as examples; CFRMPOL0 would be coded to place the structures in the
coupling facilities in both sites, CFRMPOL1 would be coded to place the structures in the
coupling facilities in Site1, and CFRMPOL2 would be used to place the structures in the coupling
facilities in Site2.
When building these policies the user must define all existing coupling facilities in all policies for
the rebuild procedures to execute properly. If a new CFRM policy is activated that does not define
a coupling facility, XCF will not be able to move a structure from that coupling facility using
REBUILD. It also appears that the coupling facilities must be in the preference list for structures
for the rebuild to work properly, even though the policy’s purpose is to move a structure out of
that coupling facility.
Placing an M for modify as shown above under ‘Coupling Facilities’ will display the next panel:
Normal, Site1, and Site2 policy names are entered here as shown. The active policy is also
displayed. The active policy can be changed from this panel by placing the appropriate entry in the
“Selection ==>” field (1 for Site1, 2 for Site2, and 3 for Normal).
It is common practice to have a primary couple data set, an alternate couple data set and a spare
couple data set for each type of couple data set in use in the sysplex (i.e. CFRM, SFM, WLM,
ARM, etc...).
In the GDPS environment the number and placement of these data sets changes. GDPS will
automate the placement of couple data sets during both planned action scenarios and takeover
scenarios. This placement is accomplished by automating the execution of the required data set
switch and definition commands.
Four data sets of each type are required. It is expected that the primary couple data set of each
type will normally be in Site1, the alternate data set of each type will normally be in Site2, and
there will be a spare data set defined in each site.
NOTE: In a GDPS environment the couple data sets should be on volumes that are not mirrored
by PPRC. The controlling system is a member in the sysplex and is using the couple data sets. If
there is a problem that requires a site switch it is necessary that the controlling system survives
which may not be possible if the couple data sets were on PPRC volumes.
From the Sysplex Resource Management panel (Option 7 from the GDPS main menu) shown
below:
Select modify for a data set type and the next panel is displayed.
Sysplex Failure Management has several time based parameters that need to be evaluated when
implementing GDPS. They are DEACTTIME, RESETTIME, and ISOLATETIME, as described
in “Setting Up a Sysplex - GC28 - 1779”.
DEACTTIME specifies the time interval in seconds after which the logical partition on which the
failing system resides is to be deactivated. For example, specifying DEACTTIME(10) causes an
LPAR DEACTIVATE of the failing LPAR 10 seconds after the system status update missing
condition is detected. This action is performed only if there is an active system in the sysplex
running in LPAR mode on the same PR/SM CPC as the named system.
You can specify a time interval for DEACTTIME from 0 to 86400 seconds. Specifying a value of
0 implies that you want the system to deactivate the logical partition as soon as the status update
missing condition is detected. You can set a default value for your installation by using the
NAME(*) DEACTTIME(your default value) option.
Notes:
1) If the failing system resumes its status update before the interval has expired, the
deactivate function is not performed.
2) If the OPNOTIFY interval expires before the time specified for DEACTTIME, the
IXC402D message will be issued before the deactivate is attempted.
You can specify a time interval for RESETTIME from 0 to 86400 seconds. Specifying
RESETTIME(0) causes the failing system to be reset as soon as XCF detects the system status
update missing condition. You can set a default value for your installation by using the NAME(*)
RESETTIME (your default value) option.
Notes:
1) If the failing system resumes its status update before the interval has expired, the system
reset function is not performed.
2) If the OPNOTIFY interval expires before the time specified for RESETTIME, the
IXC402D message is issued before the reset is attempted.
ISOLATETIME specifies the time interval in seconds after which the named system is to be
isolated using the fencing services through the coupling facility. For example, specifying
ISOLATETIME(10) causes the failing system to be fenced 10 seconds after the System Status
Update Missing condition is detected. This action is performed only if there is an active system in
the sysplex that shares connectivity to one or more coupling facilities with the named system.
You can specify a time interval for ISOLATETIME of from 0 to 86400 seconds. Specifying a
value of 0 implies that you want the system to isolate the named system as soon as the status
update missing condition is detected. You can set a default value for your installation by using the
NAME(*) ISOLATETIME (your default value) option.
Notes:
1) If the failing system resumes its status update before the interval has expired, the
ISOLATE function is not performed.
2) If the isolation fails, message IXC102A prompts the operator to reset the system
manually.
GDPS also has functions for RESET and DEACTIVATE for logical partitions and GDPS
recovery actions will be delayed on a failed system for the duration of ISOLATETIME.
There are so many possible variations on the selections made for specific users that no fixed
recommendation can be made. Each sysplex environment must be analyzed to determine if the
SFM policy is using RESETTIME and/or DEACTTIME and whether the functions are to be
eliminated from SFM and taken on by GDPS.
This may or may not be important. Each user must evaluate this option according to their sysplex
management criteria. However, during testing of GDPS, ISOLATETIME is usually perceived as a
GDPS Installation Guide 95
delay of GDPS. When a system failure is induced to test GDPS recovery functions,
ISOLATETIME delays GDPS actions and can give the impression that GDPS is very slow in
reacting to failures. Keep this in mind during testing.
For a GDPS environment with two sites the following SFM considerations apply:
1 - SFM active
ŸYou must specify CONNFAIL(NO). You cannot specify CONNFAIL(YES), because if you do
and there is a site failure, the systems in the other site may be stopped by SFM based on the SFM
Policy.
ŸThe controlling system should have a SFM weight that is larger than the sum of all other systems'
weights, because you never want SFM to remove the controlling system from the sysplex.
ŸThe REBUILDPERCENT parameter must be set to allow rebuild.
ŸThe ISGLOCK structure must be in site 1 to allow the site 1 systems to continue without a
rebuild. The controlling system ( and other systems) in site 2 will rebuild because of their higher
SFM weight.
ŸIf there is a site failure, GDPS will act on the IXC256A and IXC409D messages so that the
result is removal of failing systems from the sysplex. The IXC409D message would cause the
systems in the failing site to be removed which is correct.
ENVIRONMENT: GDPS
Subsection 2.7.2.1
SYS1.PARMLIB(MPFLSTxx) Coexistence
Messages that can possibly trigger automation must be seen by NetView. This is accomplished
with the MVS Subsystem Interface. There are three major components of the MVS Subsystem
Interface: 1) the subsystem using the interface must be defined in SYS1.PARMLIB (IEFSSNxx),
2) the subsystem must have a mechanism to look at the message traffic on the subsystem interface,
For the NetView SSI to determine if a message is applicable, the AUTO flag must be turned on.
The AUTO flag is defined in SYS1.PARMLIB (MPFLSTxx). Note that not all automation
packages use the MVS Subsystem Interface to determine if a message can possibly trigger
automation.
There are messages that both SA for OS/390 and GDPS need for automation. Therefore, the
MPFLSTxx is impacted. There are several ways of coding the MPFLSTxx for the AUTO
parameter: 1) a specific message can be coded with the AUTO parameter value, 2) a value to be
used when a specific message entry does not have the AUTO parameter coded can be specified
with the .DEFAULT statement, and 3) a value to be used for messages that are not specified in
MPFLSTxx can be specified with the .NO_ENTRY statement.
If the DEFAULT statement specifies AUTO(YES), all message entries in the MPFLSTxx that are
coded without the AUTO parm, are eligible for automation by NetView. If there is not a
NO_ENTRY statement, messages without an entry in the MPFLSTxx are also eligible for
automation by NetView, because the default AUTO parm value for the NO_ENTRY statement is
YES.
Note that for performance reasons it is recommended that AUTO(YES) not be specified
as the default for messages in the MPFLSTxx.
FREEZE causes one suspend event per PPRC volume pair. Message IEA494I appears at the
operator console for each PPRC volume pair. This may generate a message flood to be displayed
at the consoles. Maximum of four messages per second will be rolled via "Wrap Mode" - this is
the recommended mode for AOM consoles. Thus, it may take several minutes until the last
IEA494I message is actually displayed.
Following msgid specification should be imbedded into the MPFLSTxx PARMLIB member
This provides a relief on the amount of WTO buffers being required. Furthermore, the time AOM
has to wait until a potential request service message GEO090A of GDPS appears at the console
and finally being recognized by AOM via screen scrapping will be significantly reduced.
AUTO(YES) specifies that the message IEA494I is further eligible for processing by the
automation system NetView. SUP(YES) causes that the system does not dispaly it at the console
but is written to the SYSLOG.
SYS1.PARMLIB(COMMNDxx) Coexistence
One of the GDPS requirements in an environment that uses non-IBM automation as the primary
automation tool is that SA for OS/390 must start the primary automation product (i.e.AFOPER).
GDPS Installation Guide 97
Therefore, the command to start the coexistent automation must be removed from the
COMMNDxx member in the SYS1.PARMLIB data set and the commands to start the GDPS
Netview and GDPS NetView SSI must be included in the COMMNDxx member in the
SYS1.PARMLIB data set.
Both SA for OS/390 and GDPS have automation operator definitions that must be added to
NetView. These definitions are the same regardless of what kind of GDPS system it is (controlling
or non-controlling).
NetView / SA for OS/390 will be executing even though its involvement with non-GDPS
automation will be limited to starting the other automation subsystem.
The SA for OS/390 automation depends on the AUTO1 automation operator executing the
LOGPROF2 clist specified in the DSIPROFC logon profile. SA for OS/390 requires that the
standard NetView version of LOGPROF2 execute.
The SA for OS/390 automation depends on the AUTO2 automation operator executing the
LOGPROF3 clist specified in the DSIPROFD logon profile. For each NetView system, this needs
to be reverified prior to SA for OS/390 GDPS implementation.
In the NetView environment, messages are assigned to automation operators and when
automation is triggered from NetView’s automation table, it will execute under the automation
operator to which the message is assigned. SA for OS/390 automation will setup these message
assignments for any automation operator defined in the SA for OS/390 policy (the required SA for
OS/390 operators as well as any others defined).
As part of the SA for OS/390 installation, there are commands and command lists that must be
defined in NetView’s DSICMD definitions. An include statement for AOFCMD must be added to
the DSICMD definitions. GDPS has no commands or command lists that must be defined so this
is not applicable for GDPS.
As part of the SA for OS/390 and GDPS installations, there are task definitions and initialization
specifications that must be made in NetView’s DSIDMN definitions.
Includes for SA for OS/390’s AOFDMN and GDPS’s DSIDMNGP must be added to NetView’s
DSIDMN definitions.
For SA for OS/390 to initialize correctly, the DSILOG task must not be automatically started via
the DSILOG task statement. During SA for OS/390 initialization, the DSILOG task is started
with a START command in one of the early initialization clists. The resulting DSI240I message is
trapped in the first automation table loaded, to trigger additional SA for OS/390 initialization
98 GDPS Installation Gude
activity. Therefore, in NetView’s DSIDMNB definitions, the DSILOG task must be defined with
INIT=N.
Both SA for OS/390 and GDPS require that NetView’s save/restore database (DSISVRT) exists.
Therefore, the DSISVRT task in NetView’s DSIDMNB definitions must be implemented. Plus the
DSISVRT task must be started automatically with INIT=Y to avoid timing problems during SA
for OS/390’s initialization.
For NetView to function properly in a Sysplex environment, the SSIR task must have a unique
name for each NetView. The SSIR task definition in NetView’s DSIDMNB definitions is where
the SSIR task is named.
SA for OS/390 initialization depends upon NetView’s CNME1034 clist executing first in
NetView’s initialization activity. Plus, SA for OS/390’s AOFMSG00 automation table must be
passed as a parameter to the CNME1034 clist and as a result, CNME1034 will load the
AOFMSG00 automation table. AOFMSG00 contains critical entries for the SA for OS/390
initialization activity. During SA for OS/390’s initialization activity, another automation table will
be loaded. This automation table name is specified in SA for OS/390’s policy. NetView’s
CNME1035 clist must be customized to 1) change the start of the SSIR task to match the
tskname in the DSIDMN definition and 2) change the console ID attachment of the AUTO2
automation operator to a valid console name.
Both SA for OS/390 and GDPS have automation table segments that must be added to the
existing automation table before any other entries. SA for OS/390 initialization requires a separate
small automation table that is used during the very beginning of the SA for OS/390 initialization
process. The larger automation table with the bulk of the SA for OS/390 statements, GDPS
statements, and any other statements, is loaded by the SA for OS/390 initialization process. The
default names for these two required SA for OS/390 automation tables are AOFMSG00 and
AOFMSG01.
SA for OS/390’s AOFMSG01 must be modified to include any other non-SA for OS/390
automation table segments or statements. SA for OS/390’s AOFMSG01 has an ALWAYS
CONTINUE(Y) statement at the beginning of AOFMSG01. This sets the CONTINUE default to
be Y for the remainder of the automation table statements. With this default, NetView will
continue searching the automation table if a statement causes a message match. For performance
purposes, if the search should not continue, then CONTINUE(N) must be specified on a specific
automation table statement.
NetView, SA for OS/390, and GDPS require the use of MVS extended consoles. Extended
console names must be unique within a Sysplex. NetView, SA for OS/390, and GDPS all provide
standard automation operators that must be included in their setup. But for these operators to
issue MVS commands, they must obtain extended consoles with unique names. By default,
NetView assigns an extended console name the same as the automation operator name. This
default will not work in a Sysplex environment. Thus SA for OS/390 provides a mechanism for a
Second, there is an initialization clist executed by SA for OS/390, AOFEXDEF, that specifies the
naming mask for the extended console name.
Third, in the logon profiles for the operators provided by SA for OS/390 and GDPS, a SA for
OS/390 clist (AOCGETCN) is executed to obtain an extended console when the automation
operators are started. This clist gets the console mask defined in AOFEXDEF and gets an
extended console with a unique name.
The same consideration exists for people operators logging onto NetView. If MVS commands
will be issued from NetView, an extended console is obtained if one doesn’t exist and the default
name is that of the operator. If that operator is logged onto TSO and has executed commands via
SDSF, then there will be a conflict. On NetView, the MVS command will fail and message
DWO338I is issued about unable to obtain a unique extended console name. SA for OS/390 has
an entry in its supplied automation table that catches this message and issues the AOCGETCN
clist for that operator and obtains an extended console with a unique name at that point. This
AOCGETCN command could be executed in a logon clist for the people operators to avoid this
one-time failure situation.
For each NetView system, this needs to be reviewed prior to SA for OS/390 GDPS
implementation.
SYS1.PARMLIB(CONSOLxx) Coexistence
The console setup for the AO Manager console, in the CONSOLxx member of SYS1.PARMLIB,
is critical.
The message traffic on the AO Manager console needs to be kept to a minimum. Its main function
is to look for messages from the GDPS environment. These messages are route code 1. Therefore
the AO Manager console cannot be a master console because a master console is required to have
at a minimum route codes 1 and 2. The ROUTCODE parameter in the CONSOLxx member
should be specified as ROUTCODE(1) and the AUTH parameter in the CONSOLxx member
should be specified as ALL. AUTH(ALL) specifies that information, system control, I/O control,
and console control commands may be entered from this console. Commands issued by AO
Manager automation from the AO Manager console do not require master console authority.
Commands that require master console authority are:
+-------------------+----------------------------------+---------------------------------+
¦ MASTER ¦ CONFIG ¦ SWITCH CN ¦
¦ (master console ¦ CONTROL ¦ TRACE (with MT) ¦
¦ control) ¦ DUMP ¦ VARY {CN{...}[,AUTH=...]} ¦
¦ ¦ FORCE ¦ {CONSOLE[,AUTH=...]} ¦
¦ ¦ IOACTION ¦ {GRS } ¦
¦ ¦ QUIESCE ¦ {HARDCPY } ¦
¦ ¦ RESET CN ¦ {MSTCONS } ¦
If the AO Manager automation for duplicate volser messages during IPL will be used, then the
AO Manager console address must be specified as the NIP console in the MVS hardware
configuration definitions (HCD).
The AO Manager console must not be in the alternate console list because it should not be eligible
to become a master console during a console switch situation.
The recommendation for the DEL parameter is DEL(W). With roll mode, DEL(R), messages roll
on the screen which means that the same message can occur on different lines on the screen and
there is a risk that AOM will respond more than once for the same message.
Even with ROUTCODE(1) specified, there can be a good deal of message traffic on the AO
Manager console. With the DEL parameter for the AO Manager console set to DEL(W) for wrap
mode AO Manager message traffic may back up, which could cause AO Manager not to reply to
messages from GDPS or perhaps to not reply at all before GDPS times out.
With wrap mode, DEL(W), you know for sure that messages occur only once on the console
which is the reason for DEL(W) being the recommended setting. Changing to DEL(R) for
performance reasons is an option, but AOM may process a message more than once.
The MFORM parameter for the AO Manager console should be specified as MFORM(S,T). The
AO Manager automation looks for messages with this specific format and does not recognize any
other MFORM message format.
The AO Manager console on the controlling and non-controlling systems do not need to see
messages from each other. Therefore, the MSCOPE parameter for the AO Manager console on
any system can be specified as MSCOPE(*).
The following are the values that are in use on a functioning GDPS Test bed:
ENVIRONMENT: All.
This section details the NetView customization that is required for GDPS. It is assumed that
NetView has been installed and no customization has been done. The details presented here are
the minimum required for GDPS setup and are not meant to replace the customization described
in the NetView manuals.
Allocate a DOMAINID.DSIPARM data set and a DOMAINID.CLISTS data set that will contain
members that must be customized.
All GDPS updates are included in #HLQ.GDPS.SGDPPARM data set and contains the task
statements for GDPS. Implement the following includes depending on the GDPS environment
NetView requires several support data sets, such as the NetView log, save-restore database, trace
data sets, etc. There is a group of NetView supplied jobs that can be used for these data set
allocations and they can be found in the NETVIEW.CNMSAMP data set, members CNMSJBUP,
CNMSID01, CNMSID02, CNMSID03, CNMSI101, CNMSI201, CNMSI301, CNMSI401,
CNMSI501, CNMSI601, and CNMSJ004. Please refer to the NetView manuals for creation of
these support data sets.
NetView requires VTAM definitions. A sample member containing these definitions can be found
in NETVIEW.CNMSAMP member CNMS0013. Refer to the NetView manuals for specific
information. The VTAM definitions must be activated prior to NetView startup.
If the NetView STATMON application is started, then the STATMON preprocessor must be
executed to create member DSINDEF in a data set on NetView’s DSIPARM concatenation. A
Operator AUTO2 and all Netview users who will do the CONFIG (C) command in GDPS should
be authorized to issue the command: 'START TASK=xxxxxx,MOD=PPRCAPI'.
MOD KEYCLASS 1
DSIZDST VALCLASS 1
PPRCAPI VALCLASS 1
=OTHER VALCLASS 1
NetView Performance
The GDPS runs in a NetView environmet. As such there are NetView tuning issue that may arise
depending on how extensively NetView is being used. GDPS may increase that usage to the
extent that Netview tuning may need to be revisited.
One of these NetView tuning issues has to with changing the NetView Constants
Module(DSICTMOD). GDPS uses NetView variables (to save information). If there are many
variables being used it may be necessary to change the NetView Constants Module.
To determine the number of constants being used you can enter the QRYGLOBL command on
the NetView command line, you get a response as follows:
After setting up and starting GDPS in full production use the QRYGLOBL command to find the
number of variables used and refer to the NetView manual for information on how to change the
NetView Constants Module. For more information about setting values in DSICTMOD, see
This section details the SA/390 customization that is required for GDPS. It is assumed that
SA/390 has been installed and no customization has been done. The details presented here are the
minimum required for GDPS setup and are not meant to replace the customization described in
the SA/390 manuals.
Update the DSIDMN member in the DOMAINID.DSIPARM data set to include AOFDMN.
AOFDMN is found in the SA/390 SINGNPRM data set and contains the task statements for
SA/390. This include can be placed at the end of the DSIDMN member, prior to the END
statement.
%INCLUDE AOFDMN
Update the DSIOPF member in the DOMAINID.DSIPARM data set to include AOFOPF.
AOFOPF is found in the SA/390 SINGNPRM data set and contains the automation operator
statements for SA/390. This include can be placed at the end of the DSIOPF member, prior to the
END statement.
%INCLUDE AOFOPF
Copy the DSICMD member from the NETVIEW.DSIPARM data set into the
DOMAINID.DSIPARM data set. Update this member to include AOFCMD. AOFCMD is found
in the SA/390.SINGNPRM data set and contains the command statements for SA/390. This
include can be placed at the end of the DSICMD member, prior to the END statement.
%INCLUDE AOFCMD
Copy the DSIDMNB member from the NETVIEW.DSIPARM data set into the
DOMAINID.DSIPARM data set. Update this member as follows:
Ÿ Change the DSISVRT task statement to INIT=Y. This is the save/restore database
task and should be started as soon as possible during NetView - SA/390 initialization.
Ÿ Change the CNMCSSIR task name to be unique for each system in the sysplex. This is
required in a sysplex environment when using the extended MCS console interface. In
the following partial example, xxxx are the characters which make the task name
unique.
TASK MOD=CNMCSSIR,TSKID=xxxxSSIR,...
Ÿ Change the LUC, VMT, and BRW task name prefixes to match the NetView domain
ID. In the following partial examples, xxxxx are the characters that must match the
NetView domainid (which is found on the NCCFID DOMAINID statement in the
DSIDMNK member).
TASK MOD=CNMTARCA,TSKID=xxxxxVMT,...
GDPS Installation Guide 105
TASK MOD=CNMTGBRW,TSKID=xxxxxBRW,...
TASK MOD=DSIZDST,TSKID=xxxxxLUC,...
Ÿ Note that the DSILOG task statement must be INIT=N. This is very important for
SA/390. The clists that execute during NetView initialization start the DSILOG task
with a NetView start command. SA/390 reacts to the resulting command response
message to trigger its own initialization processes.
Copy the DSIDMNK member from the NETVIEW.DSIPARM data set into the
DOMAINID.DSIPARM data set. Change the domainid on the NCCFID statement to reflect the
correct NetView name and change the initial clist on the NCCFIC statement to be CNME1034
AOFMSG00. The CNME1034 is the standard NetView initialization clist which is used by
SA/390. AOFMSG00 is a parameter for CNME1034 and is the initial automation table to be
loaded.
NCCFID DOMAINID=ddddd,...
NCCFIC IC=CNME1034 AOFMSG00
Copy the AOFMSGSY member from the SA/390.SINGNPRM data set to the
DOMAINID.DSIPARM data set. Update this member with the correct NetView domainid,
OS390 system name, VTAM procedure name, and CNMCSSIR task name. AOFMSGSY
contains automation table synonym entries that SA/390 uses in other SA/390 automation table
members.
GDPS is delivered with two sets of SDF panels, one for GDPS/PPRC and the other for
GDPS/XRC. The member names are GEOSDFGn for PPRC and GEOSDFXn for XRC.
Create a new member named AOFPxxxx (where xxxx is the OS390 system name) in the
DOMAINID.DSIPARM data set and copy the contents of the AOFPSYS1 member from the
SA/390.SINGNPRM data set. Change all the SYS1 system name references from the
AOFPSYS1 member to the correct OS390 system name and add the following two lines prior to
the PFKey definitions (again substitute xxxx with the correct OS390 system name). This puts the
GDPS component on the main SDF panel and is the hook into the GDPS GEOSDFG1 (for
GDPS/PPRC) or GEOSDFX1 (for GDPS(XRC) panel, which is the GDPS main SDF panel.
1. SF(xxxx.GEOPLEX,21,04,22,N,,GEOSDFx1)
2. ST(GDPS)
Create a new member named AOFTxxxx (where xxxx is the OS390 system name) in the
DOMAINID.DSIPARM data set and copy the contents of the AOFTSYS1 member from the
SA/390.SINGNPRM data set. Change all the SYS1 system name references from the AOFTSYS1
member to the correct OS390 system name.
%INCLUDE(AOFTxxxx)
%INCLUDE(GEOTREE)
Copy the GEOSDFG1/GEOSDFX1 member from the GDPS.SGDPPARM data set to the
DOMAINID.DSIPARM data set. Update all the SWETSMVS system name references to the
correct OS390 system name. GEOSDFG1/GEOSDFX1 is the GDPS main SDF panel.
Create a new member named AOFPNLS in the DOMAINID.DSIPARM data set and copy the
contents of the AOFPNLS member from the SA/390.SINGNPRM data set. Change the
AOFPNLS member so that it includes just two INCLUDE members, one for the AOFPxxxx
member just created and the updated GEOSDFG1/GEOSDFX1 member.
%INCLUDE(AOFPxxxx)
%INCLUDE(GEOSDFG1) for GDPS/PPRC
%INCLUDE(GEOSDFX1) for GDPSXRC
SA/390 requires a status file to be created which is used by NetView - SA for OS/390 processing.
Create this status file using the sample allocation information found in the SA/390.SINGSAMP
data set member INGESYSA.
Messages that are automated by SA/390 must be sent over the SSI to NetView. This is handled in
the MVS MPFLST member in the SYS1.PARMLIB data set. There are several methods of
specifying AUTO(YES) for messages. Refer to the MVS System Initialization and Tuning
Reference manual for specific information. Note that no sample MPFLST is provided by SA/390.
Defaults can be specified in MPFLST for AUTO(YES) on all messages. If the defaults are
AUTO(NO) then there must be specific messages entries in the MPFLST with AUTO(YES). The
NetView, SA/390, and GDPS automation tables would be the source for the specific messages for
this situation.
GDPS policy must be created using the SA/390 ISPF dialogs. The following steps summarize the
activities involved in defining the SA/390 environment prior to the actual addition of GDPS -
specific policy. During these steps, be sure to use PF1=HELP to get specific information on the
policy fields and their possible values.
On the OS/390 Entry Type Selection that displays after selecting the appropriate policy
database, enter SYS to create a System, then enter NEW on the following SA OS/390
Entry Selection panel. Create specific policy for: AUTOMATION CONSOLE and
AUTOMATION SETUP.
Connect the following policy to the system being created. These policies were created
from the model policy database.
On the OS/390 Entry Type Selection that displays after selecting the appropriate policy
database, enter GRP to create a Sysplex group. Be sure to specify SYSPLEX as the group
type. This is where information about sysplex timers and couple data sets is specified.
Connect the system(s) created to this sysplex group.
If needed, applications are created with the APL selection. Applications can be connected
to application groups which can be connected to systems. Application groups are created
with the APG selection.
First, allocate a data set that will contain the automation control file. It should be a
standard partitioned data set with RECFM=FB and LRECL=80. This data set will be
placed on the DSIPARM DD statement in the NetView procedure.
On the first SA/390 panel, SA OS/390 PolicyDB Selection, enter BUILD on the
appropriate policy database. On the following panel, enter 4=ACF to build automation
control file. On the next panel, specify the data set name containing the automation control
file on the Build Output Data set line. For a small environment, Build Mode can be
Note that the policy should not be built into the policy database and the first time
into this panel, the Build Output Data set will be pre-filled with the policy database
data set name. Be sure to change this.
During an online build, messages are displayed very quickly. These messages are also
stored in the automation control file data set, member $BLDRPT. Check this member for
errors. The AOFACFMP member in this data set maps the systems to the actual
automation control file name.
Create the NetView procedure for NetView, SA/390, and GDPS. NetView provides sample
procedures for NetView and the companion NetView SSI. They are found in the
NETVIEW.CNMSAMP data set.
AOFEXDEF and AOFRGCON are SA/390 modules that GDPS does not deliver. They are used
to create unique console names. Best case is that you won't have any problems with console
names. If you do have console name problems, you will need to create the AOFEXDEF module to
change the variable AOFCNMASK. This is described in "SA/390 Customization".
ENVIRONMENT: GDPS/PPRC
Definitions have to be made using SA/390 Customization Dialog. Additions and changes are made
in the following sections:
w System (4)
w Applications (6)
w OS/390 Components (33)
w Auto Operators (37)
w Network (39)
w Status Details (42)
w User E-T Pairs (99) This change can be made in two ways:
w Typing in all values
w Type some definitions and use a sample configuration as a model and fill in the 'blanks'. The
sample configurations contains definitions for:
w Auto Operators (37)
w Status Details (42)
w User E-T Pairs (99)
The following steps must be done for all systems that will be part of this SYSPLEX.
Note: Make sure that the first system is always a Site1 system.
x Fill in the following information and set the Environment Exit to VPCEINIT.
SWETSMVS
DSI TBLXX
SWETSMVS
SWETSMVS
VPCEI NI T
If another automation product is used, define it here. The following entries must be completed:
Ÿ Application Information, including job name
If SA for OS/390 is the primary automation product, all tasks that run on the image(s) must be
defined in the Policy. Refer to the “System Automation for OS/390 Customization” manual for
instructions on how to accomplish this.
Subsection 2.7.5.3 OS/390 Components (33)
Select OS/390 Messages and add the commands to be executed when the system:
w grows to a 'production' system (SMSFULL1).
w shrinks to a 'standby' system (SMSMINI1).
SMSMINI1 is intended to be used when one has a system that one wants to 'shrink' down to
a 'hot standby' system. A ‘hot standby’ system is one which has no allocations for PPRC
managed DASD. All necessary routines to make this happen can be added here.
Note that the MODIFY CATALOG,RESTART needs to be executed to release all allocated user catalogs.
1) From the SA/390 Customization Dialog panel select option 33 (OS/390 Components)
3) Select “Messages”
Ÿ For the SMSFULL1 message, specify the commands to be executed as diagrammed below.
Pol i cy Saved
Add 3 automation operators to the SA/390 automation operators defined for SA OS/390. The
names must be GEOOPER, GEOOPER1, GEOOPER2 and GEOOPER3. In addition to these, if
parallel execution of DASD-switching and CRECOVER and/or parallel execution of FlashCopy is
desired, add as many GEOBATnn operators as there are SSID-pairs in the PPRC-configuration,
or the number of primary SSIDs in the XRC-configuration. The names must be GEOBATnn,
where nn is a number from 1 to 99 and they must be consecutive. When the number of SSID-pairs
reaches 100 name the automated operators/tasks with numbers 100 and above
AUTBAxxx/GEOBAxxx.
XRC NOTE: In GDPS/XRC the number of GEOOPER operators should be the same as the
number of primary SSID’s.
NOTE: Failing to define sufficient GEOBATnn operators can seriously impair GDPS
performance during configuration and recovery.
Ÿ Enter the operator names, as demonstrated below. The names shown are the required names.
The GEOBATnn operators are optional but recommended for performance reasons.
NOTE: If you don’t have sufficient operators defined, parallel processing of RECOVER or SWITCH
DELPAIR commands will not occur
SWETSMVS
Note: Make the above selection for ALL systems in the GDPS
1. From the “SA/390 Customization Dialog” main panel, select option 42 (Status Details).
2. Enter ‘new’ to add a new entry with the name of “GEOPLEX” and provide an appropriate
description.
Specify the Priority and Color of different types of alerts. This information is used when the
SDF-screen is updated.
Ÿ Select “Where Used” and insure all GDPS images are selected.
User E-T pairs is used to define the GDPS configuration options for DOMAINS, OPTIONS,
TAKEOVER, CONTROL and BATCH. From GDPS 2.4 and later the GDPS DASD
configuration is defined in a GEOPARM member as described in section 2.6.
1. From the SA/390 Customization Dialog main panel, select option 99 (User E-T Pairs).
2. Enter ‘new’ on the command line to add a new entry with the name of ”GEOPLEX“ and an
appropriate description
After you create the GEOPLEX UET policy object, which is partially illustrated above, you then
create the GEOPLEX DOMAIN, GEOPLEX OPTIONS, TAKEOVER, etc... entry-types.
For each of the entry types below (DOMAIN and OPTIONS), type ‘new’ on the command line.,
On the next panel, in the “Entry” field, enter ‘GEOPLEX’, and in the “Type” field enter a type
shown below, i.e. ‘DOMAIN’.
4. TAKEOVER definitions are used to define actions to be executed in case of a failure in the
GDPS configuration. Add a new “Entry” of TAKEOVER with a “Type” name of up to 14
characters long for each of the takeover actions that GDPS is to handle. For example:
w ALLSITE1
w ALLSITE2
w SYSSITE1
w SYSSITE2
w SYSsysname (one entry for each of the systems that will be in the GDPS definitions)
w DASDSITE1
w DASDSITE2
w xxxxxxxxx (optionally, create the one or more user-defined TAKEOVER actions)
Note: Do not start the name of user TAKEOVER actions with SYS, ALL, DASD or
PROC.
Note: Do not start the name of user CONTROL actions with SYS, ALL, DASD, or
PROC.
6. BATCH definitions are used to define planned actions to be executed by GDPS and initiated
from outside GDPS. For each of the batch planned actions desired, add a new “Eentry” of
BATCH with a “Ttype” name of up to 15 characters
Define all GDPS systems in the Site-table using GEOPLEX DOMAINS. One line for every
system, that is one SITE1 keyword for each system in site 1 and one SITE2 keyword for each
system in site 2. Always make sure that the first entry is SITE1.
NOTE: It is very important that GEOPLEX DOMAINS and the MASTER list be identical in all GDPS
systems. When these parameters have to be changed, you should be very careful and plan the change.
Ideally all systems should be changed and restarted (NetView restart) at the same time, with the
controlling system (first system in the master list) starting first. If you have to run with differences, results
are unpredictable, and the GDPS Standard actions panel may show inconsistencies.
Even though domainids were coded, system names were shown. When implementing GDPS or
adding a system to GDPS the panel will show ???? for system name until GDPS has been
initialized in the new system. GDPS initialization will fill in the system name and broadcast it to
the other GDPSs.
The Standard Actions panel and all GDPS planned and unplanned action scripts refer to system
names and not domainids and it makes more sense to define system names in GEOPLEX
DOMAINS.
The following options are available for defining either system names or domain names in
GEOPLEX DOMAINS:
GEOPLEX DOMAINS
SITE1='(S=ssss/pppp/aaaa/x/yz),text'
SITE2='(S=ssss/pppp/aaaa/x/YN),text'
Or
SITE1='(dmn/pppp/aaaa/x/yz),text'
SITE2='(dmn/pppp/aaaa/x/YN),text'
“S=ssss” is used to define a system name, and “dmn” is used to define domain name. GDPS will
continue to support domainid specification until GDPS V2R7, when the domainid specification
will be removed. All systems should be defined the same way, that is all systems defined by
domainid or all systems defined by system names. If systems are defined by system name in the
domain table, the MASTER list must also define system names.
GDPS ‘Display Options’ will continue to show domainids in the master list.
NOTE: The value for Y must be coded as Y for all GDPS systems. By coding any other character
for y, it is possible to define LPARS that can be managed by GDPS Standard Actions. One
example where this can be used is for recovery systems/LPARs in a GDPS/XRC envoironment.
The value for z can be Y or N with Y indicating another automation product is used. Default for
yz is YN.
Note: define each system in Site n in a separate SITEn keyword/data pair, that is if there
are three systems in Site1 then three SITE1 definitions exist. (Always define SITE1
systems before SITE2.) In the definition pppp is the AO Manager Object ID of
primary/normal LPAR and aaaa is the AO Manager Object ID of the alternate/abnormal
LPAR. One should specify the AO Manager object id's for the LPARs, so that when
IPLTYPE for a system is NORMAL the system will IPL in its primary/normal LPAR
and when IPLTYPE is ABNORMAL the system will IPL in its alternate/abnormal
LPAR.
There is an additional benefit to coding site statements using system name. Prior to this feature
being an option it was not possible to code a definition for a system that did not exist in the GDPS
environment. It was not possible to have standard action panel entries for recovery site systems
that were not part of the GDPS environment (had GDSP code executing). Consequently it was
not possible to manage LPAR resources at the recovery site through the standard action panels.
This was a disadvantage in the GDPS/XRC environment.
With the introduction of system name site table entries, site statements can be coded for resources
that are not in the GDPS environment, and they can be managed through the standard action
panels of GDPS.
The following matrix defines which OPTIONS can be coded for PPRC and XRC:
PPRC? XRC?
DASDVARY Yes No
MASTER Yes Yes
CONTROLLINGSYSTEMS Yes NA
MONITOR1 Yes Yes
MONITOR2 Yes Yes
MONITOR3 Yes Yes
FREEZE Yes No
FRTIMEOUT Yes No
RIPLOPT Yes Yes
AOMANAGER Yes Yes
NETOWNER NA NA
ATCCONSUFFIX NA NA
PROCOPTS Yes Yes
STOPAPPL Yes Yes
OVERRIDE Yes Yes
CFMONITORING Yes Yes
REPEATTAKE Yes Yes
THRESHOLD No Yes
AUTODETECT NA Yes
XRCSTART No Yes
XRCUPDATE No Yes
NOTE: Master and Controllingsystems must be coded initially, the others have defaults.
Ÿ From the User E-T Pairs panel, select the OPTIONS entry
Ÿ For each of the items below create an entry with the specified keyword (underlined) and
appropriate values. Make sure to place the values in quotes if they contain any punctuation
(spaces, commas, periods, etc.)
AOMCONNECT='sysname,sysname,...'
which defines the systems that have AO Manager connectivity when they run in their normal
partition (IPLTYPE=NORMAL). The default is AOMCONNECT=ALL, meaning all systems
have AOM connection. NOTE: The default value (ALL) cannot be coded, AOMCONNECT
OPTION is left undefined to use the default.
At least one system in each site must have AOM connectivity.
The DASDVARY Keyword defines whether or not GDPS will be responsible for Vary Online /
OFFLINE at IPL time. When the setting is DISABLED, then GDPS will not be responsible for
DASD onliine condition at IPL time. When the setting is ENABLED, then GDPS will be
responsible for varying its controlled DASD online at IPL time. The default is ENABLED but
this is not the recommended setting for environments with over 500 DASD volumes under GDPS
control due to increased IPL time (see Appendix E for further information) The current setting
can be viewed by selecting OPTION 9 (View Definitions) from the GDPS main menu.
NOTE: The DASDVARY option only applies to production systems. In Controlling systems the
primary DASD will always be varied online.
The MASTER Keyword defines the NetView domains, in descending order. The MASTER
keyword defines the system names, when system names are defined in GEOPLEX DOMAINS. It
defines which system is to be the MASTER system. The first system defined in the list should be
the Controlling system and normally the Controlling system will be the current MASTER. If the
Controlling system isn't running, the next system in the list will take over and become the current
MASTER. The first active system in the list will always be the current MASTER. Whenever the
Controlling system is restarted it will immediately be the MASTER again. All systems must be
defined in the MASTER list.
The CONTROLLINGSYSTEMS keyword defines the number of controlling systems. The default
value is 1.
The MONITOR1 Keyword defines the monitor interval for SYSPLEX monitoring and AO
Manager heartbeat function. The default is 00:05:00. (every 5 minutes)
The MONITOR2 Keyword defines the monitor interval for PPRC device pair monitoring. The
format of the parameter is:
1) hh:mm:ss or (EVERY and ALL are defaults in this format)
2) xxx,hh:mm:ss,yyyyyy where xxx can be AT or EVERY and yyyyyy can be ALL or
MASTER (when MASTER is specified for MONITOR2, it actually means ‘controlling
system’.
Examples:
w AT,00:03:45,ALL will schedule the monitoring at 3.45 on all systems
w EVERY,04:00:00,MASTER will schedule the monitoring on the current master system
every 4th hour
w 00:02:00 will schedule the monitoring every two hours on all systems
In the first format the time is an interval and the monitor will run in all systems with the specified
interval. In the second format, EVERY indicates that the time is an interval and AT indicates that
the monitor will run once a day at the specified time. The default (if MONITOR2 is omitted) is
AT,01:00:00,MASTER which means it will run in the controlling system every night at one
o’clock. It is recommended that monitor2 runs once per day in the controlling system.
NOTE: Monitor2 will also run at Netview/GDPS initialization in all or MASTER system/s, based
on the ALL/MASTER specification in this keyword.
The times in all monitors can be defined as hh:mm, that is seconds are not required.
This next global policy setting requires management attention and decision because it is driven by
business requirements. GDPS offers several data recovery policy options which address ‘how
much data loss can be tolerated in the event of a disaster by the business. These policy options
relate to events which prohibit updates from being propagated to the secondary site. They are:
ŸFreeze and Go —GDPS will freeze the secondary copy of data when remote copy
processing suspends and the critical workload will continue to execute making updates
to the primary copy of data. However, these updates will not be on the secondary
DASD if there is a subsequent Site1 failure in which the primary copy of data is
damaged or destroyed. This is the recommended option for those enterprises that can
tolerate limited data loss or have established processes to recreate the data.
ŸFreeze and Stop — GDPS will freeze the secondary copy of data when remote copy
processing suspends and will quiesce the production systems resulting in the critical
workload being stopped and thereby preventing any data loss. This option may cause the
production systems executing the critical workload to be quiesced for transient events
that interrupt PPRC processing, thereby adversely impacting application availability.
ŸFreeze Conditional — GDPS inspects the reason for the suspension of remote copy
processing using the IEA491E message. If this message indicates a secondary DASD
problem has occurred, GDPS will issue a CGROUP RUN and let the applications
continue to process. If the suspension is due to a primary DASD failure, GDPS will do
a stop action by issuing MVS QUIESCE commands and system reset to all production
systems. If there is no IEA491E message within the FRTIMEOUT time period, GDPS
will do the stop action. It is essential that the FRTIMEOUT value is less (with some
margin) than the freeze time-out in the storage control units. The default values are 1
minute for FRTIMEOUT and 2 minutes for the storage control units.
The FREEZE Keyword defines what action to take if there is a DASD error. Values can be GO,
STOP or COND.
w GO means that CGROUP FREEZE is invoked followed by a CGROUP RUN.
w STOP means that CGROUP FREEZE will be invoked followed by
MVS QUIESCE and System RESET of the images.
w COND means that CGROUP FREEZE will be invoked followed by
MVS QUIESCE and System RESET if it is an error on primary DASD.
For Secondary errors the Freeze will be followed by a CGROUP RUN.
The FRTIMEOUT Keyword defines the number of minutes and seconds to wait for the IEA491E
message to arrive. If there is no IEA491E message within the FRTIMEOUT time period, GDPS
will do the stop action (MVS QUIESCE and system reset to all production systems). It is
Note: This is only used when FREEZE=COND is specified and when no IEA491E message
is issued.
The RIPLOPT Keyword defines how the Standard Action Re-IPL is customized. Re--IPL is a
combination of two Standard Actions, STOP and one of the actions to start a system,
ACTIVATE, IPL, or LOAD. RIPLOPT is used to specify the second step of Re-IPL to be
ACTIVATE, IPL, or LOAD. The default is LOAD.
See the description of Standard Actions for the requirements of the alternatives. In summary,
ACTIVATE requires an image profile with “Load at Activation” specified, IPL requires a load
profile, and LOAD requires a load address (and possibly loadparm).
The AOMANAGER keyword defines which system(s) will monitor the HMC and
AOMANAGER.
The first system found in the SITE-table for each site will do the monitoring.
The NETOWNER defines the name of the system that normally owns the net.
The PROCOPTS specifies how IPL's and other types of processor management is handled.
Possible keywords are:
w AOMGR for AO Manager and Site Manager
w SITEMON for Site Monitor (DOS-version)
The STOPAPPL=applname identifies the subsystem that SA/390 will use when a system is
stopped. GDPS will query SA/390 to get the actual jobname. Jobname is used in the SHUTSYS
command to stop that jobname and all it’s children. Default is JES.
REPEATTAKE=hh:mm:ss where hh:mm:ss is the amount of time that GDPS will wait before repeating
message GEO112/GEO113. Default is 00:05:00.
Definition: Automatic re-IPL of a system that fails is enabled using an OPTION definition in
GDPS.
Where:
tt is the threshold. It can be from 1 to 10.
hh:mm is the elapsed time the threshold is calculated within.
The (tt,hh:mm) is a threshold definition that defines how often an automatic IPL is allowed to be
scheduled. Format is: number of times (tt) within time limit (hh:mm) before automatic IPL is
disabled. If a system fails 2 times within 12 hours (default) the automation is disabled and a
takeover is scheduled instead. A specification of (04,01:30) means that if a systems fails 4 times
whiten 1 hour 30 minutes the automated IPL will be disabled.
Function
When a system that is managed by GDPS fails, a takeover request is transferred to the current
master system. Take-over processing will check the following things before scheduling the
automated IPL:
If these requirements are all met the automated IPL is scheduled by calling AOM to schedule the
IPL. The RIPLOPT definition will tell GDPS which flavor of IPL (LOAD, IPL or ACTIVATE)
will be used. In other words, automatic re-IPL will work the same way that the Standard Action
re-IPL will work.
If the requirement is not met, a takeover will be scheduled indicating system problems for the
failing system.
Operational considerations.
THRESHOLD=’LEVEL1=hh:mm:ss LEVEL2=hh:mm:ss’
LEVEL1 and LEVEL2 values defines a DELAY-trigger. When LEVEL1 time is
exceeded, but not LEVEL2, a yellow alert is generated. When LEVEL2 is exceeded a red
alert is generated. There are no defalt settings for LEVEL1 and LEVEL2. Also,
LEVEL1 must be specified before LEVEL2.
XRCUPDATE=IMMED|DEFERRED
Default is IMMED. This keyword is only used when XRC configtype = VOLSER.
XRCUPDATE defines when the XRC-policy definition member will be updated.
IMMED indicates immediately update when a change is done. DEFERRED indicates
that the update will be done when the user returns to the primary panel. Updates through
batch always use IMMED.
AUTODETECT=NO!YES
IF AUTODETECT=YES, MONITOR2 will execute the configuration process and if pairs are
detected that match the definitions they will be added to the policy. If the newly detected pairs
were not in synch an alert will be issued stating ‘all volumes-pairs are not in synch’.
Autodetect=YES can only be used
when XRC configtype=VOLMASK.
IF AUTODETECT=NO, config is done from the GDPS main panel. This is default.
Note: XMVS will load configuration ACFZ995 when the IPLTYPE is ABNORMAL
and YMVS will perform FALLBACK which will start all subsystems defined to GDPS as
secondary with no ARMname.
Unplanned actions are defined with SA/390 "User E-T Pairs" entry-type TAKEOVER. Planned
actions are defined with SA/390 "User E-T Pairs" entry-type CONTROL or BATCH. The
contents in these definitions types can be the same. The difference between TAKEOVER versus
CONTROL and BATCH is that TAKEOVER can only be invoked when something unplanned
happens, like a system goes away or DASD failure. Definitions in CONTROL can only be invoked
from the GDPS ‘User Defined Actions'. Definitions in BATCH can only be invoked from outside
of GDPS, using VPCEXIT2. See Section 2..9.5 for a description how to use the
BATCH-possibility.
A new function in SA OS/390 V1R3.0, called 'Includes Policy Object', enables the user to add
ACF definitions outside of the SA/390 dialogs.
The advantage is that you can maintain the member outside SA/390 dialogs, but there is no
check of the content and syntax for the INCLUDEd members.
This is an example of how to add the following script, DASDS2RETURN, using the SA/390
function ‘Includes Policy Object’.
CONTROL DASDS2RETURN,
COMM='DASD SITE2 RETURN'
SYSPLEX='CDS NORMAL'
DASD='START SECONDARY'
Copy the sample ‘GDPSINCL’ from the GDPS samplib in the ‘Build Output Dataset’ for SA
OS/390. The name GDPSINCL is recommended, but it is not required.
1. In the panels for SA OS/390 find ‘SA OS/390 Entry Type Selection’.
2. Enter ‘98’ at COMMAND to select Includes and press ENTER.
3. Enter ‘New’ at COMMAND and press ENTER.
4. Supply the name of the new member ‘GDPSINCL’.
Recycle Netview and carefully check the log for warnings about the member ‘GDPSINCL’.
You can view your new script through the GDPS panels.
NOTE: The scripts will appear on the GDPS panels in the same way they would if they had been
created through the SA/390 customization panels. From the GDPS display panels there will be no
way of distinguishing scripts created in a data set from scripts created using the SA/390
customization panel interface.
If you want to change the script, or add a new script, update the ‘GDPSINCL’-member and
recycle Netview. Build is not necessary.
Please note that there is no comma after the last line in every script.
Here is an additional example defining two different scripts, one TAKEOVER and one
CONTROL. Please note that there is no comma after the last line in every script.
When GDPS/XRC (full and RCMF) initializes it will check if the SDM is active, If not it will execute XSTART
and XADDPAIRs for that SDM. This is done ,in full GDPS, by specifying VPCEINIT in the exit-list for the system
in SA/390. In RCMF version a VPCEINIT-statement has to be inserted in NetView initialization routine
(CNME1034 or CNME1035).
VPCEXIT1, (Volume-pair creation exit).will be called when a volume-pair is created using the
userinterface and the secondary volser or secondary device has a questionmark. Parameters to the
exit:
1. Session name
2. a request type VOLSER or DEVNUM
3. a primary volumes serial number or a primary device .
Exit has to respond with a volser of the secondary volume that is to be the the secondary volume a
task global variable RESP and exit with return code zero. If exit cannot provide a secondary
VPCEXIT3 is scheduled after an XSTART initiated from GDPS has completed. The purpose of
VPXEXIT3 is to allow the installation to specify XSET commands tailored for their particular
environment. Some common XSET parameters that are issued are TIMEOUT, SYNCH,
SCSYNCH and PAGEFIX values. A sample VPCEXIT3 is distributed in the GDPS SAMPLIB
and can be tailored appropriately.
ENVIRONMENT: GDPS/PPRC
GDPS provides a panel interface that allows basic system management functions to be performed
(such as IPLing and stopping systems, and activating and deactivating LPARs). This panel
interface is available through the GDPS main menu option 3, also labeled ‘Standard Actions’.
For detailed information about the functions performed by theses standard actions, please refer to
the Section 3.2 “GDPS Standard Actions Validation” of this installation guide.
ENVIRONMENT: GDPS/PPRC
Overview
User defined actions fall into three categories: CONTROL scripts, BATCH scripts and
TAKEOVER scripts. All three types use the same syntax and execute the same way. The
difference is how they are invoked.
CONTROL scripts are executed by selecting them from the list provided when the ‘User Defined
Actions’ option is selected from the GDPS main menu. These scripts are created in the SA/390
panels by defining them in as "User E-T Pairs" (Option 99) entry type Control when GDPS is
being customized.
BATCH user defined actions allow batch execution of a script defined in GDPS.
TAKEOVER scripts can only be executed when the option to select them is presented in the
GEO112E / GEO113A message generated by GDPS when a failure is detected. They can NOT be
executed manually. These scripts are also created using SA/390 panels to define them as "User
E-T Pairs" (Option 99) entry type Takeover when GDPS is being customized.
A takeover condition can be detected in any system. It is always routed to current MASTER for
processing. The detecting system classifies problem as SYS or DASD, SYS when system(s)
disappears, DASD for PPRC-related problems.
The MASTER system issues the following WTOR with multiple choices depending on GDPS
policy definitions:
When GDPS detects a failure it will always issue a prompt ( GEO112E/GEO113A) and ask what
OPTION should be taken. Note, the OPTIONS available at this time are listed in the GEO112E
message.
Ÿ ALLSITEn switches systems and primary DASD from site n to the other site
Ÿ DASDSITEn switches primary DASD from site n to the other site
Ÿ SYSSITEn actions performed when a system in site n fails
Ÿ SYSsysname actions performed when system “sysname” fails
Ÿ user defined will be presented in all takeover prompts
In ALLSITEn and DASDSITEn n will be 1 or 2 depending on what site currently has the primary
DASD. When a system in a site fails, SYSsysname and SYSSITEn for that site will be used.
If the above naming convention is not adhered to, the presentation of the TAKEOVER scripts
will not occur as expected.
Here is a description of how GDPS selects from the available TAKEOVER scripts and what it
presents in the GEO112E/GEO113A prompt. One very important thing is that GDPS keeps track
of where the primary DASD is but it does not keep track of where systems are. This means that
the selection of the TAKEOVERs SYSSITE1/SYSSITE2 is only based on where the systems are
defined to be in the GEOPLEX DOMAINS policy and not on where they currently are running. If
there is a DASD problem the selection of the TAKEOVERs DASDSITE1/DASDSITE2 and
ALLSITE1/ALLSITE2 is base on where the primary DASD is when the problem occurs.
If “user defined” takeovers are defined they will be presented in all takeover prompts. In the
following description only the standard takeovers will be mentioned.
Note: For additional information on TAKEOVER processing see ‘Takeover Script Selection and
Validation Process’ in Section 3.3.
Section 2.9.1 Standard Takeover Actions
If this occurs, GDPS will trigger a FREEZE to stop mirroring and create a consistent set of
secondary DASD. For FREEZE=STOP production systems will be QUISCEd and RESET. If
FREEZE=COND and GDPS can determine that the cause of freeze is a secondary DASD
problem, execution will be allowed to continue and there will be no takeover prompt, otherwise
production systems will be QUIESCEd and RESET. In all cases, execept FREEZE=COND and
secondary DASD problem, GDPS will issue a takeover prompt with the ALLSITEn and
DASDSITEn options. The value of n will be 1 or 2 depending on where the primary DASD is.
Note: A secondary DASD problem can cause a freeze and a takeover prompt, but you should not
do any takeover for a secondary DASD problem. Note: The ALLSITEn and DASDSITEn
options will only be available if the current master system is a controlling system.
System Failures
If a production system in the site of the primary DASD fails , GDPS should suggest a SYSSITEn,
SYSsysname, or ALLSITEn TAKEOVER where n is the site of the primary DASD. When a
NetView Failures
GDPS uses XCF to communicate. If a GDPS NetView fails, the XCF communication to that
GDPS will fail and this is indicated by setting the status of that system to NOGDPS. Therefore,
the system where NetView failed should get status NOGDPS and it should be reported in SDF.
No takeover will be suggested.
If NetView in the controlling system fails the next GDPS in the master list will take the master
(controlling) function.
CF Failures
Coupling Facility failures are supposed to be handled by the CFRM policy. There will be no
GDPS action for a Coupling Facility failure.
Currently there are no GDPS actions for processor or coupling facility failures. A processor
failure will cause systems in that processor to fail and GDPS action will be as described for system
failures.
The following keywords are used in BATCH, CONTROL and TAKEOVER scripts. The syntax
of the BATCH, CONTROL and TAKEOVER keywords is as follows:
The existence of ',ETIME=nn:nn' will start a timer at execution time of this script-statement.
When hh:mm has elapsed, in the example above 10 minutes, there will be a warning message
stating that the estimated time for this actions has been exceeded
§ COMM keyword
w Text
COMM is used to create a heading for the script and it should only be specified first in a
script. The text will be shown when you display the script in GDPS. There is no action related
to COMM. Maximum length of text is 63 characters.
§ DASD keywords
NOTE: Any script, for either planned or unplanned actions, using DASD keywords can ONLY
execute in a controlling system.
Switching the PPRC disk from one site to the other can be done in a planned action using either
the DASD='SWITCH DELPAIR' or DASD='SWITCH P/DAS' statement. SWITCH DELPAIR
requires that all systems using the disk subsystems be stopped and reIPLed after the switch, while
SWITCH P/DAS can be done when systems are running.
It typically takes 15+ seconds to swap a device pair. It is of course recommended to do a P/DAS
switch at a time when the system load is low. The pairs are swapped one by one so a total P/DAS
switch can take considerable time. You also have to take into consideration that there are some
requirements that have to be met for P/DAS to work.
These requirements makes it very unlikely that a production mirror can be switched using P/DAS
without manual intervention. The requirements are described in "DFSMS/MVS V1 Advanced
Copy Services (SC35-0355). In addition, testing at different installations have shown that P/DAS
sometimes fails for a single (or a few) volume.
When executing a DASD='SWITCH P/DAS' statement, if any device fails, the return code will be
set to 8 , and this will cause GDPS to issue a GEO118A prompt asking if script execution should
continue or not. . If this occurs a subset of the device pairs have been swapped and manual
intervention is required. If most of the devices have been swapped, you can try to swap the
remaining ones using the PDAS function from the GDPS DASD Mirroring panels and then let the
script continue. If you select to interrupt the script, you have to manually swap back the device
pairs that were swapped, again using the PDAS function from the GDPS panels. It is
recommended that an ASSIST is added after the DASD='SWITCH P/DAS' statement, and that
you set up proper procedures to follow if you use SWITCH P/DAS. It is also recommended that
you do extensive testing of this function.
w SWITCH DELPAIR:
Will use PPRC-commands (CDELPAIR/CESTPAIR (NOCOPY) to swap the mirrored
DASD configuration. A script containing this keyword will not execute unless mirroring
status is OK.
w SWITCH P/DAS:
Executes P/DAS on all volumes
w START SECONDARY:
FCESTABLISH SECONDARY (flashcopy secondary volumes)
CESTPATH for all paths
CESTPAIR for all pairs with RESYNCH
If the flashcopy command fails and the msgs GEO310W and GEO313I appears in SDF
screen a WTOR is also issued in syslog. Follow the instructions in the WTOR. To avoid
problems if a previous flashcopy sometimes exist, the user can as a first
step before the DASD=’START SECONDARY’ command, execute the
DASD=’FCWITHDRAW SECONDARY’ command.
w RECOVER:
CQUERY of all pairs (to check status, unacceptable status not recovered)
VARY OFFLINE of old primary volumes
CRECOVER all pairs with acceptable status
VARY ONLINE of all recovered pairs
§ SYSPLEX keywords
if the parameter is ALL, SITEn, or a list of system names, all systems will be stopped in
parallel. (Note that STOP ALL or STOP SITEn will not stop the system executing the script.)
stopping a system normally includes stopping all applications/subsystems followed by varying
the system out of the sysplex. The time to do this is installation dependent, but up to half an
hour is not uncommon. The statement after a SYSPLEX STOP will not be executed until all
systems in SYSPLEX STOP have been removed from the sysplex. This means that you would
in general not have more than one SYSPLEX STOP statement in a user-defined action.
Note: ACTIVATE is used to IPL a system and therefore we have the system name in the
syntax. GDPS knows what partition the system should run in and will issue the
ACTIVATE to the correct partition. The partition is selected based on the IPLTYPE
setting for the system. This is described in section 3.2 Standard Actions Descriptions and
Indicators. This is true for all script statements that operate on an LPAR
(DEACTIVATE, RESET, PSWRESTART, and LOAD), specify the system name and the
operation will be issued to the “current” LPAR for that system.
Note: As an alternative to specifying sysname, one can specify AOMOBJECT=oooo for
ACTIVATE, DEACTIVATE, and RESET. The oooo should be the AOM object for a
LPAR and this is supposed to be used for manipulating LAPRs that do not run a GDPS
system, for example a partition where expendable work is running which has to be stopped
in a failure situation to allow a GDPS system to use the partition.
Do DEACTIVATE of LPAR where sysname was active, or the LPAR with object id
oooo.
Do SYSTEM RESET of LPAR where sysname is active, or the LPAR with object id
oooo.
w PSWRESTART sysname
Do PSWRESTART where sysname is active
w CF NORMAL|SITE1|SITE2
Execute SETXCF commands so the CFRM-policy defined in GDPS will be used
The policy names are managed from GDPS user interface choice 7.
Use of this keyword will result in structures being moved via the XCF rebuild command.
Therefore only structures that support rebuild will be moved. Users should consider using
an ‘ASSIST’ in a script to prompt operations to verify the status of the coupling facilities
prior to further action.
The first command changes the CFRM poilcy and it will cause MVS to initiate structure rebuilds,
then GDPS issues the D XCF,STR command and depending on the response also SETXCF
START,REBUILD,... commands.
w CFRECOVER COND|UNCOND|PROMPT
Execute SETXCF commands to move structures from the CFs in the primary site.
If COND, structures will be move is CF is not working
If UNCOND, structures will be moved from the primary site
If PROMPT, Operators will be asked if movement is to be executed
If CFs don't work, structures will be move without operator intervention.
§ CBU Keywords
Ÿ ACTIVATE AOMOBJECT=oooo
Initiate CBU activate on the CPC with object id oooo. A prompt will be issued and the
operator can select between a real CBU activation, a test CBU activation, or no CBU
activation. The CBU ACTIVATE statement is valid only in TAKEOVER scripts.
Ÿ CONFIGON sysname
After CBU activate reserved PUS will be available but offline. For systems that do not
need to be IPLed during the takeover the CBU CONFIGON statement can be used and
GDPS will configure all offline PUS online to system sysname. For systems that are IPLed
GDPS will configure all offline PUS online during GDPS initialisation.
1. Issue message 'GEO165A Is GDPS authorized to activate CBU?' If reply is NO, issue
'GEO161I CBU Activation not Authorized by Operator' and exit with RC = 1. If reply is
YES or TEST, continue
2. Issue message 'GEO090A GET_ATTRIBUTE S P L obj 32' to get current CBU status. If
response indicates CBU active, issue message 'GEO160I CBU Activate Complete' and exit
with RC = 0. If response indicates CBU not active, continue
3. Issue message 'GEO090A ACTIVATE_CBU S P L obj' or 'GEO090A
ACTIVATE_CBU_TEST S P L obj' to activate CBU
4. Wait for 2 minutes and then go back to step 2. to check if CBU has been activated
Steps 2, 3, and 4 are repeated up to 10 times. If activation was not successful after 10 iterations,
issue message 'GEO163W CBU Activate Confirmation not Received' and exit with RC = 3.
§ NETWORK keywords
w ATCCONnn
Note: if a network owning system is recycled,
a message table entry, defined in INSTALLATION GUIDE, can be added
so the system will remember to do this when VTAM is started.
§ Other keywords
w GETSTORAGE sysname
152 GDPS Installation Gude
MVS CF-command will be issued in sysname requesting central and expanded
storage defined in storage element 1
w RELSTORAGE sysname
MVS Config (CF) -command will be issued in sysname releasing storage defined in
storage element 1
w USERPROC xxxxxxxx
Execute user procedure xxxxxxxx
Note: When a USERPROC statement is executed in a script, the return code is saved. If
the return code is >7 then a WTOR is issued on the console:
GEO118A text DID NOT WORK
REPLY 'ACCEPT' TO CONTINUE
'CANCEL' TO CANCEL ACTION
The text is the USERPROC statement and the return code. If the user replies ‘ACCEPT’
the script will continue. If the user answers. ‘CANCEL’ the script will be
cancelled/stopped at this point.
GDPS will send to system 'sysname' and execute a procedure that will issue a SA for
OS/390 command that finds out which applications have the status of FALLBACK
associated with them. It then issues the SA for OS/390 commands to start those
application on that system.
For example, if System A normally runs application ABC and if System B is capable of
running application ABC when system A is not up, then in SA/390 one can define that it is
possible to run ABC in system B even though that's not the normal state.
If system A fails, the takeover script for SYSSYSTEMA can contain EXECUTE
SYSTEMB FALLBACK and that will 'move' the application ABC to SYSTEMB.
w ASSIST text
A WTOR will be issued, message-id GEO045A, requesting operator assistance. When
manual intervention is complete the operator has to respond to the WTOR, and GDPS will
proceed.
NOTE: The ASSIST keyword must not contain single or double quotes (“,’).
w MESSAGE keyword
W Text
MESSAGE is used to create a message in the syslog (WTO). The 'text' will be written in
the syslog preceded by GEO045I MESSAGE from GDPS SCRIPT: There is no action related
to MESSAGE.
GDPS Installation Guide 153
NOTE: The MESSAGE keyword must not contain single or double quotes (“,’).
The following summarizes the use of IPLTYPE and IPLMODE in planned and unplanned
reconfiguration scripts. Both keywords will be used to find the proper entry for an sysname entry
in the site-table definition..
When a LOAD (IPL) script request is issued, i.e., LOAD sysname, then the current settings of
IPLTYPE and IPLMODE are the search arguments into the site-table definition to find the load
address and load parms associated with this sysname. For example (refer to table below), if the
sysname is SYS1, and the IPLTYPE is NORMAL and IPLMODE is set to PRI, then the load
address of 1000 and load parms of 10018A will be supplied. If there is no match, then the LOAD
request is passed without any load information (via AOM) to the HCM and the last setting in the
load profile will be used.
When the LOAD sysname is issued with load address and load parms explicitly specified, then no
lookup in the site-table is necessary.
IPLMODE NORMAL and ABNORMAL identifies the object id (Cxyz) as defined by the policy
(DOMAINS). Generally, ABNORMAL is used as INTER backup option, this means, when a
system (processor) fails and the failing system has to be restarted in the other site. When the
IPLTYPE is set to an object id then the system is restarted in this specified LPAR. This allows an
Script definitions.
Script definitions for GDPS/XRC is defined in SA/390. SA/390 is a prereq for the following script
definitions.
ENVIRONMENT: GDPS/PPRC
Site Maintenance
PS tag
e
Geographically Dispersed Parallel Sysplex
GD
Ou tage
ed u
nn d O tion
Pla nne otec
pla Pr
Un ster
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av rtab ity
sta bil
Re overa
Re
c
Site Maintenance (Site-1)
site-2
site-1
'SWITCH DELPAIR'
DASD AO Mgr. AO Mgr.
'STOP SECONDARY'
SYSPLEX
'CDS SITE2'' L P P P S S S L
SYSPLEX
'CF SITE2'
SYSPLEX Initiated from
'RESET AOMOBJECT=LP22'
SYSPLEX GPKL
'RESET AOMOBJECT=LP21'
IPLTYPE
'MP11 ABNORMAL' Procedure (on GPKL)
SYSPLEX
'ACTIVATE MP11' Shutdown All systems in site1 and site-1 CF
IPLTYPE
'MP12 ABNORMAL'
Switch DASD; reconnect with NOCOPY and suspend
SYSPLEX Reset LP21/LP22
'ACTIVATE MP12'
Switch to site 2 CF
IPL site 1 systems in LP21 and LP22
Start applications on Site-2 systems
This script would execute an orderly shutdown of the systems in Site1 (STOP SITE1) followed by
a switch of the primary DASD from Site1 to Site2 (SWITCH DELPAIR). It would then place the
secondary DASD (now in Site1) in suspend (STOP SECONDARY). The sysplex couple data sets
and coupling facility use would then be switched to Site2 (CDS SITE2 & CF SITE2).
The LPARs at Site1 would then be reset (RESET AOMOBJECT) followed by placing the
systems GP11 and GP12 in abnormal IPL mode, thus preparing them for IPL at Site2
(IPLTYPE). Finally the systems would be IPLd at Site2 (ACTIVATE).
The coupling facility command (CF SITE2 in this case) will change the CFRM policy based on
definitions in GDPS and execute rebuild commands to move the structures (as described
elsewhere in this document). In many cases this action will not leave the coupling facility ready for
maintenance. If there are structures in the coupling facility that cannot be moved, either because
they do not support rebuild or for any other reason, these structures will need to be moved by
manual intervention and the coupling facility will need to be shut down for maintenance as
described in the manual “Setting Up a Sysplex - GC28 - 1779”.
PS Ou
e
tag e
Geographically Dispersed Parallel Sysplex
GD
g
ed Outa on
nn
Pla nned otecti
pla P r
Un ster
a
Dis
y
bilit y
The IBM Multiple Site Application Availability Solution
aila ilit
Av artab ility
st b
Re overa
Re
c
Return to Site 1 after Maintenance
site-2
site-1
'SWITCH DELPAIR'
SYSPLEX AO Mgr. AO Mgr.
'CDS NORMALl'
SYSPLEX
'CF NORMALl' L P P P S S S L
IPLTYPE
'GP11 NORMAL'
SYSPLEX Initiated from
'ACTIVATE GP11'
IPLTYPE GPKL
'GP12 NORMAL'
SYSPLEX Procedure (on GPKL)
'ACTIVATE GP12'
Re-sync all DASD and ask operator to respond when completed
Shutdown All systems and site-2 CF
Switch DASD; reconnect with NOCOPY
Set CDS to normal
Set CF configuration to normal
IPL GP11 & GP12
Start applications on Site-1 systems
Return to Site1. After maintenance is complete this script would return operations to normal;
GP11 and GP12 executing in LPARs LP11 and LP12, with primary DASD in Site1 and secondary
DASD in Site2. The PPRC devices must be resynched (from the GDPS PPRC control panel) and
DASD mirroring set to OK (by using option 5 - Query, from the GDPS control panel) before the
script can be started.
The STOP SITE1 keyword stops GP11 and GP12, which are running at Site2. The STOP SITE1
statement gets the definition for the systems to be shut down from the domain statement in the
GDPS definition. In this case Site1 is defined as having systems GP11 and GP12, so these systems
are shut down in whatever LPAR they are executing.
The DASD switch moves the primary drives to Site1 and the secondary drives to Site2 (SWITCH
DELPAIR), followed by returning the couple data set and coupling facility use to normal, based
on the CFRM policy defined to GDPS. IPL type is then set to normal, preparing GP11 and GP12
for IPL in Site1. Finally the systems are IPLd at Site1.
It should be noted that LOAD is a viable alternative to ACTIVATE for IPLing systems. It is a
matter of preference.
PS tag
e
Geographically Dispersed Parallel Sysplex
GD
Ou tage
ed u
on
nn d O
Pla nne otecti
pla Pr
Un ster
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av rtab ility
sta b
Re overa
c
Re
Site Maintenance (SITE2)
Policy - CONTROL/SITE2MAINT ICF LP11 LP12 LP21 LP22 LPKL ICF
COMM
'SITE 2 HARDWARE MAINTENANCE' 9672-R26 9672-R26
SYSPLEX
'CF SITE1'
SYSPLEX
site-2
site-1
'CDS SITE1'
Tape Tape
DASD AO Mgr. AO Mgr.
'STOP SECONDARY'
L P P P S S S L
Initiated from
GPKL
Procedure
GPKL to start the process and GP11 to shut down GPKL
Suspend secondary dasd
Move CDS and CF to site 1
This script automates shutdown of GDPS functions in Site2 except for shutting down GPKL (the
controlling system). GDPS will not allow a system to shut itself down. That is why the last
ASSIST is provided. It generates a WTOR to remind the operator of the final step.
The script places all couple data set and coupling facility functions in Site1 and places the
secondary DASD in suspend mode.
PS e
tag e
Geographically Dispersed Parallel Sysplex
GD
Ou g
ed Outa on
nn
Pla nned otecti
pla P r
Un ster
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av rtab ility
sta b
Re overa
c
Re
Return to Site 2 after Maintenance
Policy - CONTROL/SITE2RETURN ICF LP11 LP12 LP21 LP22 LPKL ICF
COMM
'BRING SITE 2 ONLINE AFTER HARDWARE MAINTENANCE' 9672-R26 9672-R26
SYSPLEX
'CF NORMAL'
SYSPLEX
Tape Tape
site-2
site-1
'CDS NORMAL'
DASD AO Mgr. AO Mgr.
'START SECONDARY'
IPLTYPE
'GPKL NORMAL' L P P P S S S L
SYSPLEX
'ACTIVATE GPKL'
Initiated from
GP11
Procedure
Return CDS and CF to normal configuration
Start Secondary DASD
IPL GPKL
In this environment GP11 is the second system in the list of systems eligible to be the controlling
system. In the absence of GPKL GP11 is the controlling system. This script is initiated from
GDPS on GP11. It restores the couple data set and coupling facility functions to normal,
re-synchs the DASD and IPLs GPKL. As soon as GPKL has IPLd it will automatically and
immediately resume the role of controlling system.
PS e
tag e
Geographically Dispersed Parallel Sysplex
GD
Ou g
ed Outa on
nn
Pla nned otecti
pla Pr
Un ster
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av rtab ility
esta erab
R ov
Re
c
DASD Maintenance (SITE2)
Policy - CONTROL/SITE2DASDMAINT
COMM Tape Tape
site-2
site-1
'SITE 2 DASD MAINTENANCE ' AO Mgr. AO Mgr.
DASD
'STOP SECONDARY'
L P P P S S S L
SUSPENDED SUSPENDED
Initiated at
GPKL
This script places the secondary DASD devices in suspend to allow maintenance on the devices.
The assumption here is that the maintenance procedures will not change any data resident on the
volumes.
If data were to be changed a different approach would be required. The PPRC pairs would need
to be broken and later reestablished. This would copy all data from primary to secondary, ensuring
data integrity.
PS Ou
e
tag e
Geographically Dispersed Parallel Sysplex
GD
g
ed Outa on
nn
Pla nned otecti
pla P r
Un ster
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av rtab ility
sta b
Re overa
Re
c
Coupling Facility Maintenance (Site-1)
site-1
site-2
AO Mgr. AO Mgr.
L P P P S S S L
Initiated at
GPKL
A simple script to allow maintenance on the coupling facility in Site1. It results in the automated
movement of all structures to the Site2 coupling facility.
Earlier comments regarding the scope of GDPS management of structures applies here. Further
manual intervention may be required depending on the structures in use.
It is not likely that it will ever be possible to have GDPS manage ALL kinds of failures in a
complex S/390 environment. One reason is that whatever triggers GDPS is always a symptom of a
problem and human intervention is always needed to find the true problem causing the symptom(s)
and decide what the proper actions are. In most cases GDPS will assist and offer valid takeover
actions and, as soon as anyone has selected a takeover, it is automated and production will be
restored in the fastest possible way.
However, SITE2 failures are most prone to having incorrect actions proposed by GDPS.
There is a very high risk that GDPS will present a takeover prompt with ALLSITE1 and
DASDSITE1. If the controlling system also failed, the master function will move to the next
system in the master list which in most cases is a production system. In this case the takeover
prompt will say "DASD TAKEOVER NOT POSSIBLE" and "ALL TAKEOVER NOT
POSSIBLE", since these takeover scripts/actions are not allowed to run in a production system.
If the customer has two controlling systems, one in each site, the one in site 1 will take the master
function and offer the ALLSITE1 and DASDSITE1 takeovers. It is extremely important that it is
understood that the takeovers offered by GDPS cannot be selected if the secondary DASD has
failed.
The actions that you would normally want to do for a site 2 failure is to switch to the site 1 couple
datasets and site 1 CFs, which can easily be done from the GDPS panels. If it is a multi-site
workload with production systems in site 2, you may want to start these systems in backup
partitions in site 1 and this can also be done from the GDPS panels.
Another, in most cases better, possibility would be to do the planned action for site 2
maintenance, which will probably do exactly what is wanted for a site 2 failure.
If the site 2 maintenance planned action doesn't do exactly what is wanted, a site 2 failure planned
action script could be prepared to be used for a site 2 failure.
It is possible to define ALLSITE2 and DASDSITE2 takeovers which may be a bit confusing
because they will only be offered by GDPS if there is a failure when the primary DASD is in site 2.
These takeovers normally don't apply for a site 2 failure.
System Failure
PS Ou
e
tag e
Geographically Dispersed Parallel Sysplex
GD
g
ed Outa n
nn o
Pla nned otecti
npla r Pr
U ste
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av artab ility
st b
Re overa
Re
c
System Failure and/or Last System Site-1
site-2
site-1
SYSPLEX
'CFRECOVER COND' AO Mgr. AO Mgr.
SYSPLEX
'RESET AOMOBJECT=LP21'
SYSPLEX L P P P S S S L
'RESET AOMOBJECT=LP22'
IPLTYPE
'GP11 LP21X'
SYSPLEX
'ACTIVATE GP11'
IPLTYPE Detected by Procedure
'GP12 LP22X' GDPS Operator prompted to
SYSPLEX initiate this takeover
'ACTIVATE GP12''
action to move production
MVS images to Site 2 and
leave primary DASD in
Site 1
This script is similar to the script for moving systems to Site2 for maintenance. The primary
difference is that there is no attempt to shut down systems in an orderly fashion. The systems
running in LPARs LP21 and LP22 are considered to be expendable work. There is also no DASD
movement.
This not likely to be considered a viable alternative since the performance degradation incurred by
the remote DASD is not likely to be considered viable. It is provided merely as an option to
consider.
PS e
tag e
Geographically Dispersed Parallel Sysplex
GD
Ou g
ed Outa n
nn io
Pla nned otect
pla r Pr
Un ste
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av artab ility
st b
Re overa
Re
c
Site 1 Failure / Switch All to Site 2
site-1
site-2
'RECOVER'
SYSPLEX
Tape Tape
'CDS SITE2'' AO Mgr. AO Mgr.
SYSPLEX
'CF SITE2'
SYSPLEX L P P P S S S L
'RESET AOMOBJECT=LP21'
SYSPLEX
'RESET AOMOBJECT=LP22' Initiated by
IPLTYPE
'GP11 ABNORMAL' GDPS
SYSPLEX
'ACTIVATE GP11'
IPLTYPE Procedure initiated by selecting takeover option
'GP12 ABNORMAL'
SYSPLEX Shutdown All systems in site1
'ACTIVATE GP12'
Switch DASD; reconnect with NOCOPY and suspend
Reset LP21/LP22
Switch to site 2 CF & CDS
IPL site 1 systems in LP21 and LP22
This scenario is likely to be the best choice where a production site switch is required due to a
disaster or major systems failure. It starts by doing a reset for all Site1 systems. This ensures no
I/O can occur to primary DASD. Next, the secondary DASD is placed in simplex mode, leaving
the primary DASD in Primary/Suspend mode. This allows access to the secondary DASD to allow
IPL in Site2.
Couple data sets and coupling facility functions are moved to Site2, followed by an
unceremonious shutdown of the systems running in LP21 and LP22. IPL type is set to abnormal
preparing GP11 and GP12 for IPL at Site2, then they are IPLd at Site2.
The process of cleaning up the DASD and systems environment at Site1 are left to the user to do
based on established policies.
PS e
tag e
Geographically Dispersed Parallel Sysplex
GD
Ou g
ed Outa n
nn io
Pla nned otect
pla P r
Un ster
a
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av artab ility
st b
Re overa
Re
c
System Failure of GP11
site-1
site-2
'GP11 NORMAL' AO Mgr. AO Mgr.
SYSPLEX
'ACTIVATE GP11'
L P P P S S S L
Detected by Procedure
GDPS Operator prompted to
initiate this takeover
action to IPL GP11 in its
normal location (Site 1)
after a failure of GP11
PS Ou
e
tag e
Geographically Dispersed Parallel Sysplex
GD
g
ed Outa n
nn o
Pla nned tecti
npla r Pro
U
aste
Dis
bilit
y The IBM Multiple Site Application Availability Solution
aila ility
Av rtab ility
esta rab
R ove
Re
c
DASD Failure in Site-1
site-2
site-1
'RESET SYSname'
DASD AO Mgr. AO Mgr.
'RECOVER'
IPLTYPE
L P P P S S S L
'GP11 LP11Z'
SYSPLEX
'ACTIVATE GP11'
IPLTYPE
'GP12 LP12Z'
SYSPLEX
Detected by Procedure
'ACTIVATE GP12''
GDPS Operator prompted to
initiate this takeover
action to move DASD
primary devices to Site 2
and leave production
MVS images in Site 1
This script allows the DASD to be moved to Site2 with the systems remaining at Site1. Like the
first script in this section it is not likely to be a viable alternative. It is included to demonstrate the
flexibility of scripts.
Note that the systems must be re-IPLd. Since the DASD device numbers will be different this is
required. The other unusual feature of this script is the uncommon IPLTYPE statement. This
allows a load profile to exist that points the IPL of a system in Site1 to a DASD system residence
volume in Site2.
BATCH user defined actions allow batch execution of a planned action defined in GDPS. The planned
action must be defined in SA/390 under choice 99 (UET-pairs). Entry name must be BATCH and the
type is the user-defined name if the batch-procedure.
With XRC this could be used to dynamically and automatically modify the configuration of the
DASD environment. This feature could be used to add a pair to an XRC config at an unattended
or vendor recovery site after the volume has been initialized.
Ÿ VPC_BATCHRC OK or NOK
Ÿ VPC_BATCHREAS reason text
Action Keyword/Data
XRC
%1
NOTE: %1, %2, %3, etc. represent values that will be passed to the GDPS VPCEXIT2 CLIST at
invocation time. %1 will be the fourth position variable data passed to the exit. At invocation,
“%1” in the script will be repalced by “STHLM3 CREPAIR SH1000 SH2000”. If there were to be a
%2 defined in the script it would be repalced by the fifth value in the “Operation Text” field, %3
would be repalced by the sixth value, etc...
NOTE: A comma (,) must be used as a delimitter in the data field between values that are to be
passed to the VPCEXIT2 clist as different variables. Since the example below does not use
commas between the data field values (there are spaces) the %1 in the script is replaced by
‘STHLM3 CREPAIR SH1000 SH2000’.
Execute the exit VPCEXIT2 that is supplied as a sample in GDPS SAMPLIB where the first parameter
(300) is the number of seconds the the script should terminate within. The second parameter is the name
of the BATCH-definition (CREATE). The rest of the parameters are substituted in the batch-definition
where the substitution string(s) are present (%n). The execution of the following command:
If the script is not finished within 300 seconds (frist parameter) it will be terminated. Result exeution is
returned in taskvariables VPC_BATCHRC and VPC_BATCHREAS.
If you want to execute a script from a batch job, for example to re-ipl a system as part of sched-
uled maintenance or as a task to perform every other weekend, it can be done from batch without
a person on site to perfrom the IPL.
Note: This example is specific to the OPC/ESA batch scheduler, if you use another sched-
uler tailor this code appropriately.
%1, %2, %3, etc. represent values that will be passed to the GDPS VPCEXIT2 CLIST at invoca-
tion time by OPC. %1 will be the second position variable data passed to OPC. In the example
below, in the “Operation Text” field the second position is “TSMVS1”. At invocation “%1” in the
script will be repalced by “TSMVS1”. If there were to be a %2 defined in the script it would be
repalced by the third value in the “Operation Text” field, %3 would be repalced by the fourth
value, etc...
NOTE: A comma (,) must be used as a delimitter in the data field between values that are to be
passed to the VPCEXIT2 clist as different variables.
Action Keyword/Data
SYSPLEX
'STOP %1'
SYSPLEX
'LOAD %1'
-----------end of--SCREEN Shot from SA/390 ---------------
Note:
1. REIPLSYSTEM is the GDPS script name
2. %1 is a parameter that will be substituted by GDPS from data in VPCEXIT2 -
Note:
1. Operws has to be a WTO workstation.
2. When this operation is scheduled, Message EQQW775I is issued and trapped by GDPS
3. Operation text contains name of batchscript and parameters, in this case:
i. name of Batch script is REIPLSYSTEM
ii. parameter is TSMVS1. The ‘TSMVS1’ value will be substituted in all places within the script
where %1 is coded. See %1 in the above screen shot.
ENVIRONMENT: GDPS/PPRC
‘Standard Actions’ are a set of functions that can be performed on MVS images and LPARs using
the GDPS ‘Standard Actions’ menu, accessed through the GDPS application on NetView. The
primary functions are starting, stopping, and restarting systems in GDPS. Stopping a GDPS
system should always be done from Standard Actions because GDPS will then be aware that
the system leaves the sysplex. If a system is stopped in another way, GDPS will detect that the
system goes away and treat this as a failure situation and initiate a takeover. Since an HMC
interface is required to perform these actions, GDPS also provides some HMC functions as
Standard Actions, such as Load (IPL), system reset, and activating or deactivating LPARs. In
addition IPL Type and IPL Mode settings influence the way some of the other actions operate and
can be changed from the Standard Actions menu.
Note: Standard Actions should be performed only from the Controlling system except for
those functions performed to the Controlling system.
In a non-GDPS environment with automation a system start is normally done by initiating a Load
or Activate from the HMC and the rest of the startup is automated. Likewise a shutdown is
initiated from the automation product and when shutdown is complete a System Reset and
possibly a Deactivate is done from the HMC.
In a GDPS environment additional functionality has been provided by combining some of these
functions into a single Standard Action invoked from the GDPS NetView application, for example
the IPL, Stop, and ReIPL Standard Actions.
Another category of Standard Actions are those that change the way the other Standard Actions
operate, specifically: 1) selecting the LPAR for a system (with the IType Standard Action) that
GDPS will pass to the HMC functions, 2) setting values (with the IMode Standard Action) that
GDPS can pass to non-SA390 automation products to aid in the inter operability of GDPS with
non-SA390 automation products, and 3) setting Loadaddr and Loadparm (with the Modify
Standard Action) to be used for the Load Standard Action.
In addition operations has been simplified since some other HMC actions can be initiated from
GDPS Standard Actions, like System Reset, Deactivate, and Activate.
The interface between GDPS and the HMC is AO Manager (AOM), or an equivalent product.
GDPS requests AOM to perform an action by issuing a WTOR with the format:
GDPS keeps track of the LPAR where a system should run and displays it on the Standard
Actions panel and inserts it in the GEO090A message as lpar-id. AOM initiates the action at the
HMC and then replies OK. If for some reason it cannot do the action it will reply NOK.
All Standard Actions must be executed from a system that is not the target of the action and
has a connection to AOM. GDPS has no awareness of which systems have a connection to
AOM. If a Standard Action is performed on a system without a connection to AOM there will be
no reply to the GEO090A message and the action will hang. No system will be allowed by GDPS
to do anything to itself that will cause it to stop functioning. In general Standard Actions will be
performed from the controlling system, but the controlling system will need to be stopped from
another system. In this case, another system will take the controlling function. The selection is
based on the sequence of entries in the MASTER list coded at GDPS installation time. It is
important that ALL systems in the GDPS environment be included in the list. The next eligible
system can be used to perform actions on the former controlling system.
Note: when the primary controlling system is IPL’d successfully, it will automatically become the
controlling system again as soon as GDPS has been initialized.
In Section 3.2 each Standard Action will be accompanied by a description of the action and a
description of what to look for and where to look for it, in order to validate that the expected
operations have taken place.
ENVIRONMENT: GDPS/PPRC
1.Logon to NetView
2. Enter the GDPS command to get into the GDPS Main Menu
3. Select Option 3 “Standard Actions” from the GDPS Main Menu Panel
4. The Standard Actions are listed at the top of the next GDPS panel
5. To execute a Standard Action, type the 1-2 character abbreviation on the underscored line
in front of a sysname and press enter.
6. The next GDPS panel is a verification panel. At the bottom of the panel, type YES and
press Enter for the Standard Action to start.
Note: that for the IType and IMode Standard Actions that the verification panel is also where
new data is entered.
Note : It is assumed that Standard Actions are performed just like one would do these operations
from the HMC. For example, one should not do a Reset to a running system. The only actions
supposed to be issued to a system in the Active (or Master) state are Stop and ReIPL.
ENVIRONMENT: GDPS/PPRC
ENVIRONMENT: GDPS/PPRC
The IPL Standard Action is similar to dropping a system image object, which is associated with a
load profile, onto the activate function on the HMC. It causes the IPL program to be read from a
designated device and initiates the execution of that program.
The IPL Standard Action uses HMC Load Profiles, which contain the parameters needed to put
the system into an operational state. GDPS requires that the load profiles follow a specific naming
convention: LPAR-ID concatenated with SYSNAME. If loadaddr and loadparm are defined to
GDPS, and shown on the Standard Actions panel, they will be added as extra parameters in the
GEO090A message but AOM will ignore them. (If one has another product these parameters can
be used by that product.)
ENVIRONMENT: GDPS/PPRC
The Stop Standard Action is used to shutdown a system in an orderly manner, remove the system
from the sysplex (V XCF,sysname,OFF), and then reset the system by invoking the GDPS Reset
Standard Action.
GDPS supports shutting down a system through SA for OS/390, another automation product, or
a combination of SA for OS/390 and another product. Systems are defined to GDPS in the
GEOPLEX DOMAINS automation policy and for each system one specifies a two character
automation flag where each character can have the value Y (for yes) or N (for no). The first
character is for SA for OS/390 and the second is for another automation product. So if both SA
for OS/390 and another product is used, one should specify YY.
GDPS Installation Guide 175
Note : One should specify YY even if SA for OS/390 is only used to start and stop the other
automation product.
The Stop standard action can be used to shut down one system or all systems in a site (S is the
only allowed action on the SITE1 and SITE2 lines on the VPCPSTD1 panel.) Note that when you
initiate a Stop standard action, it will not complete until the system(s) has/have been removed
from the sysplex, and no other standard action or planned action can be started until the stop
completes. If you stop a site, all systems in the site will stop in parallel. If you initiate stop of a site
from a system in that site, the action will be allowed, but the system where you initiated the stop
will not stop (in other words: a system will never stop itself).
The status on the GDPS panel will be changed from ACTIVE to STOP.
The following can be seen in the SYSLOG:
VPCTRACE PLANNED ACTION STARTING
VPCTRACE SYSPLEX=STOP sysname STARTED
On the controlling system (or the system initiating the Stop) the following WTOR occurs
GEO043A ID=hh:mm:ss... WAITING FOR SHUTDOWN SYSTEM (sysname)
If the flag for another automation product is Y, the following WTOR message occurs on the
system to be stopped
GEO046I STOP SYSTEM sysname TOKEN=hh:mm:ss...
The non-SA390 automation product should recognize this message occurrence and invoke its
system shutdown process. When the system shutdown process has been completed, then
the non-SA390 automation product should reply OK
Next, if the flag for SA for OS/390 is Y, then GDPS on the system to be stopped will issue a
SHUTSYS command to SA for OS/390 and SA for OS/390 will stop the applications
defined to it (possibly just the other automation product).
When everything has been stopped, GDPS on the stopping system will reply OK to the GEO043A
message and processing will continue on the controlling system where GDPS will issue the
command
V XCF,sysname,OFFLINE
GDPS will reply to the messages that occur because of the V XCF command and finally the
following can be seen in the SYSLOG:
VPCTRACE sysname HAS LEFT THE SYSPLEX
GEO090A RESET sysname P L lpar-id
AO Manager initiates the Reset and replies OK
The status will be changed to RESET.
VPCTRACE SYSPLEX=STOP sysname ENDED RC=0
VPCTRACE PLANNED ACTION COMPLETED
The HMC ICON will go from clear to red.
ENVIRONMENT: GDPS/PPRC
The Load Standard Action is identical to the load performed from the HMC. The load will be
issued to the current LPAR for the system. The current LPAR is displayed in the LPAR field on
the GDPS panel. It causes the IPL program to be read from a designated device and initiates the
execution of that program. For GDPS, the Load Standard Action is used when one does not want
to use the loadaddr and loadparm specified in the selected system’s HMC load profile (refer to the
IPL section for information on how GDPS determines the name of the load profile). The Load
Standard Action will always use the loadaddr and loadparm shown on the Standard Actions panel.
Loadaddr and loadparm are defined on the GDPS panels using the Modify Standard Action and
selected by changing IPL Type and IPL Mode with the IT and IM Standard Actions.
Optionally you can change loadaddr and loadparm on the confirmation panel before you enter
YES. When displayed, the confirmation panel contains the values from the Standard Actions
panel, but they can be changed.
ENVIRONMENT: GDPS/PPRC
The Activate Standard Action is identical to the LPAR (partition) activate performed from the
HMC. The activate will be issued to the current LPAR for the system. The current LPAR is
displayed in the LPAR field on the GDPS panel. An HMC image profile is associated with this
HMC function and will be used for the activation. The name of the image profile is the same as the
LPAR name. An IPL will be initiated if the image profile specifies “Load at Activation”.
ENVIRONMENT: GDPS/PPRC
The ReIPL Standard Action is a combination of the Stop and Activate, IPL, or Load Standard
Actions. It will do an orderly shutdown of the system as described for the STOP Standard Action
and then, immediately after Reset, it will do the Activate, IPL, or Load Standard Action. What
Standard Action to use for restarting the system is defined in GEOPLEX OPTIONS keyword
RIPLOPT.
See above for the Stop and Activate, IPL, or Load actions.
ENVIRONMENT: GDPS/PPRC
The Deactivate Standard Action is identical to the deactivate performed from the HMC. It
performs a Deactivate of the current LPAR for the system. The current LPAR is displayed in the
LPAR field on the GDPS panel.
The Reset Standard Action is identical to the reset clear performed from the HMC. It terminates
any current operations and clears any interruption conditions for the selected image. A reset clear
clears main storage and all registers. Normally, in a GDPS environment, one will not need to use
the Reset Standard Action, because whenever a Reset is needed it will be performed by GDPS for
example as the last step of the Stop Standard Action.
ENVIRONMENT: GDPS/PPRC
The IType Standard Action is used to set a system’s IPL Type to either NORMAL,
ABNORMAL, or a user defined value (4 or 5 characters).
The IPL Type of a system is used to select the LPAR and load profile for a system. Systems are
defined to GDPS in the GEOPLEX DOMAINS automation policy. For each system specified a
primary and optionally an alternate lpar id must be entered. When IPL Type is NORMAL the
primary lpar-id will be used and when it is ABNORMAL the alternate lpar-id will be used. If one
sets IPL Type to anything else the first for characters will be used as the lpar-id.
The load profile for a system is used when the IPL Standard Action is performed. The name of the
load profile is the concatenation of the current lpar-id (primary or alternate) and the system name.
If IPL Type is a user defined value the load profile name will be that value concatenated with the
system name.
When one enters IT to the left of a system a new panel will be shown where one can enter the new
value for IPL Type and then enter YES to confirm the change.
For a change of the IPL Type to ABNORMAL, the IPLtype column on the GDPS panel is
changed to ABNORMAL and the LPAR column on the GDPS panel is changed to the name of
the ‘abnormal’ LPAR.
For a change of the IPL Type to NORMAL, the IPLtype column on the GDPS panel is changed
to NORMAL and the LPAR column on the GDPS panel is changed to the name of the ‘normal’
LPAR.
ENVIRONMENT: GDPS/PPRC
The IMode Standard Action sets a system’s IPL Mode to any user defined value which can be
passed to a non-SA390 automation products at IPL time to define the type of start to be
performed.
When one enters IM to the left of a system a new panel will be shown where one can enter the
new value for IPL Mode and then enter YES to confirm the change.
Note: Currently the panel states that the value of IPL Mode can be NORMAL, MAINT, or
BASIC. However, there is no limitation, IPL Mode can take any value.
ENVIRONMENT: GDPS/PPRC
ENVIRONMENT: GDPS/PPRC
The IPL (abnormal) Standard Action is used when one wants to IPL a system in it’s alternate
LPAR.
Shut down the system, using the Stop Standard Action and wait until the system to be IPL’d
comes to the RESET status.
Set the IPL Type for the system being IPL’d to ABNORMAL using the IType Standard Action
and verify that the LPAR changes.
IPL the system using the IPL Standard Action.
Should there be a new IPL, planned or unplanned, it will be directed to the alternate LPAR as
long as IPL Type is ABNORMAL. Set the IPL Type for the system to NORMAL when one
wants to return to the primary LPAR and the next IPL will be directed to the primary LPAR.
Validation of the user defined actions discussed in the previous section consists of invoking the
defined actions and validating the proper execution of each. Proper execution is simply having
each user defined action (script) perform the expected tasks to acceptable completion.
The two types of user defined actions, CONTROL scripts and TAKEOVER scripts, are invoked
in different ways, but in either case, validation is demonstrated by successful completion based on
the content of the script.
To execute a ‘CONTROL script’ user defined action, select option 6 (Planned Actions) from the
GDPS main menu.
SAMVS
TAKEOVER actions can be viewed by selecting ‘View Definitions’ (option 9) from the GDPS
main menu.
An alternate approach to testing script functionality is to create the script initially as a control
script, to allow manually invoking it as described above, then changing it to a takeover script after
function testing has been performed.
This approach will not verify that the script will be invoked for a specific failure. In order to
accomplish this the failure that is expected to result in the takeover script option being presented
in the GEO112E /GEO113A message must be induced.
When a failure occurs, GDPS will analyze the situation and suggest a TAKEOVER action based
on the results of the analysis and the TAKEOVERs defined to GDPS. When GDPS detects a
failure it will always issue a prompt and ask what TAKEOVER action is to be invoked. There is a
naming standard for TAKEOVERs and the following can be defined where n is 1 or 2 for Site1 or
Site2:
ALLSITEn switches systems and primary DASD from site n to the other site
DASDSITEn switches primary DASD from site n to the other site
SYSSITEn actions performed when systems in site n fails
SYSsysname actions performed when system "sysname" fails
user defined will be presented in all TAKEOVER prompts
If “user defined” TAKEOVERs are defined they will be presented in all prompts. All
TAKEOVER prompts will have a NO selection. Replying NO terminates the takeover without any
GDPS action.
In ALLSITEn and DASDSITEn n will be 1 or 2 depending on what site currently has the primary
DASD. GDPS is aware of what site currently has the primary DASD. In SYSSITEn n will be the
GDPS Installation Guide 185
site of the failing system. The site for a system is the site defined in GEOPLEX DOMAINS. That
is, a system defined in a SITE1 statement is always a Site1 system even if it has moved as the
result of a planned or unplanned action. There is no awareness in GDPS of systems moving
between sites because of a planned or unplanned action.
When a problem is detected by GDPS, a prompt will be issued and applicable options from the
above list will be offered. In addition, REPEAT will be offered, and if you reply REPEAT, GDPS
will repeat it’s failure analysis and additional selections may be offered. This is supposed to cover
rolling disaster situations where the first prompt may indicate only the first of a series of failures.
If a series of failures occurs, GDPS can only handle the first one, and when the first takeover is
initiated and the prompt is issued, new takeovers are disabled. Should this type of situation be
suspected, reply REPEAT, and GDPS will repeat its analysis and reissue the prompt possibly with
additional selections.
NOTE: For DASD problems ALLSITEn and DASDSITEn will only be suggested if mirroring
status is FRZ. For system problems ALLSITEn will only be available if mirroring status is OK.
Note again that ALLSITEn will only be available if mirroring status is OK or FRZ.
If a system in the site of the primary DASD fails , GDPS should suggest the SYSSITEn,
SYSsysname, and ALLSITEn TAKEOVERs where n is the site of the primary DASD. Note
again that ALLSITEn will only be available if mirroring status is OK. When a system in the site of
the secondary DASD fails, GDPS should suggest SYSsysname and SYSSITEn where n is the site
of the failing system.
When the controlling system fails, the next system in the master list will take the master
(controlling) function and GDPS would suggest the SYSSITEn and SYSsysname TAKEOVER
actions where n is the site of the controlling system.
It is important to note that there is an awareness in GDPS as to where the primary DASD is and
this means that GDPS can handle DASD problems regardless of where the primary DASD is.
However, there is a requirement that all “freeze device pairs” are full duplex for a DASD switch
to occur.
There is no equivalent awareness of where a system is running. GDPS will keep track of what
LPAR a system is currently running in and if there is an IPL it will be directed to that partition.
However, if there is a failure situation, all TAKEOVER selections will be based on the
assumption that systems are running in the site defined in GEOPLEX DOMAINS.
ENVIRONMENT: GDPS/PPRC
In Appendix C, there is a sample REXX program which can be used to verify that the PPRC
freeze function gives a consistent set of secondary volumes. This tool should be used to verify that
the PPRC CGROUP(FREEZE/RUN) function provides a consistent set of secondary PPRC
volumes.
A point of clarification: this REXX program tool can be used to verify that a disk subsystem
supports ELB/CGROUP when manually triggered by inducing a failure (e.g., suspending a PPRC
volume pair), but this is only part of the freeze function.
Please note, one cannot check the consistency and long busy by the issuing of a suspend
command. To test extended long-busy one must create an error to ensure that the device goes
into long-busy when a IEA491E condition is detected. Those conditions are 1) primary CU
failure such as subsystem cache or NVS fail or set off, 2) last link gone between the primary CU
and the secondary CU, 3) the secondary CU failure or subsystem cache, NVS, DFW or device
cache set off 4) Secondary device failure. The simplest way to accomplish this is to create the
last link failure. To do this, go to a TSO session and execute the CESTPATH command to
establish one path to an invalid port ID. This will replace all of the valid paths with the invalid
path. The next write I/O to any/all primary device(s) will cause that primary device to detect the
error (no paths available) and produce both a IEA494I and a IEA491E message. Either one of
these messages will cause the GDPS automation to do a FREEZE. The key here in testing the
long-busy function is to produce a IEA491E message. One should not go into long-busy unless it
is a IEA491E condition or a FREEZE command. The IEA491E condition is the current device
only in long-busy. The FREEZE command will put all PPRC devices into long-busy that are part
of the CU pair that the command was issued to and should not affect any PPRC devices out side
if that group.
The process used to maintain the GDPS - PPRC environment is a two step process. First, one
must update the control file and notify GDPS of the changes using the CONFIG function. Step 2
is to use the GDPS Option 1 panels to establish the PPRC environment.
The general maintenance approach would be to modify the GDPS - PPRC control file pointed to
by the NetView GEOPARM DD statement. Once the changes have been completed, GDPS needs
to be notified of the changes. This is accomplished by using OPTION C (Config) from the GDPS
Main Menu. This forces GDPS to open the control file and overlays any previous configuration
with the new configuration. Please note that this affects only what GDPS can manage. The actual
PPRC environment has not been impacted until GDPS initiates actions such as establishing links,
new pairs, etc... Once the control file updates have been introduced to GDPS, then GDPS Option
1 panels are used to establish the new environment.
An example will help to clarify the processes. Let’s assume the following changes are necessary:
1) Add devices (1228 - 122F) to one of the SSID pairs already established, 2) Remove the freeze
group for the same SSID pair, and finally add a new SSID pair with the first 16 devices being
mirrored. For purposes of our example, the following definitions existed for the SSID pair to be
modified:
GEOPLEX LINKS
...
...
SITE1='1800,8800,Y,N,0000B000,0051B400'
SITE2='8800,1800,Y,N,0001C600,0011C200'
SITE1PDAS='1800,8800,Y,N,=SITE2’
SITE2PDAS='8800,1800,Y,N,=SITE1’
GEOPLEX MIRROR
...
...
PPRCSSID='1800,8800'
PPRC='1220,9220,08,N'
...
GEOPLEX NONSHARE
...
...
NONSHARE=’123F’
NONSHARE=’923F’
GEOPLEX LINKS
...
...
SITE1='1800,8800,N,N,0000B000,0051B400'
SITE2='8800,1800,N,N,0001C600,0011C200'
SITE1PDAS='1800,8800,N,N,=SITE2’
SITE2PDAS=’8800,1800,N,N,=SITE1’
GEOPLEX MIRROR
...
...
PPRCSSID='1800,8800'
PPRC='1220,9220,16,N'
...
GEOPLEX NONSHARE
...
...
NONSHARE=’123F’
NONSHARE=’923F’
The following statements were added the control file to define the new SSID pair
GEOPLEX LINKS
...
...
SITE1='1800,8800,N,N,0000B000,0051B400'
SITE1='1860,8860,Y,N,0040B000,0050B400'
SITE2='8800,1800,N,N,0001C600,0011C200'
SITE2='8860,1860,Y,N,0001C600,0011C200'
SITE1PDAS='1800,8800,N,N,=SITE2’
SITE1PDAS='1860,8860,Y,N,=SITE2’
SITE2PDAS='8800,1800,N,N,=SITE1’
SITE2PDAS=’8860,1860,Y,N,=SITE1’
GEOPLEX MIRROR
...
...
PPRCSSID='1800,8800'
PPRC='1220,9220,16,N'
PPRCSSID='1860,8860'
PPRC='4220,A220,60,N'
...
190 GDPS Installation Guide
GEOPLEX NONSHARE
...
...
NONSHARE=’123F’
NONSHARE=’223F’
NONSHARE=’423F’
NONSHARE=’A23F’
Once these changes have been entered into the control file, from the GDPS Main Menu, select
OPTION C (Config) from the GDPS Controlling System.
When the CONFIG command is issued GDPS will post the warning:
You are requesting to establish your PPRC-pairs for SSID pair xxxx - xxxx FREEZE=x. This
can be a longrunning task depending on what DASD controllers you have installed.
This is to remind you that you are starting a long running process. Confirm with YES to continue.
NOTE 1: When option C - Config is selected, GDPS will do CQUERYs to all devices and it will
take some time (up to a second per device pair) until it completes. During this time option 1 -
DASD Remote Copy will give the message “Config in progress. Wait & retry” until Config is
complete. The Latest Config field on the GDPS main panel will be updated when Config is
complete. When Config is complete in the master system, it will send a message to the other
systems that they should get the new configuration. This will also take some time and during that
time option 1 - DASD Remote Copy will give the message “Retrieving DASD info. Wait &
retry.” When this processing has finished, the Latest Config field will show the same value on all
systems.
NOTE 3: During Config processing CQUERY commands are issued to all devices in the
configuration (primary, secondary, and nonshare devices). The CQUERY is an I/O operation and
if any other system has issued a RESERVE to any of the devices, Config processing will be
delayed or hung up. You should avoid doing Config during times with reserve activity.
If Config processing detects errors, you will get messages describing the problem, and the old
config information will be reloaded.
NOTE 4: At the end of Config procesing a background task will be started which will get the
volser information by issuing a "DS QD;SSID=ALL" command and get all the volsers. There will
be a delay from Config complete till the volsers are present and shown on the panels. Also, if
WTO buffers are filled, there is a risk that only a subset of the volsers will be picked up, but you
should be able to find that out from the syslog. That is, you may need to increase the number of
GDPS Installation Guide 191
WTO buffers since the DS QD command will display all DASD devices defined/connected to the
system. It is recommended that the MLIM parameter in CONSOLxx is set to a value that is
at least 20% more than the number of DASD devices defined to the system.
NOTE 5: Config processing can disable FREEZE. Whenever the remote copy status is
NOK, freeze is disabled by GDPS. If a config adds pairs that are not in duplex when the
config has completed, remote copy status will be NOK and FREEZE will be disabled. The
exposure can be minimized by careful planning. During a config remote copy status is set to
NULL.
Volsers are changed very seldom, so they are picked up at Config time. If you change the volser
of a volume, you can get GDPS to update by using the T command to the device. The T
command issues a DS P command to the device and if the saved volser is different from the one
returned from DS, GDPS will update the saved volser.
From the GDPS main menu, select Option 1 (DASD Remote Copy) . The SSID pair panel will be
displayed and one would proceed with the normal panel processes to complete the establishing of
the links and volume pairs.
ENVIRONMENT: GDPS
To add systems to the GDPS environment they must be defined to both AO Manager and SA/390.
AO Manager is beyond the scope of this document. If the required AO Manager expertise is not
available locally contact IBM Global services to arrange for tailoring AO Manager.
To add systems to the GDPS environment the DOMAIN definitions in the UET panels of SA/390
are invoked and the Domain statement is modified to include the new systems. See the section
titled ‘Using SA/390 Dialogues to Create the GDPS Policy’.
Once the policy has been modified, execute the ‘build’ function to create a new Automation
Control File (ACF). Ensure that this ACF is included in the DSIPARM concatenation. Then use
the SA for OS/390 command, ACF, to activate (load) the new ACF or restart NetView.
ENVIRONMENT: GDPS
To change system configuration in the GDPS environment the DOMAIN definitions in the UET
panels of SA/390 are invoked and the Domain statement is modified to reflect the changes. See
the section titled ‘Using SA/390 Dialogues to Create the GDPS Policy’.
Once the policy has been modified, execute the ‘build’ function to create a new Automation
Control File (ACF). Ensure that this ACF is included in the DSIPARM concatenation. Then use
the SA for OS/390 command, ACF, to activate (load) the new ACF or restart NetView.
To add control or takeover scripts to the GDPS environment the definitions in the UET panels of
SA/390 are invoked and additional scripts are added.
SYSPLEX='CDS SITE1'
DASD='STOP SECONDARY'
Enter ‘99’ at COMMAND to select User E-T pairs and press ENTER.
Finally do a SA OS/390 BUILD for the new config and activate the new policy in all your
systems. Issue the command ACF MEMBER=ACFZ99x,SAVE. Be careful to use the correct
member in each system.
To modify control or takeover scripts in the GDPS environment the definitions in the UET panels
of SA/390 are invoked and the existing script is edited. See the section titled ‘User Defined
Actions’.
Once the policy has been modified, execute the ‘build’ function to create a new Automation
Control File (ACF). Ensure that this ACF is included in the DSIPARM concatenation. Then use
the SA for OS/390 command, ACF, to activate (load) the new ACF or restart NetView.
If scripts are deleted they will be remembered by Netview until Netview is restarted. So the way
to have old scripts cleared is to restart NetView.
When implementing a new release or FIXPAK for GDPS follow standard SMP procedures for
your installation.
Release 2 of GDPS contains some new definitions and some definitions has been removed.
Ÿ 2 new messages in the messaged table
Ÿ 1 new autooperator in SA/390- NetView
Ÿ 1 new monitor function
Ÿ Changed monitor defaults
Ÿ Installation sequence
Ÿ Removed GEOPLEX OPTIONS definitions
Ÿ Redesigned SDF-panels
Ÿ New command for querying SYSPLEX status
Ÿ Katbat support removed
Ÿ Management of mirrored DASD.
IXC256A, IGD706D and IGD703D has been added to the automation in V2R2M0. IOS002A
was added in V2R1M0 in fixpack9. Make sure that those messages are defined with
AUTO(YES) in the MPFLIST in your system(s).
Subsection 4.6.1.2 New Auto-Operator
GEOOPER3 has been added to the GDPS definitions. The update is made in DSIOPFGP in
SAMPLIB. This new operator will run all DASD-monitoring. If the operator is not defined,
GEOOPER2 will be used and that may delay initiation of a takeover. Takeovers will always use
GEOOPER2.
Subsection 4.6.1.3 New Monitor defaults
MONITOR2 default is. 01.00 (1.AM) every night in the master system. Release 1 default was
every hour in all systems.
The new release has to be installed on the Controlling system first. After that the order of
installation is of no importance.
MSGPREFIX, FUNCTION and PPRC has been removed. The options will create a warning
messages if they exist.
The messages prefix for GDPS is GEO. FUNCTION and PPRC will always be enabled. If
disabling of GDPS is desired the automation can be turned off using the User Interface (choice
number 8)
The possibility to do PF08 (next page) on the GDPS status panel has been added. The change is
reflected in PGEOPLEX, PGEOPL2 and PGEOPL3 in DSIPARM
A new command has been added to query sysplex resources using IXCQUERY macro. If the
Command is installed, GDPS will use it. Benefits: Faster response and less output on SYSLOG.
The command processor and the program for the TSO to NetView API has been deleted form
GDPS. The update is reflected in DSIDMNGP in SAMPLIB. Remove the subsystem definitions
for KATBAT in SA/390
ACTIVATE AOMOBJECT=oooo
UNDO AOMOBJECT=oooo
CONFIGON sysname
GEO034I
GEO035I
GEO036I
GEO045I
GEO170I
GEO171W
GEO172E
GEO161I
GEO162W
GEO163W
GEO164E
GEO165A
GEO166I
GEO166W
GEO711I
GEO126I
GEO127W
GEO128W
GEO083W
GEO084I
GEO150I
GEO151I
GEO152E
GEO153E
GEO154W
GEO155I
GEO190I
GEO191I
GEO192W
GEO193W
See message table for details
4.6.4.1 NetView
Release 5 of GDPS contains some changes and new definitions that requires following updates.
Ÿ SGDPPARM member names have been changed requiring updates to NetView DSIPARM
'%INCLUDE' statements and SA/390 SDF definitions.
Ÿ Define systemnames instead of NetView domainid in the SA/390 GEOPLEX DOMAINS table
and in the GEOPLEX OPTIONS Master list. (Recommended update)
Ÿ When Msgtable HealthCheck support is used WTOs are issued with routcode=11. This has
to be considered if the installation has the same msgs in their own automation to ignore msgs
with routcode=11.
Ÿ The DFSMS APAR OW43316 is required for GDPS/XRC and RCMF/XRC. Please make
sure that the applicable PTF is installed before starting GDPS/XRC or RCMF/XRC.
The process to migrate to the full GDPS environment from a previously installed RCMF subset
environment is basically the same as implementing GDPS for the first time and should be
implemented and tested in a test environment before migrating to the production environment.
The main difference is that this time, some of the work and some of the automation definitions
have already been completed.
In the case of moving from RCMF, AO Manager and SA/390 will be required and therefore they
must be installed if not previously completed before actually attempting to implement GDPS.
Each of the tasks listed in Section 2.1 should be reviewed and completed if necessary before
proceeding.
One will then need to go through each of the steps for a complete GDPS install. Access to the
new GDPS code and panels can be accomplished as described in Section 4.6 (Migration from
previous releases of GDPS).
All messages appearing in the table below have the following format:
Where:
nnn Three-digit message number
X
Ÿ A - Action Required Message. Actions identified need to be performed
Ÿ E - Error Message. Error detetected needs your action
Ÿ W- Warning
Ÿ I- Information only
Ÿ blank - Information only
Explanation: The monitor interval has detected that there are no systems active in
the secondary site. There is also a GEO112E/GEO113A message issued in this
situation.
Explanation: The monitor interval has detected that there are no systems active in
the primary site. There is also a GEO112E/GEO113A message issued in this
situation.
Explanation: Operator has responded YES to a takeover prompt and the takeover
is started.
Explanation: The API-interface for PPRC is not active. Takeover scripts and
user-defined actions are not allowed to start. .
Action: Make sure that the GDPS PPRC-API is active and retry the operation.
GEO018I Automation is ON.
Explanation: GDPS monitoring detected that the automation for this system is ON.
GEO019W Automation is OFF.
Explanation: GDPS monitoring detected that the automation for this system is
OFF. Automated Functions on this system will not execute.
Action: Reply YES to continue the script NO to terminate the script or Repeat to
see the cause of the problem again.
GEO026W Freeze not enabled due to P/DAS active
Action: To have the system enabled for freeze use ‘dasd-mirroring’ to get back to
normal PPRC-operation.
GEO027W Freeze not enabled PPRC environment not initialized.
Explanation: Monitor function detected that the PPRC environment was not
initialized. In case of a Freeze situation, a freeze will not be executed.
Action: Check NetView log to determine the reason for this. In case of a new PPRC
configuration being created, this condition will be cleared automatically when the
PPRC-configuration is created.
GEO028W Freeze not enabled due to FREEZE already done
Explanation: Monitor function detected that FREEZE has been executed. In case
of a freeze situation, freeze will not be executed.
Action: To enable freeze again, establish paths and the freeze function will be
enabled again.
GEO033W sysname XCF-communication lost
Action: Determine why the communication was lost. If NetView on a system was
stopped, GDPS can not talk to that system. Start NetView to clear this condition.
GEO034I TAKEOVER IS NOW SERIALIZED
GEO035I PLANNED OR UPLANNED ACTION IS NOW SERIALIZED
GEO036I Serialization of Takeover, Planned or Standard Action ended
GEO043A ID=token Waiting for shutdown of system (sysname)
Action: When procedure xxxxxx has completed, GDPS will respond to the
WTOR
GEO045A Operator Assist Reply OK NOK --text--
Action: Respond to the WTOR when the action described in the text is
performed.
GEO046I STOP SYSTEM sysname TOKEN=token
Explanation: GDPS has recognized a STOP for a system that has another
automation package. GDPS will wait for an automated response from that
application. Reply from that application can be OK to continue or NOK to
terminate.
GEO047A Text
Explanation: User defined text from the SA/390 Policy is issued when a
system is not found in the site table.
Explanation: GDPS failed to stop system ‘sysname’. Respond YES to continue the
script or NO to terminate the script
Explanation: Monitor process detected that only one couple data set of the function
‘cdstype’ was Active. This alert will disappear when two couple data sets are in use.
Explanation: GDPS monitor process detected that the installation defined couple
data set usage for function ‘cdstype’ was not used. This message will disappear
when the installation defined defaults are in use.
GEO053I SETXCF FORCE,STR,STRNAME=strname
GDPS CFRECOVER wants to execute the command shown in the message.
A verification message (GEO054A) will be issued before executing the
command. This message will only be shown if AOM-verification is set to
yes(main menu, choice 8)
GEO054A REPLY ‘ACCEPT’ to execute SETXCF-commands, ‘CANCEL’ to
bypass.
GDPS requires a verification to execute the command shown in the previous
GEO053I message. Reply ACCEPT to execute the commands. CANCEL to
skip execution of the commands. This verification will only occur when
AOM-verification is set to YES (main menu, choice 8)
GEO061W CF cfname NOT WORKING
Explanation: Monitor process detected that Coupling Facility ‘cfname’ was not
active
This message will disappear when the coupling facility is activated.
Action: Determine the reason and make changes to CF-configuration using choice
7 (Sysplex Resource Management)
GEO062W DEFAULT CFRM POLICY NOT USED
Explanation: Monitor process detected that the default CFRM policy was not
active.
This message will disappear when default policy is active
Action: Determine the reason and make changes to CF-configuration using choice
7 (Sysplex Resource Management)
GEO070E GDPS INITIALIZATION FAILED: reason
Explanation: GDPS has detected error in the input parameters. Reason describes
the error.
Explanation: GDPS has detected a program error Netlog contains the error .
Explanation: The CGRPLB values and FREEZE specifications match for all
SSID pairs.
GEO090A action sysname P L object-id [loadprofile loadaddr loadparm ] .
Action: None
GEO092I VPCEAOMB ind IS THE DUPLICATE VOLSER INDICATOR.
Action: Verify that AO Mgr is active and has the GDPS application running
Explanation: GDPS did not find any AOM-connected system which is active in its
normal partition and manual interaction is needed to complete the required HMC
operation.
Action: Perform the requested action on the HMC and then reply OK.
GEO100I MIRRORING IS ACTIVE
Explanation: Outstanding alert(s) has been reset to normal because the monitoring
process detected that all conditions are normal.
Action: Use the Automation ON/OFF (choice 8) to enable the Automation indicator
Action:
GEO112E GDPS Takeover prompt
STATUS:
CF(s) IN SITEn (OK) or
CF(s) IN SITEn (NOK) or
CF(s) IN SITEn (NOT DEFINED)
OPTIONS:
‘SYSMVS2’ TO EXECUTE takeovescript FOR
Comm=RE-IPL OF MVS2
SYSSITE1 NOT FOUND ALLSITE1 NOT POSSIBLE
Explanation: DASD Monitor process has detected that some PPRC-links which are
defined in the GDPS policy are not active. Verify on GDPS panel MIRRORING
which link(s) are not operational.
Explanation: DASD Monitor function was invoked but the GDPS PPRC
environment was not Initialized. The alert will disappear when the
PPRC-environment has been initialized.
GEO117E Unit is off-line hh????
Explanation:
GEO118A Reply ACCEPT, CANCEL or REPEAT(question)'
Explanation: This message appears when a takeover or planned action fails. User
has a choice to continue, (after fixing the problem manually) or terminate the
takeover / planned action.
Explanation: DASD monitoring was scheduled but an indicator was set to disable
DASD monitoring.
Action: None. Monitoring will be resumed when DASD problems are solved
Explanation: DASD monitoring was scheduled after a freeze had been executed.
DASD monitoring will not be executed until PPRC resynchronization complete.
GEO123I FREEZING Secondary volumes
Explanation: DASD monitoring detected that P/DAS was active. This condition will
inhibit FREEZE If there is a freeze condition. This alert will disappear when P/DAS
is no longer active
GEO125I WAITING FOR devn TO GET ONLINE.
Explanation: Health Check (Monitor 3) detected that all primary DASD devices are
on-line to the controlling system ‘sysname’.
GEO127W MORE THAN 10 DEVICES OFFLINE/ALLOCATED IN CONTROLLING SYSTEM
sysname
Explanation: Health Check (Monitor3) detected that more than 10 devices are
either offline or allocated from the controlling system ‘sysname’
Action: If the devices are offline change the status of the devices to online. If
devices are allocated determine why and take action to remove the allocation and
have the devices online.
GEO128W DEVICE devn IS status, IN CONTROLLING SYSTEM sysname
Explanation: During initialization GDPS will request dynamic information from the
current MASTER system. This message tells when this function starts/ends.
GEO133 TAKING MASTER FUNCTION
Explanation: This system is placed in front of all other active systems in GDPS and
will now take the role as MASTER/Controlling system.
GEO134 COULD NOT GET INFORMATION FROM MASTER, USING LOCAL VARIABLES.
Explanation: For some reason, this system could not copy information from current
the MASTER system. Recycle NetView.
GEO135 CHECK MASTER FUNCTION.
Explanation: GDPS has detected that a the DASD configuration has changed and
retrieves the changed date from the Controlling system.
GEO140I RESTORE OF DASD CONFIG started/ended
Explanation: GDPS detected that the PPRC-configuration has not changed and
reloads it’s save configuration.
GEO141I SYSTEM(sysname) NOT FOUND IN GDPS SITE-TABLE.
Explanation: A request was made to retrieve the IPLMODE value for a system.
This request was attempted on a system where GDPS was not defined.
GEO142I DOMAINID(domainid) DID NOT HAVE GDPS ACTIVE.
Explanation: A request was made to GDPS to retrieve IPLMODE. GDPS was not
active on that system.
GEO143 COULD NOT FIND commandprocessor
Action: Check that GDPS command processors are defined in DSIDMN. After
corrections NetView has to be recycled
GEO144 COULD NOT DETERMINE STATUS OF commandprocessor
Action: Check that GDPS command processors are defined in DSIDMN. After
corrections NetView has to be recycled
GEO145W COULD NOT FIND GEOXCFQ
Action: Check that GDPS command processor has INIT=Y on it’s definition in
DSIDMN
GEO147W COULD NOT DETERMINE IF GEOXCFQ WAS INSTALLED
Action: Reply YES for real CBU activation, TEST for test CBU activation,
or NO for no CBU activation
GEO166I n CPUs Configured Online by GDPS
Action: Check definitions for GDPS CBU actions, and if needed manually
configure CPs online
GEO170I ALL CONTROLLING SYSTEMS ARE ACTIVE
Action: Reply CONFIG or 'name' of the NCP that will be used as active NCP.
GEO201 Activate of node nnnnnnnn NOT OK.
Action:
GEO204I System-type = PRI, full activation of VTAM nodes.
Action:
GEO209I Site-Switch of NCP, nnnnnnnn used as primary.
Explanation:
Action:
GEO213A Are you sure? Reply YES/NO
Action:
GEO214 Active NCP (nnnnn) has severe problems, reply on MCS console.
Action:
GEO220I Network synchronization done.
A script or planned action event stopped and needs manual action before
continuing with the next step. A WTOR will be issued in SYSLOG. Follow
the instructions in the WTOR.
GEO314W No more Flash device errors will be shown, limit passed.
Action:: Consider defining FlashCopy pairs for all devices on this SSID defined to
GDPS.
GEO400I SYSTEM sysn JOINED THE SYSPLEX
Explanation: A system has joined the sysplex and GDPS will verify if this system is
part of GDPS
Explanation: GDPS updates SDF with this text when a GDPS system joins GDPS.
This alert will be removed when the PPRC-API function has been tested.
GEO402I PPRC NOT YET EXAMINED
Explanation: GDPS has not yet examined the PPRC-devices This message is
written to SDF and will be removed when verification of the PPRC configuration has
been executed.
GEO403I PPRC API IS ACTIVE
Explanation: GDPS has tested the PPRC-API and found it working. This line is
displayed on SDF,
GEO404I PPRC API is not active
Explanation: GDPS has tested the PPRC-API and found that it’ does not work. This
message is displayed on SDF.
GEO405I PPRC is not examined yet
Explanation: GDPS has not yet examined the PPRC-devices. This message is
written to SDF and will be removed when the API is found working
GEO500I 'domainid' INITIATES FREEZE
Explanation: GDPS has detected a freeze condition and is now initiating the
freeze-process.
GEO501I 'domainid' FREEZE DONE
Explanation: A GEOXCFQ command was issued with a device-name that did not
exist.
Explanation: GDPS logs that freeze was executed with nonzero return code
-pppp is the ssid of the primary control unit
-ssss is the ssid of the secondary control unit
-pser is the serial number of the primary control unit
-sser is the serial number of the secondary control unit
GEO902I FREEZE of pppp pser ssss sser DONE
Explanation: GDPS logs that freeze was executed with nonzero return code
-pppp is the ssid of the primary control unit
-ssss is the ssid of the secondary control unit
-pser is the serial number of the primary control unit
-sser is the serial number of the secondary control unit
Sample 3990
D0
E0 D4
E6
E4 E8 EB F6
A B C DE F G H A B C D E F GH A B C DE F G H A B C D E F GH
Serial number 0039788 Serial number 0039988
CLUSTER 0 CLUSTER 1 CLUSTER 0 CLUSTER 1
3990 MODEL 006 3990 MODEL 006
APPLICATION RECOVERY
SYSTEM(S) SYSTEM
E0
D0
E4
E8
EB D4
ESCD ESCD
A B CD E F G H A B C D E F G H A B CD E F GH A BC D E F G H
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 SAID 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
ADDRESS
CLUSTER 0 CLUSTER 1 CLUSTER 0 CLUSTER 1
DASD DASD
B0
B4
B8
C0 BC
C4
C2 CC E4
C4 C6 E8 C8
C8
A B C DE F G H A B C DE F GH A BC DEF GH A B C D E F GH
Serial number 1325007 Serial number 1328006
CLUSTER 0 CLUSTER 1 CLUSTER 0 CLUSTER 1
RVA MODEL T82 RVA MODEL T82
C0 B0
B4
C4
B8
CC BC
C8
ESCD
ESCD
A B C D E F G H H GF E D C B A A B C D E F G H H GF E D C B A
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SAID 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 4
ADDRESS 0 0 2 2 4 4 6 6 7 7 5 5 3 3 1 1
2 2 4 6 6 7 7 5 5 3 3 1 1
0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0
DASD DASD
The following file image will define the GDPS - PPRC environment documented in Section 3.2. It
is provided to allow one to see the completed definitions to match the sample environment.
ENVIRONMENT: GDPS
This sample REXX program tests secondary device consistency. This program will write odd
numbers into the "ODD" and even numbers into the "EVEN" data set. A time stamp is recorded
together with the even/odd numbers.
How to use:
A. Establish two unique SSID pairs (e.g., first pair from primary SSID (SSID1) to secondary
SSID (SSID3) and second pair from primary SSID (SSID2) to secondary SSID (SSID4). Both
SSID pairs are in the Freeze Group.
B. Allocate one data set (specified by ODD) on one volume pair in the first SSID pair and the
other data set (specified by file definition EVEN) on the other volume pair in the second SSID
pair.
D. Create a PPRC failure for one of the SSID pairs, for example by pulling all PPRC links or
block all ESCON director ports. This will cause the primary SSID to detect a PPRC problem and
initiate a freeze. It is also possible to create a FREEZE by issuing a CSUSPEND to a "freeze
device" from TSO. Note, however, that causing the freeze this way will in fact only verify the
software part (the CGROUP FREEZE/RUN) of the freeze function. Also note that if you
CSUSPEND a device from TSO, you should not suspend one of the two devices in the test. The
reason is that when you suspend by command there will be no "long busy".
E. Check the last record content of the two datasets on the secondary devices. This table shows
valid outcomes, when it comes to the count values (sequence numbers) in the datasets:
ODD EVEN
n n+1
n n-1
The program performs a loop writing to the ODD and EVEN datasets and depending on where it
is when the freeze occurs, one of the two outcomes should be expected. The difference between
ODD and EVEN may never be more than 1.
Freeze processing involves doing the freeze and then disabling freeze processing. Each system will
do the freeze once but only the first one will actually cause the control units to do the freeze and
the rest will be no-ops. After the freeze the GDPS main panel will show the freeze time and
depending on timing the systems can show different times. You need to log on to all the GDPSs
and find the lowest time-stamp value and compare this with the time-stamps in the datasets on the
Consistency is guaranteed when sequence numbers in the datasets on the suspended secondary
volumes match one of the rows in the table and the time-stamps in the datasets are lower than the
freeze time-stamps.
ENVIRONMENT: GDPS
/*REXX*/
/*---------------------------------------------------------------------------------*/
/*NETVIEW PROGRAM WRITES ODD AND EVEN NUMBERS*/
/*--------------------------------------------------------------------------------*/
PARSE SOURCE .. Id .
"ALLOC DA('GDPS.ODD') F(ODD) OLD”
"ALLOC DA('GDPS.EVEN') F(EVEN) OLD”
ARG N
IF DATATYPE(N,’W’) THEN NOP
ELSE N=10000
COUNT=0
SAY 'GEO001U’ Id ‘PROGRAM STARTED'
DO I=1 TO N
/* ALLOCATED IN FIRST SSID PAIR */
COUNT=COUNT+1
HOUR=TIME('L')
PUSH COUNT HOUR
ADDRESS MVS "EXECIO 1 DISKW ODD"
ADDRESS MVS "EXECIO 0 DISKW ODD (FINIS"
/* remove ADDRESS MVS for TSO */
/* */
/* ALLOCATED IN SECOND SSID PAIR */
COUNT=COUNT+1
HOUR=TIME('L')
PUSH COUNT HOUR
ADDRESS MVS "EXECIO 1 DISKW EVEN"
ADDRESS MVS "EXECIO 0 DISKW EVEN (FINIS"
/* remove ADDRESS MVS for TSO */
END
SAY 'GEO002U’ Id ‘PROGRAM ENDED'
“FREE DA(ODD)”
“FREE DA(EVEN)”
EXIT -5
/* Use NetView PIPE command to display records */
/* PIPE < “dsname” ! CONS */
In the following section, one will find information relating to implementation and setup of the
GDPS environment. This information has been produced from experiences with initial
installations.
ENVIRONMENT: GENERAL
§ 1) HMC
ü Be certain to create all the necessary HMC load profiles based on the naming convention
used. The first 4 characters being the AO Manager Object ID followed by the system
name. Unique load profiles may be needed for unusual IPL requirements.
ü Ensure that everyone that uses the HMC understands the criticality of the HMC within the
GDPS environment and does not modify it or leave it in Single Object Operations mode.
Changes such as these are fatal to GDPS’s ability to manage the environment. GDPS,
through AO Manager, interacts with the HMC based on specific names and TCP/IP
parameters. If these are changed, communication fails. If the HMC is left in Single Object
Operations mode, AO Manager cannot communicate with the HMC and GDPS system
management functions are disabled.
Warning: HMC code replacement may destroy HMC objects and they should be
checked after a code upgrade.
Customer and CE education with regards to the critical path role of the HMC is essential.
Note: IF AO Manager cannot communicate to the HMC, this will appear in the
SDF panels.
ü In addition to these considerations, the HMC functions must be duplexed to avoid a single
point of failure. This is typically achieved by having a backup AO Manager server at the
secondary site connected to a second HMC that has access to all processors (CECs) used
in the GDPS environment. Another level of redundancy can be achieved by having two
HMCs at each site connected to AO Manager.
§ 2) Consoles
ü Ensure that all consoles in the sysplex have unique names (which is required by Sysplex
and therefore should not be a new requirement).
ü Some GDPS functions have to be serialized and GDPS makes use of console names for
serialization. Currently the console names VPCEINIT and VPCECONT are being used.
Systems have to be set up to allow commands from these console names.
ü The ONLY acceptable MFORM parameter for AO Manager consoles is (T,S). Use route
code 1 for AOM consoles to minimize message traffic to AO Manager.
ü If the duplicate VOLSER resolution function of AO Manager is used the console attached
to AO Manager must be the NIP console for the system being IPL’d because the duplicate
VOLSER messages occur in the NIP part of the IPL sequence.
ü There may be a need for additional inter-site connectivity to provide master consoles at
both sites.
ü RTME should be set to ¼ second for AO Manager consoles. The minimum value that
can be set is ¼ of a second. Even when set at ¼ second, if more than four messages per
second are arriving, queuing will occur.
ü Ensuring terminal access to GDPS/SA/NV. One way to ensure that the operator have
terminal access while testing scenarios run, is to let one of the AO Manager 3270
connections be a VTAM connection, such that the operator can log on to NV here.
§ 3) AO Manager
ü The mask value in the HMC TCP/IP addressing for inter HMC LAN communication needs
to be 255.255.255.255.
ü When NIP consoles are connected to AO Manager, the operator will need to look for
NIP messages via the AO manager instead of the operators usual place.
§ 4) Sysplex
ü GDPS automates the placement of coupling facility structures when doing a site switch by
changing CFRM policies dynamically. To make this possible there needs to be multiple
CFRM policies. It is suggested that you have three different policies.
ü GDPS automates couple data set selection. To make this management possible there needs
to be pre-allocated 4 couple data sets of each couple data set type being used in the
242 GDPS Installation Guide
sysplex. The primary couple data set is to be in Site1, the alternate couple data set is to be
in Site2 and there needs to be a spare couple data set in each site. These data sets are
defined to GDPS using the Sysplex Resource Management Panel and GDPS automatically
issues the necessary commands to select the primary and alternate data sets in the
appropriate site based on the execution of a standard action or the execution of a properly
coded CONTROL or TAKEOVER script.
ü The Automatic Restart Manager (ARM) policy may be coded to perform some of the
same functions that are planned for GDPS management. A careful analysis of ARM
management must be undertaken and appropriate changes made in the ARM policy to
ensure that GDPS and ARM are not attempting to simultaneously resolve sysplex restart
issues.
ARM was designed for cloned applications that participate in sysplex data sharing. When
a system in a sysplex fails, ARM provides a fast restart for data sharing applications to 1)
release data sharing locks as soon as possible, 2) restore SNA sessions, specifically CICS
with VTAM Multi-mode persistent sessions, and 3) restore application capacity (e.g., if
there are two CICS AORs, when one is not available, 50% of capacity is lost).
1. In order for an application to use ARM's application restart automation, it must register with
ARM. The IBM software products that register with ARM are IRLM, DB2, IMS, VTAM, and
CICS. The restart automation policy for other subsystems that do not register with ARM (e.g.
DataCom, ....), would have to be specified in SA/390 policy.
2. SA/390 has the capability to recognize when an application defined to it is under ARM-restart
control. When defining the application's policy to SA/390, the ARM element name is specified. If
there is a RESTART automation exit for the application, then there is a special RESTART.ARM
automation flag that can be turned off to prevent the RESTART exit from executing. With these
facilities, SA/390 will not restart an application if it is under ARM-restart control.
3. Currently, GDPS does not stop ARM from doing restarts of an application that is under
ARM-restart control during a workload move. In a system failure scenario which could be part of
a site failure, ARM will immediately attempt to restart the failed system's applications that are
under ARM-restart control and 1) the restart will take place prior to the operator giving
permission to initiate a takeover and 2) the restart will fail since the DASD hasn't been switched
(the UCBs still point to the DASD at the failed site).
GDPS is based upon parallel sysplex and works in a sysplex environment. GDPS chose to use
SA/390 for all aspects of workload management (e.g., start, shutdown, restart), because ARM
only provides the restart function. ARM changes and/or GDPS changes would have to be made
for ARM and GDPS to work cooperatively for system failures.
It should be noted that the GDPS team does not know of any other customers currently using
ARM for application restart in a GDPS environment.
4. ARM policy is defined in the ARM Sysplex Couple Dataset with a Sysplex utility program. If
ARM is used for an application's restart-automation policy, then the restart-automation policy for
GDPS Installation Guide 243
that application would be in ARM and the rest of the automation policy for that application and
other non-ARM applications would be in SA/390. This means that you couldn't go to a single
place to find out about and manage all the automation for an individual application as well as all
production applications. This could be a major systems management issue and exposure.
1. Applications registered with ARM that participate in data sharing will be restarted more
quickly in the event of a system failure. ARM is effectively a sysplex group member and receives
internal notification of problems and will be slightly faster compared to message-based
automation. The difference might be only a couple seconds which could be useful when an
application abends and needs to be restarted.
1. ARM was not designed to work in a GDPS environment. Changing GDPS to temporarily stop
ARM restarts until the DASD has been swapped and the operator gives permission to proceed
with the takeover, is not time-critical to the GDPS project, nor is it on the critical path.
2. All automation policy for production applications need only be specified in one place, that is
SA/390, thus simplifying systems management.
3. The GDPS team does not know of any coordination between ARM and AF/Oper.
The GDPS team recommends using SA/390 for all automation policy.
ü SFM policy integration consideration: The isolate time value will be honored and may
contribute to the perception that GDPS actions are slow. Evaluate the time chosen and use
the minimum time required to properly manage the sysplex.
ü Intra-GDPS communication occurs using XCF. The name of GDPS XCF group is
GEOPLEX.
§ 5) DASD
ü At NIP time, volumes are varied online based on the settings in the IO generation and
connectivity to devices. In large disk subsystem environments planning must be done to
ensure that the IPL process is completed as quickly as possible especially during recovery.
Furthermore, careful analysis of online/off-line status for primary and alternate site DASD
should be done to avoid duplicate VOLSER problems. With this in mind, the following
recommendations exist for the GDPS environment.
ü For installations with more than 500 disk devices, it is recommended that all disk
devices are gen’d as online and one of the following techniques be used to ensure
connectivity to the appropriate devices:
ü It is recommended that all volumes be copied with PPRC except for page, work and
volumes containing couple data sets. This avoids the possibility of critical data being
inadvertently placed on a DASD device that has been left out of the scope of PPRC and
lost in the event of a system failure. Furthermore, it ensures when a problem occurs
causing a “Freeze” operation, the Controlling System will have access to the couple data
sets. Access to the couple data sets is critical for the Controlling System to survive and to
its ability to take appropriate action.
ü When one of the messages that will trigger a freeze (see member MSGGDPS in the GDPS
DSIPARM dataset) occurs, GDPS will check if the device number in the message is a
primary device defined to GDPS in the freeze group and only if that is the case will the
freeze be performed.
ü Important note: The IEA491E message is an exception because it does not contain any
device number, it only contains serial numbers. For this message GDPS will check if the
serial numbers match an SSID pair defined to GDPS with FREEZE=Y and if that is the
case, GDPS will do the freeze.
ü Note: In subsystems with logical CUs, there can be more than one SSID pair with
matching serial numbers. This can result in unnecessary/unwanted freeze occurances. For
example if logical CUs are defined and used in test systems, action causing an IEA491E
message may cause a freeze in other (production) systems using the same physical control
units.
§ 6) Do not start GDPS testing until the environment is built and stable
ü It is common for projects to get behind and for installation steps to be accelerated or
attempted in parallel. It is important to avoid this decision when it comes to installing and
testing GDPS automation. If one attempts to automate a function that does not work
properly when manually executed, under automation it will be ‘automatically’ broken. The
perception of the observer/user is that automation is at fault. This perception is usually
inaccurate and difficult to overcome.
ü Further, standard actions should be thoroughly tested before user defined actions are
tested since, for the most part, user defined actions are script driven standard actions.
§ 7) Controlling System/s
ü Ensure that the controlling system/s is/are COMPLETELY isolated from the PLEX but
still a member. i.e. 1) they must have separate JES, Master Catalog, SMSPLEX,
RACF,LOGREC, SYSLOG, OPERLOG.data sets, and 2) connectivity to the couple data
sets must be maintained.
ü The controlling system/s must have connectivity to all volumes (both primary and
secondary) managed by GDPS.
ü They must be able to survive in the absence of all other systems and all PPRC mirrored
DASD, and must be able to IPL under these same circumstances.
ü In the event that the Controlling System/s is/are down, another system will take the
controlling function. The selection is based on the sequence of entries in the MASTER list
coded at GDPS installation time. It is essential that ALL systems in the GDPS
environment be included in the MASTER list.
ü It is important to understand that a Controlling System due to its isolated environment can
initiate a DASD TAKEOVER action. Other systems which take on the controlling
function cannot initiate a DASD TAKEOVER actions because they will be dependent on
DASD involved in the TAKEOVER operation itself.
ü Some GDPS users have asked for suggestions for having a backup controlling System at
Site1, to be used during the period of time that production systems are located at Site2.
The mechanics of having a ‘hot standby’ system to take over as the controlling system and
be resident at Site1 are fairly straight forward. An LPAR with an active MVS system,
doing no production work, having isolated DASD and subsystems, and defined in GDPS
as the second system eligible for being the controlling system, is what would be required.
Under these conditions the Site2 controlling system could be shut down and the Site1
controlling system would become the controlling system. It would have all the isolation
and survivability characteristics of the Site2 controlling system. There are system
maintenance considerations, but these are no more complex than normal sysplex planning
issues.
This will require setting the CONTROLLINGSYSTEMS keyword to allow for the
appropriate number of controlling systems.
246 GDPS Installation Guide
§ 9) GDPS Monitoring Intervals
There are three monitors in GDPS, monitor1, monitor2, and monitor3. All deviations detected
by the monitors are reported through SDF.
ü The MONITOR1 Keyword defines the monitor interval for SYSPLEX monitoring and AO
Manager heartbeat function. Monitor1 will verify that all GDPS NetViews are active. If a
GDPS NetView is missing, monitor1 will check if it is GDPS/NetView only or the
system that is down. If the system is down, GDPS will schedule a TAKEOVER.
Monitor1 will also verify that couple datasets, coupling facilities, and the CFRM policy
are the ones defined to GDPS and that they are OK. (One defines this to GDPS on the
panels displayed when one selects GDPS option 7 - Sysplex Resource Management.) It
also verifies that AOM and the PPRC API (the GEOPPRC task) are functioning.
ü The MONITOR2 Keyword defines the monitor interval for PPRC device pair monitoring.
It is doing CQUERYs to all primary and secondary volumes defined to GDPS. The
format of the parameter is:
1) hh:mm:ss (EVERY and ALL are defaults in this format)
2) xxx,hh:mm:ss,yyyyyy where xxx can be AT or EVERY and yyyyyy can be ALL or
MASTER (when MASTER is used for MONITOR2 it actually means ‘controlling
system’.
Examples:
In the first format the time is an interval and the monitor will run in all systems with the
specified interval. In the second format, EVERY indicates that the time is an interval and
AT indicates that the monitor will run once a day at the specified time. The default (if
MONITOR2 is omitted) is AT,01:00:00,MASTER which means it will run in the
controlling system every night at one o’clock. It is recommended that monitor2 runs once
per day in the controlling system.
ü The MONITOR3 Keyword defines the monitor interval for checking utility devices, PPRC
links and schedule the AOM(HIGH/LOW) indicator program. The default is every two
hours. In addition, when monitor3 runs in the controlling system, it will verify that all
PPRC primary devices are online and not allocated. Monitor3 will also check that the
FREEZE specification for an SSID pair and the current value of CGRPLB match for every
SSID pair. If a mismatch is found there will be an SDF alert.
Note: The times in all monitors can be defined as hh:mm, that is seconds are not required.
While it is true that most DASD failures do not precede disasters, it is also true that it is
impossible to guarantee that any one such failure will not be a precursor to an actual
disaster event. The decision of which data to mirror from one site to another must be
closely followed by a decision on ‘how much data loss can be tolerated in a disaster
event?’ The answer to this question can range from absolutely no data loss will be
tolerated to some data loss can be tolerated.
GDPS offers several data recovery policy options which address this issue of how much
data loss can be tolerated in the event of a disaster. These policy options relate to events
which prohibit updates from being propagated to the secondary site. They are:
ŸFreeze and Go —GDPS will freeze the secondary copy of data when remote copy
processing suspends and the critical workload will continue to execute making updates
to the primary copy of data. However, these updates will not be on the secondary
DASD if there is a subsequent Site1 failure in which the primary copy of data is
damaged or destroyed. This is the recommended option for those enterprises that can
tolerate limited data loss or have established processes to recreate the data.
ŸFreeze and Stop — GDPS will freeze the secondary copy of data when remote copy
processing suspends and will quiesce the production systems resulting in the critical
workload being stopped and thereby preventing any data loss. This option may cause the
production systems executing the critical workload to be quiesced for transient events
that interrupt PPRC processing, thereby adversely impacting application availability.
ŸFreeze Conditional — GDPS inspects the reason for the suspension of remote copy
processing. If the suspension is caused by the storage subsystems that contain the secon-
dary copy of data, processing is the same as for Freeze and Go; otherwise processing is
the same as for Freeze and Stop. This is the recommended option for those enterprises
that cannot tolerate any data loss but require maximum availability.
ü Please be aware that in the RVA, the primary subsystems must have ESCON adapters
dedicated to PPRC while in the 3990 implementation, ESCON adapters may be shared
between host and PPRC traffic. Furthermore, when defining the PPRC links, be aware that
the ESCON interfaces at the secondary site used for PPRC traffic appear as host
interfaces. They can be shared with host initiated I/Os.
ü Also be aware that when establishing cross site links using ESCON directors, all ESCON
rules apply, i.e.. only one dynamic connection is allowed in the path. If multiple directors
are used on a single path, then one director path may be dynamic, the other director path
must be a static connection.
ü Since the secondary subsystem interface appears as a host interface, it may not be used to
establish Site2 links while Site1 links are established through that interface.
ü All host connections must be removed before a link can be used as a PPRC primary link.
ü In order for PDAS to function in the GDPS environment there must be SITE1PDAS and /
or SITE2PDAS statements defined. These statements inform GPDS about PPRC links
that can be used during the PDAS operations. There must be paths available from the
secondary to the primary site for each SSID pair. There are a number of ways to
accomplish this: 1) Code SITE1PDAS or SITE2PDAS definitions using the links defined
in SITE2 and SITE1 statements respectively This assumes that permanent links exist in
both directions. 2) Code SITE1PDAS and SITE2PDAS statements with
“USE_LAST_PRIMARY_LINK” operand to instruct GDPS to take the last link from the
SITE1 or SITE2 definition and to remove it from the primary to secondary SSID pair
connection and use this link for PDAS operation in the reverse direction. The advantage
to this approach is that the PDAS links do not have to be permanent. 3) Code
SITE1PDAS and SITE2PDAS definitions using unique links (deferring from those
permanent links defined as SITE1 or SITE2 links). This also assumes that permanent links
exist in both directions.
ü Please be aware that SITE2 and SITE2PDAS statements are not valid for the RCMF
environment. If used, a message will be issued and the statement will be ignored.
ü In order to provide the best availability, recoverability, and the simplest environment to
maintain, IBM recommends uni-directional PPRC for the same execution environment
(i.e., all primary PPRC volumes must reside in the same site) to provide a consistent set of
secondary volumes that can be used for restart in the event of a disaster. If there is a
requirement to run a mixture of PPRC primary and secondary at the same site for the same
recoverable environment then the freeze and stop mode of operation must be selected to
ensure the recoverability when a complete site failure occurs.
ü In the event that WTO buffer shortages becomes a problem, SA for OS/390 has facilities
to assist. There are two parameters under the MVS Component Policy Object called
WTOBUF RECOVERY and WTOBUF AUTOMATION which deal with setting
thresholds and responding to buffer shortage conditions. For additional information on
these policy settings please refer to WTOBUF information in the ‘Systems Automation for
OS/390 Customization Manual ‘. The AOM-console has to have DEL=W (or DEL=R) set
so messages are never queued to that one.
The MVS system logger duplexes log writes in 2 out of 3 locations for high availability: (1)
dataspace in virtual memory; (2) the CF using a list structure; and (3) a logger staging dataset.
Normally the log data would be duplexed between (1) and (2) but if the customer had a
volatile CF (i.e., log data would be lost in the event of a power failure) or the CF was on the
same processor as a sysplex system (i.e., log data would be lost in the event of a processor
failure), the system logger would duplex between (2) and (3). Note that the determination of
duplexing: (1) & (2) OR (2) & (3) is determined by the system on a logstream (aka requestor)
basis but can be overridden by policy. Obviously in the event of a disaster, data contained in a
CF list structure would be destroyed and this would prevent a transaction server from being
able to clean-up in-flight transactions and resolve in-doubt transactions. Meanwhile, the
DBMS (including VSAM files) were updated and mirrored to the other site - so the secondary
copy of DBMS data contains outstanding transactions that cannot be cleaned up. GDPS has a
couple options, using (2) & (3) using to prevent log data loss, as described below:
Regarding option 2(b), since most customers will be changing their environments perodically,
changing freeze policy, etc. and may forget about about changing their LOGR set-up, it is
250 GDPS Installation Guide
recommended that option "LOGR staging dataset(s) reside in Site1 and are PPRCed to
Site2".
This set-up will be required for any system logger logstream that contains mission critical data
- exploiters of the system logger include: OPERLOG (syslog replacement which some may
view as critical); LOGREC - which probably is not viewed as critical; CTS - this is critical if
transactions need to be backed out, etc.; IMS Shared Message Queue - this is critical if
transactions need to be backed out; and Resource Recovery Services (aka new OS/390 synch
point manager across multiple database managers) - also viewed as critical.
In release 10 of OS/390 a Fence option was introduced for JES2 Spool. To mitigate the
impact on PPRC we recommend that SPOOLDEF FENCE=NO be specified to distribute the
write I/O.
In the following section, one will find information relating to GDPS Operations and usability.
This information has been included based on experiences with initial installations.
ENVIRONMENT: GENERAL
ü In the “RCMF” environment, the API indicator on the GDPS main menu will not appear
because in RCMF there is no monitoring. It's the monitoring that checks the status of the
API and updates the information.
ü OS/390 obtains an initial amount of SQA during IPL/NIP proir to processing the
IEASYSxx SQA=(x,y) parameter - the system reserves eight 64K blocks of
virtual storage for SQA and sixteen 64K blocks of virtual storage for extended
SQA - which may result in a wait state during IPL. It has been observed that
adding a large number of new devices and IPLing or a large number of
WTO/WTORs being issued during IPL/NIP have exhausted the initial SQA
allocation - the latter has direct impact on GDPS if an enterprise has a large
number of volumes with duplicate VOLSERs resulting in the system issuing a
MSGIEA213A NIPWTOR for each duplicate VOLSER. There are a couple
options to circumvent this problem:
For OS/390 2.5 or earlier, the VSM NIP module, IEAIPL04 must be zapped to
increase the initial SQA allocation - the following ZAP will probably work
depending on code level
:
++ USERMOD (MSYS004) REWORK(00000002).
++ VER (Z038) FMID(HBB6606).
++ZAP(IEAIPL04).
NAME IEAIPL04
VER 0F32 0012 NVESQA
REP 0F32 0024 INCREASE ESQA BY 12
However, if the ZAP does not go on, OS/390 VSM L3 should be contacted to get
a new zap.
For OS/390 2.6 or above, the new LOADXX INITSQA=(a,b) may be used to
override the default initial SQA allocation without requiring a ZAP - this support
was shipped in OS/390 2.6 and documented via documentation APAR OW36793:
ü When one changes a volume serial number, GDPS needs to be "informed". This is done by
entering a “T” (DASD Management) in front of the device pair. Entering T in front of a
ü Additional information: When doing a C (Config) from the main menu, GDPS will go and
get the volume serial numbers. If a volume happens to be off-line there will not be any
volser and GDPS will show ------. Later on when the device is varied online, use the "T"
command to have GDPS go get the volser.
ü The GDPS device pairs panels VPCPQST1 - VPCPQST4 can show up to 64 device pairs
per panel and a total of 256 device pairs for one SSID pair. PF10 and PF11 are used to
scroll “left and right” if there are more than 64 device pairs for one SSID pair, and PF7
and PF8 are used to scroll to the previous or next SSID pair, that is, it is not necessary to
return (PF3) to the SSID pairs panel to select another SSID pair with V (View devices) or
X (eXceptions).
ü Please note that the bottom line selections on the device pairs panel always operate on all
device pairs for the current SSID pair and not just the device pairs that are shown on the
panel. This is important to understand because if there are more than 64 device pairs for an
SSID pair, or if one is only displaying exceptions, there will or may exist device pairs that
are not shown on the panel.
PPRC-Pair
VOLSER PRImary SECondary
DV0985 0C85 0985
CGRPLB N
CRIT N
Number of Links 2
Link / status 00110002 01
00010002 01
that is primary and secondary has been swapped. This situation is corrected by entering the
action F in front of the incorrect device pair.
§ 8) IBM supplied program for dynamic deletion of consoles
IBM supplies a program that allows for the dynamic deletion of consoles without needing a
sysplex-wide IPL. The program is called IEARELCN and it is provided in OS/390’s
SYS1.SAMPLIB. The documentation in SAMPLIB contains the usual disclaimer “this source has
not been submitted to formal IBM testing’, etc.
GDPS development contacted the ownwers of the program to verify that their are no known
problems with the program. Their response states that the program has been available since 1995
and no problems have been reported with the code. Their response also states that they would
accept APARs if there were a problem.
If, while setting up the PPRC environment, a primary volume is copied to a secondary volume,
then the pair is deleted and the primary is subsequently copied to a different secondary(different
UCB), it must be remembered that you now have two volumes at the secondary site that have the
same VOLSER. If the IPL process at the recovery site birings both UCBs online, you will get
duplicate VOLSER messages at IPL time.
This is easily resolved by initializing any secondary volume if it is taken out of PPRC service.
Checklist of tasks to verify when installing and testing different flavours of GDPS, GDPS/PPRC,
RCMF/PPRC,GDPS/XRC and RCMF/XRC.
SYS1.PARMLIB(MPFLSTxx) Messages that can possibly trigger automation GDPS, Subsection 2.7.2.1
Coexistence must be seen by NetView. RCMF/XRC
SA/390 Modules AOFEXDEF and They are used to create unique console names. GDPS SA/390 Customization".
AOFRGCON
MLIM parameter in CONSOLxx It is recommended that the MLIM parameter in RCMF Section 4.1.1
CONSOLxx is set to a value that is at least 20% GDPS
more than the number of DASD devices defined
to the system.
GDPS Implementation Tips / Hints Read all bullets in this Appendix and take action All Appendix E: GDPS
for them that applies to the current install. Tips and hints
AO Manager consol SETUP Definition for the MCS console attached to AO GDPS 2.3.3.1
Manager must include: MFORM T,S and DEL(W)
Extended Consoles Coexistence NetView, SA for OS/390, and GDPS require the use All Subsection 2.7.2.2
of MVS extended consoles. NetView
Both SA for OS/390 and GDPS have automation table GDPS Subsection 2.7.2.2
NetView Automation Table
segments that must be added to the existing automation RCMF/XRC NetView
Coexistence table.
Environment:
GDPS means GDPS/PPRC and
GDPS/XRC.
RCMF means RCMF/PPRC and
RCMF/XRC
ENVIRONMENT: GENERAL
When GDPS has been installed and the initial setup is complete one needs to start testing and do
some Installation Verification. The following suggested testing was completed by IBM during its
quality assurance process and represents a good example for each installation to follow to verify
correct installation. It is recommended that this testing is done in the following sequence:
ENVIRONMENT: RCMF
It is recommended that initial PPRC (remote copy) testing is performed with a limited number of
SSID pairs and device pairs and that only test devices are used so that all PPRC functions can be
verified.
GDPS uses the SA/390 function SDF to indicate the status of systems and remote copy (PPRC)
on an SDF screen. Whenever there is a status change this screen is updated. When everything is
working correctly all information on this screen will be green and when something fails the color
will change to pink (warning) or red (error). So in all failure situations being tested, something on
the SDF screen should turn red.
GDPS also uses SDF for tracing actions and this includes Standard Actions as well as CONTROL
and TAKEOVER actions.
Part of all testing is a verification that the SDF screen is updated and that it indicates what
problems exist or what actions were executed.
Before any test of GDPS actions (Standard, CONTROL, or TAKEOVER) the following should
be verified:
Ÿ GDPS Option 1 - DASD Remote Copy, verify Freeze and Crit specifications for each SSID
pair
Ÿ GDPS Option 1 - Make sure that DASD mirroring status is OK and if needed do a 5 (Query)
to have status updated.
Ÿ Do Q(uery) to at least one device pair in each SSID pair and verify that CGRPLB has the
correct value
Ÿ GDPS Option 3 - Standard Actions, verify that each system has correct status and that all
information on the panel has the correct values
Ÿ GDPS Option 7 - Sysplex Resource Management, verify all information
Ÿ SDF, verify that status is correct and delete all trace entries
Ÿ After any FREEZE testing make sure that freeze is enabled again by doing Estpath or Resynch
(all or SSID pair) from GDPS.
After testing any GDPS actions (Standard, CONTROL, or TAKEOVER) the following should be
verified:
When Option 1 is selected, the SSID pairs panel (VPCPQSTC) will be displayed and one should
see the SSID pairs and link (path) status. Try out and verify the different actions and selections
and also select the device pairs panel (action V) and verify actions and selections from there.
3 - Standard Actions
ENVIRONMENT: GDPS
6 - Planned Actions
ENVIRONMENT: GDPS
ENVIRONMENT: GDPS
Option 7 will show the “Sysplex Resource Management” screen primed with the current couple
data sets and the current CFRM policy. The couple data set types and the CFRM policy names
will be shown in pink and one needs to provide GDPS with some information so that GDPS can
do the sysplex resource management.
Enter M (modify) to the left of each couple data set type and the next panel will show the current
couple data sets in the top half and defined couple data sets in the bottom part. GDPS will assume
that the currently used data sets should be the primary, in Site1, and alternate, in Site2, and those
fields will be primed on the screen. A spare data set should be allocated in each site and the spare
data set names should be entered in the input fields. If needed, one can change the names of the
primary and alternate data sets. GDPS will verify that the data sets specified, exist. When the data
sets have been defined, one should verify that the selections work, for example switching to the
Site1 or Site2 data sets and switching back to normal.
Next enter M (modify) to the left of the CFRM policy name and on the next panel enter the
CFRM policy names for using only the Site2 and only Site2 coupling facilities. Also enter the
names of the coupling facilities in each site. When done one should verify that it is possible to
switch between the different CFRM policies.
8 - Automation ON/OFF
Verify that one can set automation OFF and back ON and that it is possible to switch between
debug ON and OFF.
9 -View Definitions
ENVIRONMENT: GDPS
Verify that the different selections show the correct information that has been defined to GDPS.
C - Config
When the GDPS or RCMF command is entered for the first time no PPRC configuration exists
and the C - Config option will automatically start. Later one needs to change the configuration
and select OPTION C - Config and verify that the changes are reflected on the DASD Remote
Copy” panels.
ENVIRONMENT: GDPS
Testing Standard Actions is just a matter of doing all of them and verifying that they operate as
expected, and also verifying that one is not allowed to do any actions to the system on which you
are logged on.
IPL
Activate
Load
Stop
ReIPL
Deactivate
Reset
Please note that the only actions that are supposed to be used against a system in status ACTIVE
(or MASTER) are Stop and ReIPL. Also note that the standard actions should be performed from
the controlling system except actions against the controlling system which need to be performed
from another system.
Please review Sections 3.2.1 - 3.2.11 before continuing. These sections provide additional
information on testing Standard Actions along with expected results.
ENVIRONMENT: GDPS/PPRC
Planned Actions are defined to GDPS through the System Automation dialog with the entry-type
CONTROL in the form of scripts. First one has to decide what planned actions from GDPS are
desired and create CONTROL scripts for these actions. Testing is essentially starting each of the
actions and verifying that all the script statements are performed correctly. User-Defined Actions
should be started from the controlling system except actions involving the controlling system. For
example if a script contains a statement for stopping the controlling system, this script must be run
on another system.
The development test sysplex looks as follows (from the option 3 panel):
Sysname Status
LPAR Mode SA-OA L-addr LOADPARM
_ SITE1
_ SWETSGEO ACTIVE NORMAL MVS1 NORMAL YY
_ SITE2
_ SWETSMVS MASTER NORMAL MVS NORMAL YN
In Site1 the “production” system SWETSGEO is running and the controlling system,
SWETSMVS, is running in Site2. Primary DASD is in Site1.
The following Planned (CONTROL) Actions have been defined and tested by development.
Site1 Maintenance
COMM='SITE1 MAINTENANCE'
SYSPLEX='STOP SITE1'
SYSPLEX='DEACTIVATE SWETSGEO'
SYSPLEX='CDS SITE2'
SYSPLEX='CF SITE2'
DASD='SWITCH DELPAIR'
DASD='STOP SECONDARY'
IPLTYPE='SWETSGEO ABNORMAL'
SYSPLEX='ACTIVATE SWETSGEO'
Site2 Maintenance
COMM='SITE2 MAINTENANCE'
SYSPLEX='STOP SITE2'
SYSPLEX='DEACTIVATE SWETSMVS'
SYSPLEX='CDS SITE1'
SYSPLEX='CF SITE1'
DASD='STOP SECONDARY'
IPLTYPE='SWETSMVS ABNORMAL'
SYSPLEX='ACTIVATE SWETSMVS'
ENVIRONMENT: GDPS/PPRC
Whenever a failure occurs, GDPS will analyze the situation and suggest a takeover action based
on the result of the analysis and the takeovers defined to GDPS. When GDPS detects a failure it
will always issue a prompt and ask what takeover action is to be done. There is a naming standard
for takeovers and the following can be defined where n is 1 or 2 for Site1 or Site2:
Ÿ ALLSITEn switches systems and primary DASD from site n to the other site
Ÿ DASDSITEn switches primary DASD from site n to the other site
Ÿ SYSSITEn actions performed when a system in site n fails
Ÿ SYSsysname actions performed when system “sysname” fails
Ÿ user defined will be presented in all takeover prompts
In ALLSITEn and DASDSITEn n will be 1 or 2 depending on what site currently has the primary
DASD. When the last system in a site fails, SYSSITEn and SYSsysname Takeover actions for
that site will be suggested.
If “user defined” takeovers are defined they will be presented in all takeover prompts. In the
following description only the standard takeovers will be mentioned.
All takeover prompts will include the option REPEAT and if one replies ‘REPEAT’ the analysis
will be repeated and there will be a new prompt which may contain additional options. This is
supposed to cover rolling disasters where the first indication of a failure will cause a prompt based
on the very first problem. When the reply is ‘REPEAT’ the new analysis may detect more
problems and the new prompt will offer options based on the new analysis.
This section contains descriptions of failure creation and what takeovers should be offered for the
different kinds of failures.
Ÿ Simulate disk array failure by powering off RAMAC3 drawer and part of RVA.
Ÿ Simulate disk subsystem failure by powering off RAMAC3 and RVA.
This should trigger a FREEZE and present the DASDSITEn and ALLSITEn takeovers (if
defined). The value of n will be 1 or 2 depending on where the primary DASD is.
Ÿ Simulate disk array failure by powering off RAMAC3 drawer and part of RVA.
Ÿ Simulate disk subsystem failure by powering off RAMAC3 and RVA.
This should trigger a FREEZE and present the DASDSITEn and ALLSITEn takeovers (if
defined). The value of n will be 1 or 2 depending on where the primary DASD is. One should not
GDPS Installation Guide 263
do a takeover if this really is a secondary DASD failure. However, if all PPRC links to the
secondary DASD fails, it could be the beginning of a rolling disaster, and in that case the proper
action might be a complete site switch.
System Failures
Systems are defined to GDPS in the policy GEOPLEX DOMAINS and all systems should be
defined in the policy GEOPLEX OPTIONS in the MASTER list. The controlling system is the
first one in the MASTER list. If the controlling system (or GDPS in the controlling system) is not
active, GDPS in the next system in the MASTER list will take the master function. In any point in
time, the first GDPS in the MASTER list that is active will take the master function and if needed
the master function will automatically switch whenever a GDPS is started or stopped. To be fully
disaster recovery ready the controlling system needs to be active so that it has the master function.
In GEOPLEX DOMAINS Site1 systems are defined in the keyword SITE1 and Site2 systems are
defined in the keyword SITE2. In a backup situation a Site1 system can be IPLed in the Site2 in
its alternate LPAR using IPLTYPE=ABNORMAL. However, it is still a Site1 system. There is no
awareness in GDPS as to what site a system is currently running in. So in this description a Site1
system is a system defined in the keyword SITE1 even if it is currently running in a Site2 LPAR.
Ÿ Simulate a production system failure by resetting a lpar. Note: This should not be done from
GDPS Standard Actions because if done, GDPS will not treat it as a failure.
When a production system fails GDPS should present the SYSsysname and SYSSITEn takeovers
where n is the site of the failing system. In addition, the ALLSITEn takeover will be presented if
the failing system is in the same site as the primary DASD.
When the controlling system fails, the next system in the master list should take the master
function and GDPS should present takeovers the same way as for a production system failure.
Note: If the new master system is using the PPRC disk, one cannot select the ALLSITEn script
even if it is presented. (GDPS currently does not prevent this.)
NetView Failures
The system where NetView failed should get status NOXCF and it should be reported in SDF. No
takeover will be suggested.
If NetView in the controlling system fails the next GDPS in the master list should take the master
(controlling) function.
CF Failures
Coupling Facility failures are supposed to be handled by the CFRM policy. There will be no
GDPS action for a Coupling Facility failure.
Currently there are no GDPS actions for processor or coupling facility failures. A processor
failure will cause systems in that processor to fail and GDPS action will be as described for system
failures.
TAKEOVER scripts
The following Unplanned (TAKEOVER) Actions have been defined and tested by development.
ALLSITE1
DASDSITE1
SYSSITE1
SYSSWETSGEO
A GDPS/PPRC implementation plan will often have a staged implementation of the GDPS/PPRC
functions with the following steps:
ŸRemote Copy
ŸFreeze
ŸStandard Actions
ŸSysplex Resource Management
ŸPlanned Actions
ŸUnplanned Actions
RCMF/PPRC does not supply the freeze function, so to have freeze, GDPS/PPRC must be
installed. However, it is not necessary to implement all GDPS/PPRC functions immediately.
GDPS should be implemented in a test environment and when putting it into production it should
be possible to activate all functions. However, some users want to do a staged production
implementation and so far the first step has been to activate GDPS remote copy and it is possible
to limit the implementation to just remote copy with (or without) freeze and the other functions
can be implemented when it suits the user.
Most GDPS implementations will start with remote copy, and this is recommended, but it is not
an absolute requirement. It is possible to start using the different functions in any order.
The rest of the GDPS Install Guide has limited information on how to do a staged implementation
and for that reason this appendix has been added. It describes the minimum GDPS/PPRC
definitions that have to be made for the different GDPS functions with a bias towards remote copy
with freeze.
The description in the GDPS Installation Guide section “2.7 GDPS System Implementation” must
be followed, but here is a description of the entries needed in the steps in section “2.7.5 Using
SA/390 Dialogues to Create the GDPS Policy” if this limited approach is to be used.
2.7.5.1
Exit name VPCEINIT has to be filled in.
2.7.5.2
Application is not needed unless SA/390 automation will be used.
2.7.5.3
OS/390 components is not needed.
2.7.5.4
2.7.5.5
Status details is not needed unless SDF exception reporting is wanted.
This section starts the description of GDPS parameters defined in SA/390. The absolute minimum
that has to be defined is GEOPLEX DOMAINS and GEOPLEX OPTIONS.
2.7.5.7
One SITE1 system in GEOPLEX DOMAINS must be defined with at least the pppp object id,
even if AO Manager will not be used. If AO Manager will not be used, any four characters can be
used for pppp. When systems are added, one SITE1 or SITE2 statement is needed for each
system.
2.7.5.8
The MASTER keyword has to be defined in GEOPLEX OPTIONS and it must contain the same
system name or NetView domainid as defined in GEOPLEX DOMAINS. When more than one
system is defined in GEOPLEX DOMAINS, all systems must be in the MASTER list. The format
of the list is: Keyword is MASTER and Data is a list of system names or domainids separated by
commas and enclosed in quotes, for example: ‘DSSAO,DSSKO’ for two systems with the
NetView domainids DSSAO and DSSKO. The controlling system (or systems if more than one
controlling system is defined) must be the first system(s) in the list.
2.7.5.9
Abnormal IPLs is not needed and if it is not defined there will be a message during GDPS
initialization
2.7.5.10
It is not necessary to define any TAKEOVER, CONTROL, or BATCH actions and if they are not
defined there will be messages during GDPS initialization
with xxx replaced by TAKEOVER, CONTROL, and BATCH. These are not error messages, they
are information messages.
When the plan is to use only the remote copy functions of GDPS/PPRC all systems should be
defined in GEOPLEX DOMAINS. However, since GDPS is designed to use GDPS Standard
Actions to start and stop systems, the following changes are needed to prevent unwanted GDPS
actions:
ŸDisable Monitor 1
ŸChange the NetView Automation Table
Disable Monitor 1
One function of the GDPS monitor 1 is to verify that all systems are active. If a system is stopped
without using GDPS Standard Actions, monitor 1 will detect loss of the system and will regard
this as a problem and initiate takeover processing. To prevent this, monitor 1 has to be disabled by
setting the monitor 1 interval to 00:00:00 in GEOPLEX OPTIONS. GDPS will fail to set the
monitor 1 timer during initialization (using the EVERY command) and the following messages
will be issued:
EVERY 00:00:00,ID=GEOMON1,VPCEMON
DSI203I TIME VALUE SPECIFIED INVALID OR OMITTED
DSI202I TIMER REQUEST FAILED TO BE SCHEDULED FOR EXECUTION 'ID=GEOMON1'
The Install Guide states that the GDPS/PPRC SGDPPARM members GEOMSGGP and
GEOMSGG0 should be included in the NetView automation table. GEOMSGGP contains the
messages that GDPS/PPRC needs in the automation table for the freeze function and
GEOMSGG0 contains messages for managing the systems in the sysplex. The IXC messages in
GEOMSGG0 are used for GDPS Standard Actions and in GDPS takeover processing (unplanned
actions). These messages have to be removed when GDPS Standard Actions are not used that is
GEOMSGG0 should not be included. If the freeze function is not wanted, no messages are needed
in the automation table for GDPS and none of the GEOMSGGP or GEOMSGG0 members is
needed.
Based on if freeze is wanted or not, make one of the following modifications to the NetView
automation table:
For GDPS with remote copy only and no freeze, do not include any GDPS member in the
automation table.
It is possible to run GDPS without AO Manager, but this is of course not recommended in a
production environment. GDPS requests AO Manager actions by issuing a GEO090A WTOR and
AO Manager is supposed to initiate the requested action and then reply OK. The actions are HMC
actions like ACTIVATE, DEACTIVATE, LOAD, and RESET. Before AO Manager is
implemented it is of course possible for a person to initiate the HMC action and then reply OK to
the outstanding GEO090A message.
It is possible to run GDPS/PPRC without the remote copy function and this is accomplished by
creating an empty GEOPARM member. There will be error messages every time NetView is
started beacuse GDPS has no PPRC configuration information and it will automatically try to do a
Config operation and it will not find the “required” GEOPLEX LINKS, MIRROR, and
NONSHARE statements. When Dasd Remote Copy is selected from the GDPS main panel, a
panel with uninitialized variables will be shown and no actions can be performed.
The GDPS/PPRC SGDPPARM member GEOMSGGP contains the messages that GDPS/PPRC
needs in the NetView automation table for the freeze function. If the remote copy function is not
used, there is no need to have the “freeze messages” in the automation table and GEOMSGGP
should not be included
To remove the GDPS remote copy function once it has been used, it is not sufficient to create an
empty GEOPARM member. The reason is that a successful Config saves information in the
NetView DSISVRT dataset and a subsequent Config with an empty GEOPARM fails and GDPS
will restore the old information. To remove the remote copy function requires re-allocation of the
DSISVRT dataset in addition to the empty GEOPARM member..
GDPS is not designed to run with a limited set of functions which means that even if the
instructions in this document are followed, all selections will exist on the GDPS main panel. It is
of course possible to change the main panel definition to remove the functions that have not yet
been implemented but it will still be possible to select them (and it is quite obvious that you enter a
number to select a function). To prevent anyone from making a selection, the panel definition for
that function has to renamed, for example VPCPSTD1 for Standard Actions. When the selection
is tried, there will be an error message like “VIEW .... COMMAND FAILED” and the selection
will fail.After this GDPS may have to be reentered to come back into GDPS. All panels have their
name in the top row so it is a simple task to find out what members in the panel library to rename.
You may use this form to communicate your comments about this publication, its organization, or
subject matter, with the understanding that IBM may use or distribute whatever information you
supply in any way it believes appropriate without incurring any obligation to you. Your
comments will be sent to the author's department for whatever review and action, if any, are
deemed appropriate.
Today's date:
__________________________________________________________________________
__________________________________________________________________________
Is there anything you especially like or dislike about the organization, presentation, or writing in
this manual? Helpful comments include general usefulness of the book; possible additions,
deletions, and clarifications; specific errors and omissions.
Name . . . . . . . . . _______________________________________________
Company or Organization _______________________________________________
Address . . . . . . . . _______________________________________________
_______________________________________________
_______________________________________________
Phone No. . . . . . . . _______________________________________________