0% found this document useful (0 votes)

91 views16 pages

IBM z Systems Performance Guide

Uploaded by

Özgür Hepsağ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views16 pages

IBM z Systems Performance Guide

Uploaded by

Özgür Hepsağ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

IBM z Systems 2016 NY NaSPA Chapter

Performance Optimization for

Modern z Processors

C. Kevin Shum Charles F. Webb

IBM Distinguished Engineer IBM Fellow
z Systems Processor Design z Systems Development
Member of IBM Academy of Technology

© 2016 IBM Corporation

IBM z Systems 2016 NY NaSPA Chapter

Trademarks
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries.
BigInsights DFSMSdss FICON* IMS RACF* System z10* zEnterprise*
BlueMix DFSMShsm GDPS* Language Environment* Rational* Tivoli* z/OS*
CICS* DFSORT HyperSwap MQSeries* Redbooks* UrbanCode zSecure
COGNOS* DS6000* IBM* Parallel Sysplex* REXX WebSphere* z Systems
DB2* DS8000* IBM (logo)* PartnerWorld* SmartCloud* z13 z/VM*
DFSMSdfp
* Registered trademarks of IBM Corporation
The following are trademarks or registered trademarks of other companies.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce.
ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
Java and all Java based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
OpenStack is a trademark of OpenStack LLC. The OpenStack trademark policy is available on the OpenStack website.
TEALEAF is a registered trademark of Tealeaf, an IBM Company.
Windows Server and the Windows logo are trademarks of the Microsoft group of countries.
Worklight is a trademark or registered trademark of Worklight, an IBM Company.
UNIX is a registered trademark of The Open Group in the United States and other countries.
* Other product and service names might be trademarks of IBM or other companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any
user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload
processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have
achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to
change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the
performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
This information provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g, zIIPs, zAAPs, and IFLs) ("SEs"). IBM
authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM
Machines” provided at www.ibm.com/systems/support/machine_warranties/machine_code/aut.html (“AUT”). No other workload processing is authorized for execution on an SE. IBM offers SE
at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the
AUT.

2 © 2016 IBM Corporation

IBM z Systems 2016 NY NaSPA Chapter

Introduction / Motivation
 Hardware/Software co-optimization is increasingly important to performance
– Performance gains from technology scaling have ended
– Hardware performance gains are coming from design
• Micro-architectural innovation (and complexity)
• New instructions and architected features
– Coding practices and software exploitation needed to get the full value of the hardware
 More efficient code helps everybody
– Increases value of software
• Extract the maximum useful work from the hardware
– Increases value of z Systems platform
• Solutions delivered more cost-effectively
– Decreases effective cost for end user
 Goal of this session: Motivate you to make performance a priority
– Can only scratch the surface in 45 minutes
– Highlight a few high-leverage areas
– Point you to resources available to assist with optimization

3 © 2016 IBM Corporation

IBM z Systems 2016 NY NaSPA Chapter

Compilers

 Biggest single performance lever for many applications

– Aggressive use of the latest compiler technology

 Close Linkage between compiler and hardware development teams

– Define new instructions and architectural features
– Tune code generation for processor micro-architecture

 ARCH and TUNE options optimize for current hardware designs

– ARCH needs to match oldest hardware level supported
• May be worth experimenting to test value of higher ARCH level on new hardware
– TUNE should match hardware level for which you care most about performance
• Usually the latest available hardware level
• Code will work correctly on all hardware levels

 Use higher levels of OPT to get the best performance:

– At least on performance-sensitive components

4 © 2016 IBM Corporation

IBM z Systems 2016 NY NaSPA Chapter

Compilers on z systems
 IBM continues to invest in the compiler portfolio on z:
– Increased focus on application program performance in recent years
– Continued advancements in languages and operating systems
• Java / JIT, C/C++, COBOL, PL/I, Linux, z/OS

Enterprise COBOL for Enterprise PL/I for z/OS V2.2 XL C/C++ for Linux on z
z/OS V5.2 z/OS V4.5 XL C/C++ Systems V1.2
• Leverage SIMD instructions • Critical Business • Optional feature of z/OS • New compiler based on
to improve processing of Language – Committed to 2.2
invest in leading-edge Clang and IBM
certain COBOL statements.
technology optimization
• Provides system technology
• Increased use of DFP programming
• Shipped a new release
instructions for Packed capabilities with Metal C
every year since 1999 • Fully Supports
Decimal data option z/Architecture,
• Fully Supports including z13 & z13s
• Support COBOL 2002 z/Architecture, including • Fully Supports
z/Architecture, including processors
language features: SORT z13 & z13s processors
and table SORT statements z13 & z13s processors
• Provide full support for • Provide easy migration
JSON (Parse, Generate, • Ships with High of C/C++ applications
• Allows applications to
and Validate) performance Math to System z
access new z/OS JSON
services Libraries tuned for z13 *Up to 14% increase in
performance over GCC*
*Up to 14% reduction in CPU *Up to 17% reduction in *Up to 24% increase in
time* CPU time* throughput*
* The performance improvements are based on internal IBM lab measurements. Performance results for specific applications will vary, depending on the
source code, the compiler options specified, and other factors
5 © 2016 IBM Corporation
IBM z Systems 2016 NY NaSPA Chapter

Evolution of IBM COBOL on z Systems

Day 1 z Processer Support
Rel-Rel Performance Improvement
New COBOL Language Features
App. Modernization Features
Application Modernization
Middleware Interoperability Enterprise
Internationalization COBOL V6 (Ann:
2016)
Enterprise COBOL V5 • Enhanced
LE, Debug, (Ann: 2013) Scalability –
USS… Enterprise COBOL V4
• New advanced Compile and
Optimize very large
Optimization
(Ann: 2007-2009) Framework COBOL programs
• New COBOL Runtime • Native JSON
Enterprise COBOL V3 • XML System Services “Generate”
(Ann: 2001-2005) parser • DWARF Debugging
• New COBOL 2002
• Unicode • DB2 9 SQL support with format
Language Features
COBOL/370, COBOL coprocessor • Exploits Program Object
• Native Java & XML • Enhanced Migration
for MVS & VM; • Java 5 & 6 support; • New COBOL 2002
• CICS & DB2 co- Help
COBOL for OS/390 & processors; IMS Java • UNICODE performance Language Features
VM (Ann: 1990’s) regions improvement • Generates SMF 89
• Language Environment • Debugging of production • Improve debug support for
optimized code
• Intrinsic functions code with Debug Tool
• Debug Tool • Data item limits raised to
• Dynamic Libraries, USS, 128MB (from 16 MB)
DB2 coprocessor…

6 © 2016 IBM Corporation

IBM z Systems 2016 NY NaSPA Chapter

Why SW Optimization Matters

Processor design
 Deep instruction pipeline
– Driven by high-frequency design
– Z13 pipeline: 20+ cycles from instruction fetch to instruction finish

 Pipeline hazards can be expensive

– Branch flush – 20+ cycles
– Cache reject – 12+ cycles
 Code optimization can help
– Arrange frequent code in “fall through” paths
– Pass values via registers rather than storage
reject

Data cache access,

then reject to retry

Instruction Instruction register Instruction

fetch buffer, decode mapping queue, wakeup Fixed-Point operation,
and dispatch and issue then branch flush

flush
7 © 2016 IBM Corporation
IBM z Systems 2016 NY NaSPA Chapter

Why SW Optimization Matters z196 On-chip Cache Hierarchy

Local/Private caches Shared caches

Cache design 4
D
L1
Core

 Private (per-core) cache evolution

Pipe- L2
line L3
4 I

– Allows improvements in size and latency L1

– Unified vs. split L2 for instructions and operands

• Split L2 keeps data closer to L1 zEC12 On-chip Cache Hierarchy
• Unified (z196) to hybrid (zEC12) to split (z13) Local/Private caches Shared caches
– Integrated vs. serial directory lookup
• Integrated reduces access latency for L2, L3 4
D
L1
L1
+

• Added for operands (zEC12), Instructions (z13) Core

Pipe-
line L3
L2
I
 Allows large, fast L2 caches
4
L1

– L2 sizes comparable to others’ L3s (MBs)

• Leverages eDRAM technology z13 On-chip Cache Hierarchy
– Around 10 cycles to access data from L2 Local/Private caches Shared caches

 On-chip shared L3 D L2
– Shared by all cores on the CP chip Core
4 L1

– Now also the sharing point for I-L2 and D-L2 Pipe-
line L3
4 I
L1 L2

 Cache line size is 256B throughout hierarchy

– Safe value to use for separation / alignment
8 © 2016 IBM Corporation
IBM z Systems 2016 NY NaSPA Chapter

Optimizations on local data z196 On-chip Cache Hierarchy

Local/Private caches Shared caches

Instruction / data proximity D

4 L1

 Instructions & Operands in same cache line Core

Pipe-
line
L2
L3
– OK (maybe inefficient) if operands read-only 4 I
L1
– Problem if stores to those operand locations
• Extra cache misses, long delays
zEC12 On-chip Cache Hierarchy
 Split L1 caches (re-)introduced in z900 (2000) Local/Private caches Shared caches

– Designs optimized for well-behaved code

• Increasing cost of I/D cache contention 4
D
L1
L1
+

– With split of L2 cache, resolution moved to L3 Core

Pipe-
line L3
L2
I
 Not a problem for
4
L1

– Re-entrant code
– Any LE-based compiler generated code z13 On-chip Cache Hierarchy
– Dynamic run-time code Local/Private caches Shared caches

 Problematic Examples D L2
– True self-modifying code Core
4 L1

– Classic save area

Pipel
-ine L3
I
– Local save of return address
4
L1 L2

– In-line macro parameters

Optimizations on shared data

Shared data structures among SW threads / processes
 Sharing is not necessarily bad
– Can be very useful to leverage strongly consistent architecture

 …But updates from multiple cores => lines bounce around among caches
– Depending on locations of cores, added access latency can be troublesome
– Need to manage well to get good performance
 True sharing – real-time sharing among multiple SW threads / processes
– Atomic updates, Software locks
– Higher nWay (concurrent SW threads), more frequent access => more care needed
– If contested in real-time, can lead to “hot-cache-line” situations
 False sharing – structures / elements in same cache line
– Can be avoided by separating structures into different cache lines
Cache hit Latencies Intervention Overhead
locations (no queuing) (if a core owes CP CP N N
exclusive) 8 cores 8 cores
L1,L2,L3 L1,L2,L3
L1 4 NA
CP CP CP N N
L2 ~10 NA CP XBus
8 cores 8 cores 8 cores XBus 8 cores
L1,L2,L3 L1,L2,L3 L1,L2,L3
L3 (on-chip) 35+ 40+ L1,L2,L3
N N
L3 (on-node) 180+ 20+ SC SBus SC
L4+NIC L4+NIC
L3 700+ 20+ Processor Processor N N
Node (N) Node (N)
(off-drawer, Inter-node
Far column) Cache topology and latencies for z13 Topology

IBM z Systems 2016 NY NaSPA Chapter

Moving / Clearing Large Blocks

Usages of MOVE LONG (MVCL) vs MOVE (MVC) instructions
 Several ways to move or clear a large block of storage
– One MVCL instruction
– Loops of MVCs to move data
– Loops of MVC <Len>,<Addr>+1,<Addr> or XC <Len>,<Addr>,<Addr> to pad/clear an area

 MVCL is implemented through millicode routines

– Millicode is a firmware layer in the form of vertical microcode
• Incurs some overhead in startup, boundary/exception checking, and ending
– MVCL function implemented using loops of MVCs or XCs

 Millicode has access to special hardware

– Near-memory engines that can do page-aligned move and page-aligned padding
• Can be faster than dragging cache lines through the cache hierarchy
• However, the destination will NOT be in the local cache
 Many factors to consider
– Will the target be needed in local cache soon?
• Then moving “locally” will be better
– Is the source in local cache?
• Then moving “locally” may be better
– How much data is being processed?
• If many pages, then the near-memory engine usage might be beneficial
11 © 2016 IBM Corporation
IBM z Systems 2016 NY NaSPA Chapter

Software Aids to Hardware

Hardware cannot read programmers’ minds: Give it some hints
 Instructions designed to help hardware optimize performance
– Modify details of heuristic / history-based hardware mechanisms
– Please use responsibly: Over- or mis-use can be counter-productive
• Increased code image, pathlength
• One wrong hint can outweigh several correct ones
– Some experimentation may be needed to fine-tune usage
– Exact hardware effects will vary by implementation
• Hardware reserves the right to ignore hints

 Branch Prediction Preload [Relative] (BPP, BPRP) Instruction

– Introduced on zEC12
– Specifies future branch instruction and its target
• Target address in GR or relative to current instruction address
– Performs instruction cache touch of the provided branch target address
– Architectural no-op

IBM z Systems 2016 NY NaSPA Chapter

Software Aids to Hardware (continued)

 Next Instruction Access Intent (NIAI) instruction
– Introduced on zEC12
– Affects hardware handling of the storage operand of the next instruction
• Like a “prefix” instruction but architecturally a separate (no-op) instruction
• Especially useful when referencing shared storage areas / data structures
• May be used by MVCL millicode to optimize use of near-memory engines
– “Read”: This program will only read – not write/change – that location / cache line
– “Write”: This program will be updating the location / cache line later
• Even though this access is a read/ load
– “Use once”: This program will not be using this location again
• Can indicate that the current access is a streaming type access

 Prefetch Data [Relative] (PFD, PFDRL) instruction

– Introduced on z10
– Helps hardware have the right stuff in the caches when needed
– Pre-stage cache lines into the local caches (all the way into L1)
• Specify whether intended usage is read-only or read/write
– “Untouch” cache lines to remove from local caches
• Can be helpful when done using a shared data structure
– Demoting cache line from an exclusive state to a read-only state
• Can be helpful when done updating a shared data structure
– Architectural no-op
13 © 2016 IBM Corporation
IBM z Systems 2016 NY NaSPA Chapter

IBM Automatic Binary Optimizer (ABO) for z/OS

Improve Performance of Compiled COBOL Programs
ABO Features
Internal & Customer Performance Improvements Measuring ~15% 
No Source Code, Migration or Performance Options Tuning Required 
Targets Latest IBM z Systems : zEC12, zBC12, z13,z13s running z/OS 2.1 or z/OS 2.2 
All IBM Enterprise COBOL v3 & v4 Compiled Programs Are Eligible For Optimization 
Optimized Programs Guaranteed To Be Functionally Equivalent 
IBM Problem Determination Tooling Support
+ Working With Several Key 3rd Party Tooling Vendors In Our Beta Program

Leverages new z/OS 2.2 Infrastructure To Target Multiple Hardware Levels Automatically 

Optimizer

Original Program Binaries Optimized Program Binaries

Other Resources
Like this stuff? There’s lots more available:
 Microprocessor Optimization Primer
– Available under IBM Developerworks’ LinuxOne community
• https://www.ibm.com/developerworks/community/groups/community/lozopensource

 CPU Measurement Facilities

– User-accessible hardware instrumentation data to understand performance characteristics
– Documentation and education materials can be found on online, some references:
• For z/OS http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TC000066
– (supported under Hardware Instrumentation Services - HIS)
• For z/VM http://www.vm.ibm.com/perf/tips/cpumf.html

 Other related references

– “z/Architecture: Principles of operation,” Int. Bus. Mach. (IBM) Corp., Armonk, NY, USA,
Order No. SA22-7832-10, Feb. 2015. [Online]
– Dan Greiner’s presentations of z/Architecture features with SHARE
– John R. Ehrman's book: Assembler Language Programming for IBM z System Servers
– “The IBM z13 multithreaded microprocessor,” in IBM J. Res. & Dev., pp. 1:1–1:13, 2015

IBM z Systems 2016 NY NaSPA Chapter

Thank you!

(Chung-Lung) Kevin Shum Charles Webb

[email protected] [email protected]
Linkedin: https://www.linkedin.com/in/ckevinshum

IBM z16
No ratings yet
IBM z16
6 pages
z16 Technical Overview 50M KennyStine
No ratings yet
z16 Technical Overview 50M KennyStine
64 pages
Anaheim OMEGAMON SHARE 11792
No ratings yet
Anaheim OMEGAMON SHARE 11792
30 pages
The New IBM z13 PART 1 - SHARE Feb21a2014
No ratings yet
The New IBM z13 PART 1 - SHARE Feb21a2014
56 pages
GSE - 201710 - IBM - Z14-Technical - Overview - Final SUPER GOOD ONE
100% (1)
GSE - 201710 - IBM - Z14-Technical - Overview - Final SUPER GOOD ONE
156 pages
IDz Workbench Module 2 - Editing Your COBOL Program
No ratings yet
IDz Workbench Module 2 - Editing Your COBOL Program
66 pages
Mainframe Technologies
No ratings yet
Mainframe Technologies
10 pages
ZSeries and ZOS Reference Guide
100% (1)
ZSeries and ZOS Reference Guide
49 pages
IBM Z Systems Processor Optimization Primer
No ratings yet
IBM Z Systems Processor Optimization Primer
50 pages
01 Z System
100% (1)
01 Z System
17 pages
Mainframe Advantages Report Spruth 2010
100% (1)
Mainframe Advantages Report Spruth 2010
55 pages
SMF: Old Dog Learning New Tricks: Ibm Z Systems 2016 Fall Naspa
No ratings yet
SMF: Old Dog Learning New Tricks: Ibm Z Systems 2016 Fall Naspa
32 pages
WVP Fs5gEemXFQqkMWIiQg PRS3699 IPL Logic Flow V2
No ratings yet
WVP Fs5gEemXFQqkMWIiQg PRS3699 IPL Logic Flow V2
49 pages
z/OS WLM: The Basics Every Performance Analyst Should Know: Session 10888 Glenn Anderson, IBM Technical Training
No ratings yet
z/OS WLM: The Basics Every Performance Analyst Should Know: Session 10888 Glenn Anderson, IBM Technical Training
25 pages
zIIP Capacity Planning Guide
No ratings yet
zIIP Capacity Planning Guide
50 pages
Db2 Analytics Accelerator Version MDUG May2019
No ratings yet
Db2 Analytics Accelerator Version MDUG May2019
75 pages
Db2 For ZOS Ultra High Performance and Tuning
No ratings yet
Db2 For ZOS Ultra High Performance and Tuning
44 pages
Ywtuerytuuywert: Eh848488488848848As Blue Accents, As Compared With The Linuxone Iii Model With Orange Highlights
No ratings yet
Ywtuerytuuywert: Eh848488488848848As Blue Accents, As Compared With The Linuxone Iii Model With Orange Highlights
14 pages
Chapter 1: The New Mainframe: Introduction To z/OS Basics
No ratings yet
Chapter 1: The New Mainframe: Introduction To z/OS Basics
20 pages
Omegamon Db2 Classic
100% (1)
Omegamon Db2 Classic
634 pages
COBOL V6 Migration Best Practices
No ratings yet
COBOL V6 Migration Best Practices
52 pages
CICS TS 51 SHARE Technical Overview
No ratings yet
CICS TS 51 SHARE Technical Overview
71 pages
Thread Safe in CICS
No ratings yet
Thread Safe in CICS
54 pages
What To Look For in A OMEGAMON Statistics Report - IBM Documentation
No ratings yet
What To Look For in A OMEGAMON Statistics Report - IBM Documentation
5 pages
IDz Workbench - File Manager - v12.1
No ratings yet
IDz Workbench - File Manager - v12.1
136 pages
Apostila zOS
0% (1)
Apostila zOS
64 pages
IBM Z Systems Processor Optimization SHARE Aug 2016
No ratings yet
IBM Z Systems Processor Optimization SHARE Aug 2016
27 pages
Mainframe Dinosaur Myth: An Evolving Method of Analyzing and Optimizing A IT Server Infrastructure
No ratings yet
Mainframe Dinosaur Myth: An Evolving Method of Analyzing and Optimizing A IT Server Infrastructure
51 pages
Intro and Overview V4 Final (Compatibility Mode)
No ratings yet
Intro and Overview V4 Final (Compatibility Mode)
39 pages
Storage For Z Seller Presentation
No ratings yet
Storage For Z Seller Presentation
21 pages
Mainframe Networking Essentials
100% (1)
Mainframe Networking Essentials
4 pages
zsl03437 Usen 02 - ZSL03437USEN
No ratings yet
zsl03437 Usen 02 - ZSL03437USEN
2 pages
IBM TS7700 Market Competitive Compare Seller Presentation L3 - 2023-Dec-06
No ratings yet
IBM TS7700 Market Competitive Compare Seller Presentation L3 - 2023-Dec-06
18 pages
Starters Guide To Db2 For ZOS Data Sharing Monitoring and Tuning
No ratings yet
Starters Guide To Db2 For ZOS Data Sharing Monitoring and Tuning
28 pages
Ieav2r4 SDSF
100% (1)
Ieav2r4 SDSF
94 pages
Brocade Commands PDF
No ratings yet
Brocade Commands PDF
10 pages
EConfig IBM I 7.1 BP Training V5
100% (1)
EConfig IBM I 7.1 BP Training V5
36 pages
The New IBM z15 A-Technical Review of The Processor Design New Features IO Cards and Crypto 2020
No ratings yet
The New IBM z15 A-Technical Review of The Processor Design New Features IO Cards and Crypto 2020
55 pages
20 GDPS
No ratings yet
20 GDPS
10 pages
zOS' Address Space - Virtual Storage Layout
100% (1)
zOS' Address Space - Virtual Storage Layout
9 pages
MVS Messages
No ratings yet
MVS Messages
872 pages
A Brief Tour of RMF Monitor III Version 1.14
100% (1)
A Brief Tour of RMF Monitor III Version 1.14
256 pages
Gse 2019 Mainframes and The Moon PDF
No ratings yet
Gse 2019 Mainframes and The Moon PDF
48 pages
Unit 02 - IBM I - Integration - 3448250
No ratings yet
Unit 02 - IBM I - Integration - 3448250
19 pages
CICS Concepts and Faclities PDF
No ratings yet
CICS Concepts and Faclities PDF
29 pages
Introduction To Z/OS: Ibm CSDL
No ratings yet
Introduction To Z/OS: Ibm CSDL
72 pages
Cell Tutorial
No ratings yet
Cell Tutorial
87 pages
Ibm 3592 Tape Cartridge Catalogue
No ratings yet
Ibm 3592 Tape Cartridge Catalogue
4 pages
Explain: Window To The DB2 Optimizer
No ratings yet
Explain: Window To The DB2 Optimizer
60 pages
CA 7 Complete Commands
No ratings yet
CA 7 Complete Commands
664 pages
HC2021.C1.3 IBM Cristian Jacobi Final
No ratings yet
HC2021.C1.3 IBM Cristian Jacobi Final
22 pages
Interactive System Productivity Facility: April 17, 2012 TCS Confidential
100% (1)
Interactive System Productivity Facility: April 17, 2012 TCS Confidential
180 pages
IBM i Architecture Overview and Benefits
No ratings yet
IBM i Architecture Overview and Benefits
21 pages
Systemz Architecture Course
100% (1)
Systemz Architecture Course
230 pages
IBM z13 PART 1 - SHARE Aug04a 2015
No ratings yet
IBM z13 PART 1 - SHARE Aug04a 2015
54 pages
zVSE Release and Hardware Upgrade
No ratings yet
zVSE Release and Hardware Upgrade
55 pages
zOS Support For z13 Servers
No ratings yet
zOS Support For z13 Servers
95 pages
Mainframe Skills for Educators
No ratings yet
Mainframe Skills for Educators
32 pages
IBM z14 ZR1 - Hardware Innovation
No ratings yet
IBM z14 ZR1 - Hardware Innovation
18 pages
Opening - 11712 - Anaheim
No ratings yet
Opening - 11712 - Anaheim
58 pages
Nynaspa 2007 10 XML On Z Intro
No ratings yet
Nynaspa 2007 10 XML On Z Intro
19 pages
Gse 2018 Life of An Acee
No ratings yet
Gse 2018 Life of An Acee
42 pages
CME 2101 Project-Based Learning III Project-2 - Logic Expression Simplification Tool
No ratings yet
CME 2101 Project-Based Learning III Project-2 - Logic Expression Simplification Tool
3 pages
Prepared by : 2014510013 SEREN BOLAT 2014510043 ÖZGÜR HEPSAĞ 2014510091 ABDULSAMET İleri
No ratings yet
Prepared by : 2014510013 SEREN BOLAT 2014510043 ÖZGÜR HEPSAĞ 2014510091 ABDULSAMET İleri
16 pages
Huawei OceanStor Dorado 5000 and 6000 All-Flash Storage Systems Data Sheet - V6.1.8
No ratings yet
Huawei OceanStor Dorado 5000 and 6000 All-Flash Storage Systems Data Sheet - V6.1.8
8 pages
Juniper SRX550-645AP-M Datasheet
No ratings yet
Juniper SRX550-645AP-M Datasheet
4 pages
Week 1, Day 1-4
No ratings yet
Week 1, Day 1-4
55 pages
OS II-I Mid 1 Subjective CSE-AIML
No ratings yet
OS II-I Mid 1 Subjective CSE-AIML
2 pages
Toshiba Satellite 1135-S1551 Detailed Specs
No ratings yet
Toshiba Satellite 1135-S1551 Detailed Specs
8 pages
Hirschmann Automation Price List
No ratings yet
Hirschmann Automation Price List
21 pages
Confidential: Service Manual
100% (1)
Confidential: Service Manual
108 pages
CASINO ONLINE SYSTEM - PPSX
No ratings yet
CASINO ONLINE SYSTEM - PPSX
42 pages
8255 Microprocessor
No ratings yet
8255 Microprocessor
7 pages
Aegis Rs2 Ai 2Nd: Specification
No ratings yet
Aegis Rs2 Ai 2Nd: Specification
1 page
Fix Write-Protected USB & SD Cards
No ratings yet
Fix Write-Protected USB & SD Cards
7 pages
LT6911C Brief R1.5
No ratings yet
LT6911C Brief R1.5
3 pages
How To Build A Gaming PC
No ratings yet
How To Build A Gaming PC
21 pages
3 PLC&HMI FAQs
No ratings yet
3 PLC&HMI FAQs
62 pages
Performance Guide: Microsoft Dynamics NAV 2009
100% (2)
Performance Guide: Microsoft Dynamics NAV 2009
40 pages
identiFINDER 2
No ratings yet
identiFINDER 2
2 pages
Data
No ratings yet
Data
4 pages
CWM and TWRP Recovery For Xiaomi Redmi Note 4g
No ratings yet
CWM and TWRP Recovery For Xiaomi Redmi Note 4g
4 pages
FST Dfs Mcquay
No ratings yet
FST Dfs Mcquay
5 pages
Ecostar 39
No ratings yet
Ecostar 39
2 pages
Precision-T1500 Service Manual En-Us
No ratings yet
Precision-T1500 Service Manual En-Us
48 pages
Reader Configuration Codes
No ratings yet
Reader Configuration Codes
80 pages
Donors Forum Disaster Recovery Plan
No ratings yet
Donors Forum Disaster Recovery Plan
40 pages
Software Basics for Beginners
No ratings yet
Software Basics for Beginners
67 pages
Computer System Maintenance Guide
No ratings yet
Computer System Maintenance Guide
31 pages
Exercises 04
No ratings yet
Exercises 04
46 pages
Computer Maintenance Exam Guide
No ratings yet
Computer Maintenance Exam Guide
20 pages
R-Studio - Recovery Manual
No ratings yet
R-Studio - Recovery Manual
407 pages
Embedded Systems Interface Review
No ratings yet
Embedded Systems Interface Review
7 pages
Lenovo g27q-20 User Guide
No ratings yet
Lenovo g27q-20 User Guide
32 pages

IBM z Systems Performance Guide

Uploaded by

IBM z Systems Performance Guide

Uploaded by

IBM z Systems 2016 NY NaSPA Chapter

Performance Optimization for

C. Kevin Shum Charles F. Webb

© 2016 IBM Corporation

2 © 2016 IBM Corporation

3 © 2016 IBM Corporation

 Biggest single performance lever for many applications

 Close Linkage between compiler and hardware development teams

 ARCH and TUNE options optimize for current hardware designs

 Use higher levels of OPT to get the best performance:

4 © 2016 IBM Corporation

Evolution of IBM COBOL on z Systems

6 © 2016 IBM Corporation

Why SW Optimization Matters

 Pipeline hazards can be expensive

Data cache access,

Instruction Instruction register Instruction

Why SW Optimization Matters z196 On-chip Cache Hierarchy

 Private (per-core) cache evolution

– Allows improvements in size and latency L1

– Unified vs. split L2 for instructions and operands

• Added for operands (zEC12), Instructions (z13) Core

– L2 sizes comparable to others’ L3s (MBs)

 Cache line size is 256B throughout hierarchy

Optimizations on local data z196 On-chip Cache Hierarchy

Instruction / data proximity D

 Instructions & Operands in same cache line Core

– Designs optimized for well-behaved code

– With split of L2 cache, resolution moved to L3 Core

– Classic save area

– In-line macro parameters

Optimizations on shared data

10 © 2016 IBM Corporation

Moving / Clearing Large Blocks

 MVCL is implemented through millicode routines

 Millicode has access to special hardware

Software Aids to Hardware

 Branch Prediction Preload [Relative] (BPP, BPRP) Instruction

12 © 2016 IBM Corporation

Software Aids to Hardware (continued)

 Prefetch Data [Relative] (PFD, PFDRL) instruction

IBM Automatic Binary Optimizer (ABO) for z/OS

Original Program Binaries Optimized Program Binaries

 CPU Measurement Facilities

 Other related references

15 © 2016 IBM Corporation

(Chung-Lung) Kevin Shum Charles Webb

16 © 2016 IBM Corporation

You might also like