0% found this document useful (0 votes)
19 views14 pages

The AMD Next-Generation

The AMD Zen 3 core architecture, launched between 2020 and 2021, focuses on enhanced single-thread performance and energy efficiency, achieving a 19% IPC uplift over Zen 2. Key features include a reworked pipeline, advanced branch prediction, and the introduction of 3D V-Cache technology for improved gaming performance. Additionally, Zen 3 incorporates new security features and a unified 8-core design that optimizes cache efficiency and reduces latency.

Uploaded by

dhanashree270994
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views14 pages

The AMD Next-Generation

The AMD Zen 3 core architecture, launched between 2020 and 2021, focuses on enhanced single-thread performance and energy efficiency, achieving a 19% IPC uplift over Zen 2. Key features include a reworked pipeline, advanced branch prediction, and the introduction of 3D V-Cache technology for improved gaming performance. Additionally, Zen 3 incorporates new security features and a unified 8-core design that optimizes cache efficiency and reduces latency.

Uploaded by

dhanashree270994
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

The AMD Next-

Generation
“Zen 3” Core

•Presented by: Group FRI12


•Vijayasri Kristaparapu
•Dhanashree Shinde

Date : 20th Oct


Based on : THEME ARTICLE: HOT CHIPS 2025
33 1
Background and History

AMD LAUNCHED DELIVERED MAJOR ZEN 2 (2019) ZEN 3 (2020–2021) INTRODUCED NEW SUPPORTED
THE ZEN CORE GAINS IMPROVED REARCHITECTED ISA INNOVATIONS
ARCHITECTURE IN INSTRUCTIONS PER IPCS, CLOCK SPEEDS, THE CACHE EXTENSIONS, EXPAND FOR SERVERS, DATAC
IN EARLY 2017, CYCLE (IPC) AND AND CACHE SIZE, HIERARCHY INTO ED SECURITY ENTERS,
MARKING INTRODUCED AN WHILE UNIFIED EIGHT-CORE FEATURES, AND AND SUPERCOMPUTE
A COMPLETE INNOVATIVE SYSTEM- TRANSITIONING TO COMPLEXES, COMPATIBILITY WITH RS, INCLUDING 3D V-
REDESIGN FROM ON-CHIP A 7 NM PROCESS. ENABLING LOWER PREVIOUS AM4 CACHE
PRIOR GENERATIONS. (SOC) DESIGN LATENCY AND HIGHE SOCKETS FOR EASY INTEGRATION TO
FEATURING FOUR- R SINGLE-THREAD PLATFORM BOOST
CORE COMPLEXES. PERFORMANCE. UPGRADES. PERFORMANCE
EFFICIENCY.

2
Introduction
• Zen 3 is another redesign of AMD’s CPU core architecture.
• Focused on higher single-thread performance and better energy efficiency.
• Introduces Simultaneous Multithreading (SMT) to boost throughput with
additional threads.
• Features a reworked pipeline, enhanced branch prediction, and optimized execution
units.
• Achieves a 19% IPC uplift over Zen 2, the largest improvement since the original
“Zen”.
• Improved fetch, decode, integer, and floating-point execution units contribute to the
performance gain.
• Designed for balanced high performance and power efficiency across diverse
workloads.

3
Block diagram
Front End (Fetch & Decode)
• Advanced branch predictor with reduced
misprediction and taken-branch latency.
• 32 KB L1 instruction cache + 4,096-entry op-
cache (fetch up to 8 ops/cycle).

Integer Execution
• Distributed integer scheduler for higher efficiency.
• 4 ALUs, 3 AGUs, plus new branch & store data
units.
• 10 integer ops/cycle, larger reorder buffer (256
entries).

Floating-Point Execution
• Dispatches 6 ops/cycle.
• 2 add + 2 multiply units → 2 FMA ops/cycle (high
throughput).

Load/Store System
• 32 KB L1 data cache, 512 KB L2 cache, 3 memory
ops/cycle. 4
Branch Prediction and Front
End

• Advanced TAGE2 branch predictor optimized for latency and accuracy.


• L1 BTB: 1,024 entries, L2 BTB: 6,656 entries, indirect target table: 1,536
entries.
• 32 KB instruction cache, improved prefetching.
• Faster recovery from mispredictions, reduced branch latency.

5
Integer and Floating-Point
Execution
• Integer: Increased issue width: 7 → 10 µops per cycle.
• 4 ALUs, 3 AGUs, plus new branch and store units.
• Larger reorder buffer: 256 entries; Scheduler: 96 entries.
• Floating point: 6 µops per cycle; 2 FMA units (256-bit); latency reduced to 4
cycles.
• FP scheduler entries: 64 (up from 36 in Zen 2).

6
Load / Store and Memory

• L1 Data Cache: 32 KB, 8-way; L2: 512 KB per core.


• Store queue: 64 entries (was 48 in Zen 2).
• Improved prefetchers for cross-page and multi-level coordination.
• Supports 3 memory ops per cycle (2 stores + 1 load).

7
L3 Cache and Core Complex

• Unified 8-core CCX with shared 32 MB L3 (vs. 4-core/16 MB in Zen 2).


• Reduces latency and improves data sharing among cores.
• Bi-directional ring bus interconnect for low-latency L3 access.
• L3 filled from L2 victims only for better utilization.

8
3D V-Cache

• AMD’s 3D V-Cache: vertical stacking of extra L3 cache.


• Adds 64 MB stacked L3 to base 32 MB → 96 MB per CCD.
• Copper-to-copper bonding provides high bandwidth, low power.
• ~15% gaming FPS uplift with V-Cache prototype (12-core test).

9
Security Features
• SEV: Secure Encrypted Virtualization (per-VM memory encryption).
• SEV-ES: Adds encrypted CPU register state.
• SNP (Secure Nested Paging): Protects VM memory mappings from hypervisors.
• CET Shadow Stack and Memory Protection Keys for client CPUs.
• 256-bit encryption instruction extensions (VAES, VPCLMULQDQ).

10
Performance Highlights

• +19% IPC uplift at fixed frequency vs. Zen 2.


• +26–50% gaming performance boost.
• Better power efficiency, lower effective latency.

11
Conclusion and Key Takeaways

• Zen 3’s unified 8-core design greatly reduces latency and boosts IPC.
• The bi-directional ring bus ensures fast, balanced L3 access.
• The victim-only L3 policy maximizes cache efficiency.
• 3D V-Cache demonstrates AMD’s leadership in stacked memory design.
• And new security features strengthen cloud and client protection.
• Zen 3 laid the foundation for Zen 4 and Zen 5, showing how smart
architecture can deliver big gains even without a smaller process node.

12
THANK
YOU!!

13
Questions and
Feedback

14

You might also like