AMD Zen5 support #8612

changhoon-sung · 2025-04-19T11:55:16Z

This PR closes #8592 and adds support for the AMD Zen5 architecture.

A comparison of Zen5 with adjacent architectures in terms of supported features is as follows:

Zen4: Zen5 is a superset of Zen4 and includes all of its features.
SapphireRapids: Zen5 supports AVX_VNNI from SapphireRapids, but does not support AVX512-FP16 or AMX.
Zen5: Introduces support for AVX512VP2INTERSECT, which is not supported by SapphireRapids.

Key changes in this PR:

Separates AVX_VNNI from SapphireRapids so that it can also be used on Zen5.
Adds the AMD Zen5 architecture target.
Update AVX_VNNI, AVX512VNNI, AVX512BF16 test cases. AVX_VNNI is tested separately when the feature is present. AVX512VNNI and AVX512BF16 can operate on 128-bit and 256-bit vectors when AVX512VL is supported. Since the AMD_Zen4 target implies AVX512VL, full-width vector testing is performed for this target.

Test Environment:

Processor
- AMD Zen5: AMD Ryzen 9 9950X
- SapphireRapids: Intel(R) Xeon(R) Platinum 8488C
OS: Ubuntu 24.04.2
Compilers: GNU GCC 13.3.0, LLVM 19.1.7

mcourteaux · 2025-04-25T08:11:23Z

Perhaps for the dev-meeting, but I am inclined to argue that feature names (or at the very least their documentation) needs to clearly state which architecture they are intended for. Not so long ago, I got confused, as I wanted to do target.has_feature(Target::FMA), which returns false on most architectures that do have FMA support. The culprit of course is that Target::FMA is a feature flag meant to be used exclusively for x86 architectures. The problem here is that FMA and F16C, sound like functionalities, instead of specific instruction set extensions. I'd argue that complementary to documentation improvements, we could add Target::host_supports_fma() and Target::host_supports_f16_conversion() and Target::host_supports_f16_ops(), or something like this.

abadams · 2025-04-25T20:22:58Z

src/CodeGen_X86.cpp

    }
+    if (t.has_feature(Target::AVX512_Zen5)) {
+        t.set_feature(Target::AVX512_Zen4);
+    }


Should Zen5 and SapphireRapids be setting the AVX_VNNI flag here?

Yes, since we are populating features from a target, AVXVNNI should also be enabled for Zen5 and SapphireRapids. Thank you!

changhoon-sung · 2025-04-30T18:09:39Z

@mcourteaux @abadams I may be misunderstanding the broader context, but from what I can see, providing this functionality in a clean way would likely require moving away from the current subset/superset model and avoiding the pattern of treating architecture names as feature sets. Instead, I believe it would be better if each target explicitly listed its supported features one-by-one, ideally with target names composed only of features, not architecture labels.

The clear upside is that this avoids edge cases from implicit relationships, but the downside is the cost of changing existing code and maintaining consistency going forward, the increased verbosity from having to declare both feature-to-arch support and arch-to-feature mappings. Still, for finer-grained and unambiguous control, this may be a necessary step.

One concern I have is that some features may include architectural traits beyond instruction sets—such as register widths—which might complicate this approach. I’m not sure if that’s the case, so I’d appreciate any clarification on that.

@abadams If there are no other reviews or changes needed on this PR, would it be okay to go ahead and merge this first? I’d be happy to open a follow-up issue to continue this discussion and will mention the related discussions.

abadams · 2025-04-30T18:46:57Z

The conclusion from the dev meeting is that we're happy with this design. The only issue preventing a merge is that there appears to be a simd_op_check failure with llvm 18 (the two other failing bots are unrelated)

changhoon-sung · 2025-04-30T20:00:13Z

The issue was caused by the znver5 tuning flag not being supported in LLVM 18. I’ve updated the configuration to use znver5 only when LLVM version 19 or higher is detected, and to fall back to znver4 for earlier versions.

abadams · 2025-05-06T19:37:19Z

src/CodeGen_X86.cpp

    case Target::Processor::ZnVer4:
        return "znver4";
+    case Target::Processor::ZnVer5:
+        return (Halide::Internal::get_llvm_version() >= 190) ? "znver5" : "znver4";


The preferred way to do this is:

return (LLVM_VERSION >= 190) ? "znver5" : "znver4";

It's preferred just because that's the pattern we grep for when dropping support for an llvm version

Thank you! I've updated it as you pointed out

abadams · 2025-05-08T16:04:05Z

Failures are unrelated flakes

abadams · 2025-05-08T16:05:12Z

Thanks for this! I have merged it.

Extract AVXVNNI from SapphireRapids

1ac2c28

changhoon-sung changed the title ~~AMD Zen4 support~~ AMD Zen5 support Apr 19, 2025

changhoon-sung force-pushed the changhoon/zen5 branch from 6bdd499 to 24aa368 Compare April 19, 2025 13:14

changhoon-sung added 2 commits April 19, 2025 07:24

Add AMD Zen5 target

e2d2ae5

Enhance AVX_VNNI support in x86 code generation and tests

77fba01

changhoon-sung force-pushed the changhoon/zen5 branch from 24aa368 to 77fba01 Compare April 19, 2025 14:24

Enhance Zen3/4 identification

282cc41

alexreinking requested a review from halidebuildbots April 19, 2025 15:24

Fix clang format

cc22a9e

changhoon-sung marked this pull request as ready for review April 20, 2025 15:57

abadams added the dev_meeting Topic to be discussed at the next dev meeting label Apr 23, 2025

abadams reviewed Apr 25, 2025

View reviewed changes

Add AVXVNNI feature flag for Zen5 and SP target completion

9e10521

changhoon-sung added 2 commits April 30, 2025 12:55

Adjust Zen5 target return value based on LLVM version

c228f8e

Add Zen4/5 tuning options for python bindings

7e3e6a1

changhoon-sung requested a review from abadams May 1, 2025 16:34

abadams reviewed May 6, 2025

View reviewed changes

Update LLVM version check for Zen4/5

f1c0c18

abadams approved these changes May 8, 2025

View reviewed changes

abadams merged commit bf9c55d into halide:main May 8, 2025
17 of 19 checks passed

mcourteaux mentioned this pull request Jul 4, 2025

No Generation of VNNI instructions on avx2-avxvnni Targets #8673

Open

BrewTestBot mentioned this pull request Sep 16, 2025

halide 21.0.0 Homebrew/homebrew-core#244220

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMD Zen5 support #8612

AMD Zen5 support #8612

Uh oh!

changhoon-sung commented Apr 19, 2025 •

edited

Loading

Uh oh!

mcourteaux commented Apr 25, 2025

Uh oh!

abadams Apr 25, 2025

Uh oh!

changhoon-sung Apr 26, 2025

Uh oh!

changhoon-sung commented Apr 30, 2025 •

edited

Loading

Uh oh!

abadams commented Apr 30, 2025

Uh oh!

changhoon-sung commented Apr 30, 2025

Uh oh!

abadams May 6, 2025

Uh oh!

changhoon-sung May 7, 2025

Uh oh!

abadams commented May 8, 2025

Uh oh!

Uh oh!

abadams commented May 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AMD Zen5 support #8612

AMD Zen5 support #8612

Uh oh!

Conversation

changhoon-sung commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcourteaux commented Apr 25, 2025

Uh oh!

abadams Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

changhoon-sung Apr 26, 2025

Choose a reason for hiding this comment

Uh oh!

changhoon-sung commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abadams commented Apr 30, 2025

Uh oh!

changhoon-sung commented Apr 30, 2025

Uh oh!

abadams May 6, 2025

Choose a reason for hiding this comment

Uh oh!

changhoon-sung May 7, 2025

Choose a reason for hiding this comment

Uh oh!

abadams commented May 8, 2025

Uh oh!

Uh oh!

abadams commented May 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

changhoon-sung commented Apr 19, 2025 •

edited

Loading

changhoon-sung commented Apr 30, 2025 •

edited

Loading