Skip to content

Conversation

@changhoon-sung
Copy link
Contributor

@changhoon-sung changhoon-sung commented Apr 19, 2025

This PR closes #8592 and adds support for the AMD Zen5 architecture.

A comparison of Zen5 with adjacent architectures in terms of supported features is as follows:

  • Zen4: Zen5 is a superset of Zen4 and includes all of its features.
  • SapphireRapids: Zen5 supports AVX_VNNI from SapphireRapids, but does not support AVX512-FP16 or AMX.
  • Zen5: Introduces support for AVX512VP2INTERSECT, which is not supported by SapphireRapids.

Key changes in this PR:

  1. Separates AVX_VNNI from SapphireRapids so that it can also be used on Zen5.
  2. Adds the AMD Zen5 architecture target.
  3. Update AVX_VNNI, AVX512VNNI, AVX512BF16 test cases. AVX_VNNI is tested separately when the feature is present. AVX512VNNI and AVX512BF16 can operate on 128-bit and 256-bit vectors when AVX512VL is supported. Since the AMD_Zen4 target implies AVX512VL, full-width vector testing is performed for this target.

Test Environment:

  • Processor
    • AMD Zen5: AMD Ryzen 9 9950X
    • SapphireRapids: Intel(R) Xeon(R) Platinum 8488C
  • OS: Ubuntu 24.04.2
  • Compilers: GNU GCC 13.3.0, LLVM 19.1.7

@changhoon-sung changhoon-sung changed the title AMD Zen4 support AMD Zen5 support Apr 19, 2025
@changhoon-sung changhoon-sung marked this pull request as ready for review April 20, 2025 15:57
@abadams abadams added the dev_meeting Topic to be discussed at the next dev meeting label Apr 23, 2025
@mcourteaux
Copy link
Contributor

Perhaps for the dev-meeting, but I am inclined to argue that feature names (or at the very least their documentation) needs to clearly state which architecture they are intended for. Not so long ago, I got confused, as I wanted to do target.has_feature(Target::FMA), which returns false on most architectures that do have FMA support. The culprit of course is that Target::FMA is a feature flag meant to be used exclusively for x86 architectures. The problem here is that FMA and F16C, sound like functionalities, instead of specific instruction set extensions. I'd argue that complementary to documentation improvements, we could add Target::host_supports_fma() and Target::host_supports_f16_conversion() and Target::host_supports_f16_ops(), or something like this.

}
if (t.has_feature(Target::AVX512_Zen5)) {
t.set_feature(Target::AVX512_Zen4);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Zen5 and SapphireRapids be setting the AVX_VNNI flag here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since we are populating features from a target, AVXVNNI should also be enabled for Zen5 and SapphireRapids. Thank you!

@changhoon-sung
Copy link
Contributor Author

changhoon-sung commented Apr 30, 2025

@mcourteaux @abadams I may be misunderstanding the broader context, but from what I can see, providing this functionality in a clean way would likely require moving away from the current subset/superset model and avoiding the pattern of treating architecture names as feature sets. Instead, I believe it would be better if each target explicitly listed its supported features one-by-one, ideally with target names composed only of features, not architecture labels.

The clear upside is that this avoids edge cases from implicit relationships, but the downside is the cost of changing existing code and maintaining consistency going forward, the increased verbosity from having to declare both feature-to-arch support and arch-to-feature mappings. Still, for finer-grained and unambiguous control, this may be a necessary step.

One concern I have is that some features may include architectural traits beyond instruction sets—such as register widths—which might complicate this approach. I’m not sure if that’s the case, so I’d appreciate any clarification on that.

@abadams If there are no other reviews or changes needed on this PR, would it be okay to go ahead and merge this first? I’d be happy to open a follow-up issue to continue this discussion and will mention the related discussions.

@abadams
Copy link
Member

abadams commented Apr 30, 2025

The conclusion from the dev meeting is that we're happy with this design. The only issue preventing a merge is that there appears to be a simd_op_check failure with llvm 18 (the two other failing bots are unrelated)

@changhoon-sung
Copy link
Contributor Author

The issue was caused by the znver5 tuning flag not being supported in LLVM 18. I’ve updated the configuration to use znver5 only when LLVM version 19 or higher is detected, and to fall back to znver4 for earlier versions.

@changhoon-sung changhoon-sung requested a review from abadams May 1, 2025 16:34
case Target::Processor::ZnVer4:
return "znver4";
case Target::Processor::ZnVer5:
return (Halide::Internal::get_llvm_version() >= 190) ? "znver5" : "znver4";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preferred way to do this is:

return (LLVM_VERSION >= 190) ? "znver5" : "znver4";

It's preferred just because that's the pattern we grep for when dropping support for an llvm version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I've updated it as you pointed out

@abadams
Copy link
Member

abadams commented May 8, 2025

Failures are unrelated flakes

@abadams abadams merged commit bf9c55d into halide:main May 8, 2025
17 of 19 checks passed
@abadams
Copy link
Member

abadams commented May 8, 2025

Thanks for this! I have merged it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dev_meeting Topic to be discussed at the next dev meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AMD Zen5 Architecture Support

3 participants