-
Notifications
You must be signed in to change notification settings - Fork 1.1k
AMD Zen5 support #8612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMD Zen5 support #8612
Conversation
6bdd499 to
24aa368
Compare
24aa368 to
77fba01
Compare
|
Perhaps for the dev-meeting, but I am inclined to argue that feature names (or at the very least their documentation) needs to clearly state which architecture they are intended for. Not so long ago, I got confused, as I wanted to do |
| } | ||
| if (t.has_feature(Target::AVX512_Zen5)) { | ||
| t.set_feature(Target::AVX512_Zen4); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Zen5 and SapphireRapids be setting the AVX_VNNI flag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since we are populating features from a target, AVXVNNI should also be enabled for Zen5 and SapphireRapids. Thank you!
|
@mcourteaux @abadams I may be misunderstanding the broader context, but from what I can see, providing this functionality in a clean way would likely require moving away from the current subset/superset model and avoiding the pattern of treating architecture names as feature sets. Instead, I believe it would be better if each target explicitly listed its supported features one-by-one, ideally with target names composed only of features, not architecture labels. The clear upside is that this avoids edge cases from implicit relationships, but the downside is the cost of changing existing code and maintaining consistency going forward, the increased verbosity from having to declare both feature-to-arch support and arch-to-feature mappings. Still, for finer-grained and unambiguous control, this may be a necessary step. One concern I have is that some features may include architectural traits beyond instruction sets—such as register widths—which might complicate this approach. I’m not sure if that’s the case, so I’d appreciate any clarification on that. @abadams If there are no other reviews or changes needed on this PR, would it be okay to go ahead and merge this first? I’d be happy to open a follow-up issue to continue this discussion and will mention the related discussions. |
|
The conclusion from the dev meeting is that we're happy with this design. The only issue preventing a merge is that there appears to be a simd_op_check failure with llvm 18 (the two other failing bots are unrelated) |
|
The issue was caused by the |
src/CodeGen_X86.cpp
Outdated
| case Target::Processor::ZnVer4: | ||
| return "znver4"; | ||
| case Target::Processor::ZnVer5: | ||
| return (Halide::Internal::get_llvm_version() >= 190) ? "znver5" : "znver4"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preferred way to do this is:
return (LLVM_VERSION >= 190) ? "znver5" : "znver4";
It's preferred just because that's the pattern we grep for when dropping support for an llvm version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I've updated it as you pointed out
|
Failures are unrelated flakes |
|
Thanks for this! I have merged it. |
This PR closes #8592 and adds support for the AMD Zen5 architecture.
A comparison of Zen5 with adjacent architectures in terms of supported features is as follows:
Key changes in this PR:
Test Environment: