-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
More generally, we need to be smarter about features that imply other features.
If you manually set the target, e.g. x86-64-linux-avx2, then the x86 backend completes this target by setting all implied features to produce x86-64-linux-sse41-avx-f16c-fma-avx2 and uses that to make instruction selection decisions. So you don't need to manually set implied features in a build(*).
Host target detection on such a machine will return the latter completed target rather than the former. This is getting increasingly ridiculous. Sapphire rapids avx512 implies everything in cannonlake avx512, skylake avx512, and all the way down, so on a sapphire rapids machine, host target detection returns:
x86-64-linux-sse41-avx-f16c-fma-avx2-avx512-avx512_skylake-avx512_cannonlake-avx512_sapphire_rapids
zen4's avx512 support is a superset of cannonlake and a subset of sapphire rapids, so once that's merged we'd generate:
x86-64-linux-sse41-avx-f16c-fma-avx2-avx512-avx512_skylake-avx512_cannonlake-avx512_zen4-avx512_sapphire_rapids
This discourages us from being too granular with what generations of thing we support, which is silly. Target strings shouldn't be size O(n) after n generations. If we wanted to be very precise on arm, we'd have to have target strings like:
arm-64-linux-aarchv81-aarchv82-aarchv84-aarchv85 ....
I think we should stop manually setting implied features, and rely on the backends to do the feature bit completion on a local copy of the target as necessary if it's convenient for them to make instruction selection decisions. The only thing we'd really need to change to enact this is to stop setting implied features when we detect the host target. The rest is a matter of usage rather than a code change.
A gotcha I anticipate is if a user schedule switches based on something like get_host_target().has_feature(Target::AVX). We might address this with a new target accessor (supports_feature?) that returns true for implied features too. Or maybe has_feature should return true if the feature is implied. Target.cpp would have to keep a list of feature implications (e.g. avx2 -> avx), and that would also let us centralize target completion for the backends.
Yet another option is to set implied features but not print them when converting a target to a string. I think it's probably better if the set flags in the struct mirrors the string exactly though.
(*) though making that change would break schedules that do things like has_feature(Target::AVX) and want that to return true for avx2, so for existing codebases (e.g. google3) we may not want to change the target strings in use.