Skip to content

Comments

Detect x86_64 microarchitecture variant automatically#9788

Draft
zanieb wants to merge 2 commits intomainfrom
zb/micro-detect
Draft

Detect x86_64 microarchitecture variant automatically#9788
zanieb wants to merge 2 commits intomainfrom
zb/micro-detect

Conversation

@zanieb
Copy link
Member

@zanieb zanieb commented Dec 10, 2024

e.g., on my test machine

❯ cargo run -- python install 3.12
Installed Python 3.12.8 in 3.65s
 + cpython-3.12.8-linux-x86_64_v3-gnu

Note this is split into two commits:

  • c0f146c: which adds detection
  • 5d78bf2: which improves behaviors and compatibility between architecture variants

@zanieb zanieb added the enhancement New feature or improvement to existing functionality label Dec 10, 2024
Comment on lines +114 to +160
if is_x86_feature_detected!("avx512f")
&& is_x86_feature_detected!("avx512bw")
&& is_x86_feature_detected!("avx512cd")
&& is_x86_feature_detected!("avx512dq")
&& is_x86_feature_detected!("avx512vl")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh no clue if we should check for all of these or not

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those checks are necessary I think.

I'd even go further and do it similar to https://github.com/HenrikBengtsson/x86-64-level (fyi, it takes the values from /proc/cpuinfo so the names slightly differ)

determine_cpu_version() {
    ## x86-64-v0 (can this happen?)
    level=0
    
    ## x86-64-v1
    has_cpu_flags lm cmov cx8 fpu fxsr mmx syscall sse2 || return 0
    level=1

    ## x86-64-v2
    has_cpu_flags cx16 lahf_lm popcnt sse4_1 sse4_2 ssse3 || return 0
    level=2
    
    ## x86-64-v3
    has_cpu_flags avx avx2 bmi1 bmi2 f16c fma abm movbe xsave || return 0
    level=3

    ## x86-64-v4
    has_cpu_flags avx512f avx512bw avx512cd avx512dq avx512vl || return 0
    level=4
}

I think going bottom-up like this is more robust although not really necessary in practice I guess. A CPU supporting AVX512 will support AVX2 😃 Unless someone is running uv in a crazy emulated environment. Nevertheless, it could be a good idea to comment that each level is a superset of the level below and that in practice it's okay to check top-down instead of bottom-up.

But that abm feature is confusing me, I couldn't find it directly mentioned in these sources

@zanieb zanieb force-pushed the zb/micro-detect branch 11 times, most recently from e8ace72 to 359b5bb Compare December 10, 2024 22:13
@notatallshaw-gts
Copy link

notatallshaw-gts commented Dec 10, 2024

While mostly I would want uv to automatically detect which microarchitecture variant to grab, I have a concern here related to docker images, specifically if uv is used to install Python in the dockerfile, what happens when the build machine and execution machine support different instruction sets?

@zanieb
Copy link
Member Author

zanieb commented Dec 10, 2024

I have a concern here related to docker images, specifically if uv is used to install Python in the dockerfile, what happens when the build machine and execution machine support different instruction sets?

I think it's fair to say the build machine will need to be on the same (or an older) microarchitecture than your execution machine. Do you think that's unreasonable?

@notatallshaw-gts
Copy link

I think it's fair to say the build machine will need to be on the same (or an older) microarchitecture than your execution machine. Do you think that's unreasonable?

Only in the sense it's not something I have to think about right now, and without being familiar with this I'm not sure it would be easy to diagnose when a failure happened?

@zanieb
Copy link
Member Author

zanieb commented Dec 10, 2024

Only in the sense it's not something I have to think about right now, and without being familiar with this I'm not sure it would be easy to diagnose when a failure happened?

Yeah the interpreter would probably just crash at runtime, or, if you use uv as your entry point, it would be "missing"

@notatallshaw-gts
Copy link

Maybe this is already an issue with other tools that are installed into docker images, and it's not that common to face, and people who do face this problem are familiar with the symptom and cause. I just never thought about it before now.

@zanieb zanieb force-pushed the zb/micro-detect branch 2 times, most recently from c3ec284 to d51e5d7 Compare December 10, 2024 22:31
@danielhollas
Copy link
Contributor

Maybe this is already an issue with other tools that are installed into docker images, and it's not that common to face, and people who do face this problem are familiar with the symptom and cause. I just never thought about it before now.

I don't think this is a common issue right now, and you raised a very valid concern. For pre-built binary artifacts, for example most Linux distros AFAIK use conservative microarchitecture and don't do any runtime detection.

There has been some discussion about this in Fedora but the proposal to bump microarchutecture has been withdrawn?

https://discussion.fedoraproject.org/t/will-future-releases-require-x86-64-v3/131200/6

@drmikehenry
Copy link

Assuming this feature is eventually implemented, it would be helpful to have a way to disable CPU microarchitecture detection to retain the current behavior that v1 is used by default. This would make it possible for folks who are mirroring python-build-standalone to continue to mirror only the v1 binaries rather than the whole set of microarchitectures, reducing both space and time requirements for mirroring. Developers might also prefer to use the same Python executables across an entire team for consistency.

@zanieb
Copy link
Member Author

zanieb commented Dec 28, 2024

@drmikehenry thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants