BLD: use smaller scipy-openblas builds#27147
Conversation
|
@charris This would be nice to backport since it shrinks the wheel sizes |
|
Thanks Matti. |
Done for 2.1 because it will have an rc, but skipped for 2.0. |
|
Has anyone compared these benchmarks against the Zen kernels on AMD chips? The original post only tested Intel archs b/c its a mac-focused repo, but its entirely possible that there will be a not-insignificant performance difference. |
|
We would need someone to rerun the benchmark scripts with an AMD processor that has AVX512 features. |
|
Only AVX2 over here, unfortunately. |
|
Looks like M7a instances should do the trick. edit: i'm just going to do it |
|
Marginally worse than SKYLAKEX despite, according to OpenBLAS docs, being HASWELL with zen2/3 optimizations (i.e. no AVX512). Curious what this looks like on my local zen 3 machine.
|
resounding meh |
|
I am not sure what I am seeing. What are the two results? |
|
First one is an AWS m7a-medium. One zen 4 core. Second is my personal machine, which is zen 3. Can't really see a reason to include the ZEN kernel based on either of those. |
|
And that is using an openblas from before the shrink? |
|
Hm. If I followed the instructions from the script repository exactly, it would have pulled down latest scipy-openblas, wouldn't it. |
Builds on #27140 to use the same OpenBLAS build but with fewer kernels. Based on the analysis in MacPython/openblas-libs#144 there are now 5 kernels based on cpu core labels
PRESCOTT NEHALEM SANDYBRIDGE HASWELL SKYLAKEX. Needs a release note about the possible performance implications, and will also add a note about the windows changes in #27140.