Skip to content

Conversation

@seiko2plus
Copy link
Member

@seiko2plus seiko2plus commented Oct 9, 2025

The failures are triggered when the Intel x86 sort AVX‑512 kernels for 16‑bit
are enabled at build time and the CPU/OS also supports them. A quick look at the
zmm_vector<float16>::ge(reg_t, reg_t) seems to not correctly generate the instructions for it.

This patch does not actually fix the underlying bug; instead, it disables these kernels on 32‑bit MSVC builds as a stop‑gap, since the issue requires further investigation and an upstream fix. Note: Newer NumPy releases may drop the entire AVX‑512 support on 32‑bit for all compilers and will enable at most AVX2 as part of gh-28896

closes #29808

@seiko2plus seiko2plus added 09 - Backport-Candidate PRs tagged should be backported component: SIMD Issues in SIMD (fast instruction sets) code or machinery labels Oct 9, 2025
@seiko2plus
Copy link
Member Author

cc @r-devulap

The failures are triggered when the Intel x86 sort AVX‑512 kernels for 16‑bit
are enabled at build time and the CPU/OS also supports them. A quick look at the
`zmm_vector<float16>::ge(reg_t, reg_t)` seems to not correctly generate the instructions for it.

This patch does not actually fix the underlying bug; instead, it disables these
kernels on 32‑bit MSVC builds as a stop‑gap, since the issue requires further
investigation and an upstream fix. Note: Newer NumPy releases may drop the entire
AVX‑512 support on 32‑bit for all compilers and will enable at most AVX2 as part
of numpygh-28896
@charris
Copy link
Member

charris commented Oct 9, 2025

This is only a problem with msvc?

@charris charris merged commit 53b0d99 into numpy:main Oct 9, 2025
77 checks passed
@charris
Copy link
Member

charris commented Oct 9, 2025

Thanks Sayed.

@r-devulap
Copy link
Member

thanks @seiko2plus for debugging and fixing this! Disabling on MSVC 32-bit sounds fine to me.

charris pushed a commit to charris/numpy that referenced this pull request Oct 9, 2025
The failures are triggered when the Intel x86 sort AVX‑512 kernels for 16‑bit
are enabled at build time and the CPU/OS also supports them. A quick look at the
`zmm_vector<float16>::ge(reg_t, reg_t)` seems to not correctly generate the instructions for it.

This patch does not actually fix the underlying bug; instead, it disables these
kernels on 32‑bit MSVC builds as a stop‑gap, since the issue requires further
investigation and an upstream fix. Note: Newer NumPy releases may drop the entire
AVX‑512 support on 32‑bit for all compilers and will enable at most AVX2 as part
of numpygh-28896
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Oct 9, 2025
charris added a commit that referenced this pull request Oct 9, 2025
BUG: Fix float16-sort failures on 32-bit x86 MSVC (#29908)
MaanasArora pushed a commit to MaanasArora/numpy that referenced this pull request Oct 13, 2025
The failures are triggered when the Intel x86 sort AVX‑512 kernels for 16‑bit
are enabled at build time and the CPU/OS also supports them. A quick look at the
`zmm_vector<float16>::ge(reg_t, reg_t)` seems to not correctly generate the instructions for it.

This patch does not actually fix the underlying bug; instead, it disables these
kernels on 32‑bit MSVC builds as a stop‑gap, since the issue requires further
investigation and an upstream fix. Note: Newer NumPy releases may drop the entire
AVX‑512 support on 32‑bit for all compilers and will enable at most AVX2 as part
of numpygh-28896
IndifferentArea pushed a commit to IndifferentArea/numpy that referenced this pull request Dec 7, 2025
The failures are triggered when the Intel x86 sort AVX‑512 kernels for 16‑bit
are enabled at build time and the CPU/OS also supports them. A quick look at the
`zmm_vector<float16>::ge(reg_t, reg_t)` seems to not correctly generate the instructions for it.

This patch does not actually fix the underlying bug; instead, it disables these
kernels on 32‑bit MSVC builds as a stop‑gap, since the issue requires further
investigation and an upstream fix. Note: Newer NumPy releases may drop the entire
AVX‑512 support on 32‑bit for all compilers and will enable at most AVX2 as part
of numpygh-28896
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New float16 failures in 32-bit Windows wheel build jobs

3 participants