Skip to content

Conversation

@r-devulap
Copy link
Member

@r-devulap r-devulap commented Oct 29, 2024

As discussed in the optimization meeting, @seiko2plus wanted to split VSX and VXE into two separate PR's.

For clarification, SIMD optimizations for sine and cosine functions on both ppc64 and z/Architecture (IBM Z) were disabled by #25781 to bypass CI tests. This PR aims to re-enable optimizations for z/Architecture after addressing the following runtime errors, while #27627 re-enabled ppc64 optimizations.

numpy/_core/tests/test_umath_accuracy.py:84: AssertionError
=========================== short test summary info ============================
FAILED numpy/_core/tests/test_umath.py::TestSpecialFloats::test_sincos_values[f] - AssertionError: 
FAILED numpy/_core/tests/test_umath.py::TestSpecialFloats::test_sincos_errors[inf-f-sin] - AssertionError: FloatingPointError not raised by sin
FAILED numpy/_core/tests/test_umath.py::TestSpecialFloats::test_sincos_errors[inf-f-cos] - AssertionError: FloatingPointError not raised by cos
FAILED numpy/_core/tests/test_umath.py::TestSpecialFloats::test_sincos_errors[-inf-f-sin] - AssertionError: FloatingPointError not raised by sin
FAILED numpy/_core/tests/test_umath.py::TestSpecialFloats::test_sincos_errors[-inf-f-cos] - AssertionError: FloatingPointError not raised by cos
FAILED numpy/_core/tests/test_umath.py::TestAVXFloat32Transcendental::test_sincos_float32 - AssertionError: Arrays are not almost equal up to 2 ULP (max difference is ...
FAILED numpy/_core/tests/test_umath.py::TestAVXFloat32Transcendental::test_strided_float32 - AssertionError: Arrays are not equal to 2 ULP (max is 1.05785e+09)
FAILED numpy/_core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_fp16_transcendentals[cos] - AssertionError: Arrays are not almost equal up to 1 ULP (max difference is ...
FAILED numpy/_core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_fp16_transcendentals[sin] - AssertionError: Arrays are not almost equal up to 1 ULP (max difference is ...
= 9 failed, 22213 passed, 1559 skipped, 26186 deselected, 7 xfailed, 2 xpassed in 555.12s (0:09:15) =

@r-devulap r-devulap added component: SIMD Issues in SIMD (fast instruction sets) code or machinery HWY features related to google Highway labels Oct 29, 2024
@seberg seberg requested a review from seiko2plus November 7, 2024 09:41
"SIMD in general.");
npy_uint64 simd_maski;
hn::StoreMaskBits(f32, simd_mask, (uint8_t*)&simd_maski);
hn::StoreMaskBits(f32, simd_mask, (uint8_t *)&simd_maski);
Copy link
Member

@seiko2plus seiko2plus Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better approach is needed for libc fallback to properly handle scalable extensions. One potential improvement could be introducing a new intrinsic, such as bool TestBit(MASK, int pos), which would also be endian-friendly. For now, I applied a quick fix below to address a big-endian bug, as passing a uint64_t as a uint8_t array obviously lead to accessing garbage bytes instead.

@seiko2plus
Copy link
Member

@r-devulap, Oh, my editor automatically reformatted the entire source code based on the NumPy clang-format configuration. Would you prefer that I revert these changes, or is this formatting acceptable for you?

@seiko2plus
Copy link
Member

All CI tests for s390x have passed.

@seberg
Copy link
Member

seberg commented Nov 22, 2024

@r-devulap if you are happy about the changes please just put it in (I don't think the reformatting matters too much -- it's not that much).
2.2.0 is going to be branched very soon, it it seems reasonable to want to have this in.

@r-devulap
Copy link
Member Author

@r-devulap, Oh, my editor automatically reformatted the entire source code based on the NumPy clang-format configuration. Would you prefer that I revert these changes, or is this formatting acceptable for you?

Yeah no worries. This looks better anyways.

@r-devulap
Copy link
Member Author

@r-devulap if you are happy about the changes please just put it in (I don't think the reformatting matters too much -- it's not that much). 2.2.0 is going to be branched very soon, it it seems reasonable to want to have this in.

yeah makes sense. I had to fix some rebase problems. But once the CI passes it should be good to go in.

@seberg
Copy link
Member

seberg commented Nov 22, 2024

Let's put this in then, thanks @r-devulap and @seiko2plus.

@seberg seberg merged commit be54bff into numpy:main Nov 22, 2024
69 checks passed
@Sunny-Anand
Copy link

Not sure if this is having any impact here. google/highway#2409

@Sunny-Anand
Copy link

I can confirm that trying to install numpy>= 2.0.0 for python3.10 breaks on UBI-8 image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

01 - Enhancement component: SIMD Issues in SIMD (fast instruction sets) code or machinery HWY features related to google Highway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants