Conversation
matsui528
left a comment
There was a problem hiding this comment.
As sse2neon.h is a third-party file, can you move it from src/sse2neon.h to src/extern/sse2neon.h ?
|
This is an excellent PR. Thank you! Although I haven't tested it because I don't have an ARM computer now, I'll merge this anyway after you resolve the above comment. I've heard that ARM runners will be available next year. After that, it would be great if you could write a CI to test the code on ARM runners. |
|
Moved the I should be able to do the corresponding CI work once GitHub makes that available. In the meantime, if there's any manual tests other than |
|
Thanks again! It's merged. But I faced another problem uploading the files to pypi... And I cannot update the version of the library. #62 If you know the solution, please let me know. If not, I'll solve someday.. (Note that this is not ARM-related problem) |
|
@timzag Hi! I just updated the pypi version to v0.2.11, which includes this PR. Can you please run |
|
@matsui528 Sorry about the delay, the installation works fine in my environment (M2 Max, Ventura 13.5.2, Python 3.11) and |
|
Perfect! |
I hope this is a reasonable fix, my C++ is not strong. If there's something wrong with this solution I'm happy to try another approach.
Newer Mac hardware does not use x86 processors, and as far as I can tell from the error message attached to #57,
<x86intrin.h>isn't functional. https://github.com/DLTcollab/sse2neon seems to contain the necessary SSE intrinsics for Arm/Aarch64. It uses the MIT license.This PR adds the
sse2neon.hheader and only includes it when the current architecture is aarch64. Tested the fix on a Mac M2 (aarch64) and an AWS m5.xlarge EC2 instance (x86/64).Also fixed the attribution on a previous fix in the README, I mistakenly did that on the wrong account.
This PR resolves #57.