Add vir::stdx::simd (Spack: vir-simd)#332
Conversation
|
@mattkretz are you ok to be listed as Spack package co-maintainer for your library Please let me also know if I messed up any spellings of the package. |
vir::stdx::simd (Vir-SIMD)vir::stdx::simd (Spack: vir-simd)
Add a new package for the header-only library `vir::stdx::simd`.
No it won't. But it will get you notified of issues/PRs related to the package. |
|
Oh that's right. Thanks, Adam! |
Just to keep the package rolling. Will add him once he responds.
|
@adamjstewart ready for review/merge :) |
|
Thank you! :) |
|
@ax3l Cool, thanks! You can list me as co-maintainer, even if it's likely that I won't find any time to maintain the Spack package myself. Also, I'm very interested in any feedback you have on the extensions in |
|
Thanks @mattkretz! :) Yes, will do! Thank you for the package and your work on ISO C++! I am completing PRs to AMReX here and to ImpactX here and will post in those threads performance numbers once I finalize them (feel free to subscribe to those if interested). I'll open issues on your GitHub for feedback if I got some :) |
## Summary AMReX does not have a concept yet to help users write effective SIMD code on CPU, besides relying on auto-vectorization and pragmas, which are unreliable for any complex enough code. [1] Lucky enough, C++ `std::datapar` was just accepted into C++26, which gives an easy in to write portable SIMD/scalar code. Yet, I did not find a compiler/stdlib yet with support for it, so I finally had play with the C++17 `<experimental/simd>` headers, which are not as complete as C++26 but a good in, especially if complemented with the https://github.com/mattkretz/vir-simd library. This PR adds initial support for portable user-code by providing: - build system support: `AMReX_SIMD` (default is OFF), relying on [vir-simd](https://github.com/mattkretz/vir-simd) - an `AMReX_SIMD.H` header that handles includes & helper types - `ParallelForSIMD<SIMD_WIDTH>(...)` ## Additional background [1] Fun fact one: As written in the [story behind Intel's iscp compiler](https://pharr.org/matt/blog/2018/04/18/ispc-origins) and credited to [Tim Foley](http://graphics.stanford.edu/~tfoley/), *auto-vectorization is not a programming model.* Fun fact two: This is as ad-hoc as the implementation for [data parallel types / SIMD in Kokkos](https://kokkos.org/kokkos-core-wiki/API/simd/simd.html), it seems. ## User-Code Examples & Benchmark [Please see this ImpactX PR for details.](BLAST-ImpactX/impactx#1002) ## Checklist - [x] clean up commits (separate commits) - [x] finalize fallbacks & CI checks - [ ] add a `vir::stdx::simd` test in CI - [x] CMake - [ ] GnuMake - [x] `AMReX_SIMD.H` - [x] `ParallelForSIMD` - [x] `ParticleIdWrapper::make_(in)valid(mask)` - [x] clean up `sincos` support - [x] `SmallMatrix` - [x] Support for `GpuComplex` (minimal) - [x] Support [passing WIDTH as compile-time meta-data](https://godbolt.org/z/7455hqrEc) to callee in `ParallelForSIMD` - [ ] include documentation in the code and/or rst files, if appropriate - [x] add `vir::stdx::simd` in package managers: - [x] Spack [vir-simd](spack/spack-packages#332) - [x] Conda [vir-simd](conda-forge/staged-recipes#30377) ## Future Ideas / PRs - allocate particle arrays aligned so we can use [stdx::vector_aligned](https://en.cppreference.com/w/cpp/experimental/simd/vector_aligned.html) (for [copies](https://en.cppreference.com/w/cpp/experimental/simd/simd/copy_from) into/out of vector registers - note: makes no difference anymore on modern CPUs) - Support more/all functions in `ParticleIdWrapper`/`ParticleCpuWrapper` - Support for [vir::simdize<std::complex<T>>](mattkretz/vir-simd#42) instead of `GpuComplex<SIMD>` - `ParallelFor` ND support - `ParallelFor`/`ParallelForSIMD`: one could, maybe, with enable-if magic, etc fuse them into a single name again - CMake superbuild: `vir-simd` auto-download for convenience (opt-out) - Build system: "SIMD provider" selection, once we can opt-in to a C++26 compiler+stdlib instead of C++17 TS2 + vir-simd - Update AMReX package in package management: - Spack [vir-simd](spack/spack-packages#332) - Conda [vir-simd](conda-forge/staged-recipes#30377) --------- Co-authored-by: Alexander Sinn <[email protected]>
Add a new package for the header-only library
vir::stdx::simdby @mattkretz .I am adding support for SIMD in BLAST WarpX/ImpactX/HiPACE++/... via AMReX and need this library as a dependency to enhance our C++17 implementations (until we can require C++26
std::datapar).