SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit by seiko2plus · Pull Request #17340 · numpy/numpy

seiko2plus · 2020-09-17T10:28:23Z

This patch implements NPYV intrinsics for partial and non-contiguous memory access,
which paves the way to replace the raw SIMD kernels in simd.inc.src with the universal intrinsics.

required by #16247

numpy/core/src/common/simd/avx2/memory.h

mattip · 2020-10-07T08:43:48Z

This seems to build on PR gh-16782, correct?

seiko2plus · 2020-10-07T09:07:19Z

@mattip, yes this pull-request temporary merge #16782, so I can be able to test the new intrinsics.

seiko2plus · 2020-10-09T00:00:35Z

All tests are successfully passed, I will move testing units of the new initrinics to #16782 so we can merge this pr.
https://travis-ci.org/github/numpy/numpy/builds/733983312
https://github.com/numpy/numpy/pull/17340/checks?check_run_id=1226297258

numpy/core/src/common/simd/avx2/memory.h

numpy/core/src/common/simd/avx512/memory.h

…-bit This patch improves the implementation of memory load/store for VSX

seiko2plus · 2020-10-21T18:34:24Z

@mattip, These intrinsics already been used by #17587 and #16247 and proved a good efficiency almost similar to the replacement raw SIMD in case of AVX2 and AVX512F, provide massive improvements for non-contiguous memory access
in the case of SSE and VSX, on the other hand, NEON/ASIMD shows acceptable improvements but not that wow.

I hope we can merge this pull-request as soon as possible.

charris · 2020-10-25T16:58:52Z

@seiko2plus I notice that you are still making commits here. Do you feel that there is more to do?

mattip · 2020-10-25T17:09:39Z

I was hoping to merge #16782 first, thinking that then we might be able to add some (maybe marked @slow) tests using that infrastructure here. Does that make sense?

seiko2plus · 2020-10-25T17:42:51Z

@charris, no, the last change I made on this pr was 17 days ago,

 seiko2plus force-pushed the seiko2plus:npyv_partial_noncont_mem branch from bec733b to 1b8637d 17 days ago

the other messages due to build #16247 and #17587 on the top of this pr(reference commit).

@mattip,

I was hoping to merge #16782 first,

I totally agree with you without testing cases it would be chaos.

thinking that then we might be able to add some (maybe marked @slow) tests using that infrastructure here

there's no need for @slow #16782 is too fast in running time the current ratio 1 to 5 seconds depending on
the enabled SIMD extensions. The only issue is the binary size and maybe the building time.

charris · 2020-10-25T20:48:04Z

Thanks Sayed.

seiko2plus marked this pull request as draft September 17, 2020 10:29

seiko2plus force-pushed the npyv_partial_noncont_mem branch 11 times, most recently from ed975c8 to b699b95 Compare September 25, 2020 00:47

mattip added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Sep 25, 2020

Qiyu8 reviewed Sep 28, 2020

View reviewed changes

numpy/core/src/common/simd/avx2/memory.h Outdated Show resolved Hide resolved

seiko2plus force-pushed the npyv_partial_noncont_mem branch from b699b95 to b7761ba Compare October 7, 2020 08:32

github-actions bot added the 25 - WIP label Oct 7, 2020

seiko2plus force-pushed the npyv_partial_noncont_mem branch 4 times, most recently from 19fd9fd to e7e4699 Compare October 8, 2020 13:44

seiko2plus mentioned this pull request Oct 8, 2020

ENH:Umath Replace raw SIMD of unary float point(32-64) with NPYV - g0 #16247

Merged

11 tasks

seiko2plus force-pushed the npyv_partial_noncont_mem branch from 0744337 to 24b5841 Compare October 9, 2020 00:03

seiko2plus marked this pull request as ready for review October 9, 2020 00:03

seiko2plus changed the title ~~WIP:SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit~~ SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit Oct 9, 2020

seiko2plus commented Oct 9, 2020

View reviewed changes

numpy/core/src/common/simd/avx2/memory.h Outdated Show resolved Hide resolved

seiko2plus commented Oct 9, 2020

View reviewed changes

ENH, SIMD: Add partial/non-contig load and store intrinsics for 32/64…

1b8637d

…-bit This patch improves the implementation of memory load/store for VSX

seiko2plus force-pushed the npyv_partial_noncont_mem branch from bec733b to 1b8637d Compare October 9, 2020 00:18

seiko2plus mentioned this pull request Oct 9, 2020

ENH, TST: Bring the NumPy C SIMD vectorization interface "NPYV" to Python #16782

Merged

7 tasks

seiko2plus mentioned this pull request Oct 19, 2020

SIMD: Replace raw SIMD of sin/cos with NPYV(universal intrinsics) #17587

Merged

5 tasks

charris added 01 - Enhancement and removed 25 - WIP labels Oct 25, 2020

charris merged commit fcba5a6 into numpy:master Oct 25, 2020

Qiyu8 mentioned this pull request Nov 11, 2020

Optimize the performance of rot by using universal intrinsics OpenMathLib/OpenBLAS#2983

Merged

seiko2plus deleted the npyv_partial_noncont_mem branch January 9, 2021 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit#17340

SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit#17340
charris merged 1 commit intonumpy:masterfrom
seiko2plus:npyv_partial_noncont_mem

seiko2plus commented Sep 17, 2020 •

edited

Loading

Uh oh!

Uh oh!

mattip commented Oct 7, 2020

Uh oh!

seiko2plus commented Oct 7, 2020 •

edited

Loading

Uh oh!

seiko2plus commented Oct 9, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seiko2plus commented Oct 21, 2020

Uh oh!

charris commented Oct 25, 2020

Uh oh!

mattip commented Oct 25, 2020

Uh oh!

seiko2plus commented Oct 25, 2020 •

edited

Loading

Uh oh!

charris commented Oct 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

seiko2plus commented Sep 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mattip commented Oct 7, 2020

Uh oh!

seiko2plus commented Oct 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seiko2plus commented Oct 9, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seiko2plus commented Oct 21, 2020

Uh oh!

charris commented Oct 25, 2020

Uh oh!

mattip commented Oct 25, 2020

Uh oh!

seiko2plus commented Oct 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charris commented Oct 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

seiko2plus commented Sep 17, 2020 •

edited

Loading

seiko2plus commented Oct 7, 2020 •

edited

Loading

seiko2plus commented Oct 25, 2020 •

edited

Loading