You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 22, 2021. It is now read-only.
On MIPS architecture (with MSA extension) there is FRINT.df instruction (df is element size 32,64b) with four rounding modes which can be set in MSACSR register (round to nearest, round towards zero (trunc), round towards positive/+inf (ceil), round towards negative/-inf(floor)).
ARM has vector convert float to integer, round towards zero (truncate). That's the most natural operation to implement since it matches C behavior.
I believe there are instructions that use the rounding mode in the FP control register (FPSCR) but setting this register causes FP execution to stall until all instructions in progress complete. Therefore it can be slow. I believe most ISAs work like this.
I'd be concerned that exposing other rounding modes would be slow. There are clever tricks to simulate the other rounding modes efficiently.
The proposal in #1 doesn't include vectorized versions of the floating-point round-to-integer instructions:
f32x4.ceil(x: v128) -> v128
f32x4.floor(x: v128) -> v128
f32x4.trunc(x: v128) -> v128
f32x4.nearest(x: v128) -> v128
f64x2.ceil(x: v128) -> v128
f64x2.floor(x: v128) -> v128
f64x2.trunc(x: v128) -> v128
f64x2.nearest(x: v128) -> v128
It seems these instructions would be just as useful in vectorized floating-point code as they are in scalar code.
Are these instructions widely available in SIMD instruction sets we care about?
The text was updated successfully, but these errors were encountered: