Skip to content

[BUG] NDK r23b neon intrinsics too slow #1607

@DaydreamCoding

Description

@DaydreamCoding

Description

NDK r23b compile neon intrinsics is very slow:

https://github.com/Tencent/ncnn/blob/master/src/mat_pixel_affine.cpp#L1173

warpaffine_bilinear_c4

params:
constexpr int width = 160;
constexpr int height = 160;
constexpr float image_matrix[6] = {
-0.00673565036, 0.146258384, 4.34562492,
-0.146258384, -0.00673565036, 162.753372,
};

NDK r23b this funciton cost 8.40 ms.
NDK r22b this function cost 0.302 ms.

Environment Details

  • NDK Version: r23b
  • Build system: CMake
  • Host OS: Mac
  • ABI: arm64-v8a
  • NDK API level: android-21
  • Device API level: android-30

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions