pub fn copy_u32(input: &[u8; 4], output: &mut [u8; 4]) {
let a = &input[..4];
let b = &mut output[..4];
for b in b.chunks_exact_mut(4) {
b.copy_from_slice(a);
}
}
Complied with -C opt-level=s -C inline-threshold=300 produces (same as just -C opt-level=s)
example::copy_u32:
mov rax, rsi
mov rdx, rdi
mov esi, 4
mov ecx, 4
mov rdi, rax
jmp qword ptr [rip + core::slice::<impl [T]>::copy_from_slice@GOTPCREL]
Complied with -C opt-level=s -C llvm-args=-inline-threshold=300 produces
example::copy_u32:
mov eax, dword ptr [rdi]
mov dword ptr [rsi], eax
ret
(Godbolt link)
The cause appears to be that LLVM keeps using a separate inlining threshold from the "primary" one given by rustc on functions with optsize attribute, but only when it didn't receive -inline-threshold itself.
Fixing this seems to either require converting the rustc-received argument to LLVM commandline one, or constructing a custom llvm::InlineParams structure instead of using one of LLVM-provided helper functions?
Complied with
-C opt-level=s -C inline-threshold=300produces (same as just-C opt-level=s)Complied with
-C opt-level=s -C llvm-args=-inline-threshold=300produces(Godbolt link)
The cause appears to be that LLVM keeps using a separate inlining threshold from the "primary" one given by rustc on functions with
optsizeattribute, but only when it didn't receive-inline-thresholditself.Fixing this seems to either require converting the rustc-received argument to LLVM commandline one, or constructing a custom
llvm::InlineParamsstructure instead of using one of LLVM-provided helper functions?