Currently, unsafe_load(::Ptr{T}) where T is an immutable struct, heap allocates the new T() instance.
However, there's no need to do that, since immutable structs can be heap allocated, even if they are not isbits, as of a few years ago.
Consider this small MRE:
julia> struct X # isbits
x::Int
end
julia> struct Y # immutable, but not isbits
y::Vector{Int}
end
julia> function caller(v::V) where V
r = Ref(v)
GC.@preserve r begin
p = reinterpret(Ptr{V}, pointer_from_objref(r))
callee(p)
end
end
caller (generic function with 1 method)
julia> function callee(p::Ptr)
v = unsafe_load(p)
foo(v)
end
callee (generic function with 1 method)
julia> foo(x::X) = x.x
foo (generic function with 2 methods)
julia> foo(y::Y) = sum(y.y, init=0)
foo (generic function with 2 methods)
julia> const x1 = X(1)
X(1)
julia> const y1 = Y([1,2])
Y([1, 2])
julia> caller(x1)
1
julia> caller(y1)
3
julia> @time caller(x1)
0.000001 seconds
1
julia> @time caller(y1) # One alloc from the Ref(), one alloc from boxing the Y() instance.
0.000002 seconds (2 allocations: 32 bytes)
3
I'm not sure why it can only elide the Ref() allocation for ::X, but not for ::Y... Is that a similar optimization opportunity that's currently being missed?
I'm not 100% sure, but i'm pretty sure this can be fixed by just changing this line:
|
else if (!jl_isbits(ety)) { |
to:
else if (!jl_is_immutable(ety)) {
.
Currently,
unsafe_load(::Ptr{T})whereTis an immutable struct, heap allocates the new T() instance.However, there's no need to do that, since immutable structs can be heap allocated, even if they are not isbits, as of a few years ago.
Consider this small MRE:
I'm not sure why it can only elide the
Ref()allocation for::X, but not for::Y... Is that a similar optimization opportunity that's currently being missed?I'm not 100% sure, but i'm pretty sure this can be fixed by just changing this line:
julia/src/intrinsics.cpp
Line 698 in 95749c3
to:
.