Skip to content

Rework support for reduction operations on Metal GPU to avoid materialization of the interior#5329

Merged
glwagner merged 24 commits intoCliMA:mainfrom
alesok:metal_bcs
Mar 19, 2026
Merged

Rework support for reduction operations on Metal GPU to avoid materialization of the interior#5329
glwagner merged 24 commits intoCliMA:mainfrom
alesok:metal_bcs

Conversation

@alesok
Copy link
Copy Markdown
Contributor

@alesok alesok commented Feb 23, 2026

Unfortunately, setting wind stress and heat flux using FluxBoundaryCondition does not modify values inside the modeling domain when running on the Metal GPU. This behavior seems to be the result of materialization of the interior (function maybe_copy_interior in OceananigansMetalExt.jl), which was introduced in PR #4990 to enable reduction operators on the Metal GPU.

It appears that reduction operators can be enabled without materializing the interior. This PR is an attempt to enable reduction operators on the Metal GPU.

@simone-silvestri
Copy link
Copy Markdown
Collaborator

It appears that reduction operators can be enabled without materializing the interior.

How is the reduction handled in this case?

r = Field(loc, c.grid, T; indices=indices(c))
initialize_reduced_field!(Base.$(reduction!), identity, r, conditioned_c)
Base.$(reduction!)(identity, maybe_copy_interior(r), conditioned_c, init=false)
Base.$(reduction!)(identity, interior(r), conditioned_c, init=true)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the behavior for all types of reductions. Is it needed only for Metal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, init should be false, as it was before. We need just to go back to use interior(r) instead of maybe_copy_interior(r)

Copy link
Copy Markdown
Collaborator

@simone-silvestri simone-silvestri Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see, what happened that made just using interior work that was not working before #4990?

@alesok
Copy link
Copy Markdown
Contributor Author

alesok commented Feb 24, 2026

It appears that reduction operators can be enabled without materializing the interior.

How is the reduction handled in this case?

Actually, Metal seems to support reductions

using Metal

A = Metal.ones(Float32, 4, 4, 4)
V = view(A, 2:3, 2:3, 2:3) # Non-contiguous view
@show typeof(V)
println(minimum(V))
println(sum(V))

println()
R = reshape(V, 2, 2, 2, 1)
@show typeof(R)
println(minimum(R))
println(sum(R))

Gives

typeof(V) = SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}
1.0
8.0

typeof(R) = Base.ReshapedArray{Float32, 4, SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}
1.0
8.0

The problem was related to device(ReshapedArray) and I hope that

device(a::Base.ReshapedArray) = device(parent(a))

in OceananigansMetalExt.jl and

device(a::AbstractArray) = device(architecture(a))

in the Architectures.jl solve the issue.

@glwagner
Copy link
Copy Markdown
Member

It appears that reduction operators can be enabled without materializing the interior.

How is the reduction handled in this case?

Actually, Metal seems to support reductions

using Metal

A = Metal.ones(Float32, 4, 4, 4)
V = view(A, 2:3, 2:3, 2:3) # Non-contiguous view
@show typeof(V)
println(minimum(V))
println(sum(V))

println()
R = reshape(V, 2, 2, 2, 1)
@show typeof(R)
println(minimum(R))
println(sum(R))

Gives

typeof(V) = SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}
1.0
8.0

typeof(R) = Base.ReshapedArray{Float32, 4, SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}
1.0
8.0

The problem was related to device(ReshapedArray) and I hope that

device(a::Base.ReshapedArray) = device(parent(a))

in OceananigansMetalExt.jl and

device(a::AbstractArray) = device(architecture(a))

in the Architectures.jl solve the issue.

these are nice and simple. Should the ReshapedArray method belong in Architectures? It doesn't depend on Metal as far as I can tell.

@alesok
Copy link
Copy Markdown
Contributor Author

alesok commented Feb 24, 2026

these are nice and simple. Should the ReshapedArray method belong in Architectures? It doesn't depend on Metal as far as I can tell.

I’m not sure. Maybe other GPU backends implement a method like device(a::Base.ReshapedArray)? Actually, we don't need to add

device(a::AbstractArray) = device(architecture(a))

to Architectures.jl. Adding

device(a::Base.ReshapedArray) = device(parent(a))

to the OceananigansMetalExt.jl appears to be sufficient.

@simone-silvestri
Copy link
Copy Markdown
Collaborator

I agree that it should live in Architectures. It is a general method so we can also reuse it for different architectures (If we ever need it)

@glwagner
Copy link
Copy Markdown
Member

these are nice and simple. Should the ReshapedArray method belong in Architectures? It doesn't depend on Metal as far as I can tell.

I’m not sure. Maybe other GPU backends implement a method like device(a::Base.ReshapedArray)? Actually, we don't need to add

device(a::AbstractArray) = device(architecture(a))

to Architectures.jl. Adding

device(a::Base.ReshapedArray) = device(parent(a))

to the OceananigansMetalExt.jl appears to be sufficient.

I think what you’re describing is a form of type piracy. Metal does not own ReshapedArray. Therefore its extension should not implement special behavior for this foreign type.

@alesok
Copy link
Copy Markdown
Contributor Author

alesok commented Feb 27, 2026

I think what you’re describing is a form of type piracy. Metal does not own ReshapedArray. Therefore its extension should not implement special behavior for this foreign type.

Metal defines device including::

device()
device(::MtlArray)
device(::SubArray)

and I propose adding:

Metal.device(a::Base.ReshapedArray) = Metal.device(parent(a))

Since device is defined in Metal, extending it for Base.ReshapedArray is not exactly type piracy. Metal owns the function; it is only extending behavior to a Base wrapper type.

I tried moving this to Architectures, but ran into dependency issues. Since device is defined and used within Metal, it seems more appropriate to keep the method there.

@inline convert_to_device(::MetalGPU, args::Tuple) = map(Metal.mtlconvert, args)

device(a::Base.ReshapedArray) = device(parent(a))
device(a::AbstractArray) = device(parent(a)) # it's more general than Base.ReshapedArray
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is parent(a) alwasy defined for all AbstractArray?

Also, can you please move this function to src/Architectures.jl?

Copy link
Copy Markdown
Contributor Author

@alesok alesok Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is parent(a) alwasy defined for all AbstractArray?

Probably not. It seems to be safer using Base.ReshapedArray than more general AbstractArray

Also, can you please move this function to src/Architectures.jl?

I’ve tried, but I haven’t succeeded so far.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue putting this in src/Architectures.jl?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am puzzled with placing device(a::Base.ReshapedArray) in Architectures
What we need in Architectures is

Architectures.device(a::Base.ReshapedArray)=Metal.device(parent(a)) 

and then import Architectures:device in OceananigansMetalExt. However, we do not want Architectures to depend on Metal by importing Metal: device, correct? Also, Metal calls device internally ...

Copy link
Copy Markdown
Member

@glwagner glwagner Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If indeed you are trying to extend Metal.device for ReshapedArray, then this must be contributed to Metal.jl unfortunately... (I assumed you were trying to extend Architectures.device --- note there is also a KernelAbstractions.device, so the situation is fairly bewildering)

Consider, that one could be using Metal and expect a certain behavior with ReshapedArray. Then somewher in their code there is a using Oceananigans, which immediately changes how device(::ReshapedArray) works for no explicable reason (this is the outcome of type piracy)

But actually, is device owned by KernelAbstractions or Metal? cc @vchuravy

I think we have noted somewhere else that it is unfortunate that KA defines device as an "id" whereas Architectures defines device to mean the "backend". We should probably fix that in Oceananigans so we are consistent with KA.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an output with commented out device in OceananigansMetalExt

# Metal.device(a::Base.ReshapedArray) = Metal.device(parent(a))

Sorry, it is too long

% julia --project=. --check-bounds=yes test/test_metal.jl   
Precompiling OceananigansCUDAExt finished.
  1 dependency successfully precompiled in 4 seconds
[2026/03/01 09:23:26.025] INFO  Initializing simulation...
[2026/03/01 09:23:27.465] INFO      ... simulation initialization complete (1.389 seconds)
[2026/03/01 09:23:27.470] INFO  Executing initial time step...
[2026/03/01 09:23:32.542] INFO      ... initial time step complete (5.073 seconds).
[2026/03/01 09:23:32.580] INFO  Simulation is stopping after running for 6.813 seconds.
[2026/03/01 09:23:32.580] INFO  Model iteration 3 equals or exceeds stop iteration 3.
Test Summary:      | Pass  Total   Time
MetalGPU extension |   10     10  47.1s
[2026/03/01 09:24:05.991] INFO  Initializing simulation...
[2026/03/01 09:24:06.179] INFO      ... simulation initialization complete (188.274 ms)
[2026/03/01 09:24:06.179] INFO  Executing initial time step...
[2026/03/01 09:24:15.132] INFO      ... initial time step complete (8.952 seconds).
[2026/03/01 09:24:15.204] INFO  Simulation is stopping after running for 9.208 seconds.
[2026/03/01 09:24:15.204] INFO  Model iteration 10 equals or exceeds stop iteration 10.
Test Summary:                  | Pass  Total   Time
MetalGPU: ImmersedBoundaryGrid |   11     11  44.7s
MetalGPU: test for reductions: Error During Test at /Users/sokolov/models/Oceananigans.jl/test/test_metal.jl:112
  Test threw exception
  Expression: minimum(c) == 1
  MethodError: no method matching device(::Base.ReshapedArray{Float32, 4, SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
  The function `device` exists, but no method is defined for this combination of argument types.
  
  Closest candidates are:
    device()
     @ Metal ~/.julia/packages/Metal/EFzAO/src/state.jl:16
    device(::MtlArray)
     @ Metal ~/.julia/packages/Metal/EFzAO/src/array.jl:125
    device(::SubArray)
     @ Metal ~/.julia/packages/Metal/EFzAO/src/array.jl:552
    ...
  
  Stacktrace:
    [1] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::MtlArray{Float32, 4, Metal.PrivateStorage}; init::Nothing)
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:197
    [2] mapreducedim!
      @ ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:171 [inlined]
    [3] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}; init::Nothing)
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:291
    [4] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing})
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:171
    [5] mapreducedim!(f::Function, op::Function, R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing})
      @ GPUArrays ~/.julia/packages/GPUArrays/3a5jB/src/host/mapreduce.jl:10
    [6] #minimum!#771
      @ ./reducedim.jl:1003 [inlined]
    [7] minimum(f::Function, c::Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}; condition::Nothing, mask::Float64, dims::Function)
      @ Oceananigans.Fields ~/models/Oceananigans.jl/src/Fields/field.jl:782
    [8] minimum
      @ ~/models/Oceananigans.jl/src/Fields/field.jl:770 [inlined]
    [9] minimum(c::Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing})
      @ Oceananigans.Fields ~/models/Oceananigans.jl/src/Fields/field.jl:791
   [10] top-level scope
      @ ~/models/Oceananigans.jl/test/test_metal.jl:106
   [11] macro expansion
      @ ~/.julia/juliaup/julia-1.12.5+0.aarch64.apple.darwin14/Julia-1.12.app/Contents/Resources/julia/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
   [12] macro expansion
      @ ~/models/Oceananigans.jl/test/test_metal.jl:112 [inlined]
   [13] macro expansion
      @ ~/.julia/juliaup/julia-1.12.5+0.aarch64.apple.darwin14/Julia-1.12.app/Contents/Resources/julia/share/julia/stdlib/v1.12/Test/src/Test.jl:677 [inlined]
MetalGPU: test for reductions: Error During Test at /Users/sokolov/models/Oceananigans.jl/test/test_metal.jl:117
  Test threw exception
  Expression: minimum(add_2_kfo) == 3
  MethodError: no method matching device(::Base.ReshapedArray{Float32, 4, SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
  The function `device` exists, but no method is defined for this combination of argument types.
  
  Closest candidates are:
    device()
     @ Metal ~/.julia/packages/Metal/EFzAO/src/state.jl:16
    device(::MtlArray)
     @ Metal ~/.julia/packages/Metal/EFzAO/src/array.jl:125
    device(::SubArray)
     @ Metal ~/.julia/packages/Metal/EFzAO/src/array.jl:552
    ...
  
  Stacktrace:
    [1] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::MtlArray{Float32, 4, Metal.PrivateStorage}; init::Nothing)
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:197
    [2] mapreducedim!
      @ ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:171 [inlined]
    [3] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::KernelFunctionOperation{Center, Center, Center, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Float32, var"#add_2#add_2##0", Tuple{Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}}}; init::Nothing)
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:291
    [4] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::KernelFunctionOperation{Center, Center, Center, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Float32, var"#add_2#add_2##0", Tuple{Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}}})
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:171
    [5] mapreducedim!(f::Function, op::Function, R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::KernelFunctionOperation{Center, Center, Center, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Float32, var"#add_2#add_2##0", Tuple{Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}}})
      @ GPUArrays ~/.julia/packages/GPUArrays/3a5jB/src/host/mapreduce.jl:10
    [6] #minimum!#771
      @ ./reducedim.jl:1003 [inlined]
    [7] minimum(f::Function, c::KernelFunctionOperation{Center, Center, Center, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Float32, var"#add_2#add_2##0", Tuple{Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}}}; condition::Nothing, mask::Float64, dims::Function)
      @ Oceananigans.Fields ~/models/Oceananigans.jl/src/Fields/field.jl:782
    [8] minimum
      @ ~/models/Oceananigans.jl/src/Fields/field.jl:770 [inlined]
    [9] minimum(c::KernelFunctionOperation{Center, Center, Center, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Float32, var"#add_2#add_2##0", Tuple{Field{Center, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, Float32, Float32}, Float32, Float32, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, OffsetVector{Float32, StepRangeLen{Float32, Float32, Float32, Int64}}, GPU{MetalBackend}}, Tuple{Colon, Colon, Colon}, OffsetArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, NoFluxBoundaryCondition, NoFluxBoundaryCondition, Nothing, @NamedTuple{bottom_and_top::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:32, 1:32)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_bottom_and_top_halo!)}, south_and_north::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_south_and_north_halo!)}, west_and_east::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.StaticSize{(16, 16)}, Oceananigans.Utils.OffsetStaticSize{(1:38, 1:38)}, typeof(Oceananigans.BoundaryConditions.gpu__fill_periodic_west_and_east_halo!)}}, @NamedTuple{bottom_and_top::Tuple{NoFluxBoundaryCondition, NoFluxBoundaryCondition}, south_and_north::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}, west_and_east::Tuple{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}}}}, Nothing, Nothing}}})
      @ Oceananigans.Fields ~/models/Oceananigans.jl/src/Fields/field.jl:791
   [10] top-level scope
      @ ~/models/Oceananigans.jl/test/test_metal.jl:106
   [11] macro expansion
      @ ~/.julia/juliaup/julia-1.12.5+0.aarch64.apple.darwin14/Julia-1.12.app/Contents/Resources/julia/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
   [12] macro expansion
      @ ~/models/Oceananigans.jl/test/test_metal.jl:117 [inlined]
   [13] macro expansion
      @ ~/.julia/juliaup/julia-1.12.5+0.aarch64.apple.darwin14/Julia-1.12.app/Contents/Resources/julia/share/julia/stdlib/v1.12/Test/src/Test.jl:677 [inlined]
Test Summary:                 | Error  Total  Time
MetalGPU: test for reductions |     2      2  6.2s
RNG of the outermost testset: Xoshiro(0x9363a019d9147530, 0xaeaa4610686054c9, 0x2e20d6e039929cf9, 0x46360c98ccc16a67, 0x7180fcd9cb5c6d39)
ERROR: LoadError: Some tests did not pass: 0 passed, 0 failed, 2 errored, 0 broken.
in expression starting at /Users/sokolov/models/Oceananigans.jl/test/test_metal.jl:105

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I think that this deserves a PR to Metal.jl. The relevant part of the stack trace is:

  MethodError: no method matching device(::Base.ReshapedArray{Float32, 4, SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
  The function `device` exists, but no method is defined for this combination of argument types.
  
  Closest candidates are:
    device()
     @ Metal ~/.julia/packages/Metal/EFzAO/src/state.jl:16
    device(::MtlArray)
     @ Metal ~/.julia/packages/Metal/EFzAO/src/array.jl:125
    device(::SubArray)
     @ Metal ~/.julia/packages/Metal/EFzAO/src/array.jl:552
    ...
  
  Stacktrace:
    [1] mapreducedim!(f::typeof(identity), op::typeof(min), R::SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, A::MtlArray{Float32, 4, Metal.PrivateStorage}; init::Nothing)
      @ Metal ~/.julia/packages/Metal/EFzAO/src/mapreduce.jl:197

Note how the arguments to mapreducedim! are completely independent of Oceananigans. So, it should be possible to create an example that triggers this error which does not use Oceananigans.

Very nice work to uncover this!

Copy link
Copy Markdown
Contributor Author

@alesok alesok Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that extending Metal.device should live in Metal. In the meantime, we need to get rid of the materialization (function maybe_copy_interior). So I added the following in OceananigansMetalExt:

# This method extension for Metal.device(::ReshapedArray) should live in Metal.jl,
# specifically in array.jl alongside:
# device(a::SubArray) = device(parent(a))
Metal.device(a::Base.ReshapedArray) = Metal.device(parent(a))

Given this, should we merge this into main, or should we wait until Metal.jl incorporates the change (if they accept the proposal)?
A am afraid it will take some time ...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But without the materialization main will not work at the moment right? So maybe we can wait for the metal PR then we merge this one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But without the materialization main will not work at the moment right?

Unfortunately current main with materialization produces a wrong result (surface heat flow and wind do not affect data in the domain).

So maybe we can wait for the metal PR then we merge this one?

It would be nice to fix the bug and remove materialization now. If we do not want to extend (temporarily) Metal.device(Base.ReshapedArray) ourself in OceananigansMetalExt and want
to wait for a Metal PR we can comment out three tests in the metal_test.jl

#=
@testset "MetalGPU: test for reductions" begin
    arch = GPU(Metal.MetalBackend())
    grid = RectilinearGrid(arch, size=(32, 32, 32), extent=(1, 1, 1))

    # Test reduction of Field
    c = CenterField(grid)
    set!(c, 1)
    @test minimum(c) == 1

    # Test reduction of KernelFunctionOperation
    add_2(i, j, k, grid, c) = @inbounds c[i, j, k] + 2
    add_2_kfo = KernelFunctionOperation{Center,Center,Center}(add_2, grid, c)
    @test minimum(add_2_kfo) == 3
end

@testset "MetalGPU: TimeStepWizard" begin
    arch = GPU(Metal.MetalBackend())

    grid = RectilinearGrid(arch; size=(64, 64, 16), x=(0, 5000), y=(0, 5000), z=(-20, 0))
    @test eltype(grid) == Float32

    model = HydrostaticFreeSurfaceModel(grid; momentum_advection=WENO(), tracer_advection=WENO(),
                                        free_surface=SplitExplicitFreeSurface(grid; substeps=30))

    sim = Simulation(model, Δt=5, stop_iteration=20)

    wizard = TimeStepWizard(cfl=0.7, min_Δt=1, max_Δt=15)
    sim.callbacks[:wizard] = Callback(wizard, IterationInterval(5))

    run!(sim)
    @test time(sim) > 100seconds
end

@testset "MetalGPU: Nonhydrostatic model" begin
    arch = GPU(Metal.MetalBackend());
    grid = RectilinearGrid(arch; size=(32, 32, 8), x=(0, 5000), y=(0, 5000), z=(-20, 0))
    @test eltype(grid) == Float32

    model = NonhydrostaticModel(grid)
    sim = Simulation(model, Δt=5, stop_iteration=2)
    run!(sim)
    @test time(sim) == 10seconds
end
=#


# This method extension for Metal.device(::ReshapedArray) should live in Metal.jl,
# specifically in array.jl alongside:
# device(a::SubArray) = device(parent(a))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open a PR there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, firstly we need to create MWE to show the error without using Oceananigans. Interesting, that

using Metal
A = Metal.ones(Float32, 4, 4, 4)
V = view(A, 2:3, 2:3, 2:3) # Non-contiguous view
R = reshape(V, 2, 2, 2, 1)
@show typeof(R)
println(minimum(R))

works nicely

typeof(R) = Base.ReshapedArray{Float32, 4, SubArray{Float32, 3, MtlArray{Float32, 3, Metal.PrivateStorage}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}
1.0

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened an issue about the missing device(a::Base.ReshapedArray) method in JuliaGPU/Metal.jl#756. Let's see what happens.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR Fix mapreduce device check #757 fix the issue. Will wait for the next Metal.jl release!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I backported the fix and it should be in 1.9.3

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @christiangnrd. Metal tests passed successfully.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.48%. Comparing base (f84994e) to head (88cf026).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5329      +/-   ##
==========================================
- Coverage   73.49%   73.48%   -0.01%     
==========================================
  Files         398      398              
  Lines       22678    22671       -7     
==========================================
- Hits        16667    16660       -7     
  Misses       6011     6011              
Flag Coverage Δ
buildkite 68.83% <100.00%> (+0.01%) ⬆️
julia 68.83% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@glwagner
Copy link
Copy Markdown
Member

glwagner commented Mar 19, 2026

@alesok is this ready to merge? I ask because main has been merged many times now --- its best to merge only when necessary to save resources (the CI is pretty heavy). It's been approved so it is good to go from our side. Let us know!

@alesok
Copy link
Copy Markdown
Contributor Author

alesok commented Mar 19, 2026

@alesok is this ready to merge? I ask because main has been merged many times now --- its best to merge only when necessary to save resources (the CI is pretty heavy). It's been approved so it is good to go from our side. Let us know!

Yes, I think it would be good to merge this PR. I have nothing to add now.

@glwagner glwagner merged commit 96f04c9 into CliMA:main Mar 19, 2026
66 of 67 checks passed
@alesok alesok deleted the metal_bcs branch March 19, 2026 23:33
briochemc added a commit to briochemc/Oceananigans.jl that referenced this pull request Mar 23, 2026
…ine-ACCESS-OM2

* bp-claude/distributed-FPivot-TripolarGrid:
  Replace reverse() with reversed-range views in fold halo fills
  Reinstate Docs/Benchmarks (CliMA#5419)
  Update fill_halo_regions.jl (CliMA#5415)
  Temporarily drop Benchmark section from Docs + delete `legacy_benchmarks` (CliMA#5412)
  Add restart verification script (CliMA#5379)
  Rework support for reduction operations on Metal GPU to avoid materialization of the interior (CliMA#5329)
  Fix typo with density perturbation in docs (CliMA#5398)
  Implement ReactantCore.materialize_traced_array for Field (CliMA#5409)
  Remove Oceananigans dependency from Project.toml (CliMA#5414)
  (0.106) Log checkpoint file and mtime when restoring simulations (CliMA#5355)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants