-
Notifications
You must be signed in to change notification settings - Fork 274
Closed
Closed
Copy link
Labels
Description
Behavior
- ✅ Works:
filename = "output"→ createsoutput_rank0.jld2,output_rank1.jld2, etc. - ❌ Fails:
filename = "output.jld2"→ hangs (all ranks attempt to write to same file)
Reproducer
using MPI, Oceananigans
MPI.Init()
arch = Distributed(GPU(); partition = Partition(2, 2))
grid = RectilinearGrid(arch, size=(64, 64, 16), extent=(1, 1, 1))
model = NonhydrostaticModel(; grid)
simulation = Simulation(model, Δt=10, stop_iteration=100)
# This hangs:
simulation.output_writers[:out] = JLD2Writer(model, model.tracers,
filename = "output.jld2",
schedule = IterationInterval(10))
run!(simulation)Test Results
Tested with filename = "output" (without extension):
- Grid: 1440×720×30 (0.25° Arctic Ocean)
- GPUs: 4× RTX 4090
- Partition:
Partition(2, 2) - Output interval: 6 hours
Results:
- ✅ All 4 ranks completed successfully
- ✅ Created
output_rank0.jld2throughoutput_rank3.jld2 - ✅ Good I/O performance (~10ms write time after initialization)
Originally reported in ClimaOcean.jl: CliMA/ClimaOcean.jl#746
Credit to @glwagner for diagnosing the bug.
Reactions are currently unavailable