Remove initialization_update_state! and clean up initialization by glwagner · Pull Request #5420 · CliMA/Oceananigans.jl

glwagner · 2026-03-21T03:38:56Z

Summary

This PR intends to clarify the meaning of "initialization" versus other actions associated with set!. A principal source of confusion is initialization_update_state!, whose name was intended to mean "update state, during initialization". However, the name seemed to be interpreted as "update state AND initialization", such that this function was overloaded with initialization activities. To fix this confusion this PR eliminates initialization_update_state!.

Another related issue is that set! was being misinterpreted as an initial condition function that can be called only once. This is not true; the intent of the design is that set! can be called multiple times. This actually can be useful for some cases where a second call to set! may use a computation from the first call. There isn't really a use case for this as far as I can tell for ocean modeling, but it applies to other kinds of models (like atmosphere models and as a corollary, for coupled models), and I believe consistency and non-surprising behavior across the ecosystem is important.

To help resolve these inconsistencies, this PR creates a distinction between "state reconciliation" which is needed to ensure consistency between 1) the 3D velocity field and the barotropic mode and 2) between the grid metrics and surface displacement for MutableVerticalDiscretization. In particular, state reconcilation needs to be performed whenever the model state is changed --- so it can occur more often than "initialization", which occurs only once. Moreover state reconciliation is "idempotent", in that multiple calls to reconcile_state!(model) (a new function for this PR) have no additional effect.

In contrast to state reconcilation, initialization is NOT guaranteed to be idempotent, and therefore must strictly be called only once. Initialization is now performed by time_step!(simulation), which is called by run!. This also means that "bare" time-stepping with time_step!(model, dt) (ie without Simulation) is not guaranteed correct unless initialize!(model) is called beforehand. Right now this issue only applies to models that use the Lagrangian-averaged DynamicSmagorinsky closure.

Note that we also call reconcile_state!(model) during initialization (recall that reconcile_state! is idempotent; so it can be called as often as needed), and we call fill_halo_regions! on prognostic variables as well. This resolves an issue raised on #5352 where halo filling needs to be moved out of update_state! (meaning that update_state! can no longer fulfill the role of a complete auxiliary state update). After this PR, a total auxiliary state update requires calling both reconcile_state! (which is not called during time-stepping), and update_state! which is called during time-stepping.

In summary this PR

Removes the confusing initialization_update_state! function, replacing it with a clear separation of concerns
Introduces reconcile_state! — ensures auxiliary state (barotropic velocities, vertical coordinate σ) is consistent with prognostic fields after external mutation (e.g. via set!)
Renames initialize_free_surface! → reconcile_free_surface! and initialize_vertical_coordinate! → reconcile_vertical_coordinate! to reflect their true role as reconciliation, not one-time initialization
Renames maybe_initialize_state! → maybe_prepare_first_time_step! (calls only update_state! at iteration 0 as a safety net for bare model time-stepping)
Adds reconcile_state kwarg to set! for advanced users
Updates model interface documentation with idempotency requirements
Bumps version to 0.106.1

Function roles after this PR

Function	Idempotent?	Purpose
`reconcile_state!`	Yes	Make auxiliary state that isn't changed by `update_state!` (U,V,σ) consistent with prognostic fields
`update_state!`	Yes	Compute auxiliary state except the parts handled by `reconcile_state!`: recompute diagnostics, fill halos (async), compute tendencies
`initialize!`	No	`reconcile_state!` + sync halo fills + closure field init
`maybe_prepare_first_time_step!`	Yes	Safety net: calls `reconcile_state!` and `update_state!` at iter 0 for bare model time-stepping

Closes discussion from #5352.

Test plan

Smoke tests: model construction, set!, SplitExplicit barotropic mode, Simulation.run!, bare time_step!
HFSM tests: 152 passed, 7 broken (pre-existing)
Simulation tests: 55/55 passed
Time stepping tests: 118 passed, 2 broken (pre-existing)
Reactant HFSM tests: 2 errors (pre-existing MLIR failures, same on main)
Multi-region / CubedSphere tests (running)
CI

🤖 Generated with Claude Code

@trace

In tick!, both clock.last_Δt and clock.last_stage_Δt were set to the same Δt.mlir_data, causing them to alias. Inside @trace for loops, this triggers "Attempt to donate a buffer which is also used by the same call to Execute()" because XLA tries to donate one while reading the other in the while loop carry. Break the alias by creating a copy via Δt + zero(Δt) for last_stage_Δt. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Replace the confusing `initialization_update_state!` with a clear separation of concerns between `reconcile_state!`, `initialize!`, and `update_state!`: - `reconcile_state!`: ensures auxiliary state (barotropic velocities, vertical coordinate scaling) is consistent with prognostic fields. Called in `set!` and `initialize!`. Idempotent. - `initialize!`: one-time setup before first time step. Calls `reconcile_state!`, synchronous halo fills, and `initialize_closure_fields!`. Allowed to be non-idempotent. - `update_state!`: idempotent recomputation of diagnostics/derived quantities. Called every time step and after `set!`. Renames: - `initialize_free_surface!` → `reconcile_free_surface!` - `initialize_vertical_coordinate!` → `reconcile_vertical_coordinate!` - `maybe_initialize_state!` → `maybe_initialize!` (now also calls `initialize!`) The `set!` function gains a `reconcile_state` kwarg (default `true`) for advanced users who set barotropic velocities independently. Bump version to 0.106.1. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

maybe_initialize! should only call update_state!, not initialize!. Calling initialize! inside time_step! would cause double-initialization when used with Simulation (which already calls initialize! in Simulation.initialize!). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Also move reconcile_state! definition to TimeSteppers (where update_state! lives) and call it from maybe_prepare_first_time_step! so that bare time_step!(model, Δt) reconciles state on the first step. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

XLA folds Δt + zero(Δt) → Δt, so the aliasing between last_Δt and last_stage_Δt persists. Use Reactant.Ops.optimization_barrier which XLA cannot fold, ensuring distinct buffers for the while loop carry. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Add `materialize_clock!(clock, timestepper)` called in HFSM and NH model constructors. For QuasiAdamsBashforth2TimeStepper, this sets `clock.last_Δt = clock.last_stage_Δt` to ensure they are distinct objects. This is needed for Reactant, where aliased ConcreteRNumber fields cause XLA buffer donation errors in compiled loops. With materialize_clock! breaking the alias at construction time, the optimization_barrier workaround in the Reactant tick! override is no longer needed and is removed. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

materialize_clock! in the constructor doesn't prevent MLIR-level aliasing that occurs at runtime inside tick! when both last_Δt and last_stage_Δt are assigned the same Δt value. The optimization_barrier is still needed until Reactant provides a proper buffer copy op. See #5389 (comment) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

materialize_clock! in the model constructor ensures last_Δt and last_stage_Δt are distinct ConcreteRNumber objects. Assigning the same value to both in tick! via .mlir_data does not re-alias the objects. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

For QAB2, last_Δt and last_stage_Δt always hold the same value after tick!. In the Reactant extension, materialize_clock! aliases them via setfield! (bypassing the ReactantClock setproperty! override) so that Reactant's tracer sees one buffer, avoiding XLA buffer donation errors. The src/ QAB2 method is removed since aliasing is only needed for Reactant's ConcreteRNumber fields (normal Float64 fields are immutable values with no aliasing concern). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…tion-update-state

simone-silvestri · 2026-03-25T14:15:17Z

I think this is ready to merge. We just need to fix the multi-region tests, I can take a look.

…liMA/Oceananigans.jl into glw/remove-initialization-update-state

src/Models/HydrostaticFreeSurfaceModels/HydrostaticFreeSurfaceModels.jl

glwagner

comment

codecov · 2026-03-25T20:04:47Z

Codecov Report

❌ Patch coverage is 87.09677% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.01%. Comparing base (85ba915) to head (8d89c68).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
ext/OceananigansReactantExt/Models.jl	0.00%	2 Missing ⚠️
ext/OceananigansReactantExt/TimeSteppers.jl	75.00%	1 Missing ⚠️
src/TimeSteppers/TimeSteppers.jl	75.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #5420   +/-   ##
=======================================
  Coverage   74.01%   74.01%           
=======================================
  Files         400      400           
  Lines       22875    22876    +1     
=======================================
+ Hits        16930    16932    +2     
+ Misses       5945     5944    -1

Flag	Coverage Δ
buildkite	`68.87% <87.09%> (-0.03%)`	⬇️
julia	`68.87% <87.09%> (-0.03%)`	⬇️
reactant_unit	`3.77% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Compatibility broken by CliMA/Oceananigans.jl#5420, ref NumericalEarth/Breeze.jl#583.

glwagner and others added 4 commits March 20, 2026 18:55

glwagner requested a review from simone-silvestri March 21, 2026 04:17

glwagner and others added 6 commits March 21, 2026 06:53

Merge branch 'main' into glw/remove-initialization-update-state

c26c1f9

simone-silvestri approved these changes Mar 24, 2026

View reviewed changes

glwagner and others added 3 commits March 24, 2026 09:23

Merge remote-tracking branch 'origin/main' into glw/remove-initializa…

55659ad

…tion-update-state

rm stale import

fdaf055

Merge branch 'main' into glw/remove-initialization-update-state

c6dc459

simone-silvestri added 3 commits March 25, 2026 16:07

fill free surface halos before timestepping

8e1fad6

Merge branch 'glw/remove-initialization-update-state' of github.com:C…

f9fc85f

…liMA/Oceananigans.jl into glw/remove-initialization-update-state

remove stale imports

1131761

glwagner commented Mar 25, 2026

View reviewed changes

src/Models/HydrostaticFreeSurfaceModels/HydrostaticFreeSurfaceModels.jl Outdated Show resolved Hide resolved

glwagner commented Mar 25, 2026

View reviewed changes

simone-silvestri added 4 commits March 25, 2026 16:41

add fill halos in reconcile_state!

56d30fd

now just need to fix multi-region

a8e49ff

cubed sphere implementation

2f57d80

fix the dispatch

8d89c68

glwagner merged commit f231920 into main Mar 26, 2026
70 checks passed

glwagner deleted the glw/remove-initialization-update-state branch March 26, 2026 05:16

simone-silvestri mentioned this pull request Mar 26, 2026

Bump minor version from 0.106.2 to 0.106.3 #5431

Merged

giordano added the breaking change 💔 Concerning a change which breaks the API label Mar 26, 2026

giordano added a commit to JuliaRegistries/General that referenced this pull request Mar 26, 2026

[Breeze] Restrict comapt with Oceananigans v0.106

520eb37

Compatibility broken by CliMA/Oceananigans.jl#5420, ref NumericalEarth/Breeze.jl#583.

giordano mentioned this pull request Mar 26, 2026

[Breeze] Restrict comapt with Oceananigans v0.106 JuliaRegistries/General#151405

Merged

giordano added a commit to JuliaRegistries/General that referenced this pull request Mar 26, 2026

[Breeze] Restrict comapt with Oceananigans v0.106 (#151405)

4fe3bc0

Compatibility broken by CliMA/Oceananigans.jl#5420, ref NumericalEarth/Breeze.jl#583.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove initialization_update_state! and clean up initialization#5420

Remove initialization_update_state! and clean up initialization#5420
glwagner merged 20 commits intomainfrom
glw/remove-initialization-update-state

glwagner commented Mar 21, 2026 •

edited

Loading

Uh oh!

simone-silvestri commented Mar 25, 2026

Uh oh!

Uh oh!

glwagner left a comment

Uh oh!

codecov bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

glwagner commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Function roles after this PR

Test plan

Uh oh!

simone-silvestri commented Mar 25, 2026

Uh oh!

Uh oh!

glwagner left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

glwagner commented Mar 21, 2026 •

edited

Loading

codecov bot commented Mar 25, 2026 •

edited

Loading