Improve benchmark tests by simone-silvestri · Pull Request #4842 · CliMA/Oceananigans.jl

simone-silvestri · 2025-10-10T06:19:52Z

I am afraid that, given the very low values given to velocity, and an almost constant temperature and salinity, the benchmark tests do not really run in a fair way. I had already noticed a case where the performance depended on the initial conditions (i.e. if your velocities are zeros, the kernel automatically does not compute the reconstructions).

This PR changes the initial conditions of the benchmark tests to make sure they are fair, i.e. a stratified fluid in z with much higher velocity fluctuations that cannot die out in the first iteration

…ananigans.jl into ss/improve-benchmarks-pipeline

simone-silvestri · 2025-10-10T09:28:53Z

The benchmarks seem to be fair, as this PR has no effect on the performance results.
We probably can close this PR even if I would prefer merging this so we know we have a test case which is solid.
However, do you guys know what all these test failures that do not pertain to this PR are?
@navidcy @glwagner @siddharthabishnu

Updated the bottom function to use a fixed denominator for bathymetry calculations and added Random.seed! for consistency in tests.

simone-silvestri · 2025-10-10T10:47:59Z

Also, a random bathymetry with values going from -5000 to -1000, as it is right now, is not really representative of an ocean simulation where the bathymetry is somewhat smoother (the benchmarks here are hitting way more reduced advection than what would happen in a real simulation). Therefore, I think it is better to use a sloped bathymetry.

glwagner · 2025-10-10T12:52:28Z

The benchmarks should measure time per time-step (not SYPD).

glwagner · 2025-10-10T12:53:01Z

Where are the benchmarks uploaded? Can we put a link into the documentation or readme or somewhere?

glwagner

are there benchmarks for NonhydrostaticModel?

cc @giordano

simone-silvestri · 2025-10-10T12:58:56Z

The benchmarks should measure time per time-step (not SYPD).

They do better (for GPU only)! They measure the timing of each kernel independently. For the CPU tests, I have this open PR #4711, which probably still needs some work for the formatting of the output.

are there benchmarks for NonhydrostaticModel?

No, but we can include them for sure. The tests are run on Tartarus, and a .txt file summarizing the performance of the tested simulation is uploaded as a buildkite artifact.

glwagner · 2025-10-10T13:00:07Z

and a .txt file summarizing the performance of the tested simulation is uploaded as a buildkite artifact.

How can one view the artifact? Is there documentation or instructions somewhere?

glwagner · 2025-10-10T13:00:34Z

They do better (for GPU only)! They measure the timing of each kernel independently.

Why does it matter what the bathymetry is then?

simone-silvestri · 2025-10-10T13:04:01Z

Where are the benchmarks uploaded? Can we put a link into the documentation or readme or somewhere?

The tests are run on Tartarus, and a .txt file summarizing the performance of the tested simulation is uploaded as a buildkite artifact. I think they are not really user-friendly, more tests for developers to make sure we don't hit some performance regression:

This is an example of the output of the benchmark tests
buildkiteartifacts.com_52b8ca89-71e4-4c6b-b9a2-94f4dcea2227_019661c8-be0b-455f-80e2-9a259b307d77_0199cdc9-54e6-4211-a2f6-4bb0c7921248_0199cdc9-5dc2-4d76-a0dd-849eb1184893_diff_immersed_output.txt_response-content-type=text%2Fplain&X-Amz-Algorithm=AWS4.pdf

The artifacts are uploaded on the github pipeline (for example)
https://buildkite.com/clima/oceananigans-benchmarks-1/builds/3442#0199cdc9-5dc2-4d76-a0dd-849eb1184893
under the artifacts tab

The bathymetry has a role because of the different methods that are used near the immersed boundary which will modify the
profile of the simulation.

glwagner · 2025-10-10T13:53:40Z

The bathymetry has a role because of the different methods that are used near the immersed boundary which will modify the
profile of the simulation.

But if we are using non-short-circuiting logic, the same methods are called in every cell regardless of the immersed boundary.

simone-silvestri · 2025-10-10T14:19:14Z

I am not sure that always happens on a GPU; there might be some optimizations that occur under the hood.

glwagner · 2025-10-10T15:48:10Z

I am not sure that always happens on a GPU; there might be some optimizations that occur under the hood.

I don't think it's possible because the bathymetry is determined by an array whose values can change / are only known at runtime (not compile time)

simone-silvestri · 2025-10-13T08:06:57Z

Yeah, but at runtime, how CUDA organizes execution might depend on possible zeros.
I noticed that if you multiply by zero a very complicated computation (like, for example, a WENO reconstruction), the reconstruction is avoided altogether.

simone-silvestri · 2025-10-13T10:46:19Z

@navidcy do you know why the documenter key is not found here?
https://buildkite.com/clima/oceananigans/builds/26163#0199dc9b-eb29-42ab-8f54-4daaa060ddd4

giordano · 2025-10-13T11:02:05Z

To make Greg happy, we could automatically upload the benchmark results somewhere so that we can produce charts like https://enzymead.github.io/Enzyme-JAX/benchmarks/ and https://enzymead.github.io/Reactant.jl/benchmarks/. Those are using https://github.com/benchmark-action/github-action-benchmark, but we can explore alternative visualisations if you prefer, just to give an idea. This could be another thing I can work on next month, if you're interested

simone-silvestri · 2025-10-13T11:12:09Z

I would be very happy if you would like to take this on! Also, this could help with #4711. I can work on adding some relevant non-hydrostatic benchmarks in another PR (if I can find some time).

giordano · 2025-10-13T11:21:07Z

I opened #4851 to track this task.

simone-silvestri · 2025-10-13T12:00:27Z

In the meantime, I merge this PR that just changes the initial condition and physical setup slightly.

simone-silvestri added 5 commits April 24, 2025 00:52

improve the yml build

8d717eb

run benchmarks

99af57c

make sure no NaN appears

4cb3b46

Merge branch 'ss/improve-benchmarks-pipeline' of github.com:CliMA/Oce…

46398ac

…ananigans.jl into ss/improve-benchmarks-pipeline

improve benchmarks

8cb8e17

simone-silvestri added the benchmark performance runs preconfigured benchamarks and spits out timing label Oct 10, 2025

simone-silvestri added 3 commits October 10, 2025 08:21

not sure why this was different...

632b76f

Merge branch 'main' into ss/improve-benchmarks-pipeline

7def910

Allow scalar assignment in benchmark tests

91317f4

simone-silvestri added 2 commits October 10, 2025 12:22

try a new bottom

e1438c7

Refactor bathymetry function and seed random number generator

2793f1a

Updated the bottom function to use a fixed denominator for bathymetry calculations and added Random.seed! for consistency in tests.

force push

d855d0a

simone-silvestri requested review from glwagner and navidcy October 10, 2025 11:31

glwagner approved these changes Oct 10, 2025

View reviewed changes

Merge branch 'main' into ss/improve-benchmarks-pipeline

7410082

Merge branch 'main' into ss/improve-benchmarks-pipeline

586a129

Merge branch 'main' into ss/improve-benchmarks-pipeline

d2cf0d1

giordano mentioned this pull request Oct 13, 2025

Collect benchmark results and host them on a website #4851

Closed

simone-silvestri merged commit 6a75213 into main Oct 13, 2025
58 of 59 checks passed

simone-silvestri deleted the ss/improve-benchmarks-pipeline branch October 13, 2025 12:00

Conversation

simone-silvestri commented Oct 10, 2025

Uh oh!

simone-silvestri commented Oct 10, 2025

Uh oh!

simone-silvestri commented Oct 10, 2025

Uh oh!

glwagner commented Oct 10, 2025

Uh oh!

glwagner commented Oct 10, 2025

Uh oh!

glwagner left a comment

Choose a reason for hiding this comment

Uh oh!

simone-silvestri commented Oct 10, 2025

Uh oh!

glwagner commented Oct 10, 2025

Uh oh!

glwagner commented Oct 10, 2025

Uh oh!

simone-silvestri commented Oct 10, 2025

Uh oh!

glwagner commented Oct 10, 2025

Uh oh!

simone-silvestri commented Oct 10, 2025

Uh oh!

glwagner commented Oct 10, 2025

Uh oh!

simone-silvestri commented Oct 13, 2025

Uh oh!

simone-silvestri commented Oct 13, 2025

Uh oh!

giordano commented Oct 13, 2025

Uh oh!

simone-silvestri commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giordano commented Oct 13, 2025

Uh oh!

simone-silvestri commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simone-silvestri commented Oct 13, 2025 •

edited

Loading