Skip to content

Conversation

@bouweandela
Copy link
Member

@bouweandela bouweandela commented Jun 28, 2023

Description

Closes #2114 and closes #2121


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@bouweandela bouweandela added the preprocessor Related to the preprocessor label Jun 28, 2023
@valeriupredoi
Copy link
Contributor

cheers muchly for starting this @bouweandela 🍺 I am testing with @ledm 's used case:

import iris
from esmvalcore.preprocessor._regrid import extract_levels

fn = "/badc/cmip6/data/CMIP6/CMIP/MOHC/UKESM1-0-LL/historical/r2i1p1f2/Omon/po4/gn/v20190708/po4_Omon_UKESM1-0-LL_historical_r2i1p1f2_gn_200001-201412.nc"

cube= iris.load_cube(fn)
c2 = extract_levels(cube, scheme='nearest', levels = [0.1 ,])

which with main goes through, but uses 12.7GB of mem; with this branch I am getting this:

Traceback (most recent call last):
  File "/home/users/valeriu/lee_blocker.py", line 7, in <module>
    c2 = extract_levels(cube, scheme='nearest', levels = [0.1 ,])
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/users/valeriu/ESMValCore/esmvalcore/preprocessor/_regrid.py", line 1053, in extract_levels
    result = _vertical_interpolate(
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/users/valeriu/ESMValCore/esmvalcore/preprocessor/_regrid.py", line 875, in _vertical_interpolate
    new_data = stratify.interpolate(
               ^^^^^^^^^^^^^^^^^^^^^
  File "src/stratify/_vinterp.pyx", line 544, in stratify._vinterp.interpolate
  File "/home/users/valeriu/miniconda3-June2021/envs/esmvaltool-latest/lib/python3.11/site-packages/dask/array/core.py", line 4575, in asarray
    return from_array(a, getitem=getter_inline, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/users/valeriu/miniconda3-June2021/envs/esmvaltool-latest/lib/python3.11/site-packages/dask/array/core.py", line 3487, in from_array
    token = tokenize(x, chunks, lock, asarray, fancy, getitem, inline_array)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/users/valeriu/miniconda3-June2021/envs/esmvaltool-latest/lib/python3.11/site-packages/dask/base.py", line 999, in tokenize
    hasher = _md5(str(tuple(map(normalize_token, args))).encode())
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/users/valeriu/miniconda3-June2021/envs/esmvaltool-latest/lib/python3.11/site-packages/dask/utils.py", line 642, in __call__
    return meth(arg, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/users/valeriu/miniconda3-June2021/envs/esmvaltool-latest/lib/python3.11/site-packages/dask/base.py", line 1288, in normalize_array
    data = hash_buffer_hex(x.ravel(order="K").view("i1"))
                           ^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 11.9 GiB for an array with shape (1603800000,) and data type float64

@valeriupredoi
Copy link
Contributor

@bouweandela your code changes are are proving to be absolutele memory-savers, bud! I tested again with @ledm 's minimal case above, and now the run goes fine, needing order 300M of memory, as compared to a needed allocation of 12G as per file 🍺 🍺 🍺 @ledm you may want to grab this one to save you tons of memory (if you are not using altitude-pressure conversions which are still realized in memory 👍

@bouweandela
Copy link
Member Author

Yes, this should work for normal use cases and possibly also with the altitude - pressure conversion. However, it does not work for derived coordinates like hybrid sigma because these end up with really large chunks (the example I saw had 14 GB chunks). I suspect that is an iris issue though, but need more time to investigate.

@valeriupredoi
Copy link
Contributor

OK how about we merge this as is (needs a couple tests, I can do that) with the caveat emptor to be solved that derived coords need lazy-ing?

@ledm
Copy link
Contributor

ledm commented Jul 6, 2023

This is great - thanks! It seems to fix the minimal test case. I'll check the full run next!

@ledm
Copy link
Contributor

ledm commented Jul 6, 2023

This is great - thanks! It seems to fix the minimal test case. I'll check the full run next!

Nope, failed again with the same error. Starting to think this is something else?

@bouweandela
Copy link
Member Author

it does not work for derived coordinates like hybrid sigma because these end up with really large chunks (the example I saw had 14 GB chunks). I suspect that is an iris issue though, but need more time to investigate.

Opened an issue about this in Iris: SciTools/iris#5457

@bouweandela
Copy link
Member Author

OK how about we merge this as is (needs a couple tests, I can do that) with the caveat emptor to be solved that derived coords need lazy-ing?

It would be great if you have time to finish this pull request

@valeriupredoi
Copy link
Contributor

OK I'll have a look this week then, bud 🍺

@bouweandela bouweandela marked this pull request as ready for review January 15, 2024 15:12
@bouweandela
Copy link
Member Author

Uploading coverage to Codecov is failing for some reason, even if their status page reports all systems are online.

@bouweandela bouweandela added this to the v2.11.0 milestone Jan 15, 2024
@bouweandela bouweandela self-assigned this Jan 16, 2024
@codecov
Copy link

codecov bot commented Jan 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (cbf9394) 94.08% compared to head (e566afa) 94.11%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2120      +/-   ##
==========================================
+ Coverage   94.08%   94.11%   +0.02%     
==========================================
  Files         240      240              
  Lines       13308    13363      +55     
==========================================
+ Hits        12521    12576      +55     
  Misses        787      787              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bouweandela bouweandela requested a review from schlunma January 25, 2024 09:38
@bouweandela
Copy link
Member Author

@schlunma @valeriupredoi @ledm I think I have now fixed all the issues with making the extract_levels preprocessor function lazy. Could you have another go at testing this branch please and let me know if there's any further issues? Please make sure to use the Distributed scheduler to avoid running out of memory.

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bouweandela, code looks good!

Will test this with real data now.

@schlunma
Copy link
Contributor

Just tested this with a real recipe (see below) that covers all cases (regular pressure level coordinate, hybrid height coordinate, conversion altitude -> air pressure). Everything works well! 🚀 Without this PR, the recipe hangs indefinitely; with this PR, the recipe finishes in ~2 min when a reasonably high amount of memory is used:

cluster:
  type: distributed.LocalCluster
  n_workers: 3
  threads_per_worker: 4
  memory_limit: 10 GiB
Recipe
# ESMValTool
---
documentation:
  title: Test
  description: test
  authors:
    - schlund_manuel


datasets:
  - {project: CMIP6, dataset: MPI-ESM1-2-HR,   exp: historical, ensemble: r1i1p1f1, grid: gn}
  - {project: CMIP6, dataset: MPI-ESM1-2-LR,   exp: historical, ensemble: r1i1p1f1, grid: gn}
  - {project: CMIP6, dataset: HadGEM3-GC31-LL, exp: historical, ensemble: r1i1p1f3, grid: gn}
  - {project: native6, dataset: ERA5, type: reanaly, version: v1, tier: 3}


preprocessors:

  test:
    extract_levels:
      levels: 12345  # use weird level to ensure interpolation
      scheme: linear
      coordinate: air_pressure


diagnostics:

  test:
    variables:
      cl:
        timerange: '2003/2007'
        mip: Amon
        preprocessor: test
    scripts:
      null

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing my comments. I am fine with merging this now, but we could also wait until the iris features are available. Please just make sure to remove any duplicated code when the corresponding iris counterpart is available.

@bouweandela bouweandela merged commit 265d2f4 into main Feb 21, 2024
@bouweandela bouweandela deleted the more-lazy-extract-levels branch February 21, 2024 14:03
@bouweandela
Copy link
Member Author

Merging now because the iris 3.8 release candidate is out and it does not include the features needed here. They will hopefully be available the 3.9 release planned for early summer.

@chrisbillowsMO chrisbillowsMO added the dask related to improvements using Dask label May 13, 2024
@nhsavage nhsavage changed the title More lazy extract_levels preprocessor function WIP More lazy extract_levels preprocessor function May 14, 2024
@nhsavage nhsavage changed the title WIP More lazy extract_levels preprocessor function More lazy extract_levels preprocessor function May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dask related to improvements using Dask preprocessor Related to the preprocessor

Projects

No open projects

Development

Successfully merging this pull request may close these issues.

extract_levels is not lazy when using altitude - pressure levels conversion extract_levels preprocessor realizes source levels

6 participants