-
Notifications
You must be signed in to change notification settings - Fork 153
Description
The Hazard class offers several options to instantiate it from data files, e.g. from_raster, from_excel, etc. The classmethod from_raster, in particular, uses rasterio to open datasets and read their metadata, coordinates, and data. In this issue, I want to discuss if a general-purpose classmethod for reading data from a NetCDF file into a Hazard object might be useful, and how such a method could look like. A method implementing such a functionality to some extent can be found at climada_petals/blob/feature/wildfire/climada_petals/hazard/wildfire.py#L2247.
What the method should do
Use a single NetCDF file to load data for a consistent instance of Hazard, meaning that if data is missing, it will be set to a sensible default.
The minimal (i.e., essential) data supplied as variables in the file should be
- hazard intensity data (2D or 3D dataset)
- coordinates (1D dataset each)
- time (1D dataset, if applicable)
Optional data could include:
- hazard fraction data (same dimensions as intensity)
- event frequency (1D)
- event names (1D)
- event IDs (1D)
- coordinate system information (attributes/metadata)
Method signature
from_netcdf should take the following arguments:
data(path-like orxarray.Dataset, required): The dataset. Open the file if it is a path.intensity_var(string, required): The name of the hazard intensity variable in the datasetfraction_var(string, optional): The name of the hazard fraction variable in the datasetcoordinate_vars(dict, optional): A mapping from default coordinate names to the variables used as coords in the dataset, e.g.dict(longitude="lon", latitude="y")- tbd
Method outline
Suppose a netCDF file contains the following data:
intensity: 3D dataset (dims: "time", "longitude", "latitude")- 1D coordinate dataset for each dimension
Then the following code creates a consistent Hazard instance from this data:
import xarray as xr
from scipy.sparse import csr_matrix
from climada.hazard import Hazard
from climada.hazard.centroids.centr import Centroids
data = xr.open_dataset("...")
hazard = Hazard()
# Transpose the data so we flatten it with longitude running "fastest"
intensity = data["intensity"].transpose("time", "latitude", "longitude")
hazard.intensity = csr_matrix(intensity.values.reshape((data.sizes["time"], -1)))
hazard.intensity.eliminate_zeros()
# Build centroids
lat, lon = np.meshgrid(data["latitude"].values, data["longitude"].values, indexing="ij")
hazard.centroids = Centroids.from_lat_lon(lat.flatten(), lon.flatten())
hazard.centroids.set_lat_lon_to_meta()
# Consistent Hazard also needs
# hazard.fraction, hazard.event_id, hazard.event_name, hazard.frequency, hazard.date
# but these can be defaulted, e.g.
hazard.event_id = np.array(range(1, data.sizes["time"] + 1))