-
Notifications
You must be signed in to change notification settings - Fork 35
Description
pyFlwDir version checks
- I have checked that this issue has not already been reported.
- I have checked that this bug exists on the latest version of pyFlwDir.
Reproducible Example
import numpy as np
import rasterio as rio
import pyflwdir
def accumulate_flow(
flow_direction_filename,
headwaters_filename
):
with rio.open(flow_direction_filename) as src:
data = src.read(1)
nodata = src.nodata
profile = src.profile
temp = np.ndarray(shape=np.shape(data), dtype=np.uint8)
temp[data == 1] = 1
temp[data == 2] = 128
temp[data == 3] = 64
temp[data == 4] = 32
temp[data == 5] = 16
temp[data == 6] = 8
temp[data == 7] = 4
temp[data == 8] = 2
temp[data == nodata] = 247
flw = pyflwdir.from_array(temp, ftype='d8', check_ftype=False)
del temp
# Read the flow direction raster
with rio.open(headwaters_filename) as src:
headwaters = src.read(1)
nodata = src.nodata
flowaccum = flw.accuflux(headwaters, nodata=nodata, direction='up')
...Current behaviour
In trying to accumulate flow (accuflux) on larger rasters generated from 1m LiDAR Data, Segmentation Faults are occurring.
The specifics:
1m LiDAR flow direction file & headwaters file (same size raster)
>>> np.shape(data)
(63708, 79780)
flow_direction_filename is 253M
head_waters_filename is 66M
Rasters generated from 3m LiDAR data will not seg fault, and process successfully.
>>> np.shape(data)
(21236, 26593)
flow_direction_filename is 107M
head_waters_filename is 7.4M
When the Python script is called from a shell script, an Exit Status of 139 is observed. Further debugging:
export PYTHONFAULTHANDLER=1
root@f7f7e3af5d8b:/# python3 $srcDir/accumulate_headwaters.py -fd flowdir_d8_burned_filled_0.tif -wg headwaters_0.tif
Fatal Python error: Segmentation fault
Current thread 0x00007ff9300d8000 (most recent call first):
File "/usr/local/lib/python3.10/dist-packages/pyflwdir/flwdir.py", line 215 in order_cells
File "/usr/local/lib/python3.10/dist-packages/pyflwdir/pyflwdir.py", line 272 in idxs_seq
File "/usr/local/lib/python3.10/dist-packages/pyflwdir/flwdir.py", line 555 in accuflux
File "/accumulate_headwaters.py", line 73 in accumulate_flow
File "/accumulate_headwaters.py", line 110 in <module>
Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, yaml._yaml, numba.core.typeconv._typeconv, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box, scipy._lib._ccallback_c, _cffi_backend, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy._lib.messagestream, scipy.optimize._trlib._trlib, numpy.linalg.lapack_lite, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.spatial._ckdtree, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.optimize._direct, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, rasterio._version, rasterio._err, rasterio._filepath, rasterio._env, _brotli, rasterio._transform, rasterio._base, rasterio.crs, rasterio._features, rasterio._warp, rasterio._io (total: 98)
Segmentation fault (core dumped)
Naively, the source code was modified (call to order_cells) hardcoding method="sort". This did not work. As seen above, line 215 in order_cells, calls core.idxs_seq which appears to be the root of the problem. No further investigation has been made past this point.
Desired behaviour
Ideally larger rasters would process without segmentation faults. If not, the exception could potentially be handled a little more elegantly from python with a message stating that the raster is too big to process, or....
Another option might entail providing documentation/examples to users on how to split larger rasters into chunks, and then provide the tool/utility to join/concatenate 'blocked' or 'chunked' rasters back into a single pyflwdir.FlwdirRaster object once processing (whether it be accuflux or other FlwdirRaster methods) is finished.
Additional context
Memory usage was tracked, and it was observed that less than 1/3 of the available RAM was in use when the segmentation fault occurred.