Describe the bug
There a appears to be a race condition in tiledbsoma.io.register_h5ads when called with multiple H5AD blob store URIs when soma.compute_concurrency_level is greater than 1. I diagnosed this as a race condition because 1) I've only seen it happen while registering multiple H5AD files with soma.compute_concurrency_level greater than 1 and 2) it is non-deterministic under those conditions and sometimes succeeds, but I haven't inspected the library internals deeply to confirm logically.
The bug manifests as both a FileNotFoundError and UnboundLocalError during error handling. The UnboundLocalError at least is pretty clear: it is caused by the fact that this finally block assumes the variable anndata is defined, though that is not the case owing to the first FileNotFoundError. I haven't tried to look into the root cause of the initial FileNotFoundError. Notice though that the GCS URI protocol is incorrect in the error message: it is gs:/ instead of gs://, so I'd suspect path handling on some particular codepath triggered by concurrent registration.
$ python repro.py
Experiment URI: /tmp/tmp0gi59nhd
Traceback (most recent call last):
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/_util.py", line 55, in read_h5ad
anndata = ad.read_h5ad(_FSPathWrapper(input_handle, input_path), mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_io/h5ad.py", line 246, in read_h5ad
return read_h5ad_backed(filename, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_io/h5ad.py", line 193, in read_h5ad_backed
adata = AnnData(**d)
^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/legacy_api_wrap/__init__.py", line 82, in fn_compatible
return fn(*args_all, **kw)
^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/anndata.py", line 249, in __init__
self._init_as_actual(
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/anndata.py", line 362, in _init_as_actual
self.file = AnnDataFileManager(self, filename, filemode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/file_backing.py", line 38, in __init__
self.open()
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/file_backing.py", line 100, in open
self._file = h5py.File(self.filename, self._filemode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/h5py/_hl/files.py", line 564, in __init__
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/h5py/_hl/files.py", line 238, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 102, in h5py.h5f.open
FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = 'gs:/rarebase-data-ttl-14d/clayton/test_h5ads/h5ad_6.h5ad', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/claytonmellina/code/tiledb_race_bug/repro.py", line 38, in <module>
registration_mapping = tiledbsoma.io.register_h5ads(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/ingest.py", line 228, in register_h5ads
axes_metadata = list(
^^^^^
File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/_registration/ambient_label_mappings.py", line 344, in _load_axes_metadata_from_h5ads
with read_h5ad(p, mode="r") as adata:
^^^^^^^^^^^^^^^^^^^^^^
File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/contextlib.py", line 137, in __enter__
return next(self.gen)
^^^^^^^^^^^^^^
File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/_util.py", line 59, in read_h5ad
if anndata.file:
^^^^^^^
UnboundLocalError: cannot access local variable 'anndata' where it is not associated with a value
To Reproduce
I used GCS to reproduce this, so the repro assumes 1) installation of gsutil and 2) access to a GCS bucket to create test data.
- Create test data — substitute
YOUR_GCS_URI with an appropriate gs://... URI
# Download the base h5ad file used to test tiledb race condition.
SOURCE_FILE="/tmp/base.h5ad"
gsutil cp gs://arc-ctc-tahoe100/2025-02-25/tutorial/plate3_2k-obs.h5ad "$SOURCE_FILE"
DEST_DIR="/tmp/test_h5ads"
N_FILES=10
# Create the output h5ad directory.
mkdir -p "$DEST_DIR"
for i in $(seq 1 $N_FILES)
do
# Construct the new filename and copy the file
cp "$SOURCE_FILE" "${DEST_DIR}/h5ad_${i}.h5ad"
done
# Upload the h5ad files to GCS.
gsutil cp -r "$DEST_DIR" "YOUR_GCS_URI"
- Setup environment
python -m venv venv
source venv/bin/activate
pip install tiledbsoma==1.17.1 gcsfs
- Repro script — again, substitute
YOUR_GCS_URI
For example, save the following in repro.py and run python repro.py. Can be run multiple times since it creates temporary experiment directories.
import os
import tempfile
import tiledbsoma.io
_TEST_H5AD_DIR = 'YOUR_GCS_URI'
_N_FILES = 10
_MEASUREMENT_NAME = 'RNA'
_OBS_ID_NAME = 'BARCODE_SUB_LIB_ID'
_VAR_ID_NAME = 'gene_name'
# Build the list of h5ad files
h5ad_files = [
os.path.join(_TEST_H5AD_DIR, f'h5ad_{i}.h5ad')
for i in range(1, _N_FILES + 1)
]
_EXPERIMENT_URI = tempfile.mkdtemp()
print('Experiment URI:', _EXPERIMENT_URI)
# Create context with high concurrency.
# Note: Setting the concurrency level to 1 avoids the race condition.
context = tiledbsoma.SOMATileDBContext(
tiledb_config={'soma.compute_concurrency_level': 20})
# Create the experiment with the schema from the first h5ad file.
tiledbsoma.io.from_h5ad(_EXPERIMENT_URI,
h5ad_files[0],
measurement_name=_MEASUREMENT_NAME,
obs_id_name=_OBS_ID_NAME,
var_id_name=_VAR_ID_NAME,
ingest_mode='schema_only',
X_kind=tiledbsoma.SparseNDArray,
context=context)
registration_mapping = tiledbsoma.io.register_h5ads(
_EXPERIMENT_URI,
h5ad_files,
measurement_name=_MEASUREMENT_NAME,
obs_field_name=_OBS_ID_NAME,
var_field_name=_VAR_ID_NAME,
context=context,
)
Versions (please complete the following information):
- TileDB-SOMA version: 1.17.1
- TileDB core version (libtiledbsoma): 2.28.1
- Language and language version (e.g. Python 3.9, R 4.3.2): Python 3.12.11
- OS (e.g. MacOS, Ubuntu Linux): Linux 5.15.0-1083-gcp
Describe the bug
There a appears to be a race condition in
tiledbsoma.io.register_h5adswhen called with multiple H5AD blob store URIs whensoma.compute_concurrency_levelis greater than 1. I diagnosed this as a race condition because 1) I've only seen it happen while registering multiple H5AD files withsoma.compute_concurrency_levelgreater than 1 and 2) it is non-deterministic under those conditions and sometimes succeeds, but I haven't inspected the library internals deeply to confirm logically.The bug manifests as both a
FileNotFoundErrorandUnboundLocalErrorduring error handling. TheUnboundLocalErrorat least is pretty clear: it is caused by the fact that thisfinallyblock assumes the variableanndatais defined, though that is not the case owing to the firstFileNotFoundError. I haven't tried to look into the root cause of the initialFileNotFoundError. Notice though that the GCS URI protocol is incorrect in the error message: it isgs:/instead ofgs://, so I'd suspect path handling on some particular codepath triggered by concurrent registration.To Reproduce
I used GCS to reproduce this, so the repro assumes 1) installation of
gsutiland 2) access to a GCS bucket to create test data.YOUR_GCS_URIwith an appropriategs://...URIpython -m venv venv source venv/bin/activate pip install tiledbsoma==1.17.1 gcsfsYOUR_GCS_URIFor example, save the following in
repro.pyand runpython repro.py. Can be run multiple times since it creates temporary experiment directories.Versions (please complete the following information):