Skip to content

[Bug] [python] Possible race condition during concurrent H5AD registration from GCS blob storage #4219

@pumpikano

Description

@pumpikano

Describe the bug

There a appears to be a race condition in tiledbsoma.io.register_h5ads when called with multiple H5AD blob store URIs when soma.compute_concurrency_level is greater than 1. I diagnosed this as a race condition because 1) I've only seen it happen while registering multiple H5AD files with soma.compute_concurrency_level greater than 1 and 2) it is non-deterministic under those conditions and sometimes succeeds, but I haven't inspected the library internals deeply to confirm logically.

The bug manifests as both a FileNotFoundError and UnboundLocalError during error handling. The UnboundLocalError at least is pretty clear: it is caused by the fact that this finally block assumes the variable anndata is defined, though that is not the case owing to the first FileNotFoundError. I haven't tried to look into the root cause of the initial FileNotFoundError. Notice though that the GCS URI protocol is incorrect in the error message: it is gs:/ instead of gs://, so I'd suspect path handling on some particular codepath triggered by concurrent registration.

$ python repro.py
Experiment URI: /tmp/tmp0gi59nhd
Traceback (most recent call last):
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/_util.py", line 55, in read_h5ad
    anndata = ad.read_h5ad(_FSPathWrapper(input_handle, input_path), mode)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_io/h5ad.py", line 246, in read_h5ad
    return read_h5ad_backed(filename, mode)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_io/h5ad.py", line 193, in read_h5ad_backed
    adata = AnnData(**d)
            ^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/legacy_api_wrap/__init__.py", line 82, in fn_compatible
    return fn(*args_all, **kw)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/anndata.py", line 249, in __init__
    self._init_as_actual(
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/anndata.py", line 362, in _init_as_actual
    self.file = AnnDataFileManager(self, filename, filemode)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/file_backing.py", line 38, in __init__
    self.open()
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/anndata/_core/file_backing.py", line 100, in open
    self._file = h5py.File(self.filename, self._filemode)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/h5py/_hl/files.py", line 564, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/h5py/_hl/files.py", line 238, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 56, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 57, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 102, in h5py.h5f.open
FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = 'gs:/rarebase-data-ttl-14d/clayton/test_h5ads/h5ad_6.h5ad', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/claytonmellina/code/tiledb_race_bug/repro.py", line 38, in <module>
    registration_mapping = tiledbsoma.io.register_h5ads(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/ingest.py", line 228, in register_h5ads
    axes_metadata = list(
                    ^^^^^
  File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 619, in result_iterator
    yield _result_or_cancel(fs.pop())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 317, in _result_or_cancel
    return fut.result(timeout)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/_registration/ambient_label_mappings.py", line 344, in _load_axes_metadata_from_h5ads
    with read_h5ad(p, mode="r") as adata:
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/claytonmellina/.pyenv/versions/3.12.11/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/claytonmellina/code/tiledb_race_bug/venv/lib/python3.12/site-packages/tiledbsoma/io/_util.py", line 59, in read_h5ad
    if anndata.file:
       ^^^^^^^
UnboundLocalError: cannot access local variable 'anndata' where it is not associated with a value

To Reproduce

I used GCS to reproduce this, so the repro assumes 1) installation of gsutil and 2) access to a GCS bucket to create test data.

  1. Create test data — substitute YOUR_GCS_URI with an appropriate gs://... URI
# Download the base h5ad file used to test tiledb race condition.
SOURCE_FILE="/tmp/base.h5ad"
gsutil cp gs://arc-ctc-tahoe100/2025-02-25/tutorial/plate3_2k-obs.h5ad "$SOURCE_FILE"


DEST_DIR="/tmp/test_h5ads"
N_FILES=10

# Create the output h5ad directory.
mkdir -p "$DEST_DIR"
for i in $(seq 1 $N_FILES)
do
  # Construct the new filename and copy the file
  cp "$SOURCE_FILE" "${DEST_DIR}/h5ad_${i}.h5ad"
done

# Upload the h5ad files to GCS.
gsutil cp -r "$DEST_DIR" "YOUR_GCS_URI"
  1. Setup environment
python -m venv venv
source venv/bin/activate
pip install tiledbsoma==1.17.1 gcsfs
  1. Repro script — again, substitute YOUR_GCS_URI

For example, save the following in repro.py and run python repro.py. Can be run multiple times since it creates temporary experiment directories.

import os
import tempfile

import tiledbsoma.io

_TEST_H5AD_DIR = 'YOUR_GCS_URI'
_N_FILES = 10

_MEASUREMENT_NAME = 'RNA'
_OBS_ID_NAME = 'BARCODE_SUB_LIB_ID'
_VAR_ID_NAME = 'gene_name'


# Build the list of h5ad files
h5ad_files = [
    os.path.join(_TEST_H5AD_DIR, f'h5ad_{i}.h5ad')
    for i in range(1, _N_FILES + 1)
]

_EXPERIMENT_URI = tempfile.mkdtemp()
print('Experiment URI:', _EXPERIMENT_URI)

# Create context with high concurrency.
# Note: Setting the concurrency level to 1 avoids the race condition.
context = tiledbsoma.SOMATileDBContext(
    tiledb_config={'soma.compute_concurrency_level': 20})

# Create the experiment with the schema from the first h5ad file.
tiledbsoma.io.from_h5ad(_EXPERIMENT_URI,
                        h5ad_files[0],
                        measurement_name=_MEASUREMENT_NAME,
                        obs_id_name=_OBS_ID_NAME,
                        var_id_name=_VAR_ID_NAME,
                        ingest_mode='schema_only',
                        X_kind=tiledbsoma.SparseNDArray,
                        context=context)

registration_mapping = tiledbsoma.io.register_h5ads(
    _EXPERIMENT_URI,
    h5ad_files,
    measurement_name=_MEASUREMENT_NAME,
    obs_field_name=_OBS_ID_NAME,
    var_field_name=_VAR_ID_NAME,
    context=context,
)

Versions (please complete the following information):

  • TileDB-SOMA version: 1.17.1
  • TileDB core version (libtiledbsoma): 2.28.1
  • Language and language version (e.g. Python 3.9, R 4.3.2): Python 3.12.11
  • OS (e.g. MacOS, Ubuntu Linux): Linux 5.15.0-1083-gcp

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions