[python] add axis delete to Experiment#4215
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4215 +/- ##
===========================================
+ Coverage 69.46% 89.71% +20.25%
===========================================
Files 169 61 -108
Lines 18972 7284 -11688
Branches 1233 0 -1233
===========================================
- Hits 13179 6535 -6644
+ Misses 5359 749 -4610
+ Partials 434 0 -434
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
I keep forgetting that we need to explicitly add new Python functions to the Sphinx docs (see |
| if isinstance(obj, (DataFrame, SparseNDArray, GeometryDataFrame, PointCloudDataFrame)) and all( | ||
| obj.schema.get_field_index(col) >= 0 for col in join_on | ||
| ): |
There was a problem hiding this comment.
The support for deletes in GeometryDataFrame isn't implemented yet.
There was a problem hiding this comment.
True, but I didn't want to duplicate the error checking already implemented in delete_cells.
That said, we can remove this easily and ignore the spatial arrays, at the cost of potential user confusion. My sense is we need to raise a proactive error if the Experiment contains any unsupported sub-array.
What do you recommend?
There was a problem hiding this comment.
I think we raise a NotImplementedError here if there is a GeometryDataFrame. I think it's unlikely anyone is using the GeometryDataFrame in the wild right now, but I don't want to corrupt an Experiment if they are.
jp-dark
left a comment
There was a problem hiding this comment.
I have a couple small comments, but overall this looks good to me.
| # var_id, n_cells | ||
| # len 1837 | ||
|
|
|
|
||
|
|
||
| @pytest.mark.spatialdata | ||
| def test_experiment_obs_axis_delete_spatial(soma_spatial_experiment, soma_tiledb_context) -> None: # noqa: F811 |
There was a problem hiding this comment.
Missing a similar test for var_axis_delete
aaronwolen
left a comment
There was a problem hiding this comment.
Looks great and the docstrings are all very clear/informative. Most of my suggestions are formatting-related.
| This operation affects all Measurements within an Experiment. Deleted observations are specified: | ||
| - by `obs` DataFrame soma_joinid: a sequence of soma_joinid or a slice | ||
| - or, by `obs` DataFrame value filter / query condition |
There was a problem hiding this comment.
| This operation affects all Measurements within an Experiment. Deleted observations are specified: | |
| - by `obs` DataFrame soma_joinid: a sequence of soma_joinid or a slice | |
| - or, by `obs` DataFrame value filter / query condition | |
| This operation affects all Measurements within an Experiment. Deleted observations are specified: | |
| - by ``obs`` DataFrame soma_joinid: a sequence of soma_joinid or a slice | |
| - or, by ``obs`` DataFrame value filter / query condition |
Inserted space so the list will render properly and fixed inline code formatting.
| Sub-objects affected by this operation: | ||
| - exp.obs - delete matching DataFrame elements by soma_joinid |
There was a problem hiding this comment.
| Sub-objects affected by this operation: | |
| - exp.obs - delete matching DataFrame elements by soma_joinid | |
| Sub-objects affected by this operation: | |
| - exp.obs - delete matching DataFrame elements by soma_joinid |
| The following sub-objects are unaffected by this operation:exp.ms[*].var , exp.ms[*].varm[*], exp.ms[*].varp[*] | ||
| and any user-defined arrays within the collection. |
There was a problem hiding this comment.
| The following sub-objects are unaffected by this operation:exp.ms[*].var , exp.ms[*].varm[*], exp.ms[*].varp[*] | |
| and any user-defined arrays within the collection. | |
| The following sub-objects are unaffected by this operation: | |
| - exp.ms[*].var , exp.ms[*].varm[*], exp.ms[*].varp[*] | |
| - any user-defined arrays within the collection |
| If any affected sub-objects are of type DenseNDArray (dense TileDB array), the entire operation results in an error. | ||
|
|
||
| Either ``coords`` or ``value_filter`` must be provided. When both ``coords`` and ``value_filter`` are provided, | ||
| the cells that match both constraints will be removed. |
There was a problem hiding this comment.
| the cells that match both constraints will be removed. | |
| the observations that match both constraints will be removed. |
| Deleted features are specified: | ||
| - by `var` DataFrame joinid: a sequence of joinid or a slice | ||
| - by `var` DataFrame value filter / query condition |
There was a problem hiding this comment.
| Deleted features are specified: | |
| - by `var` DataFrame joinid: a sequence of joinid or a slice | |
| - by `var` DataFrame value filter / query condition | |
| Deleted features are specified: | |
| - by ``var`` DataFrame joinid: a sequence of joinid or a slice | |
| - by ``var`` DataFrame value filter / query condition |
| Sub-objects affected by this operation, given a named Measurement (ms_name): | ||
| - exp.ms[*ms_name*].var - delete matching DataFrame elements |
There was a problem hiding this comment.
| Sub-objects affected by this operation, given a named Measurement (ms_name): | |
| - exp.ms[*ms_name*].var - delete matching DataFrame elements | |
| Sub-objects affected by this operation, given a named Measurement (ms_name): | |
| - exp.ms[*ms_name*].var - delete matching DataFrame elements |
| - exp.ms[*ms_name*].varp[*] - delete matching SparseNDMatrix values, joining on the soma_dim_0 and soma_dim_1 dimensions | ||
| - exp.spatial[*].varl[*ms_name*][*] - delete matching PointCloudDataFrame/GeometryDataFrame elements, joining on the soma_joinid dimension | ||
|
|
||
| The following Measurement sub-objects are unaffected by this operation: exp.ms[*ms_name*].obsm[*] , exp.ms[*ms_name*].obsp[*] , |
There was a problem hiding this comment.
| The following Measurement sub-objects are unaffected by this operation: exp.ms[*ms_name*].obsm[*] , exp.ms[*ms_name*].obsp[*] , | |
| The following Measurement sub-objects are unaffected by this operation: | |
| - exp.ms[*ms_name*].obsm[*] and exp.ms[*ms_name*].obsp[*] |
| If any affected sub-objects are a DenseNDArray (dense TileDB array), the entire operation results in an error. | ||
|
|
||
| Either ``coords`` or ``value_filter`` must be provided. When both ``coords`` and ``value_filter`` are provided, | ||
| the cells that match both constraints will be removed. |
There was a problem hiding this comment.
| the cells that match both constraints will be removed. | |
| the variables that match both constraints will be removed. |
| >>> with tiledbsoma.Experiment.open(experiment_uri, mode="d") as exp: | ||
| ... exp.var_axis_delete(value_filter="feature_biotype == 'gene'") | ||
|
|
||
| Note: Deleting cells does not change the size of the current domain or possible enumeration values. |
There was a problem hiding this comment.
| Note: Deleting cells does not change the size of the current domain or possible enumeration values. | |
| Note: Deleting variables does not change the size of the current domain or possible enumeration values. |
| """Append the candidate array to the final selection list if: | ||
| - not a dense array |
There was a problem hiding this comment.
| """Append the candidate array to the final selection list if: | |
| - not a dense array | |
| """Append the candidate array to the final selection list if: | |
| - not a dense array |
|
I believe I have addressed all of the issues:
|
Add
Experiment.var_axis_deleteandExperiment.obs_axis_deleteand their unit tests.