Skip to content

[r] Support SparseNDArray writes case#2755

Merged
eddelbuettel merged 4 commits intomainfrom
de/sc-44886/sparse_nd_write
Jul 16, 2024
Merged

[r] Support SparseNDArray writes case#2755
eddelbuettel merged 4 commits intomainfrom
de/sc-44886/sparse_nd_write

Conversation

@eddelbuettel
Copy link
Copy Markdown
Contributor

Issue and/or context:

Following PRs #2704 (data frames) and #2730 (dense arrays) this PR extends to C++-ification to sparse n-dimensional arrays.

Changes:

Refactored code to use C++ classes to write.

Notes for Reviewer:

SC 44886

@eddelbuettel eddelbuettel force-pushed the de/sc-44886/sparse_nd_write branch from e71c07d to e3bd800 Compare July 10, 2024 14:25
@eddelbuettel eddelbuettel force-pushed the de/sc-44886/sparse_nd_write branch from e3bd800 to 6cc17be Compare July 15, 2024 16:56
@eddelbuettel
Copy link
Copy Markdown
Contributor Author

I think this is now ready for review. Locally I get [ FAIL 0 | WARN 0 | SKIP 0 | PASS 3475 ].

There was another change here that we were able to revert so also reverting
this one remaining changed line
@johnkerl johnkerl changed the title [r] Support sparse nd array writes case [r] Support SparseNDArray writes case Jul 15, 2024
Comment on lines +229 to +231
## the 'soma_data' data type may not have been cached, and if so we need to fetch it
if (is.null(private$.type)) {
## TODO: replace with a libtiledbsoma accessor as discussed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to use SOMAArray::arrow_schema or SOMASparseArrayNDArray::schema to get the dtype of soma_data?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not clear where the value would come from. This function is called with just the values to be written, and this segment is where, if no 'cached' type info exists, we go to the schema on disk as a last resort.

## the 'soma_data' data type may not have been cached, and if so we need to fetch it
if (is.null(private$.type)) {
## TODO: replace with a libtiledbsoma accessor as discussed
tpstr <- tiledb::datatype(tiledb::attrs(tiledb::schema(self$uri))[["soma_data"]])
arstr <- arrow_type_from_tiledb_type(tpstr)
private$.type <- arstr
}

In some of the use case we do not need that as private$.type has been set.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it is similar for thw write() function from which this is called. It gets called with just values (and an optional bbox). No schema info.

@nguyenv nguyenv self-requested a review July 16, 2024 12:06
Copy link
Copy Markdown
Contributor

@nguyenv nguyenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed over DM, I will add a dtype getter for the soma_data into libtiledbsoma.

@eddelbuettel eddelbuettel merged commit 18df8bb into main Jul 16, 2024
@eddelbuettel eddelbuettel deleted the de/sc-44886/sparse_nd_write branch July 16, 2024 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants