Skip to content

[c++/python/r] Move SOMADataFrame creation logic to C++#4421

Merged
XanthosXanthopoulos merged 19 commits intomainfrom
xan/SOMA-864
Mar 2, 2026
Merged

[c++/python/r] Move SOMADataFrame creation logic to C++#4421
XanthosXanthopoulos merged 19 commits intomainfrom
xan/SOMA-864

Conversation

@XanthosXanthopoulos
Copy link
Copy Markdown
Collaborator

@XanthosXanthopoulos XanthosXanthopoulos commented Feb 25, 2026

Issue and/or context: SOMA-864

Changes:
This PR moves the DataFrame creation logic down to the common C++ layer and simplifies the R and Python implementations. Specifically

  • Schema validation is now common for R and Python and implemented in C++.
  • Max domains are now the same between R and Python. Because of R imposed restrictions the max domains for int32, int64 and uint64 columns are now narrower.
  • Both Python and R implementations are now responsible to typecast the domain of column to the correct type before passing it to C++.

Notes for Reviewer:

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 25, 2026

Codecov Report

❌ Patch coverage is 91.40625% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.60%. Comparing base (89576e9) to head (a762427).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4421      +/-   ##
==========================================
- Coverage   86.02%   84.60%   -1.42%     
==========================================
  Files         140      146       +6     
  Lines       21213    21502     +289     
  Branches       13       13              
==========================================
- Hits        18248    18192      -56     
- Misses       2965     3310     +345     
Flag Coverage Δ
python 88.39% <100.00%> (-0.27%) ⬇️
r 82.68% <90.09%> (-1.97%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 88.39% <100.00%> (-0.27%) ⬇️
libtiledbsoma 65.93% <ø> (-2.73%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@XanthosXanthopoulos XanthosXanthopoulos changed the title [c++/python/r][WIP] Move SOMADataFrame creation logic to C++ [c++/python/r] Move SOMADataFrame creation logic to C++ Feb 26, 2026
@XanthosXanthopoulos XanthosXanthopoulos marked this pull request as ready for review February 26, 2026 15:53
@jp-dark jp-dark requested a review from aaronwolen February 26, 2026 19:54
Copy link
Copy Markdown
Collaborator

@jp-dark jp-dark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a huge improvement, but I do have a couple change requests.

Change requests:

  • Unless there is a good reason to use std::any over std::variant, I think we should switch the domain types.
  • A couple nits around error messages.
  • TBD: I still need to think through the right way to handle the inconsistencies between R and Python.

Comment thread libtiledbsoma/src/utils/util.cc Outdated
Comment thread apis/python/HISTORY.md
Comment thread libtiledbsoma/src/soma/soma_dataframe.cc Outdated
Comment thread libtiledbsoma/src/tiledb_adapter/platform_config.cc Outdated
Comment thread libtiledbsoma/src/tiledb_adapter/platform_config.cc Outdated
@XanthosXanthopoulos XanthosXanthopoulos merged commit 92fd7a7 into main Mar 2, 2026
60 of 61 checks passed
@XanthosXanthopoulos XanthosXanthopoulos deleted the xan/SOMA-864 branch March 2, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants