Skip to content

[python] Address memory leak in domain accessor#4067

Closed
johnkerl wants to merge 1 commit intomainfrom
kerl/seal-gcd
Closed

[python] Address memory leak in domain accessor#4067
johnkerl wants to merge 1 commit intomainfrom
kerl/seal-gcd

Conversation

@johnkerl
Copy link
Copy Markdown
Contributor

@johnkerl johnkerl commented May 19, 2025

Issue and/or context: https://linear.app/tiledb/issue/SOMA-169/c-memory-leak-in-somaarray-get-core-domainish

Changes:

Notes for Reviewer:

I ran this from @bkmartinjr on SOMA-169:

cat 169.py

import tiledbsoma as soma


def main():
    uri = "data/soma-experiment-versions-2025-04-04/1.15.7/pbmc3k_processed/ms/RNA/X/data/"
    with soma.SparseNDArray.open(uri) as arr:
        for i in range(100):
            d = arr._handle._handle.domain()


main()
valgrind --error-limit=no --trace-children=yes -s --leak-check=full $(which python) 169.py

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.26%. Comparing base (f29839d) to head (7d328fe).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4067      +/-   ##
==========================================
+ Coverage   88.97%   89.26%   +0.29%     
==========================================
  Files          59       59              
  Lines        7073     7073              
==========================================
+ Hits         6293     6314      +21     
+ Misses        780      759      -21     
Flag Coverage Δ
python 89.26% <ø> (+0.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 89.26% <ø> (+0.29%) ⬆️
libtiledbsoma ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@johnkerl johnkerl changed the title [python] Fix memory leak in domain accessor [WIP] [python] Fix memory leak in domain accessor May 19, 2025
@johnkerl johnkerl marked this pull request as ready for review May 19, 2025 20:47
@johnkerl johnkerl changed the title [python] Fix memory leak in domain accessor [python] Address memory leak in domain accessor May 19, 2025
@bkmartinjr
Copy link
Copy Markdown
Member

bkmartinjr commented May 19, 2025

To validate the fix I ran:

valgrind --max-threads=4096 --leak-check=full pytest -s -x -v apis/python/tests/test_shape.py

I believe all of the leaks are coming from the test: test_canned_experiments and test_canned_nonstandard_dataframe_upgrade.

This shows that there are additional leaks in the get_core_domainish code path.

==756726== 9 bytes in 3 blocks are definitely lost in loss record 744 of 7,633
==756726==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==756726==    by 0x4A0E58E: strdup (strdup.c:42)
==756726==    by 0x873DDB4D: tiledbsoma::ArrowAdapter::make_arrow_schema_parent(unsigned long, std::basic_string_view<char, std::char_traits<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8731B0C0: tiledbsoma::SOMAArray::_get_core_domainish(Domainish) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8732A218: tiledbsoma::SOMAArray::_set_soma_joinid_shape_helper(long, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x87009BB8: pybind11::cpp_function::initialize<libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}, void, tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg>(libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}&&, void (*)(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x86D05603: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x323D37: cfunction_call (methodobject.c:537)
==756726==    by 0x30726B: _PyObject_MakeTpCall (call.c:240)
==756726==    by 0x21AB4E: _PyEval_EvalFrameDefault.cold (bytecodes.c:2715)
==756726==    by 0x309CF1: _PyObject_FastCallDictTstate (call.c:144)
==756726==    by 0x3327A8: _PyObject_Call_Prepend (call.c:508)

==756726== 28 bytes in 4 blocks are definitely lost in loss record 1,942 of 7,633
==756726==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==756726==    by 0x4A0E58E: strdup (strdup.c:42)
==756726==    by 0x873DDB5C: tiledbsoma::ArrowAdapter::make_arrow_schema_parent(unsigned long, std::basic_string_view<char, std::char_traits<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8731B0C0: tiledbsoma::SOMAArray::_get_core_domainish(Domainish) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8732A218: tiledbsoma::SOMAArray::_set_soma_joinid_shape_helper(long, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x87009BB8: pybind11::cpp_function::initialize<libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}, void, tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg>(libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}&&, void (*)(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x86D05603: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x323D37: cfunction_call (methodobject.c:537)
==756726==    by 0x30726B: _PyObject_MakeTpCall (call.c:240)
==756726==    by 0x21AB4E: _PyEval_EvalFrameDefault.cold (bytecodes.c:2715)
==756726==    by 0x309CF1: _PyObject_FastCallDictTstate (call.c:144)
==756726==    by 0x3327A8: _PyObject_Call_Prepend (call.c:508)

==756726== 555 (48 direct, 507 indirect) bytes in 4 blocks are definitely lost in loss record 6,650 of 7,633
==756726==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==756726==    by 0x873DDB85: tiledbsoma::ArrowAdapter::make_arrow_schema_parent(unsigned long, std::basic_string_view<char, std::char_traits<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8731B0C0: tiledbsoma::SOMAArray::_get_core_domainish(Domainish) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8732A218: tiledbsoma::SOMAArray::_set_soma_joinid_shape_helper(long, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x87009BB8: pybind11::cpp_function::initialize<libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}, void, tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg>(libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}&&, void (*)(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x86D05603: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x323D37: cfunction_call (methodobject.c:537)
==756726==    by 0x30726B: _PyObject_MakeTpCall (call.c:240)
==756726==    by 0x21AB4E: _PyEval_EvalFrameDefault.cold (bytecodes.c:2715)
==756726==    by 0x309CF1: _PyObject_FastCallDictTstate (call.c:144)
==756726==    by 0x3327A8: _PyObject_Call_Prepend (call.c:508)
==756726==    by 0x3FF2FA: slot_tp_call (typeobject.c:8791)
==756726== 

==756726== 728 (48 direct, 680 indirect) bytes in 4 blocks are definitely lost in loss record 6,752 of 7,633
==756726==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==756726==    by 0x873DE03A: tiledbsoma::ArrowAdapter::make_arrow_array_parent(unsigned long) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8731B0CD: tiledbsoma::SOMAArray::_get_core_domainish(Domainish) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x8732A218: tiledbsoma::SOMAArray::_set_soma_joinid_shape_helper(long, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (in /home/bruce/projects/TileDB-SOMA/dist/lib/libtiledbsoma.so)
==756726==    by 0x87009BB8: pybind11::cpp_function::initialize<libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}, void, tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg>(libtiledbsomacpp::load_soma_dataframe(pybind11::module_&)::{lambda(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)#6}&&, void (*)(tiledbsoma::SOMADataFrame&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x86D05603: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/bruce/projects/TileDB-SOMA/apis/python/src/tiledbsoma/pytiledbsoma.cpython-312-x86_64-linux-gnu.so)
==756726==    by 0x323D37: cfunction_call (methodobject.c:537)
==756726==    by 0x30726B: _PyObject_MakeTpCall (call.c:240)
==756726==    by 0x21AB4E: _PyEval_EvalFrameDefault.cold (bytecodes.c:2715)
==756726==    by 0x309CF1: _PyObject_FastCallDictTstate (call.c:144)
==756726==    by 0x3327A8: _PyObject_Call_Prepend (call.c:508)
==756726==    by 0x3FF2FA: slot_tp_call (typeobject.c:8791)

@johnkerl
Copy link
Copy Markdown
Contributor Author

Closing in favor of #4066

@johnkerl johnkerl closed this May 20, 2025
@johnkerl johnkerl deleted the kerl/seal-gcd branch June 30, 2025 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants