Skip to content

feat(python): add context manager support for safe resource cleanup#412

Merged
HenryNdubuaku merged 7 commits intocactus-compute:mainfrom
yogyam:feat/python-context-manager
Feb 26, 2026
Merged

feat(python): add context manager support for safe resource cleanup#412
HenryNdubuaku merged 7 commits intocactus-compute:mainfrom
yogyam:feat/python-context-manager

Conversation

@yogyam
Copy link
Copy Markdown
Contributor

@yogyam yogyam commented Feb 24, 2026

What

Based of issue #388
Added CactusModel and CactusIndex context manager classes to the Python bindings so cactus_destroy / cactus_index_destroy are called automatically — even when exceptions occur.

# Before — leaks if anything throws between init and destroy:
model = cactus_init("weights/model")
response = cactus_complete(model, messages)
cactus_destroy(model)

# After — cleanup guaranteed:
with CactusModel("weights/model") as model:
    response = model.complete(messages)

Both classes:

  • Guard against double-free (calling destroy() twice is safe)
  • Raise RuntimeError on use-after-destroy

FFI Audit

Reviewed cactus_ffi.h.

Three handle types allocate native memory:

  • cactus_model_t
  • cactus_index_t
  • cactus_stream_transcribe_t

cactus_model_t and cactus_index_t receive wrappers.

cactus_stream_transcribe_t is a sub-resource of a model (freed by cactus_stream_transcribe_stop), so no separate wrapper is needed.


Files Changed

  • python/src/cactus.py
    Added CactusModel and CactusIndex classes (appended; no existing code modified)

  • python/src/__init__.py
    Added lazy imports for new classes

  • python/example.py
    Updated to use the with pattern

  • python/tests/test_context_manager.py
    Added 8 unit tests (mocked FFI; no model weights required)


Tests

$ python -m unittest python/tests/test_context_manager.py -v
Ran 8 tests in 0.002s — OK

Tests verify:

  • destroy is called on normal exit
  • destroy is called on exception
  • Double-destroy is safe
  • Methods raise after destroy

Manual leaks --atExit testing requires a full build + model weights. Happy to add a demo script if needed.


Backward Compatibility

Fully backward-compatible.

All existing free functions remain unchanged. The new classes are purely additive.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds context manager support to the Python bindings for safe resource cleanup. The changes address issue #388 by introducing CactusModel and CactusIndex wrapper classes that implement the context manager protocol (__enter__/__exit__), ensuring that cactus_destroy and cactus_index_destroy are called automatically, even when exceptions occur.

Changes:

  • Added CactusModel and CactusIndex context manager classes with automatic cleanup and use-after-destroy protection
  • Updated __init__.py with lazy imports for the new classes
  • Modernized example.py to demonstrate the new context manager pattern
  • Added comprehensive unit tests with mocked FFI layer

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 11 comments.

File Description
python/src/cactus.py Added CactusModel and CactusIndex context manager classes with enter/exit support, double-free protection, and use-after-destroy guards
python/src/init.py Added lazy imports for CactusModel and CactusIndex classes
python/example.py Updated to demonstrate context manager usage pattern with both model types
python/tests/test_context_manager.py Added 8 unit tests covering normal exit, exception handling, double-destroy safety, and use-after-destroy error handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"""

def __init__(self, model_path, corpus_dir=None, cache_index=False):
self._handle = cactus_init(model_path, corpus_dir, cache_index)
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The init method should check if cactus_init returns None (which indicates initialization failure) and raise an appropriate exception. According to the cactus_init docstring at line 207, it returns None on failure. Without this check, the context manager will have a None handle and operations will fail with less informative error messages. Consider checking the handle and calling cactus_get_last_error() to provide a meaningful error message.

Suggested change
self._handle = cactus_init(model_path, corpus_dir, cache_index)
self._handle = cactus_init(model_path, corpus_dir, cache_index)
if self._handle is None:
# Retrieve a descriptive error message from the underlying library, if available.
try:
error_msg = cactus_get_last_error()
except NameError:
error_msg = None
if error_msg:
raise RuntimeError(f"Failed to initialize CactusModel: {error_msg}")
raise RuntimeError("Failed to initialize CactusModel: cactus_init returned None")

Copilot uses AI. Check for mistakes.
"""

def __init__(self, index_dir, embedding_dim):
self._handle = cactus_index_init(index_dir, embedding_dim)
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The init method should check if cactus_index_init returns None (which indicates initialization failure) and raise an appropriate exception. According to the cactus_index_init docstring at line 654, it returns None on failure. Without this check, the context manager will have a None handle and operations will fail with less informative error messages. Consider checking the handle and raising a RuntimeError with details from cactus_get_last_error().

Suggested change
self._handle = cactus_index_init(index_dir, embedding_dim)
self._handle = cactus_index_init(index_dir, embedding_dim)
if self._handle is None:
# Initialization failed; surface a clear error with details from the C library
error_msg = cactus_get_last_error()
if not error_msg:
raise RuntimeError("Failed to initialize CactusIndex (no additional error information provided)")
raise RuntimeError(f"Failed to initialize CactusIndex: {error_msg}")

Copilot uses AI. Check for mistakes.
self._require_handle()
return cactus_rag_query(self._handle, query, top_k)

def stream_transcribe_start(self, options=None, language="en"):
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding docstrings to the wrapper methods (complete, transcribe, embed, etc.) that reference or summarize the documentation from their underlying functions. This would improve discoverability and make the API easier to use, especially since users may look up help on these methods directly rather than the underlying free functions.

Suggested change
def stream_transcribe_start(self, options=None, language="en"):
def stream_transcribe_start(self, options=None, language="en"):
"""
Start a streaming transcription session with the current model.
This is a convenience wrapper around :func:`cactus_stream_transcribe_start`.
Args:
options: Optional transcription options dict, passed directly to
``cactus_stream_transcribe_start``.
language (str): Language code (e.g. "en") for the audio being transcribed.
Returns:
The handle or session identifier returned by ``cactus_stream_transcribe_start``.
"""

Copilot uses AI. Check for mistakes.
Comment on lines +903 to +915
def add(self, ids, documents, embeddings, metadatas=None):
self._require_handle()
return cactus_index_add(self._handle, ids, documents, embeddings, metadatas)

def delete(self, ids):
self._require_handle()
return cactus_index_delete(self._handle, ids)

def query(self, embedding, top_k=5, options=None):
self._require_handle()
return cactus_index_query(self._handle, embedding, top_k, options)

def compact(self):
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding docstrings to the wrapper methods (add, delete, query, compact) that reference or summarize the documentation from their underlying functions. This would improve discoverability and make the API easier to use, especially since users may look up help on these methods directly rather than the underlying free functions.

Suggested change
def add(self, ids, documents, embeddings, metadatas=None):
self._require_handle()
return cactus_index_add(self._handle, ids, documents, embeddings, metadatas)
def delete(self, ids):
self._require_handle()
return cactus_index_delete(self._handle, ids)
def query(self, embedding, top_k=5, options=None):
self._require_handle()
return cactus_index_query(self._handle, embedding, top_k, options)
def compact(self):
def add(self, ids, documents, embeddings, metadatas=None):
"""Add vectors and associated data to the index.
This is a convenience wrapper around :func:`cactus_index_add`.
Args:
ids: Iterable of application-defined IDs for each vector.
documents: Iterable of raw text (or other payload) for each vector.
embeddings: Iterable or array-like of embedding vectors.
metadatas: Optional iterable of metadata objects aligned with ``ids``.
Returns:
The result returned by :func:`cactus_index_add`, typically a status
or summary of the added entries.
"""
self._require_handle()
return cactus_index_add(self._handle, ids, documents, embeddings, metadatas)
def delete(self, ids):
"""Delete vectors from the index by ID.
This is a convenience wrapper around :func:`cactus_index_delete`.
Args:
ids: Iterable of IDs to remove from the index.
Returns:
The result returned by :func:`cactus_index_delete`, typically a status
or count of deleted items.
"""
self._require_handle()
return cactus_index_delete(self._handle, ids)
def query(self, embedding, top_k=5, options=None):
"""Query the index for the nearest neighbors of a vector.
This is a convenience wrapper around :func:`cactus_index_query`.
Args:
embedding: The query embedding vector.
top_k: Maximum number of nearest neighbors to return.
options: Optional dict of backend-specific query options.
Returns:
The result returned by :func:`cactus_index_query`, typically a
structure containing matched IDs, scores, and associated payloads.
"""
self._require_handle()
return cactus_index_query(self._handle, embedding, top_k, options)
def compact(self):
"""Compact the index, reclaiming space and optimizing layout.
This is a convenience wrapper around :func:`cactus_index_compact`.
Returns:
The result returned by :func:`cactus_index_compact`, if any.
"""

Copilot uses AI. Check for mistakes.
# cactus_index_destroy called automatically, even on errors
"""

def __init__(self, index_dir, embedding_dim):
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a docstring to the init method that documents the parameters (index_dir, embedding_dim) and their purpose. This would help users understand what arguments to pass when creating the context manager directly (outside of a with statement).

Suggested change
def __init__(self, index_dir, embedding_dim):
def __init__(self, index_dir, embedding_dim):
"""Initialize a vector index handle.
Parameters:
index_dir: Path to the directory where the index is stored or will be created.
embedding_dim: Integer dimensionality of the embedding vectors stored in the index.
"""

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +92
class TestCactusModelContextManager(unittest.TestCase):

@patch.object(cactus, "cactus_destroy")
@patch.object(cactus, "cactus_init", return_value=MOCK_HANDLE)
def test_with_calls_destroy(self, mock_init, mock_destroy):
with cactus.CactusModel("weights/test") as model:
self.assertIsNotNone(model._handle)

mock_init.assert_called_once_with("weights/test", None, False)
mock_destroy.assert_called_once_with(MOCK_HANDLE)

@patch.object(cactus, "cactus_destroy")
@patch.object(cactus, "cactus_init", return_value=MOCK_HANDLE)
def test_with_calls_destroy_on_exception(self, mock_init, mock_destroy):
with self.assertRaises(ValueError):
with cactus.CactusModel("weights/test") as model:
raise ValueError("simulated error")

mock_destroy.assert_called_once_with(MOCK_HANDLE)

@patch.object(cactus, "cactus_destroy")
@patch.object(cactus, "cactus_init", return_value=MOCK_HANDLE)
def test_double_destroy_safe(self, mock_init, mock_destroy):
model = cactus.CactusModel("weights/test")
model.destroy()
model.destroy()

mock_destroy.assert_called_once_with(MOCK_HANDLE)

@patch.object(cactus, "cactus_destroy")
@patch.object(cactus, "cactus_init", return_value=MOCK_HANDLE)
def test_method_after_destroy_raises(self, mock_init, mock_destroy):
model = cactus.CactusModel("weights/test")
model.destroy()

with self.assertRaises(RuntimeError):
model.complete([])

with self.assertRaises(RuntimeError):
model.embed("text")

with self.assertRaises(RuntimeError):
model.transcribe("audio.wav")

Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding test cases for initialization failure scenarios (when cactus_init returns None). This would help verify that the context manager properly handles and reports initialization errors, which is currently not tested.

Copilot uses AI. Check for mistakes.
Comment on lines +137 to +138


Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding test cases for initialization failure scenarios (when cactus_index_init returns None). This would help verify that the context manager properly handles and reports initialization errors, which is currently not tested.

Suggested change
@patch.object(cactus, "cactus_index_destroy")
@patch.object(cactus, "cactus_index_init", return_value=None)
def test_init_failure_raises_and_does_not_destroy(self, mock_init, mock_destroy):
# When cactus_index_init fails (returns None), the context manager
# should raise and never call cactus_index_destroy.
with self.assertRaises(RuntimeError):
with cactus.CactusIndex("/tmp/index", 384) as index:
# Body should never execute if initialization fails.
self.fail("Context body executed despite initialization failure")
mock_init.assert_called_once_with("/tmp/index", 384)
mock_destroy.assert_not_called()

Copilot uses AI. Check for mistakes.
# cactus_destroy called automatically, even on errors
"""

def __init__(self, model_path, corpus_dir=None, cache_index=False):
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a docstring to the init method that documents the parameters (model_path, corpus_dir, cache_index) and their purpose. This would help users understand what arguments to pass when creating the context manager directly (outside of a with statement).

Suggested change
def __init__(self, model_path, corpus_dir=None, cache_index=False):
def __init__(self, model_path, corpus_dir=None, cache_index=False):
"""
Initialize a CactusModel instance.
Args:
model_path: Path to the model weights or configuration used to initialize the Cactus model.
corpus_dir: Optional directory containing corpus or index data for retrieval-augmented generation.
cache_index: If True, cache the index in memory for faster subsequent access.
"""

Copilot uses AI. Check for mistakes.
if self._handle is None:
raise RuntimeError("Model has been destroyed")

def destroy(self):
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a docstring to the destroy method explaining that it's safe to call multiple times and that it's automatically called when using the context manager. This would help users who might want to manually call destroy in certain scenarios.

Suggested change
def destroy(self):
def destroy(self):
"""Release the underlying model resources.
This method is safe to call multiple times; after the first call it
becomes a no-op. It is called automatically when using CactusModel
as a context manager (i.e., at the end of a ``with`` block).
"""

Copilot uses AI. Check for mistakes.
if self._handle is None:
raise RuntimeError("Index has been destroyed")

def destroy(self):
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a docstring to the destroy method explaining that it's safe to call multiple times and that it's automatically called when using the context manager. This would help users who might want to manually call destroy in certain scenarios.

Suggested change
def destroy(self):
def destroy(self):
"""Release the underlying index resources.
This method is safe to call multiple times. After the first call,
the internal handle is set to None and subsequent calls will be
no-ops.
When using CactusIndex as a context manager (with CactusIndex(...) as index:),
this method is called automatically from __exit__, even if an exception
is raised inside the with block. Manual calls to destroy are only
needed when managing the lifecycle explicitly without a with statement.
"""

Copilot uses AI. Check for mistakes.
@Udit8348
Copy link
Copy Markdown

Udit8348 commented Feb 24, 2026

At a glance it looks like its going in a good direction. It does add quite a bit more lines of code, but I can see the intent for it. The backwards compat seems like it should be okay since it does not break the prev syntax style.

Co-Pilot is asking to add very long and verbose docstrings. I think these are unreadable, imo. I think its better to find a good place to demonstrate how to migrate one's code from the old style to the new one that uses a more concise api. This is also something repo owner (@HenryNdubuaku) should have input on because it is pretty significant.

If you can verify everything with a full build an a model that would be good. Golden, if you can prove your work with some kind of address sanitizer tool. Will depend on your platform. Redundancy checks like this will be helpful for future contribs as well. With that being said, checking for memory leaks using an address sanitizer tool might be trickier than the scope of this issue. Please feel free to investigate.

@yogyam
Copy link
Copy Markdown
Contributor Author

yogyam commented Feb 24, 2026

Thanks for the review. On my recent commits, I added a migration guide to python/README.md showing old vs new pattern. Restored example.py to its original state — instead of modifying it, I added a separate example_context_manager.py to keep things non-invasive.

I ran the full build verification — cactus build --python + cactus download google/gemma-3-270m-it — and all integration tests pass: normal with-block, cleanup on exception, use-after-destroy protection, and double-destroy safety. The mock-based unit tests also pass. Happy for you or @HenryNdubuaku to run on your end as well.

@HenryNdubuaku
Copy link
Copy Markdown
Collaborator

HenryNdubuaku commented Feb 24, 2026

thanks for this @yogyam !
I love this direction! one thing, this is quite chunky and adds a new file to the project, something we generally avoid.

  • move the example to the existing example.py
  • if we are sticking to context manager, make it the default by refactoring, rather than padding it onto existing flow and bloating the codebase.
  • also update all docs and MD files that use Python.
  • CLI depends on the python, review the implications of your PR to these.
  • avoid redundant or unnecessary code at all cost.

Thanks!

…ager

* Remedied PR bloat by moving standalone _lib wrapper functions directly into CactusModel/CactusIndex
* Removed redundant example_context_manager.py and uses example.py instead
* Updated README.md to reflect CactusModel as primary Python FFI API
* Ensured zero impact on cli.py and core executable
* Stripped docstrings to adhere strictly to contributing guidelines

Signed-off-by: yogyam <[email protected]>
@yogyam
Copy link
Copy Markdown
Contributor Author

yogyam commented Feb 24, 2026

Thanks for the feedback! I deleted the extra example file and moved the code into the existing example.py, refactored the Python bindings to make the CactusModel context manager the default API, updated the docs to reflect this, and verified that these changes don't impact the CLI. Let me know what more needs to be done!

Signed-off-by: HenryNdubuaku <[email protected]>
@HenryNdubuaku
Copy link
Copy Markdown
Collaborator

Thanks for this @yogyam merging now, i decided to retain the old design too, for backward compatibility, while this becomes the new way.

@HenryNdubuaku HenryNdubuaku merged commit d47bb5d into cactus-compute:main Feb 26, 2026
1 of 2 checks passed
HenryNdubuaku added a commit to jrajala6/cactus that referenced this pull request Feb 26, 2026
…actus-compute#412)

* feat(python): add context manager support for safe resource cleanup

Signed-off-by: yogyam <[email protected]>

* docs: add context manager migration guide to python README

Signed-off-by: yogyam <[email protected]>

* style: trim verbose comments to match project guidelines

Signed-off-by: yogyam <[email protected]>

* refactor: restore original example.py, add separate context manager example

Signed-off-by: yogyam <[email protected]>

* refactor(python): encapsulate C-bindings into CactusModel context manager

* Remedied PR bloat by moving standalone _lib wrapper functions directly into CactusModel/CactusIndex
* Removed redundant example_context_manager.py and uses example.py instead
* Updated README.md to reflect CactusModel as primary Python FFI API
* Ensured zero impact on cli.py and core executable
* Stripped docstrings to adhere strictly to contributing guidelines

Signed-off-by: yogyam <[email protected]>

* retain old

Signed-off-by: HenryNdubuaku <[email protected]>

---------

Signed-off-by: yogyam <[email protected]>
Signed-off-by: HenryNdubuaku <[email protected]>
Co-authored-by: HenryNdubuaku <[email protected]>
cattermelon1234 pushed a commit to cattermelon1234/cactus that referenced this pull request Feb 28, 2026
…actus-compute#412)

* feat(python): add context manager support for safe resource cleanup

Signed-off-by: yogyam <[email protected]>

* docs: add context manager migration guide to python README

Signed-off-by: yogyam <[email protected]>

* style: trim verbose comments to match project guidelines

Signed-off-by: yogyam <[email protected]>

* refactor: restore original example.py, add separate context manager example

Signed-off-by: yogyam <[email protected]>

* refactor(python): encapsulate C-bindings into CactusModel context manager

* Remedied PR bloat by moving standalone _lib wrapper functions directly into CactusModel/CactusIndex
* Removed redundant example_context_manager.py and uses example.py instead
* Updated README.md to reflect CactusModel as primary Python FFI API
* Ensured zero impact on cli.py and core executable
* Stripped docstrings to adhere strictly to contributing guidelines

Signed-off-by: yogyam <[email protected]>

* retain old

Signed-off-by: HenryNdubuaku <[email protected]>

---------

Signed-off-by: yogyam <[email protected]>
Signed-off-by: HenryNdubuaku <[email protected]>
Co-authored-by: HenryNdubuaku <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants