simplify and align sdks by jakmro · Pull Request #489 · cactus-compute/cactus

jakmro · 2026-03-04T07:30:52Z

No description provided.

Signed-off-by: jakmro <[email protected]>

…o README Signed-off-by: jakmro <[email protected]>

Copilot

Pull request overview

This PR simplifies and aligns the Python, Flutter/Dart, Swift, and Kotlin/Android SDKs by exposing module/top-level functions that more directly mirror the C FFI API (including new telemetry helpers and index document retrieval).

Changes:

Refactors Python/Flutter/Swift/Kotlin APIs from OO wrappers into direct FFI-style, module/top-level functions and updates READMEs accordingly.
Adds/threads through new FFI parameters and functions (e.g., cacheIndex on init, telemetry set_app_id, telemetry shutdown, and cactus_index_get).
Standardizes streaming transcription APIs across SDKs (*_start/process/stop).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 20 comments.

Show a summary per file

File	Description
python/src/cactus.py	Reworks Python bindings to be thin wrappers over the C FFI, adds telemetry + index_get, and changes error handling to raise `RuntimeError`.
python/README.md	Updates Python docs/examples to match the new module-level API.
flutter/cactus.dart	Refactors Dart API to top-level FFI wrappers, adds telemetry + index_get, and updates stream transcription functions.
flutter/README.md	Updates Flutter docs/examples to match the new API.
apple/README.md	Updates Swift SDK docs/examples to match the new top-level API.
apple/Cactus.swift	Refactors Swift bindings to top-level functions, adds telemetry + index_get, and aligns stream transcription functions.
android/cactus_jni.cpp	Updates JNI exports to match new Kotlin API surface, adds telemetry + index_get, and switches error handling to throwing.
android/README.md	Updates Android/Kotlin docs/examples to match the new top-level API.
android/Cactus.kt	Replaces class-based wrapper with top-level functions backed by a JNI singleton.
android/Cactus.ios.kt	Refactors Kotlin/Native (iOS) bindings to top-level functions mirroring the C FFI, adds telemetry + index_get.
android/Cactus.common.kt	Replaces `expect class` API with `expect fun` top-level functions for the shared surface.
android/Cactus.android.kt	Implements Android `actual` top-level functions that call into the JNI layer.

Comments suppressed due to low confidence (1)

flutter/cactus.dart:453

cactusGetLastError() returns a non-optional String but the underlying C API returns an empty string when there is no error. This makes it hard for callers to distinguish "no error" vs an actual empty error message. Consider returning String? (null when empty) for parity with the Python API.

/// Returns the last error message.
String cactusGetLastError() {
  return _cactusGetLastError().toDartString();
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T07:39:00Z

python/src/cactus.py

+    max_tokens = 8192
+    arr = (ctypes.c_uint32 * max_tokens)()
+    n = ctypes.c_size_t(0)
+    rc = _lib.cactus_tokenize(model, _enc(text), arr, max_tokens, ctypes.byref(n))
+    if rc < 0:
+        raise RuntimeError(_err("Tokenization failed"))


cactus_tokenize in the C FFI returns -2 when the provided buffer is too small (and supports a length-query call by passing token_buffer=NULL, token_buffer_len=0). The new Python wrapper always uses a fixed 8192 buffer and will fail for longer inputs where the old two-pass implementation would succeed. Consider restoring the two-pass approach (query required length, allocate exact buffer, then fetch).

Suggested change

max_tokens = 8192

arr = (ctypes.c_uint32 * max_tokens)()

n = ctypes.c_size_t(0)

rc = _lib.cactus_tokenize(model, _enc(text), arr, max_tokens, ctypes.byref(n))

if rc < 0:

raise RuntimeError(_err("Tokenization failed"))

# First, query the required number of tokens by calling with a NULL buffer.

n = ctypes.c_size_t(0)

rc = _lib.cactus_tokenize(model, _enc(text), None, 0, ctypes.byref(n))

if rc < 0:

raise RuntimeError(_err("Tokenization failed"))

if n.value == 0:

return []

# Allocate exactly the required buffer and fetch the tokens.

arr = (ctypes.c_uint32 * n.value)()

rc = _lib.cactus_tokenize(model, _enc(text), arr, n.value, ctypes.byref(n))

if rc < 0:

raise RuntimeError(_err("Tokenization failed"))

Copilot · 2026-03-04T07:39:00Z

python/src/cactus.py

+        doc = doc_raw[i].value.decode("utf-8", errors="ignore")
+        meta_bytes = meta_raw[i].value
+        meta = meta_bytes.decode("utf-8", errors="ignore") if meta_bytes else None
+        emb_dim = emb_sizes[i]


cactus_index_get uses emb_dim = emb_sizes[i] (a ctypes.c_size_t instance) as a slice bound (emb_raw[i][:emb_dim]). Slice bounds must be plain integers; use emb_sizes[i].value (and likewise for the > 0 check) to avoid a TypeError at runtime.

Suggested change

emb_dim = emb_sizes[i]

emb_dim = emb_sizes[i].value

Copilot · 2026-03-04T07:39:01Z

flutter/cactus.dart

+    final sb = StringBuffer('{"results":[');
+    for (var i = 0; i < resultCount; i++) {
+      if (i > 0) sb.write(',');
+      sb.write('{"id":${idBuffers[0][i]},"score":${scoreBuffers[0][i]}}');
    }
+    sb.write(']}');
+    return sb.toString();


cactusIndexQuery returns JSON built via string concatenation. This is brittle (e.g., requires correct escaping, and non-finite floats can break JSON). Prefer returning a Dart object encoded via jsonEncode to guarantee valid JSON.

Suggested change

final sb = StringBuffer('{"results":[');

for (var i = 0; i < resultCount; i++) {

if (i > 0) sb.write(',');

sb.write('{"id":${idBuffers[0][i]},"score":${scoreBuffers[0][i]}}');

}

sb.write(']}');

return sb.toString();

final results = <Map<String, dynamic>>[];

for (var i = 0; i < resultCount; i++) {

results.add({

'id': idBuffers[0][i],

'score': scoreBuffers[0][i],

});

}

return jsonEncode({'results': results});

Copilot · 2026-03-04T07:39:01Z

android/cactus_jni.cpp

+    std::vector<int> idBuffer(static_cast<size_t>(topK));
+    std::vector<float> scoreBuffer(static_cast<size_t>(topK));
+    size_t idBufferSize = static_cast<size_t>(topK);
+    size_t scoreBufferSize = static_cast<size_t>(topK);



nativeIndexQuery / nativeIndexGet accept topK as jlong and directly size std::vector buffers from it. A negative or extremely large topK from Kotlin could cause huge allocations or wrap to a very large size_t. Validate topK bounds (e.g., >0 and <= some maximum) before allocating.

Suggested change

std::vector<int> idBuffer(static_cast<size_t>(topK));

std::vector<float> scoreBuffer(static_cast<size_t>(topK));

size_t idBufferSize = static_cast<size_t>(topK);

size_t scoreBufferSize = static_cast<size_t>(topK);

// Validate topK before using it to size buffers

if (topK <= 0 || topK > static_cast<jlong>(DEFAULT_BUFFER_SIZE)) {

env->ReleaseFloatArrayElements(embedding, embData, JNI_ABORT);

release_jstring(env, optionsJson, options);

throwCactusException(env, "Invalid topK: must be between 1 and 65536");

return nullptr;

}

size_t kSize = static_cast<size_t>(topK);

std::vector<int> idBuffer(kSize);

std::vector<float> scoreBuffer(kSize);

size_t idBufferSize = kSize;

size_t scoreBufferSize = kSize;

Copilot · 2026-03-04T07:39:01Z

python/src/cactus.py

    if pcm_data is not None:
-        if isinstance(pcm_data, bytes):
-            pcm_arr = (ctypes.c_uint8 * len(pcm_data)).from_buffer_copy(pcm_data)
-        else:
-            pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data)
+        pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data)
        pcm_ptr = ctypes.cast(pcm_arr, ctypes.POINTER(ctypes.c_uint8))
-        pcm_len = len(pcm_data)
-
-    _lib.cactus_detect_language(
-        model,
-        audio_path.encode() if isinstance(audio_path, str) else audio_path,
-        buf, len(buf),
-        options_json.encode() if options_json else None,
-        pcm_ptr, pcm_len
+        pcm_size = len(pcm_data)


pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data) expands pcm_data into thousands of positional arguments, which is extremely slow for real audio buffers. Use from_buffer_copy(pcm_data) when pcm_data is bytes/bytearray (or wrap with memoryview) to avoid argument explosion.

Copilot · 2026-03-04T07:39:03Z

python/src/cactus.py

    buf = (ctypes.c_float * 4096)()
    dim = ctypes.c_size_t()
-    _lib.cactus_embed(
-        model,
-        text.encode() if isinstance(text, str) else text,
-        buf, ctypes.sizeof(buf), ctypes.byref(dim), normalize
-    )
+    rc = _lib.cactus_embed(model, _enc(text), buf, 4096, ctypes.byref(dim), normalize)
+    if rc < 0:
+        raise RuntimeError(_err("Embedding failed"))


cactus_embed expects buffer_size in bytes (see C FFI: it compares embeddings.size() * sizeof(float) to buffer_size). Passing 4096 here under-allocates (only 4KB), and will fail for larger embedding dims. Pass ctypes.sizeof(buf) (or 4096 * ctypes.sizeof(ctypes.c_float)) instead of 4096.

Copilot · 2026-03-04T07:39:04Z

python/src/cactus.py

 def cactus_image_embed(model, image_path):
-    """
-    Get image embeddings from a VLM.
-
-    Args:
-        model: Model handle from cactus_init
-        image_path: Path to image file
-
-    Returns:
-        List of floats representing the image embedding vector.
-    """
+    """Generates an image embedding. Returns list of floats."""
    buf = (ctypes.c_float * 4096)()
    dim = ctypes.c_size_t()
-    _lib.cactus_image_embed(
-        model,
-        image_path.encode() if isinstance(image_path, str) else image_path,
-        buf, ctypes.sizeof(buf), ctypes.byref(dim)
-    )
+    rc = _lib.cactus_image_embed(model, _enc(image_path), buf, 4096, ctypes.byref(dim))
+    if rc < 0:
+        raise RuntimeError(_err("Image embedding failed"))


Same issue as cactus_embed: cactus_image_embed’s buffer_size parameter is in bytes. Passing 4096 is only 4KB and can cause -2 (buffer too small) for higher-dimensional embeddings. Use ctypes.sizeof(buf) (or 4096 * sizeof(float)).

Copilot · 2026-03-04T07:39:04Z

python/src/cactus.py

 def cactus_audio_embed(model, audio_path):
-    """
-    Get audio embeddings from a Whisper model.
-
-    Args:
-        model: Whisper model handle from cactus_init
-        audio_path: Path to audio file (WAV format)
-
-    Returns:
-        List of floats representing the audio embedding vector.
-    """
+    """Generates an audio embedding. Returns list of floats."""
    buf = (ctypes.c_float * 4096)()
    dim = ctypes.c_size_t()
-    _lib.cactus_audio_embed(
-        model,
-        audio_path.encode() if isinstance(audio_path, str) else audio_path,
-        buf, ctypes.sizeof(buf), ctypes.byref(dim)
-    )
+    rc = _lib.cactus_audio_embed(model, _enc(audio_path), buf, 4096, ctypes.byref(dim))
+    if rc < 0:
+        raise RuntimeError(_err("Audio embedding failed"))


Same bytes-vs-floats issue: cactus_audio_embed expects buffer_size in bytes. Passing 4096 will under-report available space and can fail for larger embedding dimensions. Use ctypes.sizeof(buf) (or 4096 * sizeof(float)).

Copilot · 2026-03-04T07:39:04Z

python/src/cactus.py

+    n = id_size.value
+    parts = [f'{{"id":{id_buffer[i]},"score":{score_buffer[i]}}}' for i in range(n)]
+    return '{"results":[' + ",".join(parts) + ']}'


cactus_index_query manually constructs a JSON string. Besides string escaping issues, float values like nan/inf would serialize as nan/inf via str(...), which is not valid JSON. Prefer returning structured data via json.dumps(...) to ensure correct encoding.

Copilot · 2026-03-04T07:39:04Z

apple/Cactus.swift

+    var sb = "{\"results\":["
+    for i in 0..<idBufferSize {
+        if i > 0 { sb += "," }
+        sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}"
    }
+    sb += "]}"
+    return sb


cactusIndexQuery returns JSON assembled via string concatenation. This is error-prone (escaping, non-finite floats) and duplicates JSON formatting logic across SDKs. Prefer returning structured results and encoding with JSONSerialization to guarantee valid JSON.

Suggested change

var sb = "{\"results\":["

for i in 0..<idBufferSize {

if i > 0 { sb += "," }

sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}"

}

sb += "]}"

return sb

var resultsArray: [[String: Any]] = []

resultsArray.reserveCapacity(idBufferSize)

for i in 0..<idBufferSize {

let score = scoreBuffer[i]

if !score.isFinite {

throw _err("Index query returned non-finite score")

}

resultsArray.append([

"id": idBuffer[i],

"score": score

])

}

let jsonObject: [String: Any] = ["results": resultsArray]

let data = try JSONSerialization.data(withJSONObject: jsonObject, options: [])

guard let jsonString = String(data: data, encoding: .utf8) else {

throw _err("Failed to encode index query results as UTF-8 JSON")

}

return jsonString

simplify and align sdks

6627171

Signed-off-by: jakmro <[email protected]>

Copilot AI review requested due to automatic review settings March 4, 2026 07:30

Copilot started reviewing on behalf of jakmro March 4, 2026 07:31 View session

Add fine-tuning guide and runtime compatibility documentation links t…

ba4e717

…o README Signed-off-by: jakmro <[email protected]>

Copilot AI reviewed Mar 4, 2026

View reviewed changes

HenryNdubuaku merged commit 078340f into main Mar 4, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simplify and align sdks#489

simplify and align sdks#489
HenryNdubuaku merged 2 commits intomainfrom
sdks_simplification_and_alignment

jakmro commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    std::vector<int> idBuffer(static_cast<size_t>(topK));
-    std::vector<float> scoreBuffer(static_cast<size_t>(topK));
-    size_t idBufferSize = static_cast<size_t>(topK);
-    size_t scoreBufferSize = static_cast<size_t>(topK);
+    // Validate topK before using it to size buffers
+    if (topK <= 0 || topK > static_cast<jlong>(DEFAULT_BUFFER_SIZE)) {
+        env->ReleaseFloatArrayElements(embedding, embData, JNI_ABORT);
+        release_jstring(env, optionsJson, options);
+        throwCactusException(env, "Invalid topK: must be between 1 and 65536");
+        return nullptr;
+    }
+    size_t kSize = static_cast<size_t>(topK);
+    std::vector<int> idBuffer(kSize);
+    std::vector<float> scoreBuffer(kSize);
+    size_t idBufferSize = kSize;
+    size_t scoreBufferSize = kSize;

-    var sb = "{\"results\":["
-    for i in 0..<idBufferSize {
-        if i > 0 { sb += "," }
-        sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}"
-    }
-    sb += "]}"
-    return sb
+    var resultsArray: [[String: Any]] = []
+    resultsArray.reserveCapacity(idBufferSize)
+    for i in 0..<idBufferSize {
+        let score = scoreBuffer[i]
+        if !score.isFinite {
+            throw _err("Index query returned non-finite score")
+        }
+        resultsArray.append([
+            "id": idBuffer[i],
+            "score": score
+        ])
+    }
+    let jsonObject: [String: Any] = ["results": resultsArray]
+    let data = try JSONSerialization.data(withJSONObject: jsonObject, options: [])
+    guard let jsonString = String(data: data, encoding: .utf8) else {
+        throw _err("Failed to encode index query results as UTF-8 JSON")
+    }
+    return jsonString

Conversation

jakmro commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants