Skip to content

simplify and align sdks#489

Merged
HenryNdubuaku merged 2 commits intomainfrom
sdks_simplification_and_alignment
Mar 4, 2026
Merged

simplify and align sdks#489
HenryNdubuaku merged 2 commits intomainfrom
sdks_simplification_and_alignment

Conversation

@jakmro
Copy link
Copy Markdown
Collaborator

@jakmro jakmro commented Mar 4, 2026

No description provided.

Copilot AI review requested due to automatic review settings March 4, 2026 07:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies and aligns the Python, Flutter/Dart, Swift, and Kotlin/Android SDKs by exposing module/top-level functions that more directly mirror the C FFI API (including new telemetry helpers and index document retrieval).

Changes:

  • Refactors Python/Flutter/Swift/Kotlin APIs from OO wrappers into direct FFI-style, module/top-level functions and updates READMEs accordingly.
  • Adds/threads through new FFI parameters and functions (e.g., cacheIndex on init, telemetry set_app_id, telemetry shutdown, and cactus_index_get).
  • Standardizes streaming transcription APIs across SDKs (*_start/process/stop).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
python/src/cactus.py Reworks Python bindings to be thin wrappers over the C FFI, adds telemetry + index_get, and changes error handling to raise RuntimeError.
python/README.md Updates Python docs/examples to match the new module-level API.
flutter/cactus.dart Refactors Dart API to top-level FFI wrappers, adds telemetry + index_get, and updates stream transcription functions.
flutter/README.md Updates Flutter docs/examples to match the new API.
apple/README.md Updates Swift SDK docs/examples to match the new top-level API.
apple/Cactus.swift Refactors Swift bindings to top-level functions, adds telemetry + index_get, and aligns stream transcription functions.
android/cactus_jni.cpp Updates JNI exports to match new Kotlin API surface, adds telemetry + index_get, and switches error handling to throwing.
android/README.md Updates Android/Kotlin docs/examples to match the new top-level API.
android/Cactus.kt Replaces class-based wrapper with top-level functions backed by a JNI singleton.
android/Cactus.ios.kt Refactors Kotlin/Native (iOS) bindings to top-level functions mirroring the C FFI, adds telemetry + index_get.
android/Cactus.common.kt Replaces expect class API with expect fun top-level functions for the shared surface.
android/Cactus.android.kt Implements Android actual top-level functions that call into the JNI layer.
Comments suppressed due to low confidence (1)

flutter/cactus.dart:453

  • cactusGetLastError() returns a non-optional String but the underlying C API returns an empty string when there is no error. This makes it hard for callers to distinguish "no error" vs an actual empty error message. Consider returning String? (null when empty) for parity with the Python API.
/// Returns the last error message.
String cactusGetLastError() {
  return _cactusGetLastError().toDartString();
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +340 to +345
max_tokens = 8192
arr = (ctypes.c_uint32 * max_tokens)()
n = ctypes.c_size_t(0)
rc = _lib.cactus_tokenize(model, _enc(text), arr, max_tokens, ctypes.byref(n))
if rc < 0:
raise RuntimeError(_err("Tokenization failed"))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cactus_tokenize in the C FFI returns -2 when the provided buffer is too small (and supports a length-query call by passing token_buffer=NULL, token_buffer_len=0). The new Python wrapper always uses a fixed 8192 buffer and will fail for longer inputs where the old two-pass implementation would succeed. Consider restoring the two-pass approach (query required length, allocate exact buffer, then fetch).

Suggested change
max_tokens = 8192
arr = (ctypes.c_uint32 * max_tokens)()
n = ctypes.c_size_t(0)
rc = _lib.cactus_tokenize(model, _enc(text), arr, max_tokens, ctypes.byref(n))
if rc < 0:
raise RuntimeError(_err("Tokenization failed"))
# First, query the required number of tokens by calling with a NULL buffer.
n = ctypes.c_size_t(0)
rc = _lib.cactus_tokenize(model, _enc(text), None, 0, ctypes.byref(n))
if rc < 0:
raise RuntimeError(_err("Tokenization failed"))
if n.value == 0:
return []
# Allocate exactly the required buffer and fetch the tokens.
arr = (ctypes.c_uint32 * n.value)()
rc = _lib.cactus_tokenize(model, _enc(text), arr, n.value, ctypes.byref(n))
if rc < 0:
raise RuntimeError(_err("Tokenization failed"))

Copilot uses AI. Check for mistakes.
doc = doc_raw[i].value.decode("utf-8", errors="ignore")
meta_bytes = meta_raw[i].value
meta = meta_bytes.decode("utf-8", errors="ignore") if meta_bytes else None
emb_dim = emb_sizes[i]
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cactus_index_get uses emb_dim = emb_sizes[i] (a ctypes.c_size_t instance) as a slice bound (emb_raw[i][:emb_dim]). Slice bounds must be plain integers; use emb_sizes[i].value (and likewise for the > 0 check) to avoid a TypeError at runtime.

Suggested change
emb_dim = emb_sizes[i]
emb_dim = emb_sizes[i].value

Copilot uses AI. Check for mistakes.
Comment on lines +1014 to +1020
final sb = StringBuffer('{"results":[');
for (var i = 0; i < resultCount; i++) {
if (i > 0) sb.write(',');
sb.write('{"id":${idBuffers[0][i]},"score":${scoreBuffers[0][i]}}');
}
sb.write(']}');
return sb.toString();
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cactusIndexQuery returns JSON built via string concatenation. This is brittle (e.g., requires correct escaping, and non-finite floats can break JSON). Prefer returning a Dart object encoded via jsonEncode to guarantee valid JSON.

Suggested change
final sb = StringBuffer('{"results":[');
for (var i = 0; i < resultCount; i++) {
if (i > 0) sb.write(',');
sb.write('{"id":${idBuffers[0][i]},"score":${scoreBuffers[0][i]}}');
}
sb.write(']}');
return sb.toString();
final results = <Map<String, dynamic>>[];
for (var i = 0; i < resultCount; i++) {
results.add({
'id': idBuffers[0][i],
'score': scoreBuffers[0][i],
});
}
return jsonEncode({'results': results});

Copilot uses AI. Check for mistakes.
Comment on lines +574 to 578
std::vector<int> idBuffer(static_cast<size_t>(topK));
std::vector<float> scoreBuffer(static_cast<size_t>(topK));
size_t idBufferSize = static_cast<size_t>(topK);
size_t scoreBufferSize = static_cast<size_t>(topK);

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nativeIndexQuery / nativeIndexGet accept topK as jlong and directly size std::vector buffers from it. A negative or extremely large topK from Kotlin could cause huge allocations or wrap to a very large size_t. Validate topK bounds (e.g., >0 and <= some maximum) before allocating.

Suggested change
std::vector<int> idBuffer(static_cast<size_t>(topK));
std::vector<float> scoreBuffer(static_cast<size_t>(topK));
size_t idBufferSize = static_cast<size_t>(topK);
size_t scoreBufferSize = static_cast<size_t>(topK);
// Validate topK before using it to size buffers
if (topK <= 0 || topK > static_cast<jlong>(DEFAULT_BUFFER_SIZE)) {
env->ReleaseFloatArrayElements(embedding, embData, JNI_ABORT);
release_jstring(env, optionsJson, options);
throwCactusException(env, "Invalid topK: must be between 1 and 65536");
return nullptr;
}
size_t kSize = static_cast<size_t>(topK);
std::vector<int> idBuffer(kSize);
std::vector<float> scoreBuffer(kSize);
size_t idBufferSize = kSize;
size_t scoreBufferSize = kSize;

Copilot uses AI. Check for mistakes.
Comment on lines 274 to +277
if pcm_data is not None:
if isinstance(pcm_data, bytes):
pcm_arr = (ctypes.c_uint8 * len(pcm_data)).from_buffer_copy(pcm_data)
else:
pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data)
pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data)
pcm_ptr = ctypes.cast(pcm_arr, ctypes.POINTER(ctypes.c_uint8))
pcm_len = len(pcm_data)

_lib.cactus_detect_language(
model,
audio_path.encode() if isinstance(audio_path, str) else audio_path,
buf, len(buf),
options_json.encode() if options_json else None,
pcm_ptr, pcm_len
pcm_size = len(pcm_data)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data) expands pcm_data into thousands of positional arguments, which is extremely slow for real audio buffers. Use from_buffer_copy(pcm_data) when pcm_data is bytes/bytearray (or wrap with memoryview) to avoid argument explosion.

Copilot uses AI. Check for mistakes.
Comment on lines 292 to +296
buf = (ctypes.c_float * 4096)()
dim = ctypes.c_size_t()
_lib.cactus_embed(
model,
text.encode() if isinstance(text, str) else text,
buf, ctypes.sizeof(buf), ctypes.byref(dim), normalize
)
rc = _lib.cactus_embed(model, _enc(text), buf, 4096, ctypes.byref(dim), normalize)
if rc < 0:
raise RuntimeError(_err("Embedding failed"))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cactus_embed expects buffer_size in bytes (see C FFI: it compares embeddings.size() * sizeof(float) to buffer_size). Passing 4096 here under-allocates (only 4KB), and will fail for larger embedding dims. Pass ctypes.sizeof(buf) (or 4096 * ctypes.sizeof(ctypes.c_float)) instead of 4096.

Copilot uses AI. Check for mistakes.
Comment on lines 300 to +306
def cactus_image_embed(model, image_path):
"""
Get image embeddings from a VLM.

Args:
model: Model handle from cactus_init
image_path: Path to image file

Returns:
List of floats representing the image embedding vector.
"""
"""Generates an image embedding. Returns list of floats."""
buf = (ctypes.c_float * 4096)()
dim = ctypes.c_size_t()
_lib.cactus_image_embed(
model,
image_path.encode() if isinstance(image_path, str) else image_path,
buf, ctypes.sizeof(buf), ctypes.byref(dim)
)
rc = _lib.cactus_image_embed(model, _enc(image_path), buf, 4096, ctypes.byref(dim))
if rc < 0:
raise RuntimeError(_err("Image embedding failed"))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as cactus_embed: cactus_image_embed’s buffer_size parameter is in bytes. Passing 4096 is only 4KB and can cause -2 (buffer too small) for higher-dimensional embeddings. Use ctypes.sizeof(buf) (or 4096 * sizeof(float)).

Copilot uses AI. Check for mistakes.
Comment on lines 310 to +316
def cactus_audio_embed(model, audio_path):
"""
Get audio embeddings from a Whisper model.

Args:
model: Whisper model handle from cactus_init
audio_path: Path to audio file (WAV format)

Returns:
List of floats representing the audio embedding vector.
"""
"""Generates an audio embedding. Returns list of floats."""
buf = (ctypes.c_float * 4096)()
dim = ctypes.c_size_t()
_lib.cactus_audio_embed(
model,
audio_path.encode() if isinstance(audio_path, str) else audio_path,
buf, ctypes.sizeof(buf), ctypes.byref(dim)
)
rc = _lib.cactus_audio_embed(model, _enc(audio_path), buf, 4096, ctypes.byref(dim))
if rc < 0:
raise RuntimeError(_err("Audio embedding failed"))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same bytes-vs-floats issue: cactus_audio_embed expects buffer_size in bytes. Passing 4096 will under-report available space and can fail for larger embedding dimensions. Use ctypes.sizeof(buf) (or 4096 * sizeof(float)).

Copilot uses AI. Check for mistakes.
Comment on lines +466 to +468
n = id_size.value
parts = [f'{{"id":{id_buffer[i]},"score":{score_buffer[i]}}}' for i in range(n)]
return '{"results":[' + ",".join(parts) + ']}'
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cactus_index_query manually constructs a JSON string. Besides string escaping issues, float values like nan/inf would serialize as nan/inf via str(...), which is not valid JSON. Prefer returning structured data via json.dumps(...) to ensure correct encoding.

Copilot uses AI. Check for mistakes.
Comment on lines +502 to +508
var sb = "{\"results\":["
for i in 0..<idBufferSize {
if i > 0 { sb += "," }
sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}"
}
sb += "]}"
return sb
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cactusIndexQuery returns JSON assembled via string concatenation. This is error-prone (escaping, non-finite floats) and duplicates JSON formatting logic across SDKs. Prefer returning structured results and encoding with JSONSerialization to guarantee valid JSON.

Suggested change
var sb = "{\"results\":["
for i in 0..<idBufferSize {
if i > 0 { sb += "," }
sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}"
}
sb += "]}"
return sb
var resultsArray: [[String: Any]] = []
resultsArray.reserveCapacity(idBufferSize)
for i in 0..<idBufferSize {
let score = scoreBuffer[i]
if !score.isFinite {
throw _err("Index query returned non-finite score")
}
resultsArray.append([
"id": idBuffer[i],
"score": score
])
}
let jsonObject: [String: Any] = ["results": resultsArray]
let data = try JSONSerialization.data(withJSONObject: jsonObject, options: [])
guard let jsonString = String(data: data, encoding: .utf8) else {
throw _err("Failed to encode index query results as UTF-8 JSON")
}
return jsonString

Copilot uses AI. Check for mistakes.
@HenryNdubuaku HenryNdubuaku merged commit 078340f into main Mar 4, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants