Conversation
Signed-off-by: jakmro <[email protected]>
…o README Signed-off-by: jakmro <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR simplifies and aligns the Python, Flutter/Dart, Swift, and Kotlin/Android SDKs by exposing module/top-level functions that more directly mirror the C FFI API (including new telemetry helpers and index document retrieval).
Changes:
- Refactors Python/Flutter/Swift/Kotlin APIs from OO wrappers into direct FFI-style, module/top-level functions and updates READMEs accordingly.
- Adds/threads through new FFI parameters and functions (e.g.,
cacheIndexon init, telemetryset_app_id, telemetry shutdown, andcactus_index_get). - Standardizes streaming transcription APIs across SDKs (
*_start/process/stop).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 20 comments.
Show a summary per file
| File | Description |
|---|---|
| python/src/cactus.py | Reworks Python bindings to be thin wrappers over the C FFI, adds telemetry + index_get, and changes error handling to raise RuntimeError. |
| python/README.md | Updates Python docs/examples to match the new module-level API. |
| flutter/cactus.dart | Refactors Dart API to top-level FFI wrappers, adds telemetry + index_get, and updates stream transcription functions. |
| flutter/README.md | Updates Flutter docs/examples to match the new API. |
| apple/README.md | Updates Swift SDK docs/examples to match the new top-level API. |
| apple/Cactus.swift | Refactors Swift bindings to top-level functions, adds telemetry + index_get, and aligns stream transcription functions. |
| android/cactus_jni.cpp | Updates JNI exports to match new Kotlin API surface, adds telemetry + index_get, and switches error handling to throwing. |
| android/README.md | Updates Android/Kotlin docs/examples to match the new top-level API. |
| android/Cactus.kt | Replaces class-based wrapper with top-level functions backed by a JNI singleton. |
| android/Cactus.ios.kt | Refactors Kotlin/Native (iOS) bindings to top-level functions mirroring the C FFI, adds telemetry + index_get. |
| android/Cactus.common.kt | Replaces expect class API with expect fun top-level functions for the shared surface. |
| android/Cactus.android.kt | Implements Android actual top-level functions that call into the JNI layer. |
Comments suppressed due to low confidence (1)
flutter/cactus.dart:453
cactusGetLastError()returns a non-optionalStringbut the underlying C API returns an empty string when there is no error. This makes it hard for callers to distinguish "no error" vs an actual empty error message. Consider returningString?(null when empty) for parity with the Python API.
/// Returns the last error message.
String cactusGetLastError() {
return _cactusGetLastError().toDartString();
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| max_tokens = 8192 | ||
| arr = (ctypes.c_uint32 * max_tokens)() | ||
| n = ctypes.c_size_t(0) | ||
| rc = _lib.cactus_tokenize(model, _enc(text), arr, max_tokens, ctypes.byref(n)) | ||
| if rc < 0: | ||
| raise RuntimeError(_err("Tokenization failed")) |
There was a problem hiding this comment.
cactus_tokenize in the C FFI returns -2 when the provided buffer is too small (and supports a length-query call by passing token_buffer=NULL, token_buffer_len=0). The new Python wrapper always uses a fixed 8192 buffer and will fail for longer inputs where the old two-pass implementation would succeed. Consider restoring the two-pass approach (query required length, allocate exact buffer, then fetch).
| max_tokens = 8192 | |
| arr = (ctypes.c_uint32 * max_tokens)() | |
| n = ctypes.c_size_t(0) | |
| rc = _lib.cactus_tokenize(model, _enc(text), arr, max_tokens, ctypes.byref(n)) | |
| if rc < 0: | |
| raise RuntimeError(_err("Tokenization failed")) | |
| # First, query the required number of tokens by calling with a NULL buffer. | |
| n = ctypes.c_size_t(0) | |
| rc = _lib.cactus_tokenize(model, _enc(text), None, 0, ctypes.byref(n)) | |
| if rc < 0: | |
| raise RuntimeError(_err("Tokenization failed")) | |
| if n.value == 0: | |
| return [] | |
| # Allocate exactly the required buffer and fetch the tokens. | |
| arr = (ctypes.c_uint32 * n.value)() | |
| rc = _lib.cactus_tokenize(model, _enc(text), arr, n.value, ctypes.byref(n)) | |
| if rc < 0: | |
| raise RuntimeError(_err("Tokenization failed")) |
| doc = doc_raw[i].value.decode("utf-8", errors="ignore") | ||
| meta_bytes = meta_raw[i].value | ||
| meta = meta_bytes.decode("utf-8", errors="ignore") if meta_bytes else None | ||
| emb_dim = emb_sizes[i] |
There was a problem hiding this comment.
cactus_index_get uses emb_dim = emb_sizes[i] (a ctypes.c_size_t instance) as a slice bound (emb_raw[i][:emb_dim]). Slice bounds must be plain integers; use emb_sizes[i].value (and likewise for the > 0 check) to avoid a TypeError at runtime.
| emb_dim = emb_sizes[i] | |
| emb_dim = emb_sizes[i].value |
| final sb = StringBuffer('{"results":['); | ||
| for (var i = 0; i < resultCount; i++) { | ||
| if (i > 0) sb.write(','); | ||
| sb.write('{"id":${idBuffers[0][i]},"score":${scoreBuffers[0][i]}}'); | ||
| } | ||
| sb.write(']}'); | ||
| return sb.toString(); |
There was a problem hiding this comment.
cactusIndexQuery returns JSON built via string concatenation. This is brittle (e.g., requires correct escaping, and non-finite floats can break JSON). Prefer returning a Dart object encoded via jsonEncode to guarantee valid JSON.
| final sb = StringBuffer('{"results":['); | |
| for (var i = 0; i < resultCount; i++) { | |
| if (i > 0) sb.write(','); | |
| sb.write('{"id":${idBuffers[0][i]},"score":${scoreBuffers[0][i]}}'); | |
| } | |
| sb.write(']}'); | |
| return sb.toString(); | |
| final results = <Map<String, dynamic>>[]; | |
| for (var i = 0; i < resultCount; i++) { | |
| results.add({ | |
| 'id': idBuffers[0][i], | |
| 'score': scoreBuffers[0][i], | |
| }); | |
| } | |
| return jsonEncode({'results': results}); |
| std::vector<int> idBuffer(static_cast<size_t>(topK)); | ||
| std::vector<float> scoreBuffer(static_cast<size_t>(topK)); | ||
| size_t idBufferSize = static_cast<size_t>(topK); | ||
| size_t scoreBufferSize = static_cast<size_t>(topK); | ||
|
|
There was a problem hiding this comment.
nativeIndexQuery / nativeIndexGet accept topK as jlong and directly size std::vector buffers from it. A negative or extremely large topK from Kotlin could cause huge allocations or wrap to a very large size_t. Validate topK bounds (e.g., >0 and <= some maximum) before allocating.
| std::vector<int> idBuffer(static_cast<size_t>(topK)); | |
| std::vector<float> scoreBuffer(static_cast<size_t>(topK)); | |
| size_t idBufferSize = static_cast<size_t>(topK); | |
| size_t scoreBufferSize = static_cast<size_t>(topK); | |
| // Validate topK before using it to size buffers | |
| if (topK <= 0 || topK > static_cast<jlong>(DEFAULT_BUFFER_SIZE)) { | |
| env->ReleaseFloatArrayElements(embedding, embData, JNI_ABORT); | |
| release_jstring(env, optionsJson, options); | |
| throwCactusException(env, "Invalid topK: must be between 1 and 65536"); | |
| return nullptr; | |
| } | |
| size_t kSize = static_cast<size_t>(topK); | |
| std::vector<int> idBuffer(kSize); | |
| std::vector<float> scoreBuffer(kSize); | |
| size_t idBufferSize = kSize; | |
| size_t scoreBufferSize = kSize; |
| if pcm_data is not None: | ||
| if isinstance(pcm_data, bytes): | ||
| pcm_arr = (ctypes.c_uint8 * len(pcm_data)).from_buffer_copy(pcm_data) | ||
| else: | ||
| pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data) | ||
| pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data) | ||
| pcm_ptr = ctypes.cast(pcm_arr, ctypes.POINTER(ctypes.c_uint8)) | ||
| pcm_len = len(pcm_data) | ||
|
|
||
| _lib.cactus_detect_language( | ||
| model, | ||
| audio_path.encode() if isinstance(audio_path, str) else audio_path, | ||
| buf, len(buf), | ||
| options_json.encode() if options_json else None, | ||
| pcm_ptr, pcm_len | ||
| pcm_size = len(pcm_data) |
There was a problem hiding this comment.
pcm_arr = (ctypes.c_uint8 * len(pcm_data))(*pcm_data) expands pcm_data into thousands of positional arguments, which is extremely slow for real audio buffers. Use from_buffer_copy(pcm_data) when pcm_data is bytes/bytearray (or wrap with memoryview) to avoid argument explosion.
| buf = (ctypes.c_float * 4096)() | ||
| dim = ctypes.c_size_t() | ||
| _lib.cactus_embed( | ||
| model, | ||
| text.encode() if isinstance(text, str) else text, | ||
| buf, ctypes.sizeof(buf), ctypes.byref(dim), normalize | ||
| ) | ||
| rc = _lib.cactus_embed(model, _enc(text), buf, 4096, ctypes.byref(dim), normalize) | ||
| if rc < 0: | ||
| raise RuntimeError(_err("Embedding failed")) |
There was a problem hiding this comment.
cactus_embed expects buffer_size in bytes (see C FFI: it compares embeddings.size() * sizeof(float) to buffer_size). Passing 4096 here under-allocates (only 4KB), and will fail for larger embedding dims. Pass ctypes.sizeof(buf) (or 4096 * ctypes.sizeof(ctypes.c_float)) instead of 4096.
| def cactus_image_embed(model, image_path): | ||
| """ | ||
| Get image embeddings from a VLM. | ||
|
|
||
| Args: | ||
| model: Model handle from cactus_init | ||
| image_path: Path to image file | ||
|
|
||
| Returns: | ||
| List of floats representing the image embedding vector. | ||
| """ | ||
| """Generates an image embedding. Returns list of floats.""" | ||
| buf = (ctypes.c_float * 4096)() | ||
| dim = ctypes.c_size_t() | ||
| _lib.cactus_image_embed( | ||
| model, | ||
| image_path.encode() if isinstance(image_path, str) else image_path, | ||
| buf, ctypes.sizeof(buf), ctypes.byref(dim) | ||
| ) | ||
| rc = _lib.cactus_image_embed(model, _enc(image_path), buf, 4096, ctypes.byref(dim)) | ||
| if rc < 0: | ||
| raise RuntimeError(_err("Image embedding failed")) |
There was a problem hiding this comment.
Same issue as cactus_embed: cactus_image_embed’s buffer_size parameter is in bytes. Passing 4096 is only 4KB and can cause -2 (buffer too small) for higher-dimensional embeddings. Use ctypes.sizeof(buf) (or 4096 * sizeof(float)).
| def cactus_audio_embed(model, audio_path): | ||
| """ | ||
| Get audio embeddings from a Whisper model. | ||
|
|
||
| Args: | ||
| model: Whisper model handle from cactus_init | ||
| audio_path: Path to audio file (WAV format) | ||
|
|
||
| Returns: | ||
| List of floats representing the audio embedding vector. | ||
| """ | ||
| """Generates an audio embedding. Returns list of floats.""" | ||
| buf = (ctypes.c_float * 4096)() | ||
| dim = ctypes.c_size_t() | ||
| _lib.cactus_audio_embed( | ||
| model, | ||
| audio_path.encode() if isinstance(audio_path, str) else audio_path, | ||
| buf, ctypes.sizeof(buf), ctypes.byref(dim) | ||
| ) | ||
| rc = _lib.cactus_audio_embed(model, _enc(audio_path), buf, 4096, ctypes.byref(dim)) | ||
| if rc < 0: | ||
| raise RuntimeError(_err("Audio embedding failed")) |
There was a problem hiding this comment.
Same bytes-vs-floats issue: cactus_audio_embed expects buffer_size in bytes. Passing 4096 will under-report available space and can fail for larger embedding dimensions. Use ctypes.sizeof(buf) (or 4096 * sizeof(float)).
| n = id_size.value | ||
| parts = [f'{{"id":{id_buffer[i]},"score":{score_buffer[i]}}}' for i in range(n)] | ||
| return '{"results":[' + ",".join(parts) + ']}' |
There was a problem hiding this comment.
cactus_index_query manually constructs a JSON string. Besides string escaping issues, float values like nan/inf would serialize as nan/inf via str(...), which is not valid JSON. Prefer returning structured data via json.dumps(...) to ensure correct encoding.
| var sb = "{\"results\":[" | ||
| for i in 0..<idBufferSize { | ||
| if i > 0 { sb += "," } | ||
| sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}" | ||
| } | ||
| sb += "]}" | ||
| return sb |
There was a problem hiding this comment.
cactusIndexQuery returns JSON assembled via string concatenation. This is error-prone (escaping, non-finite floats) and duplicates JSON formatting logic across SDKs. Prefer returning structured results and encoding with JSONSerialization to guarantee valid JSON.
| var sb = "{\"results\":[" | |
| for i in 0..<idBufferSize { | |
| if i > 0 { sb += "," } | |
| sb += "{\"id\":\(idBuffer[i]),\"score\":\(scoreBuffer[i])}" | |
| } | |
| sb += "]}" | |
| return sb | |
| var resultsArray: [[String: Any]] = [] | |
| resultsArray.reserveCapacity(idBufferSize) | |
| for i in 0..<idBufferSize { | |
| let score = scoreBuffer[i] | |
| if !score.isFinite { | |
| throw _err("Index query returned non-finite score") | |
| } | |
| resultsArray.append([ | |
| "id": idBuffer[i], | |
| "score": score | |
| ]) | |
| } | |
| let jsonObject: [String: Any] = ["results": resultsArray] | |
| let data = try JSONSerialization.data(withJSONObject: jsonObject, options: []) | |
| guard let jsonString = String(data: data, encoding: .utf8) else { | |
| throw _err("Failed to encode index query results as UTF-8 JSON") | |
| } | |
| return jsonString |
No description provided.