Conversation
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: Roman Shemet <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
Signed-off-by: justinl66 <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR introduces native C++ telemetry functionality to track model initialization, completion, and transcription events. The telemetry system collects performance metrics and optionally uploads them to a cloud backend, with support for local caching and environment-based configuration.
Changes:
- Added comprehensive telemetry infrastructure with support for completion, transcription, and streaming events
- Integrated telemetry recording at key points (init, complete, transcribe operations)
- Added CLI flags to disable cloud telemetry across all commands
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| third_party/supabase-backend | Added subproject for cloud backend integration |
| third_party/cactus-react-native | Added subproject dependency |
| cactus/telemetry/telemetry.h | New telemetry API header defining metrics structures and recording functions |
| cactus/telemetry/telemetry.cpp | Core telemetry implementation with cloud upload, caching, and event tracking |
| cactus/ffi/cactus_utils.h | Extended options parsing to support telemetry_enabled flag |
| cactus/ffi/cactus_init.cpp | Added telemetry initialization and recording at model load |
| cactus/ffi/cactus_complete.cpp | Integrated telemetry recording for completion operations |
| cactus/ffi/cactus_transcribe.cpp | Added telemetry recording for transcription events |
| cactus/ffi/cactus_stream.cpp | Integrated streaming transcription telemetry with session tracking |
| cactus/ffi/cactus_telemetry_bridge.cpp | New bridge file for FFI telemetry integration |
| cactus/CMakeLists.txt | Added telemetry sources to build configuration |
| tests/test_engine.cpp | Updated test options to explicitly disable telemetry |
| tests/chat.cpp | Added telemetry recording in chat test |
| tests/asr.cpp | Added telemetry recording and error tracking in ASR tests |
| python/src/cli.py | Added --no-cloud-tele flag to CLI commands |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| static std::atomic<bool> atexit_registered{false}; | ||
| static std::atomic<bool> cloud_disabled{false}; | ||
| static std::string supabase_url = "https://ivzeouvbwsnepwojsjya.supabase.co"; | ||
| static std::string supabase_key = "sb_publishable_hgsggXPMkAsEuyhQE_bwuQ_ouLN3rcc"; |
There was a problem hiding this comment.
A publicly visible API key or credential is hardcoded in the source code. Even if this is a 'publishable' key, it should be loaded from an environment variable or configuration file to prevent unauthorized use and facilitate key rotation.
| static std::string supabase_key = "sb_publishable_hgsggXPMkAsEuyhQE_bwuQ_ouLN3rcc"; | |
| static std::string supabase_key = []() { | |
| const char* env_key = std::getenv("SUPABASE_KEY"); | |
| if (env_key && env_key[0] != '\0') { | |
| return std::string(env_key); | |
| } | |
| return std::string(); | |
| }(); |
| oss << std::setw(8) << ((a >> 32) & 0xffffffffULL); | ||
| oss << "-" << std::setw(4) << ((a >> 16) & 0xffffULL); | ||
| oss << "-" << std::setw(4) << (a & 0xffffULL); | ||
| oss << "-" << std::setw(4) << ((b >> 48) & 0xffffULL); | ||
| oss << "-" << std::setw(12) << (b & 0xffffffffffffULL); |
There was a problem hiding this comment.
The UUID generation uses std::setw without ensuring hex formatting is explicitly set for all components. While line 396 sets hex mode, explicit hex formatting should be applied consistently for clarity and correctness.
| #include "cactus_ffi.h" | ||
| #include "telemetry/telemetry.h" | ||
|
|
||
| extern "C" { |
There was a problem hiding this comment.
This file contains an empty extern C block with no implementation. If the bridge is not yet implemented, consider adding a comment explaining its future purpose or removing the file until it's needed.
| extern "C" { | |
| extern "C" { | |
| // Telemetry-related C FFI functions will be exposed here in the future. | |
| // This bridge is intentionally left without exported symbols for now. |
| const char* home = getenv("HOME"); | ||
| if (!home) home = "/tmp"; |
There was a problem hiding this comment.
The fallback to /tmp when HOME is not set may not be appropriate for all platforms. Consider using platform-specific APIs (e.g., NSHomeDirectory on macOS, SHGetKnownFolderPath on Windows) or std::filesystem::temp_directory_path for a more portable solution.
| + std::to_string(MAX_TOKENS) | ||
| + ",\"stop_sequences\":[\"<|im_end|>\",\"<end_of_turn>\"]}"; |
There was a problem hiding this comment.
The indentation changed from spaces to fewer spaces, creating inconsistency with the surrounding code style. Maintain consistent indentation throughout the concatenated string construction.
|
|
||
| curl_global_init(CURL_GLOBAL_DEFAULT); | ||
| if (!atexit_registered.exchange(true)) { | ||
| std::atexit([](){ shutdown(); }); |
There was a problem hiding this comment.
The atexit registration uses exchange(true) but doesn't handle potential race conditions if multiple threads call init() simultaneously. Consider protecting the entire initialization sequence with a mutex or using std::call_once for thread-safe one-time initialization.
| handle->previous_transcription = response; | ||
| handle->previous_audio_buffer_size = handle->audio_buffer.size(); | ||
| } | ||
|
|
There was a problem hiding this comment.
The magic numbers 20000 tokens and 600000 ms (10 minutes) lack documentation explaining why these specific thresholds were chosen for stream telemetry reporting. Add comments describing the rationale.
| // Telemetry batching caps: | |
| // - STREAM_TOKENS_CAP limits the number of tokens included in a single | |
| // telemetry interval. 20k tokens was chosen to balance granularity of | |
| // per-stream metrics against overhead from emitting too many events. | |
| // - STREAM_DURATION_CAP_MS limits the wall-clock length of a telemetry | |
| // interval. 600000 ms (10 minutes) ensures long-lived streams still | |
| // emit periodic metrics even if token throughput is low. |
| } | ||
|
|
||
| std::cout << colored("\n👋 Goodbye!\n", Color::MAGENTA + Color::BOLD); | ||
| if (result != 0) { |
There was a problem hiding this comment.
The telemetry recording on error happens after the 'Goodbye' message is printed but before cactus_destroy. If result is positive (success) but not exactly 0, this incorrectly records a failure. The condition should be result < 0 to match the actual failure case.
| if (result != 0) { | |
| if (result < 0) { |
Signed-off-by: justinl66 <[email protected]>
* Add basic telemetry setup Signed-off-by: justinl66 <[email protected]> * Add init error logging Signed-off-by: justinl66 <[email protected]> * Update device + project ID Signed-off-by: justinl66 <[email protected]> * Align telemetry POSTs to supabase schema Signed-off-by: justinl66 <[email protected]> * Fixing 35-char UUID format bug to make it the correct 36 Signed-off-by: Roman Shemet <[email protected]> * Update telemetry Signed-off-by: justinl66 <[email protected]> * Fix transcription telemetry event Signed-off-by: justinl66 <[email protected]> * Update telemetry to work with new RLS policy Signed-off-by: justinl66 <[email protected]> * Update response time logging Signed-off-by: justinl66 <[email protected]> * Fix transcribe events Signed-off-by: justinl66 <[email protected]> * Fix init error logging Signed-off-by: justinl66 <[email protected]> * Fix token double count during stream transcription Signed-off-by: justinl66 <[email protected]> * Update telemetry with embedding logs Signed-off-by: justinl66 <[email protected]> * Fix streaming transcription telemetry Signed-off-by: justinl66 <[email protected]> * Remove stray third_party folder Signed-off-by: justinl66 <[email protected]> --------- Signed-off-by: justinl66 <[email protected]> Signed-off-by: Roman Shemet <[email protected]> Co-authored-by: Roman Shemet <[email protected]>
* Add basic telemetry setup Signed-off-by: justinl66 <[email protected]> * Add init error logging Signed-off-by: justinl66 <[email protected]> * Update device + project ID Signed-off-by: justinl66 <[email protected]> * Align telemetry POSTs to supabase schema Signed-off-by: justinl66 <[email protected]> * Fixing 35-char UUID format bug to make it the correct 36 Signed-off-by: Roman Shemet <[email protected]> * Update telemetry Signed-off-by: justinl66 <[email protected]> * Fix transcription telemetry event Signed-off-by: justinl66 <[email protected]> * Update telemetry to work with new RLS policy Signed-off-by: justinl66 <[email protected]> * Update response time logging Signed-off-by: justinl66 <[email protected]> * Fix transcribe events Signed-off-by: justinl66 <[email protected]> * Fix init error logging Signed-off-by: justinl66 <[email protected]> * Fix token double count during stream transcription Signed-off-by: justinl66 <[email protected]> * Update telemetry with embedding logs Signed-off-by: justinl66 <[email protected]> * Fix streaming transcription telemetry Signed-off-by: justinl66 <[email protected]> * Remove stray third_party folder Signed-off-by: justinl66 <[email protected]> --------- Signed-off-by: justinl66 <[email protected]> Signed-off-by: Roman Shemet <[email protected]> Co-authored-by: Roman Shemet <[email protected]>
@rshemet