Skip to content

fix(analytics): persist unique CLI install IDs#1348

Merged
davincios merged 7 commits intomainfrom
feature/unique-users
May 5, 2026
Merged

fix(analytics): persist unique CLI install IDs#1348
davincios merged 7 commits intomainfrom
feature/unique-users

Conversation

@davincios
Copy link
Copy Markdown
Contributor

Summary

  • Persist a stable anonymous CLI install ID and use it as the PostHog distinct ID for unique-user counting.
  • Add local analytics event logging/error handling and deterministic insert IDs for one-time lifecycle events.
  • Update docs and tests for the new analytics storage, opt-out, and migration behavior.

Test plan

  • make lint
  • make format-check
  • make typecheck
  • make test-cov

Made with Cursor

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 5, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
tracer 🟢 Ready View Preview May 5, 2026, 9:20 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Comment thread app/analytics/provider.py Fixed
Comment thread app/analytics/provider.py Fixed
Comment thread app/analytics/provider.py Fixed
Comment thread app/analytics/provider.py Fixed
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 5, 2026

Greptile Summary

This PR persists a stable anonymous CLI install ID to ~/.config/opensre/anonymous_id (migrating from the old ~/.opensre layout), uses it as the PostHog distinct_id for unique-user counting, and adds deterministic $insert_id values for one-time lifecycle events to prevent duplicate rows on retry.

  • Anonymous ID persistence: atomic write with os.replace, cross-process file-lock, in-process cache under threading.Lock, and legacy-path migration with a new USER_ID_LOAD_FAILED PostHog event for diagnostics.
  • Event log infrastructure: _log_event_line, _log_debug_line, and _log_failure write to posthog_events.txt with line-cap rotation — but _log_event_line is never called from _send(), so actual events are never logged despite the README promise.
  • Config-dir migration: OPENSRE_HOME_DIR moved from ~/.opensre to ~/.config/opensre throughout constants, guardrails, integrations, docs, and tests.

Confidence Score: 3/5

The config-dir migration and ID-persistence machinery are well-engineered and safe; the event log feature ships silently broken.

The anonymous-ID persistence, file-lock, atomic writes, and legacy migration all look correct. The main problem is that _log_event_line is defined but never invoked from _send(), so posthog_events.txt will only contain debug/failure lines rather than the actual events described in the updated README. Secondary concerns are the dead _Envelope.insert_id field and a line counter that drifts when writes are suppressed.

app/analytics/provider.py — specifically the _send method, the unused _log_event_line function, and the _Envelope dataclass.

Important Files Changed

Filename Overview
app/analytics/provider.py Core analytics module heavily refactored with atomic writes, file locking, legacy migration, event log, and failure breadcrumbs — but _log_event_line is never called, _Envelope.insert_id is dead code, and the line counter increments on suppressed write failures.
tests/analytics/test_provider.py Extensive new tests for ID persistence, concurrency, legacy migration, insert IDs, and event log placement; autouse fixture omits event log global state reset.
app/constants/init.py Renames OPENSRE_HOME_DIR to ~/.config/opensre and introduces LEGACY_OPENSRE_HOME_DIR for migration.
app/analytics/events.py Adds USER_ID_LOAD_FAILED lifecycle event for diagnostics.

Sequence Diagram

sequenceDiagram
    participant CLI
    participant Analytics
    participant _get_or_create_anonymous_id
    participant _compute_anonymous_identity
    participant Disk

    CLI->>Analytics: Analytics()
    Analytics->>_get_or_create_anonymous_id: _get_or_create_anonymous_id()
    _get_or_create_anonymous_id->>_compute_anonymous_identity: under _anonymous_id_lock
    _compute_anonymous_identity->>Disk: check _CONFIG_DIR and _FIRST_RUN_PATH
    alt anonymous_id file exists and valid
        Disk-->>_compute_anonymous_identity: return existing UUID
    else legacy path exists and valid
        Disk-->>_compute_anonymous_identity: migrate legacy UUID to new path
    else no existing ID
        _compute_anonymous_identity->>Disk: _write_new_anonymous_id under file lock
        Disk-->>_compute_anonymous_identity: AnonymousIdentity disk
    end
    _compute_anonymous_identity-->>_get_or_create_anonymous_id: _AnonymousIdentity distinct_id persistence
    _get_or_create_anonymous_id-->>Analytics: cache distinct_id and persistence
    Analytics->>Analytics: _pop_user_id_load_failures capture USER_ID_LOAD_FAILED
    Analytics-->>CLI: ready

    CLI->>Analytics: capture event props
    Analytics->>Analytics: _Envelope event props new_insert_id
    Analytics->>Analytics: _worker_loop _send
    Analytics->>Disk: POST /capture/ with distinct_id and $insert_id for one-time events
    Note over Analytics,Disk: _log_event_line never called here is a bug
Loading

Comments Outside Diff (1)

  1. app/analytics/provider.py, line 57-61 (link)

    P2 _Envelope.insert_id field is dead code

    The insert_id field is populated in capture() via _new_insert_id() (a random UUID), but _send() never reads item.insert_id. Instead, _send independently computes _event_insert_id(item.event, self._anonymous_id) which returns a deterministic string for one-time events and None otherwise. The envelope's insert_id is allocated and discarded on every capture() call. Either the field should be removed, or _send should fall back to item.insert_id for non-one-time events if per-event deduplication is desired.

Reviews (1): Last reviewed commit: "fix(analytics): persist unique CLI insta..." | Re-trigger Greptile

Comment thread app/analytics/provider.py
Comment thread app/analytics/provider.py Outdated
Comment thread tests/analytics/test_provider.py
davincios and others added 6 commits May 5, 2026 22:23
Emit CLI and funnel analytics only after commands are accepted or flow milestones truly occur, so dashboard funnels do not count help, parse failures, templates, or failed verifications as product progress.

Co-authored-by: Cursor <[email protected]>
@davincios davincios merged commit 9cb279f into main May 5, 2026
14 checks passed
@davincios davincios deleted the feature/unique-users branch May 5, 2026 22:26
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

🎲 Researchers are baffled. @davincios opened a PR, got it reviewed without drama, and merged clean. This violates known laws of open source. 🔬


👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants