Skip to content

Releases: jstuart0/agentpulse

v0.4.0-pre.2 — Generic forwardauth abstraction

05 May 17:06

Choose a tag to compare

Generic forwardauth abstraction — AgentPulse SSO is no longer hard-coded to Authentik. Operators can now wire it to any forwardauth-capable identity provider via env config alone. Zero migration burden for existing Authentik deployments.

What changed

Before: AGENTPULSE_AUTHENTIK_TRUST_SECRET, hard-coded X-Authentik-* header reads, hard-coded signOutUrl: /outpost.goauthentik.io/sign_out, UI strings hard-coded to "Authentik".

After: FORWARDAUTH_TRUST_SECRET, FORWARDAUTH_PROVIDER, eight FORWARDAUTH_HEADER_* slots (each defaults to its Authentik header name), provider-aware signOutUrl logic, UI label derived from FORWARDAUTH_PROVIDER via a formatProviderLabel helper.

Backward compatibility (read this if you're upgrading)

You don't need to do anything. All eight header slots default to the Authentik header names. FORWARDAUTH_PROVIDER defaults to "authentik". AGENTPULSE_AUTHENTIK_TRUST_SECRET continues to work as a deprecated alias for one release (a boot warning prints when the legacy env is set without the new one). The k8s Middleware agentpulse-strip-client-authentik continues to exist as a duplicate of the renamed agentpulse-strip-client-forwardauth so external overlays referencing the old name continue to work.

The deprecation alias and legacy Middleware resource will be removed in the next release boundary.

How to switch to a different IdP

See deploy/k8s/FORWARDAUTH.md (renamed from AUTHENTIK-FORWARDAUTH.md). Provider-specific sub-sections cover:

  • Authentik (default; current homelab setup)
  • AutheliaRemote-User, Remote-Email, Remote-Groups, Remote-Name
  • oauth2-proxyX-Forwarded-User, X-Forwarded-Email, X-Forwarded-Groups
  • PomeriumX-Pomerium-Claim-*
  • Cloudflare AccessCf-Access-Authenticated-User-Email (note: Cloudflare's auth model has no separate verify header; the doc explains the limitation)

Each section gives the env values, the Traefik middleware adjustments needed, and the gotchas.

Implementation summary

Three phases (mozart-orchestrated, STANDARD tier):

  • Phase 1 backend (ba4ee61) — config.forwardauthHeader(slot), config.forwardauthProvider, config.forwardauthTrustSecret getters with Authentik defaults; middleware reads configurable header names; /auth/me dual-emits source: "forwardauth" + new provider field; deprecation warning when only legacy env is set; 20 new unit tests covering configurable headers, default fallback, trust gate, deprecated alias, signOutUrl provider selection.
  • Phase 2 frontend (78fdf79) — new formatProviderLabel(provider) helper handles known special cases (oauth2-proxy"OAuth2-Proxy", cloudflare/cloudflare-access"Cloudflare Access"); generic capitalize-first fallback. TopBar/Layout read user.provider with a fallback to source === "authentik" for cached pre-Phase-1 responses. 9 new helper tests.
  • Phase 3 manifests + docs + guard (e6b7085) — k8s Middleware renamed with one-release alias; secret data key renamed to FORWARDAUTH_TRUST_SECRET; deployment binds both env vars to the same secret key (operators rotate once); AUTHENTIK-FORWARDAUTH.mdFORWARDAUTH.md restructured with five provider sub-sections; README + CLAUDE.md + RUNBOOK generalised; new scripts/check-no-authentik-literals.ts architecture guard prevents new X-Authentik-* literals from sneaking into src/.

Verification

  • SQLite tests: 808 pass / 14 skip / 0 fail (was 779/14/0 — gained 29 tests).
  • All architecture guards green: bun-sqlite-only-apis, legacy-schema-import, no-sync-transcript-io, installer-paths, no-authentik-literals (new).
  • TypeScript clean. Biome 0 errors.
  • Manifests render: kubectl kustomize deploy/k8s/ produces 587 lines, no errors.

Plane ticket: AGEN-4.

v0.4.0-pre.1 — Postgres backend + OIDC trust gate completion

05 May 16:18

Choose a tag to compare

PostgreSQL backend at portable parity — the headline feature of v0.4.0, plus four post-rollout deploy/auth fixes that surfaced when this image first landed on a real Kubernetes cluster.

Postgres backend (the v0.4.0 work)

Set DATABASE_URL=postgres://user:pw@host:5432/db to opt into Postgres. SQLite remains the default for OSS quickstart and single-machine use.

  • Drizzle dual-dialect schema split with column-factory pattern; per-dialect drizzle-kit baselines committed at drizzle/sqlite/0000_*.sql and drizzle/postgres/0000_*.sql
  • pg_advisory_lock-serialized boot for safe k8s rolling deploys
  • PostgresSearchBackend — ILIKE parity for /api/v1/search
  • withTransaction() helper that bridges bun-sqlite's sync rollback semantics with Postgres's native async transactions
  • GHA two-job CI: test-sqlite (always) + test-postgres (push to main/dev with postgres:16-alpine service container)
  • New deploy/overlays/postgres/ kustomize overlay for opting into Postgres

7 follow-up plan stubs filed for pg-superior items (pgvector, JSONB raw_payload + GIN, LISTEN/NOTIFY, tsvector/pg_trgm search, FOR UPDATE SKIP LOCKED, deterministic search rank, sqlite-legacy-init removal).

Post-v0.4.0 fixes (this prerelease)

These four issues only appeared during actual Kubernetes deployment against a real Authentik/Traefik stack — the campaign itself shipped clean against the test suite. Operators upgrading from any pre-postgres version should apply this patch before deploying v0.4.0 in production.

  • .dockerignore exception for deploy/k8s/scripts/ — the deploy/ exclusion prevented scripts/build-and-push.sh from building the backup-sidecar image. Added !deploy/k8s/scripts/ exception. (a88d289)
  • backup-sidecar cpu request raised from 10m to 50m — the request fell below the new LimitRange floor; pods were rejected at admission. (a88d289)
  • AGENTPULSE_PG_POOL_MAX secretKeyRef marked optional: true — operators upgrading from pre-postgres installs got CreateContainerConfigError on every pod start. (a88d289)
  • AGENTPULSE_AUTHENTIK_TRUST_SECRET wired into base deployment — the v0.3.0 trust gate added the in-process verification but never wired the env var binding. (b0f16ea)
  • authRouter moved to root app — Hono cascades sibling routers' requireAuth() wildcards across the parent router; /api/v1/auth/me was returning 401, breaking the login UI. Mirrors the existing cspReportRouter/telegramWebhookRouter pattern. (403a4d5)
  • OIDC trust gate completed end-to-end — the v0.3.0 design left header injection unimplemented. New agentpulse-inject-verify Traefik middleware injects the shared secret after forwardauth passes (Authentik property mappings populate JWT claims, not forwardauth response headers). Wired as the third step in the protected-route chain (strip → forwardauth → inject-verify). (dc94356)
  • Public-route bypasses added to IngressRoute/api/v1/ready, /assets/*, and /api/v1/auth/{me,login,logout,signup} now skip forwardauth so the kubelet, the browser asset loader, and the login UI work correctly. (dc94356)

scott documentation pass (201fb6c):

  • CHANGELOG.md — full [0.4.0-pre.1] section
  • deploy/k8s/AUTHENTIK-FORWARDAUTH.md — replaced incorrect property-mapping setup with the correct Traefik middleware approach
  • deploy/k8s/README.md — added Authentik SSO setup section + public-route bypass reference
  • GitHub wiki — Deployment + Architecture pages updated

Operator upgrade notes

If you are upgrading an existing AgentPulse deployment:

  1. Generate a strong shared secret: openssl rand -hex 32
  2. Add it to your secret as AGENTPULSE_AUTHENTIK_TRUST_SECRET
  3. Patch your agentpulse-inject-verify Middleware's customRequestHeaders.X-Authentik-Verify to the same value (use a private overlay; do not commit a real secret)
  4. Verify your IngressRoute middleware chain is strip-client-authentikforwardauthinject-verify in that order
  5. Apply, restart the pod, refresh your browser

See deploy/k8s/AUTHENTIK-FORWARDAUTH.md for the full step-by-step.

Verification

Plane ticket: AGEN-3.

v0.2.0-pre.10 — Reliability hotfix

01 May 15:44

Choose a tag to compare

Pre-release

Reliability hotfix for two stacked production issues that were surfacing as user-visible "could not connect to the LLM" / "network error" messages from Ask and Telegram.

Fixed

  • Cold-load LLM timeouts. The Ask sync and streaming paths used a 60s timeout on the LLM call, which is too tight for cold-load on local 20B+ models — first-token latency runs 60–90s when a model has been evicted from GPU memory. Bumped both timeouts to 180s. Classifier timeouts (8s) stay tight on purpose so a slow gate fails fast and lets the main turn proceed. When a timeout does fire, the user-facing message is now timeout-specific ("LLM didn't respond in time — usually a cold model load. Try again in a few seconds.") instead of the generic "couldn't reach" copy.
  • Pod restart loop from over-tight liveness probes. Default Kubernetes `timeoutSeconds: 1` was triggering on `/api/v1/health` whenever the event loop was briefly busy (LLM streaming, FTS backfill). Bumped both probes to `timeoutSeconds: 5` and the liveness probe to `failureThreshold: 5`.

Validated in production for 41 hours before tagging: 4 restarts vs the prior ~18/day baseline; zero LLM timeouts since deploy.

v0.2.0-pre.9 — Efficiency redesign

29 Apr 04:53

Choose a tag to compare

Pre-release

Building on the type-discipline work in pre.8, this release consolidates the largest remaining duplication patterns across the orchestration core so that adding a new kind, gate, executor, or classifier is a one-place change instead of a five-place change.

Changed

  • Inbox card surface unified. 12 nearly-identical action-request card components collapsed into one ActionRequestCard driven by a per-kind spec table. Adding a new action kind is now one entry in one switch, with an exhaustiveness guard. The freeform-alert-rule card now has the same KindBadge / severity / busy-state polish as every other card.
  • Ask orchestrator unified across transports. The 9-gate ladder previously hand-stamped in both the sync and streaming Ask pipelines is now a single ASK_GATES table consumed by a shared runAskGates runner. New gates are one row, automatically wired into both transports; previously a new gate had to be inserted in two places in matching order.
  • Action-request executors consolidated. The thirteen executor functions (launch / archive / edit / delete / alert-rule / etc.) now share succeed, fail, and failExpired helpers that bundle the conditional-update + notify + return triple they all do, plus a runExecutor<K> frame for the simple cases. Each executor's unique logic now reads at a glance.
  • Intent classifiers consolidated. All eight intent detectors (launch / resume / session-action / template-CRUD / alert-rule / bulk-action / Q&A / add-project) now share a classifyJson<T> helper for the provider-fetch / spend-check / adapter-call / JSON-parse boilerplate. Each detector is now its system prompt plus a small parse lambda — the actual differences are immediately visible.
  • Ask message-meta unified. One shared tagged-union parser (parseAskMeta) replaces three separate server encode/extract pairs and three separate frontend parsers; the AskPage chained-&& detection logic is gone.
  • Inbox kind metadata centralized. One KIND_META table drives both the filter dropdown and the footer counters; missing kinds fail to compile. Closes a pre-existing bug where the footer was missing the add_channel counter.

Fixed

  • Single canonical archive predicate. sessions.is_archived is now the only source of truth for whether a session is archived; status='archived' is retained only for backward compat. A shared isVisibleSession(s) / isArchivedSession(s) helper pair replaces a mix of isArchived = false and status !== 'archived' filters scattered across the codebase. Idempotent DB backfill at startup converts any historical row with status='archived' to is_archived = 1. The Archive badge still flips immediately on archive click. Net effect: digest counts, the operator inbox, and Ask candidate resolution no longer leak archived-but-active sessions; the transcript-sync worker no longer wastes IO polling archived agents.

Behind the scenes

  • 24 files modified, 12 deleted, 3 added; −1,821 LOC net while tightening rather than loosening invariants.
  • 612/612 tests pass; typecheck clean; 0 Biome errors.

v0.2.0-pre.2 — Find any past conversation

25 Apr 14:54

Choose a tag to compare

The "find any past conversation" release. Three new layers stack on
top of session state so Ask actually works across compaction
boundaries and across past completed work — full-text first, then
LLM query expansion, then optional vector embeddings for true
semantic recall. No breaking changes.

Added

Full-text search (/search)

  • New SQLite FTS5 backend behind a SearchBackend interface so a
    Postgres tsvector impl can slot in later without changing
    routes or UI. Two virtual tables (sessions + events), porter +
    unicode61 tokenizer, BM25 ranking normalized to 0..1.
  • Triggers on INSERT/UPDATE/DELETE of sessions and
    events keep both indexes in sync. Event indexing filters to
    meaningful types (UserPromptSubmit, AssistantMessage,
    Stop, TaskCreated/TaskCompleted, SubagentStop,
    SessionEnd, AiProposal, AiReport, AiHitlRequest).
  • Boot-time backfill detects row-count divergence between source
    tables and FTS and re-indexes the gap in a single transaction —
    upgrades from 0.2.0-pre.1 light up retroactively without manual
    rebuild.
  • Query escaping: tokens are phrase-quoted before MATCH so inputs
    with -, :, (, etc. don't get parsed as FTS5 operators.
    New mode: "and" | "or" filter — AND default for the search
    box, OR for programmatic callers.
  • New /search page with URL-stateful filter UI (agentType,
    status, eventType, kind), <mark>-highlighted snippets, links
    back to the originating session and event.
  • New GET /api/v1/search and POST /api/v1/search/rebuild.

Semantic Ask (LLM query expansion)

  • Pluggable SemanticEnricher interface returning extraTerms
    (lexical synonyms) and directHits (sessionId → score). Vector
    enrichment populates the latter, leaves the former empty;
    LlmQueryExpander does the inverse.
  • LlmQueryExpander calls the default LLM provider with a tight
    prompt asking for 5–10 comma-separated synonym terms. Output
    parser tolerates chatty preamble, numbered lists, quotes, and
    unclosed <think> blocks. Caps at 15 deduped terms.
  • New CompositeEnricher runs multiple enrichers in parallel and
    unions their results — so vector + LLM expansion compose
    cleanly when both are configured. One enricher failing doesn't
    poison the other.
  • Ask resolver now folds the enricher's extraTerms and
    directHits into its FTS query and pool extension. keen-worm
    (or whatever your "I worked on coupling for two days" session
    is) finally surfaces even when the user's question paraphrases
    rather than quotes the original work.
  • Ask context builder pulls each session's top FTS-matching
    events
    , not just the most-recent tail, so the LLM sees the
    evidence that earned each session a spot in the candidate list.

Vector search — install-time-optional, AI-gated

  • New AGENTPULSE_VECTOR_SEARCH=true build flag (off by default,
    zero overhead unset). When set, creates an event_embeddings
    table (event_id PK, model, dim, vector BLOB), a delete-cascade
    trigger, and the Settings → AI → Vector search subsection.
  • EmbeddingAdapter interface with an OllamaEmbeddingAdapter
    implementation (uses /api/embed batched, falls back to legacy
    /api/embeddings per-input on older Ollama versions). Strips a
    trailing /v1 from the LLM provider's baseUrl so the OpenAI-
    compatible chat URL works as the embedding host without
    reconfiguration.
  • Default embedding model mxbai-embed-large (335M params,
    1024-dim, top-5 MTEB English in its weight class, ~30–60ms per
    embed). Switchable in Settings to qwen3-embedding:8b
    (8B, 4096-dim, top-tier MTEB, ~200–500ms per embed) for
    installs with the headroom.
  • Ingest hooks fire-and-forget embedEvent(id) from the session
    bus listener — adds zero latency to the hook hot path. Boot-
    time backfill kicks off a background task when row counts
    diverge and reports progress through the Settings UI.
  • New VectorEmbeddingEnricher brute-force scans event vectors
    for the active model, computes cosine similarity, aggregates
    per-session as max + log1p(count) × 0.05. Filters out hits
    below a 0.4 floor (typical noise threshold for unit-normalized
    retrieval models). Sub-100ms over ~10K vectors; sqlite-vss can
    slot in around 100K events.
  • Settings UI: enable toggle, model picker (datalist with
    recommended models + hints), live-polling indexing progress
    bar, "Re-index now" button.
  • Endpoints: GET/PUT /api/v1/ai/vector-search/status,
    POST /api/v1/ai/vector-search/rebuild.

Other additions

  • Resolver tests (#10, merged via #11 from @mvanhorn) — stopword
    filtering, multi-keyword ranking, tie-break ordering,
    archived-session exclusion, explicit-id order preservation.
  • Kustomize base + overlay pattern (deploy/k8s/kustomization.yaml,
    deploy/README-kustomize.md). Environment-specific overlays
    go under gitignored deploy/k8s-*/ so private values
    (registry, hostnames, TLS secret) never leak into the OSS
    base. Full apply flow: kubectl apply -k deploy/k8s-<name>/.

Changed

  • fetchSessionsById(ids) returns rows in the caller's input
    order instead of SQLite rowid order. Internal callers don't
    rely on ordering; external importers can now trust the result
    to match the input list.
  • Ask resolver no longer excludes completed sessions when FTS
    surfaces them. "Find a session where I worked on X" was always
    going to be about past finished work; the active-only filter
    hid the right answer.
  • FTS-surfaced session ranking now uses
    max(score) + log1p(count) × 0.1 per session, not just max.
    BM25 penalizes high-frequency documents; a session about the
    topic (many moderate hits) was losing to one with a single
    rare-term bullseye.
  • LLM provider's openai-compatible adapter:
    • Adds think: false (Ollama ≥0.7) and
      chat_template_kwargs.enable_thinking: false (vLLM/SGLang)
      to suppress reasoning blocks that consumed the entire output
      window without producing the answer.
    • Falls back to choices[0].message.reasoning when content
      is empty so Qwen3 thinking-mode responses surface useful text
      even when the answer didn't fit in the budget.
  • event_processor.insertNormalizedEvents returns real DB row
    IDs via .returning() instead of id: 0 placeholders.
    Required for ingest-time vector indexing; consumers who relied
    on the placeholder behavior… don't exist (verified across the
    repo).
  • Memory limit bumped 512Mi → 1Gi in the base deployment;
    homelab overlay further bumps to 2Gi to absorb Ask streams +
    enricher LLM fetch buffering on bigger workloads.
  • TLS secret name in the base IngressRoute scrubbed from a
    cluster-specific wildcard name to the placeholder
    agentpulse-tls. Real cert names go in the gitignored
    overlay.

Fixed

  • Search returned 500 in 2ms under any concurrent ingest
    load. The FTS backend was opening a second bun:sqlite
    connection that raced the primary connection's WAL snapshot.
    Now shares the drizzle-owned handle with PRAGMA busy_timeout = 5000 so brief writer collisions block + retry
    instead of throwing.
  • Ask SSE stream dropped during enricher warmup. With LLM
    expansion in front of the main Ask call, time-to-first-token
    on local-Qwen setups climbed to 15–20s. The route now emits
    : keepalive\n\n every 5s while the model is warming, so the
    browser / Traefik don't time out the idle connection.
  • Vector backfill stuck at 22 events with running: true.
    Events without extractable text (Stop, SubagentStop,
    SessionEnd with no content) were correctly skipped, but the
    next batch query's LEFT JOIN re-surfaced them indefinitely.
    Now writes a dim=0 placeholder row so the join excludes them
    from future batches; the cosine query already filters by
    dim = adapter.dim so placeholders are invisible to lookups.
  • Pre-existing FTS data wasn't indexed on upgrades — triggers
    only fire on new writes. The boot-time backfill (above) closes
    this gap automatically.
  • Ollama embed URL hit /v1/api/embed (404) when the LLM
    provider's baseUrl ended in /v1 (the standard OpenAI-
    compatible chat path). Embed adapter now strips a trailing
    /v1 before building the embed URL.
  • Pod OOMed mid-Ask-stream under 512Mi limit (exit 137).
    Memory bumped + responsible code paths tightened.
  • AND-mode FTS query of full Ask message practically never
    matched — every token had to appear in one document. The Ask
    resolver now passes only the stopword-filtered tokens and uses
    OR mode; users still get AND in the search box where
    specificity is the goal.

Fixed — documentation

  • Postgres backend is not yet implemented. README, wiki, and
    release notes previously implied DATABASE_URL=postgres://…
    works; it doesn't (parses, then falls back to SQLite with a
    warning). Tracking issue #12. Phased port plan in
    thoughts/2026-04-24-postgres-backend-plan.md.

v0.2.0-pre.1 — Ask assistant + in-app Telegram

23 Apr 16:34

Choose a tag to compare

First pre-release after 0.1.0. See CHANGELOG.md for the full detail.

Highlights

Ask assistant (Labs)

  • Global /ask chat grounded in your live session state. Resolver scores sessions by fuzzy-match on name / cwd / branch / current task and hands the LLM a terse <sessions> block. Breadth hints (all / every / across) widen the pool.
  • SSE streaming on the web with markdown rendering (headers, lists, fenced code blocks).
  • Ask via Telegram — DM the enrolled bot and get grounded replies back in the same chat. Origin-preserving delivery so web threads never push to Telegram and vice versa.

Telegram setup without the command line

  • Paste-token wizard; bot token + webhook secret live encrypted in the settings table.
  • Polling delivery mode for instances without public reachability (home-lab, NAT'd, private-DNS). Switch between webhook and polling from the UI at any time.

AI provider UX

  • "Load available models" button probes /models on the configured endpoint and turns the Model field into a dropdown.

Setup / onboarding

  • Dashboard empty state collapses first-run into one screen (mint API key → copy install command → start an agent).

Resilience

  • Auto-reload on expired Authentik — cross-origin 302 is detected and triggers a top-level reload so the OIDC round-trip completes silently.
  • DB migrations retry on SQLite lock contention — rolling k8s updates no longer leave a new pod running against a stale schema.
  • Watcher loop fix: dropped the session_updated trigger that was enqueuing ~19 no-op runs/minute per active session.
  • Streaming fix: worked around ERR_HTTP2_PROTOCOL_ERROR caused by Transfer-Encoding: chunked on HTTP/2.
  • Telegram webhook 401 fix: public webhook now mounts outside the auth'd api bundle.

Upgrade notes

No breaking changes — existing 0.1.0 deployments migrate forward automatically (new columns via idempotent ALTER TABLE with the new retry-on-lock path).

Pre-release. Expect rough edges around streaming markdown flicker while code blocks are partial, and LLM-cost handling for cloud providers on Ask (Qwen local = free).

v0.1.0-beta.1

20 Apr 00:59

Choose a tag to compare

v0.1.0-beta.1 Pre-release
Pre-release

AgentPulse first prerelease.

Highlights

Notes

  • This is a prerelease intended for OSS users to test both observability-only and local orchestration flows.
  • Windows now has a first-class installer path, but should still be considered newly shipped and worth extra user validation.