Releases: jstuart0/agentpulse
v0.4.0-pre.2 — Generic forwardauth abstraction
Generic forwardauth abstraction — AgentPulse SSO is no longer hard-coded to Authentik. Operators can now wire it to any forwardauth-capable identity provider via env config alone. Zero migration burden for existing Authentik deployments.
What changed
Before: AGENTPULSE_AUTHENTIK_TRUST_SECRET, hard-coded X-Authentik-* header reads, hard-coded signOutUrl: /outpost.goauthentik.io/sign_out, UI strings hard-coded to "Authentik".
After: FORWARDAUTH_TRUST_SECRET, FORWARDAUTH_PROVIDER, eight FORWARDAUTH_HEADER_* slots (each defaults to its Authentik header name), provider-aware signOutUrl logic, UI label derived from FORWARDAUTH_PROVIDER via a formatProviderLabel helper.
Backward compatibility (read this if you're upgrading)
You don't need to do anything. All eight header slots default to the Authentik header names. FORWARDAUTH_PROVIDER defaults to "authentik". AGENTPULSE_AUTHENTIK_TRUST_SECRET continues to work as a deprecated alias for one release (a boot warning prints when the legacy env is set without the new one). The k8s Middleware agentpulse-strip-client-authentik continues to exist as a duplicate of the renamed agentpulse-strip-client-forwardauth so external overlays referencing the old name continue to work.
The deprecation alias and legacy Middleware resource will be removed in the next release boundary.
How to switch to a different IdP
See deploy/k8s/FORWARDAUTH.md (renamed from AUTHENTIK-FORWARDAUTH.md). Provider-specific sub-sections cover:
- Authentik (default; current homelab setup)
- Authelia —
Remote-User,Remote-Email,Remote-Groups,Remote-Name - oauth2-proxy —
X-Forwarded-User,X-Forwarded-Email,X-Forwarded-Groups - Pomerium —
X-Pomerium-Claim-* - Cloudflare Access —
Cf-Access-Authenticated-User-Email(note: Cloudflare's auth model has no separate verify header; the doc explains the limitation)
Each section gives the env values, the Traefik middleware adjustments needed, and the gotchas.
Implementation summary
Three phases (mozart-orchestrated, STANDARD tier):
- Phase 1 backend (
ba4ee61) —config.forwardauthHeader(slot),config.forwardauthProvider,config.forwardauthTrustSecretgetters with Authentik defaults; middleware reads configurable header names;/auth/medual-emitssource: "forwardauth"+ newproviderfield; deprecation warning when only legacy env is set; 20 new unit tests covering configurable headers, default fallback, trust gate, deprecated alias, signOutUrl provider selection. - Phase 2 frontend (
78fdf79) — newformatProviderLabel(provider)helper handles known special cases (oauth2-proxy→"OAuth2-Proxy",cloudflare/cloudflare-access→"Cloudflare Access"); generic capitalize-first fallback. TopBar/Layout readuser.providerwith a fallback tosource === "authentik"for cached pre-Phase-1 responses. 9 new helper tests. - Phase 3 manifests + docs + guard (
e6b7085) — k8s Middleware renamed with one-release alias; secret data key renamed toFORWARDAUTH_TRUST_SECRET; deployment binds both env vars to the same secret key (operators rotate once);AUTHENTIK-FORWARDAUTH.md→FORWARDAUTH.mdrestructured with five provider sub-sections; README + CLAUDE.md + RUNBOOK generalised; newscripts/check-no-authentik-literals.tsarchitecture guard prevents newX-Authentik-*literals from sneaking intosrc/.
Verification
- SQLite tests: 808 pass / 14 skip / 0 fail (was 779/14/0 — gained 29 tests).
- All architecture guards green: bun-sqlite-only-apis, legacy-schema-import, no-sync-transcript-io, installer-paths, no-authentik-literals (new).
- TypeScript clean. Biome 0 errors.
- Manifests render:
kubectl kustomize deploy/k8s/produces 587 lines, no errors.
Plane ticket: AGEN-4.
v0.4.0-pre.1 — Postgres backend + OIDC trust gate completion
PostgreSQL backend at portable parity — the headline feature of v0.4.0, plus four post-rollout deploy/auth fixes that surfaced when this image first landed on a real Kubernetes cluster.
Postgres backend (the v0.4.0 work)
Set DATABASE_URL=postgres://user:pw@host:5432/db to opt into Postgres. SQLite remains the default for OSS quickstart and single-machine use.
- Drizzle dual-dialect schema split with column-factory pattern; per-dialect drizzle-kit baselines committed at
drizzle/sqlite/0000_*.sqlanddrizzle/postgres/0000_*.sql pg_advisory_lock-serialized boot for safe k8s rolling deploysPostgresSearchBackend— ILIKE parity for/api/v1/searchwithTransaction()helper that bridges bun-sqlite's sync rollback semantics with Postgres's native async transactions- GHA two-job CI:
test-sqlite(always) +test-postgres(push to main/dev with postgres:16-alpine service container) - New
deploy/overlays/postgres/kustomize overlay for opting into Postgres
7 follow-up plan stubs filed for pg-superior items (pgvector, JSONB raw_payload + GIN, LISTEN/NOTIFY, tsvector/pg_trgm search, FOR UPDATE SKIP LOCKED, deterministic search rank, sqlite-legacy-init removal).
Post-v0.4.0 fixes (this prerelease)
These four issues only appeared during actual Kubernetes deployment against a real Authentik/Traefik stack — the campaign itself shipped clean against the test suite. Operators upgrading from any pre-postgres version should apply this patch before deploying v0.4.0 in production.
.dockerignoreexception fordeploy/k8s/scripts/— thedeploy/exclusion preventedscripts/build-and-push.shfrom building the backup-sidecar image. Added!deploy/k8s/scripts/exception. (a88d289)- backup-sidecar
cpurequest raised from10mto50m— the request fell below the new LimitRange floor; pods were rejected at admission. (a88d289) AGENTPULSE_PG_POOL_MAXsecretKeyRef markedoptional: true— operators upgrading from pre-postgres installs gotCreateContainerConfigErroron every pod start. (a88d289)AGENTPULSE_AUTHENTIK_TRUST_SECRETwired into base deployment — the v0.3.0 trust gate added the in-process verification but never wired the env var binding. (b0f16ea)authRoutermoved to root app — Hono cascades sibling routers'requireAuth()wildcards across the parent router;/api/v1/auth/mewas returning 401, breaking the login UI. Mirrors the existingcspReportRouter/telegramWebhookRouterpattern. (403a4d5)- OIDC trust gate completed end-to-end — the v0.3.0 design left header injection unimplemented. New
agentpulse-inject-verifyTraefik middleware injects the shared secret after forwardauth passes (Authentik property mappings populate JWT claims, not forwardauth response headers). Wired as the third step in the protected-route chain (strip → forwardauth → inject-verify). (dc94356) - Public-route bypasses added to IngressRoute —
/api/v1/ready,/assets/*, and/api/v1/auth/{me,login,logout,signup}now skip forwardauth so the kubelet, the browser asset loader, and the login UI work correctly. (dc94356)
scott documentation pass (201fb6c):
CHANGELOG.md— full[0.4.0-pre.1]sectiondeploy/k8s/AUTHENTIK-FORWARDAUTH.md— replaced incorrect property-mapping setup with the correct Traefik middleware approachdeploy/k8s/README.md— added Authentik SSO setup section + public-route bypass reference- GitHub wiki — Deployment + Architecture pages updated
Operator upgrade notes
If you are upgrading an existing AgentPulse deployment:
- Generate a strong shared secret:
openssl rand -hex 32 - Add it to your secret as
AGENTPULSE_AUTHENTIK_TRUST_SECRET - Patch your
agentpulse-inject-verifyMiddleware'scustomRequestHeaders.X-Authentik-Verifyto the same value (use a private overlay; do not commit a real secret) - Verify your IngressRoute middleware chain is
strip-client-authentik→forwardauth→inject-verifyin that order - Apply, restart the pod, refresh your browser
See deploy/k8s/AUTHENTIK-FORWARDAUTH.md for the full step-by-step.
Verification
- SQLite tests: 779 pass / 14 skip / 0 fail
- Postgres tests against live docker postgres:16-alpine: 754 pass / 39 skip / 0 fail (the 39 skips are SQLite-only tests gated via
describeSqliteOnly/itSqliteOnly) - All architecture guards pass
- Production: https://agentpulse.xmojo.net is running this codebase live
Plane ticket: AGEN-3.
v0.2.0-pre.10 — Reliability hotfix
Reliability hotfix for two stacked production issues that were surfacing as user-visible "could not connect to the LLM" / "network error" messages from Ask and Telegram.
Fixed
- Cold-load LLM timeouts. The Ask sync and streaming paths used a 60s timeout on the LLM call, which is too tight for cold-load on local 20B+ models — first-token latency runs 60–90s when a model has been evicted from GPU memory. Bumped both timeouts to 180s. Classifier timeouts (8s) stay tight on purpose so a slow gate fails fast and lets the main turn proceed. When a timeout does fire, the user-facing message is now timeout-specific ("LLM didn't respond in time — usually a cold model load. Try again in a few seconds.") instead of the generic "couldn't reach" copy.
- Pod restart loop from over-tight liveness probes. Default Kubernetes `timeoutSeconds: 1` was triggering on `/api/v1/health` whenever the event loop was briefly busy (LLM streaming, FTS backfill). Bumped both probes to `timeoutSeconds: 5` and the liveness probe to `failureThreshold: 5`.
Validated in production for 41 hours before tagging: 4 restarts vs the prior ~18/day baseline; zero LLM timeouts since deploy.
v0.2.0-pre.9 — Efficiency redesign
Building on the type-discipline work in pre.8, this release consolidates the largest remaining duplication patterns across the orchestration core so that adding a new kind, gate, executor, or classifier is a one-place change instead of a five-place change.
Changed
- Inbox card surface unified. 12 nearly-identical action-request card components collapsed into one
ActionRequestCarddriven by a per-kind spec table. Adding a new action kind is now one entry in one switch, with an exhaustiveness guard. The freeform-alert-rule card now has the same KindBadge / severity / busy-state polish as every other card. - Ask orchestrator unified across transports. The 9-gate ladder previously hand-stamped in both the sync and streaming Ask pipelines is now a single
ASK_GATEStable consumed by a sharedrunAskGatesrunner. New gates are one row, automatically wired into both transports; previously a new gate had to be inserted in two places in matching order. - Action-request executors consolidated. The thirteen executor functions (launch / archive / edit / delete / alert-rule / etc.) now share
succeed,fail, andfailExpiredhelpers that bundle the conditional-update + notify + return triple they all do, plus arunExecutor<K>frame for the simple cases. Each executor's unique logic now reads at a glance. - Intent classifiers consolidated. All eight intent detectors (launch / resume / session-action / template-CRUD / alert-rule / bulk-action / Q&A / add-project) now share a
classifyJson<T>helper for the provider-fetch / spend-check / adapter-call / JSON-parse boilerplate. Each detector is now its system prompt plus a small parse lambda — the actual differences are immediately visible. - Ask message-meta unified. One shared tagged-union parser (
parseAskMeta) replaces three separate server encode/extract pairs and three separate frontend parsers; the AskPage chained-&&detection logic is gone. - Inbox kind metadata centralized. One
KIND_METAtable drives both the filter dropdown and the footer counters; missing kinds fail to compile. Closes a pre-existing bug where the footer was missing theadd_channelcounter.
Fixed
- Single canonical archive predicate.
sessions.is_archivedis now the only source of truth for whether a session is archived;status='archived'is retained only for backward compat. A sharedisVisibleSession(s)/isArchivedSession(s)helper pair replaces a mix ofisArchived = falseandstatus !== 'archived'filters scattered across the codebase. Idempotent DB backfill at startup converts any historical row withstatus='archived'tois_archived = 1. The Archive badge still flips immediately on archive click. Net effect: digest counts, the operator inbox, and Ask candidate resolution no longer leak archived-but-active sessions; the transcript-sync worker no longer wastes IO polling archived agents.
Behind the scenes
- 24 files modified, 12 deleted, 3 added; −1,821 LOC net while tightening rather than loosening invariants.
- 612/612 tests pass; typecheck clean; 0 Biome errors.
v0.2.0-pre.2 — Find any past conversation
The "find any past conversation" release. Three new layers stack on
top of session state so Ask actually works across compaction
boundaries and across past completed work — full-text first, then
LLM query expansion, then optional vector embeddings for true
semantic recall. No breaking changes.
Added
Full-text search (/search)
- New SQLite FTS5 backend behind a
SearchBackendinterface so a
Postgrestsvectorimpl can slot in later without changing
routes or UI. Two virtual tables (sessions + events), porter +
unicode61 tokenizer, BM25 ranking normalized to 0..1. - Triggers on
INSERT/UPDATE/DELETEofsessionsand
eventskeep both indexes in sync. Event indexing filters to
meaningful types (UserPromptSubmit,AssistantMessage,
Stop,TaskCreated/TaskCompleted,SubagentStop,
SessionEnd,AiProposal,AiReport,AiHitlRequest). - Boot-time backfill detects row-count divergence between source
tables and FTS and re-indexes the gap in a single transaction —
upgrades from 0.2.0-pre.1 light up retroactively without manual
rebuild. - Query escaping: tokens are phrase-quoted before MATCH so inputs
with-,:,(, etc. don't get parsed as FTS5 operators.
Newmode: "and" | "or"filter — AND default for the search
box, OR for programmatic callers. - New
/searchpage with URL-stateful filter UI (agentType,
status, eventType, kind),<mark>-highlighted snippets, links
back to the originating session and event. - New
GET /api/v1/searchandPOST /api/v1/search/rebuild.
Semantic Ask (LLM query expansion)
- Pluggable
SemanticEnricherinterface returningextraTerms
(lexical synonyms) anddirectHits(sessionId → score). Vector
enrichment populates the latter, leaves the former empty;
LlmQueryExpanderdoes the inverse. LlmQueryExpandercalls the default LLM provider with a tight
prompt asking for 5–10 comma-separated synonym terms. Output
parser tolerates chatty preamble, numbered lists, quotes, and
unclosed<think>blocks. Caps at 15 deduped terms.- New
CompositeEnricherruns multiple enrichers in parallel and
unions their results — so vector + LLM expansion compose
cleanly when both are configured. One enricher failing doesn't
poison the other. - Ask resolver now folds the enricher's
extraTermsand
directHitsinto its FTS query and pool extension.keen-worm
(or whatever your "I worked on coupling for two days" session
is) finally surfaces even when the user's question paraphrases
rather than quotes the original work. - Ask context builder pulls each session's top FTS-matching
events, not just the most-recent tail, so the LLM sees the
evidence that earned each session a spot in the candidate list.
Vector search — install-time-optional, AI-gated
- New
AGENTPULSE_VECTOR_SEARCH=truebuild flag (off by default,
zero overhead unset). When set, creates anevent_embeddings
table (event_id PK, model, dim, vector BLOB), a delete-cascade
trigger, and the Settings → AI → Vector search subsection. EmbeddingAdapterinterface with anOllamaEmbeddingAdapter
implementation (uses/api/embedbatched, falls back to legacy
/api/embeddingsper-input on older Ollama versions). Strips a
trailing/v1from the LLM provider's baseUrl so the OpenAI-
compatible chat URL works as the embedding host without
reconfiguration.- Default embedding model
mxbai-embed-large(335M params,
1024-dim, top-5 MTEB English in its weight class, ~30–60ms per
embed). Switchable in Settings toqwen3-embedding:8b
(8B, 4096-dim, top-tier MTEB, ~200–500ms per embed) for
installs with the headroom. - Ingest hooks fire-and-forget
embedEvent(id)from the session
bus listener — adds zero latency to the hook hot path. Boot-
time backfill kicks off a background task when row counts
diverge and reports progress through the Settings UI. - New
VectorEmbeddingEnricherbrute-force scans event vectors
for the active model, computes cosine similarity, aggregates
per-session asmax + log1p(count) × 0.05. Filters out hits
below a 0.4 floor (typical noise threshold for unit-normalized
retrieval models). Sub-100ms over ~10K vectors; sqlite-vss can
slot in around 100K events. - Settings UI: enable toggle, model picker (datalist with
recommended models + hints), live-polling indexing progress
bar, "Re-index now" button. - Endpoints:
GET/PUT /api/v1/ai/vector-search/status,
POST /api/v1/ai/vector-search/rebuild.
Other additions
- Resolver tests (#10, merged via #11 from @mvanhorn) — stopword
filtering, multi-keyword ranking, tie-break ordering,
archived-session exclusion, explicit-id order preservation. - Kustomize base + overlay pattern (
deploy/k8s/kustomization.yaml,
deploy/README-kustomize.md). Environment-specific overlays
go under gitignoreddeploy/k8s-*/so private values
(registry, hostnames, TLS secret) never leak into the OSS
base. Full apply flow:kubectl apply -k deploy/k8s-<name>/.
Changed
fetchSessionsById(ids)returns rows in the caller's input
order instead of SQLite rowid order. Internal callers don't
rely on ordering; external importers can now trust the result
to match the input list.- Ask resolver no longer excludes completed sessions when FTS
surfaces them. "Find a session where I worked on X" was always
going to be about past finished work; the active-only filter
hid the right answer. - FTS-surfaced session ranking now uses
max(score) + log1p(count) × 0.1per session, not just max.
BM25 penalizes high-frequency documents; a session about the
topic (many moderate hits) was losing to one with a single
rare-term bullseye. - LLM provider's openai-compatible adapter:
- Adds
think: false(Ollama ≥0.7) and
chat_template_kwargs.enable_thinking: false(vLLM/SGLang)
to suppress reasoning blocks that consumed the entire output
window without producing the answer. - Falls back to
choices[0].message.reasoningwhencontent
is empty so Qwen3 thinking-mode responses surface useful text
even when the answer didn't fit in the budget.
- Adds
event_processor.insertNormalizedEventsreturns real DB row
IDs via.returning()instead ofid: 0placeholders.
Required for ingest-time vector indexing; consumers who relied
on the placeholder behavior… don't exist (verified across the
repo).- Memory limit bumped 512Mi → 1Gi in the base deployment;
homelab overlay further bumps to 2Gi to absorb Ask streams +
enricher LLM fetch buffering on bigger workloads. - TLS secret name in the base IngressRoute scrubbed from a
cluster-specific wildcard name to the placeholder
agentpulse-tls. Real cert names go in the gitignored
overlay.
Fixed
- Search returned 500 in 2ms under any concurrent ingest
load. The FTS backend was opening a secondbun:sqlite
connection that raced the primary connection's WAL snapshot.
Now shares the drizzle-owned handle withPRAGMA busy_timeout = 5000so brief writer collisions block + retry
instead of throwing. - Ask SSE stream dropped during enricher warmup. With LLM
expansion in front of the main Ask call, time-to-first-token
on local-Qwen setups climbed to 15–20s. The route now emits
: keepalive\n\nevery 5s while the model is warming, so the
browser / Traefik don't time out the idle connection. - Vector backfill stuck at 22 events with
running: true.
Events without extractable text (Stop,SubagentStop,
SessionEndwith no content) were correctly skipped, but the
next batch query's LEFT JOIN re-surfaced them indefinitely.
Now writes adim=0placeholder row so the join excludes them
from future batches; the cosine query already filters by
dim = adapter.dimso placeholders are invisible to lookups. - Pre-existing FTS data wasn't indexed on upgrades — triggers
only fire on new writes. The boot-time backfill (above) closes
this gap automatically. - Ollama embed URL hit
/v1/api/embed(404) when the LLM
provider'sbaseUrlended in/v1(the standard OpenAI-
compatible chat path). Embed adapter now strips a trailing
/v1before building the embed URL. - Pod OOMed mid-Ask-stream under 512Mi limit (exit 137).
Memory bumped + responsible code paths tightened. - AND-mode FTS query of full Ask message practically never
matched — every token had to appear in one document. The Ask
resolver now passes only the stopword-filtered tokens and uses
OR mode; users still get AND in the search box where
specificity is the goal.
Fixed — documentation
- Postgres backend is not yet implemented. README, wiki, and
release notes previously impliedDATABASE_URL=postgres://…
works; it doesn't (parses, then falls back to SQLite with a
warning). Tracking issue #12. Phased port plan in
thoughts/2026-04-24-postgres-backend-plan.md.
v0.2.0-pre.1 — Ask assistant + in-app Telegram
First pre-release after 0.1.0. See CHANGELOG.md for the full detail.
Highlights
Ask assistant (Labs)
- Global
/askchat grounded in your live session state. Resolver scores sessions by fuzzy-match on name / cwd / branch / current task and hands the LLM a terse<sessions>block. Breadth hints (all/every/across) widen the pool. - SSE streaming on the web with markdown rendering (headers, lists, fenced code blocks).
- Ask via Telegram — DM the enrolled bot and get grounded replies back in the same chat. Origin-preserving delivery so web threads never push to Telegram and vice versa.
Telegram setup without the command line
- Paste-token wizard; bot token + webhook secret live encrypted in the
settingstable. - Polling delivery mode for instances without public reachability (home-lab, NAT'd, private-DNS). Switch between webhook and polling from the UI at any time.
AI provider UX
- "Load available models" button probes
/modelson the configured endpoint and turns the Model field into a dropdown.
Setup / onboarding
- Dashboard empty state collapses first-run into one screen (mint API key → copy install command → start an agent).
Resilience
- Auto-reload on expired Authentik — cross-origin 302 is detected and triggers a top-level reload so the OIDC round-trip completes silently.
- DB migrations retry on SQLite lock contention — rolling k8s updates no longer leave a new pod running against a stale schema.
- Watcher loop fix: dropped the
session_updatedtrigger that was enqueuing ~19 no-op runs/minute per active session. - Streaming fix: worked around
ERR_HTTP2_PROTOCOL_ERRORcaused byTransfer-Encoding: chunkedon HTTP/2. - Telegram webhook 401 fix: public webhook now mounts outside the auth'd
apibundle.
Upgrade notes
No breaking changes — existing 0.1.0 deployments migrate forward automatically (new columns via idempotent ALTER TABLE with the new retry-on-lock path).
Pre-release. Expect rough edges around streaming markdown flicker while code blocks are partial, and LLM-cost handling for cloud providers on Ask (Qwen local = free).
v0.1.0-beta.1
AgentPulse first prerelease.
Highlights
- Real-time observability for Claude Code and Codex sessions
- Session workspaces with prompts, responses, progress, notes, and instruction-file editing
- Local orchestration with templates, supervisors, headless tasks, interactive sessions, retries, and host routing
- One-command local install with Bun + SQLite for macOS, Linux, and Windows
- Observability-only install path via --skip-supervisor
- Public hosted installers:
Notes
- This is a prerelease intended for OSS users to test both observability-only and local orchestration flows.
- Windows now has a first-class installer path, but should still be considered newly shipped and worth extra user validation.