Skip to content

feat(infra): add structured observability events system#7

Open
turquoisebaydev wants to merge 3613 commits intomainfrom
feature/observability-events
Open

feat(infra): add structured observability events system#7
turquoisebaydev wants to merge 3613 commits intomainfrom
feature/observability-events

Conversation

@turquoisebaydev
Copy link
Copy Markdown
Owner

Structured observability event system with typed payloads (domain/phase/status), sequence numbering, listener pattern, and file-based mirror for debugging. Includes config schema for enabling/disabling domains and controlling capture verbosity.

vincentkoc and others added 30 commits March 7, 2026 18:00
…estart reason

resolveChannelRestartReason did not handle the "disconnected" evaluation
reason explicitly, so it fell through to "stuck". This conflates a clean
WebSocket drop (e.g. Discord 1006) with a genuinely stuck channel, making
logs misleading and preventing future policy differentiation.

Add "disconnected" to ChannelRestartReason and handle it before the
catch-all "stuck" return.

Closes openclaw#36404
Landed from contributor PR openclaw#39326 by @dunamismax.

Co-authored-by: dunamismax <[email protected]>
* CLI: add gateway password-file option

* Docs: document safer gateway password input

* Update src/cli/gateway-cli/run.ts

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Tests: clean up gateway password temp dirs

* CLI: restore gateway password warning flow

* Security: harden secret file reads

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Daemon: allow unmanaged gateway lifecycle fallback

* Status: fix service summary formatting

* Changelog: note unmanaged gateway lifecycle fallback

* Tests: cover unmanaged gateway lifecycle fallback

* Daemon: split unmanaged restart health checks

* Daemon: harden unmanaged gateway signaling

* Daemon: reject unmanaged restarts when disabled
Landed from contributor PR openclaw#39313 by @fellanH.

Co-authored-by: Felix Hellström <[email protected]>
Landed from contributor PR openclaw#39238 by @widingmarcus-cyber.

Co-authored-by: Marcus Widing <[email protected]>
Landed from contributor PR openclaw#39196 by @scoootscooob.

Co-authored-by: scoootscooob <[email protected]>
Landed from contributor PR openclaw#39318 by @ql-wade.

Co-authored-by: ql-wade <[email protected]>
AaronWander and others added 30 commits March 8, 2026 19:11
Add browser.relayBindHost config option so the Chrome extension relay
server can bind to a non-loopback address (e.g. 0.0.0.0 for WSL2).
Defaults to 127.0.0.1 when unset, preserving current behavior.

Closes openclaw#39214

Co-Authored-By: Claude Opus 4.6 <[email protected]>
… bind

- Zod schema: validate relayBindHost with ipv4/ipv6 instead of bare string
- Upgrade handler: allow non-loopback connections when bindHost is explicitly
  non-loopback (e.g. 0.0.0.0 for WSL2), keeping loopback-only default
- Test: verify actual bind address via relay.bindHost instead of just checking
  reachability on 127.0.0.1 which passes regardless
- Expose bindHost on ChromeExtensionRelayServer type for inspection

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sanitized test tailnet hostnames and re-ran the targeted macOS gateway discovery test suite before merge.
Merged via squash.

Prepared head SHA: ed46625
Co-authored-by: shichangs <[email protected]>
Co-authored-by: gumadeiras <[email protected]>
Reviewed-by: @gumadeiras
…ssion error (openclaw#39435)

* fix(setup-podman): cd to TMPDIR before podman load to avoid inherited cwd permission error

* fix(podman): safe cwd in run_as_user to prevent chdir errors

Co-Authored-By: Claude Opus 4.6  <[email protected]>
Signed-off-by: sallyom <[email protected]>

---------

Signed-off-by: sallyom <[email protected]>
Co-authored-by: sallyom <[email protected]>
Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
…d minimal prompt mode (openclaw#40204)

* fix(cron): consolidate announce delivery and detach manual runs

* fix: queue detached cron runs (openclaw#40204)
…enclaw#40281)

Merged via squash.

- Local validation: `pnpm exec vitest run --config vitest.gateway.config.ts src/gateway/server-methods/nodes.invoke-wake.test.ts`
- Local validation: `pnpm build`
- mb-server validation: `pnpm exec vitest run --config vitest.gateway.config.ts src/gateway/server-methods/nodes.invoke-wake.test.ts`
- mb-server validation: `pnpm build`
- mb-server validation: `pnpm protocol:check`
…#40282)

Merged via squash.

- mb-server validation: `swift test --package-path apps/shared/OpenClawKit --filter GatewayNodeSessionTests`
- mb-server validation: `pnpm build`
- Scope note: top-level `RootTabs` shell change was intentionally removed from this PR before merge
…rmissive hosts (openclaw#39449)

* fix(run-openclaw-podman): add SELinux :Z mount option on Linux with enforcing/permissive SELinux

* fix(quadlet): add SELinux :Z label to openclaw.container.in volume mount

* fix(podman): add SELinux :Z mount option for Fedora/RHEL hosts

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Signed-off-by: sallyom <[email protected]>

---------

Signed-off-by: sallyom <[email protected]>
Co-authored-by: sallyom <[email protected]>
Co-authored-by: Claude Opus 4.6 <[email protected]>
* Docker: shrink runtime image payload

* Docker: add runtime pnpm opt-in

* Docker: collapse helper entrypoint chmod layers

* Docker: restore bundled pnpm runtime

* Update CHANGELOG.md
…0342)

* docs(changelog): move post-2026.3.8 entries to unreleased

* Update CHANGELOG.md
…claw#40345)

* fix(tui): improve colour contrast for light-background terminals (openclaw#38636)

Detect light terminal backgrounds via COLORFGBG and apply a WCAG
AA-compliant light palette. Adds OPENCLAW_THEME=light|dark env var
override for terminals without auto-detection.

Uses proper sRGB linearisation and WCAG 2.1 contrast ratios to pick
whichever text palette (dark or light) has higher contrast against
the detected background colour.

Co-authored-by: ademczuk <[email protected]>

* Update CHANGELOG.md

---------

Co-authored-by: ademczuk <[email protected]>
Co-authored-by: ademczuk <[email protected]>
Adds a domain-typed, sequence-numbered event bus for runtime observability
across the gateway lifecycle. Key components:

- **src/infra/observability-events.ts** — core event bus with:
  - Typed event payloads (domain, phase, status, runId, sessionId, etc.)
  - Global sequence counter for ordering/correlation
  - Listener registration/deregistration pattern
  - Recursion guard (depth >100 drops + logs, never throws)
  - Optional file-based log mirror (tail/JSON formats)
  - Test reset + config override helpers

- **src/config/types.observability.ts** — new config type definitions:
  - `ObservabilityDomain` union ('llm' | 'tool' | 'run' | 'queue' | 'session')
  - `ObservabilityConfig` with master toggle, per-domain event filtering,
    and file-mirror settings (format, filePath, includeEventIds)

- **src/config/types.openclaw.ts** — adds `observability?: ObservabilityConfig`
  to `OpenClawConfig`

- **src/config/types.ts** — re-exports new types.observability module

Event schema (ObservabilityEventPayload):
  id, ts, seq, domain, event, phase?, runId?, sessionId?, sessionKey?,
  agentId?, status?, durationMs?, error?, data?

All observability operations fail-safe: listener errors are caught and
logged, file writes swallow errors, recursion is bounded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.