Skip to content

Comments

SDK runner: close message processing parity gaps with Pi embedded agent#12

Closed
dgarson wants to merge 11 commits intomainfrom
claude/review-message-processing-djLHt
Closed

SDK runner: close message processing parity gaps with Pi embedded agent#12
dgarson wants to merge 11 commits intomainfrom
claude/review-message-processing-djLHt

Conversation

@dgarson
Copy link
Owner

@dgarson dgarson commented Feb 7, 2026

  • Gap 1 (HIGH): Add streaming block replies during the event loop via blockReplyBreak/blockReplyChunking params. Supports "message_end" (flush at message boundaries) and "text_end" (per-chunk) modes, matching Pi embedded's streaming delivery. Final onBlockReply is skipped when streaming was active to avoid duplicate delivery.

  • Gap 2 (MEDIUM): Add onReasoningStream callback support. Extract thinking/reasoning text from SDK events classified as "system" and emit via the new callback, enabling reasoning display in UIs.

  • Gap 3 (MEDIUM): Reset chunks/assistantSoFar at message boundaries so onPartialReply reflects only the current message, not accumulated text from all prior turns.

  • Gap 5 (LOW-MED): Add enforceFinalTag support. When enabled, only content inside ... tags is returned, preventing intermediate narration leakage from non-Claude models. Adds shared extractFinalTagContent() utility.

  • Gap 6 (LOW-MED): Map all payload fields (mediaUrl, mediaUrls, replyToId, replyToTag, replyToCurrent, audioAsVoice) in adaptSdkResultToPiResult to prevent silent field loss.

Wire all new params through the full chain:
executor.ts → AgentRuntimeRunParams → sdk-agent-runtime → sdk-runner-adapter → sdk-runner

https://claude.ai/code/session_013FRzxnpzCwncubghL7GGCF

dgarson and others added 10 commits February 6, 2026 19:13
Implement a comprehensive scoring system that evaluates whether experiences
should become long-term memories based on five weighted factors:

- **Novelty**: Is this new information or repetition?
- **Impact**: Does this change understanding, behavior, or system state?
- **Relational**: Does this connect to known entities/people/projects?
- **Temporal**: Is this time-sensitive or evergreen?
- **User Intent**: Was this explicitly marked as important?

Features:
- Configurable weights for each factor (normalized to sum to 1.0)
- Named threshold profiles: balanced, aggressive, conservative, minimal
- Override rules for specific tools (glob-pattern matching)
- evaluateMemoryRelevance() as primary entry point
- shouldCapture(), shouldPersistToGraph(), shouldUseLlmEval() helpers
- Backward-compatible evaluateHeuristic() now delegates to scoring system
- New evaluateHeuristicDetailed() returns full scoring breakdown
- 37 comprehensive tests covering all factors and edge cases

Files:
- extensions/meridia/src/meridia/scoring/types.ts
- extensions/meridia/src/meridia/scoring/config.ts
- extensions/meridia/src/meridia/scoring/factors.ts
- extensions/meridia/src/meridia/scoring/index.ts
- extensions/meridia/src/meridia/scoring/scoring.test.ts
- extensions/meridia/src/meridia/evaluate.ts (updated)
feat(meridia): multi-factor memory relevance scoring system
feat: add configurable subsystem debug log suppression
- Gap 1 (HIGH): Add streaming block replies during the event loop via
  blockReplyBreak/blockReplyChunking params. Supports "message_end"
  (flush at message boundaries) and "text_end" (per-chunk) modes,
  matching Pi embedded's streaming delivery. Final onBlockReply is
  skipped when streaming was active to avoid duplicate delivery.

- Gap 2 (MEDIUM): Add onReasoningStream callback support. Extract
  thinking/reasoning text from SDK events classified as "system" and
  emit via the new callback, enabling reasoning display in UIs.

- Gap 3 (MEDIUM): Reset chunks/assistantSoFar at message boundaries
  so onPartialReply reflects only the current message, not accumulated
  text from all prior turns.

- Gap 5 (LOW-MED): Add enforceFinalTag support. When enabled, only
  content inside <final>...</final> tags is returned, preventing
  intermediate narration leakage from non-Claude models. Adds shared
  extractFinalTagContent() utility.

- Gap 6 (LOW-MED): Map all payload fields (mediaUrl, mediaUrls,
  replyToId, replyToTag, replyToCurrent, audioAsVoice) in
  adaptSdkResultToPiResult to prevent silent field loss.

Wire all new params through the full chain:
  executor.ts → AgentRuntimeRunParams → sdk-agent-runtime →
  sdk-runner-adapter → sdk-runner

https://claude.ai/code/session_013FRzxnpzCwncubghL7GGCF
Resolve merge conflicts in meridia scoring module by taking
origin/main's refactored multi-factor scoring system (scorer.ts,
defaults.ts, updated types/factors/index). Remove HEAD-only
scoring/config.ts and scoring.test.ts that used the old API
(origin/main has proper tests in factors.test.ts and scorer.test.ts).

Co-Authored-By: Claude Opus 4.6 <[email protected]>
dgarson added a commit that referenced this pull request Feb 7, 2026
…#208)

* feat: added work-queue workflow workers and a worker manager, integrate w/cron

* fix: address 12 logic flaws in workflow engine, adapter, phases, and cron types

- engine: capture phase before clobbering in catch handler (#1)
- engine: remove dead `state.plan ?? plan` fallback (#2)
- engine: mark workflow failed when all execution nodes fail (#6)
- adapter: fix retry count off-by-one (attemptNumber vs retryCount) (#3)
- adapter: clean up abort listener in sleep on timeout (#10)
- discover: fix batch/entries index mismatch when spawns fail (#4)
- execute: add cycle detection to topological sort (#5)
- review: add autoApproved flag to distinguish fallback approvals (#7)
- plan: add comment clarifying sessionKey reuse across repair attempts (#8)
- decompose: remove unused model/maxPhases/maxTasksPerPhase/maxSubtasksPerTask opts (#9)
- types: add autoApproved to ReviewIteration (#7)
- cron/state: use discriminated union for CronEvent (CronJobEvent | CronHealthEvent) (#12)
- tests: add WorkflowWorkerAdapter test suite (8 tests) (#11)

https://claude.ai/code/session_01L8kquwpmUh5zmU9S4MHgPu

---------

Co-authored-by: Claude <[email protected]>
dgarson added a commit that referenced this pull request Feb 7, 2026
…cron types (openclaw#217)

- engine: capture phase before clobbering in catch handler (#1)
- engine: remove dead `state.plan ?? plan` fallback (#2)
- engine: mark workflow failed when all execution nodes fail (#6)
- adapter: fix retry count off-by-one (attemptNumber vs retryCount) (#3)
- adapter: clean up abort listener in sleep on timeout (#10)
- discover: fix batch/entries index mismatch when spawns fail (#4)
- execute: add cycle detection to topological sort (#5)
- review: add autoApproved flag to distinguish fallback approvals (#7)
- plan: add comment clarifying sessionKey reuse across repair attempts (#8)
- decompose: remove unused model/maxPhases/maxTasksPerPhase/maxSubtasksPerTask opts (#9)
- types: add autoApproved to ReviewIteration (#7)
- cron/state: use discriminated union for CronEvent (CronJobEvent | CronHealthEvent) (#12)
- tests: add WorkflowWorkerAdapter test suite (8 tests) (#11)

https://claude.ai/code/session_01L8kquwpmUh5zmU9S4MHgPu

Co-authored-by: Claude <[email protected]>
@dgarson
Copy link
Owner Author

dgarson commented Feb 7, 2026

Already merged to main via git merge. Closing stale PR.

@dgarson dgarson closed this Feb 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants