Skip to content

fix(gateway): hot-reload agents.defaults.models changes#33820

Open
Mylszd wants to merge 3 commits intoopenclaw:mainfrom
Mylszd:fix/hotreload-agents-defaults-models
Open

fix(gateway): hot-reload agents.defaults.models changes#33820
Mylszd wants to merge 3 commits intoopenclaw:mainfrom
Mylszd:fix/hotreload-agents-defaults-models

Conversation

@Mylszd
Copy link
Copy Markdown
Contributor

@Mylszd Mylszd commented Mar 4, 2026

Summary

  • treat agents.defaults.models as a hot-reloadable prefix in gateway reload planning
  • trigger heartbeat restart when model allowlist entries are added/removed
  • extend config reload test coverage for agents.defaults.models.* paths

Why

Issue #33600 reports that editing agents.defaults.models is detected but not applied at runtime. The reload plan previously only handled agents.defaults.model and models.*, so allowlist edits were skipped by hot actions.

Testing

  • pnpm vitest src/gateway/config-reload.test.ts

Closes #33600

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime agents Agent runtime and tooling size: S labels Mar 4, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

This PR fixes hot-reload handling for agents.defaults.models config paths (issue #33600), adds a SILENT_REPLY_TOKEN return path for heartbeat runs that deliver via messaging tools, and extends failover classification to recognize zhipuai's "Weekly/Monthly Limit Exhausted" rate-limit errors.

  • config-reload-plan.ts: New agents.defaults.models prefix rule is correctly inserted and disambiguated from the existing agents.defaults.model rule by the dot-delimiter matching in matchRule.
  • agent-runner.ts: The SILENT_REPLY_TOKEN path is only reachable when payloadArray is non-empty (the guard at payloadArray.length === 0 fires earlier). If a heartbeat agent sends via messaging tool but emits no direct payloads, undefined would be returned instead of SILENT_REPLY_TOKEN.
  • failover-matches.ts: The new /weekly\/monthly limit/i regex precisely targets zhipuai's error format. The accompanying plain string "limit exhausted" is broad and could match SDK/transport retry-exhaustion messages, causing them to be misrouted through the rate-limit failover path instead of timeout.
  • Test coverage is appropriately extended across all three change areas.

Confidence Score: 3/5

  • Generally safe to merge with two issues worth addressing before the next release: a code-path gap in the SILENT_REPLY_TOKEN feature and a potentially over-broad rate-limit pattern.
  • The gateway hot-reload fix is clean and well-tested. The heartbeat SILENT_REPLY_TOKEN path has a gap where a zero-payload messaging-tool delivery would bypass the new logic and return undefined instead of SILENT_REPLY_TOKEN. The "limit exhausted" pattern is too broad and could cause false-positive rate-limit classifications for transport/retry errors, leading to incorrect failover behavior.
  • src/auto-reply/reply/agent-runner.ts (early-return gap) and src/agents/pi-embedded-helpers/failover-matches.ts (overly broad string pattern)

Last reviewed commit: fbe2ee1

Comment on lines 503 to 515
if (replyPayloads.length === 0) {
// Heartbeat runs can be configured to deliver via a messaging tool
// (for example when heartbeat.target routes to iMessage/WhatsApp).
// In that case buildReplyPayloads may suppress duplicate payloads because
// the text/media was already delivered externally, leaving the main session
// with "no response" and triggering followup chaining.
const deliveredViaMessagingTool =
(runResult.messagingToolSentTexts?.length ?? 0) > 0 ||
(runResult.messagingToolSentMediaUrls?.length ?? 0) > 0;
if (isHeartbeat && deliveredViaMessagingTool) {
return finalizeWithFollowup({ text: SILENT_REPLY_TOKEN }, queueKey, runFollowupTurn);
}
return finalizeWithFollowup(undefined, queueKey, runFollowupTurn);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early return skips SILENT_REPLY_TOKEN for zero-payload messaging-tool deliveries

The new check for isHeartbeat && deliveredViaMessagingTool is placed inside the replyPayloads.length === 0 block, which is only reached when payloadArray is non-empty (there is an earlier return at the payloadArray.length === 0 guard above).

If a heartbeat agent invokes a messaging tool to deliver content but emits no direct payloads (i.e., runResult.payloads is empty/undefined), the function returns undefined via the earlier early-return, completely bypassing the new logic. This would fail to return SILENT_REPLY_TOKEN even though the content was delivered via the messaging tool.

The deliveredViaMessagingTool guard should also be applied to the earlier early-return path:

// Around line 475-477
if (payloadArray.length === 0) {
  const deliveredViaMessagingTool =
    (runResult.messagingToolSentTexts?.length ?? 0) > 0 ||
    (runResult.messagingToolSentMediaUrls?.length ?? 0) > 0;
  if (isHeartbeat && deliveredViaMessagingTool) {
    return finalizeWithFollowup({ text: SILENT_REPLY_TOKEN }, queueKey, runFollowupTurn);
  }
  return finalizeWithFollowup(undefined, queueKey, runFollowupTurn);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner.ts
Line: 503-515

Comment:
**Early return skips `SILENT_REPLY_TOKEN` for zero-payload messaging-tool deliveries**

The new check for `isHeartbeat && deliveredViaMessagingTool` is placed inside the `replyPayloads.length === 0` block, which is only reached when `payloadArray` is **non-empty** (there is an earlier return at the `payloadArray.length === 0` guard above).

If a heartbeat agent invokes a messaging tool to deliver content but emits **no direct payloads** (i.e., `runResult.payloads` is empty/undefined), the function returns `undefined` via the earlier early-return, completely bypassing the new logic. This would fail to return `SILENT_REPLY_TOKEN` even though the content was delivered via the messaging tool.

The `deliveredViaMessagingTool` guard should also be applied to the earlier early-return path:

```ts
// Around line 475-477
if (payloadArray.length === 0) {
  const deliveredViaMessagingTool =
    (runResult.messagingToolSentTexts?.length ?? 0) > 0 ||
    (runResult.messagingToolSentMediaUrls?.length ?? 0) > 0;
  if (isHeartbeat && deliveredViaMessagingTool) {
    return finalizeWithFollowup({ text: SILENT_REPLY_TOKEN }, queueKey, runFollowupTurn);
  }
  return finalizeWithFollowup(undefined, queueKey, runFollowupTurn);
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +13 to +14
"limit exhausted",
/weekly\/monthly limit/i,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"limit exhausted" substring match is overly broad

The plain string pattern "limit exhausted" is matched via value.includes(pattern) on the lowercased error message. This will also match phrases like "retry limit exhausted" or "reconnection limit exhausted" — which are transport/SDK-level errors and should not trigger the rate-limit failover path.

The accompanying regex /weekly\/monthly limit/i is precise enough to catch the specific zhipuai "Weekly/Monthly Limit Exhausted" error on its own. Consider removing the generic "limit exhausted" string pattern, or replacing it with a tighter regex (e.g., anchored to the usage/api/weekly/monthly context) to avoid false positives that could redirect retries through the rate-limit cooldown path when the real cause is a transient connectivity issue.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/failover-matches.ts
Line: 13-14

Comment:
**`"limit exhausted"` substring match is overly broad**

The plain string pattern `"limit exhausted"` is matched via `value.includes(pattern)` on the lowercased error message. This will also match phrases like `"retry limit exhausted"` or `"reconnection limit exhausted"` — which are transport/SDK-level errors and should not trigger the rate-limit failover path.

The accompanying regex `/weekly\/monthly limit/i` is precise enough to catch the specific zhipuai "Weekly/Monthly Limit Exhausted" error on its own. Consider removing the generic `"limit exhausted"` string pattern, or replacing it with a tighter regex (e.g., anchored to the `usage`/`api`/`weekly`/`monthly` context) to avoid false positives that could redirect retries through the rate-limit cooldown path when the real cause is a transient connectivity issue.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: agents.defaults.models allowlist changes not applied on hot-reload

1 participant