fix(routing): make bindings dynamic by calling loadConfig() per-message by juanpablodlc · Pull Request #11372 · openclaw/openclaw

juanpablodlc · 2026-02-07T19:12:37Z

Problem

When bindings are added or changed in openclaw.json while the gateway is running, the config watcher detects the change and logs "config change applied (dynamic reads: bindings)" — but messages continue routing to the old/default agent until the gateway is fully restarted.

This affects all channels (WhatsApp, Telegram, Discord) that use multi-agent routing via bindings.

Related issues: #9351 (Telegram accountId routing — different root cause but same file touched), #9198 (Google Chat DM routing — likely same stale-config pattern but extension channel not covered here)

Root Cause (3 layers)

config-reload.ts classifies bindings as kind: "none", meaning "this section is dynamically read per-request, no reload action needed." That assumption would be correct if channel monitors re-read config per-message — but they don't. They call loadConfig() once at startup and capture it in a closure.

Layer 1 — Incorrect assumption: config-reload.ts:72 — { prefix: "bindings", kind: "none" } assumes bindings are read dynamically. They aren't.

Layer 2 — Config captured once at startup:

Channel	File	Line	Pattern
WhatsApp	`web/auto-reply/monitor.ts`	65	`const baseCfg = loadConfig()` passed to `createWebOnMessageHandler`
Telegram	`telegram/bot.ts`	117	`const cfg = loadConfig()` used in reaction handler
Discord	`discord/monitor/provider.ts`	133	`const cfg = loadConfig()` passed to `createDiscordMessageHandler`

Layer 3 — Routing uses stale snapshot: resolveAgentRoute({ cfg: params.cfg, ... }) in each channel's message handler uses the startup-time config, never seeing new bindings.

Fix

loadConfig() in config/io.ts already has a 200ms TTL cache (DEFAULT_CONFIG_CACHE_MS = 200). It was designed for per-request dynamic reads with minimal disk I/O.

This PR changes the resolveAgentRoute() call in each channel to use loadConfig() instead of the captured startup config:

// Before:
const route = resolveAgentRoute({ cfg: params.cfg, channel: "whatsapp", ... });

// After:
// Fresh config for bindings lookup; other routing inputs are payload-derived.
const route = resolveAgentRoute({ cfg: loadConfig(), channel: "whatsapp", ... });

After this change, the kind: "none" classification for bindings becomes correct — routing truly reads config dynamically now.

Files Changed

File	Change
`src/web/auto-reply/monitor/on-message.ts`	`import type` → value import; `params.cfg` → `loadConfig()` at routing call
`src/telegram/bot-message-context.ts`	Add `loadConfig` import; `cfg` → `loadConfig()` at routing call
`src/telegram/bot.ts`	`cfg` → `loadConfig()` in reaction handler routing call
`src/discord/monitor/message-handler.preflight.ts`	Add `loadConfig` import; `params.cfg` → `loadConfig()` at routing call
`...mentionpatterns-match.test.ts`	Mock `loadConfig()` so bindings test passes fresh config to routing

Mixed-config-source tradeoff

After this change, handler functions still receive cfg via params for non-routing purposes (channel config, media limits, mention patterns). Only resolveAgentRoute() reads fresh config. Each call site is annotated with an inline comment explaining this is intentional:

// Fresh config for bindings lookup; other routing inputs are payload-derived.
const route = resolveAgentRoute({ cfg: loadConfig(), ... });

The other routing inputs (account ID, chat type, peer ID) come from the message payload, not config. A broader "fresh config everywhere" refactor could follow separately if runtime changes to non-binding config become a real requirement.

Why this is safe

Performance: 200ms cache means at most 5 disk reads/second. JSON5 parse is <1ms for typical configs.
Thread safety: Node.js is single-threaded; loadConfig() reads synchronously.
Corrupted config: loadConfig() returns last cached valid config on parse failure.
200ms staleness: After writing a new binding, routing may use old bindings for up to 200ms. This is acceptable — the alternative (gateway restart) causes 2-5s of downtime.
Backward compatible: No new APIs, no config changes required.

Test plan

pnpm lint — 0 warnings, 0 errors
pnpm build — clean
pnpm test — 961 test files, 6510 tests passed, 0 failures
Manual: start gateway, add binding at runtime, verify routing picks it up without restart

AI Disclosure

This PR was prepared with assistance from Claude Opus 4.6. The bug analysis, code tracing, and fix were developed collaboratively. I understand the code and the fix. Lightly tested (automated tests pass; manual runtime testing not yet performed).

Greptile Overview

Greptile Summary

This PR fixes stale multi-agent routing by switching channel message handlers (web/WhatsApp, Telegram, Discord) to call loadConfig() at the resolveAgentRoute() call site instead of using the startup-time config snapshot. The change leverages the existing TTL cache in loadConfig() so updated bindings in openclaw.json take effect without restarting the gateway, while leaving other handler logic on the originally provided cfg.

It also updates a Discord monitor test to mock loadConfig() so binding-based routing can be exercised under the new behavior.

Confidence Score: 4/5

This PR is likely safe to merge after addressing one test robustness issue around the new loadConfig() behavior.
The production changes are narrowly scoped (only routing config source changes, leveraging existing cached loadConfig()), but the updated Discord test introduces a default loadConfigMock value of {} that can make config-dependent behavior inconsistent unless each test explicitly sets the mock return to the intended config.
src/discord/monitor.tool-result.accepts-guild-messages-mentionpatterns-match.test.ts

…thout restart

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-07T19:13:46Z

Additional Comments (1)

src/telegram/bot.ts
Stale config still used

createTelegramBot captures const cfg = opts.config ?? loadConfig() at startup, and this cfg is still used for resolveTelegramForumThreadId inputs (via isForum/isGroup computed from the reaction) and several other behaviors in this handler’s surrounding logic. Only resolveAgentRoute was switched to loadConfig(), so any routing decisions that depend on other config (e.g., default agent selection and session scoping via helper logic that uses cfg) can still behave inconsistently after a live config update. If the intent is “bindings changes apply live”, this handler should avoid mixing startup config with per-message config for any values that influence the route/session keys.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/telegram/bot.ts
Line: 446:448

Comment:
**Stale config still used**

`createTelegramBot` captures `const cfg = opts.config ?? loadConfig()` at startup, and this `cfg` is still used for `resolveTelegramForumThreadId` inputs (via `isForum`/`isGroup` computed from the reaction) and several other behaviors in this handler’s surrounding logic. Only `resolveAgentRoute` was switched to `loadConfig()`, so any routing decisions that depend on other config (e.g., default agent selection and session scoping via helper logic that uses `cfg`) can still behave inconsistently after a live config update. If the intent is “bindings changes apply live”, this handler should avoid mixing startup config with per-message config for any values that influence the route/session keys.


How can I resolve this? If you propose a fix, please make it concise.

juanpablodlc · 2026-02-07T19:21:07Z

@greptile-apps

Good catch on the mixed config sources. This is the tradeoff noted in the PR description under "Note on params.cfg inconsistency" — only the resolveAgentRoute() call was switched to loadConfig(), while surrounding logic still uses the startup cfg.

The specific values flagged (isForum, isGroup, chatId, resolveTelegramForumThreadId) are computed from the incoming Telegram message/reaction payload, not from config — they come from reaction.chat.type, reaction.chat.is_forum, and reaction.chat.id. The account.accountId that feeds into the routing call is resolved from startup config, but account identity doesn't change at runtime (nobody hot-swaps Telegram bot tokens). The only config values that realistically change at runtime are bindings entries — and those are now read fresh via loadConfig().

Making all config access dynamic across the full handler would be a significantly larger change and goes beyond the scope of this bug fix. This PR intentionally targets the narrow path where stale config causes incorrect behavior: the resolveAgentRoute() call that reads listBindings(cfg) to determine which agent handles a message. A broader "dynamic config everywhere" refactor could be a worthwhile follow-up, but would touch many more code paths and test surfaces.

greptile-apps · 2026-02-07T19:21:59Z

I understand your reasoning. You're correct that the specific values I flagged (isForum, isGroup, chatId, resolveTelegramForumThreadId) come from the Telegram payload, not config. However, there's still a subtle config consistency issue in the Telegram bot reaction handler.

Looking at src/telegram/bot.ts around line 110-130, the reaction handler does:

const cfg = loadConfig(); // startup-time config
// ... later ...
const route = resolveAgentRoute({ 
  cfg: loadConfig(), // fresh config for routing
  channel: "telegram",
  // ... other params derived from cfg (startup-time) and reaction payload
});

The issue is that resolveAgentRoute can return different results based on other parts of cfg beyond just bindings — for example, if someone changes routing.defaultAgent or channel-level routing rules in openclaw.json at runtime. The old cfg (startup-time) is still used to compute context that flows into the routing decision, while the new cfg (fresh) is used for the binding lookup.

However, re-reading your explanation and the code more carefully: if all the routing-relevant inputs other than bindings are either:

Derived from the Telegram payload (channel identity, chat type, etc.), OR
Part of config that truly doesn't change at runtime in your deployment model (account mappings, default agents)

...then the practical risk is indeed very low, and your "narrow fix for bindings only" approach is defensible.

My recommendation: Document this tradeoff more explicitly in a code comment at each resolveAgentRoute call site where you've made the swap, so future maintainers understand the mixed-config-source pattern is intentional and scoped. Something like:

// Dynamic config read for bindings only — other routing inputs 
// (account, chat type) are payload-derived or runtime-stable
const route = resolveAgentRoute({ cfg: loadConfig(), ... });

This preserves your narrow fix while making the design decision explicit. The broader "dynamic config everywhere" refactor can be deferred to a follow-up if runtime changes to non-binding routing config become a real requirement.

Is this PR ready to merge with that documentation addition, or would you prefer to tackle the broader refactor now?

…deoff

juanpablodlc · 2026-02-07T19:34:39Z

@greptile-apps Done. Updated the comments and I believe the PR ready to merge with that documentation addition. Please review, modify the original comment and provide a confidence score

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-07T19:36:12Z

Additional Comments (1)

src/discord/monitor.tool-result.accepts-guild-messages-mentionpatterns-match.test.ts
Test mock can break routing

This test now relies on loadConfig() being called to pick up bindings, but loadConfigMock defaults to {} in beforeEach. In the first test case (accepts guild messages when mentionPatterns match), cfg includes messages/groupChat/mentionPatterns, but it’s never returned from loadConfigMock, so any code path that uses loadConfig() for config-derived gating will see {} and can change behavior or make the test order-dependent. Consider setting loadConfigMock.mockReturnValue(cfg) in every test that expects config-driven behavior, not just the thread-routing case.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/discord/monitor.tool-result.accepts-guild-messages-mentionpatterns-match.test.ts
Line: 17:24

Comment:
**Test mock can break routing**

This test now relies on `loadConfig()` being called to pick up `bindings`, but `loadConfigMock` defaults to `{}` in `beforeEach`. In the first test case (`accepts guild messages when mentionPatterns match`), `cfg` includes `messages/groupChat/mentionPatterns`, but it’s never returned from `loadConfigMock`, so any code path that uses `loadConfig()` for config-derived gating will see `{}` and can change behavior or make the test order-dependent. Consider setting `loadConfigMock.mockReturnValue(cfg)` in every test that expects config-driven behavior, not just the thread-routing case.

How can I resolve this? If you propose a fix, please make it concise.

juanpablodlc · 2026-02-09T06:10:29Z

IS there a way to re-run the CI / macos (pull_request)Failing after 25m ? This should just work.

…ge (openclaw#11372)

Integrated upstream improvements: - CRITICAL: Fix bundled hooks broken since 2026.2.2 (openclaw#9295) - Grok web search provider (xAI) with inline citations - Telegram video note support with tests and docs - QMD model cache sharing optimization (openclaw#12114) - Context overflow false positive fix (openclaw#2078) - Model failover 400 status handling (openclaw#1879) - Dynamic config loading per-message (openclaw#11372) - Gateway post-compaction amnesia fix (openclaw#12283) - Skills watcher: ignore Python venvs and caches - Telegram send recovery from stale thread IDs - Cron job parameter recovery (openclaw#12124) - Auto-reply weekday timestamps (openclaw#12438) - Utility consolidation refactoring (PNG, JSON, errors) - Cross-platform test normalization (openclaw#12212) - macOS Nix defaults support (openclaw#12205) Preserved DEV enhancements: - Docker multi-stage build with enhanced tooling (gh, gog, obsidian-cli, uv, nano-pdf, mcporter, qmd) - Comprehensive .env.example documentation (371 lines) - Multi-environment docker-compose support (DEV/PRD) - GOG/Tailscale integration - Fork-sync and openclaw-docs skills - UI config editor (Svelte) - Fork workflow documentation Merge strategy: Cherry-picked 22 upstream commits, preserved DEV Docker architecture. Docker files unchanged: Dockerfile, docker-compose.yml, docker-setup.sh, .env.example Co-Authored-By: Claude Sonnet 4.5 <[email protected]>

…ge (openclaw#11372)

- Replace cached cfg with fresh loadConfig() in routing lookups - Discord, Telegram, and WhatsApp all use per-message config - Config changes take effect without restart

…ge (openclaw#11372)

…ge (#11372)

fix(routing): use loadConfig() per-message so bindings take effect wi…

2a8ee50

…thout restart

openclaw-barnacle bot added channel: discord Channel integration: discord channel: telegram Channel integration: telegram channel: whatsapp-web Channel integration: whatsapp-web labels Feb 7, 2026

greptile-apps bot reviewed Feb 7, 2026

View reviewed changes

fix(routing): add inline comments documenting mixed-config-source tra…

2552656

…deoff

greptile-apps bot reviewed Feb 7, 2026

View reviewed changes

SutanuNandigrami pushed a commit to SutanuNandigrami/openclaw that referenced this pull request Feb 8, 2026

Merge upstream PR openclaw#11372

889998d

docs: add changelog entry for dynamic bindings

7f7011b

thewilloftheshadow merged commit 8d96955 into openclaw:main Feb 9, 2026
23 of 24 checks passed

yeboster pushed a commit to yeboster/openclaw that referenced this pull request Feb 9, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

546d1db

…ge (openclaw#11372)

NikolasP98 pushed a commit to NikolasP98/openclaw that referenced this pull request Feb 9, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

8d4e63f

…ge (openclaw#11372)

Ethermious pushed a commit to Ethermious/openclaw that referenced this pull request Feb 9, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

9423757

…ge (openclaw#11372)

lucasmpramos pushed a commit to butley/openclaw that referenced this pull request Feb 10, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

88cd40e

…ge (openclaw#11372)

mcaxtr mentioned this pull request Feb 12, 2026

fix(line): use loadConfig() for agent route bindings lookup #14915

Closed

6 tasks

yeboster pushed a commit to yeboster/openclaw that referenced this pull request Feb 13, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

981ee94

…ge (openclaw#11372)

skyhawk14 pushed a commit to skyhawk14/openclaw that referenced this pull request Feb 13, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

d9bd809

…ge (openclaw#11372)

mbelinky pushed a commit that referenced this pull request Feb 14, 2026

fix(routing): make bindings dynamic by calling loadConfig() per-messa…

e2aabc4

…ge (#11372)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(routing): make bindings dynamic by calling loadConfig() per-message#11372

fix(routing): make bindings dynamic by calling loadConfig() per-message#11372
thewilloftheshadow merged 3 commits intoopenclaw:mainfrom
juanpablodlc:fix/bindings-dynamic-routing

juanpablodlc commented Feb 7, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

juanpablodlc commented Feb 7, 2026

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

juanpablodlc commented Feb 7, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

juanpablodlc commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Uh oh!

Conversation

juanpablodlc commented Feb 7, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause (3 layers)

Fix

Files Changed

Mixed-config-source tradeoff

Why this is safe

Test plan

AI Disclosure

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

juanpablodlc commented Feb 7, 2026

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

juanpablodlc commented Feb 7, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

juanpablodlc commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

juanpablodlc commented Feb 7, 2026 •

edited by greptile-apps bot

Loading