fix(matrix): only use 2-member DM fallback when dm refresh fails#54890
fix(matrix): only use 2-member DM fallback when dm refresh fails#54890gumadeiras merged 2 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a regression (linked to issue #54772) where any 2-member Matrix room was being classified as a DM, causing separate rooms with the same two participants to collapse into a single session. The fix moves the Key changes:
One issue found:
Confidence Score: 3/5The core bug fix is correct and well-tested, but there is one targeted fix needed before merge: restoring TTL rate-limiting for failed DM cache refreshes to avoid a request flood under persistent failure conditions. The logic change correctly solves the reported issue. The only concrete problem is the unintended removal of the 30-second retry gate when
|
| Filename | Overview |
|---|---|
| extensions/matrix/src/matrix/monitor/direct.ts | Core logic change correctly restricts the 2-member DM fallback to failure cases only, fixing the routing bug. One P1 issue: lastDmUpdateMs is no longer updated on refresh failure, silently removing the 30-second retry rate-limit for the error path. |
| extensions/matrix/src/matrix/monitor/direct.test.ts | Tests are well-restructured to cover the new success/failure distinction. The TTL test does not exercise dms.update() failure, so the missing rate-limiting behaviour noted in direct.ts would not be caught by these tests. |
Comments Outside Diff (1)
-
extensions/matrix/src/matrix/monitor/direct.ts, line 45-52 (link)TTL rate-limiting lost on refresh failure
lastDmUpdateMsis now only updated whenclient.dms.update()succeeds. When the refresh fails persistently,lastDmUpdateMsstays at0(or its previous stale value), which meansnow - lastDmUpdateMsalways exceedsDM_CACHE_TTL_MS. Every subsequent call toisDirectMessagewill attempt a newclient.dms.update()call rather than being gated by the 30-second TTL.The original code set
lastDmUpdateMs = nowbefore the try/catch so that even a failed refresh would suppress retries for 30 s. That rate-limiting is now gone for the failure path.If
dms.update()is failing persistently — e.g. a network issue — a high-volume room will hammer the backend on every single message received.This preserves the TTL rate-limiting regardless of outcome while still propagating the error to
isDirectMessagefor the fallback path.Prompt To Fix With AI
This is a comment left during a code review. Path: extensions/matrix/src/matrix/monitor/direct.ts Line: 45-52 Comment: **TTL rate-limiting lost on refresh failure** `lastDmUpdateMs` is now only updated when `client.dms.update()` succeeds. When the refresh fails persistently, `lastDmUpdateMs` stays at `0` (or its previous stale value), which means `now - lastDmUpdateMs` always exceeds `DM_CACHE_TTL_MS`. Every subsequent call to `isDirectMessage` will attempt a new `client.dms.update()` call rather than being gated by the 30-second TTL. The original code set `lastDmUpdateMs = now` **before** the try/catch so that even a failed refresh would suppress retries for 30 s. That rate-limiting is now gone for the failure path. If `dms.update()` is failing persistently — e.g. a network issue — a high-volume room will hammer the backend on every single message received. This preserves the TTL rate-limiting regardless of outcome while still propagating the error to `isDirectMessage` for the fallback path. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/matrix/src/matrix/monitor/direct.ts
Line: 45-52
Comment:
**TTL rate-limiting lost on refresh failure**
`lastDmUpdateMs` is now only updated when `client.dms.update()` succeeds. When the refresh fails persistently, `lastDmUpdateMs` stays at `0` (or its previous stale value), which means `now - lastDmUpdateMs` always exceeds `DM_CACHE_TTL_MS`. Every subsequent call to `isDirectMessage` will attempt a new `client.dms.update()` call rather than being gated by the 30-second TTL.
The original code set `lastDmUpdateMs = now` **before** the try/catch so that even a failed refresh would suppress retries for 30 s. That rate-limiting is now gone for the failure path.
If `dms.update()` is failing persistently — e.g. a network issue — a high-volume room will hammer the backend on every single message received.
```suggestion
const refreshDmCache = async (): Promise<void> => {
const now = Date.now();
if (now - lastDmUpdateMs < DM_CACHE_TTL_MS) {
return;
}
lastDmUpdateMs = now;
await client.dms.update();
};
```
This preserves the TTL rate-limiting regardless of outcome while still propagating the error to `isDirectMessage` for the fallback path.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(matrix): only use 2-member DM fallba..." | Re-trigger Greptile
f2eaab1 to
297852c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 297852c713
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Fixed in 297852c. It will now set the variable so duplicate requests will only cause 1 cache update. |
297852c to
c5e220a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c5e220a014
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c5e220a to
cab689f
Compare
Issue openclaw#54772 reports that two separate 2-person Matrix rooms with the same participants can both get routed as DMs. The regression comes from treating strict 2-member membership as a general fallback even when the DM cache is available and says the room is not a DM. Move the catch from refreshDmCache() into isDirectMessage(). When the refresh succeeds, keep the existing client.dms.isDm(roomId) && isStrictDirectMembership(...) gate so non-DM 2-person rooms stay grouped. Only when the refresh fails do we fall back to the exact 2-member heuristic and keep the warning log.
cab689f to
e32d220
Compare
|
Merged via squash.
Thanks @private-peter! |
…nclaw#54890) Merged via squash. Prepared head SHA: e32d220 Co-authored-by: private-peter <[email protected]> Co-authored-by: gumadeiras <[email protected]> Reviewed-by: @gumadeiras
…nclaw#54890) Merged via squash. Prepared head SHA: e32d220 Co-authored-by: private-peter <[email protected]> Co-authored-by: gumadeiras <[email protected]> Reviewed-by: @gumadeiras
…nclaw#54890) Merged via squash. Prepared head SHA: e32d220 Co-authored-by: private-peter <[email protected]> Co-authored-by: gumadeiras <[email protected]> Reviewed-by: @gumadeiras
…nclaw#54890) Merged via squash. Prepared head SHA: e32d220 Co-authored-by: private-peter <[email protected]> Co-authored-by: gumadeiras <[email protected]> Reviewed-by: @gumadeiras
Issue #54772 reports that two separate 2-person Matrix rooms with the same participants can both get routed as DMs. The regression comes from treating strict 2-member membership as a general fallback even when the DM cache is available and says the room is not a DM.
Move the catch from refreshDmCache() into isDirectMessage(). When the refresh succeeds, keep the existing client.dms.isDm(roomId) && isStrictDirectMembership(...) gate so non-DM 2-person rooms stay grouped. Only when the refresh fails do we fall back to the exact 2-member heuristic and keep the warning log.
Summary
Describe the problem and fix in 2–5 bullets:
Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause / Regression History (if applicable)
For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write
N/A. If the cause is unclear, writeUnknown.members.length == 2so it assumes that any room with only 2 members is a DM. All DMs get collapsed into a single session. Openclaw can't keep context separate or figure out which room to reply to.git blame, prior PR, issue, or refactor if known): issue was originally fixed here https://github.com/openclaw/openclaw/pull/31023/files#r2991791467Regression Test Plan (if applicable)
For bug fixes or regressions, name the smallest reliable test coverage that should have caught this. Otherwise write
N/A.User-visible / Behavior Changes
List user-visible changes (including defaults/config).
If none, write
None.Users with multiple matrix rooms will no longer have their conversations duplicated or mis-routed.
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation:Repro + Verification
Environment
Steps to reproduce
Expected behavior
A 2-person room should only be treated as a DM if it is explicitly marked as one (via m.direct or is_direct member state). Otherwise it should be handled as a regular group room.
Actual behavior
This causes replies to be routed to both rooms simultaneously, producing duplicate or out-of-context messages.
Human Verification (required)
What you personally verified (not just CI), and how:
Review Conversations
If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.
Compatibility / Migration
Yes/No)Yes/No)Yes/No)Failure Recovery (if this breaks)
Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write
None.