fix(msteams): keep provider promise pending until abort to stop auto-restart loop#22605
fix(msteams): keep provider promise pending until abort to stop auto-restart loop#22605OpakAlex wants to merge 2 commits intoopenclaw:mainfrom
Conversation
6cb542d to
612f04a
Compare
|
@steipete your commit adds html tags: Should we allow html tags for CI? Thanks |
|
@obviyus Can you please check? |
eb13354 to
67183f7
Compare
…restart loop - monitorMSTeamsProvider now returns a promise that stays pending until opts.abortSignal fires and shutdown() completes, so the gateway no longer treats the channel as exited and restarts it in a loop. - Add docs/channels/gateway-lifecycle.md describing startAccount contract. - Gateway test: startAccount that resolves on abort does not trigger auto-restart; call stopChannel so test cleanup exits. - MSTeams test: use file URL for OneDrive mediaUrl so isLocalPath works on all platforms (e.g. Windows CI). - Apply oxfmt to gateway-lifecycle.md and src/browser/* (format check). Co-authored-by: Cursor <[email protected]>
Co-authored-by: Cursor <[email protected]>
67183f7 to
c94c7b3
Compare
|
Just noticed a connection: Several other PRs seem to address the same problem:
Issue #22169 reports msteams provider starting twice on gateway boot causing EADDRINUSE; PR#22182 fixes the false auto-restart loop by keeping the provider promise pending until abort. Both approaches have merit — might be worth coordinating. Related issue(s): #22169 If any of these links don't look right, let me know and I'll correct them. |
|
This pull request has been automatically marked as stale due to inactivity. |
|
Closing this PR as superseded by follow-up work on the same issue path. The msteams lifecycle fix is covered in #22182, so I am closing this one to keep the queue clean and avoid duplicate maintenance. Thanks everyone for the review context and cross-links. |
|
Field report from a live production recovery (sanitized, no secrets / no env values). Posting this in case it helps maintainers and others triaging the same restart-loop class. Executive summaryWe hit a Microsoft Teams provider auto-restart loop with the same user-visible signature described here:
In our incident, there were three independent contributors. Fixing only one did not fully resolve the loop:
What we observed (timeline-style)Phase A — restart loop with repeated startup success signals
Phase B — eliminated local package-collision factor
Phase C — fixed missing credential config
Phase D — remaining behavior matches known monitor/lifecycle bug class
Distinguishing signals that helped triageThese indicators were the most useful to separate local misconfiguration from upstream bug behavior:
Secure remediation checklist that worked best for us(Generalized to avoid host-specific details) 1) Eliminate package duplication first
2) Validate Teams auth fields completely
3) Restart and verify by behavior, not process state
4) If still looping, classify as likely upstream lifecycle bug
What to include in repro data (high value for maintainers)Recommend sharing these artifacts (all sanitized):
Suggested maintainer-facing acceptance testA robust guardrail test for this bug class would assert:
Current status from this field report
If helpful, I can provide a compact sanitized log excerpt in follow-up showing exact line ordering (startup → resolution → restart) without any identifiers. |
|
Superseded by already-merged MSTeams lifecycle work:
The provider lifecycle now stays pending until abort/shutdown, with monitor lifecycle regression coverage in main. This addresses the same restart-loop root cause. Closing as duplicate/superseded to keep the queue focused. Thank you for the thorough write-up and docs context. |
Summary
Describe the problem and fix in 2–5 bullets:
monitorMSTeamsProviderreturns a promise that stays pending until abort + shutdown (not on first listen); added gateway-lifecycle doc and a server-channels test that a pending startAccount does not trigger auto-restart.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
MS Teams channel no longer enters an auto-restart loop; provider stays running until the user stops the channel or the gateway exits. New doc for extension authors: gateway channel lifecycle (startAccount contract).
Security Impact (required)
Yes, explain risk + mitigation: N/ARepro + Verification
Environment
Steps
Expected
One "starting provider (port 3978)" and "msteams provider started on port 3978"; no repeated "auto-restart attempt N/10" unless the provider actually crashes.
Actual (before fix)
"starting provider (port 3978)" followed immediately by "auto-restart attempt 1/10 in 5s", then cycle repeats with backoff up to 10 attempts.
Evidence
server-channels.test.ts— "does not auto-restart when startAccount promise stays pending" (startAccount returns never-resolving promise → one call, running stays true).docs/channels/gateway-lifecycle.md— startAccount promise contract and MS Teams fix.Human Verification (required)
Compatibility / Migration
Failure Recovery (if this breaks)
Risks and Mitigations
Greptile Summary
Fixes the MS Teams provider auto-restart loop by making
monitorMSTeamsProviderreturn a promise that stays pending while the server is running, matching the gateway'sstartAccountcontract. Previously, the promise resolved immediately afterexpressApp.listen(), causing the gateway to treat the channel as "exited" and enter a restart loop.extensions/msteams/src/monitor.ts: Wraps the return value in aPromisethat stays pending until the abort signal fires and shutdown completes. Without an abort signal, returns a never-resolving promise.src/gateway/server-channels.test.ts: Adds a test confirming that a pendingstartAccountpromise does not trigger auto-restart.docs/channels/gateway-lifecycle.md: New documentation describing thestartAccountpromise contract for extension channel plugins.Confidence Score: 5/5
Last reviewed commit: 0dc50ed
(2/5) Greptile learns from your feedback when you react with thumbs up/down!