fix(msteams): keep monitor alive until shutdown by chilu18 · Pull Request #24580 · openclaw/openclaw

chilu18 · 2026-02-23T15:28:34Z

Summary

keep monitorMSTeamsProvider alive until server close so channel manager does not treat startup as an immediate exit
fail fast on startup bind/listen errors (for example EADDRINUSE) so restart backoff can classify a real startup failure
wire abort handling with a stable listener and cleanup on close
add lifecycle regression tests covering:
- monitor stays pending until abort
- startup rejects on listen error

Tests

corepack pnpm --dir /tmp/openclaw-next4-CwcfMD exec vitest run extensions/msteams/src/monitor.lifecycle.test.ts src/gateway/server-channels.test.ts

Greptile Summary

Fixed lifecycle issue where monitorMSTeamsProvider returned immediately after startup, causing the channel manager to treat it as an exit and trigger restart loops. The fix keeps the monitor promise pending until the HTTP server closes (via abort signal or shutdown), and fails fast on startup errors like EADDRINUSE.

Confidence Score: 5/5

Safe to merge - fixes critical lifecycle bug preventing MSTeams monitor from staying alive
Implementation correctly addresses the root cause (monitor returning immediately instead of staying pending), includes comprehensive lifecycle tests covering both normal operation and error scenarios, and follows existing patterns from other channel monitors
No files require special attention

_{Last reviewed commit: d839de4}

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-23T15:31:10Z

extensions/msteams/src/monitor.ts

+      resolve();
+    });
+  });
+  opts.abortSignal?.removeEventListener("abort", onAbort);


removeEventListener unnecessary since listener added with { once: true } on line 317

Suggested change

opts.abortSignal?.removeEventListener("abort", onAbort);

// Listener auto-removed by { once: true } option

Prompt To Fix With AI

This is a comment left during a code review. Path: extensions/msteams/src/monitor.ts Line: 327 Comment: `removeEventListener` unnecessary since listener added with `{ once: true }` on line 317 ```suggestion // Listener auto-removed by { once: true } option ``` How can I resolve this? If you propose a fix, please make it concise.

openclaw-barnacle · 2026-03-01T04:09:30Z

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

BradGroux · 2026-03-01T17:36:56Z

Field report from a live production recovery (sanitized, no secrets / no env values). Posting this in case it helps maintainers and others triaging the same restart-loop class.

Executive summary

We hit a Microsoft Teams provider auto-restart loop with the same user-visible signature described here:

provider logs startup and directory/channel resolution
then immediately reports auto-restart attempts (1/10, 2/10, ...)
loop persists despite successful startup-side lookups

In our incident, there were three independent contributors. Fixing only one did not fully resolve the loop:

Package collision (legacy global package co-installed with current package)
Missing Teams bot credential field (app secret not set)
Upstream lifecycle bug pattern (provider promise appears to resolve too early, matching this issue family)

What we observed (timeline-style)

Phase A — restart loop with repeated startup success signals

Teams provider repeatedly logged startup, user resolution, and channel resolution.
Immediately after these success logs, gateway health/auto-restart kicked in.
Backoff pattern matched known restart-loop behavior.

Phase B — eliminated local package-collision factor

Found old global package and current package both installed.
Removed legacy global package.
Result: removed one clear conflict vector, but loop still present.

Phase C — fixed missing credential config

Added missing Teams bot app password field (client secret value) in channel config.
Triggered config reload / gateway restart.
Result: credential state improved, but restart loop still present.

Phase D — remaining behavior matches known monitor/lifecycle bug class

Even after cleanup + credentials corrected, pattern remained:
- start provider
- resolve users/channels
- immediate auto-restart
This aligns with issue reports where monitor/start function resolves prematurely and channel manager interprets that as provider exit.

Distinguishing signals that helped triage

These indicators were the most useful to separate local misconfiguration from upstream bug behavior:

Success-before-failure pattern
- If user/group/channel resolution consistently succeeds before restart, network/auth may not be the primary blocker.
Stable repeating loop shape
- Consistent startup → resolution → restart sequence with backoff strongly suggests lifecycle contract mismatch.
Persistence across remediation layers
- If loop persists after:
  - removing duplicate installs,
  - setting required credentials,
  - clean restart,
    then upstream monitor lifecycle is likely involved.

Secure remediation checklist that worked best for us

(Generalized to avoid host-specific details)

1) Eliminate package duplication first

Verify only one canonical global installation is active.
Remove legacy/conflicting package names that may still register plugins.
Re-check plugin load path consistency.

2) Validate Teams auth fields completely

Ensure app ID, tenant ID, and app password (client secret value) are all present.
Confirm config validation passes before restart.

3) Restart and verify by behavior, not process state

Don’t trust “running” status alone.
Verify no new auto-restart attempt lines over a timed observation window.
Verify inbound/outbound Teams flow during that same window.

4) If still looping, classify as likely upstream lifecycle bug

Capture exact log sequence around each restart boundary.
Attach sanitized sequence to issue/PR for maintainers.

What to include in repro data (high value for maintainers)

Recommend sharing these artifacts (all sanitized):

startup line for msteams provider
user/group/channel resolution lines
first line indicating restart scheduling
health-monitor line indicating “reason: stopped” (if present)
whether duplicate package installations were found
whether credentials were complete at time of test
whether loop persists after those are corrected

Suggested maintainer-facing acceptance test

A robust guardrail test for this bug class would assert:

startAccount() for Teams does not resolve while provider is healthy.
Resolver success (users/channels) alone does not trigger subsystem restart logic.
Restart path only activates on explicit stop, fatal error, or abort signal.
No duplicate listener bind occurs during healthy run (prevents EADDRINUSE cascades).

Current status from this field report

Local config/package hygiene issues: addressed.
Remaining loop signature: still consistent with upstream lifecycle bug described in this issue family.
Net: this report supports merging a monitor/start lifecycle fix that keeps provider task pending until true shutdown.

If helpful, I can provide a compact sanitized log excerpt in follow-up showing exact line ordering (startup → resolution → restart) without any identifiers.

aymiee · 2026-03-02T07:31:49Z

I've validated this PR against latest main and it resolves the restart loop for our MSTeams deployment. This is currently blocking our testing of the MSTeams provider. Is there anything preventing a merge?

BradGroux · 2026-03-02T08:42:05Z

I've validated this PR against latest main and it resolves the restart loop for our MSTeams deployment. This is currently blocking our testing of the MSTeams provider. Is there anything preventing a merge?

I posted an update to fix the Teams issue three weeks ago - #13089

I even got Peter to reply to a request to review Teams issues and PRs, but that hasn't happened. Sadly, Teams isn't a priorty - it has been broken for over a month and there have been dozens of issues and PRs submitted.

steipete · 2026-03-02T20:31:29Z

Landed via temp rebase onto main.

Gate: pnpm vitest run extensions/msteams/src/monitor.lifecycle.test.ts src/gateway/server-channels.test.ts --maxWorkers=1; pnpm check && pnpm build
Land commit: b745bc2
Merge commit: 1567713

Thanks @chilu18!

aymiee · 2026-03-03T15:29:11Z

Thank you all!!. I'm back up and running with msTeams.

…chilu18

…chilu18)

…chilu18

…chilu18)

fix(msteams): keep monitor alive until shutdown

d839de4

chilu18 mentioned this pull request Feb 23, 2026

[Bug]: MSTeams adapter: startMonitor() returns immediately causing EADDRINUSE restart loop #24522

Closed

openclaw-barnacle bot added channel: msteams Channel integration: msteams size: M labels Feb 23, 2026

greptile-apps bot reviewed Feb 23, 2026

View reviewed changes

test(msteams): remove tuple-unsafe spread in lifecycle mocks

5bd5bc2

justinhuangcode mentioned this pull request Feb 25, 2026

fix(msteams): keep startAccount Promise pending while server runs #24391

Closed

4 tasks

steipete mentioned this pull request Feb 25, 2026

MSTeams provider restarts in a loop due to monitorMSTeamsProvider resolving immediately #26363

Closed

openclaw-barnacle bot added the stale Marked as stale due to inactivity label Mar 1, 2026

steipete added a commit that referenced this pull request Mar 2, 2026

fix: harden msteams monitor lifecycle edges (#24580) (thanks @chilu18)

b745bc2

steipete merged commit 1567713 into openclaw:main Mar 2, 2026
26 checks passed

john-ver pushed a commit to apmcoin/apmclaw that referenced this pull request Mar 9, 2026

fix: harden msteams monitor lifecycle edges (openclaw#24580) (thanks @…

84c1eca

…chilu18)

dustin-olenslager pushed a commit to dustin-olenslager/ironclaw-supreme that referenced this pull request Mar 24, 2026

fix: harden msteams monitor lifecycle edges (openclaw#24580) (thanks @…

0dd6873

…chilu18)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(msteams): keep monitor alive until shutdown#24580

fix(msteams): keep monitor alive until shutdown#24580
steipete merged 2 commits intoopenclaw:mainfrom
chilu18:fix/msteams-monitor-lifecycle

chilu18 commented Feb 23, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 23, 2026

Uh oh!

openclaw-barnacle bot commented Mar 1, 2026

Uh oh!

BradGroux commented Mar 1, 2026

Uh oh!

aymiee commented Mar 2, 2026 •

edited

Loading

Uh oh!

BradGroux commented Mar 2, 2026

Uh oh!

Uh oh!

steipete commented Mar 2, 2026

Uh oh!

aymiee commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	opts.abortSignal?.removeEventListener("abort", onAbort);
	// Listener auto-removed by { once: true } option

Uh oh!

Conversation

chilu18 commented Feb 23, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Greptile Summary

Confidence Score: 5/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

openclaw-barnacle bot commented Mar 1, 2026

Uh oh!

BradGroux commented Mar 1, 2026

Executive summary

What we observed (timeline-style)

Phase A — restart loop with repeated startup success signals

Phase B — eliminated local package-collision factor

Phase C — fixed missing credential config

Phase D — remaining behavior matches known monitor/lifecycle bug class

Distinguishing signals that helped triage

Secure remediation checklist that worked best for us

1) Eliminate package duplication first

2) Validate Teams auth fields completely

3) Restart and verify by behavior, not process state

4) If still looping, classify as likely upstream lifecycle bug

What to include in repro data (high value for maintainers)

Suggested maintainer-facing acceptance test

Current status from this field report

Uh oh!

aymiee commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BradGroux commented Mar 2, 2026

Uh oh!

Uh oh!

steipete commented Mar 2, 2026

Uh oh!

aymiee commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chilu18 commented Feb 23, 2026 •

edited by greptile-apps bot

Loading

aymiee commented Mar 2, 2026 •

edited

Loading