Skip to content

Discord WebSocket: resume loop needs circuit breaker #13180

@BYWallace

Description

@BYWallace

Bug

The Discord bot's WebSocket connection can enter an unrecoverable resume loop. Observed on Feb 9, 2026 starting ~9:53 AM PST.

Behavior

  1. Connection closes with codes 1005/1006
  2. Client attempts resume with escalating backoff (1s → 2s → 4s → 8s → 16s)
  3. Resume immediately gets closed again
  4. Backoff resets to 1s and cycles indefinitely — hundreds of times over 12+ hours
  5. Bot appears online but is deaf to all messages

Errors observed during the loop

  • connection stalled: no HELLO received within 30000ms
  • AggregateError
  • DNS failure: getaddrinfo ENOTFOUND gateway-us-east1-d.discord.gg

What didn't fix it

  • openclaw gateway restart (SIGUSR1) — temporarily fixed it but loop resumed within minutes
  • Two automated restart attempts had the same result

What fixed it

  • openclaw gateway --force (full process kill + fresh start) — resolved it permanently

Likely root cause

A transient DNS/network issue caused the initial disconnect. The session became stale but the client kept trying to resume it forever instead of falling back to a fresh connection.

Suggested fix

Add a circuit breaker: after N consecutive failed resumes, abandon the session and do a full reconnect with a new identify handshake.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingchannel: discordChannel integration: discord

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions