Skip to content

[Bug]: Plugin runtime eagerly loads all channel SDKs causing sustained high CPU on startup (3MB+ bundle parse) #28587

@hmemcpy

Description

@hmemcpy

Summary

The OpenClaw gateway exhibits sustained high CPU usage (75-85%+) during and after startup on CPU-only hardware (ARM64 VPS). Profiling reveals this is caused by the plugin runtime eagerly loading all channel SDKs (Discord, Slack, Telegram, Signal, Line, iMessage, WhatsApp) regardless of which channels are actually configured.

The culprit is a 3.0 MB bundled chunk (dist/plugin-sdk/reply-Dxhp8Y9P.js) containing 1,052+ imports from heavy dependencies like @mariozechner/pi-coding-agent, @buape/carbon, and grammy. This chunk is loaded unconditionally when createPluginRuntime() is called, triggering ~22,429 file system operations and causing V8 JIT thrashing.

Environment

  • OpenClaw Version: 2026.2.26 (npm install)
  • Node.js: v22.12.0
  • OS: Ubuntu 24.04 (ARM64/aarch64)
  • Hardware: Hetzner VPS (ARM64, 4GB RAM, shared CPU)
  • Channels configured: Telegram only
  • CPU quota applied: 85% (systemd CPUQuota=85%)

Steps to Reproduce

  1. Install OpenClaw on an ARM64 VPS with limited CPU
  2. Configure only Telegram channel (single channel)
  3. Start the gateway: openclaw gateway
  4. Observe CPU usage with ps aux or htop

Expected: CPU should drop to near-idle after startup (~5-10%)
Actual: CPU remains at 75-85% indefinitely, making the gateway unresponsive

Root Cause Analysis

1. Eager Loading in src/plugins/runtime/index.ts

The runtime creates channel adapters for all channels at startup:

// Lines 41-44, 61-69, 76-93, 110-113, 122-132, etc.
import { discordMessageActions } from "../../channels/plugins/actions/discord.js";
import { signalMessageActions } from "../../channels/plugins/actions/signal.js";
import { telegramMessageActions } from "../../channels/plugins/actions/telegram.js";
// ... 40+ more channel imports

These imports resolve to openclaw/plugin-sdk exports.

2. The 3MB Plugin-SDK Bundle

dist/plugin-sdk/reply-Dxhp8Y9P.js:

  • Size: 3.0 MB
  • Imports: 1,052+ import/require statements
  • Dependencies: @mariozechner/pi-coding-agent, @buape/carbon, grammy, discord-api-types/v10, ajv, undici, jiti, ws, etc.

This file is loaded when dist/plugin-sdk/index.js (4,116 lines) is imported.

3. Loader Sequence

In src/plugins/loader.ts:391:

const runtime = createPluginRuntime();  // Triggers all imports

This happens before checking which channels are actually configured.

4. Impact

  • 22,429+ file system operations during startup (package.json probes)
  • 13,682+ syscalls in the first 5 seconds
  • V8 JIT compiler thrashes trying to compile 3MB of code
  • On CPU-constrained hardware, GC can't keep up → sustained high CPU

Evidence from Profiling

# CPU Profile (Node.js --prof)
65.9% of CPU in Node.js runtime (module resolution/JIT)
22.6% in C++ (system calls)
3.8% in JavaScript

# System Call Trace (strace)
22,429 module file accesses in ~5 seconds
745 opens of openclaw/dist/package.json
405 opens of @mariozechner/pi-coding-agent/dist/package.json
13,682 openat calls total

Current Workaround

Only WhatsApp uses lazy loading (via loadWebOutbound() pattern). For other channels, the code is imported eagerly even if the channel is disabled.

Proposed Solutions

  1. Lazy-load channel runtimes (Recommended)

    • Only import channel code when that channel is configured
    • Pattern already exists for WhatsApp (sendMessageWhatsAppLazy)
    • Apply same pattern to Discord, Slack, Telegram, Signal, Line, iMessage
  2. Split plugin-sdk into channel-specific entry points

    • Instead of one 3MB reply-Dxhp8Y9P.js, have:
      • plugin-sdk/discord
      • plugin-sdk/telegram
      • plugin-sdk/slack
      • etc.
  3. Pass config to createPluginRuntime()

    • Only initialize adapters for enabled channels:
    createPluginRuntime({ channels: config.channels })

Related Issues

Additional Context

This issue does not occur on:

  • macOS development machines (M1/M2/M3 - fast CPU)
  • x86_64 servers with fast CPUs
  • Docker containers with multiple CPU cores

It does affect:

  • ARM64 VPS instances (Raspberry Pi, Hetzner ARM, AWS Graviton)
  • Any CPU-constrained environment where parsing 3MB of JS causes GC pressure

Would you accept a PR that implements lazy loading for channel runtimes? I'm happy to contribute a fix that follows the WhatsApp pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions