Skip to content

Implement system prompt builder #28

@alexey-pelykh

Description

@alexey-pelykh

Summary

Implement a simplified system prompt builder that assembles the ~10 sections RemoteClaw needs to inject into CLI subprocess agents. This replaces OpenClaw's 26-section, 665-line builder with a clean ~100 LoC module.

File: src/middleware/system-prompt.ts
Test: src/middleware/system-prompt.test.ts
Depends on: types module (PR #4, merged)

Background

RemoteClaw's CLI-subprocess architecture means the CLI agent handles most context management natively (skills, workspace files, memory, reasoning, tool discovery via MCP). The system prompt only needs to inject what the CLI agent cannot know: channel identity, messaging conventions, and formatting guidance.

Budget: ~770–1,270 tokens (2–5% of context window). Well within budget for all 4 runtimes. See the system prompt size budget analysis for full analysis.

API Surface

/** Parameters for system prompt generation. */
export type SystemPromptParams = {
  /** Channel provider name (e.g., "telegram", "discord", "whatsapp"). */
  channelName: string;
  /** Display name of the user sending the message. */
  userName?: string | undefined;
  /** Agent identifier within RemoteClaw. */
  agentId?: string | undefined;
  /** IANA timezone string (e.g., "America/New_York"). */
  timezone?: string | undefined;
  /** Working directory for the CLI subprocess. */
  workspaceDir: string;
  /** Phone numbers / IDs of authorized senders (owner allowlist). */
  authorizedSenders?: string[] | undefined;
  /** Channel-specific formatting hints (e.g., LINE directives, Discord component schema). */
  messageToolHints?: string[] | undefined;
  /** Reaction/emoji guidance level. */
  reactionGuidance?: { level: "minimal" | "extensive"; channel: string } | undefined;
};

/**
 * Build the RemoteClaw system prompt for injection into CLI subprocess agents.
 *
 * Assembles ~10 sections totaling ~3,000-5,000 chars (~770-1,270 tokens).
 * No promptMode switch (always full), no bootstrap context (CLI agent handles),
 * no tool list (MCP handles), no skills/memory/sandbox sections.
 */
export function buildSystemPrompt(params: SystemPromptParams): string;

Sections (10 total)

Each section is either static (constant text) or dynamic (parameterized).

1. Identity & Context (~200 chars) — Dynamic

You are running inside RemoteClaw, a middleware that connects AI agents to messaging channels.
You are responding to a message from {userName} on {channelName}.

Sets the agent's role context: it's operating as a messaging agent, not a standalone CLI tool.

2. Safety (~563 chars) — Static

Core safety principles. Always included, immutable. Adapted from OpenClaw's safety section for the RemoteClaw context. Covers: no harmful content, respect user privacy, no credential exposure.

3. Messaging Guidance (~500 chars) — Static

Cross-session messaging conventions: how to use sessions_send for reaching other channels, messaging tool behavior, reply expectations.

4. Message Tool Hints (0–2,000 chars) — Dynamic (conditional)

Channel-specific formatting guidance injected from params.messageToolHints. Only included when hints are provided. Examples:

  • LINE: ~2,000 chars (50 lines, 9 directives)
  • Discord: ~500 chars (component schema)
  • MSTeams: ~400 chars (Adaptive Cards)
  • Feishu: ~200 chars (auto-detect)

5. Reply Tags (~420 chars) — Static

Syntax for reply targeting:

  • [[reply_to_current]] — reply to the message that triggered this response
  • [[reply_to:<id>]] — reply to a specific message by ID

6. Silent Replies (~450 chars) — Static

NO_REPLY token behavior: when the agent determines no response is needed, it outputs NO_REPLY and the middleware suppresses delivery.

7. Authorized Senders (~150 chars) — Dynamic (conditional)

Only included when params.authorizedSenders is non-empty. Lists phone numbers / IDs that are trusted senders (owner allowlist).

8. Reactions (~300 chars) — Dynamic (conditional)

Emoji reaction guidance, level-dependent ("minimal" = react sparingly, "extensive" = react freely). Only included when params.reactionGuidance is provided.

9. Runtime (~300 chars) — Dynamic

Runtime metadata: channel name, timezone, agent ID. Always included.

10. Workspace (~200 chars) — Dynamic

Working directory path. Always included.

Implementation Notes

  • ~100 lines of production code: 10 section builders + assembly function
  • Static sections: Safety, Messaging, Reply Tags, Silent Replies are constant strings
  • Dynamic sections: Identity, Runtime, Workspace are template-based
  • Conditional sections: messageToolHints, Authorized Senders, Reactions only appear when params are provided
  • Sections joined with \n\n — each section is a self-contained paragraph/block
  • No promptMode switch — always full (unlike OpenClaw's full/minimal/none)
  • No bootstrap context — CLI agents load their own workspace files natively
  • No tool list — MCP protocol handles tool discovery

Test Requirements

Test file: src/middleware/system-prompt.test.ts

Section inclusion

  • All required sections present in output (identity, safety, messaging, reply tags, silent replies, runtime, workspace)
  • messageToolHints section present only when hints provided
  • Authorized senders section present only when senders provided
  • Reactions section present only when reactionGuidance provided

Dynamic content

  • Identity section includes channel name and user name
  • Runtime section includes timezone and agent ID
  • Workspace section includes working directory path
  • Authorized senders lists all provided sender IDs

Conditional omission

  • No messageToolHints section when messageToolHints is undefined or empty
  • No authorized senders section when authorizedSenders is undefined or empty
  • No reactions section when reactionGuidance is undefined

Size budget

  • Base prompt (no hints, no optional sections) is under 4,000 chars
  • Full prompt with LINE hints (worst case) is under 6,000 chars

Content correctness

  • Safety section is always present and contains safety keywords
  • Reply tags section contains [[reply_to_current]] syntax
  • Silent replies section contains NO_REPLY token
  • No OpenClaw-specific references (no "OpenClaw", no pi-embedded, no SOUL.md)

Acceptance Criteria

  • src/middleware/system-prompt.ts exports buildSystemPrompt() and SystemPromptParams type
  • 10 sections assembled: identity, safety, messaging, hints (conditional), reply tags, silent replies, authorized senders (conditional), reactions (conditional), runtime, workspace
  • Static sections contain appropriate constant text
  • Dynamic sections correctly interpolate parameters
  • Conditional sections omitted when params not provided
  • No OpenClaw-specific references in output
  • Base prompt under 4,000 chars, worst-case under 6,000 chars
  • All tests pass: npx vitest run src/middleware/system-prompt.test.ts
  • Full suite passes: npx vitest run

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions