Skip to content

fix: include .env file vars in gateway service environment on install#52492

Merged
steipete merged 2 commits intoopenclaw:mainfrom
Chevron7Locked:fix/dotenv-service-install
Mar 23, 2026
Merged

fix: include .env file vars in gateway service environment on install#52492
steipete merged 2 commits intoopenclaw:mainfrom
Chevron7Locked:fix/dotenv-service-install

Conversation

@Chevron7Locked
Copy link
Copy Markdown
Contributor

Summary

  • Read and parse ~/.openclaw/.env (or $OPENCLAW_STATE_DIR/.env) during gateway install and merge those key-value pairs into the service environment
  • .env vars are merged at the lowest priority — config env vars, auth-profile refs, and core service env all override them
  • Dangerous host env vars (NODE_OPTIONS, LD_PRELOAD, etc.) are filtered using the same security policy applied to config env vars

Problem

On macOS, the LaunchAgent plist never included env vars from ~/.openclaw/.env. This meant:

  1. Stale credentials: If a key existed in both .env and the plist (from a previous install), the stale plist value won because dotenv.config() uses override: false
  2. Missing keys: User secrets like BRAVE_API_KEY, OPENROUTER_API_KEY, DISCORD_BOT_TOKEN stored in .env weren't visible to launchctl print (hard to debug)
  3. Race condition: Child processes spawned before the gateway's loadCliDotEnv() call had no access to .env vars

The same issue exists on Linux (systemd) and Windows (Scheduled Task) — the install plan never read .env.

Fix

Added readStateDirDotEnvVars() in daemon-install-helpers.ts that:

  1. Resolves the state dir (respects OPENCLAW_STATE_DIR)
  2. Reads and parses .env using the existing dotenv dependency
  3. Filters dangerous vars via isDangerousHostEnvVarName / isDangerousHostEnvOverrideVarName
  4. Returns clean key-value pairs

The merge priority in buildGatewayInstallPlan is now:

.env file vars  (lowest)  →  config env vars  →  auth-profile refs  →  service env  (highest)

Test plan

  • readStateDirDotEnvVars — reads key-value pairs from .env
  • readStateDirDotEnvVars — returns empty when .env missing
  • readStateDirDotEnvVars — drops dangerous vars (NODE_OPTIONS)
  • readStateDirDotEnvVars — drops empty/whitespace values
  • readStateDirDotEnvVars — respects OPENCLAW_STATE_DIR override
  • buildGatewayInstallPlan — merges .env vars into install plan
  • buildGatewayInstallPlan — config env vars override .env vars
  • buildGatewayInstallPlan — service env overrides .env vars
  • buildGatewayInstallPlan — works when .env does not exist
  • All 12 pre-existing tests still pass (21 total)

Fixes #37101
Relates to #22663

When building the gateway install plan, read and parse
~/.openclaw/.env (or $OPENCLAW_STATE_DIR/.env) and merge those
key-value pairs into the service environment at the lowest
priority — below config env vars, auth-profile refs, and the
core service environment (HOME, PATH, OPENCLAW_*).

This ensures that user-defined secrets stored in .env (e.g.
BRAVE_API_KEY, OPENROUTER_API_KEY, DISCORD_BOT_TOKEN) are
embedded in the LaunchAgent plist (macOS), systemd unit (Linux),
and Scheduled Task (Windows) at install time, rather than
relying solely on the gateway process loading them via
dotenv.config() at startup.

Previously, on macOS the LaunchAgent plist never included .env
vars, which meant:
- launchctl print did not show user secrets (hard to debug)
- Child processes spawned before dotenv loaded had no access
- If the same key existed in both .env and the plist, the stale
  plist value won via dotenv override:false semantics

Dangerous host env vars (NODE_OPTIONS, LD_PRELOAD, etc.) are
filtered using the same security policy applied to config env
vars.

Fixes openclaw#37101
Relates to openclaw#22663
@openclaw-barnacle openclaw-barnacle bot added commands Command implementations size: S labels Mar 22, 2026
- Apply normalizeEnvVarKey({ portable: true }) before security
  filtering, matching the established pattern in env-vars.ts.
  Rejects non-portable key names (spaces, special chars) that
  would produce invalid plist/systemd syntax.

- Isolate existing tests from the developer's real ~/.openclaw/.env
  by providing a temp HOME directory, preventing flaky failures
  when the test machine has a populated .env file.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 22, 2026

Greptile Summary

This PR fixes a long-standing gap where ~/.openclaw/.env vars were never included in the installed service environment (LaunchAgent plist, systemd unit, Scheduled Task). The fix adds readStateDirDotEnvVars() in daemon-install-helpers.ts, which reads and parses the .env file from the state dir, filters out dangerous variables (NODE_OPTIONS, LD_PRELOAD, etc.) using the same security policy applied to config env vars, and merges them as the lowest-priority source in buildGatewayInstallPlan. The implementation is clean, the merge priority is clearly documented, and test coverage for new functionality is thorough.

  • readStateDirDotEnvVars correctly handles missing files, dangerous vars, empty/whitespace values, and OPENCLAW_STATE_DIR overrides
  • The merge order (dotenv → config → auth-profile refs → service env) matches the documented intent and is correctly implemented via spread + Object.assign
  • Test isolation concern: The pre-existing buildGatewayInstallPlan tests pass env: {}, which now causes readStateDirDotEnvVars to fall back to the real os.homedir() and attempt to read ~/.openclaw/.env. A developer whose machine has that file would see the strict toEqual assertions fail. The new dotenv-merge tests correctly avoid this by using env: { HOME: tmpDir } — the existing tests should be updated the same way.

Confidence Score: 4/5

  • Safe to merge; the production logic is correct and secure, with one test-isolation improvement recommended for existing tests.
  • The implementation is correct, the security filtering is consistent with the rest of the codebase, and new tests are comprehensive. The only issue is that pre-existing tests with env: {} now have an implicit dependency on the developer's real ~/.openclaw/.env (via os.homedir() fallback), which could cause non-deterministic local test failures. This is a non-blocking concern that won't affect CI or production.
  • Pre-existing buildGatewayInstallPlan tests in src/commands/daemon-install-helpers.test.ts (lines 89–332) should have env: {} updated to use an isolated path.

Comments Outside Diff (1)

  1. src/commands/daemon-install-helpers.test.ts, line 89-110 (link)

    P2 Pre-existing tests now read from the real ~/.openclaw/.env

    The pre-existing buildGatewayInstallPlan tests all pass env: {}. With this PR, buildGatewayInstallPlan now calls readStateDirDotEnvVars(params.env), which internally calls resolveStateDir({}). Since no HOME (or OPENCLAW_STATE_DIR) is set in the empty env object, resolveStateDir falls back to os.homedir() from the real OS, which means it attempts to read the actual ~/.openclaw/.env on the developer's machine.

    If a developer has that file (a reasonable scenario given the feature being built), the strict toEqual assertion below will fail non-deterministically:

    expect(plan.environment).toEqual({ OPENCLAW_PORT: "3000" });

    The new buildGatewayInstallPlan — dotenv merge tests correctly avoid this by using env: { HOME: tmpDir }. The pre-existing tests should be updated similarly (or mock readStateDirDotEnvVars) to preserve test isolation. For example:

    const plan = await buildGatewayInstallPlan({
      env: { HOME: "/nonexistent-home-for-test" },
      port: 3000,
      runtime: "node",
      nodePath: "/custom/node",
    });

    This also applies to the other pre-existing tests at lines 112–127, 129–148, 150–178, 180–203, 205–224, 226–245, 247–271, 273–306, and 308–332.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/commands/daemon-install-helpers.test.ts
    Line: 89-110
    
    Comment:
    **Pre-existing tests now read from the real `~/.openclaw/.env`**
    
    The pre-existing `buildGatewayInstallPlan` tests all pass `env: {}`. With this PR, `buildGatewayInstallPlan` now calls `readStateDirDotEnvVars(params.env)`, which internally calls `resolveStateDir({})`. Since no `HOME` (or `OPENCLAW_STATE_DIR`) is set in the empty env object, `resolveStateDir` falls back to `os.homedir()` from the real OS, which means it attempts to read the actual `~/.openclaw/.env` on the developer's machine.
    
    If a developer has that file (a reasonable scenario given the feature being built), the strict `toEqual` assertion below will fail non-deterministically:
    
    ```typescript
    expect(plan.environment).toEqual({ OPENCLAW_PORT: "3000" });
    ```
    
    The new `buildGatewayInstallPlan — dotenv merge` tests correctly avoid this by using `env: { HOME: tmpDir }`. The pre-existing tests should be updated similarly (or mock `readStateDirDotEnvVars`) to preserve test isolation. For example:
    
    ```typescript
    const plan = await buildGatewayInstallPlan({
      env: { HOME: "/nonexistent-home-for-test" },
      port: 3000,
      runtime: "node",
      nodePath: "/custom/node",
    });
    ```
    
    This also applies to the other pre-existing tests at lines 112–127, 129–148, 150–178, 180–203, 205–224, 226–245, 247–271, 273–306, and 308–332.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/commands/daemon-install-helpers.test.ts
Line: 89-110

Comment:
**Pre-existing tests now read from the real `~/.openclaw/.env`**

The pre-existing `buildGatewayInstallPlan` tests all pass `env: {}`. With this PR, `buildGatewayInstallPlan` now calls `readStateDirDotEnvVars(params.env)`, which internally calls `resolveStateDir({})`. Since no `HOME` (or `OPENCLAW_STATE_DIR`) is set in the empty env object, `resolveStateDir` falls back to `os.homedir()` from the real OS, which means it attempts to read the actual `~/.openclaw/.env` on the developer's machine.

If a developer has that file (a reasonable scenario given the feature being built), the strict `toEqual` assertion below will fail non-deterministically:

```typescript
expect(plan.environment).toEqual({ OPENCLAW_PORT: "3000" });
```

The new `buildGatewayInstallPlan — dotenv merge` tests correctly avoid this by using `env: { HOME: tmpDir }`. The pre-existing tests should be updated similarly (or mock `readStateDirDotEnvVars`) to preserve test isolation. For example:

```typescript
const plan = await buildGatewayInstallPlan({
  env: { HOME: "/nonexistent-home-for-test" },
  port: 3000,
  runtime: "node",
  nodePath: "/custom/node",
});
```

This also applies to the other pre-existing tests at lines 112–127, 129–148, 150–178, 180–203, 205–224, 226–245, 247–271, 273–306, and 308–332.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: normalize env var keys and isolate ..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b5248161f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// 3. Auth-profile env refs (credential store → env var lookups)
// 4. Service environment (HOME, PATH, OPENCLAW_* — highest)
const environment: Record<string, string | undefined> = {
...readStateDirDotEnvVars(params.env),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don't freeze state-dir .env keys into the service definition

Adding readStateDirDotEnvVars(params.env) here turns every ~/.openclaw/.env entry into an install-time snapshot. On later starts the gateway still calls loadCliDotEnv() (src/cli/run-main.ts:93), and that loader uses override: false (src/cli/dotenv.ts:12-19), so rotating or deleting a key in .env no longer updates the running service: the stale value already embedded in the plist/unit/task wins forever until the user reinstalls. Before this change, a service restart would pick up the current .env contents.

Useful? React with 👍 / 👎.

Comment on lines +56 to +59
if (isDangerousHostEnvVarName(key) || isDangerousHostEnvOverrideVarName(key)) {
continue;
}
entries[key] = value;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject multiline .env values before copying them into service env

This helper now forwards parsed .env values straight into the install plan, but quoted multiline values are valid dotenv syntax (for example PEM/private-key entries). The Linux and Windows service renderers explicitly reject CR/LF in environment values (src/daemon/systemd-unit.ts:30-34, src/daemon/cmd-set.ts:58-60), so a previously valid ~/.openclaw/.env with a multiline secret will now make openclaw gateway install fail on those platforms.

Useful? React with 👍 / 👎.

@Chevron7Locked
Copy link
Copy Markdown
Contributor Author

CI failures are unrelated to this PR:

  • check-additional: plugin-sdk-api-drift baseline check
  • checks-windows shard 1/6: bundled-plugin-metadata.test.ts formatter failure on Windows

Neither involves the files changed here. The daemon-install-helpers tests pass on all platforms.

@steipete steipete merged commit dd860e7 into openclaw:main Mar 23, 2026
37 of 39 checks passed
@steipete
Copy link
Copy Markdown
Contributor

Landed via temp rebase onto main.

  • Gate: pnpm check && pnpm build && pnpm test
  • Land commit: d02bcd1
  • Merge commit: dd860e7

Thanks @Chevron7Locked!

Chevron7Locked added a commit to Chevron7Locked/openclaw that referenced this pull request Mar 23, 2026
The isolatedHome pattern added in dd860e7 covered only 2 of 10
buildGatewayInstallPlan call sites. The remaining 8 still passed
env: {} (or env without HOME), causing readStateDirDotEnvVars to
fall back to os.homedir() and read the developer's real
~/.openclaw/.env — producing non-deterministic failures when
that file contains keys matching test assertions.

Fixes all remaining call sites to include HOME: isolatedHome,
completing the isolation started in openclaw#52492.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Gateway restart does not reload updated env.MAIL_API_KEY (LaunchAgent keeps stale env)

2 participants