bug(skills): os-automation skill over-triggers for generic shell commands, suppresses native bash tool usage

## Summary

When a user asks something like "Run the shell command: echo hello-world", the skill disambiguator selects `os-automation` with confidence ~0.80. With the skill injected into context, gpt-4o-mini consistently refuses to use the native `bash` tool and responds with "I cannot execute shell commands directly."

## Reproduction

Config: `.local/config/testing.toml` (gpt-4o-mini, `os-automation` skill in `.zeph/skills/`)

Prompt:
```
Run the shell command: echo hello-world
```

Expected: `bash` tool invoked with `echo hello-world`  
Actual: "I cannot execute shell commands directly. However, you can run..."

The `bash` tool IS present in the tool schema (confirmed via debug dump). The issue is that `os-automation` skill injection with high confidence causes the LLM to believe it should only perform OS-level automation tasks (desktop notifications, clipboard, screenshots) rather than run arbitrary shell commands via the native `bash` tool.

By contrast, coding-context prompts ("What is the current git branch? Run git status.") successfully invoke `bash` because a different skill (or no skill) is injected.

## Root Cause Hypothesis

The `os-automation` skill description lists specific use cases: notifications, clipboard, screenshots, open URLs, launch apps, etc. Generic `echo`/shell commands don't match these use cases but get selected due to embedding proximity to "OS automation" concepts. When injected, the skill context overrides the LLM's awareness of the native `bash` tool.

## Impact

Users asking for simple shell commands in a non-coding context (no git/cargo/file references) may get unhelpful responses despite `bash` being available.

## Suggested Fix

1. Tighten the `os-automation` skill's embedding descriptor to explicitly exclude generic shell execution
2. Or: add a `should_not_use_when` / `exclusion_patterns` field to SKILL.md that prevents triggering on bare shell command requests
3. Or: when a skill is injected at <0.85 confidence and the user explicitly requests a tool by name ("Run the shell command"), prefer native tool over skill context

## Session

CI-344 (2026-03-31). Provider: gpt-4o-mini. Log: `.local/testing/debug/ci344.log`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug(skills): os-automation skill over-triggers for generic shell commands, suppresses native bash tool usage #2501

Summary

Reproduction

Root Cause Hypothesis

Impact

Suggested Fix

Session

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

bug(skills): os-automation skill over-triggers for generic shell commands, suppresses native bash tool usage #2501

Description

Summary

Reproduction

Root Cause Hypothesis

Impact

Suggested Fix

Session

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions