Skip to content

bug(skills): rust-agent-handoff skill injected at low confidence (0.14) hijacks general memory/update tasks #2512

@bug-ops

Description

@bug-ops

Summary

During live testing (CI-350, 2026-03-31), the rust-agent-handoff skill was selected at confidence 0.14 for a generic memory update prompt:

INFO zeph_core::agent::context::assembly: disambiguation selected skill skill=rust-agent-handoff confidence=0.14100000262260437

Prompt: "Now update: Alice was promoted and now works as a Senior Engineer at Acme Corp."

Observed Behavior

With disambiguation_threshold = 0.05 in testing.toml, rust-agent-handoff (0.141 > 0.05) passes the threshold and its SKILL.md context is injected. The skill's content references .local/handoff/ YAML files, causing the agent to:

  1. Attempt edit on .local/handoff/Alice_work_update.yaml (file not found)
  2. Call find_path to search for handoff YAML files
  3. Eventually call memory_save as a fallback

Expected behavior: memory_save called directly without filesystem detour.

Impact

  • Wasted tool calls (2 extra tool invocations before correct action)
  • Longer response latency
  • Potential user confusion from unexpected tool output
  • disambiguation_threshold = 0.05 is too permissive for production use

Root Cause

disambiguation_threshold = 0.05 is intentionally low in testing.toml to exercise skill matching. However, skills like rust-agent-handoff contain strong filesystem/YAML operation patterns that redirect generic update tasks to file operations.

Suggestions

  1. Raise disambiguation_threshold to at least 0.25 in testing.toml (and production defaults) — matching at 0.14 is noise
  2. Consider adding a "scope guard" or intent classifier to reject skill injection when the user intent clearly maps to a core capability (memory, search, etc.)
  3. Alternatively, add negative examples to rust-agent-handoff SKILL.md to prevent triggering on generic update/save prompts

Config

  • testing.toml, disambiguation_threshold = 0.05
  • cosine_weight = 0.7
  • 27 skills loaded

Metadata

Metadata

Assignees

Labels

P3Research — medium-high complexitybugSomething isn't workingmemoryzeph-memory crate (SQLite)skillszeph-skills crate

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions