Skip to content

fix(clashd): Starlark dict serialization — use insert_hashed with get_hashed#2

Merged
bglusman merged 24 commits intomainfrom
feat/starlark-eval-execution
Apr 10, 2026
Merged

fix(clashd): Starlark dict serialization — use insert_hashed with get_hashed#2
bglusman merged 24 commits intomainfrom
feat/starlark-eval-execution

Conversation

@bglusman
Copy link
Copy Markdown
Owner

@bglusman bglusman commented Apr 9, 2026

Description

Brief description of the changes in this PR.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Refactoring
  • CI/CD improvement

Testing

  • cargo test passes for all affected crates
  • cargo clippy passes with no warnings
  • cargo fmt is clean
  • New tests added for new functionality

Checklist

  • Code follows the project style guidelines
  • Self-review completed
  • Comments added for complex logic
  • Documentation updated (if applicable)
  • No new warnings introduced

Related Issues

Fixes # (issue number)

Breaking Changes

List any breaking changes and migration steps:

Additional Notes

Any additional context or screenshots.

Librarian added 8 commits April 8, 2026 15:02
- Add PolicyEvaluator with AST parsing and execution
- JSON to Starlark value conversion
- Support string and dict return types from policies
- Comprehensive test suite for allow/deny/review verdicts
- Add clashd to workspace
- Rewrite main.rs to use PolicyEngine (no more hardcoded rules)
- Domain list manager with exact, regex, and HOSTS format matching
- Dynamic threat feed fetching (URLHaus, StevenBlack hosts, Yoyo ad servers)
- Per-agent policy configs (allowed/denied domains)
- Starlark policy exposes domain, domain_lists, agent context
- Background domain list refresh (every 6 hours)
- /domains/summary and /domains/check endpoints
- Default policy.star with domain filtering + tool rules
- clashd README with full API reference and examples
- Plugin README for zeroclawed-policy-plugin
- Installation script (systemd service setup)
- Policy activation script for OpenClaw agents
- Updated top-level README with clashd architecture
- Fix plugin to pass agent_id (not identity) to clashd
- test_engine_allows_by_default: Basic allow verdict
- test_engine_denies_when_policy_returns_deny: Deny with reason
- test_engine_fail_closed_on_invalid_policy: Missing evaluate function
- test_engine_fail_closed_on_runtime_error: Starlark runtime errors
- test_domain_extraction_from_url: URL parsing
- test_domain_extraction_from_domain_field: Direct domain field
- test_domain_extraction_no_domain: No domain in args
- test_agent_config_loading: Per-agent config application
- Deprecate onecli-client policy module (now fails closed)
- Add HTTP timeouts for domain list fetching (30s total, 10s connect)
- Initial domain list refresh on startup with error handling
- Make agent config parsing fail-closed on malformed JSON
- Add retry behavior for missed refresh intervals
…_hashed

The previous json_to_starlark Object handler used heap.alloc(entries)
which created a list of tuples instead of a dict. Now uses
SmallMap::insert_hashed with Value::get_hashed() following the
starlark crate's own AllocDict pattern.

Also fixes:
- Deprecate onecli-client policy module (fails closed)
- Add HTTP timeouts for domain list fetching
- Make agent config parsing fail-closed on malformed JSON
- Remove f-strings from default policy (Starlark dialect doesn't support them)
Copilot AI review requested due to automatic review settings April 9, 2026 20:15
Librarian added 2 commits April 9, 2026 20:16
- Add Default impl for DomainListManager
- Add name() getter for DomainList to use the name field
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces clashd, a new Rust sidecar service that evaluates OpenClaw tool calls against Starlark policies (with domain filtering + threat intel feeds), and updates the policy plugin/docs to route evaluations through clashd.

Changes:

  • Add new crates/clashd service (Axum HTTP API) with Starlark evaluation, domain list management, and per-agent context.
  • Update zeroclawed-policy-plugin to send agent_id context to clashd and add plugin documentation.
  • Deprecate onecli-client’s local policy module behavior and expand repository documentation around clashd.

Reviewed changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
README.md Updates high-level architecture + adds clashd policy enforcement section.
crates/zeroclawed-policy-plugin/README.md Adds docs for the OpenClaw policy plugin and clashd integration.
crates/zeroclawed-policy-plugin/before_tool_call/index.ts Sends agent_id context to clashd for policy evaluation.
crates/onecli-client/src/policy.rs Changes behavior of the legacy policy-check API and marks it deprecated.
crates/clashd/src/policy/mod.rs Introduces core policy types (Verdict, PolicyResult) and module wiring.
crates/clashd/src/policy/eval.rs Implements Starlark policy loading + evaluation + JSON→Starlark conversion.
crates/clashd/src/policy/engine.rs Adds PolicyEngine combining Starlark evaluation with domain lists + agent config.
crates/clashd/src/policy/engine/tests.rs Adds tests for engine evaluation, fail-closed behavior, domain extraction.
crates/clashd/src/main.rs Adds Axum server with /evaluate, domain endpoints, and refresh loop.
crates/clashd/src/lib.rs Declares library modules and re-exports main types.
crates/clashd/src/domain_lists.rs Adds dynamic/static domain list parsing, matching, and refresh logic.
crates/clashd/scripts/install.sh Adds a systemd-oriented install script for clashd.
crates/clashd/scripts/activate-policy.sh Adds an activation script to install/configure the OpenClaw plugin.
crates/clashd/README.md Adds service documentation, API docs, and usage examples.
crates/clashd/config/default-policy.star Adds default Starlark policy rules (domain/tool-level checks).
crates/clashd/config/agents.example.json Adds example per-agent configuration for threat feeds and allow/deny lists.
crates/clashd/Cargo.toml New crate manifest for clashd.
Cargo.toml Adds crates/clashd to the workspace members.
Cargo.lock Adds new dependencies for clashd (Axum, Starlark, Reqwest, etc.).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +63 to +69
/// Set per-agent configurations
pub async fn set_agent_configs(&self, configs: Vec<AgentPolicyConfig>) {
let mut map = self.agent_configs.write().await;
for config in configs {
map.insert(config.agent_id.clone(), config);
}
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_agent_configs stores domain_list_sources on each AgentPolicyConfig, but those sources are never registered with DomainListManager (no calls to domain_manager.add_source). As a result, dynamic threat intel feeds will never be fetched/refreshed, and domain_manager.matches() will always return empty.

Suggested fix: when setting agent configs (or when loading them in main.rs), iterate config.domain_list_sources and add them to domain_manager (dedupe by name/url) using refresh_secs as the refresh interval.

Copilot uses AI. Check for mistakes.
Comment thread crates/clashd/src/policy/engine.rs Outdated
Comment on lines +84 to +93
// Add agent-specific context
if let Some(agent_id) = agent_id {
context["agent_id"] = Value::String(agent_id.to_string());

// Check if this tool call involves a domain
let domain = Self::extract_domain(args);
if let Some(ref domain_str) = domain {
// Check against dynamic domain lists
let matched = self.domain_manager.matches(domain_str).await;
context["domain_lists"] = Value::Array(
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Domain extraction + domain list matching only happens inside if let Some(agent_id) = agent_id { ... }. If the caller omits context.agent_id, context["domain"] and context["domain_lists"] are never set, so the default policy (and any policy relying on those fields) can be bypassed.

Suggested fix: extract domain and evaluate domain_manager.matches() regardless of whether agent_id is present (and/or require agent_id at the API layer and default it to "unknown").

Copilot uses AI. Check for mistakes.
Comment thread crates/clashd/src/main.rs
Comment on lines +224 to +229
let addr = SocketAddr::from(([0, 0, 0, 0], port));
info!("");
info!("🚀 Listening on http://{}", addr);

let listener = tokio::net::TcpListener::bind(addr).await?;
axum::serve(listener, app).await?;
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server binds to 0.0.0.0 by default, exposing an unauthenticated policy-evaluation API on all interfaces. For a sidecar intended to be called locally, this is a risky default.

Suggested fix: default-bind to 127.0.0.1 (or make bind address configurable via env), and consider adding an auth token/header check if this may be exposed beyond localhost.

Copilot uses AI. Check for mistakes.
Comment on lines +186 to +198
let mut lists = self.lists.write().await;
let now = Instant::now();

for dl in lists.iter_mut() {
let needs_refresh = dl.last_fetched
.map(|f| now.duration_since(f) > dl.refresh_interval)
.unwrap_or(true);

if needs_refresh {
if let Err(e) = dl.fetch(client).await {
warn!(name = %dl.name, error = %e, "Failed to refresh domain list");
}
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DomainListManager::refresh_all holds the write lock on lists while awaiting network fetches (dl.fetch(client).await). This blocks concurrent matches()/summary() calls for the full duration of refresh and can stall policy evaluations.

Suggested fix: avoid holding the lock across .await (e.g., collect the indices/URLs to refresh under a read lock, then fetch outside the lock and swap results back in, or give each list its own lock).

Suggested change
let mut lists = self.lists.write().await;
let now = Instant::now();
for dl in lists.iter_mut() {
let needs_refresh = dl.last_fetched
.map(|f| now.duration_since(f) > dl.refresh_interval)
.unwrap_or(true);
if needs_refresh {
if let Err(e) = dl.fetch(client).await {
warn!(name = %dl.name, error = %e, "Failed to refresh domain list");
}
}
let now = Instant::now();
let due_indices = {
let lists = self.lists.read().await;
lists
.iter()
.enumerate()
.filter_map(|(idx, dl)| {
let needs_refresh = dl.last_fetched
.map(|f| now.duration_since(f) > dl.refresh_interval)
.unwrap_or(true);
needs_refresh.then_some(idx)
})
.collect::<Vec<_>>()
};
for idx in due_indices {
let mut dl = {
let mut lists = self.lists.write().await;
if idx >= lists.len() {
continue;
}
lists.remove(idx)
};
if let Err(e) = dl.fetch(client).await {
warn!(name = %dl.name, error = %e, "Failed to refresh domain list");
}
let mut lists = self.lists.write().await;
let insert_idx = idx.min(lists.len());
lists.insert(insert_idx, dl);

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +109
// Check exact matches (including subdomain checks)
if self.exact.contains(&domain_lower) {
return true;
}

// Check subdomain matches
// e.g., "foo.example.com" should match "example.com"
for part in self.exact.iter() {
if domain_lower.ends_with(&format!(".{}", part)) {
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DomainList::matches checks subdomain matches by iterating every exact entry and doing ends_with, making each lookup O(N) in the size of the list. With large threat feeds this will become a hot path.

Suggested fix: compute parent suffixes from the queried domain (e.g., a.b.ca.b.c, b.c, c) and check those in the hash set, which keeps lookup ~O(labels) instead of O(entries).

Suggested change
// Check exact matches (including subdomain checks)
if self.exact.contains(&domain_lower) {
return true;
}
// Check subdomain matches
// e.g., "foo.example.com" should match "example.com"
for part in self.exact.iter() {
if domain_lower.ends_with(&format!(".{}", part)) {
// Check exact match first.
if self.exact.contains(&domain_lower) {
return true;
}
// Check parent-domain suffixes.
// e.g., "foo.example.com" should match "example.com"
// by checking "example.com" and then "com" directly in the hash set.
for (idx, ch) in domain_lower.char_indices() {
if ch == '.' && self.exact.contains(&domain_lower[idx + 1..]) {

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +32
# Build clashd
echo "Building clashd..."
cd "$(dirname "$0")/.."
cargo build --release -p clashd

# Create directories
echo "Creating directories..."
sudo mkdir -p "$CLASHD_DIR"
sudo mkdir -p "$CONFIG_DIR"
sudo mkdir -p /var/log/clashd

# Install binary
echo "Installing binary..."
sudo cp target/release/clashd "$CLASHD_DIR/"
sudo chmod +x "$CLASHD_DIR/clashd"
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script builds from the workspace root target directory, but then copies target/release/clashd relative to crates/clashd/ (cd ".../.."). In a Cargo workspace the binary will typically be at ../../target/release/clashd, so the current path will fail on a default setup.

Suggested fix: copy from the workspace target dir (e.g., resolve with cargo metadata / CARGO_TARGET_DIR, or use a path relative to the workspace root).

Copilot uses AI. Check for mistakes.
Comment thread crates/clashd/README.md
Comment on lines +24 to +27
# Run with custom agent configs
CLASHD_POLICY=./config/default-policy.star \
CLASHD_AGENTS=./config/agents.json \
./target/release/clashd
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quick Start references CLASHD_AGENTS=./config/agents.json, but this PR adds config/agents.example.json (no agents.json). This makes the documented command fail.

Suggested fix: either add a real config/agents.json (or copy from example at runtime) or update the README to reference agents.example.json and explain how to create agents.json.

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +44
cp "$PLUGIN_SOURCE/index.ts" "$PLUGIN_DIR/before_tool_call/"
cp "$PLUGIN_SOURCE/tsconfig.json" "$PLUGIN_DIR/before_tool_call/" 2>/dev/null || true
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The activation script installs index.ts into the OpenClaw plugin directory, but the repo also contains a compiled index.js and Node environments typically execute JS (TS requires an additional loader). As written, the installed plugin may not run depending on OpenClaw's plugin runtime.

Suggested fix: install the compiled JS output (or both JS + TS), and/or document the required runtime (e.g., ts-node) explicitly.

Suggested change
cp "$PLUGIN_SOURCE/index.ts" "$PLUGIN_DIR/before_tool_call/"
cp "$PLUGIN_SOURCE/tsconfig.json" "$PLUGIN_DIR/before_tool_call/" 2>/dev/null || true
if [ -f "$PLUGIN_SOURCE/index.js" ]; then
cp "$PLUGIN_SOURCE/index.js" "$PLUGIN_DIR/before_tool_call/"
fi
if [ -f "$PLUGIN_SOURCE/index.ts" ]; then
cp "$PLUGIN_SOURCE/index.ts" "$PLUGIN_DIR/before_tool_call/"
fi
if [ -f "$PLUGIN_SOURCE/tsconfig.json" ]; then
cp "$PLUGIN_SOURCE/tsconfig.json" "$PLUGIN_DIR/before_tool_call/"
fi
if [ ! -f "$PLUGIN_SOURCE/index.js" ] && [ ! -f "$PLUGIN_SOURCE/index.ts" ]; then
echo "Error: No plugin entrypoint found in $PLUGIN_SOURCE (expected index.js or index.ts)"
exit 1
fi

Copilot uses AI. Check for mistakes.
## See Also

- [clashd](../clashd/) — Policy sidecar documentation
- [OpenClaw Plugin System](../../docs/plugins.md) — General plugin docs
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README links to ../../docs/plugins.md, but that file does not exist in this repository, so the "OpenClaw Plugin System" reference is broken.

Suggested fix: update the link to an existing doc (or add the missing document).

Suggested change
- [OpenClaw Plugin System](../../docs/plugins.md) — General plugin docs
- OpenClaw Plugin System — General plugin docs

Copilot uses AI. Check for mistakes.
Comment on lines +29 to 46
/// Check a tool against policy via clashd (not local clash binary)
///
/// DEPRECATED: This module is being replaced by clashd sidecar + zeroclawed-policy-plugin.
/// For new code, use the clashd HTTP API directly or the plugin hook.
///
/// This function now FAILS CLOSED (denies on error) for security.
pub async fn check_tool(policy_file: Option<&std::path::Path>, check: &PolicyCheck) -> PolicyResult {
let policy_file = match policy_file {
Some(p) => p,
None => {
debug!("No policy file configured, allowing all");
return PolicyResult {
allowed: true,
reason: None,
action: PolicyAction::Allow,
};
}
};
let _ = policy_file; // Unused - kept for API compatibility
let _ = check; // Unused - kept for API compatibility

match check_with_clash(policy_file, check).await {
Ok(result) => result,
Err(e) => {
warn!("Policy check failed: {}, defaulting to allow", e);
PolicyResult {
allowed: true,
reason: Some(format!("Policy check error: {}", e)),
action: PolicyAction::Allow,
}
}
// This module is deprecated. clashd is the new policy enforcement path.
// Fail closed rather than allowing potentially unsafe operations.
warn!("onecli-client policy module is deprecated, use clashd instead");
PolicyResult {
allowed: false,
reason: Some("Policy module deprecated, use clashd".to_string()),
action: PolicyAction::Deny,
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says this checks policy via clashd, but check_tool currently ignores its inputs and unconditionally returns allowed: false. That’s a semantic breaking change for any callers still using this API, and it doesn’t match the stated "deny on error" behavior.

Suggested fix: either (1) keep the previous behavior and only fail closed on actual evaluation errors, (2) implement a real clashd HTTP call here, or (3) gate the hard-deny behavior behind a feature flag/env and mark the function #[deprecated] with a migration path.

Copilot uses AI. Check for mistakes.
Librarian added 14 commits April 9, 2026 20:25
- Add debug logging for cipher matching
- Check custom fields (type 1 = hidden/password) often used for API keys
- Also check username field in login ciphers
- Better error messages for troubleshooting
VaultWarden custom fields with encrypted names default to type 0 (text)
not type 1 (hidden). Now checks any field with a non-empty value.
Documents:
- Environment variable vs VaultWarden storage
- Encryption limitation (API returns ciphertext)
- Recommended setup for working API proxy
- Testing and troubleshooting guide
- Security considerations
More descriptive naming. Added [[bin]] target with HTTP service:
- GET /health — service status
- POST /peek — URL peeking (stub)
- POST /scan — content scanning (structural + semantic)
- POST /scan/injection — injection detection

Replaces Python outpost-lite on port 9800.
- Replaces opt-in Outpost model with a mandatory proxy gateway
- Implements outbound exfiltration scan and inbound injection scan
- Implements automatic credential injection for API providers
- Added Tier 1 (Env Var) enforcement roadmap and docs
- Deployed to .210:8080
- Outbound exfiltration scan + inbound injection scan via adversary-detector
- Credential injection for 7 providers (OpenAI, Anthropic, Google, etc.)
- 10 unit tests passing
- Deployed to 192.168.1.210:8080
- Tier 1 (HTTP_PROXY env var) enforcement ready
- New agent_config.rs: reads same agents.json as clashd
- Security gateway auto-loads credentials from provider configs
- New install-security-stack.sh: one-command deploy of full stack
- Updated agents.json with providers + proxy policy per agent
- 4 tests passing (2 unit + 2 e2e ignored for CI)
- Install script now accepts multiple host IPs
- Falls back to scripts/targets.txt if no args given
- Added: status, restart actions for fleet management
- Parallel-ready: loops over all targets per action
- ci.yml: replace 'outpost' with 'adversary-detector', add clashd + security-gateway
- integration-tests.yml: skip gateway e2e tests in CI (need running service)
- Added docs/MANUAL_INSTALL.md as fallback when automated deploy fails
- All 474 tests pass, 0 clippy warnings
@bglusman bglusman merged commit 9f7cf96 into main Apr 10, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants