Skip to content

Commit b447fa1

Browse files
authored
feat(memory): benchmarking harness, cost-sensitive store routing, goal-conditioned write gate (#2485)
Closes #2419, #2444, #2408. - #2419: Add .local/testing/bench-memory.py — seeds N facts, runs M recall queries, reports hit_rate/latency/compression/interference as JSON with CSV append for longitudinal tracking. - #2444: Add cost-sensitive store routing to zeph-memory. New StoreRouter trait with HeuristicRouter, LlmRouter, and HybridRouter. HybridRouter uses heuristic confidence signal and escalates to LLM only when ambiguous. New [memory.store_routing] config section with routing_classifier_provider field. Follow-up wiring: #2484. - #2408: Add CraniMem goal-conditioned write gate to admission control. goal_utility added as 6th AdmissionFactor. Gate is disabled by default (goal_conditioned_write = false). Degenerate goal text guard (<10 chars) prevents arbitrary admission on trivial inputs. New goal_utility_provider config field. Follow-up wiring: #2483.
1 parent da8775b commit b447fa1

File tree

9 files changed

+789
-13
lines changed

9 files changed

+789
-13
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
1919
- feat(skills): ERL experiential reflective learning — `spawn_erl_reflection()` fires a background LLM call after each successful skill+tool turn to extract transferable heuristics; heuristics are stored in `skill_heuristics` table with Jaccard deduplication; at skill matching time `build_erl_heuristics_prompt()` prepends a `## Learned Heuristics` section to the skill context; controlled by `[skills.learning] erl_enabled = false`, `erl_max_heuristics_per_skill = 3`, `erl_min_confidence = 0.5` (closes #2463)
2020
- feat(db): migrations 057 (`skill_usage_log`) and 058 (`skill_heuristics`) for STEM and ERL storage; both SQLite and Postgres variants
2121
- feat(config): `LearningConfig` extended with 14 new fields for ARISE/STEM/ERL (all disabled by default); new fields registered in `config/default.toml` as commented-out entries
22+
- feat(testing): memory benchmarking harness — `.local/testing/bench-memory.py` seeds N facts via agent CLI `--pipe` mode, recalls them, and reports `hit_rate`, `avg_recall_latency_ms`, `p50_ms`, `p99_ms`, `compression_ratio`, `interference_rate` as JSON; token-level hit matching tolerates LLM paraphrase; optional CSV append for longitudinal tracking; stdlib-only Python, no external dependencies (closes #2419)
23+
- feat(memory): cost-sensitive store routing — new `[memory.store_routing]` config section with `strategy` (`heuristic`/`llm`/`hybrid`), `routing_classifier_provider`, `confidence_threshold`, `fallback_route`; `RoutingDecision` struct carries route + confidence + reasoning; `LlmRouter` calls configured provider with quoted query (injection hardening) and parses structured JSON response; `HybridRouter` runs heuristic first and escalates to LLM only when confidence < threshold; `HeuristicRouter.route_with_confidence()` returns granular confidence (`1.0 / matched_count` for ambiguous queries); `AsyncMemoryRouter` trait for async callers; all LLM paths fall back to heuristic on failure (closes #2444)
24+
- feat(memory): CraniMem goal-conditioned write gate (#2408) — `goal_utility` sixth factor added to `AdmissionFactors` and `AdmissionWeights`; new config fields `goal_conditioned_write`, `goal_utility_provider`, `goal_utility_threshold` (default 0.4), `goal_utility_weight` (default 0.25) in `[memory.admission]`; `GoalGateConfig` and `AdmissionControl::with_goal_gate()`; embedding-first scoring with optional LLM refinement for borderline cases; minimum goal text length check (< 10 chars treated as absent, W3.1 fix); soft floor of 0.1 prevents off-goal memories from scoring zero above threshold; zero regression when `goal_conditioned_write = false` (closes #2408)
2225
- feat(core): `/new` slash command — resets conversation context (messages, compaction state, tool caches, focus/sidequest, pending plans) while preserving memory, MCP connections, providers, and skills; creates a new `ConversationId` in SQLite for audit trail; generates a session digest for the outgoing conversation fire-and-forget unless `--no-digest` is passed; active sub-agents and background compression tasks are cancelled; `--keep-plan` preserves a pending plan graph; available in all channels (CLI, TUI, Telegram) via the unified `handle_builtin_command` path (closes #2451)
2326
- feat(memory): Kumiho AGM-inspired belief revision for graph edges — new `BeliefRevisionConfig` with `similarity_threshold`; `find_superseded_edges()` uses contradiction heuristic (same relation domain + high cosine similarity = supersession); `superseded_by` column added to `graph_edges` for audit trail; `invalidate_edge_with_supersession()` in `GraphStore`; `resolve_edge_typed` accepts optional `BeliefRevisionConfig`; controlled by `[memory.graph.belief_revision] enabled = false` (migration 056, closes #2441)
2427
- feat(memory): D-MEM RPE-based tiered graph extraction routing — `RpeRouter` computes heuristic surprise score from context similarity and entity novelty; low-RPE turns skip the MAGMA LLM extraction pipeline; `consecutive_skips` safety valve forces extraction after `max_skip_turns` consecutive skips; `extract_candidate_entities()` helper for cheap regex+keyword entity detection; controlled by `[memory.graph.rpe] enabled = false, threshold = 0.3, max_skip_turns = 5` (closes #2442)

crates/zeph-config/src/memory.rs

Lines changed: 95 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -775,6 +775,12 @@ pub struct MemoryConfig {
775775
/// Default: `None` (uses `sqlite_path` instead).
776776
#[serde(default)]
777777
pub database_url: Option<String>,
778+
/// Cost-sensitive store routing (#2444).
779+
///
780+
/// When `store_routing.enabled = true`, query intent is classified and routed to
781+
/// the cheapest sufficient backend instead of querying all stores on every turn.
782+
#[serde(default)]
783+
pub store_routing: StoreRoutingConfig,
778784
}
779785

780786
fn default_crossover_turn_threshold() -> u32 {
@@ -1419,6 +1425,11 @@ pub struct AdmissionWeights {
14191425
/// Content type prior based on role. Default: `0.15`.
14201426
#[serde(deserialize_with = "validate_admission_weight")]
14211427
pub content_type_prior: f32,
1428+
/// Goal-conditioned utility (#2408). `0.0` when `goal_conditioned_write = false`.
1429+
/// When enabled, set this alongside reducing `future_utility` so total sums remain stable.
1430+
/// Normalized automatically at runtime. Default: `0.0`.
1431+
#[serde(deserialize_with = "validate_admission_weight")]
1432+
pub goal_utility: f32,
14221433
}
14231434

14241435
impl Default for AdmissionWeights {
@@ -1429,6 +1440,7 @@ impl Default for AdmissionWeights {
14291440
semantic_novelty: 0.30,
14301441
temporal_recency: 0.10,
14311442
content_type_prior: 0.15,
1443+
goal_utility: 0.0,
14321444
}
14331445
}
14341446
}
@@ -1443,7 +1455,8 @@ impl AdmissionWeights {
14431455
+ self.factual_confidence
14441456
+ self.semantic_novelty
14451457
+ self.temporal_recency
1446-
+ self.content_type_prior;
1458+
+ self.content_type_prior
1459+
+ self.goal_utility;
14471460
if sum <= f32::EPSILON {
14481461
return Self::default();
14491462
}
@@ -1453,6 +1466,7 @@ impl AdmissionWeights {
14531466
semantic_novelty: self.semantic_novelty / sum,
14541467
temporal_recency: self.temporal_recency / sum,
14551468
content_type_prior: self.content_type_prior / sum,
1469+
goal_utility: self.goal_utility / sum,
14561470
}
14571471
}
14581472
}
@@ -1489,6 +1503,32 @@ pub struct AdmissionConfig {
14891503
/// Background RL model retraining interval in seconds. Default: `3600`.
14901504
#[serde(default = "default_rl_retrain_interval_secs")]
14911505
pub rl_retrain_interval_secs: u64,
1506+
/// Enable goal-conditioned write gate (#2408). When `true`, memories are scored
1507+
/// against the current task goal and rejected if relevance is below `goal_utility_threshold`.
1508+
/// Zero regression when `false`. Default: `false`.
1509+
#[serde(default)]
1510+
pub goal_conditioned_write: bool,
1511+
/// Provider name from `[[llm.providers]]` for goal-utility LLM refinement.
1512+
/// Used only for borderline cases (similarity within 0.1 of threshold).
1513+
/// Falls back to the primary provider when empty. Default: `""`.
1514+
#[serde(default)]
1515+
pub goal_utility_provider: String,
1516+
/// Minimum cosine similarity between goal embedding and candidate memory
1517+
/// to consider it goal-relevant. Below this, `goal_utility = 0.0`. Default: `0.4`.
1518+
#[serde(default = "default_goal_utility_threshold")]
1519+
pub goal_utility_threshold: f32,
1520+
/// Weight of the `goal_utility` factor in the composite admission score.
1521+
/// Set to `0.0` to disable (equivalent to `goal_conditioned_write = false`). Default: `0.25`.
1522+
#[serde(default = "default_goal_utility_weight")]
1523+
pub goal_utility_weight: f32,
1524+
}
1525+
1526+
fn default_goal_utility_threshold() -> f32 {
1527+
0.4
1528+
}
1529+
1530+
fn default_goal_utility_weight() -> f32 {
1531+
0.25
14921532
}
14931533

14941534
impl Default for AdmissionConfig {
@@ -1502,6 +1542,58 @@ impl Default for AdmissionConfig {
15021542
admission_strategy: AdmissionStrategy::default(),
15031543
rl_min_samples: default_rl_min_samples(),
15041544
rl_retrain_interval_secs: default_rl_retrain_interval_secs(),
1545+
goal_conditioned_write: false,
1546+
goal_utility_provider: String::new(),
1547+
goal_utility_threshold: default_goal_utility_threshold(),
1548+
goal_utility_weight: default_goal_utility_weight(),
1549+
}
1550+
}
1551+
}
1552+
1553+
/// Routing strategy for `[memory.store_routing]`.
1554+
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Deserialize, Serialize)]
1555+
#[serde(rename_all = "snake_case")]
1556+
pub enum StoreRoutingStrategy {
1557+
/// Pure heuristic pattern matching. Zero LLM calls. Default.
1558+
#[default]
1559+
Heuristic,
1560+
/// LLM-based classification via `routing_classifier_provider`.
1561+
Llm,
1562+
/// Heuristic first; escalates to LLM only when confidence is low.
1563+
Hybrid,
1564+
}
1565+
1566+
/// Configuration for cost-sensitive store routing (`[memory.store_routing]`).
1567+
///
1568+
/// Controls how each query is classified and routed to the appropriate memory
1569+
/// backend(s), avoiding unnecessary store queries for simple lookups.
1570+
#[derive(Debug, Clone, Deserialize, Serialize)]
1571+
#[serde(default)]
1572+
pub struct StoreRoutingConfig {
1573+
/// Enable configurable store routing. When `false`, `HeuristicRouter` is used
1574+
/// directly (existing behavior). Default: `false`.
1575+
pub enabled: bool,
1576+
/// Routing strategy. Default: `heuristic`.
1577+
pub strategy: StoreRoutingStrategy,
1578+
/// Provider name from `[[llm.providers]]` for LLM-based classification.
1579+
/// Falls back to the primary provider when empty. Default: `""`.
1580+
pub routing_classifier_provider: String,
1581+
/// Route to use when the classifier is uncertain (confidence < threshold).
1582+
/// Default: `"hybrid"`.
1583+
pub fallback_route: String,
1584+
/// Confidence threshold below which `HybridRouter` escalates to LLM.
1585+
/// Range: `[0.0, 1.0]`. Default: `0.7`.
1586+
pub confidence_threshold: f32,
1587+
}
1588+
1589+
impl Default for StoreRoutingConfig {
1590+
fn default() -> Self {
1591+
Self {
1592+
enabled: false,
1593+
strategy: StoreRoutingStrategy::Heuristic,
1594+
routing_classifier_provider: String::new(),
1595+
fallback_route: "hybrid".into(),
1596+
confidence_threshold: 0.7,
15051597
}
15061598
}
15071599
}
@@ -1648,6 +1740,7 @@ mod tests {
16481740
semantic_novelty: 3.0,
16491741
temporal_recency: 1.0,
16501742
content_type_prior: 3.0,
1743+
goal_utility: 0.0,
16511744
};
16521745
let n = w.normalized();
16531746
let sum = n.future_utility
@@ -1686,6 +1779,7 @@ mod tests {
16861779
semantic_novelty: 0.0,
16871780
temporal_recency: 0.0,
16881781
content_type_prior: 0.0,
1782+
goal_utility: 0.0,
16891783
};
16901784
let n = w.normalized();
16911785
let default = AdmissionWeights::default();

crates/zeph-config/src/root.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,7 @@ impl Default for Config {
206206
crossover_turn_threshold: 20,
207207
consolidation: crate::memory::ConsolidationConfig::default(),
208208
database_url: None,
209+
store_routing: crate::memory::StoreRoutingConfig::default(),
209210
},
210211
telegram: None,
211212
discord: None,

crates/zeph-core/src/bootstrap/mod.rs

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -363,14 +363,52 @@ impl AppBuilder {
363363
semantic_novelty: w.semantic_novelty,
364364
temporal_recency: w.temporal_recency,
365365
content_type_prior: w.content_type_prior,
366+
goal_utility: w.goal_utility,
366367
};
367-
let control = zeph_memory::AdmissionControl::new(
368+
let mut control = zeph_memory::AdmissionControl::new(
368369
self.config.memory.admission.threshold,
369370
self.config.memory.admission.fast_path_margin,
370371
weights,
371372
)
372373
.with_provider(admission_provider);
373374

375+
if self.config.memory.admission.goal_conditioned_write {
376+
let goal_provider = if self
377+
.config
378+
.memory
379+
.admission
380+
.goal_utility_provider
381+
.is_empty()
382+
{
383+
None
384+
} else {
385+
match create_named_provider(
386+
&self.config.memory.admission.goal_utility_provider,
387+
&self.config,
388+
) {
389+
Ok(p) => Some(p),
390+
Err(e) => {
391+
tracing::warn!(
392+
provider = %self.config.memory.admission.goal_utility_provider,
393+
error = %e,
394+
"goal_utility_provider not found, LLM refinement disabled"
395+
);
396+
None
397+
}
398+
}
399+
};
400+
control = control.with_goal_gate(zeph_memory::GoalGateConfig {
401+
threshold: self.config.memory.admission.goal_utility_threshold,
402+
provider: goal_provider,
403+
weight: self.config.memory.admission.goal_utility_weight,
404+
});
405+
tracing::info!(
406+
threshold = self.config.memory.admission.goal_utility_threshold,
407+
weight = self.config.memory.admission.goal_utility_weight,
408+
"A-MAC: goal-conditioned write gate enabled"
409+
);
410+
}
411+
374412
if self.config.memory.admission.admission_strategy == zeph_config::AdmissionStrategy::Rl {
375413
tracing::warn!(
376414
"admission_strategy = \"rl\" is configured but the RL model is not yet wired \

0 commit comments

Comments
 (0)