Skip to content

(feat): mmr and temporal decay#18097

Merged
steipete merged 5 commits intoopenclaw:mainfrom
rodrigouroz:feat/mmr-temporal-refresh-2026-02-16
Feb 16, 2026
Merged

(feat): mmr and temporal decay#18097
steipete merged 5 commits intoopenclaw:mainfrom
rodrigouroz:feat/mmr-temporal-refresh-2026-02-16

Conversation

@rodrigouroz
Copy link
Copy Markdown
Contributor

@rodrigouroz rodrigouroz commented Feb 16, 2026

Summary

  • Problem: feat(memory): Add MMR re-ranking and temporal decay for hybrid search #13391 drifted from current main, and part of its config UI metadata touched src/config/schema.ts in a way that no longer matches current config schema organization.
  • Why it matters: a stale PR is harder to review/merge and can introduce avoidable conflict/churn; memory ranking improvements were still relevant and should be mergeable now.
  • What changed: ported MMR re-ranking + temporal decay onto current main, aligned config metadata to src/config/schema.labels.ts + src/config/schema.help.ts, preserved opt-in defaults, and revalidated tests/checks.
  • What did NOT change (scope boundary): no default behavior change for users; no new memory backends, no channel/integration behavior changes, no auth/sandbox capability changes.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • New optional hybrid memory controls:
    • agents.defaults.memorySearch.query.hybrid.mmr.enabled (default false)
    • agents.defaults.memorySearch.query.hybrid.mmr.lambda (default 0.7)
    • agents.defaults.memorySearch.query.hybrid.temporalDecay.enabled (default false)
    • agents.defaults.memorySearch.query.hybrid.temporalDecay.halfLifeDays (default 30)
  • When enabled, hybrid search pipeline becomes:
    • Vector + Keyword -> Weighted Merge -> Temporal Decay -> Sort -> MMR -> Top-K
  • Defaults remain unchanged (features are opt-in).

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:
    • None.

Repro + Verification

Environment

  • OS: macOS (local dev machine)
  • Runtime/container: Node/pnpm repo workflow
  • Model/provider: Opus-4-6, Codex-5-3
  • Integration/channel (if any): N/A
  • Relevant config (redacted): default test config + memory hybrid settings in test cases

Steps

  1. Check out branch from current main.
  2. Run:
    • pnpm vitest run src/memory/hybrid.test.ts src/memory/mmr.test.ts src/memory/temporal-decay.test.ts src/config/schema.test.ts
  3. Run:
    • pnpm check

Expected

  • Memory/config tests pass.
  • Formatting/type/lint checks pass.
  • Diff is limited to memory feature, config schema wiring, docs/changelog.

Actual

  • All listed tests/checks passed locally.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    • MMR ranking behavior and tie-break coverage via src/memory/mmr.test.ts
    • Temporal decay behavior (dated files, evergreen exclusions, mtime fallback) via src/memory/temporal-decay.test.ts
    • Hybrid merge async pipeline integration via src/memory/hybrid.test.ts
    • Config schema export/hints still valid via src/config/schema.test.ts
  • Edge cases checked:
    • MMR disabled/default behavior
    • Lambda bounds handling
    • Temporal decay disabled/default behavior
    • No timestamp available (fallback/no decay)

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (Yes)
  • Migration needed? (No)
  • If yes, exact upgrade steps:
    • N/A. Optional new config keys only.

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly:
    • Set hybrid.mmr.enabled: false and hybrid.temporalDecay.enabled: false (defaults already false).
  • Files/config to restore:
    • Revert PR commits touching memory hybrid post-processing and config schema metadata.
  • Known bad symptoms reviewers should watch for:
    • Unexpected hybrid ranking order shifts when features are enabled.
    • Incorrect decay on evergreen memory files.

Risks and Mitigations

  • Risk: ranking regressions when users enable MMR/temporal decay.
    • Mitigation: features are opt-in and independently toggleable; defaults preserve prior behavior.
  • Risk: stale config UI metadata if wired to wrong schema file.
    • Mitigation: metadata placed in current canonical files (schema.labels.ts, schema.help.ts) and validated with schema tests.
  • Risk: async merge path could introduce subtle ordering differences.
    • Mitigation: targeted hybrid/MMR/temporal tests and full pnpm check passed.

Greptile Summary

Adds MMR (Maximal Marginal Relevance) re-ranking and optional temporal decay to the hybrid memory search pipeline. Both features are opt-in (disabled by default) and can be enabled independently via memorySearch.query.hybrid.mmr and memorySearch.query.hybrid.temporalDecay.

Key Changes:

  • Implements MMR algorithm using Jaccard similarity on tokenized content to reduce near-duplicate results
  • Adds exponential temporal decay with configurable half-life to boost recent memories over stale ones
  • Evergreen files (MEMORY.md, non-dated memory/*.md) are exempt from temporal decay
  • Integrates both features into the existing hybrid search pipeline: Vector + Keyword → Merge → Temporal Decay → Sort → MMR → Top-K
  • Config schema properly aligned with current schema.labels.ts and schema.help.ts structure
  • Comprehensive test coverage (371 tests for MMR, 173 for temporal decay)
  • Excellent documentation with real-world examples in docs/concepts/memory.md

Implementation Quality:

  • Clean separation of concerns with dedicated modules (mmr.ts, temporal-decay.ts)
  • Proper async handling in hybrid merge pipeline
  • Edge case handling (tie-breaking, negative scores, future dates, missing timestamps)
  • Type-safe with generic constraints
  • Lambda clamping and half-life validation
  • Efficient token caching in MMR algorithm

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The implementation is production-ready with excellent test coverage, proper defaults (features opt-in), clean architecture, comprehensive documentation, and no breaking changes to existing behavior. The code follows best practices with proper error handling, type safety, and edge case coverage.
  • No files require special attention

Last reviewed commit: b5d57df

Adds Maximal Marginal Relevance (MMR) re-ranking to hybrid search results.

- New mmr.ts with tokenization, Jaccard similarity, and MMR algorithm
- Integrated into mergeHybridResults() with optional mmr config
- 40 comprehensive tests covering edge cases and diversity behavior
- Configurable lambda parameter (default 0.7) to balance relevance vs diversity
- Updated CHANGELOG.md and memory docs

This helps avoid redundant results when multiple chunks contain similar content.
Exponential decay (half-life configurable, default 30 days) applied
before MMR re-ranking. Dated daily files (memory/YYYY-MM-DD.md) use
filename date; evergreen files (MEMORY.md, topic files) are not
decayed; other sources fall back to file mtime.

Config: memorySearch.query.hybrid.temporalDecay.{enabled, halfLifeDays}
Default: disabled (backwards compatible, opt-in).
- DEFAULT_MMR_CONFIG.enabled = false (opt-in, was incorrectly true)
- Tie-break: handle bestItem === null so first candidate always wins
- CHANGELOG URL: docs.clawd.bot → docs.openclaw.ai
- Tests updated to pass enabled: true explicitly where needed
@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation agents Agent runtime and tooling size: XL trusted-contributor labels Feb 16, 2026
@steipete steipete merged commit 7d8d8c3 into openclaw:main Feb 16, 2026
28 checks passed
@steipete
Copy link
Copy Markdown
Contributor

Thanks again for the work here, and sorry for the churn.

We accidentally merged a batch of PRs and need to temporarily roll this back for now.

Revert details:

  • merged commit SHA: 7d8d8c338bfc8b2063c4f5da666ab9386833dcfa
  • revert commit SHA: 563df5638bc8a6fd5cc5fb7c036844ce1f71d573

We can't repope right now, but we will revisit this later and bring this back properly.

@steipete
Copy link
Copy Markdown
Contributor

Please re-open this in a new PR, needs far more time to review.

@rodrigouroz
Copy link
Copy Markdown
Contributor Author

But I'm curious, did something break because of this? would like more details to help make it better, or do you want it again as is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants