feat(memory): Add MMR re-ranking and temporal decay for hybrid search by rodrigouroz · Pull Request #13391 · openclaw/openclaw

rodrigouroz · 2026-02-10T13:07:52Z

Summary

Two complementary, opt-in post-processing features for hybrid memory search that improve result quality without changing existing behavior.

MMR Re-ranking (Maximal Marginal Relevance)

Reduces redundancy by penalizing results too similar to already-selected ones.

Config: memorySearch.query.hybrid.mmr.enabled (default: false)
Lambda: memorySearch.query.hybrid.mmr.lambda (default: 0.7, range 0–1)
Uses cosine + Jaccard similarity

Example — query: "home network setup"

Without MMR, top 3 might be three near-identical snippets about the same router config from different daily notes. With MMR, duplicates drop out and diverse results (DNS setup, reference doc) surface instead.

Temporal Decay (Recency Boost)

Exponential score decay based on memory age, so recent notes rank higher.

Config: memorySearch.query.hybrid.temporalDecay.enabled (default: false)
Half-life: memorySearch.query.hybrid.temporalDecay.halfLifeDays (default: 30)
Dated files (memory/YYYY-MM-DD.md) use filename date
Evergreen files (MEMORY.md, topic files) are not decayed
Other sources fall back to file mtime

With half-life=30: today=100%, 7d=84%, 30d=50%, 90d=12.5%

Pipeline

Vector + Keyword → Weighted Merge → Temporal Decay → Sort → MMR → Top-K

Design decisions

Both opt-in (default disabled) — fully backwards compatible
Both independent — enable one, the other, or both
Applied at different pipeline stages (no conflict)
Config wired through zod schema, types, field labels/help

Tests

127 tests passing across 17 test files. New test coverage:

mmr.test.ts — 40 tests (algorithm, diversity, edge cases)
temporal-decay.test.ts — 5 tests (decay math, half-life, evergreen files, mtime fallback, integration)

Greptile Overview

Greptile Summary

Adds two opt-in post-processing stages to the hybrid memory search pipeline: MMR re-ranking for result diversity (reduces near-duplicate snippets) and temporal decay for recency-aware scoring (exponential decay based on memory age). Both are disabled by default and fully backwards compatible.

New files: src/memory/mmr.ts (MMR algorithm with Jaccard similarity), src/memory/temporal-decay.ts (exponential decay with evergreen file exemption)
mergeHybridResults in hybrid.ts changed from sync to async to accommodate filesystem calls in temporal decay
Config wired through zod schema, types, and memory-search.ts config resolution with proper validation/clamping
Pipeline order: Weighted Merge → Temporal Decay → Sort → MMR → Top-K
Tests: comprehensive coverage for both features including edge cases
Previous review threads identified the MMR tiebreaker bug (now fixed via ?? -Infinity fallback) and the timestampCache race condition (now fixed by caching Promise objects instead of resolved values)

Confidence Score: 4/5

This PR is safe to merge — both features are opt-in (disabled by default) with no changes to existing behavior.
Well-structured, opt-in features with comprehensive test coverage. The code is clean with proper input validation, clamping, and edge case handling. The sync-to-async change in mergeHybridResults is the main surface area for regressions, and existing tests were correctly updated. No critical issues found beyond the previously flagged items which appear addressed.
No files require special attention. The core algorithm files (mmr.ts, temporal-decay.ts) are well-tested and the integration points in hybrid.ts and manager.ts are minimal.

greptile-apps

_{3 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T13:11:36Z

src/memory/mmr.ts

+
+      // Use original score as tiebreaker (higher is better)
+      if (
+        mmrScore > bestMMRScore ||
+        (mmrScore === bestMMRScore && bestItem && candidate.score > bestItem.score)
+      ) {


MMR tiebreaker never triggers

In mmrRerank, the tie-break condition can’t select a higher-scoring candidate when bestItem is still null, because it’s gated by bestItem && .... If multiple candidates produce the same mmrScore (common early on when selected is empty, or when many items normalize to the same relevance), the first iterated candidate “wins” due to Set iteration order rather than score.

This should compare against bestItem?.score and allow setting bestItem when bestItem is null (e.g., include !bestItem in the tie path) so deterministic, score-based ordering is preserved on ties.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/memory/mmr.ts Line: 148:153 Comment: **MMR tiebreaker never triggers** In `mmrRerank`, the tie-break condition can’t select a higher-scoring candidate when `bestItem` is still `null`, because it’s gated by `bestItem && ...`. If multiple candidates produce the same `mmrScore` (common early on when `selected` is empty, or when many items normalize to the same relevance), the first iterated candidate “wins” due to Set iteration order rather than score. This should compare against `bestItem?.score` and allow setting `bestItem` when `bestItem` is null (e.g., include `!bestItem` in the tie path) so deterministic, score-based ordering is preserved on ties. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-10T13:11:41Z

CHANGELOG.md

+- Memory: add MMR (Maximal Marginal Relevance) re-ranking for hybrid search diversity. Configurable via `memorySearch.query.hybrid.mmr`. Thanks @rodrigouroz. https://docs.clawd.bot/concepts/memory
+- Memory: add opt-in temporal decay for hybrid search scoring, with configurable half-life via `memorySearch.query.hybrid.temporalDecay`. Thanks @rodrigouroz.
+
 ## 2026.2.9

 ### Added


Broken / inconsistent docs link

The changelog entry links to https://docs.clawd.bot/concepts/memory, but the repo header above uses https://docs.openclaw.ai and the PR description references /concepts/memory under the OpenClaw docs. This link is very likely incorrect and will send users to the wrong place.

Please update the URL to the correct OpenClaw docs domain/path used elsewhere in the repo.

Prompt To Fix With AI

This is a comment left during a code review. Path: CHANGELOG.md Line: 9:14 Comment: **Broken / inconsistent docs link** The changelog entry links to `https://docs.clawd.bot/concepts/memory`, but the repo header above uses `https://docs.openclaw.ai` and the PR description references `/concepts/memory` under the OpenClaw docs. This link is very likely incorrect and will send users to the wrong place. Please update the URL to the correct OpenClaw docs domain/path used elsewhere in the repo. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-10T13:11:43Z

Additional Comments (1)

src/memory/mmr.test.ts
Default config contradicts feature flag

DEFAULT_MMR_CONFIG.enabled is asserted as true here, but the rest of the PR (docs/config defaults and mergeHybridResults usage) describes MMR as opt-in / default disabled. This test will fail if the runtime default is corrected to match config defaults, and it currently bakes in the opposite behavior.

Align DEFAULT_MMR_CONFIG.enabled (and this test) with the intended default (false) so “opt-in” behavior is consistent across runtime, schema docs, and tests.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/memory/mmr.test.ts
Line: 916:920

Comment:
**Default config contradicts feature flag**

`DEFAULT_MMR_CONFIG.enabled` is asserted as `true` here, but the rest of the PR (docs/config defaults and `mergeHybridResults` usage) describes MMR as *opt-in* / default disabled. This test will fail if the runtime default is corrected to match config defaults, and it currently bakes in the opposite behavior.

Align `DEFAULT_MMR_CONFIG.enabled` (and this test) with the intended default (`false`) so “opt-in” behavior is consistent across runtime, schema docs, and tests.

How can I resolve this? If you propose a fix, please make it concise.

rodrigouroz · 2026-02-10T13:23:03Z

Check code file sizes is failing but I only added 6 and 13 lines, I don't want to tackle a refactor of those files here.

rodrigouroz · 2026-02-11T20:49:16Z

@greptile-apps can you review again?

greptile-apps

_{12 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-11T20:53:13Z

src/memory/temporal-decay.ts

+  return Promise.all(
+    params.results.map(async (entry) => {
+      const cacheKey = `${entry.source}:${entry.path}`;
+      if (!timestampCache.has(cacheKey)) {
+        const timestamp = await extractTimestamp({
+          filePath: entry.path,
+          source: entry.source,
+          workspaceDir: params.workspaceDir,
+        });
+        timestampCache.set(cacheKey, timestamp);
+      }


timestampCache doesn't work as intended inside Promise.all — all mapper callbacks start concurrently, so entries with the same source:path will all pass the !timestampCache.has(cacheKey) check before any await extractTimestamp() resolves, resulting in duplicate fs.stat calls.

Not a correctness bug (all calls produce the same result), but the cache is effectively a no-op. If deduplication matters, consider collecting unique cache keys first and resolving them sequentially or via a dedup map of promises.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/memory/temporal-decay.ts Line: 137:147 Comment: `timestampCache` doesn't work as intended inside `Promise.all` — all mapper callbacks start concurrently, so entries with the same `source:path` will all pass the `!timestampCache.has(cacheKey)` check before any `await extractTimestamp()` resolves, resulting in duplicate `fs.stat` calls. Not a correctness bug (all calls produce the same result), but the cache is effectively a no-op. If deduplication matters, consider collecting unique cache keys first and resolving them sequentially or via a dedup map of promises. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-11T20:53:14Z

CHANGELOG.md

+- Agents: honor tools.exec.safeBins in exec allowlist checks. (#2281)
+- Docs: tighten Fly private deployment steps. (#2289) Thanks @dguido.


These two entries appear to be unrelated to this PR and were added to an already-released version section (2026.1.29). Likely a merge/rebase artifact — consider reverting these lines from this PR's diff.

Prompt To Fix With AI

This is a comment left during a code review. Path: CHANGELOG.md Line: 413:414 Comment: These two entries appear to be unrelated to this PR and were added to an already-released version section (`2026.1.29`). Likely a merge/rebase artifact — consider reverting these lines from this PR's diff. How can I resolve this? If you propose a fix, please make it concise.

rodrigouroz · 2026-02-11T20:58:07Z

@greptile-apps please check now and if all good update the PR description with your score

Adds Maximal Marginal Relevance (MMR) re-ranking to hybrid search results. - New mmr.ts with tokenization, Jaccard similarity, and MMR algorithm - Integrated into mergeHybridResults() with optional mmr config - 40 comprehensive tests covering edge cases and diversity behavior - Configurable lambda parameter (default 0.7) to balance relevance vs diversity - Updated CHANGELOG.md and memory docs This helps avoid redundant results when multiple chunks contain similar content.

Exponential decay (half-life configurable, default 30 days) applied before MMR re-ranking. Dated daily files (memory/YYYY-MM-DD.md) use filename date; evergreen files (MEMORY.md, topic files) are not decayed; other sources fall back to file mtime. Config: memorySearch.query.hybrid.temporalDecay.{enabled, halfLifeDays} Default: disabled (backwards compatible, opt-in).

- DEFAULT_MMR_CONFIG.enabled = false (opt-in, was incorrectly true) - Tie-break: handle bestItem === null so first candidate always wins - CHANGELOG URL: docs.clawd.bot → docs.openclaw.ai - Tests updated to pass enabled: true explicitly where needed

rodrigouroz · 2026-02-16T14:16:02Z

Closing as superseded by a refreshed PR rebased/ported to current main, with the same feature set and updated validation.

Replacement PR: #18097

openclaw-barnacle bot added docs Improvements or additions to documentation agents Agent runtime and tooling labels Feb 10, 2026

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

rodrigouroz force-pushed the feat/mmr-reranking-clean branch from 1f16e23 to 5747b23 Compare February 10, 2026 13:14

greptile-apps bot reviewed Feb 11, 2026

View reviewed changes

rodrigouroz changed the title ~~feat(memory): Add MMR re-ranking and temporal decay for hybrid searchFeat/mmr reranking clean~~ feat(memory): Add MMR re-ranking and temporal decay for hybrid search Feb 11, 2026

rodrigouroz added 4 commits February 15, 2026 14:15

Memory: fix MMR tie-break and temporal timestamp dedupe

85a73d4

rodrigouroz force-pushed the feat/mmr-reranking-clean branch from 0f54bb4 to 85a73d4 Compare February 15, 2026 14:18

openclaw-barnacle bot added size: XL trusted-contributor labels Feb 15, 2026

thewilloftheshadow force-pushed the main branch from bfc1ccb to f92900f Compare February 15, 2026 18:46

rodrigouroz mentioned this pull request Feb 16, 2026

(feat): mmr and temporal decay #18097

Merged

18 tasks

rodrigouroz closed this Feb 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(memory): Add MMR re-ranking and temporal decay for hybrid search#13391

feat(memory): Add MMR re-ranking and temporal decay for hybrid search#13391
rodrigouroz wants to merge 4 commits intoopenclaw:mainfrom
rodrigouroz:feat/mmr-reranking-clean

rodrigouroz commented Feb 10, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 10, 2026

Uh oh!

greptile-apps bot Feb 10, 2026

Uh oh!

greptile-apps bot commented Feb 10, 2026

Uh oh!

rodrigouroz commented Feb 10, 2026

Uh oh!

rodrigouroz commented Feb 11, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 11, 2026

Uh oh!

greptile-apps bot Feb 11, 2026

Uh oh!

rodrigouroz commented Feb 11, 2026

Uh oh!

rodrigouroz commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		- Agents: honor tools.exec.safeBins in exec allowlist checks. (#2281)
		- Docs: tighten Fly private deployment steps. (#2289) Thanks @dguido.

Uh oh!

Conversation

rodrigouroz commented Feb 10, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

MMR Re-ranking (Maximal Marginal Relevance)

Temporal Decay (Recency Boost)

Pipeline

Design decisions

Tests

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 10, 2026

Uh oh!

rodrigouroz commented Feb 10, 2026

Uh oh!

rodrigouroz commented Feb 11, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

rodrigouroz commented Feb 11, 2026

Uh oh!

rodrigouroz commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rodrigouroz commented Feb 10, 2026 •

edited by greptile-apps bot

Loading