Skip to content

feat(memory-lancedb): make auto-capture max length configurable#16624

Closed
ciberponk wants to merge 1 commit intoopenclaw:mainfrom
ciberponk:feat/memory-lancedb-configurable-capture-limit-en
Closed

feat(memory-lancedb): make auto-capture max length configurable#16624
ciberponk wants to merge 1 commit intoopenclaw:mainfrom
ciberponk:feat/memory-lancedb-configurable-capture-limit-en

Conversation

@ciberponk
Copy link

@ciberponk ciberponk commented Feb 14, 2026

Summary

  • add captureMaxChars config for memory-lancedb auto-capture filtering
  • keep existing behavior by default (1500 max chars)
  • expose the option in plugin schema/ui hints and plugin manifest

Why

Capture length limits are currently hardcoded. Making the limit configurable lets operators tune recall/capture quality for different channels and message styles without patching source code.

Changes

  • extensions/memory-lancedb/config.ts
    • add captureMaxChars?: number to config type
    • parse and validate range (100..10000)
    • default to 1500 when unset
    • add UI hint metadata for the new setting
  • extensions/memory-lancedb/index.ts
    • make shouldCapture accept optional { maxChars }
    • apply cfg.captureMaxChars in auto-capture flow
  • extensions/memory-lancedb/openclaw.plugin.json
    • expose captureMaxChars in UI hints and JSON schema
  • extensions/memory-lancedb/index.test.ts
    • add config validation test for out-of-range values
    • add capture boundary assertions with explicit max-char option

Validation

  • ./node_modules/.bin/vitest run extensions/memory-lancedb/index.test.ts

Greptile Overview

Greptile Summary

This PR adds a configurable captureMaxChars setting to the memory-lancedb plugin, allowing operators to tune the maximum message length for auto-capture filtering. The implementation is clean and properly validates the range (100-10000), with good test coverage.

Critical Issue: The PR description states it will "keep existing behavior by default (1500 max chars)" but the actual previous hardcoded value was 500 chars (verified in commit 5b23999). The new default of 1500 triples the capture length limit, which is a breaking behavioral change that contradicts the stated intent. This will cause the auto-capture system to accept 3x longer messages than before, potentially affecting memory quality and performance.

Changes:

  • Added captureMaxChars config option with validation (100-10000 range)
  • Modified shouldCapture function to accept optional maxChars parameter
  • Added UI hints and JSON schema for the new setting
  • Added test coverage for config validation and capture boundary tests

Recommendation: Change DEFAULT_CAPTURE_MAX_CHARS from 1500 to 500 to truly preserve existing behavior, or update the PR description to acknowledge this is an intentional 3x increase in the default capture limit.

Confidence Score: 2/5

  • PR contains a breaking behavior change that contradicts the stated intent
  • The PR claims to preserve existing behavior with a default of 1500 chars, but the actual existing behavior was 500 chars. This increases the default by 3x, which changes how auto-capture filters messages and could significantly alter system behavior in production
  • config.ts and index.test.ts need the default value corrected from 1500 to 500

Last reviewed commit: d615545

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

export type MemoryCategory = (typeof MEMORY_CATEGORIES)[number];

const DEFAULT_MODEL = "text-embedding-3-small";
const DEFAULT_CAPTURE_MAX_CHARS = 1500;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default changed from 500 to 1500 - this triples the max chars limit and changes existing behavior despite PR description claiming to "keep existing behavior by default"

Suggested change
const DEFAULT_CAPTURE_MAX_CHARS = 1500;
const DEFAULT_CAPTURE_MAX_CHARS = 500;
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/memory-lancedb/config.ts
Line: 21:21

Comment:
default changed from 500 to 1500 - this triples the max chars limit and changes existing behavior despite PR description claiming to "keep existing behavior by default"

```suggestion
const DEFAULT_CAPTURE_MAX_CHARS = 500;
```

How can I resolve this? If you propose a fix, please make it concise.

expect(config).toBeDefined();
expect(config?.embedding?.apiKey).toBe(OPENAI_API_KEY);
expect(config?.dbPath).toBe(dbPath);
expect(config?.captureMaxChars).toBe(1500);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test expects 1500 but the original default was 500 - update to match actual existing behavior

Suggested change
expect(config?.captureMaxChars).toBe(1500);
expect(config?.captureMaxChars).toBe(500);
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/memory-lancedb/index.test.ts
Line: 64:64

Comment:
test expects 1500 but the original default was 500 - update to match actual existing behavior

```suggestion
    expect(config?.captureMaxChars).toBe(500);
```

How can I resolve this? If you propose a fix, please make it concise.

@jheeanny
Copy link

Memory LanceDB auto-capture max length: we set per-agent thresholds (e.g. 1000 chars) and also prune during heartbeat. For large memories, we store summaries instead of raw logs. Does this feature allow dynamic per-agent overrides?

@vignesh07
Copy link
Contributor

Thanks for this contribution. #16641 Implemented it differently in this PR. Credited you in the changelog.

@vignesh07 vignesh07 closed this Feb 15, 2026
@openclaw-barnacle openclaw-barnacle bot added extensions: memory-lancedb Extension: memory-lancedb size: S labels Feb 15, 2026
vincentkoc added a commit to vincentkoc/openclaw that referenced this pull request Feb 15, 2026
* 'main' of https://github.com/vincentkoc/openclaw: (1129 commits)
  Browser: avoid single-page target lookup hang under blocked CDP attach
  fix(test): avoid vitest mock type inference issues
  Changelog: configurable LanceDB capture limit
  Memory-lancedb: configurable capture limit (openclaw#16624) (thanks @ciberponk)
  feat(memory-lancedb): make auto-capture max length configurable
  fix(nodes): raise transport timeout for exec.approval.request (openclaw#12098) (openclaw#12188)
  refactor(line): extract node webhook handler + shared verification
  ci(sandbox): add sandbox-common smoke
  refactor(sandbox): add sandbox-common dockerfile
  perf(test): speed up update-startup and docker-setup suites
  perf(cron): make wakeMode now busy-wait configurable
  perf(test): consolidate inbound access-control suites
  perf(test): reuse temp roots in session suites
  refactor(test): dedupe gateway auth e2e lockout setup
  refactor(test): dedupe discord status tool-result test setup
  refactor(test): reuse think directive fixtures
  refactor(test): reuse directive per-agent allowlist config
  refactor(test): share telegram forum ctx helper
  refactor(test): share directive elevated config
  refactor(test): share slack monitor helpers
  ...
akoscz pushed a commit to akoscz/openclaw that referenced this pull request Feb 15, 2026
GwonHyeok pushed a commit to learners-superpumped/openclaw that referenced this pull request Feb 15, 2026
vincentkoc pushed a commit to vincentkoc/openclaw that referenced this pull request Feb 15, 2026
snowzlm pushed a commit to snowzlm/openclaw that referenced this pull request Feb 15, 2026
Muggleee added a commit to Muggleee/openclaw that referenced this pull request Feb 15, 2026
…patches (#4)

* fix: validate state for manual Chutes OAuth

* test: fix Signal tool-result mocks

* test(signal): avoid unused monitor import

* refactor(test): table npm global update cases

* refactor(process): share stdin/session guards

* refactor(test): share temp home env harness

* fix(gateway): abort active runs during sessions.reset (openclaw#16576)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 43da87f
Co-authored-by: Grynn <[email protected]>
Co-authored-by: gumadeiras <[email protected]>
Reviewed-by: @gumadeiras

* refactor(test): reuse nodes media gateway mock

* fix(security): add optional workspace-only path guards for fs tools

* docs(changelog): note exec allowlist command substitution fix

* docs(changelog): clarify exec allowlist mode only

* test(signal): ensure tool-result mocks apply before monitor import

* test(signal): load monitor after tool-result mocks

* tui: cap local shell output buffering

* fix: add safety timeout to session.compact() to prevent lane deadlock (openclaw#16533)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 21e4045
Co-authored-by: BinHPdev <[email protected]>
Co-authored-by: gumadeiras <[email protected]>
Reviewed-by: @gumadeiras

* refactor(test): share cron isolated agent fixtures

* fix(process): satisfy tool execute typing

* fix(test): remove unused cron imports

* Memory/QMD: cap qmd command output buffering

* Memory/QMD: prefer exact docid lookup in index

* Memory/QMD: robustly parse noisy qmd JSON output

* Memory/QMD: add limit arg to search command

* Memory/QMD: optimize qmd readFile for line-window reads

* Memory/QMD: skip unchanged session export writes

* Memory/QMD: parse scope once in qmd scope checks

* docs(changelog): soften exec allowlist scope note

* docs: consolidate 2026.2.14 changelog

* perf(test): speed up session store lock suite

* perf(test): reuse memory manager batch suite

* perf(test): speed up web auto-reply last-route coverage

* chore(test): fix oxlint errors

* perf(test): speed up sessions suite

* perf(test): speed up dns cli test

* perf(test): reuse managers in embedding batches suite

* perf(test): reuse managers in embedding token limit suite

* perf(test): speed up archive suite

* perf(test): speed up session store pruning suite

* perf(test): reduce sync passes in memory batch failure test

* perf(test): speed up memory index suite

* perf(test): speed up path env suite

* fix(web): remove leaked SIGINT handler when keepAlive=false

* perf(test): consolidate web auto-reply suites

* test: fix processMessage contract test lint

* test: isolate OPENCLAW_HOME in withTempHome

* fix(sandbox): switch to root user for package installation in sandbox-common-setup

The base image (Dockerfile.sandbox) sets USER sandbox at the end, so
when sandbox-common-setup.sh builds FROM it, apt-get runs as the
unprivileged sandbox user and fails with 'Permission denied'.

Add USER root before apt-get/npm/curl install steps, and restore
USER sandbox at the end to preserve the non-root runtime default.

Fixes openclaw#16420

* fix(line): return 200 for webhook verification requests without signature

LINE Platform sends POST {"events":[]} without an X-Line-Signature
header when the user clicks 'Verify' in the LINE Developers Console.
Both webhook.ts and monitor.ts rejected this with 400 'Missing
X-Line-Signature header', causing verification to fail.

Now detect the verification pattern (no signature + empty events array)
and return 200 OK immediately, while still requiring valid signatures
for all real webhook deliveries with non-empty events.

Fixes openclaw#16425

* fix: LINE webhook verification 200; fix tsgo error (openclaw#16582) (thanks @arosstale)

* Memory/QMD: treat prefixed no-results markers as empty

* agents: reduce prompt token bloat from exec and context (openclaw#16539)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 8e1635f
Co-authored-by: CharlieGreenman <[email protected]>
Co-authored-by: gumadeiras <[email protected]>
Reviewed-by: @gumadeiras

* docs: document bootstrap total cap and exec log/notify behavior

* fix(workspace): create BOOTSTRAP.md regardless of workspace state (openclaw#16457) (openclaw#16504)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: a57718c
Co-authored-by: robbyczgw-cla <[email protected]>
Co-authored-by: gumadeiras <[email protected]>
Reviewed-by: @gumadeiras

* Memory/QMD: make status checks side-effect free

* Memory/QMD: handle fallback init failures gracefully

* refactor(test): share overflow compaction mocks

* refactor(test): share auto-reply temp home harness

* refactor(test): share sessions_spawn e2e mocks

* fix(test): remove unused vitest imports

* refactor(test): share gateway server plugin mocks

* refactor(test): dedupe fuzzy model directive config

* refactor(test): dedupe discord handler setup

* refactor(test): share cron service fixtures

* fix(test): remove unused cron import

* fix(test): complete gateway plugin registry mock

* refactor(test): reuse base CLI program mocks

* refactor(test): dedupe pi subscribe text_end cases

* refactor(test): share slack monitor helpers

* refactor(test): share directive elevated config

* refactor(test): share telegram forum ctx helper

* refactor(test): reuse directive per-agent allowlist config

* refactor(test): reuse think directive fixtures

* refactor(test): dedupe discord status tool-result test setup

* refactor(test): dedupe gateway auth e2e lockout setup

* perf(test): reuse temp roots in session suites

* perf(test): consolidate inbound access-control suites

* perf(cron): make wakeMode now busy-wait configurable

* perf(test): speed up update-startup and docker-setup suites

* refactor(sandbox): add sandbox-common dockerfile

* ci(sandbox): add sandbox-common smoke

* refactor(line): extract node webhook handler + shared verification

* fix(nodes): raise transport timeout for exec.approval.request (openclaw#12098) (openclaw#12188)

`openclaw nodes run` always timed out after 35s with "gateway timeout
after 35000ms" even though `openclaw nodes invoke system.run` worked
instantly on the same node.

Root cause: the CLI's default --timeout of 35s was used as the WebSocket
transport timeout for exec.approval.request, but the gateway-side
handler waits up to 120s for user approval — so the transport was always
killed 85s too early.

Fix: override opts.timeout for the approval call to
Math.max(parseTimeoutMs(opts.timeout) ?? 0, approvalTimeoutMs + 10_000)
(130s by default), ensuring the transport outlasts the approval wait
while still honoring any larger user-supplied --timeout.

* feat(memory-lancedb): make auto-capture max length configurable

* Memory-lancedb: configurable capture limit (openclaw#16624) (thanks @ciberponk)

* Changelog: configurable LanceDB capture limit

* fix(test): avoid vitest mock type inference issues

* Browser: avoid single-page target lookup hang under blocked CDP attach

* fix: improve sqlite missing runtime error

* refactor: centralize exec approval timeout

* fix(workspace): persist bootstrap onboarding state

* changelog: add workspace onboarding attribution

* perf(test): reduce memory suite resets

* perf(test): streamline imessage monitor suites

* perf(test): avoid process.env cloning in update-startup suite

* perf(test): drop polling waits in qmd manager suite

* perf(test): drop recursive mkdir in qmd manager suite

* test(web): stabilize processMessage inbound contract cleanup

* test(web): stabilize processMessage inbound contract cleanup

* perf(test): remove sleeps from session store lock suite

* refactor(test): dedupe cron isolated-agent e2e setup

* refactor(test): dedupe web auto-reply last-route test

* refactor(test): dedupe cloudflare onboarding provider auth cases

* refactor(test): dedupe update-cli downgrade setup

* refactor(test): dedupe loadWorkspaceSkillEntries plugin setup

* refactor(test): dedupe pi-tools schema union checks

* refactor(test): dedupe trigger greeting prompt cases

* fix(test): align trigger harness config types

* fix(test): avoid base-to-string in nodes-media e2e logs

* refactor: share file lock via plugin-sdk

* refactor(bluebubbles): dedupe webhook normalization

* refactor(msteams): share Graph helpers

* refactor(test): dedupe gemini oauth fixture setup

* refactor(test): dedupe googlechat webhook routing setup

* fix(ci): avoid TS2742 vitest mock export types

* TUI/Gateway: emit internal hooks for /new and /reset

* TUI: honor explicit session key in global scope

* Changelog: note explicit TUI session override fix

* perf(test): stop polling cron job list

* perf(test): reuse temp root in slack prepare contract suite

* refactor(test): dedupe session reset policy setup

* perf(test): avoid per-test rm in update-startup suite

* perf(test): avoid dynamic imports in session reset suites

* perf(test): reduce mkdir churn in path env suite

* perf(test): reuse imports in models cli suite

* Sandbox: add shared bind-aware fs path resolver

* Sandbox: honor bind mounts in file tools

* perf(test): keep single media server and fast cleanup

* Changelog: note sandbox bind-mount file tool fix

* perf(test): avoid env cloning in docker-setup suite

* Media: include state workspace/sandbox in local path allowlist

* Changelog: note media local root allowlist update

* Lockfile: sync msteams specifiers

* refactor(onboarding): share promptAccountId helper

* refactor(zalo): share outbound chunker

* refactor(whatsapp): share target resolver

* refactor(slack): share message action helpers

* refactor(imessage): share target parsing helpers

* refactor(agents): dedupe claude oauth parsing

* refactor(gateway): share config restart sentinel builder

* refactor(telegram): share outbound param parsing

* refactor(bluebubbles): share send helpers

* refactor(memory): share sync indexing helper

* refactor(slack): dedupe member join/leave handlers

* perf(test): speed up qmd manager suite

* TUI: honor gateway bind mode for local connection URL

* Changelog: note TUI gateway bind URL fix

* Memory: reduce watcher FD pressure for markdown sync

* Changelog: note memory watcher FD-pressure hardening

* perf(test): remove gateway lock sleep waits

* fix(test): mock whatsapp outbound target resolver

* perf(test): avoid importing update-check in startup suite

* Protocol: regenerate Swift gateway models

* Diagnostics: bound in-memory session state tracking

* Changelog: note diagnostic session-state bounds

* fix(test): disable safeBins expectations on Windows

* fix(test): make sandbox fs-path expectations cross-platform

* fix(image): allow workspace and sandbox media paths (openclaw#15541)

* fix: media allowlist finalize (openclaw#16697) (thanks @tyler6204)

* refactor(line): share inbound context builder

* refactor(outbound): share tool payload extraction

* refactor(memory): dedupe batch embedding glue

* refactor(usage): share claude window builder

* refactor(gateway): share node session touch

* refactor(cli): share exec approvals save flow

* refactor(cli): dedupe browser start/stop

* refactor(feishu): share download buffer reader

* refactor(plugin-sdk): reuse dedupe cache

* Gateway: bound agent run sequence tracking

* Changelog: note agentRunSeq map hardening

* Auto-reply: bound abort memory map growth

* Changelog: note abort memory map hardening

* chore(release): bump versions to 2026.2.14

* Slack: bound thread starter cache growth

* Changelog: note Slack thread starter cache bounds

* Outbound: bound directory cache memory growth

* Changelog: note directory cache bounds hardening

* Skills: clean up remote node cache on disconnect

* Changelog: note remote skills cache disconnect cleanup

* fix(image): propagate workspace root for image allowlist (openclaw#16722)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 24a1367
Co-authored-by: steipete <[email protected]>
Co-authored-by: steipete <[email protected]>
Reviewed-by: @steipete

* Memory/QMD: self-heal null-byte collection metadata on update

* Memory/QMD: add null-byte collection repair regressions

* Changelog: note QMD null-byte collection self-heal

* Subagents: retain announce queue items on send failure

* Subagents: add announce queue failure retry regressions

* Changelog: note subagent announce queue retry hardening

* chore (exec): add PTY background abort regression test

* fix: deliver tool result media when verbose is off (openclaw#16679)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 6e16feb
Co-authored-by: christianklotz <[email protected]>
Co-authored-by: christianklotz <[email protected]>
Reviewed-by: @christianklotz

* fix(security): default apply_patch workspace containment

* fix(infra): avoid req.destroy(err) in request body limiters

* fix (memory/lancedb): harden memory recall and auto-capture

* chore (changelog): note memory-lancedb injection hardening

* fix (memory/lancedb): require explicit opt-in for auto-capture

* chore (changelog): note memory-lancedb auto-capture opt-in

* fix(config): stop defaulting slack/discord dm.policy

* fix(security): harden Windows child process spawning

* fix (tui): preserve active stream during concurrent run finals

* chore (changelog): note TUI concurrent stream hardening

* refactor(media): harden localRoots bypass (openclaw#16739)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 89dce69
Co-authored-by: steipete <[email protected]>
Co-authored-by: steipete <[email protected]>
Reviewed-by: @steipete

* fix(agents): block workspaceOnly apply_patch delete symlink escape

* fix (tui): sanitize binary-heavy history text before render

* chore (tui): replace control-char regex with codepoint sanitizer

* chore (changelog): note TUI binary history render hardening

* fix(security): apply tools.fs.workspaceOnly to sandbox file tools

* chore (tests): format apply-patch e2e test

* fix(discord): harden voice message media loading

* fix (tui): preserve streamed text across tool boundary deltas

* chore (tui): add stream assembler regression for tool boundary drops

* chore (changelog): note TUI tool-boundary stream fix

* fix(memory): prevent QMD scope deny bypass

* fix (tui): harden render sanitization for narrow terminals

* chore (tui): add sanitizer regressions for narrow width safety

* chore (changelog): note narrow-terminal TUI sanitizer hardening

* fix(allowlist): canonicalize Slack/Discord allowFrom

* fix (memory/builtin): keep status dirty state stable across invocations

* chore (memory): add status dirty rebound regression test

* chore (changelog): note stable memory status dirty reporting

* docs: update Slack/Discord allowFrom references

* test: stabilize sessions_spawn e2e mocks

* ci: reduce docker e2e log brittleness

* fix: support file: npm specs in plugin install

* fix: accept auth code in chutes oauth manual flow

* test (agents): cover empty-chunk timeout failover behavior

* fix (agents): classify empty-chunk stream failures as timeout

* chore (changelog): document empty-chunk timeout handling

* docs(changelog): reorder 2026.2.14 notes

* fix (telegram): return webhook timeout responses to prevent retry storms

* test (telegram): assert webhook callback timeout-safe options

* chore (changelog): note telegram webhook timeout retry-storm fix

* test: quiet docker onboard e2e noise

* fix (signal): preserve case for group target normalization

* test (signal): cover mixed-case group target ids

* chore (changelog): note signal group-id normalization fix

* fix (memory/qmd): avoid multi-collection query ranking corruption

* test (memory/qmd): cover per-collection query fallback behavior

* chore (changelog): note qmd multi-collection query fix

* fix (discord): ignore empty guild channel maps in allowlist resolution

* test (discord): cover empty guild channels config fallback

* chore (changelog): note discord empty channels allowlist fix

* docs(changelog): mark 2026.2.14 released

* fix (cron): skip startup replay for interrupted running jobs

* test (cron): cover interrupted startup job replay guard

* chore (changelog): note cron interrupted-start replay fix

* fix (tui): keep assistant text contrast theme-adaptive

* test (tui): cover assistant default-foreground theme behavior

* chore (changelog): note tui light-theme contrast fix

* fix (agents): accept read file_path alias in tool-start path checks

* test (agents): cover read file_path alias in tool-start diagnostics

* chore (changelog): note read tool file_path alias warning fix

---------

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Vishal Doshi <[email protected]>
Co-authored-by: Grynn <[email protected]>
Co-authored-by: gumadeiras <[email protected]>
Co-authored-by: Vignesh Natarajan <[email protected]>
Co-authored-by: Bin Deng <[email protected]>
Co-authored-by: BinHPdev <[email protected]>
Co-authored-by: artale <[email protected]>
Co-authored-by: Charlie Greenman <[email protected]>
Co-authored-by: CharlieGreenman <[email protected]>
Co-authored-by: Gustavo Madeira Santana <[email protected]>
Co-authored-by: Robby <[email protected]>
Co-authored-by: robbyczgw-cla <[email protected]>
Co-authored-by: Marcus Castro <[email protected]>
Co-authored-by: fan <fan@FANCOOL-P16V>
Co-authored-by: Tyler Yust <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: steipete <[email protected]>
Co-authored-by: Christian Klotz <[email protected]>
Co-authored-by: christianklotz <[email protected]>
Co-authored-by: zackleeli <[email protected]>
Benkei-dev pushed a commit to Benkei-dev/openclaw that referenced this pull request Feb 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: memory-lancedb Extension: memory-lancedb size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments