Skip to content

Long memory indexing should use first-class chunked vectorization instead of relying on single-record embedding #530

@dddgogogo

Description

@dddgogogo

Summary

Long-term memory records can become too large for stable single-record vectorization. In practice, memory indexing becomes much more reliable when memory content is chunked into multiple vector records rather than being embedded as one oversized block.

I am opening this issue because long memory indexing appears to need chunking as a first-class strategy, not just an incidental workaround.

Environment

  • Repository: OpenViking
  • Deployment: local source checkout + systemd service
  • Storage backend: local AGFS + local vectordb
  • Memory flow: session commit -> memory extraction -> vector indexing
  • Embedding provider: OpenAI-compatible embedding backend

Actual behavior

When a memory content block becomes very long, indexing it as a single vectorization unit is fragile:

  • large memory records are more likely to hit embedding input limits
  • retrieval quality suffers because one large memory blob is semantically broad
  • indexing failures or degraded embeddings affect memory recall reliability

In our local testing / patching, chunked memory indexing was significantly more robust:

  • keep a base record for the memory itself
  • create chunk records for long content
  • index each chunk separately while preserving linkage to the original memory URI

Expected behavior

Long memory indexing should be natively handled in a way that is:

  • reliable for large memory contents
  • semantically better for retrieval
  • explicit and auditable in the index structure

Why chunking helps

Chunking improves both stability and retrieval behavior:

  1. It avoids embedding overly large input blocks.
  2. It preserves more local semantics than one huge vector.
  3. It lets retrievers match the relevant part of a long memory instead of scoring the whole memory as one document.
  4. It gives the system more predictable indexing behavior as memory sizes grow.

Reproduction pattern

A general reproduction path is:

  1. Produce a session that yields a long extracted memory.
  2. Let OpenViking index the memory as one vectorization unit.
  3. Observe either:
    • embedding input size issues,
    • unstable indexing behavior,
    • or poorer recall quality than expected.

Suggested product direction

I would recommend making chunked memory indexing an explicit built-in strategy for long memories:

  1. Chunk long memory content before vectorization

    • split content by a configurable character or token window
    • support overlap between adjacent chunks
  2. Keep a base record plus child chunk records

    • base record for the memory URI itself
    • chunk URIs like memory-uri#chunk-0001
    • preserve chunk_source_uri, chunk_index, and chunk_count metadata
  3. Make retrieval aware of chunk relations

    • a matching chunk should be able to point back to the parent memory
    • dedup / rerank should avoid overwhelming final results with many neighboring chunks
  4. Expose chunking as configuration

    • chunk size
    • overlap size
    • whether chunking applies only to memory or to other context types too

Why this is different from plain truncation

Chunking is not just a safety hack. It changes the indexing model in a way that preserves more semantic coverage and gives better retrieval behavior for long memories.

Additional note

If maintainers think this already exists conceptually, I would still suggest documenting it as an official indexing strategy, because long-memory stability and recall quality depend on it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions