fix(indexing): stream endpoint namespace + errors parity (#590)#591
Merged
fix(indexing): stream endpoint namespace + errors parity (#590)#591
Conversation
``IndexEngine.index_path_stream`` and the ``GET /api/index/stream``
endpoint were missing two contract elements that ``index_path`` /
``POST /api/index`` already had: the ``namespace`` parameter (silently
dropped, so streamed indexes ignored user namespace selection) and
error aggregation in the ``complete`` event (per-file
``result["errors"]`` was summed only into chunk counters and never
emitted, so partial-failure runs appeared successful in the UI).
Engine: thread ``namespace`` through to ``_index_file``, accumulate
per-file errors, include them as ``errors: list[str]`` in the
``complete`` event — same loose shape as ``IndexingStats.errors`` so
non-stream UI handlers reuse verbatim. Path-prefix the stream-level
uncaught-exception branch (``f"{fp.name}: {exc}"``) to match the
non-stream ``asyncio.gather(return_exceptions=True)`` branch.
Route: forward the new ``namespace`` query param to the engine.
Tests: ``test_index_path_stream_namespace_propagates`` (chunks gain
``metadata.namespace`` for the streamed run) and
``test_index_path_stream_complete_errors_no_silent_drop`` (the binary
file triggers a per-file error that surfaces in ``complete.errors``).
The test names document intent so future grep finds the regression
guards.
Out of scope: per-file streaming of errors (option ii/iii from the
planning discussion); normalizing ``_index_file``'s mixed
path-prefix convention. Both noted in the issue.
Unblocks #582 PR-B (Prev #1 — Index tab two-button collapse) and
informs #582 PR #6 (4.11 — indexing in-flight visibility).
Refs #590.
Co-Authored-By: Claude <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #590.
IndexEngine.index_path_streamandGET /api/index/streamwere missing two contract elements that the non-streamindex_path/POST /api/indexpair already had:namespaceparameter — silently dropped. Folder-mode UI / CLI / MCP callers that streamed an index ignored the user's namespace selection._index_filereturnsresult["errors"]per file (engine.py:606, 628, 684, 752); the stream loop summed only chunk counters (754-757) and never propagated errors. Thecompleteevent lacked anerrorsfield, so partial-failure runs appeared successful in the UI.Changes
packages/memtomem/src/memtomem/indexing/engine.py:707-784):namespace: str | None = Nonetoindex_path_streamand forward it to_index_file(fp, force, namespace=namespace)(which already accepted the param).result["errors"]intoall_errorsand emiterrors: list[str]in thecompleteevent — same loose shape asIndexingStats.errorsso non-stream UI handlers (5-cap "+N more" rendering, error-row toggle) reuse verbatim.errors: []for shape consistency.f"{fp.name}: {exc}"so consumers see the same shape regardless of which branch caught the error. Matches non-stream'sasyncio.gather(return_exceptions=True)branch (line 394).packages/memtomem/src/memtomem/web/routes/system.py:768-790): addnamespace: str | None = Nonequery param toGET /api/index/stream, forward to engine.packages/memtomem/tests/test_indexing_engine.py):test_index_path_stream_namespace_propagates— passnamespace="ns590", verify indexed chunks havemetadata.namespace == "ns590".test_index_path_stream_complete_errors_no_silent_drop— write a file with a NUL byte (binary-detected branch at engine.py:621-628), assertcomplete.errorsis non-empty and mentions the file name. Test name documents intent so future grep finds the regression guard.Compatibility
web/routes/system.py:789legacy call site already updated;cli/init_cmd.py:1546reads onlytotal_files/indexed_chunks/skipped_chunksfrom the complete event and is unaffected) continue to work without change.runIndexStream(app.js:3467) doesn't yet readcomplete.errors; that wiring + the namespace URL param land in feat(web): Index tab mid-priority follow-ups (each ships as its own PR) #582 PR-B (Prev chore: prepare for open-source release #1 collapse) and PR docs: slim READMEs, add configuration + embeddings guides #6 (4.11 toast). This PR makes those follow-ups possible without silent regressions.Out of scope
complete.errorsis sufficient for current UX; revisit if user reports indicate large indexing runs (>100 files) where waiting forcompleteis impractical._index_file's mixed path-prefix convention — some branches prefixf"{file_path.name}: ..."(file-too-large, binary), others don't (embedding failures at engine.py:684). Separate non-stream bug; this PR preserves the existing loose contract and only path-prefixes the stream-level uncaught-exception branch (line 752) for consistency with the non-stream outer-gather branch.Refs
complete.errorsvia toast)Test plan
uv run ruff check packages/memtomem/src && uv run ruff format --check packages/memtomem/srcuv run pytest packages/memtomem/tests/test_indexing_engine.py -k "test_index_path_stream" -m "not ollama"— 4/4 pass (2 existing exclude-guard tests + 2 new parity guards)uv run pytest packages/memtomem/tests/test_indexing_engine.py packages/memtomem/tests/test_web_routes.py packages/memtomem/tests/test_web_exclude_guard.py packages/memtomem/tests/test_init_cmd.py -m "not ollama"— 441/441 pass (combined). Confirms no regression in route handlers, exclude guards, or themm initwizard's seed flow.🤖 Generated with Claude Code