Skip to content

fix(web): drop search-input pollution from _searchByTag (closes #672)#749

Merged
memtomem merged 2 commits intomainfrom
fix/672-tag-search-pollution
May 3, 2026
Merged

fix(web): drop search-input pollution from _searchByTag (closes #672)#749
memtomem merged 2 commits intomainfrom
fix/672-tag-search-pollution

Conversation

@pandas-studio
Copy link
Copy Markdown
Collaborator

@pandas-studio pandas-studio commented May 3, 2026

Summary

  • _searchByTag no longer copies the clicked tag into search-input — it sets only tag-filter. The Tags Cloud / Tags List pill click is now a pure narrowing filter on whatever the user was already searching for, instead of running a confusing dual-axis search where BM25 ranked documents that merely mentioned the tag in prose on top of the tag-filter constraint.
  • When search-input is empty, the click acts as a "filter prep": Search tab opens, tag-filter populated, but doSearch() early-returns on the empty q so no results render. Follow-up tracked in Allow tag-only search (relax empty-q guard in /api/search and doSearch) #750 (relax frontend + backend empty-q guard so a fresh-session click runs as a true tag-only search).
  • CHANGELOG entry added under [Unreleased] / Changed (Keep a Changelog format) — user-visible behaviour change, so noted explicitly.

Why this is one focused PR (per CONTRIBUTING.md)

The bug is a UX axis-duplication: the same tag string was being applied as both BM25 query and tag filter. Removing the duplication (this PR) is independent of supporting empty-q tag-only searches (#750). Doing both together would force a backend q schema change into a frontend behaviour fix, which CONTRIBUTING.md explicitly discourages.

Test plan

  • uv run ruff check packages/memtomem/src && uv run ruff format --check packages/memtomem/src — clean
  • uv run pytest -m "not ollama" — 3994 passed, 7 skipped, 46 deselected
  • Manual mm web:
    • Narrowing: type a query in Search → Tags tab → click pill → Search tab returns, search-input keeps original query, tag-filter shows clicked tag, results = (query ∩ tag).
    • Filter prep: fresh session → Tags tab → click pill → Search tab activates, tag-filter populated, search-input empty, no results render. Type query → results.
    • Result-panel tag chip click (app.js:2413): unchanged — still overwrites tag-filter and re-runs.
    • Filter chip X clear: unchanged.

Copy link
Copy Markdown
Owner

@memtomem memtomem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

memtomem and others added 2 commits May 3, 2026 20:20
Tag pill clicks were copying the clicked tag into both `search-input`
(BM25 query) and `tag-filter`, running a dual-axis search where BM25
ranked documents that merely mentioned the tag in prose on top of the
tag-filter constraint. The search bar also looked as if the user had
typed the tag themselves, which made later filter-clear behaviour
confusing.

Now `_searchByTag` sets only `tag-filter` and leaves `search-input`
alone, so a tag pill click is a pure narrowing filter on whatever the
user was already searching for. When `search-input` is empty the call
to `doSearch()` early-returns on the empty `q`, so the click acts as
filter prep (Search tab opens, `tag-filter` populated, no results
until the user types a query). A follow-up will relax the empty-q
guard in `doSearch()` and the backend `q: str = Query(...)` so a tag
pill click on a fresh session can run as a true tag-only search.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Reviewer asked for an explicit tracker reference so readers can find
the follow-up that lifts the empty-q guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@memtomem memtomem force-pushed the fix/672-tag-search-pollution branch from c08d08d to 8eb2abb Compare May 3, 2026 11:20
memtomem added a commit that referenced this pull request May 3, 2026
Follow-up to #672 / PR #749. After PR #749 stopped `_searchByTag` from
copying the clicked tag into `search-input`, a fresh-session tag pill
click left `search-input` empty and `doSearch()` early-returned —
landing the user on Search with `tag-filter` populated but no results,
requiring an extra keystroke to actually run.

The fix relaxes the empty-`q` guard on three layers:

- Frontend (`app.js`): `doSearch()` and the load-more button now run
  whenever `q` *or* `tag-filter` has a value; `saveToHistory` is gated
  on a non-empty `q` so tag-only searches don't pollute the suggest
  history.
- Route (`/api/search`): `q` is optional (`str | None`); the route
  rejects only "no axis at all" with a 400 + actionable detail
  ("Provide at least one of q, tag_filter, or source_filter."). The
  pre-existing 422 for missing-everything was FastAPI's default for a
  required param — the new 400 is explicit and lists the available
  axes.
- Pipeline (`SearchPipeline.search`): empty-query calls route through
  a new `_filter_only_search` branch that enumerates via
  `storage.recall_chunks(tag_filter=…, source_filter=…)` and skips
  BM25/dense/expansion/rescue/rerank — those need a query/embedding
  signal. The post-filter stages (validity → decay → access boost →
  importance boost → context window) still apply so the rank reflects
  recency × access × importance even without a keyword. MMR is also
  skipped because there's no semantic axis to diversify on.

`recall_chunks` gained an optional `tag_filter` kwarg with the same
comma-separated OR semantics as the keyword path's post-fusion tag
filter — keeping the two paths' tag semantics aligned so a click that
narrows down a keyword search and a click that runs tag-only on a
fresh session both match the same set of chunks.

Tests: four new non-ollama regression tests in `test_pipeline.py`
(`TestFilterOnlySearch`) lock the contract — the empty-q path enumerates
via `recall_chunks` and never reaches `bm25_search` / `dense_search` /
`embed_query` (they're wired to raise so a regression that drops the
early-return branch fails loudly). `test_web_routes.py::TestSearch`
gains `test_search_tag_only_returns_results` and updates
`test_search_missing_query_returns_422` → `_no_axis_returns_400` to
match the new contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@memtomem memtomem merged commit c368f77 into main May 3, 2026
9 checks passed
@memtomem memtomem deleted the fix/672-tag-search-pollution branch May 3, 2026 11:29
@github-actions github-actions Bot locked and limited conversation to collaborators May 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants