Skip to content

mm init: -y (non-interactive) accepts provider/tokenizer choices without checking required extras #396

@memtomem

Description

@memtomem

Summary

mm init -y (non-interactive) accepts --provider onnx|ollama|openai and --tokenizer kiwipiepy without checking whether the corresponding Python extras (fastembed, ollama, openai, kiwipiepy) are importable. It writes the user-specified choice to config.json as if it were valid, and the failure only surfaces at runtime — as a warning from component_factory / fts_tokenizer, not at mm init time.

The interactive wizard already surfaces missing extras via _collect_missing_extras (Phase 1 of #360 / #361). The -y path bypasses the wizard steps entirely and doesn't call the equivalent check.

Adjacent to the packaging / docs story that just landed in #395 (primary install flipped to memtomem[all]), but orthogonal: even after that docs change, a user who takes the Minimal install path and then runs mm init -y --provider onnx ... gets exactly this gap.

Where (verified on 2026-04-23, fresh uv tool install memtomem no extras)

$ mm init -y --provider onnx --model BAAI/bge-small-en-v1.5 --mcp skip
  memtomem init
  ─────────────
  Detected: source install
  ...
  Setup complete!
  Provider:   onnx/BAAI/bge-small-en-v1.5 (0d)  ← silently writes onnx config
  Search:     top_k=10, tokenizer=unicode61
  ...

The (0d) in the summary is the tell — the provider is configured as onnx but the embedder's dimension is 0, because fastembed is not importable and the factory falls back silently.

Later, at index/search time:

WARNING Embedding dimension mismatch detected at startup — entering degraded mode.
WARNING DB embedding_dimension=0 but configured provider is 'onnx' — continuing in recovery mode.
        Run 'mm embedding-reset --mode apply-current' to fix.

Same pattern for --tokenizer kiwipiepy on a fresh install:

$ mm init -y --tokenizer kiwipiepy --mcp skip  # succeeds
$ mm search "..."  # first call
WARNING kiwipiepy not installed — falling back to unicode61. Install with: pip install kiwipiepy

--provider ollama / --provider openai should behave the same (same importlib.util.find_spec seam) but I didn't re-test those in this session.

Why the gap exists (hypothesis, not verified in code)

The interactive wizard walks through steps and _collect_missing_extras runs at the appropriate step to surface an install-type-aware hint. The -y path short-circuits to the config writer using the flag values as given, with no call into _collect_missing_extras at all.

This is an architectural consistency gap, not a one-line bug — the remediation is to route the -y path through the same extras check the interactive wizard does, or to factor out a validation function both entry points call.

Handled by this issue

  • Route -y non-interactive flag handling through the same extras-availability check the interactive wizard already has.
  • Behavior choice on failure: fail with a clear error (consistent with mm web which exits loudly when fastapi is missing) is preferred over "warn and continue" for non-interactive flows, since -y is typically used in automation where a stale config is worse than a failed init.
  • No change to the interactive path's current behavior.

Explicitly out of scope

Acceptance criteria

  • mm init -y --provider onnx on a fresh uv tool install memtomem (no extras) exits non-zero with a clear message pointing at the missing extra and the install-type-aware fix command.
  • mm init -y --tokenizer kiwipiepy on the same setup exits non-zero similarly.
  • The interactive wizard path is unchanged.
  • Tests: add CliRunner coverage for the -y --provider <x> and -y --tokenizer <x> paths under missing-extras conditions. Use the _isatty / find_spec seam established in feat(cli): add mm uninstall for state cleanup separate from binary removal #379 (see feedback_clirunner_isatty_seam.md) for reliable patching.

Related

Fact-check evidence

See this comment on #391 for the full fresh-install reproduction log that exposed this gap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions