Skip to content

feat(privacy): warn when LLM tier sends content to external API#1224

Merged
igorls merged 1 commit intodevelopfrom
feat/privacy-warn-external-llm
Apr 26, 2026
Merged

feat(privacy): warn when LLM tier sends content to external API#1224
igorls merged 1 commit intodevelopfrom
feat/privacy-warn-external-llm

Conversation

@milla-jovovich
Copy link
Copy Markdown
Collaborator

4 files changed, 248 insertions, 0 deletions. 7 new tests (4 unit + 3 integration), all RED-first.

Per @milla-jovovich's question to @igorls during PR #1221 review: users
running mempalace init with an external LLM provider (Anthropic API,
OpenAI hosted, etc.) need a clear, explicit warning that their folder
content will be sent to the provider, that MemPalace doesn't control
how the provider logs/retains/uses that data, and how to opt out.
@igorls confirmed this should be a small follow-up PR scoped to the
warning itself, before the v3.3.4 tag.

This PR adds:

  • _endpoint_is_local(url) helper in mempalace/llm_client.py
    URL-based heuristic returning True if the hostname is on the user's
    machine or private network. Covers: localhost, 127.0.0.1, ::1,
    hostnames ending in .local (mDNS/Bonjour), IPv4 RFC1918 ranges
    (10/8, 172.16-31/12, 192.168/16), and IPv6 unique-local addresses
    (fc00::/7).

  • is_external_service property on the LLMProvider base class.
    Subclasses inherit; the URL determines (no provider-specific
    hardcoding). This means: Ollama on localhost = local. LM Studio on
    LAN = local. Anthropic with default https://api.anthropic.com =
    external. A user proxying Anthropic through localhost (advanced
    setup) = local, no false-positive warning.

  • One-line warning print in cmd_init after successful provider
    acquisition, gated on is_external_service:

    ⚠ {provider_name} is an EXTERNAL API. Your folder content will be
    sent to the provider during init. MemPalace does not control how
    the provider logs, retains, or uses your data. Pass --no-llm to
    keep init fully local.
    

    The warning fires AFTER LLM enabled: ... so users see both that
    the LLM is engaged AND the privacy implications of where it lives,
    before Pass 0 / entity detection actually runs.

LOCAL providers (Ollama on localhost, LM Studio on localhost or LAN,
llama.cpp on localhost, vLLM on localhost) DO NOT trigger the warning —
nothing leaves the user's machine/network in those configurations.

TDD: 7 tests added across 2 files.

Unit tests in tests/test_llm_client.py (4 tests, all RED-first):

  1. test_ollama_provider_default_endpoint_is_local — pins that the
    default http://localhost:11434 is classified local.
  2. test_openai_compat_provider_localhost_endpoint_is_local — covers
    the LM Studio / llama.cpp / vLLM common case (localhost,
    127.0.0.1, and 192.168.x LAN).
  3. test_openai_compat_provider_cloud_endpoint_is_external — pins
    that pointing openai-compat at https://api.openai.com (or any
    non-local URL) classifies as external.
  4. test_anthropic_provider_default_endpoint_is_external — pins that
    AnthropicProvider's default endpoint is external (the dominant
    user-facing case for --llm-provider anthropic).

Integration tests in tests/test_corpus_origin_integration.py (3 tests,
RED-first; 1 was the critical RED — the other 2 passed by accident
since nothing printed "EXTERNAL API" before this PR):

  1. test_init_prints_privacy_warning_when_provider_is_external —
    captures stdout from cmd_init with a mocked external provider,
    asserts the warning text contains "EXTERNAL API" + "--no-llm" +
    language about MemPalace not controlling provider behavior.
  2. test_init_no_privacy_warning_when_provider_is_local — same flow
    with a mocked local provider, asserts the warning text does NOT
    appear.
  3. test_init_no_privacy_warning_with_no_llm_flag — pins the --no-llm
    path: no provider acquisition attempted, no warning fires.

Tests: 1382 total mempalace tests pass. 2 pre-existing environmental
failures unrelated to this change (chromadb optional dep). Ruff check +
format both clean.

Backwards compatible: is_external_service is a new property; existing
callers don't reference it. The warning is a new print statement that
fires only when an external endpoint is acquired. The --no-llm opt-out
existed before this PR and continues to work identically.

Out of scope for follow-up (deliberately not in this PR per Igor's
"small PR" guidance): Tailscale CGNAT (100.64.0.0/10) treatment,
pre-init confirmation prompt, persistent privacy-mode config flag,
explicit cloud-provider name detection. Tracked for future iteration.

4 files changed, 248 insertions, 0 deletions. 7 new tests (4 unit + 3 integration), all RED-first.

Per @milla-jovovich's question to @igorls during PR #1221 review: users
running `mempalace init` with an external LLM provider (Anthropic API,
OpenAI hosted, etc.) need a clear, explicit warning that their folder
content will be sent to the provider, that MemPalace doesn't control
how the provider logs/retains/uses that data, and how to opt out.
@igorls confirmed this should be a small follow-up PR scoped to the
warning itself, before the v3.3.4 tag.

This PR adds:

- `_endpoint_is_local(url)` helper in `mempalace/llm_client.py` —
  URL-based heuristic returning True if the hostname is on the user's
  machine or private network. Covers: localhost, 127.0.0.1, ::1,
  hostnames ending in .local (mDNS/Bonjour), IPv4 RFC1918 ranges
  (10/8, 172.16-31/12, 192.168/16), and IPv6 unique-local addresses
  (fc00::/7).

- `is_external_service` property on the `LLMProvider` base class.
  Subclasses inherit; the URL determines (no provider-specific
  hardcoding). This means: Ollama on localhost = local. LM Studio on
  LAN = local. Anthropic with default `https://api.anthropic.com` =
  external. A user proxying Anthropic through localhost (advanced
  setup) = local, no false-positive warning.

- One-line warning print in `cmd_init` after successful provider
  acquisition, gated on `is_external_service`:

      ⚠ {provider_name} is an EXTERNAL API. Your folder content will be
      sent to the provider during init. MemPalace does not control how
      the provider logs, retains, or uses your data. Pass --no-llm to
      keep init fully local.

  The warning fires AFTER `LLM enabled: ...` so users see both that
  the LLM is engaged AND the privacy implications of where it lives,
  before Pass 0 / entity detection actually runs.

LOCAL providers (Ollama on localhost, LM Studio on localhost or LAN,
llama.cpp on localhost, vLLM on localhost) DO NOT trigger the warning —
nothing leaves the user's machine/network in those configurations.

TDD: 7 tests added across 2 files.

Unit tests in `tests/test_llm_client.py` (4 tests, all RED-first):

1. test_ollama_provider_default_endpoint_is_local — pins that the
   default `http://localhost:11434` is classified local.
2. test_openai_compat_provider_localhost_endpoint_is_local — covers
   the LM Studio / llama.cpp / vLLM common case (localhost,
   127.0.0.1, and 192.168.x LAN).
3. test_openai_compat_provider_cloud_endpoint_is_external — pins
   that pointing openai-compat at https://api.openai.com (or any
   non-local URL) classifies as external.
4. test_anthropic_provider_default_endpoint_is_external — pins that
   AnthropicProvider's default endpoint is external (the dominant
   user-facing case for `--llm-provider anthropic`).

Integration tests in `tests/test_corpus_origin_integration.py` (3 tests,
RED-first; 1 was the critical RED — the other 2 passed by accident
since nothing printed "EXTERNAL API" before this PR):

5. test_init_prints_privacy_warning_when_provider_is_external —
   captures stdout from cmd_init with a mocked external provider,
   asserts the warning text contains "EXTERNAL API" + "--no-llm" +
   language about MemPalace not controlling provider behavior.
6. test_init_no_privacy_warning_when_provider_is_local — same flow
   with a mocked local provider, asserts the warning text does NOT
   appear.
7. test_init_no_privacy_warning_with_no_llm_flag — pins the --no-llm
   path: no provider acquisition attempted, no warning fires.

Tests: 1382 total mempalace tests pass. 2 pre-existing environmental
failures unrelated to this change (chromadb optional dep). Ruff check +
format both clean.

Backwards compatible: `is_external_service` is a new property; existing
callers don't reference it. The warning is a new print statement that
fires only when an external endpoint is acquired. The `--no-llm` opt-out
existed before this PR and continues to work identically.

Out of scope for follow-up (deliberately not in this PR per Igor's
"small PR" guidance): Tailscale CGNAT (100.64.0.0/10) treatment,
pre-init confirmation prompt, persistent privacy-mode config flag,
explicit cloud-provider name detection. Tracked for future iteration.
@milla-jovovich milla-jovovich requested a review from igorls April 26, 2026 22:08
@igorls igorls merged commit 4d33f14 into develop Apr 26, 2026
6 checks passed
jphein pushed a commit to jphein/mempalace that referenced this pull request Apr 26, 2026
2 files changed, 60 insertions, 0 deletions. 2 new tests (RED-first).

Follow-up to MemPalace#1224's privacy warning. The URL-based heuristic in
``mempalace.llm_client._endpoint_is_local`` shipped without recognizing
Tailscale's CGNAT range (100.64.0.0/10), so a user running LM Studio,
Ollama, or any local LLM accessible via a Tailscale-assigned 100.x.x.x
address would currently get a wrong privacy warning — Tailscale
addresses are network-private (only reachable inside the user's
Tailnet) but they're not RFC1918, so the heuristic was treating them
as external.

This PR adds CGNAT recognition: when the hostname starts with ``100.``
AND the second octet is between 64 and 127 inclusive, it's classified
as local. Addresses in 100.x.x.x outside that range (i.e. second octet
< 64 or > 127) are regular allocated public space and remain external,
so a user pointing at a public 100.0.0.1 still gets the warning.

Concrete user impact:

  Before: ``mempalace init --llm-provider openai-compat --llm-endpoint http://100.100.50.50:1234``
          (LM Studio on Tailnet) → triggers privacy warning incorrectly.

  After:  same command → no warning. data stays inside the user's
          Tailnet, which is what the warning is supposed to protect against.

TDD: 2 tests added in ``tests/test_llm_client.py``, both RED-first.

1. ``test_openai_compat_provider_tailscale_cgnat_endpoint_is_local``
   — covers three Tailscale CGNAT addresses (start, middle, near-end
   of the range) and pins they're all classified local. This was the
   RED that drove the implementation.

2. ``test_openai_compat_provider_outside_tailscale_cgnat_is_external``
   — pins the boundary on both sides: addresses with second octet 0-63
   and 128-255 stay external. Prevents future "treat all 100.x.x.x as
   local" overcorrection.

Tests: 1388 total mempalace tests pass. 2 pre-existing environmental
failures unrelated to this change (chromadb optional dep). Ruff check
+ format both clean.

Backwards compatible: only widens the local-recognition set. Anything
classified local before is still classified local; anything classified
external before remains so unless it's specifically in the CGNAT range.

Out of scope (tracked for future iteration based on real user feedback,
not built speculatively): pre-init confirmation prompt before sending
to external API, persistent ``private-only`` config flag that refuses
external endpoints entirely, explicit cloud-provider name detection
("Using Anthropic's hosted API at ..." vs the current generic warning).
jphein added a commit to jphein/mempalace that referenced this pull request Apr 26, 2026
Brings in upstream's corpus-origin + privacy-warning track (PRs MemPalace#1211
MemPalace#1221 MemPalace#1223 MemPalace#1224 MemPalace#1225) plus the canonical merged versions of our four
PRs that landed today (21:22-21:41 UTC):

  MemPalace#1173  quarantine_stale_hnsw on make_client + cold-start gate +
         integrity sniff-test (PROTO/STOP byte check, no deserialization)
  MemPalace#1177  .blob_seq_ids_migrated marker guard, closes MemPalace#1090
  MemPalace#1198  _tokenize None-document guard in BM25 reranker
  MemPalace#1201  palace_graph.build_graph skips None metadata

Conflict resolution:

* mempalace/backends/chroma.py — took upstream as base (it has the
  igorls-review pickle-protocol docstring, thread-safety paragraph, and
  Path(marker).touch() style nit), then re-applied MemPalace#1094's _coerce_none_metas
  in query()/get() since MemPalace#1094 is still open and not yet in develop.

* mempalace/mcp_server.py — took upstream's clean form. Dropped the
  fork-only `palace_path=` kwarg from four ChromaCollection() call sites:
  the kwarg was load-bearing for MemPalace#1171's per-collection write lock, but
  MemPalace#1171 closed in favor of MemPalace#976's mine_global_lock + daemon-strict, so
  the kwarg has no remaining consumer. ChromaCollection.__init__ in
  upstream/develop is back to (self, collection); calling it with
  palace_path= raised TypeError → silently swallowed by the broad except
  in _get_collection() → returned None → tool_status() returned _no_palace().
  41 mcp_server tests went from failing-with-KeyError to passing.

* mempalace/cli.py — dropped fork-only `workers=args.workers` from the
  cmd_mine -> miner.mine() call. Pre-existing fork-side bug: the
  `--workers` argparse arg landed in 5cd14bd but miner.mine() never
  accepted a workers param, so production `mempalace mine` TypeError'd
  on every invocation. Removed the broken plumbing; tests/test_cli.py
  updated to match.

* CHANGELOG.md — took upstream verbatim. Fork-specific changelog lives
  in FORK_CHANGELOG.md (canonical: docs/fork-changes.yaml).

* CLAUDE.md — kept ours. Fork's CLAUDE.md is operational; upstream's
  added a "Design Principles / Contributing" charter, which lives in
  README.md on the fork.

* tests/test_backends.py — took upstream's ruff-formatted line widths.

docs/fork-changes.yaml flips the two MemPalace#1173 entries (hnsw-integrity-gate,
hnsw-cold-start-gate) and the MemPalace#1201 entry (palace-graph-none-guard) from
OPEN to MERGED 2026-04-26. MemPalace#1173 MemPalace#1177 MemPalace#1198 MemPalace#1201 added to the
merged_upstream archive at the bottom. FORK_CHANGELOG.md regenerated.

scripts/check-docs.sh: 4/4 clean.
Test suite: 1460/1460.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
milla-jovovich added a commit that referenced this pull request Apr 27, 2026
Adds api_key_source provenance ('flag' | 'env' | None) to LLMProvider
so cmd_init can distinguish a key passed via --llm-api-key (explicit
opt-in) from one silently picked up via OPENAI_API_KEY / ANTHROPIC_API_KEY
shell env (stray credential).

When the endpoint is external AND api_key_source == 'env', init now
prints a blocking [y/N] prompt before any data is sent. Anything other
than 'y' drops the LLM and falls back to heuristics-only.

Adds --accept-external-llm flag for CI / non-interactive bypass.

Completes the UX gap in #1224: the URL-based warning was informational
and init kept running, so a user who didn't notice the line had already
leaked. The consent prompt is the actual gate; explicit flag-passed keys
remain treated as already-consented.
lealvona pushed a commit to lealvona/mempalace that referenced this pull request Apr 29, 2026
2 files changed, 60 insertions, 0 deletions. 2 new tests (RED-first).

Follow-up to MemPalace#1224's privacy warning. The URL-based heuristic in
``mempalace.llm_client._endpoint_is_local`` shipped without recognizing
Tailscale's CGNAT range (100.64.0.0/10), so a user running LM Studio,
Ollama, or any local LLM accessible via a Tailscale-assigned 100.x.x.x
address would currently get a wrong privacy warning — Tailscale
addresses are network-private (only reachable inside the user's
Tailnet) but they're not RFC1918, so the heuristic was treating them
as external.

This PR adds CGNAT recognition: when the hostname starts with ``100.``
AND the second octet is between 64 and 127 inclusive, it's classified
as local. Addresses in 100.x.x.x outside that range (i.e. second octet
< 64 or > 127) are regular allocated public space and remain external,
so a user pointing at a public 100.0.0.1 still gets the warning.

Concrete user impact:

  Before: ``mempalace init --llm-provider openai-compat --llm-endpoint http://100.100.50.50:1234``
          (LM Studio on Tailnet) → triggers privacy warning incorrectly.

  After:  same command → no warning. data stays inside the user's
          Tailnet, which is what the warning is supposed to protect against.

TDD: 2 tests added in ``tests/test_llm_client.py``, both RED-first.

1. ``test_openai_compat_provider_tailscale_cgnat_endpoint_is_local``
   — covers three Tailscale CGNAT addresses (start, middle, near-end
   of the range) and pins they're all classified local. This was the
   RED that drove the implementation.

2. ``test_openai_compat_provider_outside_tailscale_cgnat_is_external``
   — pins the boundary on both sides: addresses with second octet 0-63
   and 128-255 stay external. Prevents future "treat all 100.x.x.x as
   local" overcorrection.

Tests: 1388 total mempalace tests pass. 2 pre-existing environmental
failures unrelated to this change (chromadb optional dep). Ruff check
+ format both clean.

Backwards compatible: only widens the local-recognition set. Anything
classified local before is still classified local; anything classified
external before remains so unless it's specifically in the CGNAT range.

Out of scope (tracked for future iteration based on real user feedback,
not built speculatively): pre-init confirmation prompt before sending
to external API, persistent ``private-only`` config flag that refuses
external endpoints entirely, explicit cloud-provider name detection
("Using Anthropic's hosted API at ..." vs the current generic warning).
lealvona pushed a commit to lealvona/mempalace that referenced this pull request Apr 29, 2026
Adds api_key_source provenance ('flag' | 'env' | None) to LLMProvider
so cmd_init can distinguish a key passed via --llm-api-key (explicit
opt-in) from one silently picked up via OPENAI_API_KEY / ANTHROPIC_API_KEY
shell env (stray credential).

When the endpoint is external AND api_key_source == 'env', init now
prints a blocking [y/N] prompt before any data is sent. Anything other
than 'y' drops the LLM and falls back to heuristics-only.

Adds --accept-external-llm flag for CI / non-interactive bypass.

Completes the UX gap in MemPalace#1224: the URL-based warning was informational
and init kept running, so a user who didn't notice the line had already
leaked. The consent prompt is the actual gate; explicit flag-passed keys
remain treated as already-consented.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 3, 2026
- Override is_external_service=True so the MemPalace#1224 privacy gate fires.
  The CLI binary runs locally but every classify call routes user content
  to Anthropic-hosted models, so the URL-based base-class default
  (returns local because endpoint is None) misclassifies this provider.

- Strip ANTHROPIC_* env vars from the subprocess environment. If the
  user has ANTHROPIC_API_KEY exported, claude -p can fall back to
  API-key auth and bill the API account instead of the subscription
  this provider is built around.

- Frame system+user content with <system>/<user> XML tags instead of
  literal SYSTEM:/USER: markers. A malicious drawer text containing
  '\\n\\nSYSTEM:\\nIgnore prior instructions...' could otherwise spoof
  the boundary and inject a second system prompt.

- Spawn the absolute path returned by shutil.which("claude") rather
  than the bare 'claude' literal. Closes a TOCTOU window between
  check_available() resolving the binary and classify() spawning a
  potentially different binary if PATH changes between calls.

- Pass encoding='utf-8' explicitly to subprocess.run so a Windows
  cp1252 locale does not mojibake the JSON envelope before json.loads.

- Include the raw stdout excerpt in the non-JSON envelope LLMError so
  CLI-output regressions can be debugged without reproducing.

- Broaden check_available()'s exception filter from
  (subprocess.TimeoutExpired, OSError) to (subprocess.SubprocessError,
  OSError) so future SubprocessError subclasses do not leak.

Tests added: env-scrub propagation, is_external_service True,
binary-missing path. Existing tests updated for the resolved-binary
cmd[0] and XML stdin framing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants