Skip to content

feat(sensitivity): add sensitivity tagging with pattern-based auto-classification#85

Merged
Siddhant-K-code merged 1 commit into
mainfrom
feat/sensitivity-tagging
May 9, 2026
Merged

feat(sensitivity): add sensitivity tagging with pattern-based auto-classification#85
Siddhant-K-code merged 1 commit into
mainfrom
feat/sensitivity-tagging

Conversation

@Siddhant-K-code

Copy link
Copy Markdown
Owner

What

Adds a sensitivity classification layer to Distill's memory store. When context is stored or retrieved, Distill classifies it and returns sensitivity metadata alongside the content.

Why

Distill sits at the right point in the pipeline to detect sensitive content before it reaches the model. When an agent retrieves memory and then calls an external tool, there's no signal about what the context contains. This PR adds that signal — callers can use max_sensitivity to make authorization decisions before dispatching tool calls.

Changes

New package: pkg/sensitivity

  • Level enum: None (0), PII (1), InternalIP (2), Credentials (3)
  • Pattern-based Classifier with built-in detection for:
    • Credentials: AWS keys (AKIA...), OpenAI keys (sk-...), GitHub tokens (ghp_...), Slack tokens (xox...), generic password=/secret= patterns
    • PII: email addresses, phone numbers, credit card numbers, SSNs
    • Internal: configurable domain suffixes (.internal, .corp, .local)
  • Classify(text) and ClassifyBatch(texts) methods
  • All classification is synchronous, no LLM calls

Memory integration

  • StoreEntry.Sensitivity — explicit tag at write time
  • StoreEntry.AutoClassify — triggers pattern-based classification on write
  • RecallResult.MaxSensitivity — highest level across all returned memories
  • RecallResult.SensitiveChunks — which memories triggered it
  • RecalledMemory.Sensitivity — per-entry sensitivity level
  • Sensitivity metadata does not affect deduplication or retrieval ranking

Tests

  • 17 classifier unit tests (patterns, false positives, batch, string representation)
  • 4 benchmarks — all well under 1ms:
    • Short text: ~10µs
    • Text with matches: ~14µs
    • Long text (2000 chars): ~413µs
    • Batch of 10: ~191µs
  • 7 memory integration tests (explicit, auto-classify, no-match, override, multi-entry, ranking unaffected)

Usage

# Store with explicit sensitivity
curl -X POST localhost:8080/v1/memory/store -d '{
  "entries": [{
    "text": "Q3 pricing: customer A at $120k",
    "sensitivity": 2
  }]
}'

# Store with auto-classification
curl -X POST localhost:8080/v1/memory/store -d '{
  "entries": [{
    "text": "API key: sk-proj-abc123...",
    "auto_classify": true
  }]
}'

# Recall — sensitivity metadata in response
curl -X POST localhost:8080/v1/memory/recall -d '{"query": "pricing"}'
# Response includes:
#   "max_sensitivity": 2,
#   "sensitive_chunks": [{"chunk_id": "...", "sensitivity": 2}]

Closes #82

@Siddhant-K-code Siddhant-K-code added enhancement New feature or request security labels May 9, 2026
…assification

- New pkg/sensitivity with Classifier, Level enum (None/PII/InternalIP/Credentials)
- Built-in patterns: email, phone, credit card, SSN, AWS keys, OpenAI keys,
  GitHub tokens, Slack tokens, generic secrets
- Configurable internal domain detection (.internal, .corp, .local)
- StoreEntry accepts Sensitivity (explicit) and AutoClassify (pattern-based)
- RecallResult includes MaxSensitivity and SensitiveChunks metadata
- Sensitivity stored in SQLite, does not affect dedup or ranking
- 17 classifier tests + 4 benchmarks (all <1ms)
- 7 memory integration tests for sensitivity propagation

Closes #82

Co-authored-by: Ona <[email protected]>
@Siddhant-K-code Siddhant-K-code force-pushed the feat/sensitivity-tagging branch from 0ad00dd to 21b850b Compare May 9, 2026 07:18
@Siddhant-K-code Siddhant-K-code merged commit 3da4b03 into main May 9, 2026
@Siddhant-K-code Siddhant-K-code deleted the feat/sensitivity-tagging branch May 9, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request security

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: sensitivity tagging — mark memory entries as sensitive to prevent downstream tool leakage

1 participant