Add Unicode variation selector detection to content scanner (Glassworm attack vector)

## Context

Follow-up from the excellent community feedback in #312 (comment by @raye-deng): our content scanner currently detects tag characters and bidi overrides, but is missing **Unicode variation selectors** — the specific mechanism used in the Glassworm supply-chain attacks (March 2026) that compromised Wasmer, multiple npm packages, and 72 VS Code extensions.

Variation selectors are particularly insidious because they attach to visible characters, making the byte stream contain invisible payload bytes that humans and most diff viewers ignore. AST-based tools (ESLint, SonarQube, Semgrep) completely skip them because parsers tokenize based on visible character boundaries.

## Changes

Add the following ranges to the content scanner's `_SUSPICIOUS_RANGES` detection table:

| Range | Name | Severity | Rationale |
|-------|------|----------|-----------|
| U+E0100–E01EF | VS17-256 (SMP) | **critical** | No legitimate use in prompt files. 240 invisible chars that can encode arbitrary data. |
| U+FE00–FE0D | VS1-14 (BMP) | **warning** | Rare CJK typography variants. Unusual in prompt files. |
| U+FE0E | VS15 (text presentation) | **warning** | Forces text rendering. Uncommon in prompts. |
| U+FE0F | VS16 (emoji presentation) | **info** | Extremely common with emoji — only shown with `--verbose` to avoid noise. |

### Key design decisions

- **VS16 (U+FE0F) is info-level**: Every emoji uses this character (❤️ = ❤ + U+FE0F). Flagging it as warning/critical would generate noise on virtually every prompt file with emoji. Info level means it only appears with `--verbose`.
- **No architecture changes**: Extends the existing `_SUSPICIOUS_RANGES` table and `_CHAR_LOOKUP` O(1) dict. No changes needed to the audit command, install security gate, or compile/pack scanning — they all use `ContentScanner` generically.
- **Strip behavior**: `apm audit --strip` will remove warning/info-level variation selectors (VS1-16) but preserve critical ones (VS17-256) for manual review — consistent with existing strip behavior.

### Scope

- [x] Scanner ranges in `content_scanner.py`
- [x] Unit tests for detection and strip behavior
- [x] End-to-end audit command tests with injected fixtures
- [x] Security documentation update with Glassworm reference
- [x] CHANGELOG entry

Closes via PR.

cc @raye-deng — thank you for the detailed analysis!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Unicode variation selector detection to content scanner (Glassworm attack vector) #320

Context

Changes

Key design decisions

Scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Range	Name	Severity	Rationale
U+E0100–E01EF	VS17-256 (SMP)	critical	No legitimate use in prompt files. 240 invisible chars that can encode arbitrary data.
U+FE00–FE0D	VS1-14 (BMP)	warning	Rare CJK typography variants. Unusual in prompt files.
U+FE0E	VS15 (text presentation)	warning	Forces text rendering. Uncommon in prompts.
U+FE0F	VS16 (emoji presentation)	info	Extremely common with emoji — only shown with `--verbose` to avoid noise.

Add Unicode variation selector detection to content scanner (Glassworm attack vector) #320

Description

Context

Changes

Key design decisions

Scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions