Skip to content

Add incremental annotation overlay API to avoid DOCX re-conversion#107

Merged
JSv4 merged 9 commits intomainfrom
claude/resolve-issue-106-LVxfG
Mar 16, 2026
Merged

Add incremental annotation overlay API to avoid DOCX re-conversion#107
JSv4 merged 9 commits intomainfrom
claude/resolve-issue-106-LVxfG

Conversation

@JSv4
Copy link
Copy Markdown
Owner

@JSv4 JSv4 commented Mar 15, 2026

Summary

Implements an incremental annotation overlay API that decouples HTML conversion from annotation projection. This allows annotations to be added, removed, or modified on already-converted HTML without requiring full DOCX re-conversion, significantly improving performance for annotation-heavy workflows.

Key Changes

Core API Methods (C#)

  • ProjectAnnotationsOntoHtml() - Projects a complete annotation set onto pre-converted HTML
  • AddAnnotationToHtml() - Adds a single annotation to existing HTML with optional label styling
  • RemoveAnnotationFromHtml() - Removes annotations by ID, unwrapping spans back to plain text
  • GenerateVisibilityCss() - Generates CSS to hide/show annotations by label ID without re-rendering
  • GenerateAnnotationCss() - Generates annotation styling CSS independently from HTML content

TypeScript/JavaScript Bindings

  • Exported all five new methods as async functions with proper error handling
  • Added comprehensive JSDoc documentation with usage examples
  • Integrated with existing ExternalAnnotationProjectionSettings and AnnotationLabel types

Implementation Details

  • Refactored CSS generation logic: extracted BuildAnnotationCssString() to enable CSS-only generation
  • Added AddSingleAnnotationCss() helper for per-annotation styling in incremental workflows
  • All methods validate inputs and return consistent error responses via SerializeError()
  • Maintains compatibility with existing annotation projection infrastructure

Testing

  • Added 6 new unit tests covering:
    • Basic projection onto HTML
    • Single annotation addition
    • Annotation removal with text preservation
    • Visibility CSS generation
    • CSS generation for label sets
    • Multi-annotation workflows with selective removal

Type System Updates

  • Added CssResponse type for CSS generation endpoints
  • Updated DocxodusJsonContext to serialize new response types
  • Extended DocxodusWasmExports interface with new method signatures

Workflow Benefits

Users can now:

  1. Convert DOCX to HTML once (expensive operation)
  2. Cache the base HTML
  3. Overlay/modify annotations on cached HTML (fast, no re-conversion)
  4. Toggle annotation visibility via CSS (instant, no re-rendering)

This is particularly valuable for collaborative annotation scenarios where multiple users modify annotations on the same document.

https://claude.ai/code/session_01EQQ8N9xQoSSogqhsXWn3sF

claude and others added 5 commits March 15, 2026 20:12
Decouple HTML conversion from annotation projection to avoid full WASM
re-conversion when annotations change. New API enables:
- ProjectAnnotationsOntoHtml: overlay annotations on cached HTML
- AddAnnotationToHtml: add single annotation without re-conversion
- RemoveAnnotationFromHtml: remove annotation by ID, preserving text
- GenerateVisibilityCss: CSS-based label toggling without re-rendering
- GenerateAnnotationCssString: independent CSS generation

All methods available across .NET, WASM (JSExport), and npm/TS layers.

https://claude.ai/code/session_01EQQ8N9xQoSSogqhsXWn3sF
The offset-based annotation creation was fragile when document content
text differed from the raw input string. Using CreateAnnotationFromSearch
and separate paragraphs ensures reliable text matching.

https://claude.ai/code/session_01EQQ8N9xQoSSogqhsXWn3sF
After projecting the first annotation (e.g. wrapping "Alpha" with a label
span containing "A"), the text map was rebuilt but htmlText was not. The
label text "A" shifted all subsequent offsets, causing the second annotation
("Beta") to wrap the wrong character range.

Fix: GetTextNodes now skips already-projected annotation wrappers (elements
with data-annotation-id), and both textMap and htmlText are rebuilt each
iteration. This prevents label text from polluting the offset calculation
for subsequent annotations.
@JSv4 JSv4 linked an issue Mar 16, 2026 that may be closed by this pull request
JSv4 added 4 commits March 15, 2026 22:08
…rsion

Measures and reports wall-clock time for:
- Full DOCX → HTML with external annotations (re-parses DOCX every time)
- Incremental projection (convert DOCX once, project annotations on cached HTML)
- Single annotation add/remove on existing HTML

Reports honest results — no assertion that one is faster than the other.
The numbers tell the truth.
The benchmark test now asserts that incremental projection, single add,
and single remove are all faster than full DOCX re-conversion. The exact
speedup doesn't matter — just that it's faster.
C# JSON serializer produces PascalCase (Content, LabelledText, TextLabels).
Handle both casings so the test works regardless of serialization convention.
- New architecture doc: docs/architecture/incremental_annotation_overlay.md
  covering problem statement, architecture, all API surfaces (.NET/WASM/TS),
  text-search-based projection, offset-drift fix, performance benchmarks,
  usage examples, and limitations

- Updated docs/npm-package.md with External Annotations section covering
  all five TypeScript functions, parameter tables, code examples, and
  performance comparison callout

- Updated CLAUDE.md with ExternalAnnotationProjector module description
  in the Core Modules section
@JSv4 JSv4 merged commit a402c74 into main Mar 16, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support incremental annotation overlay to avoid full WASM re-conversion

2 participants