-
Notifications
You must be signed in to change notification settings - Fork 19
Rewrite Zod Schema Parser to Use Pattern Matching #40
Description
Task 3: Rewrite Zod Schema Parser to Use Pattern Matching
Status
- Priority: 3
- Status: Todo
- Effort: Medium (4-6 hours)
- Assigned: Unassigned
Context
The current Zod schema parser uses line-by-line parsing with brace counting to extract field definitions from src/content.config.ts. This approach is fragile and breaks with common formatting patterns like:
coverImage: z
.object({
image: image().optional(),
alt: z.string().optional(),
})
.optional(),Why This Matters
Image fields and reference fields in nested objects don't work correctly. When a field like coverImage.image uses the image() helper, it should render as an ImageField component with upload functionality. Instead, it renders as a plain text input because the parser can't properly identify the image() helper when the definition spans multiple lines.
Root Cause
The Zod parser's job is NOT to parse the entire schema structure (the JSON schema parser handles that). It has exactly two responsibilities:
- Find fields using
image()helper → mark them as Image type - Find fields using
reference()helper → mark them as Reference type
The current line-based parsing approach tries to do too much and is too sensitive to formatting variations.
Proposed Solution
Rewrite the Zod parser to use a two-pass pattern-matching approach:
Pass 1: Find Special Helper Calls
- Scan entire schema text for
image()andreference()occurrences - Extract surrounding context (field name and containing structure)
- Don't try to parse the entire schema - just find the helpers
Pass 2: Resolve Field Paths
- For each helper found, trace backwards through the schema text
- Use brace-level counting to find parent object names
- Build dotted paths (e.g.,
coverImage.image,metadata.author.avatar)
Benefits
- More robust: Works with any formatting (multi-line, inline, etc.)
- Simpler: Focused on actual purpose (find helpers, resolve paths)
- Maintainable: Easier to understand and extend
- Aligned with architecture: JSON schema parser handles structure, Zod parser just augments it
Implementation Details
Current Code Location
- File:
src-tauri/src/parser.rs - Function:
parse_schema_fields()(lines ~420-492) - Related:
process_field_with_parent(),parse_nested_object(),parse_object_field()
Suggested Approach
-
Create new pattern matching functions:
fn find_helper_calls(schema_text: &str, helper_name: &str) -> Vec<HelperMatch> { // Find all occurrences of helper_name() with position info } fn resolve_field_path(schema_text: &str, helper_position: usize) -> String { // Trace backwards from helper position through brace levels // Return dotted path like "coverImage.image" } fn extract_zod_special_fields(schema_text: &str) -> Vec<ZodField> { // Main function: find image() and reference() calls, resolve paths }
-
Replace line-based parsing:
- Remove
parse_schema_fields()complexity - Keep the regex-based schema extraction (finding the schema object itself)
- Replace field parsing with pattern matching
- Remove
-
Handle edge cases:
- Deeply nested fields (3+ levels)
- Arrays of objects:
gallery: z.array(z.object({ src: image() })) - Comments between field names and definitions
- Inline vs. multi-line formatting
-
Maintain backwards compatibility:
- Top-level image fields must continue working:
heroImage: image() - Reference fields must work the same way
- All existing tests must pass
- Top-level image fields must continue working:
Testing Strategy
Unit Tests to Update
test_image_helper_detection- should still passtest_reference_helper_detection- should still passtest_nested_object_with_image_fields- should pass with new approach
New Tests to Add
- Multi-line image field definitions
- Deeply nested image fields (3+ levels)
- Arrays with image fields
- Mixed formatting (some inline, some multi-line)
- Comments in field definitions
Integration Testing
- Test with
test/dummy-astro-project/src/content.config.ts - Verify nested image fields render as ImageField components
- Verify existing non-nested fields still work
- Check both notes collection (has nested coverImage) and articles collection
Risks and Mitigations
Risk 1: Breaking Existing Functionality
- Impact: High
- Mitigation: Comprehensive test suite, test with real project schemas
- Validation: All existing tests must pass before merging
Risk 2: Complex Nested Structures
- Impact: Medium
- Mitigation: Design path resolution to handle arbitrary nesting depth
- Validation: Test with 3+ level nesting
Risk 3: Formatting Edge Cases
- Impact: Low
- Mitigation: Document supported formatting patterns, focus on common Prettier output
- Validation: Test with various formatting styles
Risk 4: Future Extensibility
- Impact: Low
- Mitigation: Make helper detection parameterized (pass in list of helpers to find)
- Validation: Easy to add new helpers in the future
Success Criteria
- ✅ Nested image fields render as ImageField components (not text inputs)
- ✅ All existing Zod parser tests pass
- ✅ Multi-line field definitions work correctly
- ✅ Reference fields continue working
- ✅ Top-level image fields continue working
- ✅ Code is cleaner and easier to understand than before
- ✅ Performance is comparable or better (pattern matching is O(n), line parsing was also O(n))
Dependencies
- None (self-contained refactor)
Related Files
src-tauri/src/parser.rs- Main implementationsrc-tauri/src/schema_merger.rs- Uses Zod parser outputtest/dummy-astro-project/src/content.config.ts- Test schemadocs/tasks-done/task-2-images-in-frontmatter.md- Previous related work
Notes
This task was created after attempting to fix nested image field support by improving the line-based parser. The approach became increasingly complex and fragile. The pattern-matching rewrite is a better architectural fit because:
- It aligns with the actual purpose of the Zod parser (find special cases, not parse everything)
- It leverages the JSON schema parser for structure (single source of truth)
- It's more robust to formatting variations
- It's easier to maintain and extend
The immediate benefit is nested image fields working correctly. The longer-term benefit is a more maintainable codebase that's easier to extend when new Astro/Zod helpers need special handling.