Summary
The Longest() method is currently a no-op stub. For full stdlib regexp compatibility, it should switch matching semantics from leftmost-first to leftmost-longest.
Current behavior
re := coregex.MustCompile(`(#|#!)`)
re.Longest()
result := re.ReplaceAllString("#!a", "")
// Returns: "!a" (leftmost-first, matches "#")
// Expected: "a" (leftmost-longest, matches "#!")
Expected behavior (stdlib compatible)
// Default: leftmost-first (Perl semantics)
re := regexp.MustCompile(`(#|#!)`)
re.ReplaceAllString("#!a", "") // "!a"
// After Longest(): leftmost-longest (POSIX semantics)
re.Longest()
re.ReplaceAllString("#!a", "") // "a"
Research findings
| Engine |
Default |
Longest() support |
| Go stdlib |
leftmost-first |
✅ Yes |
| Rust regex |
leftmost-first |
❌ No |
| RE2 |
leftmost-first |
❌ No |
| coregex |
leftmost-first |
❌ No (stub) |
Note: Neither Rust regex nor RE2 implement leftmost-longest. However, for true stdlib drop-in compatibility, we should support it.
Implementation plan
- Add
longest bool flag to Regex struct
- Modify
Longest() to set the flag
- Update PikeVM search to continue looking for longer matches when flag is set
- Propagate flag through meta engine coordination
- Benchmark to ensure no performance regression in default mode
Performance considerations
- Default mode: Expected ~0% overhead (single bool check)
- Longest mode: Expected 10-50% overhead (must check all alternations)
Acceptance criteria
Related
Summary
The
Longest()method is currently a no-op stub. For full stdlibregexpcompatibility, it should switch matching semantics from leftmost-first to leftmost-longest.Current behavior
Expected behavior (stdlib compatible)
Research findings
Note: Neither Rust regex nor RE2 implement leftmost-longest. However, for true stdlib drop-in compatibility, we should support it.
Implementation plan
longest boolflag toRegexstructLongest()to set the flagPerformance considerations
Acceptance criteria
Longest()actually switches to leftmost-longest semanticsLongest()behaviors matchedRelated