A zero-dependency regular expression transform for partial matching, enabling validation of incomplete input strings against regex patterns.
Unlike C/C++ (via PCRE/PCRE2, RE2, Boost.Regex), Python (via third party regex module) or Java (via hitEnd), Javascript has no canonical / innate partial-matching for regular expressions.
This library transforms regular expressions to best-effort support partial matching, allowing you to test if an incomplete string could potentially match the full pattern. This is particularly useful for real-time input validation, autocomplete systems, progressive form validation, stream chunk matching, etc.
Based on an algorithm created by Lucas Trzesniewski, re-created for NPM via ISC license, with permission.
npm install regex-partial-matchimport createPartialMatchRegex from "regex-partial-match";
const pattern = /hello world/;
const partial = createPartialMatchRegex(pattern);
partial.test("h"); // true - could match
partial.test("hello"); // true - could match
partial.test("hello world"); // true - full match
partial.test("goodbye"); // false - cannot matchimport "regex-partial-match/extend";
const partial = /hello world/.toPartialMatchRegex();
partial.test("hel"); // trueThe library transforms a regular expression by wrapping each atomic element in a non-capturing group with an alternation to end-of-input ($):
/abc/ → /(?:a|$)(?:b|$)(?:c|$)/This allows the pattern to match prefixes of the original pattern, enabling validation of incomplete input.
Since the library accepts only valid regular expressions 1, this enables the algorithm to make lots of unguarded assumptions about the source of the expression.
The library has been stress-tested with various regular expression features in isolation, and some in likely combination, but obviously its an unbounded test space, and syntactically valid regular expressions nevertheless support contradictory patterns e.g.
/\b\B/- impossible to match both a word boundary and a non-word boundary/$^/- end cannot come before startx{2}?- lazy quantifiers are mutually exclusive to fixed-length assertions
Such combinations have not been tested.
- 🔤 Literal characters
- 📋 Character classes (
[abc],[^abc],[a-z]) - 🔣 Character escapes (
\n,\t,\x61,\u0061,\u{1F600}) - 🌐 Unicode character class escape (
\p{Letter},\P{Letter}) - 🔢 Quantifiers (
*,+,?,{n},{n,},{n,m}) - 🔀 Disjunction (
a|b) - 👥 Groups (capturing and non-capturing) (
(?:abc),(abc),(?<named>abc)) - 👀 Lookahead assertions (
(?=...),(?!...)) - 👈 Lookbehind assertions (
(?<=...),(?<!...)) - ⚓ Input Boundaries (
^,$) - 🆒 Word Boundaries (
\b,\B) - 🏴 Flags:
g,i,m,s,u,d,y(See caveats fory)
The following regex features are not currently supported:
⚠️ Backreferences (\1,\k<name>) - Can be included, but can't partially match. See caveats.- ❌ Unicode sets (
vflag) - ES2024+. See issue. - ❌ Modifiers (
(?ims:...),(?-ims:...)) - ES2025+. See issue.
The library is compiled to ES5 for broad compatibility with older browsers and JavaScript environments. However, certain regular expression features naturally require ES2015+ support:
- Unicode escapes (
\u{...}) - ES2015+ - Unicode property escapes (
\p{...},\P{...}) - ES2018+ - Lookbehind assertions (
(?<=...),(?<!...)) - ES2018+ - Named capturing groups (
(?<name>...)) - ES2018+ s(dotAll) flag - ES2018+d(hasIndices) flag - ES2022+
The library produces an expression that always matches an empty string, at the end of the input. Feasibly, this is the start of a new partial match.
Hence:
/x/.test("a") === false; /* untransformed regex */
/(?:x|$)/.test("a") === true; /* what's produced by the library */To mitigate, a start boundary anchor can prevent anything but an empty string matching:
/^(?:x|$)/.test("") === true;
/^(?:x|$)/.test("x") === true;
/^(?:x|$)/.test("a") === false;On this basis, .test() should be used with caution, and a match of an empty string at the end of the input should instead be considered "no match", if validating that which came before.
i.e.
/(?:x|$)/.exec("a"); // ['', index: 1, input: "a", groups: undefined];
"a".match(/(?:x|$)/); // ['', index: 1, input: "a", groups: undefined];Since the library produces a native RegExp object, no attempt to proxy / translate this output to null has been attempted, but a helper could be produced in future, for clarity. See issue.
Backreferences cannot be partially matched because they are atomic. A backreference like \1 must match the complete captured text or fail entirely, and cannot be split into individual characters for partial matching like regular atoms can.
Fixed-length patterns like /(abc)\1/ could theoretically become /(?:(a)|$)(?:(b)|$)(?:(c)|$)(?:\1|$)(?:\2|$)(?:\3|$)/ (accepting polluted capture indexes as a side-effect), but this doesn't work for variable-length captures.
Whilst forming a match, a positive lookbehind must match in entirety, for the pattern to match. This is inherent in the concept of non-matching groups, since they are not match-worthy themselves, but just qualify matching atoms.
e.g.
/(?<=foo)bar/;"f" through "foo" is not a match, but "foob" is.
In unicode-aware mode (u flag), only whole astral characters are supported. Partial matching of individual surrogate pairs is not supported. For example, /😀/u will match the complete emoji character, but not the first surrogate pair in isolation. Hence, if partially matching a byte stream, be sure to pipe via a TextDecoder first.
Sticky Flag (y)
The sticky flag may not behave as expected in partial matching scenarios. The sticky flag requires matches to start at lastIndex, but a partial match failure resets lastIndex to 0. This means subsequent attempts cannot "continue" from where the previous match failed, making progressive character-by-character validation problematic.
Example:
const pattern = /hello/y;
const partial = createPartialMatchRegex(pattern);
pattern.lastIndex = 0;
partial.test("h"); // succeeds, lastIndex advances
partial.test("he"); // succeeds, but lastIndex was reset by previous test
// Cannot reliably continue partial matching with sticky flagRecommendation: Avoid using the y flag with partial matching unless you fully understand the implications.
The global flag is preserved but may not be necessary for partial matching use cases. The g flag affects behavior when using .exec() repeatedly to find all matches, but partial matching typically validates a single prefix at a time.
The global flag does not cause issues like the sticky flag, as partial patterns naturally match from the beginning of the input. However, if you're using lastIndex to track position, be aware that failed matches will reset it to 0.
const emailPattern = /^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$/i;
const partial = createPartialMatchRegex(emailPattern);
function validateEmail(input) {
return partial.test(input) ? "valid" : "invalid";
}
validateEmail("user"); // 'valid' - could become valid
validateEmail("user@"); // 'valid' - could become valid
validateEmail("user@example"); // 'valid' - could become valid
validateEmail("[email protected]"); // 'valid' - complete match
validateEmail("@@invalid"); // 'invalid' - cannot matchconst commandPattern = /^(help|quit|save|load)/;
const partial = createPartialMatchRegex(commandPattern);
function getSuggestions(input) {
return partial.test(input) ? "valid prefix" : "no suggestions";
}
getSuggestions("h"); // 'valid prefix'
getSuggestions("hel"); // 'valid prefix'
getSuggestions("help"); // 'valid prefix'
getSuggestions("xyz"); // 'no suggestions'// Process streaming data with pattern matching at chunk boundaries
const pattern = /\{"[^"]+":"[^"]+"\}/; // Match JSON objects
const partial = createPartialMatchRegex(pattern);
let buffer = "";
function processChunk(chunk) {
buffer += chunk;
const matches = [];
// Extract complete matches
let match;
while ((match = pattern.exec(buffer))) {
matches.push(match[0]);
buffer = buffer.slice(match.index + match[0].length);
}
// Discard buffer if it cannot possibly complete
if (buffer && !partial.test(buffer)) {
buffer = "";
}
return matches;
}
processChunk('{"na'); // [] - partial, buffer: '{"na'
processChunk('me":"Jo'); // [] - partial, buffer: '{"name":"Jo'
processChunk('hn"}{"age":'); // ['{"name":"John"}'] - buffer: '{"age":'
processChunk("25}"); // ['{"age":25}'] - buffer: ''
processChunk("invalid{"); // [] - discarded, buffer: ''Useful for parsing log files, network streams, or any chunked data where records may be split across boundaries.
Transforms a regular expression to support partial matching.
Available via the default entry point of the package.
Parameters:
regex- The regular expression to transform
Returns:
- A new
RegExpthat matches partial strings
When using import 'regex-partial-match/extend', this method is added to RegExp.prototype.
Returns:
- A new
RegExpthat matches partial strings, created from theRegExpinstance the method was called on.
ISC License - see LICENSE file for details.
Algorithm created by Lucas Trzesniewski.
Contributions are welcome! Please open an issue or pull request on GitHub.
| Project | Description |
|---|---|
incr-regex-package |
Incremental regex matcher |
dfa |
Compiles a regular expression like syntax to fast deterministic finite automata, which could be used to partial match? |
refa |
Can convert regular expressions to an Abstract Syntax Tree, which might afford partial-match capability? |
@eslint-community/regexpp |
A regular expression parser for ECMAScript with AST generation and visitor implementation |
Regex+ |
template literal, transforming native regular expressions |
Awesome Regex |
Curated list of tools, tutorials, libraries, and other resources, covering all major regex flavours |
Footnotes
-
To remain lightweight, no runtime type validation is applied, so non-typescript consumers will be reliant on underlying errors thrown, if used incorrectly. ↩