in
https://github.com/kpdecker/jsdiff/blob/master/src/diff/word.ts#L5-L23
it currently is
const extendedWordChars = 'a-zA-Z0-9_\\u{C0}-\\u{FF}\\u{D8}-\\u{F6}\\u{F8}-\\u{2C6}\\u{2C8}-\\u{2D7}\\u{2DE}-\\u{2FF}\\u{1E00}-\\u{1EFF}';
unless i'm misunderstanding something (e.g. exceptions mentioned in the comment, or how the extendedWordChars variable is actually interpreted by JS syntax) it seems to be missing 0080-00BF when i break down the unicode part of the string and compare it to the comment:
\u{C0}-\u{FF}
\u{D8}-\u{F6}
\u{F8}-\u{2C6}
\u{2C8}-\u{2D7}
\u{2DE}-\u{2FF}
\u{1E00}-\u{1EFF}
| extendedWordChars |
comment |
| SEEMS TO BE MISSING 0080-00BF? |
// Latin-1 Supplement, 0080–00FF |
| \u{C0}-\u{FF} |
// Latin-1 Supplement, 0080–00FF |
| SEEMS TO NOT BE EXCLUDED PROPERLY UNLESS THIS WAS INTENTIONAL? |
// - U+00D7 × Multiplication sign |
| \u{D8}-\u{F6} |
seems redundant since covered by other range \u{C0}-\u{FF} |
| SEEMS TO NOT BE EXCLUDED PROPERLY UNLESS THIS WAS INTENTIONAL? |
// - U+00F7 ÷ Division sign |
| \u{F8}-\u{2C6} |
covered by other ranges 0080–00FF and 0100–017F and 0180–024F and 0250–02AF and 02B0–02FF |
| included in other ranges |
// Latin Extended-A, 0100–017F |
| included in other ranges |
// Latin Extended-B, 0180–024F |
| included in other ranges |
// IPA Extensions, 0250–02AF |
| \u{2C8}-\u{2D7} |
// Spacing Modifier Letters, 02B0–02FF covered by other ranges and exclusions combined |
| intentionally excluded |
// - U+02C7 ˇ ˇ Caron |
| intentionally excluded |
// - U+02D8 ˘ ˘ Breve |
| intentionally excluded |
// - U+02D9 ˙ ˙ Dot Above |
| intentionally excluded |
// - U+02DA ˚ ˚ Ring Above |
| intentionally excluded |
// - U+02DB ˛ ˛ Ogonek |
| intentionally excluded |
// - U+02DC ˜ ˜ Small Tilde |
| intentionally excluded |
// - U+02DD ˝ ˝ Double Acute Accent |
| \u{2DE}-\u{2FF} |
// Spacing Modifier Letters, 02B0–02FF |
| \u{1E00}-\u{1EFF} |
// Latin Extended Additional, 1E00–1EFF |
in
https://github.com/kpdecker/jsdiff/blob/master/src/diff/word.ts#L5-L23
it currently is
unless i'm misunderstanding something (e.g. exceptions mentioned in the comment, or how the
extendedWordCharsvariable is actually interpreted by JS syntax) it seems to be missing 0080-00BF when i break down the unicode part of the string and compare it to the comment:\u{C0}-\u{FF}
\u{D8}-\u{F6}
\u{F8}-\u{2C6}
\u{2C8}-\u{2D7}
\u{2DE}-\u{2FF}
\u{1E00}-\u{1EFF}