-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Update: Fix no-useless-escape false negative in regexes (fixes #7424)
#7425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@not-an-aardvark, thanks for your PR! By analyzing the history of the files in this pull request, we identified @onurtemizkan, @vitorbal and @kaicataldo to be potential reviewers. |
|
LGTM |
platinumazure
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment about a complex arrow function, otherwise LGTM.
lib/rules/no-useless-escape.js
Outdated
| * characters to be valid in general, and filter out '-' characters that appear in the middle of a | ||
| * character class. | ||
| */ | ||
| .filter((charInfo, index, array) => charInfo.text !== "-" || !charInfo.inCharClass || index === 0 || index === array.length - 1 || !array[index - 1].inCharClass || !array[index + 1].inCharClass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this extracted to a function.
| * @param {Set} setB The second set | ||
| * @returns {Set} The union of the two sets | ||
| */ | ||
| function union(setA, setB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so cool.. I never knew abt yield*.
Had to read up on couple of things to understand this but so awesome. 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a super clever way to merge Sets, i like it 💯
lib/rules/no-useless-escape.js
Outdated
| } else { | ||
| return; | ||
| } | ||
| parseRegExp(node.raw.slice(1, -1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I found a problem: .slice(1, -1) will be incorrect if the regex has flags, since the trailing slash won't be at index -1.
f5b4021 to
3f4a7ca
Compare
|
LGTM |
|
@platinumazure Separated the complex behavior into its own function. The number of linting errors this causes on the existing ESLint codebase is a bit concerning 😕 |
| * @returns {Set} The union of the two sets | ||
| */ | ||
| function union(setA, setB) { | ||
| return new Set(function *() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not function* (?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the generator-star-spacing rule configured to enforce this spacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, interesting choice
| * @param {Set} setB The second set | ||
| * @returns {Set} The union of the two sets | ||
| */ | ||
| function union(setA, setB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a super clever way to merge Sets, i like it 💯
|
CI is failing due to this line and this line. Both of these seem to be valid linting errors, but why are they only getting reported on AppVeyor? Travis didn't report any issues, and I can't reproduce the linting error when running edit: Actually, it doesn't seem to report any linting errors in |
|
LGTM |
lib/rules/arrow-body-style.js
Outdated
| const tokenAfterArrowBody = sourceCode.getTokenAfter(arrowBody); | ||
|
|
||
| if (tokenAfterArrowBody && tokenAfterArrowBody.type === "Punctuator" && /^[(\[\/`+-]/.test(tokenAfterArrowBody.value)) { | ||
| if (tokenAfterArrowBody && tokenAfterArrowBody.type === "Punctuator" && /^[([\/`+-]/.test(tokenAfterArrowBody.value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the \/ in this example also be flagged by the rule? It's in a character class but I don't know if the slash token is consumed greedily by the tokenizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, looks like you're right; I had thought / always needed to be escaped, but apparently it only needs to be escaped in character classes.
Sidenote: This implies that /[/]/ is a valid regex, but oddly, /[/]/.toString() returns /[\/]/ (i.e. it auto-escapes the slash, at least in Chrome). It doesn't seem to do this for other escaped characters.
platinumazure
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for extracting that function.
I left one question, but I'm half certain of the answer (and that it would require no changes).
|
LGTM |
1 similar comment
|
LGTM |
1e32241 to
6d9bea0
Compare
|
LGTM |
New open questions on the best approach for this rule (should we put this behind an option, should we enhance with exception lists, etc.). Possible large ecosystem impact.
|
As we find new problems, there are concerns about ecosystem impact. The rule is correct in reporting all the new useless escapes, but there may be some cases where users might want to intentionally allow (or prefer) some useless escapes for readability/maintainability. So we may need to consider carefully how best to release this. |
|
I think we should do the following:
|
lib/rules/no-useless-escape.js
Outdated
| } | ||
| } | ||
| return { | ||
| charList: state.charList.concat({text: char, escaped: state.escapeNextChar, inCharClass: state.inCharClass, index}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to create an array instance for each character? It looks to generate many useless copies.
|
LGTM |
mysticatea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me.
But there is a few matter about unnecessary escapes.
lib/rules/no-useless-escape.js
Outdated
| "\\", | ||
| ".", | ||
| "-", | ||
| "^", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The escape of ^ in character class is unnecessary except the 1st character.
/[\^a]/; // this escape is necessary; it becomes `not a` if the `\` is removed.
/[a\^]/; // this escape is unnecessary; the `^` is a character.
lib/rules/no-useless-escape.js
Outdated
| "(", | ||
| ")", | ||
| "b", | ||
| "B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The escape of B in character class is unnecessary.
/[\B]/; // this escape is unnecessary; the `B` is a character.On the other hand, the escape of \b is necessary. \b in character class is meaning a backspace character. (\b outside of character class is a word boundary)
|
LGTM |
35ec446 to
4611d81
Compare
|
LGTM |
mysticatea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, awesome!
|
TSC Summary: TSC question: Should we merge this PR for the upcoming release and consider the opt-out option separately? If not, how should we handle this fix? |
|
TSC Resolution: Merge as is. Opt-out option could be considered in the future. |
|
@eslint/eslint-tsc @not-an-aardvark Thanks very much for taking the time to carefully deliberate this. |
What is the purpose of this pull request? (put an "X" next to item)
[ ] Documentation update
[x] Bug fix (template)
[ ] New rule (template)
[ ] Changes an existing rule (template)
[ ] Add autofixing to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:
See #7424
What changes did you make? (Give an overview)
This updates
no-useless-escapeto verify whether a character needs to be escaped based on its position in a regular expression. Previously,no-useless-escapehad a list of escapable characters for regexes, and it would never report cases where any of them were escaped. However, some characters only need to be escaped if they appear in a character class, and other characters only need to be escaped if they appear outside of a character class. This PR updates the rule to parse the regular expression to determine which characters are in character classes, and report the characters if they are escaped inside a character class and only need to be escaped outside a class (or vice versa).Is there anything you'd like reviewers to focus on?
We should make sure that the
REGEX_GENERAL_ESCAPES,REGEX_CHARCLASS_ESCAPES, andREGEX_NON_CHARCLASS_ESCAPESlists are correct.