Update: Fix `no-useless-escape` false negative in regexes (fixes #7424) #7425

not-an-aardvark · 2016-10-22T01:07:29Z

What is the purpose of this pull request? (put an "X" next to item)

[ ] Documentation update
[x] Bug fix (template)
[ ] New rule (template)
[ ] Changes an existing rule (template)
[ ] Add autofixing to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:

See #7424

What changes did you make? (Give an overview)

This updates no-useless-escape to verify whether a character needs to be escaped based on its position in a regular expression. Previously, no-useless-escape had a list of escapable characters for regexes, and it would never report cases where any of them were escaped. However, some characters only need to be escaped if they appear in a character class, and other characters only need to be escaped if they appear outside of a character class. This PR updates the rule to parse the regular expression to determine which characters are in character classes, and report the characters if they are escaped inside a character class and only need to be escaped outside a class (or vice versa).

Is there anything you'd like reviewers to focus on?

We should make sure that the REGEX_GENERAL_ESCAPES, REGEX_CHARCLASS_ESCAPES, and REGEX_NON_CHARCLASS_ESCAPES lists are correct.

mention-bot · 2016-10-22T01:07:30Z

@not-an-aardvark, thanks for your PR! By analyzing the history of the files in this pull request, we identified @onurtemizkan, @vitorbal and @kaicataldo to be potential reviewers.

eslintbot · 2016-10-22T01:07:30Z

LGTM

platinumazure

Just one comment about a complex arrow function, otherwise LGTM.

platinumazure · 2016-10-22T02:07:19Z

lib/rules/no-useless-escape.js

+                     * characters to be valid in general, and filter out '-' characters that appear in the middle of a
+                     * character class.
+                     */
+                    .filter((charInfo, index, array) => charInfo.text !== "-" || !charInfo.inCharClass || index === 0 || index === array.length - 1 || !array[index - 1].inCharClass || !array[index + 1].inCharClass)


I'd like to see this extracted to a function.

gyandeeps · 2016-10-22T02:21:09Z

lib/rules/no-useless-escape.js

+* @param {Set} setB The second set
+* @returns {Set} The union of the two sets
+*/
+function union(setA, setB) {


This is so cool.. I never knew abt yield*.
Had to read up on couple of things to understand this but so awesome. 💯

this is a super clever way to merge Sets, i like it 💯

not-an-aardvark · 2016-10-22T02:53:01Z

lib/rules/no-useless-escape.js

-            } else {
-                return;
-            }
+                parseRegExp(node.raw.slice(1, -1))


Oops, I found a problem: .slice(1, -1) will be incorrect if the regex has flags, since the trailing slash won't be at index -1.

eslintbot · 2016-10-22T03:09:27Z

LGTM

not-an-aardvark · 2016-10-22T03:11:30Z

@platinumazure Separated the complex behavior into its own function.

The number of linting errors this causes on the existing ESLint codebase is a bit concerning 😕

ljharb · 2016-10-22T03:13:41Z

lib/rules/no-useless-escape.js

+* @returns {Set} The union of the two sets
+*/
+function union(setA, setB) {
+    return new Set(function *() {


not function* (?

We have the generator-star-spacing rule configured to enforce this spacing.

ah, interesting choice

ljharb · 2016-10-22T03:14:06Z

lib/rules/no-useless-escape.js

+* @param {Set} setB The second set
+* @returns {Set} The union of the two sets
+*/
+function union(setA, setB) {


this is a super clever way to merge Sets, i like it 💯

not-an-aardvark · 2016-10-22T03:28:27Z

CI is failing due to this line and this line. Both of these seem to be valid linting errors, but why are they only getting reported on AppVeyor? Travis didn't report any issues, and I can't reproduce the linting error when running npm test locally.

edit: Actually, it doesn't seem to report any linting errors in ast-utils for me, even after clearing the cache. Maybe it's a Windows filepath thing?

eslintbot · 2016-10-22T04:16:11Z

LGTM

platinumazure · 2016-10-22T23:39:29Z

lib/rules/arrow-body-style.js

                            const tokenAfterArrowBody = sourceCode.getTokenAfter(arrowBody);

-                            if (tokenAfterArrowBody && tokenAfterArrowBody.type === "Punctuator" && /^[(\[\/`+-]/.test(tokenAfterArrowBody.value)) {
+                            if (tokenAfterArrowBody && tokenAfterArrowBody.type === "Punctuator" && /^[([\/`+-]/.test(tokenAfterArrowBody.value)) {


Should the \/ in this example also be flagged by the rule? It's in a character class but I don't know if the slash token is consumed greedily by the tokenizer.

Oh, looks like you're right; I had thought / always needed to be escaped, but apparently it only needs to be escaped in character classes.

Sidenote: This implies that /[/]/ is a valid regex, but oddly, /[/]/.toString() returns /[\/]/ (i.e. it auto-escapes the slash, at least in Chrome). It doesn't seem to do this for other escaped characters.

platinumazure

LGTM, thanks for extracting that function.

I left one question, but I'm half certain of the answer (and that it would require no changes).

eslintbot · 2016-10-22T23:53:29Z

LGTM

eslintbot · 2016-10-22T23:59:49Z

LGTM

eslintbot · 2016-10-23T00:06:14Z

LGTM

New open questions on the best approach for this rule (should we put this behind an option, should we enhance with exception lists, etc.). Possible large ecosystem impact.

platinumazure · 2016-10-23T00:07:46Z

As we find new problems, there are concerns about ecosystem impact. The rule is correct in reporting all the new useless escapes, but there may be some cases where users might want to intentionally allow (or prefer) some useless escapes for readability/maintainability. So we may need to consider carefully how best to release this.

not-an-aardvark · 2016-10-23T00:22:49Z

I think we should do the following:

Fix the default behavior by merging this PR. The current behavior is unambiguously a bug.
Introduce an option such as ignoreCharacterClasses: true to revert to the previous behavior, for anyone that wants to allow some useless escapes. This option should be false by default.
Ideally, it would be best if this bugfix and new option were both released in the same version, but I don't think this is a requirement; we can consider the bugfix and the new option independently.

mysticatea · 2016-10-23T12:05:32Z

lib/rules/no-useless-escape.js

+            }
+        }
+        return {
+            charList: state.charList.concat({text: char, escaped: state.escapeNextChar, inCharClass: state.inCharClass, index}),


Does it need to create an array instance for each character? It looks to generate many useless copies.

eslintbot · 2016-10-23T18:47:31Z

LGTM

mysticatea

Looks great to me.
But there is a few matter about unnecessary escapes.

mysticatea · 2016-10-24T04:33:12Z

lib/rules/no-useless-escape.js

    "\\",
-    ".",
-    "-",
    "^",


The escape of ^ in character class is unnecessary except the 1st character.

/[\^a]/; // this escape is necessary; it becomes `not a` if the `\` is removed. /[a\^]/; // this escape is unnecessary; the `^` is a character.

mysticatea · 2016-10-24T04:37:27Z

lib/rules/no-useless-escape.js

-    "(",
-    ")",
    "b",
    "B",


The escape of B in character class is unnecessary.

/[\B]/; // this escape is unnecessary; the `B` is a character.

On the other hand, the escape of \b is necessary. \b in character class is meaning a backspace character. (\b outside of character class is a word boundary)

eslintbot · 2016-10-24T05:49:16Z

LGTM

eslintbot · 2016-10-24T06:39:07Z

LGTM

mysticatea

LGTM, awesome!

not-an-aardvark · 2016-10-26T21:39:51Z

TSC Summary: no-useless-escape currently has a false negative in regex character classes. This is clearly a bug in the rule, but there is reason to believe that the fix might have an unusually large impact on the ecosystem (there are 11 existing violations in the ESLint codebase that were not caught due to the bug). In addition, it's plausible that a user might be okay with useless escapes in character classes for readability, even though these escapes are useless according to the rule's definition. The current proposal is to accept this PR as a bugfix (since the rule is currently not working as intended), and add an opt-out option to ignore character classes (something like #7455).

TSC question: Should we merge this PR for the upcoming release and consider the opt-out option separately? If not, how should we handle this fix?

alberto · 2016-10-27T21:46:32Z

TSC Resolution: Merge as is. Opt-out option could be considered in the future.

platinumazure · 2016-10-27T21:55:21Z

@eslint/eslint-tsc @not-an-aardvark Thanks very much for taking the time to carefully deliberate this.

not-an-aardvark added bug ESLint is working incorrectly rule Relates to ESLint's core rules accepted There is consensus among the team that this change meets the criteria for inclusion labels Oct 22, 2016

jquerybot added the CLA: Valid label Oct 22, 2016

platinumazure suggested changes Oct 22, 2016

View reviewed changes

gyandeeps reviewed Oct 22, 2016

View reviewed changes

Update: Fix no-useless-escape false negative in regexes (fixes #7424)

5cbd278

not-an-aardvark commented Oct 22, 2016

View reviewed changes

not-an-aardvark added 3 commits October 21, 2016 22:54

Fix existing linting errors in the codebase

9af2e56

Handle regexes with flags correctly

6096e2a

Move range-dash checking into its own function

3f4a7ca

not-an-aardvark force-pushed the no-useless-escape-character-classes branch from f5b4021 to 3f4a7ca Compare October 22, 2016 03:09

ljharb reviewed Oct 22, 2016

View reviewed changes

not-an-aardvark mentioned this pull request Oct 22, 2016

Files in lib/ aren't getting linted #7426

Closed

Remove useless escapes in lib/

bf13222

platinumazure reviewed Oct 22, 2016

View reviewed changes

platinumazure previously approved these changes Oct 22, 2016

View reviewed changes

/ is only a valid escape outside of character classes

2fdef28

Fix uselessly-escaped slashes in the codebase

6d9bea0

not-an-aardvark force-pushed the no-useless-escape-character-classes branch from 1e32241 to 6d9bea0 Compare October 23, 2016 00:06

mysticatea reviewed Oct 23, 2016

View reviewed changes

Don't create unnecessary arrays when parsing RegExps

a77608c

mysticatea suggested changes Oct 24, 2016

View reviewed changes

\B only needs to be escaped outside of character classes

4eda426

\^ only needs to be escaped at the start of a character class

4611d81

not-an-aardvark force-pushed the no-useless-escape-character-classes branch from 35ec446 to 4611d81 Compare October 24, 2016 06:39

mysticatea approved these changes Oct 24, 2016

View reviewed changes

not-an-aardvark mentioned this pull request Oct 26, 2016

Update: WIP: add ignoreCharClasses option for no-useless-escape #7455

Closed

not-an-aardvark added the tsc agenda This issue will be discussed by ESLint's TSC at the next meeting label Oct 26, 2016

alberto removed the tsc agenda This issue will be discussed by ESLint's TSC at the next meeting label Oct 27, 2016

not-an-aardvark merged commit c675d7d into master Oct 27, 2016

not-an-aardvark deleted the no-useless-escape-character-classes branch October 27, 2016 21:50

not-an-aardvark mentioned this pull request Nov 6, 2016

lib,test: remove unneeded escaping of / nodejs/node#9485

Closed

2 tasks

nightwing mentioned this pull request May 24, 2017

make eslint happy ajaxorg/ace#3151

Closed

eslint-deprecated bot locked and limited conversation to collaborators Feb 6, 2018

eslint-deprecated bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Feb 6, 2018

Uh oh!

Update: Fix no-useless-escape false negative in regexes (fixes #7424) #7425

Update: Fix no-useless-escape false negative in regexes (fixes #7424) #7425

Uh oh!

Conversation

not-an-aardvark commented Oct 22, 2016

Uh oh!

mention-bot commented Oct 22, 2016

Uh oh!

eslintbot commented Oct 22, 2016

Uh oh!

platinumazure left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eslintbot commented Oct 22, 2016

Uh oh!

not-an-aardvark commented Oct 22, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

not-an-aardvark commented Oct 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eslintbot commented Oct 22, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

platinumazure left a comment

Choose a reason for hiding this comment

Uh oh!

eslintbot commented Oct 22, 2016

Uh oh!

eslintbot commented Oct 22, 2016

Uh oh!

eslintbot commented Oct 23, 2016

Uh oh!

platinumazure commented Oct 23, 2016

Uh oh!

not-an-aardvark commented Oct 23, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eslintbot commented Oct 23, 2016

Uh oh!

mysticatea left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mysticatea Oct 24, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eslintbot commented Oct 24, 2016

Uh oh!

eslintbot commented Oct 24, 2016

Uh oh!

mysticatea left a comment

Choose a reason for hiding this comment

Uh oh!

not-an-aardvark commented Oct 26, 2016

Uh oh!

alberto commented Oct 27, 2016

Uh oh!

Update: Fix `no-useless-escape` false negative in regexes (fixes #7424) #7425

Update: Fix `no-useless-escape` false negative in regexes (fixes #7424) #7425

not-an-aardvark commented Oct 22, 2016 •

edited

Loading

mysticatea Oct 24, 2016 •

edited

Loading