New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update: handle uncommon linebreaks consistently in rules (fixes #7949) #8049
Conversation
@not-an-aardvark, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mysticatea, @vitorbal and @nzakas to be potential reviewers. |
lib/rules/no-multi-spaces.js
Outdated
@@ -93,7 +93,7 @@ module.exports = { | |||
const sourceCode = context.getSourceCode(), | |||
source = sourceCode.getText(), | |||
allComments = sourceCode.getAllComments(), | |||
pattern = /[^\n\r\u2028\u2029\t ].? {2,}/g; // note: repeating space | |||
pattern = /[^\s].? {2,}/g; // note: repeating space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new regEx is a bit more restrictive then before. It will include tabs and spaces on top of the linebreaks in the "not" group. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be misunderstanding -- the old "not" group also had tabs and spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right, sorry. I didn't notice white space and \t
towards the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at the reference, \s
is the same as [\f\n\r\t\v\u00A0\u2028\u2029]
, so it looks like you still picked up a few extra constrains. Not sure if it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, looks like you're right. I'll switch it back.
The effect of this is that the following code would not be reported:
foo\f\f + 1
@@ -34,7 +40,7 @@ module.exports = { | |||
create(context) { | |||
const sourceCode = context.getSourceCode(); | |||
|
|||
const BLANK_CLASS = "[ \t\u00a0\u2000-\u200b\u2028\u2029\u3000]", | |||
const BLANK_CLASS = "[ \t\u00a0\u2000-\u200b\u3000]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct, but I'm not sure why it originally included line breaks? Maybe there was a reason for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the linebreaks are redundant here because the regex is only executed on elements of sourceCode.lines
.
line: baseLoc.line + lineInComment, | ||
column | ||
}; | ||
return astUtils.getLocationFromRangeIndex(sourceCode, comment.range[0] + 2 + getColumnInComment(variable, comment)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unrelated to the topic of this PR, but I think we can let it slide:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. (This actually is a place where no-unused-vars
had a bug due to a bad linebreak regex, but I suppose it could have been fixed in a different way.)
4a1f724
to
79ea45e
Compare
Currently, rules that deal with line-splitting need to include a list of JS linebreaks in some form (e.g. in a regex character class). As a result, splitting a string into JS lines is implemented in 22 different places in the codebase. As described in #7949, not all of these implementations are correct (many of them forget to account for \r, \u2028, and \u2029. This commit updates ast-utils.js with the following properties: * `astUtils.LINEBREAKS`: a Set containing all JS linebreaks * `astUtils.LINEBREAK_MATCHER`: a regular expression to match JS linebreaks * `astUtils.createGlobalLinebreakMatcher`: a function that returns a new global regex to match JS linebreaks Additionally, it updates the other line-splitting code to use these constants. Observable changes: * The error message for `newline-per-chained-call` will handle property names with \u2028 and \u2029 the same way it handles names with \n. * `no-multi-str` will now report an error for strings that contain \r, \u2028, or \u2029. (Previously, it only reported strings containing \n.) * `no-unused-vars` will now report the correct location if a `/* global foo */` comment contains a linebreak other than \n. (Previously, it would report an invalid location.) * `spaced-comment` will no longer report an error if a comment contains tokens in the `exceptions` option followed by \u2028 or \u2029.
79ea45e
to
851b533
Compare
LGTM |
LGTM |
LGTM. Thanks! |
What is the purpose of this pull request? (put an "X" next to item)
[x] Bug fix (see #7949)
What changes did you make? (Give an overview)
Currently, rules that deal with line-splitting need to include a list of JS linebreaks in some form (e.g. in a regex character class). As a result, splitting a string into JS lines is implemented in 22 different places in the codebase. As described in #7949, not all of these implementations are correct (many of them forget to account for \r, \u2028, and \u2029).
This commit updates ast-utils.js with the following properties:
astUtils.LINEBREAKS
: a Set containing all JS linebreaksastUtils.LINEBREAK_MATCHER
: a regular expression to match JS linebreaksastUtils.createGlobalLinebreakMatcher
: a function that returns a new global regex to match JS linebreaksAdditionally, it updates the other line-splitting code to use these constants.
Observable changes:
newline-per-chained-call
will handle property names with \u2028 and \u2029 the same way it handles names with \n.no-multi-str
will now report an error for strings that contain \r, \u2028, or \u2029. (Previously, it only reported strings containing \n.)no-unused-vars
will now report the correct location if a/* global foo */
comment contains a linebreak other than \n. (Previously, it would report an invalid location.)spaced-comment
will no longer report an error if a comment contains tokens in theexceptions
option followed by \u2028 or \u2029.Is there anything you'd like reviewers to focus on?
Nothing in particular