Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update: handle uncommon linebreaks consistently in rules (fixes #7949) #8049

Merged
merged 3 commits into from Feb 13, 2017

Conversation

not-an-aardvark
Copy link
Member

What is the purpose of this pull request? (put an "X" next to item)

[x] Bug fix (see #7949)

What changes did you make? (Give an overview)

Currently, rules that deal with line-splitting need to include a list of JS linebreaks in some form (e.g. in a regex character class). As a result, splitting a string into JS lines is implemented in 22 different places in the codebase. As described in #7949, not all of these implementations are correct (many of them forget to account for \r, \u2028, and \u2029).

This commit updates ast-utils.js with the following properties:

  • astUtils.LINEBREAKS: a Set containing all JS linebreaks
  • astUtils.LINEBREAK_MATCHER: a regular expression to match JS linebreaks
  • astUtils.createGlobalLinebreakMatcher: a function that returns a new global regex to match JS linebreaks

Additionally, it updates the other line-splitting code to use these constants.

Observable changes:

  • The error message for newline-per-chained-call will handle property names with \u2028 and \u2029 the same way it handles names with \n.
  • no-multi-str will now report an error for strings that contain \r, \u2028, or \u2029. (Previously, it only reported strings containing \n.)
  • no-unused-vars will now report the correct location if a /* global foo */ comment contains a linebreak other than \n. (Previously, it would report an invalid location.)
  • spaced-comment will no longer report an error if a comment contains tokens in the exceptions option followed by \u2028 or \u2029.

Is there anything you'd like reviewers to focus on?

Nothing in particular

@not-an-aardvark not-an-aardvark added accepted There is consensus among the team that this change meets the criteria for inclusion bug ESLint is working incorrectly chore This change is not user-facing enhancement This change enhances an existing feature of ESLint rule Relates to ESLint's core rules labels Feb 8, 2017
@mention-bot
Copy link

@not-an-aardvark, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mysticatea, @vitorbal and @nzakas to be potential reviewers.

@@ -93,7 +93,7 @@ module.exports = {
const sourceCode = context.getSourceCode(),
source = sourceCode.getText(),
allComments = sourceCode.getAllComments(),
pattern = /[^\n\r\u2028\u2029\t ].? {2,}/g; // note: repeating space
pattern = /[^\s].? {2,}/g; // note: repeating space
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new regEx is a bit more restrictive then before. It will include tabs and spaces on top of the linebreaks in the "not" group. Is this intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding -- the old "not" group also had tabs and spaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right, sorry. I didn't notice white space and \t towards the end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking at the reference, \s is the same as [\f\n\r\t\v\u00A0\u2028\u2029], so it looks like you still picked up a few extra constrains. Not sure if it matters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looks like you're right. I'll switch it back.

The effect of this is that the following code would not be reported:

foo\f\f  + 1

@@ -34,7 +40,7 @@ module.exports = {
create(context) {
const sourceCode = context.getSourceCode();

const BLANK_CLASS = "[ \t\u00a0\u2000-\u200b\u2028\u2029\u3000]",
const BLANK_CLASS = "[ \t\u00a0\u2000-\u200b\u3000]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct, but I'm not sure why it originally included line breaks? Maybe there was a reason for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the linebreaks are redundant here because the regex is only executed on elements of sourceCode.lines.

line: baseLoc.line + lineInComment,
column
};
return astUtils.getLocationFromRangeIndex(sourceCode, comment.range[0] + 2 + getColumnInComment(variable, comment));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unrelated to the topic of this PR, but I think we can let it slide:-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. (This actually is a place where no-unused-vars had a bug due to a bad linebreak regex, but I suppose it could have been fixed in a different way.)

Currently, rules that deal with line-splitting need to include a list of JS linebreaks in some form (e.g. in a regex character class). As a result, splitting a string into JS lines is implemented in 22 different places in the codebase. As described in #7949, not all of these implementations are correct (many of them forget to account for \r, \u2028, and \u2029.

This commit updates ast-utils.js with the following properties:

* `astUtils.LINEBREAKS`: a Set containing all JS linebreaks
* `astUtils.LINEBREAK_MATCHER`: a regular expression to match JS linebreaks
* `astUtils.createGlobalLinebreakMatcher`: a function that returns a new global regex to match JS linebreaks

Additionally, it updates the other line-splitting code to use these constants.

Observable changes:

* The error message for `newline-per-chained-call` will handle property names with \u2028 and \u2029 the same way it handles names with \n.
* `no-multi-str` will now report an error for strings that contain \r, \u2028, or \u2029. (Previously, it only reported strings containing \n.)
* `no-unused-vars` will now report the correct location if a `/* global foo */` comment contains a linebreak other than \n. (Previously, it would report an invalid location.)
* `spaced-comment` will no longer report an error if a comment contains tokens in the `exceptions` option followed by \u2028 or \u2029.
@eslintbot
Copy link

LGTM

@eslintbot
Copy link

LGTM

@ilyavolodin
Copy link
Member

LGTM. Thanks!

@ilyavolodin ilyavolodin merged commit acc3301 into master Feb 13, 2017
@not-an-aardvark not-an-aardvark deleted the unify-linebreak-logic branch February 13, 2017 03:02
@eslint-deprecated eslint-deprecated bot locked and limited conversation to collaborators Feb 6, 2018
@eslint-deprecated eslint-deprecated bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Feb 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion archived due to age This issue has been archived; please open a new issue for any further discussion bug ESLint is working incorrectly chore This change is not user-facing enhancement This change enhances an existing feature of ESLint rule Relates to ESLint's core rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants