Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: clarify AST and don't use node.start/node.end (fixes #8956) #8984

Merged
merged 6 commits into from Aug 5, 2017

Conversation

mysticatea
Copy link
Member

What is the purpose of this pull request? (put an "X" next to item)

[X] Documentation update
[X] Bug fix (template)

Fixes #8956.

What changes did you make? (Give an overview)

This PR updates the document "Working with Custom Parsers" to clarify the specification of AST.
Also, this PR fixes 8 rules which are violating the spec.

It's hard that we catch the violations in review since start and end properties are legal on Location objects. So I added the custom parser which removes the properties from the result of espree. Tests of core rules use the custom parser by default. This does not affect to plugin developers.

Is there anything you'd like reviewers to focus on?

  • Please correct my English.
  • Check whether the added document is enough or not.

@eslintbot
Copy link

LGTM

@mention-bot
Copy link

@mysticatea, thanks for your PR! By analyzing the history of the files in this pull request, we identified @vitorbal, @alberto and @gyandeeps to be potential reviewers.

@eslintbot
Copy link

LGTM

Copy link
Member

@not-an-aardvark not-an-aardvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This mostly LGTM, but I have a question about using getters rather than deleting the properties.

*/
function removeStartEnd(node) {
delete node.start;
delete node.end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be confusing for new developers if the start and end properties are removed. For example, if someone uses console.log(node) when running the rule on regular code, it will have the start and end properties, but the properties will disappear during the test.

Would it work to do this instead?

Object.defineProperty(node, "start", { get() {
    throw new Error("Use node.range[0] instead of node.start");
} };

Object.defineProperty(node, "end", { get() {
    throw new Error("Use node.range[1] instead of node.end");
} };

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good idea!
I will do.

@eslintbot
Copy link

LGTM

@eslintbot
Copy link

LGTM

@mysticatea
Copy link
Member Author

I added about Literal#raw properties since I have noticed the property does not exist in ESTree.

Copy link
Member

@platinumazure platinumazure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, just left a few comments.

One thing I'll add here (because otherwise my comment will remain in the diff forever even after making this change): For the parser itself, it would be great to have a @fileoverview comment with a quick comment about the parser and why it exists.


The AST that custom parsers should create is based on [ESTree](https://github.com/estree/estree) basically. The AST requires some additional properties about detail information of the source code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the word "basically" can be removed.


All nodes must have `range` property.

* `range` (`number[]`) is the array of two numbers. Both numbers are a 0-based index which is the position in the array of source code characters. The first is the start position of the node, the second is the end position of the node. `code.slice(node.range[0], node.range[1])` must be the text of the node. This range does not include spaces which are around the node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first sentence should probably end with "is an array of two numbers" (replace the with an).

Do we need to make a note about parentheses surrounding the node, in some cases? I'm fairly sure parentheses are usually excluded also from the range, unless they're part of the grammar (e.g., arrow function parameters).

All nodes must have `range` property.

* `range` (`number[]`) is the array of two numbers. Both numbers are a 0-based index which is the position in the array of source code characters. The first is the start position of the node, the second is the end position of the node. `code.slice(node.range[0], node.range[1])` must be the text of the node. This range does not include spaces which are around the node.
* `loc` (`SourceLocation`) must not be `null`. [The `loc` property is defined as nullable by ESTree](https://github.com/estree/estree/blob/25834f7247d44d3156030f8e8a2d07644d771fdb/es5.md#node-objects), but ESLint requires this property. On the other hand, `SourceLocation#source` property can be `undefined`. ESLint does not use the `SourceLocation#source` property.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to note that for location positions, line is one-based and column is zero-based.

Saying so, maybe this should have a sublist? The main point is the loc must not be null even though loc is nullable in ESTree; then the sublist's bullet points can say that start and end must be a position with 1-based line and 0-based column, and source is not used by ESLint at all.

* `range` (`number[]`) is the array of two numbers. Both numbers are a 0-based index which is the position in the array of source code characters. The first is the start position of the node, the second is the end position of the node. `code.slice(node.range[0], node.range[1])` must be the text of the node. This range does not include spaces which are around the node.
* `loc` (`SourceLocation`) must not be `null`. [The `loc` property is defined as nullable by ESTree](https://github.com/estree/estree/blob/25834f7247d44d3156030f8e8a2d07644d771fdb/es5.md#node-objects), but ESLint requires this property. On the other hand, `SourceLocation#source` property can be `undefined`. ESLint does not use the `SourceLocation#source` property.

The `parent` property of all nodes must be rewriteable. ESLint set their parent node to the `parent` properties while traversing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the second sentence would flow better with the first by reversing the nouns:

ESLint sets each node's parent properties to its parent node while traversing.


#### The `Program` node:

The `Program` node must have `tokens` and `comments` properties. Both properties are the array of the below Token interface.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "Both properties are an array..." works better here.

}
```

* `tokens` (`Token[]`) is the array of tokens which affect the behavior of programs. Arbitrary spaces can exist between tokens, so rules check the `Token#range` to detect spaces between tokens. This must be sorted by `Token#range[0]`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the range of tokens does not include surrounding whitespace, just like for nodes?

* `tokens` (`Token[]`) is the array of tokens which affect the behavior of programs. Arbitrary spaces can exist between tokens, so rules check the `Token#range` to detect spaces between tokens. This must be sorted by `Token#range[0]`.
* `comments` (`Token[]`) is the array of comment tokens. This must be sorted by `Token#range[0]`.

The `tokens` and the `comments` must not be overlapped themselves and each other.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what you are trying to say here. Are you saying that for any comment or token, its range[1] must be less than or equal to all later tokens/comments range[0] so that the ranges do not overlap?

If so, I think that could be expressed like this:

The range indexes of all tokens and comments must not overlap with the range of other tokens and comments.

Otherwise, if I've misunderstood, let me know.

const Traverser = require("../util/traverser");

/**
* Remove `start`/`end` properties from the given node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're no longer removing the start/end properties, but rather redefining them as getters that throw. Does this JSDoc need to be changed?

* @param {ASTNode} node The node to remove.
* @returns {void}
*/
function removeStartEnd(node) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're no longer removing the start/end properties, but rather redefining them as getters that throw. Does this function name need to be changed?


/**
* Remove `start`/`end` properties from the given node.
* @param {ASTNode} node The node to remove.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're no longer removing the start/end properties, but rather redefining them as getters that throw. Does this JSDoc need to be changed?

@not-an-aardvark not-an-aardvark added accepted There is consensus among the team that this change meets the criteria for inclusion chore This change is not user-facing documentation Relates to ESLint's documentation labels Jul 24, 2017
@eslintbot
Copy link

LGTM

@mysticatea
Copy link
Member Author

I'm sorry for delay.

@platinumazure Thank you for the review! I fixed those except one; about 1-based line and 0-based column of SourceLocation. It's defined by ESTree and ESLint has no different.

@kaicataldo kaicataldo merged commit b3e4598 into master Aug 5, 2017
@mysticatea mysticatea deleted the issue8956 branch August 5, 2017 16:22
@eslint-deprecated eslint-deprecated bot locked and limited conversation to collaborators Feb 6, 2018
@eslint-deprecated eslint-deprecated bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Feb 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion archived due to age This issue has been archived; please open a new issue for any further discussion chore This change is not user-facing documentation Relates to ESLint's documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify non-standard properties of AST that custom parsers should create
7 participants