Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory allocation in rules and utils #2976

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from
Draft
12 changes: 7 additions & 5 deletions lib/rules/jsx-indent-props.js
Expand Up @@ -138,8 +138,8 @@ module.exports = {
*/
function getNodeIndent(node) {
let src = context.getSourceCode().getText(node, node.loc.start.column + extraColumnStart);
const lines = src.split('\n');
src = lines[0];
const matches = src.match(/^(.*)\n/);
src = matches ? matches[1] : src;

let regExp;
if (indentType === 'space') {
Expand All @@ -150,14 +150,16 @@ module.exports = {

const indent = regExp.exec(src);
const useOperator = /^([ ]|[\t])*[:]/.test(src) || /^([ ]|[\t])*[?]/.test(src);
const useBracket = /^([ ]|[\t])*[<]/.test(src);

line.currentOperator = false;
if (useOperator) {
line.isUsingOperator = true;
line.currentOperator = true;
} else if (useBracket) {
line.isUsingOperator = false;
} else {
const useBracket = /^([ ]|[\t])*[<]/.test(src);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this regex at module level, rather than nesting it deeper? (same with the useOperator regexes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - this actually resulted in my learning that, on top of regex literals being faster than dynamic (new RegExp(string)) ones, in ES5, regex literal instances have their own identity, which means caching them at the module level is faster yet again:
https://stackoverflow.com/questions/9750338/dynamic-vs-inline-regexp-performance-in-javascript/32523333

The storedRegExp is about 5 - 20% percent faster across browsers than inlineRegExp
Always create your immutable regexps with literal syntax and cache it if it's to be re-used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm relatively sure that's not unique to ES5, that's eternally been the semantics in JS.

if (useBracket) {
line.isUsingOperator = false;
}
}

return indent ? indent[0].length : 0;
Expand Down
13 changes: 6 additions & 7 deletions lib/rules/jsx-indent.js
Expand Up @@ -149,20 +149,19 @@ module.exports = {
excludeCommas = excludeCommas || false;

let src = context.getSourceCode().getText(node, node.loc.start.column + extraColumnStart);
const lines = src.split('\n');
let matches;
if (byLastLine) {
src = lines[lines.length - 1];
matches = src.match(/.*\n(.*)$/);
} else {
src = lines[0];
matches = src.match(/^(.*)\n/);
}
Comment on lines +152 to 157
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let matches;
if (byLastLine) {
src = lines[lines.length - 1];
matches = src.match(/.*\n(.*)$/);
} else {
src = lines[0];
matches = src.match(/^(.*)\n/);
}
const matches = byLastLine ? src.match(/.*\n(.*)$/) : src.match(/^(.*)\n/);


const skip = excludeCommas ? ',' : '';
src = matches ? matches[1] : src;

let regExp;
if (indentType === 'space') {
regExp = new RegExp(`^[ ${skip}]+`);
regExp = excludeCommas ? /^[ ,]+/ : /^[ ]+/;
} else {
regExp = new RegExp(`^[\t${skip}]+`);
regExp = excludeCommas ? /^[\t,]+/ : /^[\t]+/;
}

const indent = regExp.exec(src);
Expand Down
2 changes: 1 addition & 1 deletion lib/rules/jsx-no-useless-fragment.js
Expand Up @@ -19,7 +19,7 @@ function isJSXText(node) {
* @returns {boolean}
*/
function isOnlyWhitespace(text) {
return text.trim().length === 0;
return !/\S/.test(text);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these aren't necessarily always the same thing; trim can be polyfilled to be more correct, but regexes can't be.

Additionally, is this better than /^[^\S]*$/.test(text), which might be clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - it's hard to know for sure if this regex is truly equivalent to trim - a cursory glance at the spec says it removes "WhiteSpace and LineTerminator.".

The nice thing is it doesn't involve creating a new string, and will finish early the moment it finds a non-whitespace character, rather than continuing on trying to remove all the whitespace.

In terms of clarity I'm not sure - I'm happy to go with your suggestion here, but is /^\s*$/ equivalent, i.e. "Match only on whitespace from start to end"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not convinced that .trim() is meaningfully slower than using a custom regex. Do you have data to support that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the late reply!

I was really convinced a regex would be faster because there'd be less work (stop the moment non-whitespace is found, rather than doing the work of constructing a new string, then checking its length), but after constructing some basic micro benchmarks to check it, they're basically equivalent in speed, and in Chrome 91 + latest V8 sometimes trim was faster.

Thanks for pushing back on this, I'm glad it was double checked - will revert this change.

}

/**
Expand Down
2 changes: 1 addition & 1 deletion lib/rules/no-unstable-nested-components.js
Expand Up @@ -35,7 +35,7 @@ function generateErrorMessageWithParentName(parentName) {
* @returns {Boolean}
*/
function startsWithRender(text) {
return (text || '').startsWith('render');
return text !== undefined && text.startsWith('render');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return text !== undefined && text.startsWith('render');
return text && text.startsWith('render');

since null and false and a few other non-undefined things are falsy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot - if it's a false-y string, it won't start with 'render' anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm specifically interested in changes related to this rule. Thanks for testing the performance @willheslam!

While this improvement is definitely welcome I'm having a hard time believing it actually speeds up this rule by ~40%.
Is it really that slow to create new Strings? This method is not even accessed that often.
Why shouldn't other Strings also be declared once in module scope, e.g. render, CallExpression?

Rule Time (ms) Relative
react/no-unstable-nested-components 172.198 2.7%

to

Rule Time (ms) Relative
react/no-unstable-nested-components 95.287 2.3%

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - the performance gains for no-unstabled-nested-components don't come from the change here - it's from the changes in lib/util/Components.js.
As an example, running soley no-unstable-nested-components on a large-ish project results in isES6Component being invoked many times, and each time (without the change) a regular expression is created dynamically via a string, used once, then discarded.

There are lots of other optimisations inside lib/util/Components.js that are contributing to this speed up too - to prove this is the case, I tried running no-unstable-nested-components with just the lib/util/Components.js changes applied, and got a similar speed up as you note above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that makes perfect sense. Those utilities are perfect spot for optimization as they are being used widely across the code base.
That's impressive improvement! 💯

}

/**
Expand Down
20 changes: 16 additions & 4 deletions lib/util/Components.js
Expand Up @@ -27,7 +27,12 @@ function usedPropTypesAreEquivalent(propA, propB) {
if (!propA.allNames && !propB.allNames) {
return true;
}
if (Array.isArray(propA.allNames) && Array.isArray(propB.allNames) && propA.allNames.join('') === propB.allNames.join('')) {
if (Array.isArray(propA.allNames)
&& Array.isArray(propB.allNames)
&& (
(propA.allNames === propB.allNames)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will this check ever actually be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing, most of the time.
This ended up being a strict optimisation above and beyond the (expensive) joins.
Ideally, we'd probably be better off swapping out the join comparison for some kind of fast array of strings equality check as it's still run a good chunk of the time.

Please check my logic below:

With this basic benchmarking on a project with about 550 jsx files:

let conditionChecked = 0
let arraysDoHaveIdentityEquality = 0
let arraysDoNotHaveIdentityEquality = 0
function usedPropTypesAreEquivalent(propA, propB) {
  if (propA.name === propB.name) {
    if (!propA.allNames && !propB.allNames) {
      return true;
    }
    if (Array.isArray(propA.allNames) && Array.isArray(propB.allNames)) {
      conditionChecked ++
    }
    if (Array.isArray(propA.allNames) && Array.isArray(propB.allNames) && propA.allNames.join('') === propB.allNames.join('')) {
      if(propA.allNames === propB.allNames) {
        arraysDoHaveIdentityEquality ++
      } else {
        arraysDoNotHaveIdentityEquality ++
      }
      console.log('Arrays have identity equality:', arraysDoHaveIdentityEquality)
      console.log('Arrays do not have identity equality', arraysDoNotHaveIdentityEquality)
      console.log('Ratio', arraysDoHaveIdentityEquality / arraysDoNotHaveIdentityEquality)
      console.log('Total times condition reached', arraysDoHaveIdentityEquality + arraysDoNotHaveIdentityEquality)
      console.log("Condition checked", conditionChecked)
      return true;
    }
    return false;
  }
  return false;
}

With an otherwise unaltered lib/util/Components.js,
I sometimes got a ratio of arrays having identity equality of 4 to 1 when the condition had been checked nearly 80,000 times:

Arrays have identity equality: 60555
Arrays do not have identity equality 14723
Ratio 4.112952523262922
Total times condition reached 75278
Condition checked 78661

but this settled down to about 2 to 1 after the condition had been checked almost a million times.

Arrays have identity equality: 572805
Arrays do not have identity equality 291390
Ratio 1.9657675280551838
Total times condition reached 864195
Condition checked 990195

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm surprised that the AST parser would generate two memoized allNames arrays for different nodes.

|| propA.allNames.join('') === propB.allNames.join('')
)) {
Comment on lines +30 to +35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Array.isArray(propA.allNames)
&& Array.isArray(propB.allNames)
&& (
(propA.allNames === propB.allNames)
|| propA.allNames.join('') === propB.allNames.join('')
)) {
if (
Array.isArray(propA.allNames)
&& Array.isArray(propB.allNames)
&& (
propA.allNames === propB.allNames
|| propA.allNames.join('') === propB.allNames.join('')
)
) {

return true;
}
return false;
Expand Down Expand Up @@ -235,6 +240,10 @@ function componentRule(rule, context) {
const components = new Components();
const wrapperFunctions = getWrapperFunctions(context, pragma);

let es5ComponentRegex;
let es6ComponentRegex;
let pureComponentRegex;

// Utilities for component detection
const utils = {

Expand All @@ -248,7 +257,8 @@ function componentRule(rule, context) {
if (!node.parent) {
return false;
}
return new RegExp(`^(${pragma}\\.)?${createClass}$`).test(sourceCode.getText(node.parent.callee));
es5ComponentRegex = es5ComponentRegex || new RegExp(`^(${pragma}\\.)?${createClass}$`);
return es5ComponentRegex.test(sourceCode.getText(node.parent.callee));
},

/**
Expand All @@ -265,7 +275,8 @@ function componentRule(rule, context) {
if (!node.superClass) {
return false;
}
return new RegExp(`^(${pragma}\\.)?(Pure)?Component$`).test(sourceCode.getText(node.superClass));
es6ComponentRegex = es6ComponentRegex || new RegExp(`^(${pragma}\\.)?(Pure)?Component$`);
return es6ComponentRegex.test(sourceCode.getText(node.superClass));
},

/**
Expand Down Expand Up @@ -308,7 +319,8 @@ function componentRule(rule, context) {
*/
isPureComponent(node) {
if (node.superClass) {
return new RegExp(`^(${pragma}\\.)?PureComponent$`).test(sourceCode.getText(node.superClass));
pureComponentRegex = pureComponentRegex || new RegExp(`^(${pragma}\\.)?PureComponent$`);
return pureComponentRegex.test(sourceCode.getText(node.superClass));
}
return false;
},
Expand Down
9 changes: 7 additions & 2 deletions lib/util/ast.js
Expand Up @@ -79,6 +79,11 @@ function getComponentProperties(node) {
}
}

function getLastNewLine(string) {
const match = string.match(/\n(.*)$/);
return match ? match[1] : string;
}

/**
* Gets the first node in a line from the initial node, excluding whitespace.
* @param {Object} context The node to check
Expand All @@ -92,11 +97,11 @@ function getFirstNodeInLine(context, node) {
do {
token = sourceCode.getTokenBefore(token);
lines = token.type === 'JSXText'
? token.value.split('\n')
? getLastNewLine(token.value)
: null;
} while (
token.type === 'JSXText'
&& /^\s*$/.test(lines[lines.length - 1])
&& /^\s*$/.test(lines)
);
return token;
}
Expand Down