core(tracehouse): structure main thread task creation to be more forgiving #9491

patrickhulce · 2019-07-31T21:37:42Z

Summary
tl;dr - a rewrite of main-thread-tasks to make way for us dropping invalid trace events in the future

This PR converts our main thread creation to focus on individual tasks first, then moves on the hierarchy step. We were being fancy before by doing everything in a single pass, which is much more efficient and easier when the trace events are valid but next to impossible to understand if you're trying to recover from invalid data. By switching our approach, it's much easier to identify unmatched pairs of events, overextending tasks, etc.

~~It's mostly identical functionality-wise to our current setup with the exception of no longer requiring the logic in #9230 to handle same-timestamped events, so should be minimal impact as-is.~~ The matching of B/E events now necessitates a more forgiving parent/child event timing rule so we handle missorted events and we allow 1ms slop for parent/child start/end times.

The next step would be instead of throw new Error in main-thread-tasks it pushes the trace event onto an invalidTraceEvents array so that we can continue and the user still is informed there was something squirrely is going on.

Related Issues/PRs
moves us a step closer to addressing #7764

eliminates the need for #9230 in order to address #7764 though as discussed in the PR description, still worth doing IMO as it sets this code up to follow the hot path and is just a more sensible way to receive trace events.

patrickhulce · 2019-08-05T23:17:40Z

OK so turns out we already had an "invalid" trace checked into our fixtures and we just never noticed because the main thread task logic didn't check that the corresponding B/E event names matched, it just assumed they were valid. Now we check them and so we started throwing Child cannot end after parent errors.

This PR now also allows up to 1ms slop time between parent/child events to allow for these minor mistakes and should hopefully reduce the number of fatal trace processing errors.

exterkamp

Trying to review this, (read: I'm not super familiar with this code). It looks good, but I think it needs some more documentation to really make it understandable.

lighthouse-core/lib/tracehouse/main-thread-tasks.js

exterkamp · 2019-08-30T21:54:31Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+          priorTaskData,
+          timerInstallEventsReverseQueue
+        );
+      } else {


Remove else

Not sure how to reconcile this one with the request for more comments and documentation, this is explicitly here so that the reader knows what the else branch would have been and why it's not necessary

FWIW, I think something like, // If there's still a currentTask, that means we're currently in the middle of a task, so nextTask is a child. up at the top of the if works (and then can drop the else)

exterkamp · 2019-08-30T21:57:20Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

-      currentTask.endTime = traceEndTs;
-      currentTask = currentTask.parent;
+  /**
+   * This function takes the raw trace events sorted in increasing timestamp order and outputs connected task nodes.


This description is very helpful for understanding this file, I wish it was higher up or there was some top level description of this file.

There is a top level description of this file that addresses all that this file does while this comment discusses how this piece of the puzzle is computed (the only parts of this file that were touched were the trace parsing pieces).

It sounds like you think it might be worth splitting this file into different components?

I'm not quite sure what that would look like though. Maybe just...

main-thread-tasks.js

thread-task-creation.js

?

exterkamp · 2019-08-30T22:05:52Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+    // Create a reversed copy of the array to avoid copying the rest of the queue on every mutation.
+    const taskEndEventsReverseQueue = taskEndEvents.slice().reverse();
+
+    for (let i = 0; i < taskStartEvents.length; i++) {


There are a lot of the nested loops where one side searches left to right, and then looks up right to left. Is there any way to iterate and build a map and lookup the seen that way? go from O(x^2) to O(x)? This might be overcomplicated for some code that is already complex.

^ That might not make 100% given this code is searching 2 separate lists and it is trying to handle nesting, but this definitely has a suspicious smell to the code. Might just need better comments.

i was curious about the perf here too.

on a theverge trace (42MB), _createTasksFromStartAndEndEvents takes 21ms and its parent _createTasksFromEvents takes just 53ms.

meanwhile, recordsFromLogs takes 1380ms.

so i think right now this is still pretty cheap and not worth optimizing.

meanwhile, recordsFromLogs takes 1380ms.

Like network recordsFromLogs?? Even theverge will have max 100s of requests. What's taking 1.3s?

findNetworkQuietPeriods

while it's true the worst case runtime here is O(n^2), ~99.999% of the time the reverse loop is O(1) because the common case is that B/E events are in the correct order (think of every time we've run LH today without it throwing a fatal error, that is all of the cases that this reverse loop was O(1) on every event). The only reason we're doing this at all is to be able to handle the rare and unexpected case where Chrome put the events in an order that doesn't make sense. When that happens we always have to look at every event worst case so there's no way to escape O(n^2).

Also, what paul said, it's fine right now so seems premature to worry about taking up extra memory with a map.

Seems like a good idea to put all this into a comment somewhere :)

EDIT: I forgot about nested events when originally writing this but because we don't have stack sampling enabled the max depth should still be ~5, so still O(1) :D example: the average iterations of the inner loop on theverge was 1.1387605983318398

lighthouse-core/lib/tracehouse/main-thread-tasks.js

exterkamp · 2019-08-30T22:10:17Z

lighthouse-core/test/lib/tracehouse/main-thread-tasks-test.js

+
+    /*
+    An artistic rendering of the below trace:
+    █████████████████████████████TaskA██████████████████████████████████████████████


I would love some of these renderings in the code as comments explaining what it is doing! 🎨

paulirish · 2019-09-05T23:59:49Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+    // Create a reversed copy of the array to avoid copying the rest of the queue on every mutation.
+    const taskEndEventsReverseQueue = taskEndEvents.slice().reverse();
+
+    for (let i = 0; i < taskStartEvents.length; i++) {


i was curious about the perf here too.

on a theverge trace (42MB), _createTasksFromStartAndEndEvents takes 21ms and its parent _createTasksFromEvents takes just 53ms.

meanwhile, recordsFromLogs takes 1380ms.

so i think right now this is still pretty cheap and not worth optimizing.

paulirish · 2019-09-06T00:07:39Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+      let matchedEventIndex = -1;
+      let matchingNestedEventCount = 0;
+      let matchingNestedEventIndex = i + 1;
+      for (let j = taskEndEventsReverseQueue.length - 1; j >= 0; j--) {


if we reversed this array it's unclear why we're iterating from the back to front.

this improves the perf of the pop() and splice()s?

yep pop() <<< shift() added a comment to this effect and a better function explanation 👍

lighthouse-core/lib/tracehouse/main-thread-tasks.js

paulirish · 2019-09-06T00:33:46Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+      for (let j = taskEndEventsReverseQueue.length - 1; j >= 0; j--) {
+        const endEvent = taskEndEventsReverseQueue[j];
+        // We are considering an end event, so we'll count how many nested events we saw along the way.
+        while (matchingNestedEventIndex < taskStartEvents.length &&


looking at https://cs.chromium.org/chromium/src/third_party/catapult/tracing/tracing/extras/importer/trace_event_importer.html (and slice_group) .. they use something very simple
basically since sync events are assumed to be sync... if we have an End, then it must match with the most recently found Begin event.

also interestingly they don't sort events or preprocess before doing this begin/end matching. perhaps making this assumption from the trace saves some headaches?

i tried out a basic implementation of this approach but it fails on

should handle out-of-order 0 duration tasks
should handle child events that extend <1ms beyond parent event

perhaps making this assumption from the trace saves some headaches?

It absolutely does save headaches and that's how the old version was written but that assumption is also what fails on several real-world traces. Recovering from that error situation is exactly what this rewrite is about, so if there's a way to get that old way to work and nicely recover from errors when that assumption breaks down I'm definitely open to someone else running with their vision for it :)

paulirish · 2019-09-06T00:52:32Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

   * @param {PriorTaskData} priorTaskData
+   * @param {Array<LH.TraceEvent>} reverseEventsQueue
+   */
+  static _assignAllTimerInstallsBetweenTasks(


are there other test cases we should add for this new method?

I already added

lighthouse/lighthouse-core/test/lib/tracehouse/main-thread-tasks-test.js

Lines 156 to 197 in 6daf898

it('should compute attributableURLs correctly across timers', () => {

const baseTs = 1241250325;

const url = s => ({args: {data: {url: s}}});

const stackFrames = f => ({args: {data: {stackTrace: f.map(url => ({url}))}}});

const timerId = id => ({args: {data: {timerId: id}}});

/*

An artistic rendering of the below trace:

█████████████████████████████TaskA██████████████████████████████████████████████

████████████████TaskB███████████████████ █Timer Fire█

████EvaluateScript██████ █TaskE█

| <-- Timer Install

*/

const traceEvents = [

...boilerplateTrace,

{ph: 'X', name: 'TaskA', pid, tid, ts: baseTs, dur: 100e3, ...url('about:blank')},

{ph: 'B', name: 'TaskB', pid, tid, ts: baseTs + 5e3, ...stackFrames(['urlB.1', 'urlB.2'])},

{ph: 'X', name: 'EvaluateScript', pid, tid, ts: baseTs + 10e3, dur: 30e3, ...url('urlC')},

{ph: 'I', name: 'TimerInstall', pid, tid, ts: baseTs + 15e3, ...timerId(1)},

{ph: 'E', name: 'TaskB', pid, tid, ts: baseTs + 55e3},

{ph: 'X', name: 'TimerFire', pid, tid, ts: baseTs + 75e3, dur: 10e3, ...timerId(1)},

{ph: 'X', name: 'TaskE', pid, tid, ts: baseTs + 80e3, dur: 5e3, ...stackFrames(['urlD'])},

];

traceEvents.forEach(evt => {

evt.cat = 'devtools.timeline';

evt.args = evt.args || args;

});

const tasks = run({traceEvents});

const taskA = tasks.find(task => task.event.name === 'TaskA');

const taskB = tasks.find(task => task.event.name === 'TaskB');

const taskC = tasks.find(task => task.event.name === 'EvaluateScript');

const taskD = tasks.find(task => task.event.name === 'TimerFire');

const taskE = tasks.find(task => task.event.name === 'TaskE');

expect(taskA.attributableURLs).toEqual([]);

expect(taskB.attributableURLs).toEqual(['urlB.1', 'urlB.2']);

expect(taskC.attributableURLs).toEqual(['urlB.1', 'urlB.2', 'urlC']);

expect(taskD.attributableURLs).toEqual(['urlB.1', 'urlB.2', 'urlC']);

expect(taskE.attributableURLs).toEqual(['urlB.1', 'urlB.2', 'urlC', 'urlD']);

});

which is enough to cover the case I had in mind with this method but maybe there's another case I'm missing?

brendankenny

some general feedback, but this looks great. I like splitting the phases the construction phases a lot.

lighthouse-core/lib/tracehouse/main-thread-tasks.js

brendankenny · 2019-09-23T19:52:17Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+        taskEndEvent = {ph: 'E', ts: traceEndTs};
+      } else if (matchedEventIndex === taskEndEventsReverseQueue.length - 1) {
+        // Use .pop() in the common case where the immediately next event is needed.
+        // It's ~25x faster, https://jsperf.com/pop-vs-splice.


FWIW for a trace that ends up ~70% popped vs 30% spliced, I see no performance difference eliminating the popped path altogether.

brendankenny · 2019-09-23T19:55:38Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+
+    for (let i = 0; i < taskStartEvents.length; i++) {
+      const taskStartEvent = taskStartEvents[i];
+      if (taskStartEvent.ph === 'X') {


it might be clearer to handle this separately altogether instead of combining with the B case?

the solution to this isn't clear to me, so I might need some help :)

what's clear to me now is that every start task case is handled here which is comforting compared to making sure I have my bases covered by combining the results of several methods if that's what the suggestion is? IMO, it's maybe a wash considering the mental overhead of this is only 3 lines? might be missing what the separately altogether looks like though 🤷‍♂

yeah, it wouldn't be much different, just something like if (event.ph === 'X') taskXEvents.push(event); down in _createTasksFromEvents and a separate function _createTasksFromXEvents that's basically just this first three lines.

The main thing for me wasn't having this code here (though it would keep it slightly more focused), it was the matchingNestedEventIndex business below, where I was trying to figure out if the name check was sufficient to rule out X events, if having them in there would mess up the matching ordering in taskEndEventsReverseQueue, etc. All of that works fine, and there's no performance advantage, but it might be nice to separate them to ditch any mental overhead of having B and X together.

Not a huge deal, it also works as it is.

brendankenny · 2019-09-23T20:49:32Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+      const nextTask = sortedTasks[i];
+
+      // Update `currentTask` based on the elapsed time.
+      // The `nextTask` may be after currentTask has ended.
      while (
        currentTask &&
        Number.isFinite(currentTask.endTime) &&


when is this an issue?

Given the new construction method it should never be an issue when given a valid trace. Unfortunately I have no more confidence that this code will be given a valid trace these days 😢

(to be fair though, it should be much much rarer than before where dur itself is not a finite number)

lighthouse-core/lib/tracehouse/main-thread-tasks.js

brendankenny · 2019-09-23T21:21:58Z

lighthouse-core/lib/tracehouse/main-thread-tasks.js

+          priorTaskData,
+          timerInstallEventsReverseQueue
+        );
+      } else {


FWIW, I think something like, // If there's still a currentTask, that means we're currently in the middle of a task, so nextTask is a child. up at the top of the if works (and then can drop the else)

brendankenny

LGTM!

brendankenny · 2019-09-26T21:33:48Z

(with the really important lint fix)

brendankenny · 2019-09-26T21:44:17Z

haha gg eslint

paulirish

let's do it.

core(tracehouse): structure main thread creation to be more forgiving

900093c

patrickhulce requested review from brendankenny and paulirish as code owners July 31, 2019 21:37

googlebot added the cla: yes label Jul 31, 2019

patrickhulce mentioned this pull request Aug 1, 2019

core(tracehouse): sort trace events by nesting order #9230

Merged

patrickhulce added 3 commits August 5, 2019 17:18

Merge branch 'master' into main_thread_consumption

3d24a75

merge fixes

a06ff76

fix the nested event handling

08f7d0c

vercel bot deployed to staging August 5, 2019 22:41 View deployment

patrickhulce added 2 commits August 5, 2019 17:41

lint

b8e0e81

add slop test to fix notracingstarted trace

6daf898

vercel bot deployed to staging August 5, 2019 23:15 View deployment

patrickhulce changed the title ~~core(tracehouse): structure main thread creation to be more forgiving~~ core(tracehouse): structure main thread task creation to be more forgiving Aug 6, 2019

patrickhulce mentioned this pull request Aug 26, 2019

Fatal trace logic error - expected start event, got X #7764

Closed

paulirish added the 5.3 label Aug 27, 2019

exterkamp reviewed Aug 30, 2019

View reviewed changes

paulirish reviewed Sep 6, 2019

View reviewed changes

patrickhulce added 2 commits September 6, 2019 10:24

Merge branch 'master' into main_thread_consumption

349a13d

moar comments

863c478

vercel bot deployed to staging September 6, 2019 16:34 View deployment

brendankenny mentioned this pull request Sep 20, 2019

core(network-recorder): optimize network idle detection #9712

Merged

brendankenny reviewed Sep 23, 2019

View reviewed changes

more feedback

4bad8cc

vercel bot deployed to staging September 24, 2019 16:34 View deployment

brendankenny approved these changes Sep 26, 2019

View reviewed changes

not so small lint fix :)

fba1df7

vercel bot deployed to staging September 26, 2019 21:41 View deployment

flipped

2c6abbf

vercel bot deployed to staging September 26, 2019 21:44 View deployment

paulirish approved these changes Sep 26, 2019

View reviewed changes

paulirish merged commit 16fdf45 into master Sep 26, 2019

paulirish deleted the main_thread_consumption branch September 26, 2019 22:48

patrickhulce mentioned this pull request Oct 18, 2019

Javascript audits failing #9865

Closed

paulirish pushed a commit that referenced this pull request Nov 6, 2019

core(tracehouse): make main-thread-tasks creation more forgiving (#9491)

7674cb0

snyk-bot mentioned this pull request Mar 21, 2020

[Snyk] Upgrade lighthouse from 5.1.0 to 5.6.0 godaddy/lighthouse4u#13

Merged

paulirish mentioned this pull request Apr 21, 2020

Inconsistent handling of non-ending trace events #10616

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core(tracehouse): structure main thread task creation to be more forgiving #9491

core(tracehouse): structure main thread task creation to be more forgiving #9491

patrickhulce commented Jul 31, 2019 •

edited

patrickhulce commented Aug 5, 2019 •

edited

exterkamp left a comment

exterkamp Aug 30, 2019

patrickhulce Sep 6, 2019

brendankenny Sep 23, 2019

exterkamp Aug 30, 2019

patrickhulce Sep 6, 2019

exterkamp Aug 30, 2019

paulirish Sep 5, 2019

brendankenny Sep 6, 2019

paulirish Sep 6, 2019

patrickhulce Sep 6, 2019 •

edited

exterkamp Aug 30, 2019

paulirish Sep 5, 2019

paulirish Sep 6, 2019

patrickhulce Sep 6, 2019

paulirish Sep 6, 2019

paulirish Sep 6, 2019

patrickhulce Sep 6, 2019

paulirish Sep 6, 2019

patrickhulce Sep 6, 2019

brendankenny left a comment

brendankenny Sep 23, 2019

brendankenny Sep 23, 2019

patrickhulce Sep 24, 2019

brendankenny Sep 24, 2019

brendankenny Sep 23, 2019

patrickhulce Sep 24, 2019

patrickhulce Sep 24, 2019

brendankenny Sep 23, 2019

brendankenny left a comment

brendankenny commented Sep 26, 2019

brendankenny commented Sep 26, 2019

paulirish left a comment

	it('should compute attributableURLs correctly across timers', () => {
	const baseTs = 1241250325;
	const url = s => ({args: {data: {url: s}}});
	const stackFrames = f => ({args: {data: {stackTrace: f.map(url => ({url}))}}});
	const timerId = id => ({args: {data: {timerId: id}}});

	/*
	An artistic rendering of the below trace:
	█████████████████████████████TaskA██████████████████████████████████████████████
	████████████████TaskB███████████████████ █Timer Fire█
	████EvaluateScript██████ █TaskE█
	\| <-- Timer Install
	*/
	const traceEvents = [
	...boilerplateTrace,
	{ph: 'X', name: 'TaskA', pid, tid, ts: baseTs, dur: 100e3, ...url('about:blank')},
	{ph: 'B', name: 'TaskB', pid, tid, ts: baseTs + 5e3, ...stackFrames(['urlB.1', 'urlB.2'])},
	{ph: 'X', name: 'EvaluateScript', pid, tid, ts: baseTs + 10e3, dur: 30e3, ...url('urlC')},
	{ph: 'I', name: 'TimerInstall', pid, tid, ts: baseTs + 15e3, ...timerId(1)},
	{ph: 'E', name: 'TaskB', pid, tid, ts: baseTs + 55e3},
	{ph: 'X', name: 'TimerFire', pid, tid, ts: baseTs + 75e3, dur: 10e3, ...timerId(1)},
	{ph: 'X', name: 'TaskE', pid, tid, ts: baseTs + 80e3, dur: 5e3, ...stackFrames(['urlD'])},
	];

	traceEvents.forEach(evt => {
	evt.cat = 'devtools.timeline';
	evt.args = evt.args \|\| args;
	});

	const tasks = run({traceEvents});
	const taskA = tasks.find(task => task.event.name === 'TaskA');
	const taskB = tasks.find(task => task.event.name === 'TaskB');
	const taskC = tasks.find(task => task.event.name === 'EvaluateScript');
	const taskD = tasks.find(task => task.event.name === 'TimerFire');
	const taskE = tasks.find(task => task.event.name === 'TaskE');

	expect(taskA.attributableURLs).toEqual([]);
	expect(taskB.attributableURLs).toEqual(['urlB.1', 'urlB.2']);
	expect(taskC.attributableURLs).toEqual(['urlB.1', 'urlB.2', 'urlC']);
	expect(taskD.attributableURLs).toEqual(['urlB.1', 'urlB.2', 'urlC']);
	expect(taskE.attributableURLs).toEqual(['urlB.1', 'urlB.2', 'urlC', 'urlD']);
	});

core(tracehouse): structure main thread task creation to be more forgiving #9491

core(tracehouse): structure main thread task creation to be more forgiving #9491

Conversation

patrickhulce commented Jul 31, 2019 • edited

patrickhulce commented Aug 5, 2019 • edited

exterkamp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce Sep 6, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny left a comment

Choose a reason for hiding this comment

brendankenny commented Sep 26, 2019

brendankenny commented Sep 26, 2019

paulirish left a comment

Choose a reason for hiding this comment

patrickhulce commented Jul 31, 2019 •

edited

patrickhulce commented Aug 5, 2019 •

edited

patrickhulce Sep 6, 2019 •

edited