Skip to content
Prev Previous commit
Next Next commit
timers: 3x faster setImmediate
Save the setImmediate callback arguments into an array instead of a
closure, and invoke the callback on the arguments from an optimizable
function.

  60% faster setImmediate with 0 args (15% if self-recursive)
  4x faster setImmediate with 1-3 args, 2x with > 3
  seems to be faster with less memory pressure when memory is tight

Changes:
- use L.create() to build faster lists
- use runCallback() from within tryOnImmediate
- create immediate timers with a function instead of new
- just save the arguments and not build closures for the callbacks
  • Loading branch information
Andras committed Apr 28, 2016
commit d005b8193e968f73c6fbb5a3ea8bb79897a3f8da
70 changes: 41 additions & 29 deletions lib/timers.js
Original file line number Diff line number Diff line change
Expand Up @@ -515,9 +515,13 @@ function processImmediate() {
immediate = L.shift(queue);
domain = immediate.domain;

if (!immediate._onImmediate)
continue;

if (domain)
domain.enter();

immediate._callback = immediate._onImmediate;
tryOnImmediate(immediate, queue);

if (domain)
Expand All @@ -538,7 +542,8 @@ function processImmediate() {
function tryOnImmediate(immediate, queue) {
var threw = true;
try {
immediate._onImmediate();
// make the actual call outside the try/catch to allow it to be optimized
runCallback(immediate);
threw = false;
} finally {
if (threw && !L.isEmpty(queue)) {
Expand All @@ -552,56 +557,66 @@ function tryOnImmediate(immediate, queue) {
}
}

function runCallback(timer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we don't do this optimization anywhere else in the code more generically? It sounds like something that would go in util. If we don't then LGTM.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this construct in other projects, and found it hard to make generic;
the call semantics don't cover all uses cases well. I have 3 variants,
this version I used here is a 4th.

That being said, I agree, an optimized call invoker is a nice utility to have.

var argv = timer._argv;
var argc = argv ? argv.length : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

switch (argc) {
// fast-path callbacks with 0-3 arguments
case 0:
return timer._callback();
case 1:
return timer._callback(argv[0]);
case 2:
return timer._callback(argv[0], argv[1]);
case 3:
return timer._callback(argv[0], argv[1], argv[2]);
// more then 3 arguments run slower with .apply
default:
return timer._callback.apply(timer, argv);
}
}

function Immediate() { }

Immediate.prototype.domain = undefined;
Immediate.prototype._onImmediate = undefined;
Immediate.prototype._idleNext = undefined;
Immediate.prototype._idlePrev = undefined;

function createImmediate(callback) {
return {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here RE: creating a new instance with a no prototype inheritance

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no difference; see reply above

_idleNext: null,
_idlePrev: null,
_callback: callback,
_argv: null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even better: createImmediate(callback, argv)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried passing two params to the create function, it was slower (but I was
timing passing in the domain). I'll double-check.

Copy link
Author

@andrasq andrasq Apr 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. So passing two args to the create function runs at the same speed, but moving the
create to below the args gathering and leaving the var as const drops the speed.
Creating the immediate as var immediate = createImmediate(callback, args) is 50% faster
than the almost identical const immediate = ... . In that location -- above the switch
const and var run the same. This I don't understand (maybe a v8 scoping optimization?)

Stylistically, I like not having to poke attributes into the object, but creating it up top
and populating it below may be a clearer flow, it's how setTimeout and setInterval work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrasq If you add --trace-opt --trace-deopt to the command line, do you see any aborted optimizations or deopts related to the containing function? I have seen specific uses of const that have caused aborted optimizations which are very noticeable when benchmarking.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex good tip! It's [aborted optimizing 0x21a4311d <JS Function exports.setImmediate (SharedFunctionInfo 0x21a41ce5)> because: Unsupported phi use of const variable]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so it looks like it will need to be kept var then. My guess is that v8 still hasn't optimized all const use cases yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although var should be changed to const wherever v8 doesn't abort optimizations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried passing two params to the create function, it was slower (but I was
timing passing in the domain). I'll double-check.

Could you post some numbers on this, what are we talking about? Microseconds, nanoseconds?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was small, nanoseconds, but I didn't record it and can't reproduce it now.
I re-timed passing two parameters to the create function and compiled into node
the two-argument form is distinctly faster, so I'll change this (my test rig loads
the file under test with require, so can run the same timers.js source under
various engines. I'd been comparing v0.10.42 vs v4.4.0 vs v5.10.1 vs v6.0.0)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed the edits (passing both callback and arguments to createImmediate())

_onImmediate: callback,
domain: process.domain,
};
}

exports.setImmediate = function(callback, arg1, arg2, arg3) {
if (typeof callback !== 'function') {
throw new TypeError('"callback" argument must be a function');
}

var i, args;
var len = arguments.length;
var immediate = new Immediate();
var immediate = createImmediate(callback);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be const too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


L.init(immediate);

switch (len) {
switch (arguments.length) {
// fast cases
case 0:
case 1:
immediate._onImmediate = callback;
break;
case 2:
immediate._onImmediate = function() {
callback.call(immediate, arg1);
};
immediate._argv = [arg1];
break;
case 3:
immediate._onImmediate = function() {
callback.call(immediate, arg1, arg2);
};
immediate._argv = [arg1, arg2];
break;
case 4:
immediate._onImmediate = function() {
callback.call(immediate, arg1, arg2, arg3);
};
immediate._argv = [arg1, arg2, arg3];
break;
// slow case
default:
var len = arguments.length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the 1/2/3/more argument optimization really help with anything here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a good deal faster than iterating over arguments iirc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, x = [1, 2, 3] runs faster than x = new Array(3), and setting the values is extra.
Some 50+% faster to create pre-populated arrays of fixed length 3. (In fact, [1], [1,2,3] and
[1,2,3,4,5,6,7,8,9] all run equally fast, while explicitly setting values is linear in array length.)

Normally either would be blazing fast, 75 million vs 48 million / second, but when tuning
code that itself runs at 4 million operations / second, the difference amounts to 3% of the total.
(the baseline was 1 million / second, where the difference was under 1%)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but we're not actually invoking anything here, is it still slow to pass arguments around to the same function where you can have this sort of dispatch anyway?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that's a great question, I checked, and the answer turns to be weird -- the "/more" case
(where the arguments are copied out) runs much faster. Like 3x faster. But the special-cased
0/1/2/3 args run 25-30% slower. I guess passing arguments still disables optimization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjamingr I changed the three lines that copy the callback params out of arguments,
for an added 2x speedup of the "many" (more than 3) arguments handling. (And yes, it's
still faster to special-case the 1/2/3 handling, but 4/5 no longer, so 4 is the crossover.)

Turns out node v6 is much much faster to .apply args in an array that was dynamically
extended than if it was allocated to the exact size. Seems to have changed with v6;
v5 and v4 were faster to .apply arrays that were created the correct size.
The change is significant, now 4x slower than before.

From the commit message:

        // faster to copy args, but v6 very slow to .apply them:
        var argv = new Array(arguments.length - 1);   // exact size
        for (var i = 1; i < arguments.length; i++)
          argv[i - 1] = arguments[i];

        // slower to copy args, but v6 much faster to .apply them:
        var argv = new Array();                       // grow as needed
        for (var i = 1; i < arguments.length; i++)
          argv[i - 1] = arguments[i];

args = new Array(len - 1);
for (i = 1; i < len; i++)
args[i - 1] = arguments[i];

immediate._onImmediate = function() {
callback.apply(immediate, args);
};
immediate._argv = args;
break;
}

Expand All @@ -610,9 +625,6 @@ exports.setImmediate = function(callback, arg1, arg2, arg3) {
process._immediateCallback = processImmediate;
}

if (process.domain)
immediate.domain = process.domain;

L.append(immediateQueue, immediate);

return immediate;
Expand Down