-
-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
The Problem
Something I've noticed is that jest can spend more time requireing modules than actually running tests. This is a consequence of each test getting a completely fresh copy of its dependency tree, which is definitely a good thing. For generated mocks, however, this leads to wasted time. In what's arguably a worst-case scenario in one of my org's test suites, jest spends over thirty seconds building this exports object due to a heavy dependency tree:
{
default: jest.fn()
}This is bad enough, but even worse, this can happen multiple times per run since mocks are only cached in the each test's jest-runtime instance instead of in the worker or the main jest process. Addressing this could provide a serious speed boost to large test suites that depend heavily on automocking (whether via automock: true in config or lots of jest.mock(‘module’) calls spread out over tests).
A Solution
Jest could cache generated mocks across test runs by serializing the mock wherever possible and writing it to disk, just like jest currently caches transform results.
To be clear, user-provided mocks (either via jest.mock('my-module', () => 'implementation') or files added to __mocks__ folders) wouldn't/shouldn't be cached. Only automocks.
There are two immediately obvious problems here, serialization and cache invalidation, and another one that follows as a result of the first. Obviously, if this was easy, jest would already be doing it 😅
1. Serialization
If a generated mock consists of primitives and functions, serializing it is pretty straightforward since all functions are converted to mocks.
For a source file like this:
export const stringValue = 'string';
export function functionValue() { }Jest would turn it into this by looking at the shape of the mock:
module.exports = {
stringValue: 'string',
functionValue: jest.fn(),
}...then write it to disk.
To deserialize, jest executes the cached mock as a script and uses the return value.
Problems arise when considering instances of user-defined types (or complex built-ins). For these cases, jest could just bail and not try to cache the mock at all since it's an optimization.
When serializing classes rather than instances of classes, we run into the next problem.
1.5 Serializing Classes
When jest mocks a class structure, it creates a constructor that matches the original structure when instantiated, with the exception of static and prototype function members converted to mock functions. Since it’s common for modules to export classes (especially React applications), we shouldn’t bail here. Some kind of custom serialization format would probably work. We’d look at the internal structure of the mock and write a function call that recreates it. The same rule as above applies — if it runs into some value it can't reliably turn back into source code, it bails.
const createMockWithShape = require('jest-mock/some-new-internal-thing');
module.exports = {
stringValue: 'string',
numberValue: 42,
classValue: createMockWithShape({
staticMethod: jest.fn(),
staticString: '1234',
}, { //for nested prototypes, either nest here or accept as rest args
prototypeMethod: jest.fn(),
prototypeStringValue: '5678'
}),
}Since this could get a little too interesting to build with strings, babel-types and babel-generator could probably be used to create an AST and print it instead.
2. Cache invalidation
The second big problem is transitive dependencies. When a module re-exports or otherwise depends on the result of others, any cached automock of that module should be invalidated whenever its dependencies change. I can't think of any way to solve this besides tracking dependencies with hashes of file content and encoding that into the serialized mock. For example:
module.exports = {
dependencies: {
'some-npm-module': '912ec803b2ce49e4a541068d495ab570', //hashes should take the hashes of each module’s dependencies into account, and so on down the tree
'../path/to/other-module': '962012d09b8170d912f0669f6d7d9d07',
},
exports: {
stringValue: 'string',
numberValue: 42,
},
}In order to save on disk IO and CPU time, workers could probably share filename:hash pairs with the other workers as they compute them by coordinating with the main jest process. Ideally the transformer code could leverage the same hashes as well.
No matter which way this is sliced the implementation is going to be pretty complex. Is Jest interested in a feature like this?