-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[POC] Improve pnp loader speed and memory #6671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I like the idea (I experimented myself with using raw JS instead of wasm in #5738, although it was for a different section of the code), however I'd have issues with duplicating the Also, did you check actual performances? I think https://github.com/andrewrk/poop allows you to check both memory usage & performances in the same pass. |
|
I will check performance with this tool tomorrow. Regarding avoiding zipfs duplication - it's possible to abstract libzip and native js zip with interface, with small or no changes to orig zipfs interface. |
|
I'm currently suspecting lookup file logic is making it slower than fs. |
|
@arcanis how we can move from this point?
|
|
That sounds promising. I think the best approach for now is to first land your implementation in its current state (ie without more optimizations), as it's already a significant endeavour we'll want to test in prod for some time. To that end, can you look at unifying the interface + adding a setting to select which version to use? I think for this first iteration we can use a simple environment variable checked from within the loader; something like |
|
closed in favour of #6688 |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
## What's the problem this PR addresses? rework of #6671 TODO: - [ ] check security? - [x] tests - [ ] elaborate mixed compression logic - [ ] benchmarks? - [ ] hide exports and interfaces for external consumers? <!-- Describe the rationale of your PR. --> <!-- Link all issues that it closes. (Closes/Resolves #xxxx.) --> ... ## How did you fix it? <!-- A detailed description of your implementation. --> ... ## Checklist <!--- Don't worry if you miss something, chores are automatically tested. --> <!--- This checklist exists to help you remember doing the chores when you submit a PR. --> <!--- Put an `x` in all the boxes that apply. --> - [x] I have read the [Contributing Guide](https://yarnpkg.com/advanced/contributing). <!-- See https://yarnpkg.com/advanced/contributing#preparing-your-pr-to-be-released for more details. --> <!-- Check with `yarn version check` and fix with `yarn version check -i` --> - [ ] I have set the packages that need to be released for my changes to be effective. <!-- The "Testing chores" workflow validates that your PR follows our guidelines. --> <!-- If it doesn't pass, click on it to see details as to what your PR might be missing. --> - [ ] I will check that all automated PR checks pass before the PR gets reviewed. --------- Co-authored-by: Maël Nison <[email protected]>


What's the problem this PR addresses?
Firstly, thanks for moving industry forward with pnp.
I've tried to migrate big project to pnp and found out that it's slower (even with compression=0) and uses much more memory.
So much memory that OOM kills jest workers in our CI.
Which is a bummer, because pnp has a potential to be faster than native fs. Probably.
So to target this, I've implemented readonly zip reader in native js.
Speed
It's pretty fast, there is bench/run.ts script which I'm running for all my yarn cache, results for me are:
Now, it's possible to make it even faster, if I benchmark zip parsing only (
readZipSync), it takes 1500ms on warm fs cache.Which means, that
registerListing, registerEntryimpl is not optimal.Memory
As I understand, there is issue with memory management with libzip compiled to asm.js.
More speed improvements
Even with new zip implementation, pnp is still slower then fs, although pnp has a potential to be faster.
pnp:
Native fs
1357.4191660000001...
Caveats
Security of solution should be checked.
Possible vectors of attack - path traversal, mode (set uid, set gid, exec flag, perms).
Should check what libzip does to mitigate this, but maybe, because it's readonly and no extraction done, it's safe. I'm not sure.
If you are interested, I'll finish this PR and investigate further performance improvements.
Checklist