Skip to content

Conversation

@Dunqing
Copy link
Member

@Dunqing Dunqing commented May 8, 2025

Related:

Finished first step(32 -> 24) of #9706.

Change len and cap fields from usize to u32, so that the Vec size can reduced size to 24. Also, to avoid some unnecessary unit conversion. I changed many RawVec methods to take u32 rather than usize.

The performance looks great now! And we reduced 1%-3% memory usage in the parser.

Copy link
Member Author

Dunqing commented May 8, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added the C-enhancement Category - New feature or request label May 8, 2025
@Dunqing Dunqing changed the base branch from 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec to graphite-base/10884 May 8, 2025 08:25
@Dunqing Dunqing force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch from 2cdf9ec to faf29b6 Compare May 8, 2025 08:37
@Dunqing Dunqing force-pushed the graphite-base/10884 branch from 882a193 to b762280 Compare May 8, 2025 08:37
@Dunqing Dunqing changed the base branch from graphite-base/10884 to 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec May 8, 2025 08:37
@codspeed-hq
Copy link

codspeed-hq bot commented May 8, 2025

CodSpeed Instrumentation Performance Report

Merging #10884 will not alter performance

Comparing 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 (c60382d) with main (7d54577)

Summary

✅ 36 untouched benchmarks

@Boshen
Copy link
Member

Boshen commented May 8, 2025

@Dunqing
Copy link
Member Author

Dunqing commented May 8, 2025

Can you report memory change https://github.com/oxc-project/bench-javascript-parser-written-in-rust/blob/main/memory.sh

Not too much, 1.5% in cal.com.tsx and 3% in typescript.js.

Before:

./files/cal.com.tsx
oxc 11.8 mb
swc 15.0 mb
biome 0.9 MB

./files/typescript.js
oxc 59.1 mb
swc 84.0 mb
biome 0.9 MB

After:

./files/cal.com.tsx
oxc 11.6 mb
swc 15.0 mb
biome 0.9 mb

./files/typescript.js
oxc 57.3 mb
swc 84.0 mb
biome 0.9 MB

@overlookmotel
Copy link
Member

overlookmotel commented May 8, 2025

So 2% memory reduction in cal.com.tsx, 3% in typescript.js.

In terms of speed, I think it's hard to judge the implications at present.

  1. Current implementation is probably not as optimal as it could be.
  2. If we go forwards with this, next step is to reduce the size further to 16 (Reduce size of Vec to 24 #9706). Presumably we'd get another perf gain from that.
  3. If we go forwards with this, it lays the groundwork for other optimizations. The gain will likely grow further.

I've made a suggestion on #10883 for a faster way to get a more optimal implementation without tons of code changes. Maybe we should consider trying that and then assess perf again?

@Dunqing Dunqing changed the base branch from 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec to graphite-base/10884 May 12, 2025 05:35
@Dunqing Dunqing force-pushed the graphite-base/10884 branch from b762280 to b0b67b2 Compare May 12, 2025 07:21
@Dunqing Dunqing force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch from faf29b6 to 4192132 Compare May 12, 2025 07:21
@Dunqing Dunqing changed the base branch from graphite-base/10884 to 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec May 12, 2025 07:21
@Dunqing Dunqing force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch 2 times, most recently from 06f6eed to 5d77cbb Compare May 12, 2025 08:47
@Dunqing
Copy link
Member Author

Dunqing commented May 12, 2025

So 2% memory reduction in cal.com.tsx, 3% in typescript.js.

In terms of speed, I think it's hard to judge the implications at present.

  1. Current implementation is probably not as optimal as it could be.
  2. If we go forwards with this, next step is to reduce the size further to 16 (Reduce size of Vec #9706). Presumably we'd get another perf gain from that.
  3. If we go forwards with this, it lays the groundwork for other optimizations. The gain will likely grow further.

I've made a suggestion on #10883 for a faster way to get a more optimal implementation without tons of code changes. Maybe we should consider trying that and then assess perf again?

Thank you for reviewing this! I've taken your suggestion to re-implement this. Unfortunately, the current implementation has a 2% ~ 3% performance regression in the transformer. Probably caused by a subtle change somewhere, I will investigate a little bit on this. Anyway, your way is good! It is simpler than I thought.

@Dunqing Dunqing force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch 4 times, most recently from e26ab46 to b24ad62 Compare May 12, 2025 15:57
@Dunqing
Copy link
Member Author

Dunqing commented May 12, 2025

I may have found a little clue about the performance regression. When only changing cap to u32 hurts, exceeding the 4% performance regression in the transformer.

@overlookmotel
Copy link
Member

overlookmotel commented May 12, 2025

A couple of thoughts:

  1. I think some of the u32-to-usize / usize-to-u32 conversions are the wrong way around or unnecessary.

e.g. in push you have this:

if self.buf.len == usize_to_u32(self.buf.cap()) {

The cap field is already u32, so converting it from u32 -> usize -> u32 is unnecessary. This may make no difference, but I think it's worth trying. You could add a RawVec::cap_u32 method and use that to avoid conversions.

push, extend and Index access are probably the most heavily used APIs, so they're probably the ones to investigate first.

  1. Try reversing order of len and cap fields.

Again, this may make no difference, but I think worth a giving a go. len is the more often accessed field, so it may be preferable to have it aligned on 8. Getting a 32-bit uint from lower bits of a 64-bit uint is very cheap. Getting the higher bits costs an extra shift operation. Given that these methods are on such a hot path, even a tiny change like that might make a difference.


By the way, don't worry about the raw transfer test fails. It's just because the layout of Vec is changing and the codegen doesn't know that. That'll be easy to fix once we have something we're happy with.

@Dunqing Dunqing force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch 2 times, most recently from 5e1cb8a to 2b94bc7 Compare May 13, 2025 14:28
Copy link
Member

@overlookmotel overlookmotel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge merge merge merge!

I removed the benchmarks image from PR description, because it's outdated.

@graphite-app graphite-app bot changed the base branch from 05-08-refactor_allocator_vec2_move_rawvec_fields_to_vec to graphite-base/10884 May 15, 2025 21:49
@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label May 15, 2025
Copy link
Member

overlookmotel commented May 15, 2025

Merge activity

  • May 15, 5:50 PM EDT: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 15, 5:58 PM EDT: Graphite disabled "merge when ready" on this PR due to: a merge conflict with the target branch; resolve the conflict and try again..
  • May 15, 5:58 PM EDT: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 15, 5:59 PM EDT: Graphite disabled "merge when ready" on this PR due to: a merge conflict with the target branch; resolve the conflict and try again..
  • May 15, 5:59 PM EDT: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 15, 6:16 PM EDT: overlookmotel added this pull request to the Graphite merge queue.
  • May 15, 6:20 PM EDT: Merged by the Graphite merge queue.

@overlookmotel overlookmotel force-pushed the graphite-base/10884 branch from 91cad19 to 7d54577 Compare May 15, 2025 22:06
@overlookmotel overlookmotel force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch from 7c5d8af to 3621d93 Compare May 15, 2025 22:06
@overlookmotel overlookmotel changed the base branch from graphite-base/10884 to main May 15, 2025 22:06
…u32` (#10884)

Related:
* #6488
* #9706

Finished first step(32 -> 24) of #9706.

Change `len` and `cap` fields from `usize` to `u32`, so that the `Vec` size can reduced size to `24`. Also, to avoid some unnecessary unit conversion. I changed many `RawVec` methods to take `u32` rather than `usize`.

The performance looks great now! And we reduced 1%-3% memory usage in the parser.
@graphite-app graphite-app bot force-pushed the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch from 3621d93 to c60382d Compare May 15, 2025 22:14
@graphite-app graphite-app bot merged commit c60382d into main May 15, 2025
26 checks passed
@graphite-app graphite-app bot deleted the 05-08-feat_allocator_vec2_change_len_from_usize_to_u32 branch May 15, 2025 22:20
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label May 15, 2025
@overlookmotel
Copy link
Member

@Dunqing Just to make sure you saw my comments above. I had to mark them as "resolved" just to be able to merge the PR. But they're not resolved...

@Dunqing
Copy link
Member Author

Dunqing commented May 15, 2025

@Dunqing Just to make sure you saw my comments above. I had to mark them as "resolved" just to be able to merge the PR. But they're not resolved...

Thanks! I saw! I am going to solve the #10884 (comment) problem.

graphite-app bot pushed a commit that referenced this pull request May 17, 2025
Follow-on after #10884. Clarify comment as to why `as u32` is valid here.
graphite-app bot pushed a commit that referenced this pull request May 17, 2025
…ents` (#11072)

Follow-on after #10884.

The original safety conditions aren't required any more, because of the change made to `Vec::reserve` in #10884 - `reserve` will panic if `other.len` or `self.len + other.len` exceeds `u32::MAX`. So nothing can go wrong if these conditions aren't satisfied.

Which is lucky, because `append` doesn't do anything to make sure `self.len + other.len <= u32::MAX`!

`append_elements` originally had no safety docs at all (naughty bumpalo!). So add some.
graphite-app bot pushed a commit that referenced this pull request May 17, 2025
…s_in` + clarify safety docs (#11073)

Follow-on after #10884.

We don't need `alloc_guard` here. `RawVec::from_raw_parts_in` is an unsafe method, and caller guarantees that `len` and `capacity` are in legal range. We don't need to check the caller fulfilled that guarantee - that responsibility is on them.

Also clarify the safety docs for this method.
graphite-app bot pushed a commit that referenced this pull request May 17, 2025
…le_truncation)]` (#11074)

Follow-on after #10884.

Pure refactor. Move the `#[expect(clippy::cast_possible_truncation)]` attributes so they only apply to a single statement, rather than the whole function. That way we can see clearly where there's a dangerous conversion happening.

Annoyingly Rust doesn't allow putting attributes on expressions, so this requires statements that can then be attributed-up. But in my view, it's worth it.
@Boshen Boshen mentioned this pull request May 18, 2025
graphite-app bot pushed a commit that referenced this pull request May 20, 2025
…tters (#11081)

Follow-on after #10884.

Pure refactor.

To make it explicit when we're converting `u32` to/from `usize`:

* Make `RawVec`'s `len` field private.
* Make all access to `len` and `cap` fields go via getter/setter methods.
* Name those methods `*_u32` and `*_usize` so it's clear when a conversion is happening.

I also renamed `cap_*` methods to `capacity_*` for consistency with `Vec::capacity`.

I haven't used these methods in `RawVec` because there's some subtlety around when you want the "raw" `cap` value, and when you want the value `capacity` method returns, which is different when `T` is a ZST. Hopefully we can remove support for ZSTs, and remove this ambiguity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ast Area - AST A-ast-tools Area - AST tools C-enhancement Category - New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants