Skip to content

Conversation

@Dunqing
Copy link
Member

@Dunqing Dunqing commented Mar 10, 2025

Just replace allocator_ap2's Vec with Vec2, and make some changes to make it compile successfully.

Copy link
Member Author

Dunqing commented Mar 10, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added A-isolated-declarations Isolated Declarations C-enhancement Category - New feature or request labels Mar 10, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 10, 2025

CodSpeed Performance Report

Merging #9656 will improve performances by 5.27%

Comparing 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 (5cc614a) with main (faca7a8)

Summary

⚡ 3 improvements
✅ 30 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
transformer[antd.js] 53.9 ms 51.8 ms +3.94%
transformer[checker.ts] 24.4 ms 23.2 ms +5.27%
transformer[pdf.mjs] 10.7 ms 10.3 ms +4.05%

@Dunqing Dunqing force-pushed the 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 branch from e7aa3de to c20bc2e Compare March 11, 2025 10:26
@Dunqing Dunqing force-pushed the 03-11-feat_allocator_add_vec2_retain_mut_method branch from 450c4d3 to b3bbb9e Compare March 11, 2025 10:26
@Dunqing Dunqing marked this pull request as ready for review March 11, 2025 10:48
@Dunqing Dunqing force-pushed the 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 branch from c20bc2e to 5351a48 Compare March 11, 2025 11:49
@overlookmotel overlookmotel force-pushed the 03-11-feat_allocator_add_vec2_retain_mut_method branch from b3bbb9e to 14c989a Compare March 12, 2025 08:58
@overlookmotel overlookmotel force-pushed the 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 branch 2 times, most recently from d7ed2b3 to c0e0d41 Compare March 13, 2025 01:42
@overlookmotel overlookmotel force-pushed the 03-11-feat_allocator_add_vec2_retain_mut_method branch from 14c989a to 8bdcb8c Compare March 13, 2025 01:42
@overlookmotel overlookmotel force-pushed the 03-11-feat_allocator_add_vec2_retain_mut_method branch 2 times, most recently from 247b928 to fcb5728 Compare March 13, 2025 03:03
@overlookmotel overlookmotel force-pushed the 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 branch from c0e0d41 to cf32c65 Compare March 13, 2025 03:03
graphite-app bot pushed a commit that referenced this pull request Mar 13, 2025
…9733)

`stmts` here is an `ArenaVec<&Statement>` (`&Statement` not `Statement`). This is a temporary collection, and doesn't end up in the AST, so we shouldn't store it in the arena.

Use a `std::vec::Vec` instead of `oxc_allocator::Vec`.

This should also remove some of the lifetime problems we have in #9656.

This change may cause a small perf hit, but in my view it's still a good change. If we want to get a speed boost, a better solution would be to have a temporary "scratch space" arena which we can allocate *all* temporary values into. This would likely give us a sizeable speed boost across many parts of Oxc (oxc-project/backlog#121).
@overlookmotel
Copy link
Member

overlookmotel commented Mar 13, 2025

Here's a minimal reproduction of the lifetimes problem:

use bumpalo::{Bump, collections::Vec};

struct Test<'b> {
    _dummy: &'b (),
}

impl<'b> Test<'b> {
    pub fn test_vec(input: &Vec<'b, u64>, bump: &'b Bump) -> u64 {
        let vec = Vec::from_iter_in(input.iter(), bump);
        //                          ^^^^^^^^^^^^
        // Error: explicit lifetime required in the type of `input`. lifetime `'b` required.

        Self::sum(&vec)
    }

    fn sum(vec: &Vec<'b, &u64>) -> u64 {
        vec.iter().copied().sum()
    }
}

This fixes it:

- fn sum(vec: &Vec<'b, &u64>) -> u64 {
+ fn sum(vec: &Vec<'_, &u64>) -> u64 {

But I don't think that should be necessary. I think it should compile fine as is.

Some discussion (but no complete conclusion) here: fitzgen/bumpalo#171

@Dunqing Dunqing force-pushed the 03-11-feat_allocator_add_vec2_retain_mut_method branch from fcb5728 to ec2f8da Compare March 13, 2025 03:51
@Dunqing
Copy link
Member Author

Dunqing commented Mar 13, 2025

After #9733, the changes to oxc_isolated_declarations should no longer be required (I hope!)

Yes! You are right!

@Dunqing
Copy link
Member Author

Dunqing commented Mar 13, 2025

Here's a minimal reproduction of the lifetimes problem:

use bumpalo::{Bump, collections::Vec};

struct Test<'b> {
    _dummy: &'b (),
}

impl<'b> Test<'b> {
    pub fn test_vec(input: &Vec<'b, u64>, bump: &'b Bump) -> u64 {
        let vec = Vec::from_iter_in(input.iter(), bump);
        //                          ^^^^^^^^^^^^
        // Error: explicit lifetime required in the type of `input`. lifetime `'b` required.
        Self::sum(&vec)
    }

    fn sum(vec: &Vec<'b, &u64>) -> u64 {
        vec.iter().copied().sum()
    }
}

This fixes it:

- fn sum(vec: &Vec<'b, &u64>) -> u64 {
+ fn sum(vec: &Vec<'_, &u64>) -> u64 {

But I don't think that should be necessary. I think it should compile fine as is.

Some discussion (but no complete conclusion) here: fitzgen/bumpalo#171

Does that mean we should align the implementation with std does? Accepting allocator rather than a bump lifetime?

Copy link
Member

@overlookmotel overlookmotel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remain unsure whether the T: 'bump lifetime bound is required on vec2::Vec (and on its various impls).

But now the only place this impacts our code is requiring a C: 'new_alloc bound on CloneIn for Vec, which is unproblematic.

So I think we can merge this now, and resolve the lifetime question later on.

@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label Mar 14, 2025
Copy link
Member

overlookmotel commented Mar 14, 2025

Merge activity

  • Mar 14, 1:47 AM EDT: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Mar 14, 2:12 AM EDT: A user added this pull request to the Graphite merge queue.
  • Mar 14, 2:21 AM EDT: A user merged this pull request with the Graphite merge queue.

@overlookmotel
Copy link
Member

overlookmotel commented Mar 14, 2025

Note that as well as the positive perf impact on transformer, this PR also has a small negative effect on parser benchmarks.

bench

I imagine this may be for same reason as saw that same performance effect in #9301.

We can take a look at parser code and see if we can improve that by reserving initial capacity in Vecs where we know they're not going to remain empty.

@Dunqing
Copy link
Member Author

Dunqing commented Mar 14, 2025

We can take a look at parser code and see if we can improve that by reserving initial capacity in Vecs where we know they're not going to remain empty.

Let me take a look at this

Just replace allocator_ap2's Vec with Vec2, and make some changes to make it compile successfully.
@graphite-app graphite-app bot force-pushed the 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 branch from 328e5d0 to 5cc614a Compare March 14, 2025 06:13
@graphite-app graphite-app bot merged commit 5cc614a into main Mar 14, 2025
26 checks passed
@graphite-app graphite-app bot deleted the 03-11-feat_allocator_replace_allocator_ap2_s_vec_with_vec2 branch March 14, 2025 06:21
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Mar 14, 2025
@overlookmotel
Copy link
Member

Let me take a look at this

We may want to look at implementing the optimization in #9301 first. (or that may not even be an optimization any more!)

graphite-app bot pushed a commit that referenced this pull request Mar 14, 2025
…` for `Vec` (#9771)

Follow-on after #9656. We think probably this change to lifetime bound of `CloneIn` for `Vec` is unnecessary, but that question is unresolved. Make a comment about it.
graphite-app bot pushed a commit that referenced this pull request Mar 14, 2025
…9772)

Follow-on after #9656. Add a defence against potential double-free bug in `String::from_utf8_unchecked`.

As noted in the comment, this is probably unnecessary, but it doesn't hurt to add this defensive code while new `Vec` implementation is under development. We can remove it again later when we're satisfied we have covered all bases.
graphite-app bot pushed a commit that referenced this pull request Mar 17, 2025
resolve: #9656 (comment)

#9656 brought a small performance improvement for the transformer but also led the parser to 1% performance hits. This PR returns performance by splitting `reserve_internal` to `reserve_exact_internal` and `reserve_amortized_internal`  respectively the internal implementation of `reserve_exact` and `reserve`.

Why the change can improve performance?

The original `reserve_internal` implementation has a check for reserve strategy, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L664-L668 which is can be avoided because the caller of `reserve_internal` already knows the reserve strategy. After the change, the `reserve_exact` and `reserve` can call the corresponding internal implementation directly, which can avoid unnecessary checks.

Likewise, the `Fallibility` check can also be avoided, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L681-L683

because we know where the errors should be handled.

~~Due to this change, I also replaced Bumpalo's `CollecitonAllocErr` with allocator-api2's `TryReserveError` because `CollecitonAllocErr::AllocErr` cannot pass in a `Layout`.~~ I ended up reverting 937c61a as it caused transformer performance 1%-2% regression (See [codspeed](https://codspeed.io/oxc-project/oxc/branches/03-15-pref_allocator_vec2_optimize_reserving_memory) and switch to "replace CollectionAllocErr with TryReserveError" commit), and replaced by 84edacd I've tried various way to save the performance but it not work. I suspect the cause is that `TryReserveError` is 16 bytes whereas `CollecitonAllocErr` is only 1 byte.

So, after both checks are removed, the performance returns to the original. The whole change is according to standard `RawVec`'s implementation. See https://doc.rust-lang.org/src/alloc/raw_vec.rs.html

<img width="608" alt="image" src="https://github.com/user-attachments/assets/53066d8e-26f0-4eb1-8f33-4ca9e517e75b" />
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-isolated-declarations Isolated Declarations C-enhancement Category - New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants