perf(allocator/vec2): optimize reserving memory #9792

Dunqing · 2025-03-15T06:07:55Z

#9656 brought a small performance improvement for the transformer but also led the parser to 1% performance hits. This PR returns performance by splitting reserve_internal to reserve_exact_internal and reserve_amortized_internal respectively the internal implementation of reserve_exact and reserve.

Why the change can improve performance?

The original reserve_internal implementation has a check for reserve strategy,

oxc/crates/oxc_allocator/src/vec2/raw_vec.rs

Lines 664 to 668 in fef680a

    
           // Nothing we can really do about these checks :( 
        
           let new_cap = match strategy { 
        
               Exact => used_cap.checked_add(needed_extra_cap).ok_or(CapacityOverflow)?, 
        
               Amortized => self.amortized_new_size(used_cap, needed_extra_cap)?, 
        
           };

which is can be avoided because the caller of reserve_internal already knows the reserve strategy. After the change, the reserve_exact and reserve can call the corresponding internal implementation directly, which can avoid unnecessary checks.

Likewise, the Fallibility check can also be avoided,

oxc/crates/oxc_allocator/src/vec2/raw_vec.rs

Lines 681 to 683 in fef680a

    
           if let (Err(AllocError), Infallible) = (&res, fallibility) { 
        
               handle_alloc_error(new_layout); 
        
           }

because we know where the errors should be handled.

~~Due to this change, I also replaced Bumpalo's CollecitonAllocErr with allocator-api2's TryReserveError because CollecitonAllocErr::AllocErr cannot pass in a Layout.~~ I ended up reverting 937c61a as it caused transformer performance 1%-2% regression (See codspeed and switch to "replace CollectionAllocErr with TryReserveError" commit), and replaced by 84edacd I've tried various way to save the performance but it not work. I suspect the cause is that TryReserveError is 16 bytes whereas CollecitonAllocErr is only 1 byte.

So, after both checks are removed, the performance returns to the original. The whole change is according to standard RawVec's implementation. See https://doc.rust-lang.org/src/alloc/raw_vec.rs.html

Dunqing · 2025-03-15T06:08:09Z

perf(allocator/vec2): optimize reserving memory #9792 👈 (View in Graphite)
main

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

codspeed-hq · 2025-03-15T06:14:57Z

CodSpeed Performance Report

Merging #9792 will not alter performance

_{Comparing 03-15-pref_allocator_vec2_optimize_reserving_memory (17a9320) with main (e924c2b)}

Summary

✅ 33 untouched benchmarks

overlookmotel

Excellent!

overlookmotel · 2025-03-17T12:25:59Z

Merge activity

Mar 17, 8:25 AM EDT: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
Mar 17, 8:26 AM EDT: A user added this pull request to the Graphite merge queue.
Mar 17, 8:32 AM EDT: A user merged this pull request with the Graphite merge queue.

resolve: #9656 (comment) #9656 brought a small performance improvement for the transformer but also led the parser to 1% performance hits. This PR returns performance by splitting `reserve_internal` to `reserve_exact_internal` and `reserve_amortized_internal` respectively the internal implementation of `reserve_exact` and `reserve`. Why the change can improve performance? The original `reserve_internal` implementation has a check for reserve strategy, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L664-L668 which is can be avoided because the caller of `reserve_internal` already knows the reserve strategy. After the change, the `reserve_exact` and `reserve` can call the corresponding internal implementation directly, which can avoid unnecessary checks. Likewise, the `Fallibility` check can also be avoided, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L681-L683 because we know where the errors should be handled. ~~Due to this change, I also replaced Bumpalo's `CollecitonAllocErr` with allocator-api2's `TryReserveError` because `CollecitonAllocErr::AllocErr` cannot pass in a `Layout`.~~ I ended up reverting 937c61a as it caused transformer performance 1%-2% regression (See [codspeed](https://codspeed.io/oxc-project/oxc/branches/03-15-pref_allocator_vec2_optimize_reserving_memory) and switch to "replace CollectionAllocErr with TryReserveError" commit), and replaced by 84edacd I've tried various way to save the performance but it not work. I suspect the cause is that `TryReserveError` is 16 bytes whereas `CollecitonAllocErr` is only 1 byte. So, after both checks are removed, the performance returns to the original. The whole change is according to standard `RawVec`'s implementation. See https://doc.rust-lang.org/src/alloc/raw_vec.rs.html <img width="608" alt="image" src="https://github.com/user-attachments/assets/53066d8e-26f0-4eb1-8f33-4ca9e517e75b" />

Dunqing changed the title ~~pref(allocator/vec2): optimize reserving memory~~ perf(allocator/vec2): optimize reserving memory Mar 15, 2025

github-actions bot added the C-performance Category - Solution not expected to change functional behavior, only performance label Mar 15, 2025

Dunqing force-pushed the 03-15-pref_allocator_vec2_optimize_reserving_memory branch 5 times, most recently from e04b012 to e55b350 Compare March 16, 2025 10:01

Dunqing marked this pull request as ready for review March 16, 2025 10:01

Dunqing requested a review from overlookmotel March 16, 2025 10:01

Dunqing force-pushed the 03-15-pref_allocator_vec2_optimize_reserving_memory branch 9 times, most recently from 6634ef5 to 02413cd Compare March 16, 2025 14:47

Dunqing marked this pull request as draft March 16, 2025 14:47

Dunqing force-pushed the 03-15-pref_allocator_vec2_optimize_reserving_memory branch 9 times, most recently from 20a2e94 to 01c3780 Compare March 17, 2025 01:51

Dunqing marked this pull request as ready for review March 17, 2025 02:16

overlookmotel approved these changes Mar 17, 2025

View reviewed changes

overlookmotel added the 0-merge Merge with Graphite Merge Queue label Mar 17, 2025

graphite-app bot force-pushed the 03-15-pref_allocator_vec2_optimize_reserving_memory branch from 84edacd to 17a9320 Compare March 17, 2025 12:26

graphite-app bot merged commit 17a9320 into main Mar 17, 2025
26 checks passed

graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Mar 17, 2025

graphite-app bot deleted the 03-15-pref_allocator_vec2_optimize_reserving_memory branch March 17, 2025 12:32

oxc-bot mentioned this pull request Mar 18, 2025

release(crates): v0.59.0 #9850

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf(allocator/vec2): optimize reserving memory #9792

perf(allocator/vec2): optimize reserving memory #9792

Uh oh!

Dunqing commented Mar 15, 2025 •

edited

Loading

Uh oh!

Dunqing commented Mar 15, 2025

Uh oh!

codspeed-hq bot commented Mar 15, 2025 •

edited

Loading

Uh oh!

overlookmotel left a comment

Uh oh!

overlookmotel commented Mar 17, 2025 •

edited by graphite-app bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	// Nothing we can really do about these checks :(
	let new_cap = match strategy {
	Exact => used_cap.checked_add(needed_extra_cap).ok_or(CapacityOverflow)?,
	Amortized => self.amortized_new_size(used_cap, needed_extra_cap)?,
	};

	if let (Err(AllocError), Infallible) = (&res, fallibility) {
	handle_alloc_error(new_layout);
	}

Uh oh!

perf(allocator/vec2): optimize reserving memory #9792

perf(allocator/vec2): optimize reserving memory #9792

Uh oh!

Conversation

Dunqing commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dunqing commented Mar 15, 2025

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #9792 will not alter performance

Summary

Uh oh!

overlookmotel left a comment

Choose a reason for hiding this comment

Uh oh!

overlookmotel commented Mar 17, 2025 • edited by graphite-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Dunqing commented Mar 15, 2025 •

edited

Loading

codspeed-hq bot commented Mar 15, 2025 •

edited

Loading

overlookmotel commented Mar 17, 2025 •

edited by graphite-app bot

Loading