Skip to content

Conversation

@HaoranYi
Copy link

@HaoranYi HaoranYi commented Oct 16, 2025

Problem

The staked_nodes() method in VoteAccounts was using itertools' into_grouping_map().aggregate() pattern, which creates intermediate allocations and adds unnecessary overhead for a hot path in validator operations.

Summary of Changes

Replace the itertools grouping_map implementation with direct HashMap construction for better performance:

  • Remove unused itertools::Itertools import
  • Optimize stake aggregation using entry().and_modify().or_insert()
  • Pre-allocate HashMap with estimated capacity to reduce reallocations
  • Add benchmark to measure and track performance

Performance Impact

Benchmark with realistic scenario (400 validator nodes):

  • Before: 27,206 ns/iter
  • After: 12,586 ns/iter
  • Improvement: 2.16x speedup (53.7% faster)

Running on mainnet shows 3-4x Speedup

image
log
[2025-10-16T16:01:25.813898562Z INFO  solana_metrics::metrics] datapoint: staked_nodes_timing num_vote_accounts=6755i old_impl_us=1057i new_impl_us=313i speedup_ratio=3.376996805111821
[2025-10-16T16:01:25.813904720Z INFO  solana_metrics::metrics] datapoint: staked_nodes_timing num_vote_accounts=6750i old_impl_us=864i new_impl_us=245i speedup_ratio=3.526530612244898
[2025-10-16T16:01:29.378669009Z INFO  solana_metrics::metrics] datapoint: staked_nodes_timing num_vote_accounts=6750i old_impl_us=662i new_impl_us=152i speedup_ratio=4.355263157894737

@HaoranYi HaoranYi force-pushed the optimize-staked-nodes-hashmap branch from f8e5f71 to 89499d9 Compare October 16, 2025 15:32
@HaoranYi HaoranYi changed the title optimize staked nodes hashmap Optimize staked_nodes() for 2.16x performance improvement Oct 16, 2025
@HaoranYi HaoranYi changed the title Optimize staked_nodes() for 2.16x performance improvement Optimize staked_nodes() for 3-4x performance improvement Oct 16, 2025
@codecov-commenter
Copy link

codecov-commenter commented Oct 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.2%. Comparing base (56d328c) to head (8c566a0).
⚠️ Report is 51 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #8516   +/-   ##
=======================================
  Coverage    83.1%    83.2%           
=======================================
  Files         846      846           
  Lines      368573   368576    +3     
=======================================
+ Hits       306652   306709   +57     
+ Misses      61921    61867   -54     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! I think there is a chance to use the pubkey hasher.

)
// Pre-allocate HashMap with estimated capacity to reduce reallocations
let mut staked_nodes =
HashMap::with_capacity(self.vote_accounts.len().saturating_div(2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a hash map of validators, I think we could use PubkeyHasherBuilder instead of the default hasher:

Suggested change
HashMap::with_capacity(self.vote_accounts.len().saturating_div(2));
HashMap::with_capacity_and_hasher(self.vote_accounts.len().saturating_div(2), PubkeyHasherBuilder::default());

That should speed it up even more. 🙂 You'll need to add the PubkeyHasherBuilder generic to the return type in the function signature.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PubkeyHasherBuilder is behind "rand" feature. vote package doesn't enable rand feature.
What's your experience on enable rand feature? Would it be a more broader change and require more testing to ensure no regression?
How about doing it as a separate optimization in a follow-up pr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, we can do it separately.

Regarding adding the rand feature - in this PR #7307 I added the rand feature to solana-pubkey in solana-accounts-db. The only randomness that comes with it is randomization of 8 byte subslice of the pubkey to be used as a hash map key:

https://github.com/anza-xyz/solana-sdk/blob/a7e12b1d4af8fba2a43d447af31aebdc3dbe8a1d/address/src/lib.rs#L13-L14
https://github.com/anza-xyz/solana-sdk/blob/a7e12b1d4af8fba2a43d447af31aebdc3dbe8a1d/address/src/hasher.rs#L57-L81

There is no other module being pulled by this feature. So I don't think there should be any unexpected impact on the validator.

vadorovsky
vadorovsky previously approved these changes Oct 20, 2025
Replace the itertools grouping_map().aggregate() pattern with direct
HashMap construction using entry().and_modify().or_insert() for better
performance and reduced allocations.

Performance improvement:
- Before: 27,206 ns/iter
- After:  12,586 ns/iter
- Speedup: 2.16x (53.7% faster)

The benchmark simulates a realistic scenario with 100 validator nodes,
each having 3-5 vote accounts, measuring stake aggregation by node
pubkey.

Changes:
- Remove unused itertools::Itertools import
- Pre-allocate HashMap with estimated capacity
- Use direct entry API for stake aggregation
- Add benchmark for staked_nodes computation
@brooksprumo brooksprumo self-requested a review October 22, 2025 12:36
Replace saturating_div(2) with exact count of non-zero stake accounts
for optimal memory allocation.

Based on mainnet data showing ~14% of vote accounts have stake:
- Old: capacity = vote_accounts.len() / 2 (~3350 for 6700 accounts)
- New: capacity = non_zero_count (~970 for 6700 accounts)

Trade-offs:
- Pro: Exact capacity, zero reallocation, saves ~2380 pre-allocated entries
- Pro: Better memory efficiency (86% of vote accounts have zero stake)
- Con: Adds ~800ns overhead from counting iteration (~6% slower)

Benchmark results:
- Before: 12,516 ns/iter
- After:  13,353 ns/iter
- Trade-off: Slightly slower but uses exact memory
brooksprumo
brooksprumo previously approved these changes Oct 23, 2025
Copy link

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@HaoranYi
Copy link
Author

I updated the benchmark with realistic mainnet data (6700 vote accounts, 970 with stakes)

Benchmark result

Approach Performance Memory Decision
Default (no capacity) 76,157 ns Minimal initially, grows Too slow (2.3x)
Pre-scan exact count 36,861 ns Optimal (970 entries) 11% slower
div(2) estimate 33,249 ns 3,350 entries Fastest

Pre-scan is 10% slower than div(2) estimate. But it prevent us from wasting more memory as the zero stake vote account grow.
Pre-scan is a better long term solution.

@HaoranYi HaoranYi force-pushed the optimize-staked-nodes-hashmap branch from 6e9e91b to 8c566a0 Compare October 23, 2025 14:22
@HaoranYi HaoranYi enabled auto-merge October 23, 2025 15:25
@HaoranYi HaoranYi requested a review from brooksprumo October 23, 2025 15:25
@HaoranYi
Copy link
Author

HaoranYi commented Oct 23, 2025

@brooksprumo I update the benchmark and this dismiss your approval. Can you re-approve it? No actual prod code change.

@brooksprumo
Copy link

I'm not sure the new benchmarks should be added. They are comparing different implementations, which is valuable for this PR and choosing one, but beyond this PR I don't see the value. I would think we only want benches in the repo for our current code.

We could put the benchmarks for comparing impls as text/source in this PR, which would make it useful if we want to revisit in the future.

@HaoranYi HaoranYi force-pushed the optimize-staked-nodes-hashmap branch from 8c566a0 to 29ecdc7 Compare October 23, 2025 16:40
@HaoranYi
Copy link
Author

HaoranYi commented Oct 23, 2025

I'm not sure the new benchmarks should be added. They are comparing different implementations, which is valuable for this PR and choosing one, but beyond this PR I don't see the value. I would think we only want benches in the repo for our current code.

We could put the benchmarks for comparing impls as text/source in this PR, which would make it useful if we want to revisit in the future.

OK. Reverted the bench commit. The benchmark result is already in the PR comments. If we want to revisit, we can look at the PR.

Copy link

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

If we need to make future changes, I think we should rename the non_zero_ stuff to staked_.

@HaoranYi HaoranYi added this pull request to the merge queue Oct 23, 2025
Merged via the queue into anza-xyz:master with commit c3a829f Oct 23, 2025
55 checks passed
@HaoranYi HaoranYi deleted the optimize-staked-nodes-hashmap branch October 23, 2025 17:42
rustopian pushed a commit to rustopian/agave that referenced this pull request Nov 20, 2025
* Optimize staked_nodes() by replacing itertools with manual HashMap

Replace the itertools grouping_map().aggregate() pattern with direct
HashMap construction using entry().and_modify().or_insert() for better
performance and reduced allocations.

Performance improvement:
- Before: 27,206 ns/iter
- After:  12,586 ns/iter
- Speedup: 2.16x (53.7% faster)

The benchmark simulates a realistic scenario with 100 validator nodes,
each having 3-5 vote accounts, measuring stake aggregation by node
pubkey.

Changes:
- Remove unused itertools::Itertools import
- Pre-allocate HashMap with estimated capacity
- Use direct entry API for stake aggregation
- Add benchmark for staked_nodes computation

* pr feedback

* Simplify stake aggregation logic per PR feedback

* Use exact non-zero count for HashMap capacity

Replace saturating_div(2) with exact count of non-zero stake accounts
for optimal memory allocation.

Based on mainnet data showing ~14% of vote accounts have stake:
- Old: capacity = vote_accounts.len() / 2 (~3350 for 6700 accounts)
- New: capacity = non_zero_count (~970 for 6700 accounts)

Trade-offs:
- Pro: Exact capacity, zero reallocation, saves ~2380 pre-allocated entries
- Pro: Better memory efficiency (86% of vote accounts have zero stake)
- Con: Adds ~800ns overhead from counting iteration (~6% slower)

Benchmark results:
- Before: 12,516 ns/iter
- After:  13,353 ns/iter
- Trade-off: Slightly slower but uses exact memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants