Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

Conversation

@sakridge
Copy link
Contributor

@sakridge sakridge commented Jun 7, 2021

Problem

Validator collects votes all the time even if it has no use for them.

Summary of Changes

Only collect votes when the node is the leader.

Fixes #17790

@sakridge sakridge force-pushed the cluster-votes-listener branch from c007bbe to 966cf5d Compare June 7, 2021 17:41
@sakridge sakridge force-pushed the cluster-votes-listener branch from 966cf5d to 7a51778 Compare June 8, 2021 08:46
@sakridge sakridge requested review from behzadnouri and carllin June 8, 2021 12:55
@codecov
Copy link

codecov bot commented Jun 8, 2021

Codecov Report

Merging #17803 (7a51778) into master (708bbcb) will increase coverage by 0.0%.
The diff coverage is 81.8%.

@@           Coverage Diff           @@
##           master   #17803   +/-   ##
=======================================
  Coverage    82.7%    82.7%           
=======================================
  Files         431      431           
  Lines      120590   120596    +6     
=======================================
+ Hits        99839    99844    +5     
- Misses      20751    20752    +1     

@mvines mvines added the v1.7 label Jun 8, 2021
@carllin
Copy link
Contributor

carllin commented Jun 9, 2021

So the major change here is that leaders will now be more reliant on forwarding to get votes that don't land within 20 slots?

Edit: It's also of note since we don't forward vote transactions: https://github.com/solana-labs/solana/blob/master/core/src/banking_stage.rs#L299, then if there's long delays in gossip vote propagation or nodes are drifting and have a bad estimate of when they will next be leader, i.e. forking/delays throw off the poh_recorder.would_be_leader(20 * DEFAULT_TICKS_PER_SLOT); check.

I think the concerns above are ok though because:

  1. Validators resubmit their latest votes now that have not landed
  2. Validators will be voting with newer votes anyways and if the later votes land first, this will make the earlier votes that haven't landed unprocessable. This may slightly exacerbate this issue of lockouts being inaccurate based on votes landing out of order: Generating vote from bank diff doesn't accurately recreate lockouts #16768, but this is an existing issue.

@sakridge
Copy link
Contributor Author

sakridge commented Jun 9, 2021

So the major change here is that leaders will now be more reliant on forwarding to get votes that don't land within 20 slots?

Edit: It's also of note since we don't forward vote transactions: https://github.com/solana-labs/solana/blob/master/core/src/banking_stage.rs#L299, then if there's long delays in gossip vote propagation or nodes are drifting and have a bad estimate of when they will next be leader, i.e. forking/delays throw off the poh_recorder.would_be_leader(20 * DEFAULT_TICKS_PER_SLOT); check.

I think the concerns above are ok though because:

  1. Validators resubmit their latest votes now that have not landed
  2. Validators will be voting with newer votes anyways and if the later votes land first, this will make the earlier votes that haven't landed unprocessable. This may slightly exacerbate this issue of lockouts being inaccurate based on votes landing out of order: Generating vote from bank diff doesn't accurately recreate lockouts #16768, but this is an existing issue.

Yea. Ideally the constant of 20 is set such that 99.9% of the time there would be no difference in which votes it sees vs. always keeping them. I'm not sure if that's 20 or something higher or lower, but it seemed like a reasonable starting point.

@sakridge sakridge merged commit 0feac57 into solana-labs:master Jun 11, 2021
@sakridge sakridge deleted the cluster-votes-listener branch June 11, 2021 16:29
mergify bot pushed a commit that referenced this pull request Jun 11, 2021
mergify bot added a commit that referenced this pull request Jun 16, 2021
@brooksprumo brooksprumo mentioned this pull request Aug 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cluster_info_vote_listener uses too much memory

3 participants