-
Notifications
You must be signed in to change notification settings - Fork 244
Hybrid Consensus Node #1876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hybrid Consensus Node #1876
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interesting approach, you've created the consensus engine abstraction on top of the existing crates! I have several questions about it:
- Do you have the whole roadmap of the consensus change? It would be easier to know what part of the code is missing and what we expect in the next parts.
- How do you test the consensus change? It seems it's not possible until the next part is merged.
- Am I correct that if I propagate the custom babe block I can switch the consensus from a random node?
- What do you think about the specific block number as an explicit consensus engine boundary? This way we'll get rid of the pending verification, string error as a marker, etc.
loop { | ||
// Check if the runtime is Babe once per block. | ||
if let Ok(c) = sc_consensus_babe::configuration(&*client) { | ||
if !c.authorities.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we hardcode the exact block number to switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to reduce "magic numbers" that have the possibility to be misconfigured where possible. However if there is benefit to switching on block number instead of this I'm open to it
Thank you @shamil-gadelshin for your detailed review!
Yes, you can view the full changes (including runtime upgrade + migration) here: #1708 tldr; next changes after merging this are
You can test them on the If you're interested in it let me know and I'll send you exact commands I used to set it up :)
This is an interesting edge case I have not considered! I think you are correct, I would need to do additional checks on the Babe block rather than switching straight to Babe consensus...
Yeah this is something I considered. Initially I tried to keep away from needing to hardcode any block numbers, keeping this as close as I could to being a "normal" runtime upgrade. However you raise good points about verification. I'll have a think about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes and the roadmap. It's easier to understand the general direction now.
My current concern is the moment of the consensus change. We don't need flexibility for this event because it seems like "one time unidirectional change". If the hardcoded magic const makes the code simpler, let's use it. If it introduces more issues than it solves, let's use your current method (presence of babe authorities) or a similar external marker.
@gztensor saw this error after the runtime upgrade. It seems that "babe_switch" worker asks BabeApi and gets "not implemented" Do we have a fix for this one? |
@shamil-gadelshin thanks for giving it another try! It will panic if you try to use the Babe node with an Aura runtime. This branch does not include the change to the new Babe NPoS runtime, which I suspect is why you see that panic. Please try following my demo here https://discord.com/channels/1120750674595024897/1387662637843742760/1399902628309106822 where I test it on the Let me know how you go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime upgrade works without bugs. Warp sync works as expected.
reverted until we can test on mainnet clone and a few other things |
Phase 1 in #1887.
I added the full NPoS dependencies in this PR primarily to reduce future merge conflicts, which recently been a bit time consuming to resolve.
Hybrid Node
The Hybrid Node handles producing, syncing and validating both Aura and Babe based blocks. It contains logic to dynamically switch between Aura and Babe based processing, depending on the specific block it needs to process.
Implementation
Node Service
The core changes are in the node service logic. Consensus specific service logic was migrated to a new
ConsensusMechanism
trait, which I have written both Aura and Babe implementation for.Node RPC
Node rpc logic was made generic across consensus mechanisms by allowing the caller to pass the Frontier Consensus Data Provider to use, and custom RPC modules to instantiate.
CLI / Command
A new cli parameter
--initial-consensus
was added, which determines whichConsensusMechanism
to first start the node in. The node will dynamically switch betweenConsensusMechanism
s at runtime as needed to continue block production / validation / import.TODO
runtime-benchmarks
feature flagConditionalEVMBlockImport
is up to date