Skip to content

Multiple Validator Sets #241

@kayabaNerve

Description

@kayabaNerve

Multiple validator sets is a long-term design goal. In order to prevent every validator from needing to run every single network, requiring TBs of storage, they'd only have to run a single network. Each network would have its own validator set, and with it, their own bond/set of managed coins.

This has documentation here: https://github.com/serai-dex/serai/blob/e979883f2d02dc38c6ee012aed5865fa0fa8eb79/docs/protocol/Validator%20Sets.md#participation-in-the-bft-process

The issue is primarily on how we use inherents to publish data in. Because of that, BTC events would only be included when a BTC validator produces the blocks. For a sufficiently large network, networks with few validators will experience massive latency. We also have the issue that since all validators participate in Tendermint, we have a scalability limit there.

This is something I planned to tackle months down the road, yet due to recent events making me spend a few mom

We need a way for coin X validators to submit their events to Serai validators, which is independently verifiable.

The solution is for coin X validators to sign each batch, via a threshold signature yet optionally via half-aggregation, sending them to validators, who then include them as inherent transactions. There's no requirement for all validators to have prior awareness though as the bundle is independently verifiable. This resolves paritytech/substrate#6.

It also means we have no explicit reason to use Tendermint in Substrate. Removing it resolves paritytech/substrate#134 paritytech/substrate#137 paritytech/substrate#159 paritytech/substrate#171 paritytech/substrate#183.

paritytech/substrate#157 is also partially benefited. It shifts the issue from Tendermint to concerns about the coordinator.

Tendermint took two months and this would destroy about half of that work. The only reason it doesn't destroy it all is because the machine is still useful for paritytech/substrate#163, which is likely for after mainnet. We also can severely de-prioritize paritytech/substrate#169 and paritytech/substrate#170 due to no longer being relevant to the main chain.

While unhappy with this realization, it's where we're at. We could keep using Tendermint, offering greater control over consensus and not falling to GRANDPA failing if offline for multiple days. yet it likely isn't worth the scope when a threshold signature achieves almost the same efficiency without putting consensus at risk. While I wished to handle MVS later, the changes it'll require must at least be partially implemented now to be sane.

This does increase coordinator complexity.

The plan should be to work towards a functional e2e flow, with Tendermint, for protonet and then make this refactor.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions