-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Fix key collision for child trie #4162
Conversation
child_specific (unique id) + u32 type.
sr-io, genesis json broken if there is child trie in existing encoding genesis.
if things changed, otherwhise the root does not change).
client/src/light/backend.rs
Outdated
|
|
||
| // this is only called when genesis block is imported => shouldn't be performance bottleneck | ||
| let mut storage: HashMap<Option<Vec<u8>>, StorageOverlay> = HashMap::new(); | ||
| let mut storage: HashMap<Option<(Vec<u8>, OwnedChildInfo)>, StorageOverlay> = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage of 'OwnedChildInfo' in the key is rather questionable here, yet it seem to work well with current implementation.
| Vec<(Vec<u8>, Option<Vec<u8>>)>, | ||
| )> { | ||
| fn consolidate(&mut self, mut other: Self) { | ||
| self.append(&mut other); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will produce multiple definition with the same key (child trie storage key and ownedinfo), a btreemap could be use instead, but this is closer to existing code (it already factor some data redundancy).
| child_info: ChildInfo, | ||
| key: &[u8], | ||
| ) -> Result<Option<Vec<u8>>, Self::Error> { | ||
| Ok(self.inner.get(&Some((storage_key.to_vec(), child_info.to_owned()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Memory costy instantiation for key. Generally In memory could change to some { top : , children: } with child info into value of children.
Not sure if it should switch in this pr.
Edit : can probably be factor with basic.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I wonder, should we actually keeps this childInfo ? in the real backend those information are not stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need it to calculate the trie delta (when calling full_storage_root I need to rebuild the roots so I need to know where and how children are stored (to query existing trie state)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misread the question, in the real backend I think child info is stored with the delta (in value of the map).
|
I did start thinking on how things can evolve if the api switch to using hierarchical child trie (it start to seem usefull for case such as palet module in its own child trie which itself use some child trie). The point is that changing child api at this point is done under the asumption that child trie are not being use in any runtime, but future changes would need to keep the existing sr-io interface in place. First idea would be to use multiple storage_key as parameter, but it would also require multiple (child_info, child_type) for each step. Another idea would be that hierarchical trie would use their own child trie_type and child_info encoding. At this point open questions should be:
|
|
would this break kusama? (i don't think so as we don't use child tries, but want to check) |
|
As long as there is no wasm runtime containing 'ext_child...' function it should not. |
|
Putting status 'inprogress' while I switch code to using struct instead of tuple. |
|
I did switch to struct where there is tuple, this is very likely to conflict with other PR but looks better (change to genesis conf seems to work directly with polkadot (json parser is quiet flexible)). |
|
I'm generally fine with the way it is now. |
|
@cheme needs resolving |
switch hashmap to btreemap, switch next_child interface to use child_info.
This PR superseed #3804, in favor of a simple implementation. It reverts to the initial approach of #2209.
CC\ @arkpar @thiolliere
Polkadot required change: paritytech/polkadot@f724806
To avoid key collision for child trie, a unique id will be prepend to its rocksdb storage key values.
This prefix need to be crypto strong to avoid collision with other keys (we use the same column as the top trie).
Definition of this prefix is the direct responsibility of palet module using child tries.
If a module does not use a collision resistant unique id for the child trie it can break the whole db, so this is a bit less robust than previous implementation.
The palet module will probably need to store this unique id on chain but could also use static unique id in case of a bounded number of child trie.
An example of unbounded child trie usage is in contract module (we use the hash part of trie_id).
Breaking change
This PR keep 'deprecated_host_interface' child trie function.
I think I should delete them because using them will only result in problem (only usable for archive mode). At this point it just act as previously by using childinfo with an empty uniqueid and no root.
Storage host child trie function were all change: this means that a chain linking to this function will be broken.
Checking latest kusama runtime, those are not use, and I do not know of any case where they are use, therefore I choose to change them this can be a breaking change but I think it is not.
If it ends up being a breaking change, new ext function will be needed.
The new prototype is meant to be extensible, therefore we add a 'child info' encoded byte and a u32 type.
Enum
ChildInfois created from them.A new parameter is use: the
ChildInfoenum. At this point it only allow a strong crypto id with possibly a root to avoid a useless query.For 'srml support', the api do not use
ChildInfobutsimply an optional unique id with option None reserved for id given by systemdirectly a unique id. This is very questionable design as it forbid usage of child root directly. On the other hand it is a 'simple' api. So should we align to use childinfo?genesis json
the genesis encoding does change to include the child information, this means that existing json chainspec containing some child trie will need to be updated.
I do not know any chainspec using child trie, so this should not be an issue.
rpc
We need to break rpc (unique id being set/defined/generated by palet module we have no way to only use storage key as before).
I firstly implemented it by added 'unique id' after 'storage key' parameter, but finally find it more extensible to use the same api as for externalities.
That way we can already use rpc without fetching child root (child info starting with 32 byte of root, then unique id and child type 2).
This can be use to get some partial proof (only the child query part of the proof).
Next steps
change contract module to memoïze child root (either by direct runtime-io usage or change support child api) and use it in child info, this can be included in this pr.
putting child trie with a sequential uniqueid stored in state that is not crypto resistant should be easy, but it requires writing all child trie into their own rocksdb column. This would be equivalent to final design of avoid key collision on child trie and proof on child trie #2209. This can be implemented in a follow up pr by using another child type key in rpc and ext interface and using None in general interface.
In fact api already take unique id as optional for this next PR.
Using child trie with a sequential defined id that is not stored in state (stored on kv storage) will be for later (putting the kv storage to store this information in place being a substantial task that may require additional design).