-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Implement change trie for child trie. #3122
Conversation
some useless computation (but there is a pending pr for that). Next are tests.
svyatonik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re choosing the implementation:
- I'm still not sure - why separate tries are better than one trie for both top + all children tries. Do you have an opinion on this?
- if there are no strong requirements for using separate tries, I'd prefer to do some benches before merging this PR. Like - given the same set of changes how faster/slower is maintaining multiple tries vs big single trie.
Re implementation - everything looks OK, except for some small issues I've found. Will do a final review once PR will be ready. Thanks for doing this!
Re removing block number from trie keys - they were added exactly for that (I mean simplified pruning). If there's a faster way for doing the same, it would be great! Could you, please, file an issue?
|
|
||
| trie_storage.for_key_values_with_prefix(&child_prefix, |key, value| { | ||
| if let Some(InputKey::ChildIndex::<Number>(trie_key)) = Decode::decode(&mut &key[..]) { | ||
| if let Some(value) = <Vec<u8>>::decode(&mut &value[..]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use H::Out instead of Vec<u8>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(to avoid panic in case if lengths are different)
| block: block.clone(), | ||
| storage_key, | ||
| }; | ||
| child_map.insert(child_index, map.into_iter().map(|(_, (k, v))| InputPair::DigestIndex(k, v))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will replace existing value in the child_map with new one, right? Like if digest is being built for blocks [1; 8] and child storage with key child1 has been updated in block#2 && block#6, then:
- when processing block#2 we'll insert (
child1,vec![2]) intochild_map; - when processing block#6 we'll replace this value (with this
.insert()call) with (child1,vec![6]), thus losing changes for block#2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will need some intermediate map (test cases were wrong to).
| for (key, _) in committed_map.iter() { | ||
| map_entry.1.insert(key.clone(), None); | ||
| if !map_entry.contains_key(key) { | ||
| map_entry.insert(key.clone(), OverlayedValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If value has been changed in previous extrinsics, then this .insert() call will replace existing entry, thus forgetting previous extrinsics indices.
Flattening child trie in the parent trie seems doable (putting optional child storage key in the input keys). This will lead to trie containing encoded child storage key as child trie prefix and instead of having a child trie root in a value node pointing to another child trie, we will get directly the child trie top. For what it is worth there may be some possible quicker access for multiple child trie content (the parent trie query/proof can be shared). But this can also be achieved with flattened child trie by basing the triedb on an middle branch node corresponding to the common encoded key bytes of all child trie nodes (plus possible partial for key). In my opinion, what could justify the use of child trie is only wether we want to split change trie or not. If we want to be able to revert some child trie to a previous state, we could use the former child trie to, but I do not think this would really make sense (pushing some special info in change trie seems more relevant). This only works cleanly if child trie get some prefix (similar to #2209) to fetch their key in the collection (currently no prefix are use on child trie).
Hypothetical, the usage of child trie in digest may lead to skip some trie access (but the common child trie path would be cached in case of a big single trie). I don't have any idea if it can make for the added indirection level on creation. |
|
OK - you must be talking mostly about #2832, right (I mean reverting changes trie to previous state)? So the only advantage is potential boost of a revert-to-block performance. But imo it doesn't make sense. Like if you are going to revert to block#500 when you're at the block#1000, then changes trie for block#500 isn't the changes trie you want to duplicate for block#1000. Changes trie contains state difference between current and previous blocks. Example:
Or have I misunderstood something? So imo we must build changes trie from the scratch as we normally do in the case of revert-to-block operation. I'd also summon @gavofyork here - maybe he has a strong opinion on whether we need one-changes-trie-per-child-trie, or not. |
|
About reverting, I ended up with the same conclusion as you (does not make sense for change trie). |
|
@svyatonik , if you wish to bench a bit, I did a quick implementation of flattened child trie here: |
|
Ah, I've also started that :) Nvm - will use your version, thanks! :) |
|
Okay, I've got some bench results. And actually this implementation works faster than implementation from |
|
That is not really what I expected, it comes probably from the trie reading being split. I would say |
|
@cheme Is this still "A3-inprogress"? :) |
|
Oh, I forgot to switch the tag, thanks. |
svyatonik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good (aside of comment removal). I'll take a final look tomorrow - mostly worrying abut CT build, since it is consensus-critical part.
| // You should have received a copy of the GNU General Public License | ||
| // along with Substrate. If not, see <http://www.gnu.org/licenses/>. | ||
|
|
||
| //! Changes trie related structures and functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like these docs should stay?
svyatonik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last issue, otherwise looks good.
| changes.storage(k) | ||
| }; | ||
| if !existing.map(|v| v.is_some()).unwrap_or_default() { | ||
| if !backend.exists_storage(k).map_err(|e| format!("{}", e))? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If storage_key.is_some(), there should be a call to backend.exists_child_storage()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty bad one. I remember when looking at this part of code that this query should be expensive, I guess there may be something doable to have the info in the transaction without querying the backend (would probably require additional query of backend during execution in some case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have had this discussion in #2865 - it was an (unsuccessful) attempt to extend this check to non-temporary values. IMO that's what state cache actually handles - if value has been read from trie during execution, then it'll be read from in-memory cache for the second time. And if it wasn't, then there's no other way, than to perform this read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also guess that running this backend check during block execution, in case of invalid block, is also an unnecessary operation.
Maybe if overlay was aware of accessed data, it could conditionally store this in the transaction, but that would be similar as putting some access cache in the overlayed_change (which may be interesting to avoid some map access (we already need to check change for get_storage) but would be some substantial design change).
This PR is an attempt to fix #2622.
It makes changes trie handle child trie content.
It makes a few choice, so it is only a proposal and things can be change:
An alternate way of doing things would be to directly put child changes content into the top change trie with another new node (index build over encoding of block, encoded child trie key, child trie content key).
It also will also need some changes after #2209, main question being do we refer to child trie keyspace or parent address, in this case parent address seems to be right addressing.
I also wonder if removing block number info from change trie keys could be a good idea?
cc/ @svyatonik
It would requires prefixing memorydb key with encoded block number, and pruning could be done by directly removing all keys starting with this block number prefix (no need for trie parsing).
Similarly change trie child content could be prefixed by a unique id such as its storage path to be able to isolate its key values without trie parsing (but I got no direct use case except for being able to export change child trie without trie parsing).