Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

@gui1117
Copy link
Contributor

@gui1117 gui1117 commented Nov 22, 2019

DONE:

  • a new trait StoragePrefixedMap is introduced and implemened for map, linked_map, double_map. it allows for enumeration and removal.
  • ChangeSetOverlay is now using BTreeMap instead of HashMap.
  • InMemory backend is now using BTreeMap instead of HashMap.
  • sr-io has a new function next_key(key)-> Option<key> returning the next key in the storage.

TODO:

  • cache the backend next_key in Ext:
    every time next_key is called we fetch the next in overlay and the next in backend. But the backend doesn't implement any cache for key iteration. a quick cache could be implemented in Ext implementation of Externalities, but then if a user iterate on two structure at the same time this cache is useless. A proper cache should be implemented in a lower layer.

  • the sr-io API could include the suffix like next_key_with_suffix(key, suffix) -> Option<key> so that if we have to implement some cache we could limit the number of key in advance we cache.

    note: in the current architecture with BTreeMap for changeset this is not needed
    note: if in the future we have to change this overlay we can still also provide a new call with this suffix for an optimized call.

  • benchmark to see if BTreeMap is ok, otherswise we can try other solution with some cache on the hashmap. Some benchmark between BTreeMap and Hashmap are there Introduce prefixed storage with enumeration #4185 (comment)

Fix #3883

@rphmeier rphmeier added the A3-in_progress Pull request is in progress. No review needed at this stage. label Nov 22, 2019
@gui1117
Copy link
Contributor Author

gui1117 commented Nov 22, 2019

Iteration on overlay by changing Hashmap to BTreeMap:

Early benchmark between BTreeMap and HashMap:
each map are filled with random key of size 32bytes, we remove 100 existing keys or insert 100 random keys or get 100 existing keys. The result shows:

if maps contain less than 10_000 key/values then performance are quite same.
if maps contain 100_000 key/values then performance of BTreeMap is almost 2 times slower than HashMap.

benchmark.txt
result.txt

Iteration on overlay with cache.

If ext_call specifies the range on which it will iterate

for example get_in_range(from_key, until_key, number_of_key_to_return) -> Vec<Key>
then we can cache all the keys from the last returned (not included) to until_key. then if the cache (like a key is inserted or removed) we can reduce the cache to the cleand part (but then this cache reducing operation happens always). we remove the cache entirely when it is touched.

If ext_call doesn't specifies the range on which it will iterate

then the only way to make a cache that is useful is to reduce the cache to the not dirty part when the storage gets dirty.

@gui1117
Copy link
Contributor Author

gui1117 commented Nov 26, 2019

implementation is almost complete, still todo:

  • optimisation of next_key in overlay.
  • storage cache in client
  • implement next_key for child storage (easy)

@xlc
Copy link
Contributor

xlc commented Nov 27, 2019

How likely is this going to be merged with in next two weeks? Asking because I am writing materials for my Substrate course and if this is going to land before the course start, I can drop the linked_map part and teach prefixed map instead.

@gui1117 gui1117 changed the title Introduce prefixed storage without enumeration Introduce prefixed storage with enumeration Nov 27, 2019
@gui1117 gui1117 marked this pull request as ready for review November 29, 2019 17:22
@gui1117
Copy link
Contributor Author

gui1117 commented Nov 29, 2019

How likely is this going to be merged with in next two weeks? Asking because I am writing materials for my Substrate course and if this is going to land before the course start, I can drop the linked_map part and teach prefixed map instead.

The PRs waiting for this are no longer waiting, but it seems a very useful feature. About the deadline, it is mostly about if performance are OK by switching HashMap to BTreeMap, or if we need better and different cache and still use HashMap

@gavofyork
Copy link
Member

it's acceptable, though we might want to revert to hashmap and add a cache at a later date if real world usage dictates. as long as the API is fixed and we can change as desired later, then we're good.

@gui1117
Copy link
Contributor Author

gui1117 commented Dec 3, 2019

yes understood,

hmm then maybe it worth making the sr-io API next_key_with_prefix(next_key: &[u8], prefix: &[u8]) so that if we have to implement a cache then the runtime can provide the prefix he wants to iterate on, allowing not to create a big cache when not needed.

EDIT: top description updated including this thought

@gui1117 gui1117 added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Dec 4, 2019
@gavofyork gavofyork added A6-seemsok and removed A0-please_review Pull request needs code review. labels Dec 5, 2019
@gavofyork
Copy link
Member

Shortly after this goes in we will want to consider adding to storage maps the possibility to prefix the unhashed key to the value, thereby allowing the possibility of enumeration over all keys from both the runtime and UI. Right now to do that we only really have the option of using twox(key)++key hasher which isn't especially secure against unbalanced trie griefing.

@gavofyork
Copy link
Member

anyone else want to sign this off before merge?

@bkchr
Copy link
Member

bkchr commented Dec 6, 2019

I will review it now.

// The key just after the one given in input, basically `key++0`.
// Note: We are sure this is the next key if:
// * size of key has no limit (i.e. we can always add 0 to the path),
// * and no keys can be inserted between `key` and `key++0` (this is ensured by sr-io).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean and how is that ensured?

Copy link
Contributor Author

@gui1117 gui1117 Dec 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we doesn't want to iterate from the key included but from the key excluded and trie crate only provides seek function.

So we compute the next potential key, by doing next_potential_key = key++0.

But this is only true if:

  • we can indeed add 0 to the key
  • and if there is no key in the trie between key and key++0

First option requires size of key has no limit or that the runtime is not suppose to write anything near this limit, and the second option is ensured because sr-io doesn't provide any way to write between key and key++0, even if it is a 16-trie.

Maybe @cheme you can confirm this ?

But maybe it is better to actually don't make those assumption and seek from the key, get the next, if the next is the current one then get the second next. or implement some new features in trie.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would result in something like this:

diff --git a/primitives/state-machine/src/trie_backend_essence.rs b/primitives/state-machine/src/trie_backend_essence.rs
index aea0193dd..1c8bafb09 100644
--- a/primitives/state-machine/src/trie_backend_essence.rs
+++ b/primitives/state-machine/src/trie_backend_essence.rs
@@ -110,23 +110,29 @@ impl<S: TrieBackendStorage<H>, H: Hasher> TrieBackendEssence<S, H> where H::Out:
                let mut iter = trie.iter()
                        .map_err(|e| format!("TrieDB iteration error: {}", e))?;
 
-               // The key just after the one given in input, basically `key++0`.
-               // Note: We are sure this is the next key if:
-               // * size of key has no limit (i.e. we can always add 0 to the path),
-               // * and no keys can be inserted between `key` and `key++0` (this is ensured by sr-io).
-               let mut potential_next_key = Vec::with_capacity(key.len() + 1);
-               potential_next_key.extend_from_slice(key);
-               potential_next_key.push(0);
-
-               iter.seek(&potential_next_key)
+               iter.seek(&key)
                        .map_err(|e| format!("TrieDB iterator seek error: {}", e))?;
 
                let next_element = iter.next();
 
                let next_key = if let Some(next_element) = next_element {
-                       let (next_key, _) = next_element
-                               .map_err(|e| format!("TrieDB iterator next error: {}", e))?;
-                       Some(next_key)
+                       let next_key = next_element
+                               .map_err(|e| format!("TrieDB iterator next error: {}", e))?.0;
+
+                       if next_key != key {
+                               Some(next_key)
+                       // If next key is same as key iter to second next element.
+                       } else {
+                               let next_element = iter.next();
+                               if let Some(next_element) = next_element {
+                                       let next_key = next_element
+                                               .map_err(|e| format!("TrieDB iterator next error: {}", e))?.0;
+
+                                       Some(next_key)
+                               } else {
+                                       None
+                               }
+                       }
                } else {
                        None
                };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the fact that api only allow to write key value at byte address is important here (otherwhise we would need to work with nibbles (half byte for 16-trie) and add 0 nibble.
I am not sure if skipping the first iteration is very important here (we need to query the node anyway during seek call).
The relevant perf gain would be from using the same iterator between calls (to skip the 'seek' call on every 'next), but it is not a easy todo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not rely on "the fact that api only allow to write key value at byte address", this property could be leverage in the future.

@shawntabrizi
Copy link
Member

shawntabrizi commented Dec 9, 2019

Is it correct that after this PR is merged, linked_map could be entirely replaced with map?

@gui1117
Copy link
Contributor Author

gui1117 commented Dec 9, 2019

Is it correct that after this PR is merged, linked_map is obsolete?

To make it "really" obsolete we need to provide an easy to make maps storing their key alongside the value as Gav says here: #4185 (comment)

@bkchr
Copy link
Member

bkchr commented Dec 9, 2019

Is it correct that after this PR is merged, linked_map could be entirely replaced with map?

Maps are also not sorted, like linked maps.

@bkchr bkchr merged commit 6b70f12 into master Dec 9, 2019
@bkchr bkchr deleted the gui-prefixed-map-2 branch December 9, 2019 19:55
@gui1117 gui1117 mentioned this pull request Jan 2, 2020
cheme added a commit to cheme/substrate that referenced this pull request Oct 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend Externalities with for_keys_with_prefix instead of just clear_prefix

9 participants