Node key prefixes in the database #12

arkpar · 2019-02-20T22:20:02Z

For some applications, such as substrate it is desirable to have each node to be unique in the trie. So that the same node can't be inserted into two separate branches of the same trie. This simplifies implementation of node storage quite a lot, since it removes the need for reference counting.

In ethereum the uniqueness is already guaranteed by the fact that the keys are hashes and each value ends up in a leaf node with a long random partial key. In Substrate this is not the case, as the keys are plain.

This PR introduces an additional parameter for HashDB functions that takes encoded partial node key. This allows for separating colliding nodes on the database levels, and for more efficient database implementation. E.g. The nodes that are close to each other in the trie may be grouped on disk.
Trie root calculation is not affected.

trie-db/src/triedbmut.rs

cheme · 2019-02-21T09:10:15Z

Looks interesting, out of curiosity, why is this needed ? (TrieStream of trie-root probably also need an updated).
Edit: no need to reply, did not see the referenced issue :)

trie-db/src/triedbmut.rs

arkpar · 2019-02-21T09:31:43Z

Updated the description.

xlc · 2019-03-10T02:27:24Z

@cheme Anything prevents this from been merged? I really want paritytech/substrate#1733 get fixed.

cheme · 2019-03-10T10:57:15Z

@xlc, the code in itself seems functionally fine and could be merge, but it is currently unclear if this will be the actual fix for paritytech/substrate#1733, it would also require trie_root changes for full compatibility (probably requiring modifying 'TrieStream' trait).
Using this alternate scheme changes all tries content (every hash ref in every node so also every root) : as I see it, for substrate it means either implementing a way to run two version of trie depending on block (quite difficult), or rebooting the chain.
I believe @arkpar is working on an alternate fix to avoid such breaking change, I do not have the details (for my part I can only imagine using key as immediate backing db key for storage, which would require a db migration (could improve perf a bit for fewer operation on update of value but I did not think to much about this)).
So I totally agree with you that 1733 is a very high priority issue, but using time on alternate possible solution could really be worth it.

xlc · 2019-03-10T20:47:26Z

@cheme Thanks for the detailed explanation. Good to know this is not getting stale.

cheme

This looks great, I did not run tests yet (I could probably fuzz it a bit later but I need to update my trie_root alternative algo from #11), still I put a few first comments.
I am also starting to wonder : why not have a trie unique id and calculate full_key from concat of unique_id + prefix only : this would allow direct access to a value without going through all the trie. (the unique_id thing may not even be needed in case of a column containing only the trie).
Ok, I am stupid, this does not keep the history, it is only possible if managing history or using one unique trie id per block. Still allowing custom full_key scheme would be interesting (at least for parity-eth compatibility).
Last observation, this PR may reduce the possibilities of trie_root crate (some form of Stream could be use to build a trie by including a db, this should now requires some modification of Stream trait if I am correct).

cheme · 2019-03-20T11:14:44Z

memory-db/src/lib.rs


+
+/// Make database key from hash and prefix.
+pub fn full_key<H: KeyHasher>(key: &H::Out, prefix: &[u8]) -> Key {


Here I would certainly see something like :
pub fn full_key<H: KeyHasher>(key: &H::Out, prefix: &[u8], key_dest: &mut [u8]) {
and reuse a full key buffer in MemoryDB
Would also require a function returning size

Reusing the buffer would only make sense in read-only methods, and that would make MemoryDB not thread safe for reading, or require additional synchronization.

cheme · 2019-03-20T11:31:09Z

memory-db/src/lib.rs

+pub fn full_key<H: KeyHasher>(key: &H::Out, prefix: &[u8]) -> Key {
+	let mut full_key = Vec::with_capacity(key.as_ref().len() + prefix.len());
+	full_key.extend_from_slice(prefix);
+	full_key.extend_from_slice(key.as_ref());


This is not mixing key with prefix anymore, but assume variable length for full_key: the parity_common kvdb crate would probably benefit from being able to choose the key type (currently ElasticArray32 : in our case ElasticArray64 or an intermediatory value would fit better).
Adding an issue to not forget that may be an idea.

Profiling shows that allocations here and calculating partial keys are insignificant compared to IO and other code. So I left it for later.

cheme · 2019-03-20T11:33:45Z

hash-db/src/lib.rs

 	/// Look up a given hash into the bytes that hash to it, returning None if the
 	/// hash is not known.
-	fn get(&self, key: &H::Out) -> Option<T>;
+	fn get(&self, key: &H::Out, prefix: &[u8]) -> Option<T>;


PlainDB could probably use that too (fwiu plaindb is for varlen key), @sorpaas would know better than I.

cheme · 2019-03-20T11:44:41Z

trie-db/src/lookup.rs

 		// this loop iterates through non-inline nodes.
 		for depth in 0.. {
-			let node_data = match self.db.get(&hash) {
+			let node_data = match self.db.get(&hash, &key.encoded_leftmost(key_nibbles, false)) {


Small slowdown for parity ethereum, I do not really see a way of avoiding it.

memory-db/src/lib.rs

trie-db/src/triedbmut.rs

cheme · 2019-03-24T20:54:00Z

I did start updating another trie pr on friday using this pr changes: things seems to work fine :) (after solving few indexing issue of my own, I could fuzz my triebuilder against this prefixed implementation). Manipulating this new api makes me wonder about two points:

for Child Trie db need key isolation substrate#2035 , I started overloading db implementation (for substrate usage of triedb and triedbmut in child trie context) by prefixing key with a keyspace. If KeyFunction trait changes to :

fn key(&self, hash: &H::Out, prefix: &[u8]) -> Self::Key;

I could make keyspace indexing in the keyfunction implementation.
This implies that triedb and triedbmut will build with a third parameter (keyfunction struct as inner field), but it won't really add verbosity (we already need to set keyfunction type when building trie and triemut).
This could also be use to add additional contextual info in the key other trie usage.

But honestly, this is still doable by overloading the db if we consider this not being the role of the keyfunction trait.

I am not really comfortable with using nibble trie encoding for db key prefix (it does not allow db sorted iteration over a common prefixed key in the db due to the header infos).
Ideally the way we encode prefix in db could be defined into memorydb: into KeyFunction implementation. It would requires to pass Key plus nibble index (or an equivalent to nibbleslice) to the key function instead of the encoded prefix.
It can also avoid running the nibble encoding in the parity eth case (shouldn't be the reason to do it this way).

arkpar · 2019-03-27T17:32:19Z

Regarding paritytech/substrate#2035, this could be done purely in the HashDb implementation. KeyFunction is implementation detail of MemoryDb. I don't think it should be part of the TrieDb or TrieDbMut. The trie itself is not concerned with duplicate nodes or reference counting or storage optimizations. It just provide enough information for the backend to resolve all these issues.

I am not really comfortable with using nibble trie encoding for db key prefix (it does not allow db sorted iteration over a common prefixed key in the db due to the header infos).

Not sure I understand this. Trie iteration is surely still possible. Node backend iteration or seek depends on HashDb implementation and out of the scope of this PR. E.g. with substrate key function iteration is still possible, but seeking by node hash is not.

cheme · 2019-03-27T17:38:06Z

About paritytech/substrate#2035 , the HashDB way of doing things is fine (still I found it more elegant with KeyFunction (I am currently realizing that I need to handle the empty node value of hash db case : with keyfunction it is native). But it is really a matter of design and does not have to make it to this PR.
About iteration, it is mostly speculating about db (rocksdb or other with ordered tree iteration) optimization : having a common prefix looks better for me but as for the previous point it does not have to make it to this pr.

cheme reviewed Feb 21, 2019

View reviewed changes

trie-db/src/triedbmut.rs Outdated Show resolved Hide resolved

arkpar mentioned this pull request Feb 21, 2019

Panic: Trie lookup error: Database missing expected key paritytech/substrate#1733

Closed

arkpar added the inprogress label Feb 21, 2019

cheme reviewed Feb 21, 2019

View reviewed changes

trie-db/src/triedbmut.rs Outdated Show resolved Hide resolved

Key prefixes

20d5a3d

arkpar force-pushed the a-key-hashing branch from 6424287 to 20d5a3d Compare March 20, 2019 10:24

arkpar removed the inprogress label Mar 20, 2019

arkpar changed the title ~~Allow node hashing with key~~ Node key prefixes in the database. Mar 20, 2019

arkpar changed the title ~~Node key prefixes in the database.~~ Node key prefixes in the database Mar 20, 2019

cheme reviewed Mar 20, 2019

View reviewed changes

Generic key function

00fdfe8

cheme reviewed Mar 21, 2019

View reviewed changes

trie-db/src/triedbmut.rs Outdated Show resolved Hide resolved

More efficient key calculation

7685adf

arkpar force-pushed the a-key-hashing branch from ebad7d4 to 7685adf Compare March 22, 2019 21:45

cheme approved these changes Mar 27, 2019

View reviewed changes

gavofyork merged commit 3b2777b into master Mar 27, 2019

gavofyork deleted the a-key-hashing branch March 27, 2019 17:47

This was referenced Mar 27, 2019

Use prefixed keys for trie node. paritytech/substrate#2130

Merged

Fix iteration for prefixed storage #15

Merged

mnaamani mentioned this pull request Apr 1, 2019

Sparta network down. Joystream/substrate-node-joystream#50

Closed

cheme mentioned this pull request Apr 15, 2019

Adds first version of validate_block paritytech/cumulus#3

Merged



		/// Make database key from hash and prefix.
		pub fn full_key<H: KeyHasher>(key: &H::Out, prefix: &[u8]) -> Key {

Node key prefixes in the database #12

Node key prefixes in the database #12

Uh oh!

Conversation

arkpar commented Feb 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cheme commented Feb 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

arkpar commented Feb 21, 2019

Uh oh!

xlc commented Mar 10, 2019

Uh oh!

cheme commented Mar 10, 2019

Uh oh!

xlc commented Mar 10, 2019

Uh oh!

cheme left a comment

Choose a reason for hiding this comment

Uh oh!

cheme Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

arkpar Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

cheme Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

arkpar Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

cheme Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

cheme Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cheme commented Mar 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arkpar commented Mar 27, 2019

Uh oh!

cheme commented Mar 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

arkpar commented Feb 20, 2019 •

edited

Loading

cheme commented Feb 21, 2019 •

edited

Loading

cheme commented Mar 24, 2019 •

edited

Loading