Trie without extension node #11

cheme · 2019-02-19T08:51:45Z

This PR implement new codec spec from polkadot-re-spec:

The main change is the removal of trie extension node. This is triggered by switching an associated constant in TrieLayout.
Other changes are:

nibbleslice changes
I did rework nibbleslice a bit: replacing encode by left and right.
basically encode was copying and aligning bytes, this is mostly starting from
the spec description.

left of a nibble offset is call Prefix and is encoded with padding at the end when
number of nibble does not fit
this is used for prefix for key db : this is a breaking change but it got the nice property to allow iteration on key in db (for contract child trie for instance where we only got key of contract or in case kvdb evolves to allow collection specifics).
Also note that this change in prefix also include the number of padding (needed to iterate correctly).
right of a nibble is call Partial this is similar to existing naming of variable and
is what is encoded into trie nodes: the padding is at start.
This allow leaf to avoid shifting bytes for its partial. That is not the case for branch.
I believe leaf is more prone to use long partial (especially if key is resulting from a hash).
to avoid using a vec the types uses a slice and a optional last bytes (when padding).
Actually the optional last bytes is (u8,u8) to allow other layout.

Note that some of the code could revert to slower usage of at function to be easier to code check.

It also solve the existing TODO about removing header from encoding (see code in
ethereum branch where a code impact is visible (in substrate it is all fine)).

TrieLayout

An enveloppe trait over the trie layout to keep all implementation details at one place.
It also use experimental (few test) alternate trie layout (other radix only for aligned key and in a single byte
which limit to 4 variant).

iter_build algorithm
I write this at a time I did not see trie_root crate, I included it in the PR as I think it is not really hard to maintain and I used it a lot in testing.
Note that this PR does not update trie_root to use TrieLayOut
test branch from other project
https://github.com/cheme/parity-ethereum/tree/simple-codec : there are quite some change in codec, mainly related to the removal of nibble hpe
https://github.com/cheme/substrate/tree/simple-codec

cheme · 2019-02-26T10:16:06Z

Note about NodeHeader scheme : compacting could look like:

// branch = value - 1 in extension size -> val 85 branch 84
const EMPTY_TRIE: u8 = 0;
const LEAF_NODE_OFFSET: u8 = 1; // (1 to 86) = max nbl l 85
const BRANCH_NODE_NODE_OFFSET: u8 = 87; // (87 to 171) max 84
const BRANCH_NODE_WITH_VALUE_OFFSET: u8 = 172; // (172 to 255) max 84

Problem, some scheme in substrate (double storage map) requires length > 32 bytes (from current code there is a 8 bit prefix + 128 bit twox hash + 256 bit blake so a nibble length of 96 (more than the 85 allowed above)).
So the one byte length encoding does not fit. So it feels like it definitely should allow encoding on two bytes for big val.

are currently broken, expect lot of non working things to.

(probably some shifting to analyze)

And an additional malleability safe guard. Tests involving `Lookup` are currently broken.

key to long).

specific constant that will be use for merging code of NoExt with previous one. Furthermore could be use in multitrie implementation (no constant use but fn ). Currently only done for Mut (non mut will wait to see if merging goes well).

benefit from constant.

awkward for some unresolved types.

gavofyork · 2019-07-29T02:40:45Z

@cheme when you're certain that both all the reported issues, and the typical issues (bad naming, missing docs, low-quality docs) are fixed, please ping @arkpar and @svyatonik to get a final review.

cheme · 2019-07-29T10:01:33Z

@arkpar, @svyatonik I went other bad naming and try to go over the doc again this morning.

There is still things I can do to reduce code review size:

remove fuzzing and bench.
remove iter_build (extensively use by fuzzing but not much at other place: I may even put it in its own crate (there may be some codec thing that will need to go public)).
remove other radix variant, this can simplify nibble logic a bit (main idea of this was to make things more clear by generalizing but when I read back the code I am not sure its).
remove one of the two encoding variant from test-support.

gavofyork · 2019-07-31T08:00:09Z

@cheme please remove as much as you can to keep the review size down

gavofyork · 2019-07-31T08:00:42Z

@arkpar @svyatonik would be great if we could get this reviewed again

And replacing ReferenceError by CodecError.

arkpar · 2019-07-31T10:18:10Z

Fuzzing and benching may be left as is.
Generic nibble size/tree arity sounds nice, but in practice also introduces additional overhead. E.g. the need to store extra byte per key prefix. I'd remove it for now.

test-support/reference-trie/src/lib.rs

trie-db/src/iter_build.rs

Using 'GenericNoExtensionLayout' for 'NoExtensionLayout'.

gavofyork · 2019-08-01T13:59:34Z

@cheme couple of questions there...

@arkpar @svyatonik are you guys ok with the PR otherwise?

svyatonik · 2019-08-01T14:01:25Z

trie-db/src/nibble/mod.rs

+	pub fn biggest_depth(v1: &[u8], v2: &[u8]) -> usize {
+		// sorted assertion preventing out of bound
+		for a in 0..v1.len() {
+			if v1[a] == v2[a] {


Is that always correct that v1.len() <= v2.len() (otherwise it'll panic)? I have found the only usage of the biggest_depth() from the trie_visit(). I assume that input there is ordered in some manner? If it sorted using BTreeMap, then given that e.g. vec![5, ,5] < vec![20], this could panic, no?

In fact there is no verification of the ordering, iter_build code does not support unordered input.
I can prevent this failing, but I will need to propagate the error, that seems pretty quick to do.

Sorry - I mean that even if input is ordered (lexicographically, I suppose), then this could panic, because lexicographically vec of larger len could came before vec of smaller len. No need to propagate error imo.

yes (I just realized, I just got to min the loop bound).

(I think you also need to update this line: return v1.len() * NIBBLE_PER_BYTE; below to use cmp::min() instead of v1.len())

arkpar · 2019-08-01T14:09:43Z

I've verified that the implementation matches the spec and there's sufficient test coverage. So "looksgood" from me.

cheme added 2 commits February 15, 2019 21:49

fuse commit of no extension branch.

776f324

Tabify.

4eae594

cheme added the question Further information is requested label Feb 19, 2019

Implementation for trie_root.

5d79a12

cheme mentioned this pull request Mar 20, 2019

Node key prefixes in the database #12

Merged

cheme added 24 commits March 21, 2019 21:42

Merge branch '12' into n_ext_trie_key, try to complete for no_ext, tests

fe603e8

are currently broken, expect lot of non working things to.

Fix test

e7b299c

Fuzz of prefix seems ok for pr 12, no_ext things are totally broken.

2ca4542

Fix iter_build key indexes.

cfeb96a

Merge branch 'master' into n_ext_trie_key, tests are broken again

e38532c

(probably some shifting to analyze)

Fix merge regression

1d9be11

No error found : need more tests (after evos).

f1fdcbd

Put a no_ext header specific scheme (with second bit as in substrate).

bd5b735

And an additional malleability safe guard. Tests involving `Lookup` are currently broken.

Fix decoding issue of iter

e727fbd

Reset fuzzer to right no_ext impl, found a failing test (in comment).

b3c988c

Find and fix fuzzing error (a missing branch child ix).

bdd99c0

Removing bench blobs.

ac36fc9

Minor change (remove dead branch, and useless encoding, also panic on

42c2eba

key to long).

Fuse triedbmut code, before remove NoExt variant

e092a0c

Fix merge and remove triedbmutnoext variant.

3bd3127

No instance for Layout (NodeCodec should have one at some point), to

a8b73f6

benefit from constant.

Apply layout trait to non mut trie too.

1c34745

Nibbleslice using inner trait to define multiple encoding: this is

12842a6

awkward for some unresolved types.

NibbleVec move size arelated methods.

4478ece

Reverse padding in nibble

318e7c8

start testing, first broken part see triedbmut advance println

b30698a

Fix offset issue of nodecodec

097fec5

Switch remaining tests to use postfix for no ext

f892c76

cheme added 4 commits July 26, 2019 17:23

Line length bellow 100.

68e3832

add after comma space and after closure paramater space.

19f92fb

Renaming of variables and methods.

4f3bb58

Some line length.

a65724d

cheme added 4 commits July 29, 2019 10:23

Rename subtrait of TrieLayout to full names.

502740b

Replace 'masked_' by 'pad_'.

b6e8f9c

documentation small fixes.

5ada0fe

Restore new line at end.

2fe240c

Remove old feature.

5a81906

Merge branch 'master' into n_ext_trie

fb0e822

And replacing ReferenceError by CodecError.

cheme added 3 commits July 31, 2019 21:07

Remove multiple trie radix support.

8a9cc47

Comment.

d9a12d0

formatting, and remove unused BitMap trait.

869b4ce

svyatonik reviewed Aug 1, 2019

View reviewed changes

test-support/reference-trie/src/lib.rs Outdated Show resolved Hide resolved

svyatonik reviewed Aug 1, 2019

View reviewed changes

trie-db/src/iter_build.rs Outdated Show resolved Hide resolved

Removing: radix related cache struct from iter_build.

9b94482

Using 'GenericNoExtensionLayout' for 'NoExtensionLayout'.

svyatonik reviewed Aug 1, 2019

View reviewed changes

arkpar approved these changes Aug 1, 2019

View reviewed changes

cheme added 2 commits August 1, 2019 16:20

No assumption on biggest_depth.

5b9b022

fixing biggest depth correctly.

79d71b6

svyatonik approved these changes Aug 1, 2019

View reviewed changes

gavofyork merged commit 8c1e837 into paritytech:master Aug 1, 2019

cheme mentioned this pull request Oct 24, 2019

[optimization] Support a variant where extension nodes aren't used and branch nodes can have multiple-nibble prefix #4

Closed

Trie without extension node #11

Trie without extension node #11

Uh oh!

Conversation

cheme commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cheme commented Feb 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gavofyork commented Jul 29, 2019

Uh oh!

cheme commented Jul 29, 2019

Uh oh!

gavofyork commented Jul 31, 2019

Uh oh!

gavofyork commented Jul 31, 2019

Uh oh!

arkpar commented Jul 31, 2019

Uh oh!

Uh oh!

Uh oh!

gavofyork commented Aug 1, 2019

Uh oh!

svyatonik Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

cheme Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

svyatonik Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

cheme Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

svyatonik Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

arkpar commented Aug 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cheme commented Feb 19, 2019 •

edited

Loading

cheme commented Feb 26, 2019 •

edited

Loading