-
Notifications
You must be signed in to change notification settings - Fork 2.7k
State cache and other performance optimizations #1345
Conversation
gnunicorn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine overall, a few TODOs, wraps and smaller style remarks.
| // finish instantiation by running 'start' function (if any). | ||
| let instance = intermediate_instance.run_start(&mut fec)?; | ||
| let low = memory.lowest_used(); | ||
| let used_mem = memory.used_size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that used_size returns the highest written address so far, this will be set to the address of last written byte (shouldn't it be +1?) and then in Heap we assume that we can allocate memory starting from this address.
This completely ignores the fact that compiler might reserve some space without actually writing to it (aka BSS).
I think this would work for now since rust doesn't use start function at all atm and we only have rust runtimes now. But I'm not sure that this is universal approach, since the behavior of rust can be changed and/or somebody will want to use a language other than rust.
Of course, we can impose constraints on wasm modules to be compatible with substrate, however, I think this might be not the best decision and could backfire even with Rust.
UPD: withdrawn concern about the start function run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that
used_sizereturns the highest written address so far, this will be set to the address of last written byte (shouldn't it be +1?) and then inHeapwe assume that we can allocate memory starting from this address.
+1 is not required. The buffer should be schrinked to exactly this size.
This completely ignores the fact that compiler might reserve some space without actually writing to it (aka BSS).
Not exactly. This relies on wasmi to always initialize all data segments, as it does now. Since this code already relies on some of the wasmi internals I don't think this is much of a stretch.
The bottom line is simple. If the interpreter can do 200bps as demonstrated here it should provide a convenient interface to be used efficiently. I'm sure compatibility can be achieved without sacrificing performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: the +1 remark, wasmi documentation doesn't appear to be consistent:
/// Returns current used memory size in bytes.
/// This is the highest memory address that had been written to.
pub fn used_size(&self) -> Bytes {
Bytes(self.buffer.borrow().len())
}
i would presume that the buffer.len() actually does give a size, rather than an index (i.e. you write to index 0 implies your buffer is size 1).
so the remark "This is the highest memory address that had been written to" is wrong: it should read "This is one more than the highest memory address that had been written to".
@pepyakin could you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That function was added by me and the documentation is indeed incorrect.
* State caching * Better code caching * Execution optimizaton * More optimizations * Updated wasmi * Caching test * Style * Style * Reverted some minor changes * Style and typos * Style and typos * Removed panics on missing memory
* State caching * Better code caching * Execution optimizaton * More optimizations * Updated wasmi * Caching test * Style * Style * Reverted some minor changes * Style and typos * Style and typos * Removed panics on missing memory
* State caching * Better code caching * Execution optimizaton * More optimizations * Updated wasmi * Caching test * Style * Style * Reverted some minor changes * Style and typos * Style and typos * Removed panics on missing memory
* State caching * Better code caching * Execution optimizaton * More optimizations * Updated wasmi * Caching test * Style * Style * Reverted some minor changes * Style and typos * Style and typos * Removed panics on missing memory
* State caching * Better code caching * Execution optimizaton * More optimizations * Updated wasmi * Caching test * Style * Style * Reverted some minor changes * Style and typos * Style and typos * Removed panics on missing memory
* State caching * Better code caching * Execution optimizaton * More optimizations * Updated wasmi * Caching test * Style * Style * Reverted some minor changes * Style and typos * Style and typos * Removed panics on missing memory
Update Rust to latest nightly version and apply new clippy suggestions
Closes #257
Major change introduced in this PR is canonical state cache. This works in the same way as in
parity-ethereum. The cache caches a single stateCthat is considered "canonical" or "best". Any stateSthat is derived fromCcan use cashed value cache as long as the key is not overwritten in any of the blocks betweenCandS.A few other minor changes:
changesinCallResultand storage root calculation are not really required for each call. So the whole struct is dropped.