-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Ensure clean wasm instances via synchronous clone. #2938
Conversation
|
IIRC deep cloning a module instance is not that simple. Internally it uses reference counted objects, such as memory. So simple |
|
@arkpar @bkchr Yes, you're right. I tried to implement deep clone functions for So the approach which I just pushed emerged after talking to @pepyakin: simply preserving the initial memory by reading from the exports and restoring it for each runtime (all synchronously). Initial benchmarks look good (lmk if you want other benchmarks):
This PR: I mentioned this before, the benchmark might get worse if we decide for more complicated logic (like delayed cache eviction). |
pepyakin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave a brief review
| match runtime_preproc { | ||
| RuntimePreproc::InvalidCode => { | ||
| let code = ext.original_storage(well_known_keys::CODE).unwrap_or(vec![]); | ||
| Err(Error::InvalidCode(code)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a little bit offtopic, but I wonder what is the point of returning the code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error enum is defined here and requires the code, I just extracted this part from native_executor. Not sure about the reasoning, but since the original code (and mine as well) later outputs the code in a trace! this could have been a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about usability of this and I am going to remove this.
|
Looking at the impl, it doesn't account for possible updates coming from |
…-wasm-instances-v2
|
I just pushed the fixed version (although, still not ready for the review). I benchmarked it like this: (It would be cool to generate the chain of blocks and then export them and then import but this is blocked by #2977). Note that I choose the different params, because apparently transaction-factory doesn't handle the case when the master runs out of funds. I also used As the baseline, I took the latest master commit e63598b. master: 29.2467s So there is a gap, but it is not terribly big: 1ms per block. There are still areas for improvement. We could decrease the amount of work by avoiding copying the entire linear memory space, but asking runtimes for special global values which specify where does the heap start so we can only copy everything before that and then FWIW, I also still have hopes that pooling of wasm instances can give noticeble improvement for import times. |
|
This approach has some deficiencies which if solved cause massive slow-downs. That made me to research another approach. Will create a PR shortly. |
Addresses #2051.
The runtime cache which substrate currently uses does reuse one runtime instance for every call.
In order to enable this the instance is cleaned up for every call. This leads to the problems detailed in
#2051.
In the process of creating #2931 to address the issue it turned out that cloning a
wasmi::ModuleInstancesynchronously is not that expensive. This PR implements a basic proof of concept (no tests, still a global variable, no delayed cache eviction) just to see how it performs.This PR creates a template instance on first fetch from the cache and clones it synchronously for succeeding fetches. Since
prepare_moduleis already called before every call into wasm this works.I benchmarked it against
masterusingperf statand our transaction factory (lmk if you have a better idea for creating a benchmark):master:This PR:
The speedup might be because the implementation is currently very simple, whereas the runtime cache from
masterdoes a bit more (operations on aHashMap, …). So there is probably some slow-down when we implement e.g. delayed cache eviction. Still I think it looks very promising and we wouldn't have to use threadsafe types for wasmi.If there are no complaints I suggest to implement this approach. @pepyakin @bkchr @arkpar, wdyt?
UPD(pepyakin):
Closes #2967