-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765
Description
This has been found during investigation of x5 proof_size weight for one of the pallet_contracts API functions.
This bug was introduced in #11637.
Context
Results of benchmarks are processed with writer::process_storage_results() in the following way.
It loops through all the storage keys of all results
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 549 to 550 in 9c92e49
| for result in results.iter().rev() { | |
| for (key, reads, writes, whitelisted) in &result.keys { |
and multiples each benchmark result for each key, adjusting the result's proof_size for each key, as it depends on:
- PoV estimation mode
- single read PoV overhead
- number of reads
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 628 to 632 in 9c92e49
| // Add the additional trie layer overhead for every new prefix. | |
| if *reads > 0 { | |
| prefix_result.proof_size += 15 * 33 * additional_trie_layers as u32; | |
| } | |
| storage_per_prefix.entry(prefix.clone()).or_default().push(prefix_result); |
We get storage_per_prefix data as the result, which originally was used for creating comments with information about the storage keys touched during each benchmark.
The Bug
In PR#11637 this data started to be used for calculation of the resulting proof_size formula into the weights.rs . But in a wrong way:
Step 1, (almost right). We find average base proof_size value and slope for each component, by making regression analysis of benchmark results we put to this component in storage_per_prefix (see above).
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 299 to 303 in 9c92e49
| let proof_size_per_components = storage_per_prefix | |
| .iter() | |
| .map(|(prefix, results)| { | |
| let proof_size = analysis_function(results, BenchmarkSelector::ProofSize) | |
| .expect("analysis function should return proof sizes for valid inputs"); |
(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)
Step 2, (wrong). The resulting base_calculated_proof_size and component_calculated_proof_size values are calculated as a simple sum of the values for all prefixes from the per_prefix benchmark results.
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 317 to 323 in 9c92e49
| for (_, slope, base) in proof_size_per_components.iter() { | |
| base_calculated_proof_size += base; | |
| for component in slope.iter() { | |
| let mut found = false; | |
| for used_component in used_calculated_proof_size.iter_mut() { | |
| if used_component.name == component.name { | |
| used_component.slope += component.slope; |
But, wait a minute. To recap, the path of these proof_sizes is:
- They first were calculated for all keys in each benchmark run.
- Then we multiplied them for each key, adjusted depending on reads overhead for the key.
(which is weird because we're adding up overhead of a single key PoV to proof_size of all the keys) - Then we summed up those multiplied values.
And this is wrong. It leads to exaggerated weights when being put to weights.rs:
substrate/.maintain/frame-weight-template.hbs
Lines 73 to 75 in 9c92e49
| {{#each benchmark.component_calculated_proof_size as |cp|}} | |
| .saturating_add(Weight::from_parts(0, {{cp.slope}}).saturating_mul({{cp.name}}.into())) | |
| {{/each}} |
Suggested Fix
Instead of a simple sum, those base_calculated_proof_size and component_calculated_proof_size should be calculated as averages or medians from all keys.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status