Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765

@agryaznov

Description

@agryaznov

This has been found during investigation of x5 proof_size weight for one of the pallet_contracts API functions.
This bug was introduced in #11637.

Context

Results of benchmarks are processed with writer::process_storage_results() in the following way.

It loops through all the storage keys of all results

for result in results.iter().rev() {
for (key, reads, writes, whitelisted) in &result.keys {

and multiples each benchmark result for each key, adjusting the result's proof_size for each key, as it depends on:

  • PoV estimation mode
  • single read PoV overhead
  • number of reads

// Add the additional trie layer overhead for every new prefix.
if *reads > 0 {
prefix_result.proof_size += 15 * 33 * additional_trie_layers as u32;
}
storage_per_prefix.entry(prefix.clone()).or_default().push(prefix_result);

We get storage_per_prefix data as the result, which originally was used for creating comments with information about the storage keys touched during each benchmark.

The Bug

In PR#11637 this data started to be used for calculation of the resulting proof_size formula into the weights.rs . But in a wrong way:

Step 1, (almost right). We find average base proof_size value and slope for each component, by making regression analysis of benchmark results we put to this component in storage_per_prefix (see above).

let proof_size_per_components = storage_per_prefix
.iter()
.map(|(prefix, results)| {
let proof_size = analysis_function(results, BenchmarkSelector::ProofSize)
.expect("analysis function should return proof sizes for valid inputs");

(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)

Step 2, (wrong). The resulting base_calculated_proof_size and component_calculated_proof_size values are calculated as a simple sum of the values for all prefixes from the per_prefix benchmark results.

for (_, slope, base) in proof_size_per_components.iter() {
base_calculated_proof_size += base;
for component in slope.iter() {
let mut found = false;
for used_component in used_calculated_proof_size.iter_mut() {
if used_component.name == component.name {
used_component.slope += component.slope;

But, wait a minute. To recap, the path of these proof_sizes is:

  1. They first were calculated for all keys in each benchmark run.
  2. Then we multiplied them for each key, adjusted depending on reads overhead for the key.
    (which is weird because we're adding up overhead of a single key PoV to proof_size of all the keys)
  3. Then we summed up those multiplied values.

And this is wrong. It leads to exaggerated weights when being put to weights.rs:

{{#each benchmark.component_calculated_proof_size as |cp|}}
.saturating_add(Weight::from_parts(0, {{cp.slope}}).saturating_mul({{cp.name}}.into()))
{{/each}}

Suggested Fix

Instead of a simple sum, those base_calculated_proof_size and component_calculated_proof_size should be calculated as averages or medians from all keys.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions