-
Notifications
You must be signed in to change notification settings - Fork 131
Fix and update benchmarks #1494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 0379d24.
serban300
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though I did some research these days, I'm not familiar with benchmarking. I left some comments / questions. I hope they make sense.
| let p in VALIDATOR_SET_SIZE_RANGE_BEGIN..VALIDATOR_SET_SIZE_RANGE_END; | ||
| let v in MAX_VOTE_ANCESTRIES_RANGE_BEGIN..MAX_VOTE_ANCESTRIES_RANGE_END; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we limited these ranges should we use less steps also ? Since the ranges are more or less 50..100, 50 steps would mean testing each value in this range individually. Would it make sense to use for example 25 steps now in order to generate the benchmarks even faster ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
50 is kinda standard (don't ask me why :) ) across all repos/pallets: https://github.com/paritytech/polkadot/blob/750fcd92288a8f9637f3757b1c3dc874bd397ad5/scripts/ci/run_benches_for_runtime.sh and https://github.com/paritytech/cumulus/blob/master/scripts/benchmarks-ci.sh . We may change it in our repo, but the goal is to make benchmarks run faster when non-test runtime weights are updated. And it happens in other repos, where we can't change that without reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see. I'm ok with leaving 50 as well if it's standard.
| // `1..MAX_VALIDATOR_SET_SIZE` and `1..MAX_VOTE_ANCESTRIES` are too large && benchmarks are | ||
| // running for almost 40m (steps=50, repeat=20) on a decent laptop, which is too much. Since | ||
| // we're building linear function here, let's just select some limited subrange for benchmarking. | ||
| const VALIDATOR_SET_SIZE_RANGE_BEGIN: u32 = MAX_VALIDATOR_SET_SIZE / 20; | ||
| const VALIDATOR_SET_SIZE_RANGE_END: u32 = | ||
| VALIDATOR_SET_SIZE_RANGE_BEGIN + VALIDATOR_SET_SIZE_RANGE_BEGIN; | ||
| const MAX_VOTE_ANCESTRIES_RANGE_BEGIN: u32 = MAX_VOTE_ANCESTRIES / 20; | ||
| const MAX_VOTE_ANCESTRIES_RANGE_END: u32 = | ||
| MAX_VOTE_ANCESTRIES_RANGE_BEGIN + MAX_VOTE_ANCESTRIES_RANGE_BEGIN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use an interval that's closer to the worst case scenario ? Or it wouldn't make a difference ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When working on this PR I've ran benchmarks for the whole range, for the closer-to-worst-scenario-interval and for that interval. Everywhere results were almost the same. So imo it isn't that important
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I've made a comment regarding that in description :)
| verify { | ||
| assert_eq!(T::account_balance(&sender), 0.into()); | ||
| assert_eq!( | ||
| OutboundMessages::<T, I>::get(MessageKey { lane_id: T::bench_lane_id(), nonce }).unwrap().fee, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we could use lane_id directly here: OutboundMessages::<T, I>::get(MessageKey { lane_id, nonce }).unwrap().fee. Just saying, but I'm ok with leaving it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! :) I've submitted #1514 - will fix a bit later
| (115_651_000 as Weight) | ||
| .saturating_add((61_465_000 as Weight).saturating_mul(p as Weight)) | ||
| .saturating_add((3_438_000 as Weight).saturating_mul(v as Weight)) | ||
| (55_070_000 as Weight) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference between the old value and the new one seems quite big. Here and in other places as well. Is this normal / expected ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I've also made a comment regarding that in the PR description. Outline: normally (for non-test runtimes) we are running benchmarks (using benchmarking bot) on the dedicated reference machine. But here (for testnets) we don't need such precision, so I've been using my own laptop for that && recently I've upgraded it :) Additionally, the messages pallet configuration has changed significantly, so it is normal that we see drop there
…ets parachains. (#1340) (#1494) * Transaction version bump + updated spec_version * Reformatting issue * Adding missing runtimes * Upgrading contracts pallet runtime version * Upgrading Seedling runtime * Bump polkadot-parachain version Co-authored-by: Wilfried Kopp <[email protected]> Co-authored-by: Hector Bulgarini <[email protected]> Co-authored-by: Wilfried Kopp <[email protected]>
* decrease parameters range in grandpa benchmarks * fix messages benchmarks * update all weights * dealing with failed test (WiP) * Revert "dealing with failed test (WiP)" This reverts commit 0379d24. * proper tests fix
* decrease parameters range in grandpa benchmarks * fix messages benchmarks * update all weights * dealing with failed test (WiP) * Revert "dealing with failed test (WiP)" This reverts commit 0379d24. * proper tests fix
closes #1476
All weights are decreased a bit. There are two main reasons: (1) messages pallet configuration has changed significantly and (2) my new laptop is a bit faster than my old one, where previous results have been gathered (yes, we need to run benchmarks on the reference machine, but it is fine to use test weights on our testnets).
Apart from weights update, this PR also: