-
Notifications
You must be signed in to change notification settings - Fork 33
Calculate and compare CRC when writing and reading ledger snapshots #1319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
41891a0 to
e449818
Compare
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Util/CBOR.hs
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
b001c90 to
a599027
Compare
ouroboros-consensus/changelog.d/20241128_084625_georgy.lukyanov_892_checksum_snaphot_file.md
Outdated
Show resolved
Hide resolved
ouroboros-consensus/changelog.d/20241128_084625_georgy.lukyanov_892_checksum_snaphot_file.md
Outdated
Show resolved
Hide resolved
...s-consensus-cardano/changelog.d/20241127_163624_georgy.lukyanov_892_checksum_snaphot_file.md
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Show resolved
Hide resolved
ouroboros-consensus/changelog.d/20241128_084625_georgy.lukyanov_892_checksum_snaphot_file.md
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
31b892a to
5d39721
Compare
|
I've compared the performance of reading and then writing a snapshot using I would like to spend some more time looking at the microbenchmarks that we currently have and maybe adding one for reading-writing ledger state snapshots. Logs of `db-analyser` with timingsFeature 101.325841sMain 99.685827s |
e9ae82c to
4286fdd
Compare
b524f64 to
c8c082e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it looks good in general, but we should propagate this configuration option to the node and expose it on the JSON/YAML config file.
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Util.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Init.hs
Outdated
Show resolved
Hide resolved
...s-consensus-cardano/changelog.d/20241129_122139_georgy.lukyanov_892_checksum_snaphot_file.md
Outdated
Show resolved
Hide resolved
c8c082e to
584d4d4
Compare
f27ca6d to
15ec8ee
Compare
1ec6cdc to
dcf27db
Compare
ouroboros-consensus-cardano/src/unstable-cardano-tools/Cardano/Tools/DBAnalyser/Analysis.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/LgrDB.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/DiskPolicy.hs
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Init.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/Snapshots.hs
Outdated
Show resolved
Hide resolved
5df819b to
4200b4a
Compare
- Allow skipping snapshot checksum check - Generalise `Test/Ouroboros/Storage/LedgerDB/OnDisk.hs` - Restrict `Ord` instance for `DiskSnapshot` to `dsNumber` - Use the `Ord` instance in `listSnapshots` - Connect snapshot checksum with the node interface: - Add `Flag "DoDiskSnapshotChecksum"` to `DiskPolicyArgs` - Expose `(No)DoDiskSnapshotChecksum` in Ouroboros.Consensus.Node - Re-export `Flag` from `DiskPolicy` - `db-analyser` changes: - use checksums by default as that's what `cardano-node` will do - Allow independently disabling checksum on reading and writing
4200b4a to
2eef543
Compare
Fixes #892
Integration into
cardano-node: IntersectMBO/cardano-node#6047. This uses the branch geo2a/issue-892-checksum-snaphot-file-release-ouroboros-consensus-0.21.0.0-backport which is the backport of this PR onto the most resent release of theouroboros-consensuspackage.In this PR, we change the reading and writing disk snapshots of ledger state. When a snapshot is taken and written to disk, an additional file with the
.checksumextension is written alongside it. The checksum file contains a string that represent the CRC32 checksum of the snapshot.The checksum is calculated incrementally, alongside writing the snapshot to disk. When a snapshot is read from dist, the checksum is again calculated and compared to the tracked one. If the checksum is different,
readSnaphotreturns theReadSnapshotDataCorruptionerror, indicating data corruption.The checksum is calculated incrementally, alongside reading a writing the data. On write, we use the
hPutAllCRCfunction fromfs-sim, and on read we modify the readIncremental function to compute the checksum as data is read.To enable seamless integration into
cardano-node, we make the check optional. When initialising the ledger state from a snapshot ininitLedgerDB, we issue a warning in case the checksum file is missing for a snapshot, but do not fail as in case of invalid snapshots.The
db-analysertool ignores the checksum files by default when reading the snapshots. We add--disk-snapshot-checksumflag to enabled the check. When writing a snapshot to disk, e.g. as a result of the--store-ledgeranalysis,db-analyserwill always writecalculate the checksum and write it into the snapshot's
.checksumfile.Tests
There state machine test in
Test.Ouroboros.Storage.LedgerDB.OnDiskis relevant to this feature, and has caught a number of silly mistakes in the process of its implementation, for example forgetting to delete a checksum file when the snapshot is deleted.The model in the test does not track checksums, and I do not think it can (or should) be augmented to do that. Howerver, the
SnapandRestoreevents are now parameterised by the checksum flag, and the values for the flag are randomised when generating these events. This leads to testing the following properties:Restoreevents will always lead to restoring from a snapshot, even ifSnapevents do not write checksum files (i.e. their flag isNoDoDiskSnapshotChecksum~=False).DoDiskSnapshotChecksumflag changes in the code base and becomes strict, i.e. hard fail if the checksum file is missing, this test will discover that.Effects on Performance:
Running
db-analyserto read a ledger snapshot and store the snapshot of the state at the next slot shows a difference of 2 seconds on my machine. See a comment below for the logs.To precisely evaluate the effects, we need a micro-benchmark of the reading and writing of snapshots with and without the checksum calculation.