Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

@sandreim
Copy link
Contributor

This is a followup of #5594 . This aims to be a "catch" all migration for any column option changes such that no manual fixing for ParityDB is required. Should also work if client is rolled back.

Signed-off-by: Andrei Sandu <[email protected]>
@sandreim sandreim added A3-in_progress Pull request is in progress. No review needed at this stage. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Jul 20, 2022
sandreim added 2 commits July 20, 2022 16:28
Signed-off-by: Andrei Sandu <[email protected]>
Signed-off-by: Andrei Sandu <[email protected]>
@sandreim
Copy link
Contributor Author

Tested this locally, but a Versi burn in is needed to see how much time migration takes and if there are any issues afterwards

@sandreim sandreim requested review from arkpar and eskimor July 20, 2022 16:34
@vstakhov
Copy link
Contributor

I was about to suggest the same 👍 It is wa-ay better to provide a clear migration path than to crash on upgrade with no cleaning up.

@vstakhov
Copy link
Contributor

The only issue was that paritydb has stored unordered columns as hashes only with no values, so @eskimor told that there is no way to migrate with data preservation. Have you tried that such a migration does really work?

@sandreim
Copy link
Contributor Author

The only issue was that paritydb has stored unordered columns as hashes only with no values, so @eskimor told that there is no way to migrate with data preservation. Have you tried that such a migration does really work?

Not sure what happens to the data right now, it shouldn’t be preserved. Its a soft wipeout and hopefully there is no dangling stuff. The good partt is that there is no startup error now and no need to manually ‘fix’ the db.

@arkpar
Copy link
Member

arkpar commented Jul 20, 2022

I don't think migrate works correctly for moving data from hash column to btree.
Unless the previous version set the uniform flag for the colum, there's no way to recover the original key, as the database does not store it.
@cheme Please correct me if I'm wrong here.

Why not just delete the parachains database instead? As far as I know it should be re-populated in a matter of a few blocks.

@cheme
Copy link
Contributor

cheme commented Jul 20, 2022

Can only confirm, only 'uniform' column can be migrated.

@cheme
Copy link
Contributor

cheme commented Jul 20, 2022

Migrate function here should work when changing configuration of hash indexed column (hash indexed to hash indexed), and also with the same parameters (can gain reclaim some space sometime).

When migrating from hash indexed to btree, what may currently happen is that it migrate the index and values tables (the btree_indexed option parameter is currently ignored by the migration). After that, probably opening the db with btree_index act as if the column was deleted (no entry may have been written where the btree root usually is) and things may run fine.

For btree indexed column the migrate function is unimplemented, but should be easy to implement (just run the iterator on the whole db and inject all key values to the target column) and would allow to switch from btree indexed to hash indexed.

The fact that we cannot recover key of the content stored in column may be a concern in some case, idea should be to attach key to value then or implement a variant that keep original key along the value (or in an attached value entry). But that is a different question.

Also depending on context if a single column needs to be removed, deleting the metadata file and all index_X_* and table_X_* with X the column number should work (almost untested).

@vstakhov
Copy link
Contributor

Why not just delete the parachains database instead?

It would be enough to delete dispute coordinator column I suppose. The goal here is to provide more or less smooth migration. Even a shell script that performs rm -fr is better than crashing on an incorrect database.

@sandreim
Copy link
Contributor Author

I was hoping for a more sane solution especially on the user experience. Automated db removal sounds a bit scary to me.

@cheme
Copy link
Contributor

cheme commented Jul 21, 2022

+1 on the fact that automating a column removal is scary, maybe if we know some column are allowed to be removed we can hardcode the info and only run this on such column.
But more generally, I think a warning should be issued before (maybe even a confirmation): synching the data can take time and users should know the node may be a bit non responsive.

sandreim added 2 commits July 21, 2022 16:51
Signed-off-by: Andrei Sandu <[email protected]>
@sandreim
Copy link
Contributor Author

+1 on the fact that automating a column removal is scary, maybe if we know some column are allowed to be removed we can hardcode the info and only run this on such column. But more generally, I think a warning should be issued before (maybe even a confirmation): synching the data can take time and users should know the node may be a bit non responsive.

I've implemented it to work only for 1 column (dispute coordinator data) and will spit out warnings. There shouldn't be any issues with sync and responsiveness AFAIK.

@sandreim sandreim marked this pull request as ready for review July 21, 2022 16:58
@github-actions github-actions bot added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Jul 21, 2022
@sandreim sandreim changed the title Parachains db column migration Parachains db column "migration" Jul 21, 2022
@arkpar
Copy link
Member

arkpar commented Jul 21, 2022

Database implementation details w.r.t. file names may change in the future and this will break. There should be a function for clearing a column provided by the database. I've added one here:
paritytech/parity-db#84

@sandreim
Copy link
Contributor Author

sandreim commented Jul 21, 2022

Thanks, you are right, this relies on impl details. I’d expand on that - move most of this code in paritydb and add an additional column option to delete it when no migration is available. Would this work?

@arkpar
Copy link
Member

arkpar commented Jul 22, 2022

Not sure this is a good idea. Mixing actual configuration options with some flags that specify behaviour for some edge cases. I'd rather have it simple: try to open the database in the user code. If there's a particular error, the user code may handle it by dropping the column. I've added an explicit error type IncompatibleColumnConfig to make it possible to detect it better.

Signed-off-by: Andrei Sandu <[email protected]>
sandreim added 2 commits July 22, 2022 11:56
Signed-off-by: Andrei Sandu <[email protected]>
Signed-off-by: Andrei Sandu <[email protected]>
Copy link

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO we should try to clear a column only if opening the db fails with some specific error. Other than that, lgtm.

@sandreim
Copy link
Contributor Author

IMHO we should try to clear a column only if opening the db fails with some specific error. Other than that, lgtm.

I think this is currently more efficient and gives more control on how the issue is resolved, because we can detect the condition ourselves, and doesn't require to have a loop in which to open and delete on error. If you have a strong opinion about this I would change it.

@ordian
Copy link

ordian commented Jul 25, 2022

IMHO we should try to clear a column only if opening the db fails with some specific error. Other than that, lgtm.

I think this is currently more efficient and gives more control on how the issue is resolved, because we can detect the condition ourselves, and doesn't require to have a loop in which to open and delete on error. If you have a strong opinion about this I would change it.

It doesn't need to be a loop, just opening at most 2 times (regular open, and in case of this specific error, try clearing a column and open again (w/o a retry), otherwise, propagate the error). My main worry is that generally, we don't want to wipe the parachains db (or part of it) on all nodes, esp once paritydb replaces rocksdb and we do this on more than 1/3 of validators. This could lead to security and/or consensus issues.

@sandreim
Copy link
Contributor Author

IMHO we should try to clear a column only if opening the db fails with some specific error. Other than that, lgtm.

I think this is currently more efficient and gives more control on how the issue is resolved, because we can detect the condition ourselves, and doesn't require to have a loop in which to open and delete on error. If you have a strong opinion about this I would change it.

It doesn't need to be a loop, just opening at most 2 times (regular open, and in case of this specific error, try clearing a column and open again (w/o a retry), otherwise, propagate the error). My main worry is that generally, we don't want to wipe the parachains db (or part of it) on all nodes, esp once paritydb replaces rocksdb and we do this on more than 1/3 of validators. This could lead to security and/or consensus issues.

Assuming one column needs clearing, yeah 2 db opens would work, but if we don;t really know the number of cols which need fixing we need to loop and err on each col index and then fix it. If we use the version file bumper, then we would need manually hardcode the migration explicitly like it works for rocks db.

sandreim added 3 commits July 26, 2022 09:57
Signed-off-by: Andrei Sandu <[email protected]>
…itytech/polkadot into sandreim/paritydb_migrate_parachains
Signed-off-by: Andrei Sandu <[email protected]>
@sandreim
Copy link
Contributor Author

I've implemented the versioning for ParityDB which was missing and added the column clear code as a parityDB migration from anything in the past to current version.

@sandreim
Copy link
Contributor Author

@bkchr @ordian PTAL

Copy link

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test this, but looks reasonable.

Signed-off-by: Andrei Sandu <[email protected]>
@sandreim
Copy link
Contributor Author

bot merge

@paritytech-processbot paritytech-processbot bot merged commit e3ea24b into master Jul 27, 2022
@paritytech-processbot paritytech-processbot bot deleted the sandreim/paritydb_migrate_parachains branch July 27, 2022 09:34
ordian pushed a commit that referenced this pull request Jul 28, 2022
* master: (37 commits)
  Backport crate version bumps to 0.9.27 (#5826)
  Fix GHA (#5825)
  [ci] Add timeout to benchmark jobs (#5822)
  Parachains db column "migration" (#5797)
  Companion for #11831 (#5784)
  [ci] Return production image (#5818)
  add migration for staking v10 (#5817)
  Prepare for rust 1.62.1 (#5811)
  Bump strum to 0.24.1 (#5816)
  Bump substrate (#5814)
  Add missing trigger wildcards for some CI workflows (#5812)
  malus: add `finality_delay` cli flag (#5770)
  [ci] publish parachain-implementers-guide (#5806)
  westend xcm: collectives parachain is trusted teleporter (#5798)
  Cleanup light client leftovers (#5794)
  Fix benchmarking tests (#5791)
  allow re-use and avoid compiling kusama parachain code (#5792)
  Introduce async runtime calling trait for runtime-api subsystem (#5782)
  add `Extrinsic Ordering` check that runs against a local reference node (#5790)
  Co #11456: Expose `benchmark extrinsic` command (#5620)
  ...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants