-
Notifications
You must be signed in to change notification settings - Fork 253
provider: keystore #1096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provider: keystore #1096
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the storage strategy could be improved. Currently a whole list of multihahses is stored under a prefix key. This means that is a CID is added or removed, then the whole buffer containing the list of multihashes must be loaded, modified, and then saved again. Also, if the number of prefix bits changes, then all the data must be remapped to new keys.
Consider storing each multihash as an individual record with it key being the bits in the multihash, or at least all the ones we might ever want to use as a prfix. Then prefix querying can be used to retrieve any set of multihashes for any bit-length prefix. Also, removing an individual value can be done by specifying all the bits that make its key unique.
I have not thought through what is the best way to represent the key, but I think there is probably a way to store multihashes in a way that allows an arbitrary length prefix to look up all multihashes that have those prefix bits.
I also share these concerns. @gammazero If you have an idea on how to improve it, feel free to take over the PR. The multihashes stored in the In practice, we only perform prefix queries, so when reading, it makes sense to have multihashes grouped by prefix. However, as you mentioned, grouping like in the current implementation is bad for writing. |
|
We may want to store a timestamp to remember when the multihash was reprovided for the last time (as suggested in this comment). We don't need to do it now, but it would be great if it is easy to add later on. |
|
Opened #1112 to track improvement opportunity. Merging this PR to the |
* keystore * renamed prefixLen to prefixBits * remove long lived context from KeyStore
* keystore * renamed prefixLen to prefixBits * remove long lived context from KeyStore
* keystore * renamed prefixLen to prefixBits * remove long lived context from KeyStore
* keystore * renamed prefixLen to prefixBits * remove long lived context from KeyStore
Part of #1095
Description
The KeyStore is a datastore backed state to keep track of the keys (multihashes) that should be periodically reprovided to the DHT swarm.
The keys are stored by DHT keyspace locality. They are grouped by (constant size) prefix and indexed by this prefix in the datastore. Note that the key by which they are addressed is their Kademlia identifier, and not the hash from the multihash.
Operations
Readers (
Get()) only make prefix requests, since the caller wants to retrieve all keys matching a specific set of peers in the network matching the keyspace prefix.Writers (
Put()) are expected to write random keys (not necessarily close in the keyspace).Keys can be removed using
Delete().Since it is tricky for kubo to determine whether a key should stop being provided or not, a garbage collection (reset) function is available to remove keys that shouldn't be reprovided anymore. Caller can provide a GC function, containing the list of all keys that the KeyStore is expected to hold. If the GC function is set, the KeyStore will periodically purge all its keys, and repopulate itself with the keys supplied by the GC/reset function.