-
Notifications
You must be signed in to change notification settings - Fork 126
[DATA-4707] Add CLI for Custom Indexes #5384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DATA-4707] Add CLI for Custom Indexes #5384
Conversation
cli/app.go
Outdated
| }, | ||
| { | ||
| Name: "index", | ||
| Usage: "manage indexes for hot data stores and pipeline sinks", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to hear opinions on better usage messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this one let's do manage indexes for hot data and pipeline sink collections
cli/customindex.go
Outdated
| unspecifiedCollectionType = pb.IndexableCollection_INDEXABLE_COLLECTION_UNSPECIFIED | ||
|
|
||
| hotStoreCollectionTypeStr = "hot_store" | ||
| pipelineSinkCollectionTypeStr = "pipeline_sink" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pipeline" might be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we refer to pipelines in customer facing docs/features? Are they familiar with pipeline_sinks? Or just the general concept of pipelines and collections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think pipeline sink makes sense, esp since we reference "pipeline sink" rather than "pipeline" as a data source type users can query from. also nit but i wonder if it should be "hot storage" instead of "hot store"? i know the api says hot store but for tabular data source type, we reference it as hot storage
@RobertXu we've talked about this, but at least for customer facing features these could be aligned? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, pipeline sink and hot storage both sound good to me to keep things consistent for customers!
| printf(c.App.Writer, "- Name: %s\n", index.IndexName) | ||
| printf(c.App.Writer, " Spec: %s\n", index.IndexSpec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we wanna print more info here like what the collection type / pipeline name for each index is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, name and spec should be fine for now. The collection type / pipeline name will be the same for each index, and the customer will have just entered those in the request, so those fields would make the response longer without adding new info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True
cli/customindex.go
Outdated
| } | ||
|
|
||
| // DeleteCustomIndexConfirmation prompts the user for confirmation before deleting a custom index. | ||
| func DeleteCustomIndexConfirmation(c *cli.Context, args deleteCustomIndexArgs) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be okay to skip a confirmation step here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I didn't know we had this pattern.
I think in this case, it's ok to skip the confirmation step. I don't think people will accidentally delete an index since the action requires quite specific params.
There are 2 risks with deleting an index
- worse query performance- we can easily add back the index if this proves to be a problem
- they delete a unique index which allows duplicate data to appear
Case 1 is pretty manageable, Case 2 is not great, however, we don't perform any validation when they create a unique index, so it makes sense to not do the same when deleting (Note: I don't love the lack of validation, but this is what came up during the doc review process).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sounds good seems not to important to validate this
cli/customindex.go
Outdated
| } | ||
|
|
||
| result := make([][]byte, len(rawMessages)) | ||
| for i, raw := range rawMessages { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we'll need to specifically parse out the key and index properties from the JSON file.
Here's an example index spec JSON from the scope document:
{
"key": {
"resource_name": 1,
"method_name": 1
},
"options": {
"sparse": true
}
}
We have to do this weird conversion b/c people used to writing MongoDB index specs use JSON objects, but MongoDB's index creation API requires BSON.D slices to preserve key ordering for compound indexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah you're right I did this and then looked at the SDK bug and now realizing I did the same thing lol
cli/customindex.go
Outdated
| unspecifiedCollectionType = pb.IndexableCollection_INDEXABLE_COLLECTION_UNSPECIFIED | ||
|
|
||
| hotStoreCollectionTypeStr = "hot_store" | ||
| pipelineSinkCollectionTypeStr = "pipeline_sink" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think pipeline sink makes sense, esp since we reference "pipeline sink" rather than "pipeline" as a data source type users can query from. also nit but i wonder if it should be "hot storage" instead of "hot store"? i know the api says hot store but for tabular data source type, we reference it as hot storage
@RobertXu we've talked about this, but at least for customer facing features these could be aligned? wdyt?
cli/customindex.go
Outdated
| pipelineSinkCollectionType = pb.IndexableCollection_INDEXABLE_COLLECTION_PIPELINE_SINK | ||
| unspecifiedCollectionType = pb.IndexableCollection_INDEXABLE_COLLECTION_UNSPECIFIED | ||
|
|
||
| hotStoreCollectionTypeStr = "hot_store" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: i dont remember what the cli convention is for strings we accept but it seems like the flags we have are normally hyphenated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that seems right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!! LGTM 🐐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only skimmed the data side of things but looks good from an SDK perspective!
Description
DATA-4707 Add CLI Support for Custom Index Operations
viam data index create --org-id="11111111-2222-3333-4444-555555555555" --collection-type=<"hot_store" || pipeline_sink"> [OPTIONAL] --pipeline_name="pipeline1" --index-path=index_spec.jsonfor creating custom indexesviam data index delete --org-id="11111111-2222-3333-4444-555555555555" --collection-type=<"hot_store" || pipeline_sink"> [OPTIONAL] --pipeline_name="pipeline1" --index-name="index_name"for creating deleting indexesviam data index list --org-id="11111111-2222-3333-4444-555555555555" --collection-type=<"hot_store" || pipeline_sink"> [OPTIONAL] --pipeline_name="pipeline1"for listing all custom indexesviam data create index <args>for example would require creating adatasubcommandcreatewith a single subcommandindexwhich doesn't structurally make sense.--collection-typeis one of"hot_storeorpipeline_sinkand ifpipeline_sinkthat arg--pipeline_nameis not""Testing
I ran my local version of the CLI to confirm usage behaves how I expect. Examples for each command:
viam dataviam data indexviam data index createviam data index deleteConfirmation step:
viam data index listSee code for flow / errors returned when providing malformed commands.
I also created a small test suite for helper functions for validating args and reading index specs from JSON files.