Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
create_if_not_exists, README
  • Loading branch information
simorenoh committed Apr 2, 2024
commit 58000fdf17f4a0ec7174e64e8fee2b7b08e0e8cb
65 changes: 65 additions & 0 deletions sdk/cosmos/azure-cosmos/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,71 @@ as well as containing the list of failed responses for the failed request.

For more information on Transactional Batch, see [Azure Cosmos DB Transactional Batch][cosmos_transactional_batch].

### Private Preview - Vector Embeddings and Vector Indexes
We have added new capabilities to utilize vector embeddings and vector indexing for users to leverage vector
search utilizing our Cosmos SDK. These two container-level configurations have to be turned on at the account-level
before you can use them.

Each vector embedding should have a path to the relevant vector field in your items being stored, a supported data type
(float32, int8, uint8), the vector's dimensions, and the distance function being used for that embedding.
A sample vector embedding policy would look like this:
```python
vector_embedding_policy = {
"vectorEmbeddings": [
{
"path": "/vector1",
"dataType": "float32",
"dimensions": 1000,
"distanceFunction": "euclidean"
},
{
"path": "/vector2",
"dataType": "int8",
"dimensions": 200,
"distanceFunction": "dotproduct"
},
{
"path": "/vector3",
"dataType": "uint8",
"dimensions": 400,
"distanceFunction": "cosine"
}
]
}
```

Separately, vector indexes have been added to the already existing indexing_policy and only require two fields per index:
the path to the relevant field to be used, and the type of index from the possible options (flat, quantizedFlat, or diskANN).
A sample indexing policy with vector indexes would look like this:
```python
indexing_policy = {
"automatic": True,
"indexingMode": "consistent",
"compositeIndexes": [
[
{"path": "/numberField", "order": "ascending"},
{"path": "/stringField", "order": "descending"}
]
],
"spatialIndexes": [
{"path": "/location/*", "types": [
"Point",
"Polygon"]}
],
"vectorIndexes": [
{"path": "/vector1", "type": "flat"},
{"path": "/vector2", "type": "quantizedFlat"},
{"path": "/vector3", "type": "diskANN"}
]
}
```
You would then pass in the relevant policies to your container creation method to ensure these configurations are used by it like shown below:
```python
database.create_container(id=container_id, partition_key=PartitionKey(path="/id"),
indexing_policy=indexing_policy, vector_embedding_policy=vector_embedding_policy)
```
***Note: vector embeddings and vector indexes CANNOT be edited by container replace operations. They are only available directly through creation.***

## Troubleshooting

### General
Expand Down
5 changes: 5 additions & 0 deletions sdk/cosmos/azure-cosmos/azure/cosmos/aio/_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,7 @@ async def create_container_if_not_exists(
etag: Optional[str] = None,
match_condition: Optional[MatchConditions] = None,
analytical_storage_ttl: Optional[int] = None,
vector_embedding_policy: Optional[Dict[str, Any]] = None,
**kwargs: Any
) -> ContainerProxy:
"""Create a container if it does not exist already.
Expand Down Expand Up @@ -316,6 +317,9 @@ async def create_container_if_not_exists(
:keyword int analytical_storage_ttl: Analytical store time to live (TTL) for items in the container. A value of
None leaves analytical storage off and a value of -1 turns analytical storage on with no TTL. Please
note that analytical storage can only be enabled on Synapse Link enabled accounts.
:keyword Dict[str, Any] vector_embedding_policy: The vector embedding policy for the container. Each vector
embedding possesses a predetermined number of dimensions, is associated with an underlying data type, and
is generated for a particular distance function.
:raises ~azure.cosmos.exceptions.CosmosHttpResponseError: The container creation failed.
:returns: A `ContainerProxy` instance representing the new container.
:rtype: ~azure.cosmos.aio.ContainerProxy
Expand Down Expand Up @@ -344,6 +348,7 @@ async def create_container_if_not_exists(
match_condition=match_condition,
session_token=session_token,
initial_headers=initial_headers,
vector_embedding_policy=vector_embedding_policy,
**kwargs
)

Expand Down
5 changes: 5 additions & 0 deletions sdk/cosmos/azure-cosmos/azure/cosmos/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,7 @@ def create_container_if_not_exists( # pylint:disable=docstring-missing-param
etag: Optional[str] = None,
match_condition: Optional[MatchConditions] = None,
analytical_storage_ttl: Optional[int] = None,
vector_embedding_policy: Optional[Dict[str, Any]] = None,
**kwargs: Any
) -> ContainerProxy:
"""Create a container if it does not exist already.
Expand Down Expand Up @@ -315,6 +316,9 @@ def create_container_if_not_exists( # pylint:disable=docstring-missing-param
:keyword List[Dict[str, str]] computed_properties: **provisional** Sets The computed properties for this
container in the Azure Cosmos DB Service. For more Information on how to use computed properties visit
`here: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/query/computed-properties?tabs=dotnet`
:keyword Dict[str, Any] vector_embedding_policy: The vector embedding policy for the container. Each vector
embedding possesses a predetermined number of dimensions, is associated with an underlying data type, and
is generated for a particular distance function.
:returns: A `ContainerProxy` instance representing the container.
:raises ~azure.cosmos.exceptions.CosmosHttpResponseError: The container read or creation failed.
:rtype: ~azure.cosmos.ContainerProxy
Expand Down Expand Up @@ -345,6 +349,7 @@ def create_container_if_not_exists( # pylint:disable=docstring-missing-param
match_condition=match_condition,
session_token=session_token,
initial_headers=initial_headers,
vector_embedding_policy=vector_embedding_policy,
**kwargs
)

Expand Down