Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
126 commits
Select commit Hold shift + click to select a range
18b5021
auto-gen files
dargilco Mar 26, 2024
b8ea0cd
first pass at writing tests
dargilco Mar 27, 2024
ee78c9c
Re-emit client library with fixed variable name
dargilco Mar 27, 2024
25b5c6b
Fix test
dargilco Mar 27, 2024
5072f00
First test working!
dargilco Mar 28, 2024
62c476a
Re-emit after adding tools. Add async test
dargilco Mar 29, 2024
2bdb446
Add basic samples
dargilco Mar 29, 2024
ad9992c
Ignore spelling errors
dargilco Mar 29, 2024
59bf4f3
fix `tox run -e sphinx` errors
dargilco Mar 29, 2024
1613b37
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco Mar 29, 2024
8d3948c
Update SDK to support embeddings. Add sample. Add root README.md
dargilco Mar 30, 2024
fff460e
Fix typo
dargilco Apr 2, 2024
9d171a7
After re-emit using flat input arguments
dargilco Apr 2, 2024
7efa800
Update README.md code snippets
dargilco Apr 2, 2024
8008505
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco Apr 4, 2024
61b62ac
Re-emit
dargilco Apr 4, 2024
675ba6d
Samples for image generation
dargilco Apr 4, 2024
c5ea2fc
Add dictionary of extra parameters
dargilco Apr 4, 2024
45c7ca1
Re-emit
dargilco Apr 4, 2024
8ee88aa
Example of setting extra parameters
dargilco Apr 4, 2024
bad14c9
Fix README.md title
dargilco Apr 5, 2024
b49acb6
Placeholder patch for streaming chat method
dargilco Apr 5, 2024
708e4c4
Re-emit, to get two new streaming 'Delta' classes
dargilco Apr 5, 2024
da8a678
first go at streaming
dargilco Apr 10, 2024
ed2b227
Latest re-emit, removing 'extra_parameters'
dargilco Apr 10, 2024
0428d95
async streaming support
dargilco Apr 11, 2024
9d98958
A few quality gates fixes
dargilco Apr 11, 2024
37c3599
Use aclose() for async interator
dargilco Apr 11, 2024
58b3669
Update env-variable names
dargilco Apr 11, 2024
0f79e70
First set of updates following SDK review meeting
dargilco Apr 12, 2024
e70da4b
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco Apr 12, 2024
8e274b6
New client names. Other minor model name changes
dargilco Apr 12, 2024
159d82f
Minor fixes to root README.md
dargilco Apr 12, 2024
5326bba
Update tests
dargilco Apr 15, 2024
27b555f
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco Apr 15, 2024
1e42647
Minor test updates
dargilco Apr 16, 2024
65b2bf1
Add assets.json
dargilco Apr 16, 2024
32368f2
Fix test name
dargilco Apr 16, 2024
18d6baa
First round of Pylint fixes
dargilco Apr 16, 2024
1243048
Fix pyright errors
dargilco Apr 16, 2024
3c84d22
Fix all pyright errors
dargilco Apr 16, 2024
c951405
Fix more quality gates
dargilco Apr 17, 2024
e18d79c
Add streaming tests
dargilco Apr 17, 2024
51f30d3
Fix streaming to work with small HTTP buffers (tested down to 64 bytes)
dargilco Apr 18, 2024
2578c8d
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco Apr 18, 2024
3090bae
Update ci.yml
dargilco Apr 18, 2024
84ad0b5
Add samples for chat history, JSON input, IO[bytes] input
dargilco Apr 19, 2024
6b87862
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 6, 2024
6aaa1c4
Draft sample for chat completion with tools
dargilco May 8, 2024
11a1c78
Grab latest TypeSpec changes
dargilco May 8, 2024
6a6cbc5
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 8, 2024
5864275
Re-emit SDK to pick up TypeSpec tools changes. Fix result.id check to…
dargilco May 8, 2024
9e18146
New test for tool, new recordings
dargilco May 8, 2024
2c97356
use logger for detailed SSE streaming debug spew
dargilco May 9, 2024
37d1e35
Don't build azure-ai-generative and azure-ai-resources packages, as t…
dargilco May 9, 2024
e79c4b5
Split streaming response class to two, one for sync, one for async
dargilco May 9, 2024
06e11f7
Update test timeout. Rename /templates/stages/platform-matrix-ai.json…
dargilco May 9, 2024
610b4eb
Mark azure-ai-generative and azure-ai-resources as in-active
dargilco May 9, 2024
9def3bb
Fix mypy and pylint errors
dargilco May 10, 2024
fb7612b
Sample for getting model info
dargilco May 10, 2024
cd30a48
Remove image generation
dargilco May 10, 2024
a6ae407
Re-emit from TypeSpec without Image Generation
dargilco May 10, 2024
78331f7
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 10, 2024
943faa9
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 13, 2024
a1ebd6b
Update auth
dargilco May 13, 2024
f0b773c
Add get_model_info tests
dargilco May 14, 2024
e4f3089
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 14, 2024
deb3d16
Add sample for ClientGenerator
dargilco May 14, 2024
c6b3bc2
Remove /v1
dargilco May 14, 2024
042c06a
Use new test recording assests without /v1
dargilco May 14, 2024
56450c9
Pick up TypeSpec with /v1 removed from route
dargilco May 14, 2024
a36fbee
Support Image Embeddings and version 2024-05
dargilco May 14, 2024
82168b5
Update test recordings
dargilco May 14, 2024
e788873
Fix quality some quality gates
dargilco May 15, 2024
2c18488
Fix broken link ('link verification check')
dargilco May 15, 2024
2ac893f
Use 'response' instead of 'result' in samples, as this is what I see …
dargilco May 15, 2024
b56ca82
Implement load_client and load_async_client
dargilco May 16, 2024
aafcc37
Make three BaseStreamingChatCompletions variables (constants) private)
dargilco May 16, 2024
b5484ee
sync and async versions of load_client
dargilco May 17, 2024
0a82fe2
Remove wait loop in async samples. Simplify tool sample. Other minor …
dargilco May 17, 2024
3dbca9e
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 17, 2024
bf93525
Re-emit with new operator names
dargilco May 17, 2024
ca2f4ff
Minor change to sample
dargilco May 20, 2024
9db8bce
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 20, 2024
673f27b
Add support for hyper_parameters
dargilco May 21, 2024
0ed9d7b
Update root README.md
dargilco May 21, 2024
7c5424f
Save work - unknown_params header, hyper_params input, cached model_info
dargilco May 22, 2024
1faf123
Some test changes
dargilco May 22, 2024
71bd71d
Many changes
dargilco May 23, 2024
712dfcc
New test recordings
dargilco May 23, 2024
d6012b7
Minor samples and README.md changes
dargilco May 23, 2024
3f52d6f
Update names of streaming response classes
dargilco May 23, 2024
62c5d8a
Update Entra ID sample, document Entra ID in README, use ttps://ml.az…
dargilco May 23, 2024
8a3a167
use model_extras instead of hyper_params. Update client __str__ to no…
dargilco May 24, 2024
092f91b
Fix all pylint errors. Minor updates to root README.md
dargilco May 28, 2024
8e610c2
Example of JSON messages in the root README.md
dargilco May 28, 2024
6d1e7ce
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 28, 2024
a738ff5
Some MyPy fixes. Also fix wrong package name in root README.md
dargilco May 28, 2024
44e94d0
Same work before re-emitting SDK
dargilco May 29, 2024
27ac7b0
Merge remote-tracking branch 'origin/main' into dargilco/azure-ai-inf…
dargilco May 30, 2024
a825db4
Use `embed` instead of `embedding`. Update Entra ID and AOAI samples
dargilco May 30, 2024
c4814d8
Fix some mypy errors. Use different terms for MaaS/MaaP
dargilco May 31, 2024
2df88dc
Re-emit, now with pyproject.toml
dargilco May 31, 2024
89d92d3
Fix/supress mypy & pyright errors
dargilco May 31, 2024
e20d003
Fix missing Etra ID auth in load_client
dargilco May 31, 2024
d7ef3b7
Use vanity link for samples folder
dargilco Jun 3, 2024
63fefdf
Revert "Use vanity link for samples folder"
dargilco Jun 3, 2024
ab6973f
Fix mypy and pyright errors
dargilco Jun 3, 2024
efbcf5c
Update root README.md. Update operator ref doc comments
dargilco Jun 3, 2024
6f69018
Re-emit
dargilco Jun 3, 2024
aed30bf
Fix mypy error in sample
dargilco Jun 3, 2024
c327403
Fix missing ranges in ref-doc comments
dargilco Jun 3, 2024
27f2ed6
Remove unneeded cast
dargilco Jun 3, 2024
2c92e9d
Fix pylint error
dargilco Jun 3, 2024
98cd4fd
Fix typos & method names. Thanks Jarno!
dargilco Jun 4, 2024
a44df4e
Address Johan's code review comments. Thanks Johan!
dargilco Jun 5, 2024
c0958e3
Fix mypy errors
dargilco Jun 5, 2024
2cbb1cf
Minor update to root README.md
dargilco Jun 5, 2024
73c836e
Remove capacity_type
dargilco Jun 6, 2024
0883400
Fix public patched methods not showing up in intellisense, when using…
dargilco Jun 6, 2024
d0930a8
Import 'Self' from Typing package starting from Python 3.11
dargilco Jun 6, 2024
13e8ed6
Fix pylint error, line too long
dargilco Jun 6, 2024
223238c
More AOAI samples. Update package README with regards to AOAI support
dargilco Jun 6, 2024
0d03ad2
Add overloads with `stream: Literal[..]` to fix mypy and pyright erro…
dargilco Jun 6, 2024
8647964
Override all client __init__ methods so you can define and initialize…
dargilco Jun 6, 2024
3715766
Cleanup: delete now unused platform-matrix-ai.json.old
dargilco Jun 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update env-variable names
  • Loading branch information
dargilco committed Apr 11, 2024
commit 58b3669d9db6fb1412cd3db6de33f5684f271eb4
112 changes: 43 additions & 69 deletions sdk/ai/azure-ai-inference/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# Azure model client library for Python

The Azure AI Model Client Library allows you to do inference against any of AI models in you deployed to Azure. It supports both "model as a service" and "models with hosted managed infrastructure". For more information see [Overview: Deploy models, flows, and web apps with Azure AI Studio](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview).
The ModelClient Library allows you to do inference using AI models you deployed to Azure. It supports both serverless endpoints (aka "model as a service" (MaaS) or "pay as you go") and selfhosted endpoints (aka "model as a platform" (MaaP) or "real-time endpoints"). The ModelClient library makes services calls using REST AP version `2024-04-01-preview` specificed here (TODO: insert link). For more information see [Overview: Deploy models, flows, and web apps with Azure AI Studio](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview).

Use the model client library to:
Use the ModelClient library to:

* Authenticate against the service
* Get information about the model
* Get chat completions
* Get embeddings
* Generate an image from a text prompt

Note that for inference of OpenAI models hosted on azure you should be using the [OpenAI Python client library](https://github.com/openai/openai-python) instead of this client.
Note that for inference using OpenAI models hosted on Azure you should be using the [OpenAI Python client library](https://github.com/openai/openai-python) instead of this client.

[Product documentation](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview)
| [Samples](https://aka.ms/azsdk/model-client/samples/python)
Expand All @@ -23,93 +24,71 @@ Note that for inference of OpenAI models hosted on azure you should be using the

* [Python 3.8](https://www.python.org/) or later installed, including [pip](https://pip.pypa.io/en/stable/).
* An [Azure subscription](https://azure.microsoft.com/free).
* A [TBD resource](https://azure.microsoft.com/) in your Azure subscription. You will need the key and endpoint from this resource to authenticate against the service.
* An [AI Model from the catalog](https://ai.azure.com/explore/models) deployed through Azure AI Studio. To construct the `ModelClient`, you will need to pass in the endpoint URL and key associated with your deployed AI model.

* The endpoint URL has the form `https://your-deployment-name.your-azure-region.inference.ai.azure.com`, where `your-deployment-name` is your unique model deployment name and `your-azure-region` is the Azure region where the model is deployed (e.g. `eastus2`).

* The key is a 32-character string.

### Install the Model Client package

```bash
pip install azure-ai-inferencing
```

### Set environment variables

To authenticate the `ModelClient`, you will need the endpoint and key from your TBD resource in the [Azure Portal](https://portal.azure.com). The code snippet below assumes these values are stored in environment variables:

* Set the environment variable `MODEL_ENDPOINT` to the endpoint URL. It has the form `https://your-model-deployment-name.your-azure-region.inference.ai.azure.com`, where `your-model-deployment-name` is your unique TBD resource name.

* Set the environment variable `MODEL_KEY` to the key. The key is a 32-character string.

Note that the client library does not directly read these environment variable at run time. The endpoint and key must be provided to the constructor of `ModelClient` in your code. The code snippet below reads environment variables to promote the practice of not hard-coding secrets in your source code.

### Create and authenticate the client

Once you define the environment variables, this Python code will create and authenticate a synchronous `ModelClient`:
Assuming `endpoint` and `key` are strings holding your endpoint URL and key, this Python code will create and authenticate a synchronous `ModelClient`:

<!-- SNIPPET:sample_chat_completions.create_client -->

```python
import os
from azure.ai.inference import ModelClient
from azure.ai.inference.models import ChatRequestSystemMessage, ChatRequestUserMessage
from azure.core.credentials import AzureKeyCredential

# [START logging]
import sys
import logging

# Acquire the logger for this client library. Use 'azure' to affect both
# 'azure.core` and `azure.ai.vision.imageanalysis' libraries.
logger = logging.getLogger("azure")

# Set the desired logging level. logging.INFO or logging.DEBUG are good options.
logger.setLevel(logging.DEBUG)

# Direct logging output to stdout (the default):
handler = logging.StreamHandler(stream=sys.stdout)
# Or direct logging output to a file:
# handler = logging.FileHandler(filename = 'sample.log')
logger.addHandler(handler)

# Optional: change the default logging format. Here we add a timestamp.
formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
handler.setFormatter(formatter)
# Create Model Client for synchronous operations
client = ModelClient(
endpoint=endpoint,
credential=AzureKeyCredential(key)
)
```

<!-- END SNIPPET -->

A synchronous client supports synchronous inference methods, meaning they will block until the service responds with inference results. The code snippets below all use synchronous methods because it's easier for a getting-started guide. The SDK offers equivalent asynchronous APIs which are often preferred. To create an asynchronous client, do the following:
A synchronous client supports synchronous inference methods, meaning they will block until the service responds with inference results. For simplicity the code snippets below all use synchronous methods. The client offers equivalent asynchronous methods which are more commonly used in production.

* Update the above code to import `ModelClient` from the `aio` namespace:
To create an asynchronous client, Install the additional package [aiohttp](https://pypi.org/project/aiohttp/):

```python
from azure.ai.inference.aio import ModelClient
```
```bash
pip install aiohttp
```

* Install the additional package [aiohttp](https://pypi.org/project/aiohttp/):
and update the code above to import `ModelClient` from the `aio` namespace:

```bash
pip install aiohttp
```
```python
import asyncio
from azure.ai.inference.aio import ModelClient
```

## Key concepts

### Chat Completions

TBD
TODO: Add overview and link to explain chat completions.

Target the `/v1/chat/completions` route
Chat completion operations target the URL route `/v1/chat/completions` on the provided endpoint.

### Embeddings

TBD
TODO: Add overview and link to explain embeddings.

Target the `/v1/embeddings` route
Embeddings operations target the URL route `/v1/embeddings` on the provided endpoint.

### Image Generation

TBD
TODO: Add overview and link to explain image generation.

Target the `/images/generations` route
Image generation operations target the URL route `/images/generations` on the provided endpoint.

## Examples

Expand All @@ -125,7 +104,7 @@ See the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai

### Chat completions example

This example demonstrates how to generate chat completions.
This example demonstrates how to generate a single chat completions.

<!-- SNIPPET:sample_chat_completions.chat_completions -->

Expand All @@ -142,11 +121,10 @@ result = client.get_chat_completions(

# Print results the the console
print("Chat Completions:")
for index, choice in enumerate(result.choices):
print(f"choices[{index}].message.content: {choice.message.content}")
print(f"choices[{index}].message.role: {choice.message.role}")
print(f"choices[{index}].finish_reason: {choice.finish_reason}")
print(f"choices[{index}].index: {choice.index}")
print(f"choices[0].message.content: {result.choices[0].message.content}")
print(f"choices[0].message.role: {result.choices[0].message.role}")
print(f"choices[0].finish_reason: {result.choices[0].finish_reason}")
print(f"choices[0].index: {result.choices[0].index}")
print(f"id: {result.id}")
print(f"created: {result.created}")
print(f"model: {result.model}")
Expand All @@ -159,7 +137,7 @@ print(f"usage.total_tokens: {result.usage.total_tokens}")

<!-- END SNIPPET -->

To generate completions for additional messages, simply call `get_chat_completions` multiple times using the same `ModelClient`.
To generate completions for additional messages, simply call `get_chat_completions` multiple times using the same `client`.

### Embeddings example

Expand All @@ -169,21 +147,17 @@ This example demonstrates how to get embeddings.

```python
# Do a single embeddings operation. This will be a synchronously (blocking) call.
result = client.get_embeddings(input=["first sentence", "second sentence", "third sentence"])
result = client.get_embeddings(input=["first phrase", "second phrase", "third phrase"])

# Print results the the console
print("Embeddings result:")
for index, item in enumerate(result.data):
len = item.embedding.__len__()
print(f"data[{index}].index: {item.index}")
print(f"data[{index}].embedding[0]: {item.embedding[0]}")
print(f"data[{index}].embedding[1]: {item.embedding[1]}")
print("...")
print(f"data[{index}].embedding[{len-2}]: {item.embedding[len-2]}")
print(f"data[{index}].embedding[{len-1}]: {item.embedding[len-1]}")
for item in result.data:
length = len(item.embedding)
print(f"data[{item.index}]: length={length}, [{item.embedding[0]}, {item.embedding[1]}, ..., {item.embedding[length-2]}, {item.embedding[length-1]}]")
print(f"id: {result.id}")
print(f"model: {result.model}")
print(f"object: {result.object}")
print(f"usage.input_tokens: {result.usage.input_tokens}")
print(f"usage.prompt_tokens: {result.usage.prompt_tokens}")
print(f"usage.total_tokens: {result.usage.total_tokens}")
```
Expand Down Expand Up @@ -289,7 +263,7 @@ None redacted logs are generated for log level `logging.DEBUG` only. Be sure to

## Next steps

* Have a look at the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder, containing fully runnable Python code for Image Analysis (all visual features, synchronous and asynchronous clients, from image file or URL).
* Have a look at the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder, containing fully runnable Python code for doing inference using synchronous and asynchronous clients.

## Contributing

Expand Down
34 changes: 25 additions & 9 deletions sdk/ai/azure-ai-inference/samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,19 +51,34 @@ See [Prerequisites](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/

## Set environment variables

See [Set environment variables](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/README.md#set-environment-variables) here.
To construct the `ModelClient`, you will need to pass in the endpoint URL and key associated with your deployed AI model.

* The endpoint URL has the form `https://your-deployment-name.your-azure-region.inference.ai.azure.com`, where `your-deployment-name` is your unique model deployment name and `your-azure-region` is the Azure region where the model is deployed (e.g. `eastus2`).

* The key is a 32-character string.

For convenience, and to promote the practice of not hard-coding secrets in your source code, all samples here assume the endpoint URL and key are stored in environment variables. You will need to set these environment variables before running the samples as-is. These are the environment variables used:

| Sample type | Endpoint environment variable name | Key environment variable name |
|----------|----------|----------|
| Chat completions | `CHAT_COMPLETIONS_ENDPOINT` | `CHAT_COMPLETIONS_KEY` |
| Embeddings | `EMBEDDINGS_ENDPOINT` | `EMBEDDINGS_KEY` |
| Image generation | `IMAGE_GENERATION_ENDPOINT` | `IMAGE_GENERATION_KEY` |

Note that the client library does not directly read these environment variable at run time. The sample code reads the environment variables and constructs the `ModelClient` with this read values.


## Running the samples

To run the first sample, type:
```bash
python sample_chat_completion_async.py
python sample_chat_completions.py
```
similarly for the other samples.

## Example console output

The sample `sample_chat_completion_async.py` sends the following system and user messages in a single call:
The sample `sample_chat_completions.py` sends the following system and user messages in a single call:

- System: "You are an AI assistant that helps people find information."
- User: "How many feet are in a mile?"
Expand All @@ -72,17 +87,18 @@ And prints out the service response. It should look similar to the following:

```text
Chat Completions:
choices[0].message.content: There are 5,280 feet in a mile.
choices[0].message.content: Hello! I'd be happy to help you find the answer to your question. There are 5,280 feet in a mile.
choices[0].message.role: assistant
choices[0].finish_reason: stop
choices[0].index: 0
id: 93f5bea2-11ec-4b31-af73-cb663196ebd5
created: 1970-01-14 01:11:54+00:00
model: Llama-2-70b-chat
id: 77f08d7e-8127-431d-bed5-a814b78ddd80
created: 1970-01-08 23:28:48+00:00
model: Llama-2-13b-chat
object: chat.completion
usage.capacity_type: None
usage.prompt_tokens: 41
usage.completion_tokens: 15
usage.total_tokens: 56
usage.completion_tokens: 32
usage.total_tokens: 73
```

## Troubleshooting
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@
python sample_chat_completion_async.py

Set these two environment variables before running the sample:
1) MODEL_ENDPOINT - Your endpoint URL, in the form https://<deployment-name>.<azure-region>.inference.ai.azure.com
where `deployment-name` is your unique AI Model deployment name, and
`azure-region` is the Azure region where your model is deployed.
2) MODEL_KEY - Your model key (a 32-character string). Keep it secret.
1) CHAT_COMPLETIONS_ENDPOINT - Your endpoint URL, in the form
https://<your-deployment-name>.<your-azure-region>.inference.ai.azure.com
where `your-deployment-name` is your unique AI Model deployment name, and
`your-azure-region` is the Azure region where your model is deployed.
2) CHAT_COMPLETIONS_KEY - Your model key (a 32-character string). Keep it secret.
"""
import asyncio

Expand All @@ -25,10 +26,10 @@ async def sample_chat_completions_async():

# Read the values of your model endpoint and key from environment variables
try:
endpoint = os.environ["MODEL_ENDPOINT"]
key = os.environ["MODEL_KEY"]
endpoint = os.environ["CHAT_COMPLETIONS_ENDPOINT"]
key = os.environ["CHAT_COMPLETIONS_KEY"]
except KeyError:
print("Missing environment variable 'MODEL_ENDPOINT' or 'MODEL_KEY'")
print("Missing environment variable 'CHAT_COMPLETIONS_ENDPOINT' or 'CHAT_COMPLETIONS_KEY'")
print("Set them before running this sample.")
exit()

Expand Down Expand Up @@ -56,20 +57,19 @@ async def sample_chat_completions_async():

# Print results the the console
print("Chat Completions:")
for index, choice in enumerate(result.choices):
print(f"choices[{index}].message.content: {choice.message.content}")
print(f"choices[{index}].message.role: {choice.message.role}")
print(f"choices[{index}].finish_reason: {choice.finish_reason}")
print(f"choices[{index}].index: {choice.index}")
print(f"choices[0].message.content: {result.choices[0].message.content}")
print(f"choices[0].message.role: {result.choices[0].message.role}")
print(f"choices[0].finish_reason: {result.choices[0].finish_reason}")
print(f"choices[0].index: {result.choices[0].index}")
print(f"id: {result.id}")
print(f"created: {result.created}")
print(f"model: {result.model}")
print(f"object: {result.object}")
print(f"usage.capacity_type: {result.usage.capacity_type}")
print(f"usage.prompt_tokens: {result.usage.prompt_tokens}")
print(f"usage.completion_tokens: {result.usage.completion_tokens}")
print(f"usage.total_tokens: {result.usage.total_tokens}")


async def main():
await sample_chat_completions_async()

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@
python sample_embeddings_async.py

Set these two environment variables before running the sample:
1) MODEL_ENDPOINT - Your endpoint URL, in the form https://<deployment-name>.<azure-region>.inference.ai.azure.com
where `deployment-name` is your unique AI Model deployment name, and
`azure-region` is the Azure region where your model is deployed.
2) MODEL_KEY - Your model key (a 32-character string). Keep it secret.
1) EMBEDDINGS_ENDPOINT - Your endpoint URL, in the form
https://<your-deployment-name>.<your-azure-region>.inference.ai.azure.com
where `your-deployment-name` is your unique AI Model deployment name, and
`your-azure-region` is the Azure region where your model is deployed.
2) EMBEDDINGS_KEY - Your model key (a 32-character string). Keep it secret.
"""
import asyncio

Expand All @@ -24,18 +25,18 @@ async def sample_embeddings_async():

# Read the values of your model endpoint and key from environment variables
try:
endpoint = os.environ["MODEL_ENDPOINT"]
key = os.environ["MODEL_KEY"]
endpoint = os.environ["EMBEDDINGS_ENDPOINT"]
key = os.environ["EMBEDDINGS_KEY"]
except KeyError:
print("Missing environment variable 'MODEL_ENDPOINT' or 'MODEL_KEY'")
print("Missing environment variable 'EMBEDDINGS_ENDPOINT' or 'EMBEDDINGS_KEY'")
print("Set them before running this sample.")
exit()

# Create an Image Analysis client for synchronous operations
client = ModelClient(endpoint=endpoint, credential=AzureKeyCredential(key))

# Do a single embeddings operation. Start the operation and get a Future object.
future = asyncio.ensure_future(client.get_embeddings(input=["first sentence", "second sentence", "third sentence"]))
future = asyncio.ensure_future(client.get_embeddings(input=["first phrase", "second phrase", "third phrase"]))

# Loop until the operation is done
while not future.done():
Expand All @@ -48,17 +49,13 @@ async def sample_embeddings_async():

# Print results the the console
print("Embeddings result:")
for index, item in enumerate(result.data):
len = item.embedding.__len__()
print(f"data[{index}].index: {item.index}")
print(f"data[{index}].embedding[0]: {item.embedding[0]}")
print(f"data[{index}].embedding[1]: {item.embedding[1]}")
print("...")
print(f"data[{index}].embedding[{len-2}]: {item.embedding[len-2]}")
print(f"data[{index}].embedding[{len-1}]: {item.embedding[len-1]}")
for item in result.data:
length = len(item.embedding)
print(f"data[{item.index}]: length={length}, [{item.embedding[0]}, {item.embedding[1]}, ..., {item.embedding[length-2]}, {item.embedding[length-1]}]")
print(f"id: {result.id}")
print(f"model: {result.model}")
print(f"object: {result.object}")
print(f"usage.input_tokens: {result.usage.input_tokens}")
print(f"usage.prompt_tokens: {result.usage.prompt_tokens}")
print(f"usage.total_tokens: {result.usage.total_tokens}")

Expand Down
Loading