Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
d0fcdff
Update changelog (#35929)
xiangyan99 Jun 5, 2024
8c581a2
Autoinstrumentation rework (#35890)
jeremydvoss Jun 5, 2024
72919a9
switch to majority entra auth for tests (#35581)
kristapratico Jun 6, 2024
428ccf2
add a new parameter allow_roleassignment_on_rg to allow/disallow role…
wenjie1070116 Jun 6, 2024
61138a7
Increment package version after release of azure-core (#35950)
azure-sdk Jun 6, 2024
e6f98bc
[Event Hubs] Update URI used for consumer auth to include consumer gr…
swathipil Jun 6, 2024
697a9bf
Allow configuration of metric Views in distro (#35932)
lzchen Jun 6, 2024
bf4ee7f
[EventHub] Update README for enable logging section (#35955)
swathipil Jun 6, 2024
215bb40
[Storage] [STG 94] Merge STG 94 into `main` branch (#35888)
weirongw23-msft Jun 6, 2024
6e6648a
Sync eng/common directory with azure-sdk-tools for PR 8377 (#35915)
azure-sdk Jun 6, 2024
a73ca09
Distro release 1.6.0 (#35935)
jeremydvoss Jun 6, 2024
ae48eea
Python client for Model-as-a-Service (MaaS) / Model-as-a-Platform (Ma…
dargilco Jun 7, 2024
610da5d
[AutoRelease] t2-datafactory-2024-06-03-75602(can only be merged by S…
azure-sdk Jun 7, 2024
5b30781
Use DOTNET_ROLL_FORWARD: 'Major' for test-proxy (#35956)
azure-sdk Jun 7, 2024
acd606f
Pin pester version to 5.5.0 (#35967)
azure-sdk Jun 7, 2024
6bb9e47
Some minor updates to package & samples README.md files (#35971)
dargilco Jun 7, 2024
0b99ee1
add aoai assistants streaming/v2 tests (#35443)
kristapratico Jun 7, 2024
9b98575
[Identity] Allow use of client assertion in OBO cred (#35812)
pvaneck Jun 7, 2024
984542f
[Identity] Disable live service principal tests (#35958)
pvaneck Jun 7, 2024
a7cb46a
set storage account access to identity-based for feature store creati…
runhli Jun 7, 2024
9b6427c
Change Workspace related PR reviewer (#35921)
debuggerXi Jun 10, 2024
8abbc26
[EG] GA Namespaces (#35831)
l0lawrence Jun 10, 2024
f24b567
Increment package version after release of azure-monitor-opentelemetr…
azure-sdk Jun 10, 2024
c9b1e27
update test for new structure of custom blocklist (#36001)
kristapratico Jun 10, 2024
ec8190c
Update github-event-processor to 1.0.0-dev.20240610.2 (#36000)
azure-sdk Jun 10, 2024
81de947
[Identity] Minor doc updates (#35974)
pvaneck Jun 10, 2024
2ac0060
allow for futher embedded snippets (#36004)
l0lawrence Jun 10, 2024
40cf085
[Monitor Query + Ingestions] Update changelogs (#35942)
pvaneck Jun 10, 2024
cb065ac
[Identity] Managed identity bug fix (#36010)
pvaneck Jun 10, 2024
fe0e014
Added release dates (#36006)
vincenttran-msft Jun 10, 2024
5b55203
Remove MayankKumar91 (#35911)
lmazuel Jun 10, 2024
adbac73
Increment package version after release of azure-identity (#36015)
azure-sdk Jun 10, 2024
10c3c79
Always run analyze weekly (#35968)
kristapratico Jun 11, 2024
433b99a
move samples (#35966)
l0lawrence Jun 11, 2024
4356326
[Key Vault] Change location for weekly China cloud tests (#36018)
mccoyp Jun 11, 2024
5fd14fe
Fix Sphinx on azure-storage-blob-changefeed (#35975)
Jun 11, 2024
dff6744
update release date (#36028)
l0lawrence Jun 11, 2024
b052da8
azure-mgmt-core shouldn't use mgmt docs build (#35936)
kristapratico Jun 11, 2024
d97ff44
Identity credential unavailable error non json imds (#36016)
xiangyan99 Jun 11, 2024
e08b3b0
Update azure-ai-inference client library to support sending images as…
dargilco Jun 11, 2024
cf49b4e
Export InputTypes from constants (#35848)
emepetres Jun 11, 2024
a79c5ab
[EG] Eventgrid Release (#36030)
l0lawrence Jun 11, 2024
47fdf5d
Fix Sphinx on azure-storage-blob (#36014)
Jun 11, 2024
1552259
[Identity] Update AzurePipelinesCredential (#35858)
pvaneck Jun 11, 2024
01fa69c
upgrade autorest.python to `6.13.19` (#36024)
msyyc Jun 12, 2024
215eb63
Increment version for monitor releases (#36036)
azure-sdk Jun 12, 2024
bfd541b
compatible with new date format (#36049)
msyyc Jun 12, 2024
5b61bd4
[AutoRelease] t2-cdn-2024-06-12-45722(can only be merged by SDK owner…
azure-sdk Jun 12, 2024
614a928
[EG] link + patch update (#36045)
l0lawrence Jun 12, 2024
80ecdfb
async with (#36060)
l0lawrence Jun 12, 2024
2aba54e
Incremental (#36040)
vincenttran-msft Jun 12, 2024
379cfd3
typo (#36062)
l0lawrence Jun 12, 2024
c5e1659
Fix prepare-pipelines line wrapping (#36061)
azure-sdk Jun 12, 2024
fda24bd
[bct] Initial refactoring breaking changes tool (#36005)
catalinaperalta Jun 12, 2024
a642e74
Update swagger_to_sdk_config_dpg.json (#36068)
msyyc Jun 13, 2024
3ce8196
Increment package version after release of azure-eventgrid (#36063)
azure-sdk Jun 13, 2024
66d5de4
Sync eng/common directory with azure-sdk-tools for PR 8388 (#35970)
azure-sdk Jun 13, 2024
15bcb99
report number of breaking changes (#36067)
catalinaperalta Jun 13, 2024
0f27374
update codeowner (#36074)
xiangyan99 Jun 13, 2024
d7bfdb0
update strict-sphinx to v7 (#36075)
kristapratico Jun 13, 2024
df9c8c7
Update spelling dependencies (#36084)
azure-sdk Jun 14, 2024
3e7dff6
[DevCenter] Update release date (#36083)
drielenr Jun 14, 2024
147746b
[Identity] Add TSG section for AzurePipelinesCredential (#36048)
pvaneck Jun 14, 2024
c19f701
Support sending image data as part of a user message, using a new Ima…
dargilco Jun 14, 2024
ee65563
update (#36051)
msyyc Jun 14, 2024
892881a
Increment package version after release of azure-ai-inference (#36091)
azure-sdk Jun 14, 2024
c6383aa
address API review comments (#36058)
Adarsh-Ramanathan Jun 14, 2024
811dc0e
Update CodeownersLinter version to 1.0.0-dev.20240614.4 (#36093)
azure-sdk Jun 14, 2024
fe435b7
[AutoRelease] t2-mobilenetwork-2024-06-05-65505(can only be merged by…
azure-sdk Jun 17, 2024
a566320
[AutoRelease] t2-storagemover-2024-06-11-87054(can only be merged by …
azure-sdk Jun 17, 2024
698cd95
code and test (#35959)
azure-sdk Jun 17, 2024
40a2625
[AutoRelease] t2-web-2024-06-07-57417(can only be merged by SDK owner…
azure-sdk Jun 17, 2024
3c833e1
Update breaking_changes_allowlist.py (#36104)
msyyc Jun 17, 2024
323fdc7
appconfig mi test (#35842)
xiangyan99 Jun 17, 2024
c51ac91
Bugfix: None was being appended to output path for batch-endpoint inv…
nagkumar91 Jun 17, 2024
cd1725e
Sync eng/common directory with azure-sdk-tools for PR 8457 (#36113)
azure-sdk Jun 17, 2024
d791bc6
Merge branch 'main' into 1.17.0-core-main-merge
MilesHolland Jun 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Some minor updates to package & samples README.md files (#35971)
  • Loading branch information
dargilco authored Jun 7, 2024
commit 6bb9e477e87d72540aa75b788ffad717f7e285f9
36 changes: 22 additions & 14 deletions sdk/ai/azure-ai-inference/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Azure AI Inference client library for Python

The client Library (in preview) does inference, including chat completions, for AI models deployed by [Azure AI Studio](https://ai.azure.com) and [Azure Machine Learning Studio](https://ml.azure.com/). It supports Serverless API endpoints and Managed Compute Endpoints (formerly known as Managed Online Endpoints). The client library makes services calls using REST API version `2024-05-01-preview`, as documented in [Azure AI Model Inference API](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-api). For more information see [Overview: Deploy models, flows, and web apps with Azure AI Studio](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview).
The client Library (in preview) does inference, including chat completions, for AI models deployed by [Azure AI Studio](https://ai.azure.com) and [Azure Machine Learning Studio](https://ml.azure.com/). It supports Serverless API endpoints and Managed Compute endpoints (formerly known as Managed Online Endpoints). The client library makes services calls using REST API version `2024-05-01-preview`, as documented in [Azure AI Model Inference API](https://learn.microsoft.com/azure/ai-studio/reference/reference-model-inference-api). For more information see [Overview: Deploy models, flows, and web apps with Azure AI Studio](https://learn.microsoft.com/azure/ai-studio/concepts/deployments-overview).

Use the model inference client library to:

Expand Down Expand Up @@ -181,7 +181,7 @@ In the following sections you will find simple examples of:
* [Text Embeddings](#text-embeddings-example)
<!-- * [Image Embeddings](#image-embeddings-example) -->

The examples create a synchronous client as mentioned in [Create and authenticate clients](#create-and-authenticate-clients). Only mandatory input settings are shown for simplicity.
The examples create a synchronous client as mentioned in [Create and authenticate a client directly, using key](#create-and-authenticate-a-client-directly-using-key). Only mandatory input settings are shown for simplicity.

See the [Samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder for full working samples for synchronous and asynchronous clients.

Expand Down Expand Up @@ -388,7 +388,7 @@ To generate embeddings for additional phrases, simply call `client.embed` multip

### Exceptions

The `complete`, `embed` and `get_model_info` methods on the clients raise an [HttpResponseError](https://learn.microsoft.com/python/api/azure-core/azure.core.exceptions.httpresponseerror) exception for a non-success HTTP status code response from the service. The exception's `status_code` will be the HTTP response status code. The exception's `error.message` contains a detailed message that will allow you to diagnose the issue:
The `complete`, `embed` and `get_model_info` methods on the clients raise an [HttpResponseError](https://learn.microsoft.com/python/api/azure-core/azure.core.exceptions.httpresponseerror) exception for a non-success HTTP status code response from the service. The exception's `status_code` will hold the HTTP response status code (with `reason` showing the friendly name). The exception's `error.message` contains a detailed message that may be helpful in diagnosing the issue:

```python
from azure.core.exceptions import HttpResponseError
Expand All @@ -399,6 +399,7 @@ try:
result = client.complete( ... )
except HttpResponseError as e:
print(f"Status code: {e.status_code} ({e.reason})")
print(f"{e.message}")
```

For example, when you provide a wrong authentication key:
Expand All @@ -408,7 +409,7 @@ Status code: 401 (Unauthorized)
Operation returned an invalid status 'Unauthorized'
```

Or for example when you created an `EmbeddingsClient` and called `embed` on the client, but the endpoint does not
Or when you create an `EmbeddingsClient` and call `embed` on the client, but the endpoint does not
support the `/embeddings` route:

```text
Expand Down Expand Up @@ -442,18 +443,25 @@ formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
handler.setFormatter(formatter)
```

By default logs redact the values of URL query strings, the values of some HTTP request and response headers (including `Authorization` which holds the key or token), and the request and response payloads. To create logs without redaction, set the method argument `logging_enable = True` when you construct the client library, or when you call any of the client's `create` methods.
By default logs redact the values of URL query strings, the values of some HTTP request and response headers (including `Authorization` which holds the key or token), and the request and response payloads. To create logs without redaction, do these two things:

```python
# Create a chat completions client with non redacted logs
client = ChatCompletionsClient(
endpoint=endpoint,
credential=AzureKeyCredential(key),
logging_enable=True
)
```
1. Set the method argument `logging_enable = True` when you construct the client library, or when you call the client's `complete` or `embed` methods.
```python
client = ChatCompletionsClient(
endpoint=endpoint,
credential=AzureKeyCredential(key),
logging_enable=True
)
```
1. Set the log level to `logging.DEBUG`. Logs will be redacted with any other log level.

Be sure to protect non redacted logs to avoid compromising security.

For more information, see [Configure logging in the Azure libraries for Python](https://aka.ms/azsdk/python/logging)

### Reporting issues

None redacted logs are generated for log level `logging.DEBUG` only. Be sure to protect non redacted logs to avoid compromising security. For more information see [Configure logging in the Azure libraries for Python](https://aka.ms/azsdk/python/logging)
To report issues with the client library, or request additional features, please open a GitHub issue [here](https://github.com/Azure/azure-sdk-for-python/issues)

## Next steps

Expand Down
26 changes: 16 additions & 10 deletions sdk/ai/azure-ai-inference/samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,11 @@ urlFragment: model-inference-samples

# Samples for Azure AI Inference client library for Python

These are runnable console Python scripts that show how to do chat completion and text embeddings using the clients in this package. Samples in this folder use the a synchronous clients. Samples in the subfolder `async_samples` use the asynchronous clients. The concepts are similar, you can easily modify any of the synchronous samples to asynchronous.
These are runnable console Python scripts that show how to do chat completion and text embeddings against Serverless API endpoints and Managed Compute endpoints.

Samples with `azure_openai` in their name show how to do chat completions and text embeddings against Azure OpenAI endpoints.

Samples in this folder use the a synchronous clients. Samples in the subfolder `async_samples` use the asynchronous clients. The concepts are similar, you can easily modify any of the synchronous samples to asynchronous.

## Prerequisites

Expand All @@ -37,15 +41,17 @@ See [Prerequisites](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/

To construct any of the clients, you will need to pass in the endpoint URL. If you are using key authentication, you also need to pass in the key associated with your deployed AI model.

* The endpoint URL has the form `https://your-deployment-name.your-azure-region.inference.ai.azure.com`, where `your-deployment-name` is your unique model deployment name and `your-azure-region` is the Azure region where the model is deployed (e.g. `eastus2`).
* For Serverless API and Managed Compute endpoints, the endpoint URL has the form `https://your-unique-resouce-name.your-azure-region.inference.ai.azure.com`, where `your-unique-resource-name` is your globally unique Azure resource name and `your-azure-region` is the Azure region where the model is deployed (e.g. `eastus2`).

* For Azure OpenAI endpoints, the endpoint URL has the form `https://your-unique-resouce-name.openai.azure.com/openai/deployments/your-deployment-name`, where `your-unique-resource-name` is your globally unique Azure OpenAI resource name, and `your-deployment-name` is your AI Model deployment name.

* The key is a 32-character string.

For convenience, and to promote the practice of not hard-coding secrets in your source code, all samples here assume the endpoint URL and key are stored in environment variables. You will need to set these environment variables before running the samples as-is. The environment variables are mentioned in the tables below.

Note that the client library does not directly read these environment variable at run time. The sample code reads the environment variables and constructs the relevant client with these values.

## Serverless API and Managed Compute Endpoints
## Serverless API and Managed Compute endpoints

| Sample type | Endpoint environment variable name | Key environment variable name |
|----------|----------|----------|
Expand All @@ -57,7 +63,7 @@ Note that the client library does not directly read these environment variable a

To run against a Managed Compute Endpoint, some samples also have an optional environment variable `CHAT_COMPLETIONS_DEPLOYMENT_NAME`. This is the value used to set the HTTP request header `azureml-model-deployment` when constructing the client.

## Azure OpenAI Endpoints
## Azure OpenAI endpoints

| Sample type | Endpoint environment variable name | Key environment variable name |
|----------|----------|----------|
Expand All @@ -84,11 +90,11 @@ similarly for the other samples.
|**File Name**|**Description**|
|----------------|-------------|
|[sample_chat_completions_streaming.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming.py) | One chat completion operation using a synchronous client and streaming response. |
|[sample_chat_completions_streaming_with_entra_id_auth.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_entra_id_auth.py) | One chat completion operation using a synchronous client and streaming response, using Entra ID authentication. This sample also shows setting the `azureml-model-deployment` HTTP request header, which may be required for Selfhosted Endpoints. |
|[sample_chat_completions_streaming_with_entra_id_auth.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_entra_id_auth.py) | One chat completion operation using a synchronous client and streaming response, using Entra ID authentication. This sample also shows setting the `azureml-model-deployment` HTTP request header, which may be required for some Managed Compute endpoint. |
|[sample_chat_completions.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions.py) | One chat completion operation using a synchronous client. |
|[sample_chat_completions_with_history.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_history.py) | Two chat completion operations using a synchronous client, which the second completion using chat history from the first. |
|[sample_chat_completions_with_history.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_history.py) | Two chat completion operations using a synchronous client, with the second completion using chat history from the first. |
|[sample_chat_completions_from_input_bytes.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_bytes.py) | One chat completion operation using a synchronous client, with input messages provided as `IO[bytes]`. |
|[sample_chat_completions_from_input_json.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py) | One chat completion operation using a synchronous client, with input messages provided as `MutableMapping[str, Any]` |
|[sample_chat_completions_from_input_json.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py) | One chat completion operation using a synchronous client, with input messages provided as a dictionary (type `MutableMapping[str, Any]`) |
|[sample_chat_completions_with_tools.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py) | Shows how do use a tool (function) in chat completions, for an AI model that supports tools |
|[sample_load_client.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_load_client.py) | Shows how do use the function `load_client` to create the appropriate synchronous client based on the provided endpoint URL. In this example, it creates a synchronous `ChatCompletionsClient`. |
|[sample_get_model_info.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_get_model_info.py) | Get AI model information using the chat completions client. Similarly can be done with all other clients. |
Expand Down Expand Up @@ -118,9 +124,9 @@ similarly for the other samples.
|----------------|-------------|
|[sample_chat_completions_streaming_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_streaming_async.py) | One chat completion operation using an asynchronous client and streaming response. |
|[sample_chat_completions_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_async.py) | One chat completion operation using an asynchronous client. |
|[sample_load_client_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_load_client_async.py) | Shows how do use the function `load_async_client` to create the appropriate asynchronous client based on the provided endpoint URL. In this example, it creates an asynchronous `ChatCompletionsClient`. |
|[sample_chat_completions_from_input_bytes_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_bytes_async.py) | One chat completion operation using a synchronous client, with input messages provided as `IO[bytes]`. |
|[sample_chat_completions_from_input_json_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py) | One chat completion operation using a synchronous client, with input messages provided as `MutableMapping[str, Any]` |
|[sample_load_client_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_load_client_async.py) | Shows how do use the function `load_client` to create the appropriate asynchronous client based on the provided endpoint URL. In this example, it creates an asynchronous `ChatCompletionsClient`. |
|[sample_chat_completions_from_input_bytes_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_bytes_async.py) | One chat completion operation using an asynchronous client, with input messages provided as `IO[bytes]`. |
|[sample_chat_completions_from_input_json_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py) | One chat completion operation using an asynchronous client, with input messages provided as a dictionary (type `MutableMapping[str, Any]`) |
|[sample_chat_completions_streaming_azure_openai_async.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_streaming_azure_openai_async.py) | One chat completion operation using an asynchronous client and streaming response against an Azure OpenAI endpoint |

### Text embeddings
Expand Down