Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions sdk/ai/azure-ai-projects/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,12 @@ See [Agents package document and samples](https://github.com/Azure/azure-sdk-for
* The method `.inference.get_azure_openai_client()` now supports returning an authenticated `AzureOpenAI` client to be used with
AI models deployed to the Project's AI Services. This is in addition to the existing option to get an `AzureOpenAI` client for one of the connected Azure OpenAI services.
* Import `PromptTemplate` from `azure.ai.projects` instead of `azure.ai.projects.prompts`.
* The class ConnectionProperties was renamed to Connection, and its properties have changed.
* The method `.to_evaluator_model_config` on `ConnectionProperties` is no longer required and does not have an equivalent method on `Connection`. When constructing the EvaluatorConfiguration class, the `init_params` element now requires `deployment_name` instead of `model_config`.
* The method `upload_file` on `AIProjectClient` had been removed, use `datasets.upload_file` instead.
* Evaluator Ids are available using the Enum `EvaluatorIds` and no longer require `azure-ai-evaluation` package to be installed.
* Property `scope` on `AIProjectClient` is removed, use AI Foundry Project endpoint instead.
* Property `id` on Evaluation is replaced with `name`.

### Sample updates

Expand Down
114 changes: 23 additions & 91 deletions sdk/ai/azure-ai-projects/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -351,118 +351,50 @@ project_client.indexes.delete(name=index_name, version=index_version)
<!-- END SNIPPET -->

### Evaluation
Evaluation in Azure AI Project client library provides quantitive, AI-assisted quality and safety metrics to asses performance and Evaluate LLM Models, GenAI Application and Agents. Metrics are defined as evaluators. Built-in or custom evaluators can provide comprehensive evaluation insights.

Evaluation in Azure AI Project client library is designed to assess the performance of generative AI applications in the cloud. The output of Generative AI application is quantitively measured with mathematical based metrics, AI-assisted quality and safety metrics. Metrics are defined as evaluators. Built-in or custom evaluators can provide comprehensive insights into the application's capabilities and limitations.
The code below shows some evaluation operations. Full list of sample can be found under "evaluation" folder in the [package samples][samples]

#### Evaluator
<!-- SNIPPET:sample_evaluations.evalautions_sample-->

Evaluators are custom or prebuilt classes or functions that are designed to measure the quality of the outputs from language models or generative AI applications.

Evaluators are made available via [azure-ai-evaluation][azure_ai_evaluation] SDK for local experience and also in [Evaluator Library][evaluator_library] in Azure AI Foundry for using them in the cloud.

More details on built-in and custom evaluators can be found [here][evaluators].

#### Run Evaluation in the cloud

To run evaluation in the cloud the following are needed:

* Evaluators
* Data to be evaluated
* [Optional] Azure Open AI model.

##### Evaluators

For running evaluator in the cloud, evaluator `ID` is needed. To get it via code you use [azure-ai-evaluation][azure_ai_evaluation]

```python
# pip install azure-ai-evaluation

from azure.ai.evaluation import RelevanceEvaluator

evaluator_id = RelevanceEvaluator.id
```

##### Data to be evaluated

Evaluation in the cloud supports data in form of `jsonl` file. Data can be uploaded via the helper method `upload_file` on the project client.

```python
# Upload data for evaluation and get dataset id
data_id, _ = project_client.upload_file("<data_file.jsonl>")
```

##### [Optional] Azure OpenAI Model

Azure AI Foundry project comes with a default Azure Open AI endpoint which can be easily accessed using following code. This gives you the endpoint details for you Azure OpenAI endpoint. Some of the evaluators need model that supports chat completion.

```python
default_connection = project_client.connections.get_default(connection_type=ConnectionType.AZURE_OPEN_AI)
```

##### Example Remote Evaluation

```python
import os
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential
from azure.ai.projects.models import Evaluation, Dataset, EvaluatorConfiguration, ConnectionType
from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, HateUnfairnessEvaluator


# Create project client
project_client = AIProjectClient(
credential=DefaultAzureCredential(),
endpoint=os.environ["PROJECT_ENDPOINT"],
print(
f"Create Evaluation using dataset with id {dataset_id} and project {project_endpoint}".
)

# Upload data for evaluation and get dataset id
data_id, _ = project_client.upload_file("<data_file.jsonl>")

deployment_name = "<deployment_name>"
api_version = "<api_version>"
from azure.ai.projects.models import Evaluation, InputDataset, EvaluatorConfiguration, EvaluatorIds

# Create an evaluation
evaluation = Evaluation(
display_name="Remote Evaluation",
description="Evaluation of dataset",
data=Dataset(id=data_id),
display_name="Sample Evaluation Test",
description="Sample evaluation for testing",
# Sample Dataset Id : azureai://accounts/<account_name>/projects/<project_name>/data/<dataset_name>/versions/<version>
data=InputDataset(id=dataset_id),
evaluators={
"f1_score": EvaluatorConfiguration(
id=F1ScoreEvaluator.id,
),
"relevance": EvaluatorConfiguration(
id=RelevanceEvaluator.id,
"violence": EvaluatorConfiguration(
id=EvaluatorIds.VIOLENCE.value,
init_params={
"model_config": default_connection.to_evaluator_model_config(
deployment_name=deployment_name, api_version=api_version
)
"azure_ai_project": project_endpoint,
},
),
"violence": EvaluatorConfiguration(
id=ViolenceEvaluator.id,
init_params={"azure_ai_project": project_client.scope},
"bleu_score": EvaluatorConfiguration(
id=EvaluatorIds.BLEU_SCORE.value,
),
},
)

project_client.evaluations.create(evaluation)
print(evaluation)

evaluation_response = project_client.evaluations.create(
evaluation=evaluation,
)

# Get evaluation
get_evaluation_response = project_client.evaluations.get(evaluation_response.id)
print(f"Get an evaluation with name {name}")
evaluation = project_client.evaluations.get(evaluation.name)
print(evaluation)

print("----------------------------------------------------------------")
print("Created evaluation, evaluation ID: ", get_evaluation_response.id)
print("Evaluation status: ", get_evaluation_response.status)
if isinstance(get_evaluation_response.properties, dict):
print("AI Foundry URI: ", get_evaluation_response.properties["AiStudioEvaluationUri"])
print("----------------------------------------------------------------")
print("List all evaluation")
for evaluation in project_client.evaluations.list():
print(evaluation)
```

NOTE: For running evaluators locally refer to [Evaluate with the Azure AI Evaluation SDK][evaluators].

## Troubleshooting

### Exceptions
Expand Down
39 changes: 23 additions & 16 deletions sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@
Set these environment variables with your own values:
1) PROJECT_ENDPOINT - Required. The Azure AI Project endpoint, as found in the overview page of your
Azure AI Foundry project.
2) DATASET_NAME - Required. The name of the Dataset to create and use in this sample.
2) CONNECTION_NAME - Required. The name of the Azure Storage Account connection to use for uploading files.
3) DATASET_NAME - Optional. The name of the Dataset to create and use in this sample.
4) DATASET_VERSION - Optional. The version of the Dataset to create and use in this sample.
6) DATA_FOLDER - Optional. The folder path where the data files for upload are located.
"""

import os
Expand All @@ -32,38 +35,42 @@
InputDataset,
EvaluatorConfiguration,
EvaluatorIds,
# DatasetVersion,
DatasetVersion,
)

endpoint = os.environ[
"PROJECT_ENDPOINT"
] # Sample : https://<account_name>.services.ai.azure.com/api/projects/<project_name>
endpoint = os.environ["PROJECT_ENDPOINT"] # Sample : https://<account_name>.services.ai.azure.com/api/projects/<project_name>
model_endpoint = os.environ["MODEL_ENDPOINT"] # Sample : https://<account_name>.services.ai.azure.com
model_api_key = os.environ["MODEL_API_KEY"]
model_deployment_name = os.environ["MODEL_DEPLOYMENT_NAME"] # Sample : gpt-4o-mini
dataset_name = os.environ.get("DATASET_NAME", "dataset-test")
dataset_version = os.environ.get("DATASET_VERSION", "1.0")

# Construct the paths to the data folder and data file used in this sample
script_dir = os.path.dirname(os.path.abspath(__file__))
data_folder = os.environ.get("DATA_FOLDER", os.path.join(script_dir, "data_folder"))
data_file = os.path.join(data_folder, "sample_data_evaluation.jsonl")

with DefaultAzureCredential(exclude_interactive_browser_credential=False) as credential:

with AIProjectClient(endpoint=endpoint, credential=credential) as project_client:

# [START evaluations_sample]
# TODO : Uncomment the following lines once dataset creation works
# print(
# "Upload a single file and create a new Dataset to reference the file. Here we explicitly specify the dataset version."
# )
# dataset: DatasetVersion = project_client.datasets.upload_file(
# name=dataset_name,
# version="1",
# file="./samples_folder/sample_data_evaluation.jsonl",
# )
# print(dataset)
print(
"Upload a single file and create a new Dataset to reference the file."
)
dataset: DatasetVersion = project_client.datasets.upload_file(
name=dataset_name,
version=dataset_version,
file_path=data_file,
)
print(dataset)

print("Create an evaluation")
evaluation: Evaluation = Evaluation(
display_name="Sample Evaluation Test",
description="Sample evaluation for testing",
# Sample Dataset Id : azureai://accounts/<account_name>/projects/<project_name>/data/<dataset_name>/versions/<version>
data=InputDataset(id="<>"),
data=InputDataset(id=dataset.id), # pyright: ignore
evaluators={
"relevance": EvaluatorConfiguration(
id=EvaluatorIds.RELEVANCE.value,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@
Set these environment variables with your own values:
1) PROJECT_ENDPOINT - Required. The Azure AI Project endpoint, as found in the overview page of your
Azure AI Foundry project.
2) DATASET_NAME - Required. The name of the Dataset to create and use in this sample.
2) CONNECTION_NAME - Required. The name of the Azure Storage Account connection to use for uploading files.
3) DATASET_NAME - Optional. The name of the Dataset to create and use in this sample.
4) DATASET_VERSION - Optional. The version of the Dataset to create and use in this sample.
6) DATA_FOLDER - Optional. The folder path where the data files for upload are located.
"""
import asyncio
import os
Expand All @@ -31,40 +34,43 @@
InputDataset,
EvaluatorConfiguration,
EvaluatorIds,
# DatasetVersion,
DatasetVersion,
)

# Construct the paths to the data folder and data file used in this sample
script_dir = os.path.dirname(os.path.abspath(__file__))
data_folder = os.environ.get("DATA_FOLDER", os.path.join(script_dir, "data_folder"))
data_file = os.path.join(data_folder, "sample_data_evaluation.jsonl")

async def main() -> None:
endpoint = os.environ[
"PROJECT_ENDPOINT"
] # Sample : https://<account_name>.services.ai.azure.com/api/projects/<project_name>
endpoint = os.environ["PROJECT_ENDPOINT"] # Sample : https://<account_name>.services.ai.azure.com/api/projects/<project_name>
model_endpoint = os.environ["MODEL_ENDPOINT"] # Sample : https://<account_name>.services.ai.azure.com
model_api_key = os.environ["MODEL_API_KEY"]
model_deployment_name = os.environ["MODEL_DEPLOYMENT_NAME"] # Sample : gpt-4o-mini
dataset_name = os.environ.get("DATASET_NAME", "dataset-test")
dataset_version = os.environ.get("DATASET_VERSION", "1.0")

async with DefaultAzureCredential() as credential:

async with AIProjectClient(endpoint=endpoint, credential=credential) as project_client:

# [START evaluations_sample]
# TODO : Uncomment the following lines once dataset creation works
# print(
# "Upload a single file and create a new Dataset to reference the file. Here we explicitly specify the dataset version."
# )
# dataset: DatasetVersion = await project_client.datasets.upload_file(
# name=dataset_name,
# version="1",
# file="./samples_folder/sample_data_evaluation.jsonl",
# )
# print(dataset)
print(
"Upload a single file and create a new Dataset to reference the file."
)
dataset: DatasetVersion = await project_client.datasets.upload_file(
name=dataset_name,
version=dataset_version,
file_path=data_file,
)
print(dataset)

print("Create an evaluation")
evaluation: Evaluation = Evaluation(
display_name="Sample Evaluation Async",
description="Sample evaluation for testing",
# Sample Dataset Id : azureai://accounts/<account_name>/projects/<project_name>/data/<dataset_name>/versions/<version>
data=InputDataset(id="<>"),
data=InputDataset(id=dataset.id), # pyright: ignore
evaluators={
"relevance": EvaluatorConfiguration(
id=EvaluatorIds.RELEVANCE.value,
Expand Down
Loading