diff --git a/sdk/ai/azure-ai-projects/CHANGELOG.md b/sdk/ai/azure-ai-projects/CHANGELOG.md index 9ccbe0150138..804fd7baa0b1 100644 --- a/sdk/ai/azure-ai-projects/CHANGELOG.md +++ b/sdk/ai/azure-ai-projects/CHANGELOG.md @@ -23,6 +23,12 @@ See [Agents package document and samples](https://github.com/Azure/azure-sdk-for * The method `.inference.get_azure_openai_client()` now supports returning an authenticated `AzureOpenAI` client to be used with AI models deployed to the Project's AI Services. This is in addition to the existing option to get an `AzureOpenAI` client for one of the connected Azure OpenAI services. * Import `PromptTemplate` from `azure.ai.projects` instead of `azure.ai.projects.prompts`. +* The class ConnectionProperties was renamed to Connection, and its properties have changed. +* The method `.to_evaluator_model_config` on `ConnectionProperties` is no longer required and does not have an equivalent method on `Connection`. When constructing the EvaluatorConfiguration class, the `init_params` element now requires `deployment_name` instead of `model_config`. +* The method `upload_file` on `AIProjectClient` had been removed, use `datasets.upload_file` instead. +* Evaluator Ids are available using the Enum `EvaluatorIds` and no longer require `azure-ai-evaluation` package to be installed. +* Property `scope` on `AIProjectClient` is removed, use AI Foundry Project endpoint instead. +* Property `id` on Evaluation is replaced with `name`. ### Sample updates diff --git a/sdk/ai/azure-ai-projects/README.md b/sdk/ai/azure-ai-projects/README.md index c36ae82c0696..70c549343e15 100644 --- a/sdk/ai/azure-ai-projects/README.md +++ b/sdk/ai/azure-ai-projects/README.md @@ -351,118 +351,50 @@ project_client.indexes.delete(name=index_name, version=index_version) ### Evaluation +Evaluation in Azure AI Project client library provides quantitive, AI-assisted quality and safety metrics to asses performance and Evaluate LLM Models, GenAI Application and Agents. Metrics are defined as evaluators. Built-in or custom evaluators can provide comprehensive evaluation insights. -Evaluation in Azure AI Project client library is designed to assess the performance of generative AI applications in the cloud. The output of Generative AI application is quantitively measured with mathematical based metrics, AI-assisted quality and safety metrics. Metrics are defined as evaluators. Built-in or custom evaluators can provide comprehensive insights into the application's capabilities and limitations. +The code below shows some evaluation operations. Full list of sample can be found under "evaluation" folder in the [package samples][samples] -#### Evaluator + -Evaluators are custom or prebuilt classes or functions that are designed to measure the quality of the outputs from language models or generative AI applications. - -Evaluators are made available via [azure-ai-evaluation][azure_ai_evaluation] SDK for local experience and also in [Evaluator Library][evaluator_library] in Azure AI Foundry for using them in the cloud. - -More details on built-in and custom evaluators can be found [here][evaluators]. - -#### Run Evaluation in the cloud - -To run evaluation in the cloud the following are needed: - -* Evaluators -* Data to be evaluated -* [Optional] Azure Open AI model. - -##### Evaluators - -For running evaluator in the cloud, evaluator `ID` is needed. To get it via code you use [azure-ai-evaluation][azure_ai_evaluation] - -```python -# pip install azure-ai-evaluation - -from azure.ai.evaluation import RelevanceEvaluator - -evaluator_id = RelevanceEvaluator.id -``` - -##### Data to be evaluated - -Evaluation in the cloud supports data in form of `jsonl` file. Data can be uploaded via the helper method `upload_file` on the project client. - -```python -# Upload data for evaluation and get dataset id -data_id, _ = project_client.upload_file("") -``` - -##### [Optional] Azure OpenAI Model - -Azure AI Foundry project comes with a default Azure Open AI endpoint which can be easily accessed using following code. This gives you the endpoint details for you Azure OpenAI endpoint. Some of the evaluators need model that supports chat completion. ```python -default_connection = project_client.connections.get_default(connection_type=ConnectionType.AZURE_OPEN_AI) -``` - -##### Example Remote Evaluation - -```python -import os -from azure.ai.projects import AIProjectClient -from azure.identity import DefaultAzureCredential -from azure.ai.projects.models import Evaluation, Dataset, EvaluatorConfiguration, ConnectionType -from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, HateUnfairnessEvaluator - - -# Create project client -project_client = AIProjectClient( - credential=DefaultAzureCredential(), - endpoint=os.environ["PROJECT_ENDPOINT"], +print( + f"Create Evaluation using dataset with id {dataset_id} and project {project_endpoint}". ) -# Upload data for evaluation and get dataset id -data_id, _ = project_client.upload_file("") - -deployment_name = "" -api_version = "" +from azure.ai.projects.models import Evaluation, InputDataset, EvaluatorConfiguration, EvaluatorIds -# Create an evaluation evaluation = Evaluation( - display_name="Remote Evaluation", - description="Evaluation of dataset", - data=Dataset(id=data_id), + display_name="Sample Evaluation Test", + description="Sample evaluation for testing", + # Sample Dataset Id : azureai://accounts//projects//data//versions/ + data=InputDataset(id=dataset_id), evaluators={ - "f1_score": EvaluatorConfiguration( - id=F1ScoreEvaluator.id, - ), - "relevance": EvaluatorConfiguration( - id=RelevanceEvaluator.id, + "violence": EvaluatorConfiguration( + id=EvaluatorIds.VIOLENCE.value, init_params={ - "model_config": default_connection.to_evaluator_model_config( - deployment_name=deployment_name, api_version=api_version - ) + "azure_ai_project": project_endpoint, }, ), - "violence": EvaluatorConfiguration( - id=ViolenceEvaluator.id, - init_params={"azure_ai_project": project_client.scope}, + "bleu_score": EvaluatorConfiguration( + id=EvaluatorIds.BLEU_SCORE.value, ), }, ) +project_client.evaluations.create(evaluation) +print(evaluation) -evaluation_response = project_client.evaluations.create( - evaluation=evaluation, -) - -# Get evaluation -get_evaluation_response = project_client.evaluations.get(evaluation_response.id) +print(f"Get an evaluation with name {name}") +evaluation = project_client.evaluations.get(evaluation.name) +print(evaluation) -print("----------------------------------------------------------------") -print("Created evaluation, evaluation ID: ", get_evaluation_response.id) -print("Evaluation status: ", get_evaluation_response.status) -if isinstance(get_evaluation_response.properties, dict): - print("AI Foundry URI: ", get_evaluation_response.properties["AiStudioEvaluationUri"]) -print("----------------------------------------------------------------") +print("List all evaluation") +for evaluation in project_client.evaluations.list(): + print(evaluation) ``` -NOTE: For running evaluators locally refer to [Evaluate with the Azure AI Evaluation SDK][evaluators]. - ## Troubleshooting ### Exceptions diff --git a/sdk/ai/azure-ai-projects/samples/evaluation/samples_folder/sample_data_evaluation.jsonl b/sdk/ai/azure-ai-projects/samples/evaluation/data_folder/sample_data_evaluation.jsonl similarity index 100% rename from sdk/ai/azure-ai-projects/samples/evaluation/samples_folder/sample_data_evaluation.jsonl rename to sdk/ai/azure-ai-projects/samples/evaluation/data_folder/sample_data_evaluation.jsonl diff --git a/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations.py b/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations.py index ecb0a8ad8721..7ca4549d58a6 100644 --- a/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations.py +++ b/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations.py @@ -19,7 +19,10 @@ Set these environment variables with your own values: 1) PROJECT_ENDPOINT - Required. The Azure AI Project endpoint, as found in the overview page of your Azure AI Foundry project. - 2) DATASET_NAME - Required. The name of the Dataset to create and use in this sample. + 2) CONNECTION_NAME - Required. The name of the Azure Storage Account connection to use for uploading files. + 3) DATASET_NAME - Optional. The name of the Dataset to create and use in this sample. + 4) DATASET_VERSION - Optional. The version of the Dataset to create and use in this sample. + 6) DATA_FOLDER - Optional. The folder path where the data files for upload are located. """ import os @@ -32,38 +35,42 @@ InputDataset, EvaluatorConfiguration, EvaluatorIds, - # DatasetVersion, + DatasetVersion, ) -endpoint = os.environ[ - "PROJECT_ENDPOINT" -] # Sample : https://.services.ai.azure.com/api/projects/ +endpoint = os.environ["PROJECT_ENDPOINT"] # Sample : https://.services.ai.azure.com/api/projects/ model_endpoint = os.environ["MODEL_ENDPOINT"] # Sample : https://.services.ai.azure.com model_api_key = os.environ["MODEL_API_KEY"] model_deployment_name = os.environ["MODEL_DEPLOYMENT_NAME"] # Sample : gpt-4o-mini +dataset_name = os.environ.get("DATASET_NAME", "dataset-test") +dataset_version = os.environ.get("DATASET_VERSION", "1.0") + +# Construct the paths to the data folder and data file used in this sample +script_dir = os.path.dirname(os.path.abspath(__file__)) +data_folder = os.environ.get("DATA_FOLDER", os.path.join(script_dir, "data_folder")) +data_file = os.path.join(data_folder, "sample_data_evaluation.jsonl") with DefaultAzureCredential(exclude_interactive_browser_credential=False) as credential: with AIProjectClient(endpoint=endpoint, credential=credential) as project_client: # [START evaluations_sample] - # TODO : Uncomment the following lines once dataset creation works - # print( - # "Upload a single file and create a new Dataset to reference the file. Here we explicitly specify the dataset version." - # ) - # dataset: DatasetVersion = project_client.datasets.upload_file( - # name=dataset_name, - # version="1", - # file="./samples_folder/sample_data_evaluation.jsonl", - # ) - # print(dataset) + print( + "Upload a single file and create a new Dataset to reference the file." + ) + dataset: DatasetVersion = project_client.datasets.upload_file( + name=dataset_name, + version=dataset_version, + file_path=data_file, + ) + print(dataset) print("Create an evaluation") evaluation: Evaluation = Evaluation( display_name="Sample Evaluation Test", description="Sample evaluation for testing", # Sample Dataset Id : azureai://accounts//projects//data//versions/ - data=InputDataset(id="<>"), + data=InputDataset(id=dataset.id), # pyright: ignore evaluators={ "relevance": EvaluatorConfiguration( id=EvaluatorIds.RELEVANCE.value, diff --git a/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations_async.py b/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations_async.py index e43576e57497..332c384e7690 100644 --- a/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations_async.py +++ b/sdk/ai/azure-ai-projects/samples/evaluation/sample_evaluations_async.py @@ -19,7 +19,10 @@ Set these environment variables with your own values: 1) PROJECT_ENDPOINT - Required. The Azure AI Project endpoint, as found in the overview page of your Azure AI Foundry project. - 2) DATASET_NAME - Required. The name of the Dataset to create and use in this sample. + 2) CONNECTION_NAME - Required. The name of the Azure Storage Account connection to use for uploading files. + 3) DATASET_NAME - Optional. The name of the Dataset to create and use in this sample. + 4) DATASET_VERSION - Optional. The version of the Dataset to create and use in this sample. + 6) DATA_FOLDER - Optional. The folder path where the data files for upload are located. """ import asyncio import os @@ -31,40 +34,43 @@ InputDataset, EvaluatorConfiguration, EvaluatorIds, - # DatasetVersion, + DatasetVersion, ) +# Construct the paths to the data folder and data file used in this sample +script_dir = os.path.dirname(os.path.abspath(__file__)) +data_folder = os.environ.get("DATA_FOLDER", os.path.join(script_dir, "data_folder")) +data_file = os.path.join(data_folder, "sample_data_evaluation.jsonl") async def main() -> None: - endpoint = os.environ[ - "PROJECT_ENDPOINT" - ] # Sample : https://.services.ai.azure.com/api/projects/ + endpoint = os.environ["PROJECT_ENDPOINT"] # Sample : https://.services.ai.azure.com/api/projects/ model_endpoint = os.environ["MODEL_ENDPOINT"] # Sample : https://.services.ai.azure.com model_api_key = os.environ["MODEL_API_KEY"] model_deployment_name = os.environ["MODEL_DEPLOYMENT_NAME"] # Sample : gpt-4o-mini + dataset_name = os.environ.get("DATASET_NAME", "dataset-test") + dataset_version = os.environ.get("DATASET_VERSION", "1.0") async with DefaultAzureCredential() as credential: async with AIProjectClient(endpoint=endpoint, credential=credential) as project_client: # [START evaluations_sample] - # TODO : Uncomment the following lines once dataset creation works - # print( - # "Upload a single file and create a new Dataset to reference the file. Here we explicitly specify the dataset version." - # ) - # dataset: DatasetVersion = await project_client.datasets.upload_file( - # name=dataset_name, - # version="1", - # file="./samples_folder/sample_data_evaluation.jsonl", - # ) - # print(dataset) + print( + "Upload a single file and create a new Dataset to reference the file." + ) + dataset: DatasetVersion = await project_client.datasets.upload_file( + name=dataset_name, + version=dataset_version, + file_path=data_file, + ) + print(dataset) print("Create an evaluation") evaluation: Evaluation = Evaluation( display_name="Sample Evaluation Async", description="Sample evaluation for testing", # Sample Dataset Id : azureai://accounts//projects//data//versions/ - data=InputDataset(id="<>"), + data=InputDataset(id=dataset.id), # pyright: ignore evaluators={ "relevance": EvaluatorConfiguration( id=EvaluatorIds.RELEVANCE.value,