Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
368bdf4
Initial-Commit-multimodal
w-javed Oct 3, 2024
920c46c
Fix
w-javed Oct 4, 2024
17c7dac
Sync eng/common directory with azure-sdk-tools for PR 9092 (#37713)
azure-sdk Oct 3, 2024
5d8ca40
Removing private parameter from __call__ of AdversarialSimulator (#37…
nagkumar91 Oct 3, 2024
6e5bd48
Enabling option to disable response payload on writes (#37365)
FabianMeiswinkel Oct 3, 2024
10f9ac7
deprecate azure_germany (#37654)
xiangyan99 Oct 4, 2024
db68d9d
Add default impl to handle token challenges (#37652)
xiangyan99 Oct 4, 2024
793c3fc
Make Credentials Required for Content Safety and Protected Materials …
needuv Oct 4, 2024
4d4e5bc
addFeedRangesAndUseFeedRangeInQueryChangeFeed (#37687)
xinlian12 Oct 4, 2024
22f081c
Update release date for core (#37723)
xiangyan99 Oct 4, 2024
71e44d4
Improvements to mindependency dev_requirement conflict resolution (#3…
scbedd Oct 4, 2024
ee45fa1
Need to add environment to subscription configuration (#37726)
azure-sdk Oct 4, 2024
2e2366b
Enable samples for formrecognizer (#37676)
xiangyan99 Oct 4, 2024
0faf959
Merge branch 'main' into multi-moodal-sdk-support
w-javed Oct 5, 2024
088ed3b
multi-modal-changes
w-javed Oct 14, 2024
3ff2c2a
Merge-conflicts
w-javed Oct 14, 2024
5ff2668
fixes
w-javed Oct 14, 2024
5c270cd
Fix with latest
w-javed Oct 16, 2024
c473df7
merge-conflicts
w-javed Oct 16, 2024
99d0cf0
dict-fix
w-javed Oct 16, 2024
d570130
adding-protected-material
w-javed Oct 17, 2024
4bc8a34
Merge branch 'main' into multi-moodal-sdk-support
w-javed Oct 17, 2024
6ad0164
adding-protected-material
w-javed Oct 17, 2024
7655b9e
adding-protected-material
w-javed Oct 17, 2024
3c4b816
bumping-version
w-javed Oct 17, 2024
255add0
adding assets
w-javed Oct 17, 2024
49a8ad8
Added image in simulator
w-javed Oct 17, 2024
f64d4d3
Merge branch 'main' into multi-moodal-sdk-support
w-javed Oct 17, 2024
acad134
Added image in simulator
w-javed Oct 17, 2024
60eae73
bumping-version
w-javed Oct 18, 2024
e14c6d7
push-asset
w-javed Oct 18, 2024
b12ef57
merge-conflict-fix
w-javed Oct 18, 2024
e070237
assets
w-javed Oct 18, 2024
66548f5
pushing asset
w-javed Oct 18, 2024
f05726c
merge-conflicts
w-javed Oct 19, 2024
1fe065b
remove-containt-on-key
w-javed Oct 19, 2024
82fd655
asset
w-javed Oct 19, 2024
6dddb96
asset2
w-javed Oct 19, 2024
c5fa8cf
asset3
w-javed Oct 19, 2024
74b3582
asset4
w-javed Oct 19, 2024
2e11e9d
adding conftest
w-javed Oct 20, 2024
651fc00
conftest
w-javed Oct 20, 2024
3ed59e8
cred fix
w-javed Oct 20, 2024
4031d46
asset-new
w-javed Oct 20, 2024
b5fc1c5
fix
w-javed Oct 20, 2024
24a52aa
asset
w-javed Oct 20, 2024
73e62c6
adding multi-modal-without-tests
w-javed Oct 20, 2024
c89b341
asset-from-main
w-javed Oct 20, 2024
b63910d
asset-from-main
w-javed Oct 20, 2024
ca4c3e6
fix
w-javed Oct 20, 2024
8b28458
adding one test only
w-javed Oct 20, 2024
1eb9304
new asset
w-javed Oct 20, 2024
dd53e67
tests,fix: Sanitizer should replace with enum value not enum name
kdestin Oct 21, 2024
9fcd7f8
test-asset
w-javed Oct 21, 2024
d64704d
[AutoRelease] t2-containerservicefleet-2024-09-24-42036(can only be m…
azure-sdk Oct 21, 2024
78b11c9
[AutoRelease] t2-dns-2024-09-25-81486(can only be merged by SDK owner…
azure-sdk Oct 21, 2024
6962ca2
[AutoRelease] t2-appconfiguration-2024-10-09-68726(can only be merged…
azure-sdk Oct 21, 2024
aa2bb08
code and test (#37855)
azure-sdk Oct 21, 2024
52f3784
[AutoRelease] t2-servicefabricmanagedclusters-2024-10-08-57405(can on…
azure-sdk Oct 21, 2024
09724d8
[AutoRelease] t2-containerinstance-2024-10-21-66631(can only be merge…
azure-sdk Oct 21, 2024
9f5f7f9
[sdk generation pipeline] bump typespec-python 0.36.1 (#38008)
msyyc Oct 21, 2024
45e049c
[AutoRelease] t2-dnsresolver-2024-10-12-16936(can only be merged by S…
azure-sdk Oct 21, 2024
617f8aa
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 21, 2024
706eea3
new asset after fix in conftest
w-javed Oct 21, 2024
bd08adf
asset
w-javed Oct 21, 2024
3c71da9
chore: Update assets.json
kdestin Oct 21, 2024
6d48318
Move perf pipelines to TME subscription (#38020)
azure-sdk Oct 21, 2024
14d4675
fix
w-javed Oct 21, 2024
d4b8272
after-comments
w-javed Oct 21, 2024
33e3075
fix
w-javed Oct 21, 2024
237443b
fix
w-javed Oct 21, 2024
525379e
asset
w-javed Oct 22, 2024
9603f21
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 22, 2024
af143e8
new asset with 1 test recording only
w-javed Oct 22, 2024
511e4b5
chore: Update assets.json
kdestin Oct 22, 2024
ac94148
conftest fix
w-javed Oct 23, 2024
4f445f6
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 23, 2024
dc7cb7d
assets change
w-javed Oct 23, 2024
48bec25
new test
w-javed Oct 23, 2024
307b4e4
few changes
w-javed Oct 23, 2024
20093f8
removing proxy start
w-javed Oct 23, 2024
e819a80
added all tests
w-javed Oct 24, 2024
6f1595e
merge with asset
w-javed Oct 24, 2024
60a823f
asset
w-javed Oct 24, 2024
b6334eb
fixes
w-javed Oct 25, 2024
9092831
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 25, 2024
93ba2f0
fixes with asset
w-javed Oct 25, 2024
1be5ef1
asset-after-tax
w-javed Oct 25, 2024
78f8ec2
enabling 2 more tests
w-javed Oct 25, 2024
84d4eac
unit test fix
w-javed Oct 25, 2024
7a9eb2b
asset
w-javed Oct 25, 2024
30eba68
new asset
w-javed Oct 25, 2024
ca7cdfe
fixes per comments
w-javed Oct 25, 2024
9afeb5f
changes by black
w-javed Oct 25, 2024
dd67a01
merge fix
w-javed Oct 25, 2024
c5fb4f1
merge fix
w-javed Oct 25, 2024
1701076
pylint fix
w-javed Oct 25, 2024
9397aaf
merge conflict
w-javed Oct 25, 2024
a1d9be9
pylint fix
w-javed Oct 26, 2024
47ff5fd
ground test fix
w-javed Oct 26, 2024
a7689a2
fixes - pylint, black, mypy
w-javed Oct 26, 2024
9c09880
more tests
w-javed Oct 26, 2024
ebd21d3
docstring fixes
w-javed Oct 26, 2024
c9db879
merge conflict
w-javed Oct 26, 2024
9fdef18
doc string fix
w-javed Oct 26, 2024
058c37a
asset
w-javed Oct 26, 2024
1e15809
few updates after Nagkumar review
w-javed Oct 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fixes
  • Loading branch information
w-javed committed Oct 25, 2024
commit b6334ebbb316802ce10546235afb7a6c2e051f20
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,10 @@
import jwt
import json

from azure.ai.inference._model_base import SdkJSONEncoder
from azure.ai.inference.models import ChatRequestMessage, SystemMessage, AssistantMessage

from promptflow.core._errors import MissingRequiredPackage
from azure.ai.evaluation._exceptions import ErrorBlame, ErrorCategory, ErrorTarget, EvaluationException
from azure.ai.evaluation._http_utils import AsyncHttpPipeline, get_async_http_client
from azure.ai.evaluation._model_configurations import AzureAIProject
from azure.ai.evaluation._model_configurations import AzureAIProject, Message
from azure.core.credentials import TokenCredential
from azure.core.pipeline.policies import AsyncRetryPolicy

Expand Down Expand Up @@ -499,19 +497,20 @@ async def submit_multimodal_request(messages, metric: str, rai_svc_url: str, tok
:rtype: str
"""
## handle json payload and payload from inference sdk strongly type messages
if len(messages) > 0 and isinstance(messages[0], ChatRequestMessage):
filtered_messages = [message for message in messages if not isinstance(message, SystemMessage)]
assistant_messages = [message for message in messages if isinstance(message, AssistantMessage)]
content_type = retrieve_content_type(assistant_messages, metric)
json_text = generate_payload_multimodal(content_type, filtered_messages, metric)
messages_text = json.dumps(json_text, cls=SdkJSONEncoder, exclude_readonly=True)
payload = json.loads(messages_text)

else:
filtered_messages = [message for message in messages if message["role"] != "system"]
assistant_messages = [message for message in messages if message["role"] == "assistant"]
content_type = retrieve_content_type(assistant_messages, metric)
payload = generate_payload_multimodal(content_type, filtered_messages, metric)
if len(messages) > 0 and not isinstance(messages[0], Dict):
try:
from azure.ai.inference.models import ChatRequestMessage
except ImportError:
error_message = "Please install 'azure-ai-inference' package to use SystemMessage, UserMessage, AssistantMessage"
raise MissingRequiredPackage(message=error_message)
else:
if len(messages) > 0 and isinstance(messages[0], ChatRequestMessage):
messages = [message.as_dict() for message in messages]

filtered_messages = [message for message in messages if message["role"] != "system"]
assistant_messages = [message for message in messages if message["role"] == "assistant"]
content_type = retrieve_content_type(assistant_messages, metric)
payload = generate_payload_multimodal(content_type, filtered_messages, metric)

## calling rai service for annotation
url = rai_svc_url + "/submitannotation"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,14 +88,14 @@ def _store_multimodal_content(messages, tmpdir: str):
os.makedirs(images_folder_path, exist_ok=True)

# traverse all messages and replace base64 image data with new file name.
for item in messages:
if "content" in item:
for content in item["content"]:
for message in messages:
if "content" in message:
for content in message["content"]:
if content.get("type") == "image_url":
image_url = content.get("image_url")
if image_url and 'url' in image_url and image_url['url'].startswith("data:image/jpeg;base64,"):
if image_url and 'url' in image_url and image_url['url'].startswith("data:image/jpg;base64,"):
# Extract the base64 string
base64image = image_url['url'].replace("data:image/jpeg;base64,", "")
base64image = image_url['url'].replace("data:image/jpg;base64,", "")

# Generate a unique filename
image_file_name = f"{str(uuid.uuid4())}.jpg"
Expand Down Expand Up @@ -139,10 +139,12 @@ def _log_metrics_and_instance_results(

with tempfile.TemporaryDirectory() as tmpdir:
# storing multi_modal images if exists
col_name = "inputs.messages"
col_name = "inputs.conversation"
if col_name in instance_results.columns:
instance_results[col_name].apply(lambda messages: _store_multimodal_content(messages, tmpdir))

for key, item in instance_results[col_name].items():
if "messages" in item:
_store_multimodal_content(item["messages"], tmpdir)

# storing artifact result
tmp_path = os.path.join(tmpdir, artifact_name)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@
import math
from concurrent.futures import as_completed
from typing import Callable, Dict, List, Union
from azure.ai.inference.models import ChatRequestMessage, UserMessage, AssistantMessage, SystemMessage, ToolMessage, ContentItem, ImageContentItem

from promptflow.tracing import ThreadPoolExecutorWithContext as ThreadPoolExecutor
from promptflow.core._errors import MissingRequiredPackage
from azure.ai.evaluation._common._experimental import experimental
from azure.ai.evaluation._common.constants import HarmSeverityLevel
from azure.ai.evaluation._common.math import list_mean_nan_safe
from azure.ai.evaluation._exceptions import ErrorBlame, ErrorCategory, ErrorTarget, EvaluationException

from azure.ai.evaluation._model_configurations import Conversation
from ._hate_unfairness import HateUnfairnessMultimodalEvaluator
from ._self_harm import SelfHarmMultimodalEvaluator
from ._sexual import SexualMultimodalEvaluator
Expand Down Expand Up @@ -47,27 +47,29 @@ class ContentSafetyMultimodalEvaluator:
}
eval_fn = ContentSafetyMultimodalEvaluator(azure_ai_project)
result = eval_fn(
messages= [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "<image url or base64 encoded image>"
{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "<image url or base64 encoded image>"
}
}
}
]
},
{
"role": "assistant",
"content": "This picture shows an astronaut standing in desert."
}
]
]
},
{
"role": "assistant",
"content": "This picture shows an astronaut standing in desert."
}
]
}
)

**Output format**
Expand Down Expand Up @@ -105,57 +107,58 @@ def __init__(
def __call__(
self,
*,
messages,
conversation,
**kwargs):
"""
Evaluates content-safety metrics for list of messages.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: ~azure.ai.evaluation.Conversation
:keyword conversation: The conversation contains list of messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype conversation: ~azure.ai.evaluation.Conversation
:return: The scores for messages.
:rtype: Dict
"""
self._validate_messages(messages)
self._validate_conversation(conversation)

results: Dict[str, Union[str, float]] = {}
if self._parallel:
with ThreadPoolExecutor() as executor:
futures = {
executor.submit(evaluator, messages=messages, **kwargs): evaluator
executor.submit(evaluator, conversation=conversation, **kwargs): evaluator
for evaluator in self._evaluators
}

for future in as_completed(futures):
results.update(future.result())
else:
for evaluator in self._evaluators:
result = evaluator(messages=messages, **kwargs)
result = evaluator(conversation=conversation, **kwargs)
results.update(result)

return results

def _validate_messages(self, messages):
def _validate_conversation(self, conversation):
if conversation is None or "messages" not in conversation:
msg = "Attribute messages is missing in the request"
raise EvaluationException(
message=msg,
internal_message=msg,
target=ErrorTarget.CONTENT_SAFETY_CHAT_EVALUATOR,
category=ErrorCategory.INVALID_VALUE,
blame=ErrorBlame.USER_ERROR,
)
messages = conversation["messages"]
if messages is None or not isinstance(messages, list):
msg = "messages parameter must be a list of JSON representation of chat messages or strong typed child class of ChatRequestMessage"
msg = "messages parameter must be a list of JSON representation of chat messages"
raise EvaluationException(
message=msg,
internal_message=msg,
target=ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
category=ErrorCategory.INVALID_VALUE,
blame=ErrorBlame.USER_ERROR,
)
expected_roles = [ "user", "assistant", "system", "tool" ]
expected_roles = [ "user", "assistant", "system"]
image_found = False
for num, message in enumerate(messages):
msg_num = num + 1
if not isinstance(message, dict) and not isinstance(message, ChatRequestMessage):
msg = f"Messsage in array must be a dictionary or class of ChatRequestMessage [UserMessage, SystemMessage, AssistantMessage, ToolMessage]. Message number: {msg_num}"
raise EvaluationException(
message=msg,
internal_message=msg,
target=ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
category=ErrorCategory.INVALID_VALUE,
blame=ErrorBlame.USER_ERROR,
)
if isinstance(message, dict):
if "role" in message or "content" in message:
if message["role"] not in expected_roles:
Expand Down Expand Up @@ -192,22 +195,29 @@ def _validate_messages(self, messages):
category=ErrorCategory.INVALID_VALUE,
blame=ErrorBlame.USER_ERROR,
)
if isinstance(message, ChatRequestMessage):
if not isinstance(message, UserMessage) and not isinstance(message, AssistantMessage) and not isinstance(message, SystemMessage) and not isinstance(message, ToolMessage):
msg = f"Messsage in array must be a strongly typed class of ChatRequestMessage [UserMessage, SystemMessage, AssistantMessage, ToolMessage]. Message number: {msg_num}"
raise EvaluationException(
message=msg,
internal_message=msg,
target=ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
category=ErrorCategory.INVALID_VALUE,
blame=ErrorBlame.USER_ERROR,
)
if message.content and isinstance(message.content, list):
image_items = [item for item in message.content if isinstance(item, ImageContentItem)]
if len(image_items) > 0:
image_found = True
else:
try:
from azure.ai.inference.models import ChatRequestMessage, UserMessage, AssistantMessage, SystemMessage, ImageContentItem
except ImportError:
error_message = "Please install 'azure-ai-inference' package to use SystemMessage, AssistantMessage"
raise MissingRequiredPackage(message=error_message)
else:
if isinstance(messages[0], ChatRequestMessage):
if not isinstance(message, UserMessage) and not isinstance(message, AssistantMessage) and not isinstance(message, SystemMessage):
msg = f"Messsage in array must be a strongly typed class of [UserMessage, SystemMessage, AssistantMessage]. Message number: {msg_num}"
raise EvaluationException(
message=msg,
internal_message=msg,
target=ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
category=ErrorCategory.INVALID_VALUE,
blame=ErrorBlame.USER_ERROR,
)
if message.content and isinstance(message.content, list):
image_items = [item for item in message.content if isinstance(item, ImageContentItem)]
if len(image_items) > 0:
image_found = True
if image_found is False:
msg = f"Message needs to have multimodal input like images"
msg = f"Message needs to have multimodal input like images."
raise EvaluationException(
message=msg,
internal_message=msg,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# ---------------------------------------------------------
from abc import ABC
from typing import Dict, List, Union
from typing import Union
from azure.ai.evaluation._common.constants import EvaluationMetrics
from azure.ai.evaluation._common.rai_service import evaluate_with_rai_service_multimodal
from azure.ai.evaluation._common.constants import EvaluationMetrics, _InternalEvaluationMetrics
Expand Down Expand Up @@ -35,16 +35,17 @@ def __init__(
async def __call__(
self,
*,
messages,
conversation,
**kwargs
):
"""
Evaluates content according to this evaluator's metric.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: ~azure.ai.evaluation.Conversation
:keyword conversation: The conversation contains list of messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype conversation: ~azure.ai.evaluation.Conversation
:return: The evaluation score computation based on the Content Safety metric (self.metric).
:rtype: Any
"""
messages = conversation["messages"]
# Run score computation based on supplied metric.
result = await evaluate_with_rai_service_multimodal(
messages=messages,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from promptflow._utils.async_utils import async_run_allowing_running_loop
from azure.ai.evaluation._common._experimental import experimental
from azure.ai.evaluation._common.constants import EvaluationMetrics
from typing import List, Dict
from ._content_safety_multimodal_base import ContentSafetyMultimodalEvaluatorBase

@experimental
Expand All @@ -26,27 +25,29 @@ class HateUnfairnessMultimodalEvaluator:
}
eval_fn = HateUnfairnessEvaluator(azure_ai_project)
result = eval_fn(
messages= [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "<image url or base64 encoded image>"
{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "<image url or base64 encoded image>"
}
}
}
]
},
{
"role": "assistant",
"content": "This picture shows an astronaut standing in desert."
}
]
]
},
{
"role": "assistant",
"content": "This picture shows an astronaut standing in desert."
}
]
}
)

**Output format**
Expand All @@ -68,17 +69,18 @@ def __init__(
def __call__(
self,
*,
messages,
conversation,
**kwargs
):
"""
Evaluates hate unfairness content.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: ~azure.ai.evaluation.Conversation
:keyword conversation: The conversation contains list of messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype conversation: ~azure.ai.evaluation.Conversation
:return: The hate unfairness score.
:rtype: Dict
"""
return async_run_allowing_running_loop(self._async_evaluator, messages=messages, **kwargs)
self._validate_conversation(conversation)
return async_run_allowing_running_loop(self._async_evaluator, conversation=conversation, **kwargs)

def _to_async(self):
return self._async_evaluator
Expand All @@ -99,7 +101,7 @@ def __init__(
async def __call__(
self,
*,
messages,
conversation,
**kwargs
):
return await super().__call__(messages=messages, **kwargs)
return await super().__call__(conversation=conversation, **kwargs)
Loading