Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
368bdf4
Initial-Commit-multimodal
w-javed Oct 3, 2024
920c46c
Fix
w-javed Oct 4, 2024
17c7dac
Sync eng/common directory with azure-sdk-tools for PR 9092 (#37713)
azure-sdk Oct 3, 2024
5d8ca40
Removing private parameter from __call__ of AdversarialSimulator (#37…
nagkumar91 Oct 3, 2024
6e5bd48
Enabling option to disable response payload on writes (#37365)
FabianMeiswinkel Oct 3, 2024
10f9ac7
deprecate azure_germany (#37654)
xiangyan99 Oct 4, 2024
db68d9d
Add default impl to handle token challenges (#37652)
xiangyan99 Oct 4, 2024
793c3fc
Make Credentials Required for Content Safety and Protected Materials …
needuv Oct 4, 2024
4d4e5bc
addFeedRangesAndUseFeedRangeInQueryChangeFeed (#37687)
xinlian12 Oct 4, 2024
22f081c
Update release date for core (#37723)
xiangyan99 Oct 4, 2024
71e44d4
Improvements to mindependency dev_requirement conflict resolution (#3…
scbedd Oct 4, 2024
ee45fa1
Need to add environment to subscription configuration (#37726)
azure-sdk Oct 4, 2024
2e2366b
Enable samples for formrecognizer (#37676)
xiangyan99 Oct 4, 2024
0faf959
Merge branch 'main' into multi-moodal-sdk-support
w-javed Oct 5, 2024
088ed3b
multi-modal-changes
w-javed Oct 14, 2024
3ff2c2a
Merge-conflicts
w-javed Oct 14, 2024
5ff2668
fixes
w-javed Oct 14, 2024
5c270cd
Fix with latest
w-javed Oct 16, 2024
c473df7
merge-conflicts
w-javed Oct 16, 2024
99d0cf0
dict-fix
w-javed Oct 16, 2024
d570130
adding-protected-material
w-javed Oct 17, 2024
4bc8a34
Merge branch 'main' into multi-moodal-sdk-support
w-javed Oct 17, 2024
6ad0164
adding-protected-material
w-javed Oct 17, 2024
7655b9e
adding-protected-material
w-javed Oct 17, 2024
3c4b816
bumping-version
w-javed Oct 17, 2024
255add0
adding assets
w-javed Oct 17, 2024
49a8ad8
Added image in simulator
w-javed Oct 17, 2024
f64d4d3
Merge branch 'main' into multi-moodal-sdk-support
w-javed Oct 17, 2024
acad134
Added image in simulator
w-javed Oct 17, 2024
60eae73
bumping-version
w-javed Oct 18, 2024
e14c6d7
push-asset
w-javed Oct 18, 2024
b12ef57
merge-conflict-fix
w-javed Oct 18, 2024
e070237
assets
w-javed Oct 18, 2024
66548f5
pushing asset
w-javed Oct 18, 2024
f05726c
merge-conflicts
w-javed Oct 19, 2024
1fe065b
remove-containt-on-key
w-javed Oct 19, 2024
82fd655
asset
w-javed Oct 19, 2024
6dddb96
asset2
w-javed Oct 19, 2024
c5fa8cf
asset3
w-javed Oct 19, 2024
74b3582
asset4
w-javed Oct 19, 2024
2e11e9d
adding conftest
w-javed Oct 20, 2024
651fc00
conftest
w-javed Oct 20, 2024
3ed59e8
cred fix
w-javed Oct 20, 2024
4031d46
asset-new
w-javed Oct 20, 2024
b5fc1c5
fix
w-javed Oct 20, 2024
24a52aa
asset
w-javed Oct 20, 2024
73e62c6
adding multi-modal-without-tests
w-javed Oct 20, 2024
c89b341
asset-from-main
w-javed Oct 20, 2024
b63910d
asset-from-main
w-javed Oct 20, 2024
ca4c3e6
fix
w-javed Oct 20, 2024
8b28458
adding one test only
w-javed Oct 20, 2024
1eb9304
new asset
w-javed Oct 20, 2024
dd53e67
tests,fix: Sanitizer should replace with enum value not enum name
kdestin Oct 21, 2024
9fcd7f8
test-asset
w-javed Oct 21, 2024
d64704d
[AutoRelease] t2-containerservicefleet-2024-09-24-42036(can only be m…
azure-sdk Oct 21, 2024
78b11c9
[AutoRelease] t2-dns-2024-09-25-81486(can only be merged by SDK owner…
azure-sdk Oct 21, 2024
6962ca2
[AutoRelease] t2-appconfiguration-2024-10-09-68726(can only be merged…
azure-sdk Oct 21, 2024
aa2bb08
code and test (#37855)
azure-sdk Oct 21, 2024
52f3784
[AutoRelease] t2-servicefabricmanagedclusters-2024-10-08-57405(can on…
azure-sdk Oct 21, 2024
09724d8
[AutoRelease] t2-containerinstance-2024-10-21-66631(can only be merge…
azure-sdk Oct 21, 2024
9f5f7f9
[sdk generation pipeline] bump typespec-python 0.36.1 (#38008)
msyyc Oct 21, 2024
45e049c
[AutoRelease] t2-dnsresolver-2024-10-12-16936(can only be merged by S…
azure-sdk Oct 21, 2024
617f8aa
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 21, 2024
706eea3
new asset after fix in conftest
w-javed Oct 21, 2024
bd08adf
asset
w-javed Oct 21, 2024
3c71da9
chore: Update assets.json
kdestin Oct 21, 2024
6d48318
Move perf pipelines to TME subscription (#38020)
azure-sdk Oct 21, 2024
14d4675
fix
w-javed Oct 21, 2024
d4b8272
after-comments
w-javed Oct 21, 2024
33e3075
fix
w-javed Oct 21, 2024
237443b
fix
w-javed Oct 21, 2024
525379e
asset
w-javed Oct 22, 2024
9603f21
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 22, 2024
af143e8
new asset with 1 test recording only
w-javed Oct 22, 2024
511e4b5
chore: Update assets.json
kdestin Oct 22, 2024
ac94148
conftest fix
w-javed Oct 23, 2024
4f445f6
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 23, 2024
dc7cb7d
assets change
w-javed Oct 23, 2024
48bec25
new test
w-javed Oct 23, 2024
307b4e4
few changes
w-javed Oct 23, 2024
20093f8
removing proxy start
w-javed Oct 23, 2024
e819a80
added all tests
w-javed Oct 24, 2024
6f1595e
merge with asset
w-javed Oct 24, 2024
60a823f
asset
w-javed Oct 24, 2024
b6334eb
fixes
w-javed Oct 25, 2024
9092831
Merge branch 'main' into multi-moodal-sdk-support-one-test
w-javed Oct 25, 2024
93ba2f0
fixes with asset
w-javed Oct 25, 2024
1be5ef1
asset-after-tax
w-javed Oct 25, 2024
78f8ec2
enabling 2 more tests
w-javed Oct 25, 2024
84d4eac
unit test fix
w-javed Oct 25, 2024
7a9eb2b
asset
w-javed Oct 25, 2024
30eba68
new asset
w-javed Oct 25, 2024
ca7cdfe
fixes per comments
w-javed Oct 25, 2024
9afeb5f
changes by black
w-javed Oct 25, 2024
dd67a01
merge fix
w-javed Oct 25, 2024
c5fb4f1
merge fix
w-javed Oct 25, 2024
1701076
pylint fix
w-javed Oct 25, 2024
9397aaf
merge conflict
w-javed Oct 25, 2024
a1d9be9
pylint fix
w-javed Oct 26, 2024
47ff5fd
ground test fix
w-javed Oct 26, 2024
a7689a2
fixes - pylint, black, mypy
w-javed Oct 26, 2024
9c09880
more tests
w-javed Oct 26, 2024
ebd21d3
docstring fixes
w-javed Oct 26, 2024
c9db879
merge conflict
w-javed Oct 26, 2024
9fdef18
doc string fix
w-javed Oct 26, 2024
058c37a
asset
w-javed Oct 26, 2024
1e15809
few updates after Nagkumar review
w-javed Oct 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
added all tests
  • Loading branch information
w-javed committed Oct 24, 2024
commit e819a802d6931c9d1980cc993f6feb0c998ec59e
Original file line number Diff line number Diff line change
Expand Up @@ -270,8 +270,8 @@ def _parse_content_harm_response(batch_response: List[Dict], metric_name: str) -

result: Dict[str, Union[str, float]] = {
(key.value if hasattr(key, 'value') else key): math.nan,
f"{key}_score": math.nan,
f"{key}_reason": ""
f"{(key.value if hasattr(key, 'value') else key)}_score": math.nan,
f"{(key.value if hasattr(key, 'value') else key)}_reason": math.nan
}

response = batch_response[0]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,10 @@ def __init__(
self._eval_last_turn = eval_last_turn
self._parallel = parallel
self._evaluators: List[Callable[..., Dict[str, Union[str, float]]]] = [
ViolenceEvaluator(azure_ai_project, credential),
SexualEvaluator(azure_ai_project, credential),
SelfHarmEvaluator(azure_ai_project, credential),
HateUnfairnessEvaluator(azure_ai_project, credential),
ViolenceEvaluator(credential, azure_ai_project),
SexualEvaluator(credential, azure_ai_project),
SelfHarmEvaluator(credential, azure_ai_project),
HateUnfairnessEvaluator(credential, azure_ai_project),
]

def __call__(self, *, conversation: list, **kwargs):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,21 @@
class ContentSafetyMultimodalEvaluator:
"""
Initialize a content safety multimodal evaluator configured to evaluate content safety metrics for multimodal scenario.

:param credential: The credential for connecting to Azure AI project. Required
:type credential: ~azure.core.credentials.TokenCredential
:param azure_ai_project: The scope of the Azure AI project.
It contains subscription id, resource group, and project name.
:type azure_ai_project: ~azure.ai.evaluation.AzureAIProject
:param parallel: If True, use parallel execution for evaluators. Else, use sequential execution.
Default is True.
:type parallel: bool
:param credential: The credential for connecting to Azure AI project.
:type credential: ~azure.core.credentials.TokenCredential

:return: A function that evaluates multimodal chat messages and generates metrics.
:rtype: Callable

**Usage**

.. code-block:: python
azure_ai_project = {
"subscription_id": "<subscription_id>",
Expand Down Expand Up @@ -85,13 +88,18 @@ class ContentSafetyMultimodalEvaluator:
}
"""

def __init__(self, credential, azure_ai_project: dict, parallel: bool = False):
def __init__(
self,
credential,
azure_ai_project,
parallel: bool = False
):
self._parallel = parallel
self._evaluators: List[Callable[..., Dict[str, Union[str, float]]]] = [
ViolenceMultimodalEvaluator(azure_ai_project, credential),
SexualMultimodalEvaluator(azure_ai_project, credential),
SelfHarmMultimodalEvaluator(azure_ai_project, credential),
HateUnfairnessMultimodalEvaluator(azure_ai_project, credential),
ViolenceMultimodalEvaluator(credential, azure_ai_project),
SexualMultimodalEvaluator(credential, azure_ai_project),
SelfHarmMultimodalEvaluator(credential, azure_ai_project),
HateUnfairnessMultimodalEvaluator(credential, azure_ai_project),
]

def __call__(
Expand All @@ -102,7 +110,7 @@ def __call__(
"""
Evaluates content-safety metrics for list of messages.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: Dict
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: The scores for messages.
:rtype: Dict
"""
Expand Down Expand Up @@ -208,7 +216,6 @@ def _validate_messages(self, messages):
blame=ErrorBlame.USER_ERROR,
)


def _get_harm_severity_level(self, harm_score: float) -> Union[HarmSeverityLevel, float]:
HARM_SEVERITY_LEVEL_MAPPING = {
HarmSeverityLevel.VeryLow: (0, 1),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,36 @@ class ContentSafetyMultimodalEvaluatorBase(ABC):
"""
Initialize a evaluator for a specified Evaluation Metric. Base class that is not
meant to be instantiated by users.

:param metric: The metric to be evaluated.
:type metric: ~azure.ai.evaluation._evaluators._content_safety.flow.constants.EvaluationMetrics
:param credential: The credential for connecting to Azure AI project. Required
:type credential: ~azure.core.credentials.TokenCredential
:param azure_ai_project: The scope of the Azure AI project.
It contains subscription id, resource group, and project name.
:type azure_ai_project: ~azure.ai.evaluation.AzureAIProject
:param credential: The credential for connecting to Azure AI project.
:type credential: ~azure.core.credentials.TokenCredential
"""

def __init__(
self,
metric: Union[EvaluationMetrics, _InternalEvaluationMetrics],
azure_ai_project: Dict,
credential: TokenCredential
credential: TokenCredential,
azure_ai_project,
):
self._metric = metric
self._azure_ai_project = azure_ai_project
self._credential = credential

async def __call__(self, *, messages, **kwargs):
async def __call__(
self,
*,
messages,
**kwargs
):
"""
Evaluates content according to this evaluator's metric.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: List[Dict]
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: The evaluation score computation based on the Content Safety metric (self.metric).
:rtype: Any
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,23 @@ class HateUnfairnessMultimodalEvaluator:
}
"""

def __init__(self, azure_ai_project: dict, credential=None) -> None:
self._async_evaluator = _AsyncHateUnfairnessMultimodalEvaluator(azure_ai_project, credential)
def __init__(
self,
credential,
azure_ai_project
):
self._async_evaluator = _AsyncHateUnfairnessMultimodalEvaluator(credential, azure_ai_project)

def __call__(self, *, messages, **kwargs):
def __call__(
self,
*,
messages,
**kwargs
):
"""
Evaluates hate unfairness content.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: List[Dict]
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: The hate unfairness score.
:rtype: Dict
"""
Expand All @@ -76,12 +85,21 @@ def _to_async(self):


class _AsyncHateUnfairnessMultimodalEvaluator(ContentSafetyMultimodalEvaluatorBase):
def __init__(self, azure_ai_project: dict, credential=None):
def __init__(
self,
credential,
azure_ai_project
):
super().__init__(
metric=EvaluationMetrics.HATE_FAIRNESS,
azure_ai_project=azure_ai_project,
credential=credential,
azure_ai_project=azure_ai_project,
)

async def __call__(self, *, messages, **kwargs):
async def __call__(
self,
*,
messages,
**kwargs
):
return await super().__call__(messages=messages, **kwargs)
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ class ProtectedMaterialMultimodalEvaluator:
:param azure_ai_project: The scope of the Azure AI project.
It contains subscription id, resource group, and project name.
:type azure_ai_project: ~azure.ai.evaluation.AzureAIProject

:return: Whether or not protected material was found in the response, with AI-generated reasoning.
:rtype: Dict[str, str]

Expand Down Expand Up @@ -61,15 +62,24 @@ class ProtectedMaterialMultimodalEvaluator:
}
"""

def __init__(self, azure_ai_project: dict, credential=None):
self._async_evaluator = _AsyncProtectedMaterialMultimodalEvaluator(azure_ai_project, credential)
def __init__(
self,
credential,
azure_ai_project,
):
self._async_evaluator = _AsyncProtectedMaterialMultimodalEvaluator(credential, azure_ai_project)

def __call__(self, *, messages, **kwargs):
def __call__(
self,
*,
messages,
**kwargs
):
"""
Evaluates protected materials content.

:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: List[Dict]
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: A dictionary containing a boolean label and reasoning.
:rtype: dict
"""
Expand All @@ -79,15 +89,19 @@ def _to_async(self):
return self._async_evaluator

class _AsyncProtectedMaterialMultimodalEvaluator:
def __init__(self, azure_ai_project: dict, credential: None):
self._azure_ai_project = azure_ai_project
def __init__(
self,
credential,
azure_ai_project
):
self._credential = credential
self._azure_ai_project = azure_ai_project

async def __call__(self, *, messages, **kwargs):
"""
Evaluates content according to this evaluator's metric.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: List[Dict]
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: The evaluation score computation based on the Content Safety metric (self.metric).
:rtype: Any
"""
Expand All @@ -97,7 +111,7 @@ async def __call__(self, *, messages, **kwargs):
result = await evaluate_with_rai_service_multimodal(
messages=messages,
metric_name=EvaluationMetrics.PROTECTED_MATERIAL,
project_scope=self._azure_ai_project,
credential=self._credential,
project_scope=self._azure_ai_project,
)
return result
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,15 @@
class SelfHarmMultimodalEvaluator:
"""
Initialize a self harm multimodal evaluator for self harm score.

:param credential: The credential for connecting to Azure AI project. Required
:type credential: ~azure.core.credentials.TokenCredential
:param azure_ai_project: The scope of the Azure AI project.
It contains subscription id, resource group, and project name.
:type azure_ai_project: ~azure.ai.evaluation.AzureAIProject
:param credential: The credential for connecting to Azure AI project.
:type credential: ~azure.core.credentials.TokenCredential

**Usage**

.. code-block:: python
azure_ai_project = {
"subscription_id": "<subscription_id>",
Expand Down Expand Up @@ -56,14 +59,23 @@ class SelfHarmMultimodalEvaluator:
}
"""

def __init__(self, azure_ai_project: dict, credential=None):
self._async_evaluator = _AsyncSelfHarmMultimodalEvaluator(azure_ai_project, credential)
def __init__(
self,
credential,
azure_ai_project
):
self._async_evaluator = _AsyncSelfHarmMultimodalEvaluator(credential, azure_ai_project)

def __call__(self, *, messages, **kwargs):
def __call__(
self,
*,
messages,
**kwargs
):
"""
Evaluates self harm content.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: List[Dict]
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: The self harm score.
:rtype: Dict
"""
Expand All @@ -73,11 +85,15 @@ def _to_async(self):
return self._async_evaluator

class _AsyncSelfHarmMultimodalEvaluator(ContentSafetyMultimodalEvaluatorBase):
def __init__(self, azure_ai_project: dict, credential=None):
def __init__(
self,
credential,
azure_ai_project
):
super().__init__(
metric=EvaluationMetrics.SELF_HARM,
azure_ai_project=azure_ai_project,
credential=credential,
azure_ai_project=azure_ai_project,
)
async def __call__(self, *, messages, **kwargs):
return await super().__call__(messages=messages, **kwargs)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,15 @@
class SexualMultimodalEvaluator:
"""
Initialize a sexual multimodal evaluator for sexual score.

:param credential: The credential for connecting to Azure AI project. Required
:type credential: ~azure.core.credentials.TokenCredential
:param azure_ai_project: The scope of the Azure AI project.
It contains subscription id, resource group, and project name.
:type azure_ai_project: ~azure.ai.evaluation.AzureAIProject
:param credential: The credential for connecting to Azure AI project.
:type credential: ~azure.core.credentials.TokenCredential

**Usage**

.. code-block:: python
azure_ai_project = {
"subscription_id": "<subscription_id>",
Expand Down Expand Up @@ -58,14 +60,23 @@ class SexualMultimodalEvaluator:
}
"""

def __init__(self, azure_ai_project: dict, credential=None):
self._async_evaluator = _AsyncSexualMultimodalEvaluator(azure_ai_project, credential)
def __init__(
self,
credential,
azure_ai_project
):
self._async_evaluator = _AsyncSexualMultimodalEvaluator(credential, azure_ai_project)

def __call__(self, *, messages, **kwargs):
def __call__(
self,
*,
messages,
**kwargs
):
"""
Evaluates sexual content.
:keyword messages: The messages to be evaluated. Each message should have "role" and "content" keys.
:paramtype messages: List[Dict]
:paramtype messages: ~azure.ai.evaluation.Conversation
:return: The sexual score.
:rtype: Dict
"""
Expand All @@ -75,11 +86,15 @@ def _to_async(self):
return self._async_evaluator

class _AsyncSexualMultimodalEvaluator(ContentSafetyMultimodalEvaluatorBase):
def __init__(self, azure_ai_project: dict, credential=None):
def __init__(
self,
credential,
azure_ai_project
):
super().__init__(
metric=EvaluationMetrics.SEXUAL,
azure_ai_project=azure_ai_project,
credential=credential,
azure_ai_project=azure_ai_project,
)
async def __call__(self, *, messages, **kwargs):
return await super().__call__(messages=messages, **kwargs)
Loading