Smolagents: support ImageContent and AudioContent #61

HSGamer · 2025-06-25T17:18:19Z

Smolagents does have a method to dynamically handle types of tool result (https://github.com/huggingface/smolagents/blob/fc73322658a2c261cf59d817660c6c88d510431b/src/smolagents/agent_types.py#L262-L280). It supports Text as str, Image as PIL.Image and Audio as Tensor.

This PR modified the content to match the supported types.

This is useful for ToolCallingAgent since it supports all types of content. It's not useful for CodeAgent at the moment since it only supports Text, but this can be a preparation for when CodeAgent is upgraded to support all types.

grll · 2025-06-26T14:13:47Z

Hey @HSGamer thanks a lot for the contribution it's a very interesting feature that we wanted to add. I will have a look as soon as possible!

# Conflicts: # uv.lock

grll · 2025-07-06T20:40:30Z

src/mcpadapt/smolagents_adapter.py

                self.skip_forward_signature_validation = True

-            def forward(self, *args, **kwargs) -> str:
+            def forward(self, *args, **kwargs):


maybe we could type the return type here as image, audio or text

Since Pillow and torchaudio are optional, I'm not sure that it won't throw error if the packages are not available

src/mcpadapt/smolagents_adapter.py

pyproject.toml

src/mcpadapt/smolagents_adapter.py

grll

Overall I think it's a very good change. I think it's great that it is using PIL and torch Audio as in Smolagents. Sorry for taking so long to review. Could you please add test and better documentation around the extra to import?

grll · 2025-07-13T16:58:03Z

@HSGamer we are almost there. Lint and test are failing though. Could we also maybe simplify the tests and not create the audio or the image everytime but instead maybe create it once and commit it as a file in tests/data for example? I think this could cut the code in the test by a significant amount. Thanks a lot for this!

HSGamer · 2025-07-13T18:36:25Z

@grll I think I fixed the failing tests, at least make lint and make tests work on my computer. About the files, I created the sample files and added a module to load them. Can you check again?

grll · 2025-07-13T20:57:26Z

@grll I think I fixed the failing tests, at least make lint and make tests work on my computer. About the files, I created the sample files and added a module to load them. Can you check again?

thanks for the changes! for some reason the tests still fail in CI, I will have a look tomorrow

grll

fixed the typing and test failing. Many thanks for bringing this feature into mcpadapt @HSGamer

* Smolagents: support ImageContent and AudioContent * update uv lock * add audio to test * change the command in audio * add a note about audio package in smolagents docs * add test_image * add test_audio * add sample files * use pytest-datadir to get the sample files * assert to make sure the right image size * Any result-type * add an audio backend via soundfile package to make tests work * fix missing pytest fixture for the datadir * fix mypy warning no stubs * fix format * improve typing of the forward function * fix type definition --------- Co-authored-by: Guillaume Raille <[email protected]>

Smolagents: support ImageContent and AudioContent

bab3350

HSGamer added 2 commits July 5, 2025 16:17

Merge remote-tracking branch 'upstream/main'

d48f553

# Conflicts: # uv.lock

update uv lock

c1fbdba

grll reviewed Jul 6, 2025

View reviewed changes

grll requested changes Jul 6, 2025

View reviewed changes

HSGamer added 5 commits July 8, 2025 10:24

add audio to test

1ea11ac

change the command in audio

c4f1619

add a note about audio package in smolagents docs

91c7d01

add test_image

d6855a3

add test_audio

be83e12

HSGamer requested a review from grll July 9, 2025 02:43

HSGamer added 2 commits July 14, 2025 01:16

add sample files

8ee0e32

use pytest-datadir to get the sample files

4090e44

HSGamer added 2 commits July 14, 2025 01:38

assert to make sure the right image size

06a42a0

Any result-type

39e0d3e

grll added 6 commits July 14, 2025 08:23

add an audio backend via soundfile package to make tests work

21670f7

fix missing pytest fixture for the datadir

bf3d63c

fix mypy warning no stubs

53fe4b2

fix format

ef3559f

improve typing of the forward function

cbbfd3b

fix type definition

626b39a

grll approved these changes Jul 14, 2025

View reviewed changes

grll merged commit a75387f into grll:main Jul 14, 2025
3 checks passed

grll mentioned this pull request Jul 14, 2025

MCPAdapt added support for ImageContent and AudioContent huggingface/smolagents#1554

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Smolagents: support ImageContent and AudioContent #61

Smolagents: support ImageContent and AudioContent #61

Uh oh!

HSGamer commented Jun 25, 2025 •

edited

Loading

Uh oh!

grll commented Jun 26, 2025

Uh oh!

grll Jul 6, 2025

Uh oh!

HSGamer Jul 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grll left a comment

Uh oh!

grll commented Jul 13, 2025

Uh oh!

HSGamer commented Jul 13, 2025

Uh oh!

grll commented Jul 13, 2025

Uh oh!

grll left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Smolagents: support ImageContent and AudioContent #61

Smolagents: support ImageContent and AudioContent #61

Uh oh!

Conversation

HSGamer commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grll commented Jun 26, 2025

Uh oh!

grll Jul 6, 2025

Choose a reason for hiding this comment

Uh oh!

HSGamer Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grll left a comment

Choose a reason for hiding this comment

Uh oh!

grll commented Jul 13, 2025

Uh oh!

HSGamer commented Jul 13, 2025

Uh oh!

grll commented Jul 13, 2025

Uh oh!

grll left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HSGamer commented Jun 25, 2025 •

edited

Loading