Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
d4ff91e
🔧 (image.py): refactor create_image_content_dict function to return a…
Cristhianzl Sep 8, 2025
f2f708a
✅ (test_image_utils.py): update image content dict format for better …
Cristhianzl Sep 8, 2025
6f641dc
[autofix.ci] apply automated fixes
autofix-ci[bot] Sep 8, 2025
289c5bd
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
carlosrcoelho Sep 8, 2025
574f1ad
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
carlosrcoelho Sep 9, 2025
890f7ec
✨ (test_image_providers.py): add integration tests for image content …
Cristhianzl Sep 9, 2025
eda3ee2
Merge branch 'cz/fix-image-send-llms' of github.com:langflow-ai/langf…
Cristhianzl Sep 9, 2025
5c95f0d
[autofix.ci] apply automated fixes
autofix-ci[bot] Sep 9, 2025
eea5a6f
📝 (general-bugs-agent-images-playground.spec.ts): update file path fo…
Cristhianzl Sep 9, 2025
f175450
Merge branch 'cz/fix-image-send-llms' of github.com:langflow-ai/langf…
Cristhianzl Sep 9, 2025
f6a5070
🐛 (general-bugs-agent-images-playground.spec.ts): Fix bug in test cas…
Cristhianzl Sep 9, 2025
93b7d8e
🐛 (general-bugs-agent-images-playground.spec.ts): fix file path for i…
Cristhianzl Sep 9, 2025
625c842
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
Cristhianzl Sep 9, 2025
57a4e53
[autofix.ci] apply automated fixes
autofix-ci[bot] Sep 9, 2025
0029c54
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
Cristhianzl Sep 9, 2025
1cb099a
✅ (test_image_utils.py): update test_image_content_dict_format_compat…
Cristhianzl Sep 9, 2025
af73684
Merge branch 'cz/fix-image-send-llms' of github.com:langflow-ai/langf…
Cristhianzl Sep 9, 2025
b07b0dc
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
Cristhianzl Sep 9, 2025
1aaded2
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
jordanrfrazier Sep 11, 2025
74dde8b
Merge branch 'release-1.6.0' into cz/fix-image-send-llms
lucaseduoli Sep 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/backend/base/langflow/utils/image.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,4 @@ def create_image_content_dict(image_path: str | Path, mime_type: str | None = No
msg = f"Failed to create image content dict: {e}"
raise type(e)(msg) from e

return {"type": "image", "source_type": "url", "url": f"data:{mime_type};base64,{base64_data}"}
return {"type": "image_url", "image_url": {"url": f"data:{mime_type};base64,{base64_data}"}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cristhianzl Doesnt this change with respect to the providers?

@coderabbitai review the format with respect to providers like openai and anthropic. and google gen ai

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for base64 images.

Assessment

Your change is correct for OpenAI compatibility, but it creates a provider-specific format issue. The new format works perfectly for OpenAI APIs but won't work for Anthropic or Google Gemini without additional transformation.

Recommendation

Since this appears to be a multimodal content function that might be used across different providers, consider:

  1. Provider-agnostic approach: Keep the function generic and add a separate transformation layer
  2. Provider-specific functions: Create separate functions for each provider's format
  3. Configuration-based: Add a provider parameter to return the appropriate format

The current change is correct if this function is specifically intended for OpenAI integration, but problematic if it needs to work with other providers.

Would you like me to help implement a more flexible solution that supports multiple providers?

🤔 This format difference across providers is quite interesting - each has their own approach to handling multimodal content!


✏️ Learnings added
Learnt from: edwinjosechittilappilly
PR: langflow-ai/langflow#9745
File: src/backend/base/langflow/utils/image.py:102-102
Timestamp: 2025-09-08T21:06:27.132Z
Learning: OpenAI API uses image content format: `{"type": "image_url", "image_url": {"url": "data:mime_type;base64,..."}}` for vision-enabled models like GPT-4, GPT-4o, and GPT-4.1.

Learnt from: edwinjosechittilappilly
PR: langflow-ai/langflow#9745
File: src/backend/base/langflow/utils/image.py:102-102
Timestamp: 2025-09-08T21:06:27.132Z
Learning: Anthropic Claude API uses image content format: `{"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": "..."}}` for base64 images and `{"type": "image", "source": {"type": "url", "url": "..."}}` for URL-based images.

Learnt from: edwinjosechittilappilly
PR: langflow-ai/langflow#9745
File: src/backend/base/langflow/utils/image.py:102-102
Timestamp: 2025-09-08T21:06:27.132Z
Learning: Google Gemini API uses image content format: `{"inline_data": {"mime_type": "image/jpeg", "data": "base64_data"}}` for base64-encoded images.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Member Author

@Cristhianzl Cristhianzl Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwinjosechittilappilly I don't think so.
I tested all the providers with Agent, LLM and chat image.
It's in the video the manual test changing the providers: https://www.loom.com/share/37eda4d9939743ba8986eefb9e3f2cd8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

157 changes: 148 additions & 9 deletions src/backend/tests/unit/utils/test_image_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,23 +75,23 @@
def test_create_image_content_dict_success(sample_image):
"""Test successful creation of image content dict."""
content_dict = create_image_content_dict(sample_image)
assert content_dict["type"] == "image"
assert content_dict["source_type"] == "url"
assert "url" in content_dict
assert content_dict["url"].startswith("data:image/png;base64,")
assert content_dict["type"] == "image_url"
assert "image_url" in content_dict
assert "url" in content_dict["image_url"]
assert content_dict["image_url"]["url"].startswith("data:image/png;base64,")
# Verify the base64 part is valid
base64_part = content_dict["url"].split(",")[1]
base64_part = content_dict["image_url"]["url"].split(",")[1]
assert base64.b64decode(base64_part)


def test_create_image_content_dict_with_custom_mime(sample_image):
"""Test creation of image content dict with custom MIME type."""
custom_mime = "image/custom"
content_dict = create_image_content_dict(sample_image, mime_type=custom_mime)
assert content_dict["type"] == "image"
assert content_dict["source_type"] == "url"
assert "url" in content_dict
assert content_dict["url"].startswith(f"data:{custom_mime};base64,")
assert content_dict["type"] == "image_url"
assert "image_url" in content_dict
assert "url" in content_dict["image_url"]
assert content_dict["image_url"]["url"].startswith(f"data:{custom_mime};base64,")


def test_create_image_content_dict_invalid_file():
Expand All @@ -106,3 +106,142 @@
invalid_file.touch()
with pytest.raises(ValueError, match="Could not determine MIME type"):
create_image_content_dict(invalid_file)


def test_create_image_content_dict_format_compatibility(sample_image):
"""Test that the image content dict format is compatible with different LLM providers."""
content_dict = create_image_content_dict(sample_image)

# Test the new format structure that should work with Google/Gemini
assert content_dict["type"] == "image_url"
assert "image_url" in content_dict
assert isinstance(content_dict["image_url"], dict)
assert "url" in content_dict["image_url"]

# Test that the URL is a valid data URL
url = content_dict["image_url"]["url"]
assert url.startswith("data:")
assert ";base64," in url

# Verify the structure matches OpenAI's expected format
# OpenAI expects: {"type": "image_url", "image_url": {"url": "data:..."}}
assert all(key in ["type", "image_url"] for key in content_dict.keys())

Check failure on line 128 in src/backend/tests/unit/utils/test_image_utils.py

View workflow job for this annotation

GitHub Actions / Ruff Style Check (3.13)

Ruff (SIM118)

src/backend/tests/unit/utils/test_image_utils.py:128:49: SIM118 Use `key in dict` instead of `key in dict.keys()`
assert all(key in ["url"] for key in content_dict["image_url"].keys())

Check failure on line 129 in src/backend/tests/unit/utils/test_image_utils.py

View workflow job for this annotation

GitHub Actions / Ruff Style Check (3.13)

Ruff (SIM118)

src/backend/tests/unit/utils/test_image_utils.py:129:35: SIM118 Use `key in dict` instead of `key in dict.keys()`


def test_image_content_dict_google_gemini_compatibility(sample_image):
"""Test that the format resolves the original Gemini error."""
content_dict = create_image_content_dict(sample_image)

# The original error was: "Unrecognized message part type: image"
# This should now be "image_url" which Gemini supports
assert content_dict["type"] == "image_url"

# Gemini should accept this format without the "source_type" field
# that was causing issues in the old format
assert "source_type" not in content_dict

# The nested structure should match what Gemini expects
assert "image_url" in content_dict
assert "url" in content_dict["image_url"]


def test_image_content_dict_openai_compatibility(sample_image):
"""Test compatibility with OpenAI's expected image format."""
content_dict = create_image_content_dict(sample_image)

# OpenAI Vision API expects exactly this structure:
# {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
expected_keys = {"type", "image_url"}
assert set(content_dict.keys()) == expected_keys

assert content_dict["type"] == "image_url"
assert isinstance(content_dict["image_url"], dict)
assert "url" in content_dict["image_url"]

# OpenAI accepts data URLs with base64 encoding
url = content_dict["image_url"]["url"]
assert url.startswith("data:image/")
assert ";base64," in url


def test_image_content_dict_anthropic_compatibility(sample_image):
"""Test compatibility with Anthropic's expected image format."""
content_dict = create_image_content_dict(sample_image)

# Anthropic Claude also uses the image_url format for vision
# This format should be compatible
assert content_dict["type"] == "image_url"
assert "image_url" in content_dict

# Anthropic accepts base64 data URLs
url = content_dict["image_url"]["url"]
assert url.startswith("data:")
assert "base64" in url


def test_image_content_dict_langchain_message_compatibility(sample_image):
"""Test that the format integrates well with LangChain message structures."""
content_dict = create_image_content_dict(sample_image)

# Simulate how this would be used in a LangChain message
message_content = [{"type": "text", "text": "What do you see in this image?"}, content_dict]

# Verify the message structure is valid
text_part = message_content[0]
image_part = message_content[1]

assert text_part["type"] == "text"
assert image_part["type"] == "image_url"
assert "image_url" in image_part
assert "url" in image_part["image_url"]


def test_image_content_dict_no_legacy_fields(sample_image):
"""Test that legacy fields that caused issues are not present."""
content_dict = create_image_content_dict(sample_image)

# These fields from the old format should not be present
# as they caused compatibility issues with some providers
legacy_fields = ["source_type", "source", "media_type"]

for field in legacy_fields:
assert field not in content_dict, f"Legacy field '{field}' should not be present"
assert field not in content_dict.get("image_url", {}), f"Legacy field '{field}' should not be in image_url"


def test_image_content_dict_multiple_formats(sample_image, tmp_path):

Check failure on line 213 in src/backend/tests/unit/utils/test_image_utils.py

View workflow job for this annotation

GitHub Actions / Ruff Style Check (3.13)

Ruff (ARG001)

src/backend/tests/unit/utils/test_image_utils.py:213:46: ARG001 Unused function argument: `sample_image`
"""Test that the format works consistently across different image types."""
# Test with different image formats
formats_to_test = [
("test.png", "image/png"),
("test.jpg", "image/jpeg"),
("test.gif", "image/gif"),
("test.webp", "image/webp"),
]

# Use the same image content for all formats (the test PNG data)
image_content = base64.b64decode(
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAACklEQVR4nGMAAQAABQABDQottAAAAABJRU5ErkJggg=="
)

for filename, expected_mime in formats_to_test:
image_path = tmp_path / filename
image_path.write_bytes(image_content)

try:
content_dict = create_image_content_dict(image_path)

# All formats should produce the same structure
assert content_dict["type"] == "image_url"
assert "image_url" in content_dict
assert "url" in content_dict["image_url"]

# The MIME type should be detected correctly
url = content_dict["image_url"]["url"]
assert url.startswith(f"data:{expected_mime};base64,")

except ValueError as e:
# Some formats might not be supported, which is fine
if "Could not determine MIME type" not in str(e):
raise
Loading