Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
c572e27
Draft ollama test
Vasilije1990 Feb 19, 2025
2de364b
Ollama test end to end
Vasilije1990 Feb 20, 2025
8ebd9a7
Ollama test end to end
Vasilije1990 Feb 20, 2025
9d0d96e
Fix ollama
Vasilije1990 Feb 21, 2025
b4088be
Fix ollama
Vasilije1990 Feb 21, 2025
b670697
Fix ollama
Vasilije1990 Feb 21, 2025
bfe039d
Fix ollama
Vasilije1990 Feb 21, 2025
6bc4f6a
Fix ollama
Vasilije1990 Feb 21, 2025
96adcfb
Fix ollama
Vasilije1990 Feb 21, 2025
c06c28d
Fix ollama
Vasilije1990 Feb 21, 2025
edd681f
Fix ollama
Vasilije1990 Feb 21, 2025
02b0109
Fix ollama
Vasilije1990 Feb 21, 2025
a91e83e
Fix ollama
Vasilije1990 Feb 21, 2025
326c418
Fix ollama
Vasilije1990 Feb 21, 2025
92602aa
Fix ollama
Vasilije1990 Feb 22, 2025
f2d0909
Fix ollama
Vasilije1990 Feb 22, 2025
97465f1
Fix ollama
Vasilije1990 Feb 22, 2025
73662b8
Fix ollama
Vasilije1990 Feb 22, 2025
90d96aa
Fix ollama
Vasilije1990 Feb 22, 2025
3a88b94
Fix ollama
Vasilije1990 Feb 22, 2025
11442df
Fix ollama
Vasilije1990 Feb 22, 2025
1dfb0dd
Fix ollama
Vasilije1990 Feb 22, 2025
4c4723b
Fix ollama
Vasilije1990 Feb 22, 2025
846c45e
Fix ollama
Vasilije1990 Feb 22, 2025
2c0bfc8
Fix ollama
Vasilije1990 Feb 22, 2025
91512cd
Merge branch 'dev' into COG-1368
Vasilije1990 Feb 22, 2025
5c7b4a5
Ruff it.
soobrosa Feb 25, 2025
7a85e71
Merge branch 'dev' into COG-1368
soobrosa Feb 25, 2025
0bba1f8
Response model fun.
soobrosa Feb 25, 2025
061fbbd
OpenAI mode.
soobrosa Feb 25, 2025
0ed6aa6
Typo.
soobrosa Feb 25, 2025
3090333
Add a call, homogenous localhost.
soobrosa Feb 25, 2025
80ccf55
Should conform more.
soobrosa Feb 25, 2025
ce8c2da
Unset, my friend, unset.
soobrosa Feb 25, 2025
65927b3
Update test_ollama.yml
Vasilije1990 Feb 25, 2025
6463c2e
Update test_ollama.yml
Vasilije1990 Feb 25, 2025
70f9b5f
Update test_ollama.yml
Vasilije1990 Feb 25, 2025
468268c
Docker Composish way.
soobrosa Feb 26, 2025
c72b12d
Let's be Pydantic.
soobrosa Feb 26, 2025
c224556
Launch Docker manually.
soobrosa Feb 26, 2025
ec9bbca
Cosmetics.
soobrosa Feb 26, 2025
cabbfd6
Maybe we could fly without the Hugger.
soobrosa Feb 26, 2025
c329cef
OHMY.
soobrosa Feb 26, 2025
44f02df
Async it.
soobrosa Feb 26, 2025
6b49078
Response model.
soobrosa Feb 26, 2025
fe7da60
Will graph fly.
soobrosa Feb 26, 2025
a77655a
Oops, putting back create transcript.
soobrosa Feb 26, 2025
cfc93e3
Clean up adapter.
soobrosa Feb 26, 2025
647d872
Phi4 can respond reasonably.
soobrosa Feb 27, 2025
01bb8cb
Beefy runner.
soobrosa Feb 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .github/workflows/test_gemini.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: test | gemini

on:
workflow_dispatch:
pull_request:
types: [labeled, synchronize]


concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
run_simple_example_test:
uses: ./.github/workflows/reusable_python_example.yml
with:
example-location: ./examples/python/simple_example.py
secrets:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is not consistent.

GRAPHISTRY_USERNAME: ${{ secrets.GRAPHISTRY_USERNAME }}
GRAPHISTRY_PASSWORD: ${{ secrets.GRAPHISTRY_PASSWORD }}
EMBEDDING_PROVIDER: "gemini"
EMBEDDING_API_KEY: ${{ secrets.GEMINI_API_KEY }}
EMBEDDING_MODEL: "gemini/text-embedding-004"
EMBEDDING_ENDPOINT: "https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004"
EMBEDDING_API_VERSION: "v1beta"
EMBEDDING_DIMENSIONS: 768
EMBEDDING_MAX_TOKENS: 8076
LLM_PROVIDER: "gemini"
LLM_API_KEY: ${{ secrets.GEMINI_API_KEY }}
LLM_MODEL: "gemini/gemini-1.5-flash"
LLM_ENDPOINT: "https://generativelanguage.googleapis.com/"
LLM_API_VERSION: "v1beta"
Comment on lines +18 to +33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Secrets Mismatch with Reusable Workflow

The job passes several secrets that are not defined in the reusable workflow (reusable_python_example.yml). According to the static analysis hints, only OPENAI_API_KEY, GRAPHISTRY_USERNAME, GRAPHISTRY_PASSWORD, and LLM_API_KEY are expected, yet the configuration includes additional secrets such as:

  • EMBEDDING_PROVIDER
  • EMBEDDING_API_KEY
  • EMBEDDING_MODEL
  • EMBEDDING_ENDPOINT
  • EMBEDDING_API_VERSION
  • EMBEDDING_DIMENSIONS
  • EMBEDDING_MAX_TOKENS
  • LLM_PROVIDER
  • LLM_MODEL
  • LLM_ENDPOINT
  • LLM_API_VERSION

Please either update the reusable workflow file to accept these additional secrets (if they are necessary for the workflow’s operation) or remove them from here to avoid potential configuration issues.

🧰 Tools
🪛 actionlint (1.7.4)

22-22: secret "EMBEDDING_PROVIDER" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


23-23: secret "EMBEDDING_API_KEY" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


24-24: secret "EMBEDDING_MODEL" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


25-25: secret "EMBEDDING_ENDPOINT" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


26-26: secret "EMBEDDING_API_VERSION" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


27-27: secret "EMBEDDING_DIMENSIONS" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


28-28: secret "EMBEDDING_MAX_TOKENS" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


29-29: secret "LLM_PROVIDER" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


31-31: secret "LLM_MODEL" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


32-32: secret "LLM_ENDPOINT" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)


33-33: secret "LLM_API_VERSION" is not defined in "./.github/workflows/reusable_python_example.yml" reusable workflow. defined secrets are "GRAPHISTRY_PASSWORD", "GRAPHISTRY_USERNAME", "LLM_API_KEY", "OPENAI_API_KEY"

(workflow-call)

116 changes: 116 additions & 0 deletions .github/workflows/test_ollama.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
name: test | ollama

on:
workflow_dispatch:
pull_request:
types: [ labeled, synchronize ]

jobs:

run_simple_example_test:

# needs 16 Gb RAM for phi4
runs-on: buildjet-4vcpu-ubuntu-2204
# services:
# ollama:
# image: ollama/ollama
# ports:
# - 11434:11434

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12.x'

- name: Install Poetry
uses: snok/[email protected]
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true

- name: Install dependencies
run: |
poetry install --no-interaction --all-extras
poetry add torch

# - name: Install ollama
# run: curl -fsSL https://ollama.com/install.sh | sh
# - name: Run ollama
# run: |
# ollama serve --openai &
# ollama pull llama3.2 &
# ollama pull avr/sfr-embedding-mistral:latest

- name: Start Ollama container
run: |
docker run -d --name ollama -p 11434:11434 ollama/ollama
sleep 5
docker exec -d ollama bash -c "ollama serve --openai"

- name: Check Ollama logs
run: docker logs ollama

- name: Wait for Ollama to be ready
run: |
for i in {1..30}; do
if curl -s http://localhost:11434/v1/models > /dev/null; then
echo "Ollama is ready"
exit 0
fi
echo "Waiting for Ollama... attempt $i"
sleep 2
done
echo "Ollama failed to start"
exit 1

- name: Pull required Ollama models
run: |
curl -X POST http://localhost:11434/api/pull -d '{"name": "phi4"}'
curl -X POST http://localhost:11434/api/pull -d '{"name": "avr/sfr-embedding-mistral:latest"}'

- name: Call ollama API
run: |
curl -X POST http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "phi4",
"stream": false,
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Whatever I say, answer with Yes." }
]
}'
curl -X POST http://127.0.0.1:11434/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "avr/sfr-embedding-mistral:latest",
"input": "This is a test sentence to generate an embedding."
}'

- name: Dump Docker logs
run: |
docker ps
docker logs $(docker ps --filter "ancestor=ollama/ollama" --format "{{.ID}}")


- name: Run example test
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GRAPHISTRY_USERNAME: ${{ secrets.GRAPHISTRY_USERNAME }}
GRAPHISTRY_PASSWORD: ${{ secrets.GRAPHISTRY_PASSWORD }}
PYTHONFAULTHANDLER: 1
LLM_PROVIDER: "ollama"
LLM_API_KEY: "ollama"
LLM_ENDPOINT: "http://localhost:11434/v1/"
LLM_MODEL: "phi4"
EMBEDDING_PROVIDER: "ollama"
EMBEDDING_MODEL: "avr/sfr-embedding-mistral:latest"
EMBEDDING_ENDPOINT: "http://localhost:11434/v1/"
EMBEDDING_DIMENSIONS: "4096"
HUGGINGFACE_TOKENIZER: "Salesforce/SFR-Embedding-Mistral"
run: poetry run python ./examples/python/simple_example.py
23 changes: 22 additions & 1 deletion .github/workflows/upgrade_deps.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,29 @@ name: Update Poetry Dependencies

on:
schedule:
- cron: '0 3 * * 0'
- cron: '0 3 * * 0' # Runs at 3 AM every Sunday
push:
paths:
- 'poetry.lock'
- 'pyproject.toml'
branches:
- main
- dev
pull_request:
paths:
- 'poetry.lock'
- 'pyproject.toml'
types: [opened, synchronize, reopened]
branches:
- main
- dev
workflow_dispatch:
inputs:
debug_enabled:
type: boolean
description: 'Run the update with debug logging'
required: false
default: false

jobs:
update-dependencies:
Expand Down
53 changes: 53 additions & 0 deletions cognee/infrastructure/llm/ollama/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
from cognee.infrastructure.llm.llm_interface import LLMInterface
from cognee.infrastructure.llm.config import get_llm_config
from openai import OpenAI
import base64
import os


class OllamaAPIAdapter(LLMInterface):
Expand Down Expand Up @@ -42,3 +44,54 @@ async def acreate_structured_output(
)

return response

def create_transcript(self, input_file: str) -> str:
"""Generate an audio transcript from a user query."""

if not os.path.isfile(input_file):
raise FileNotFoundError(f"The file {input_file} does not exist.")

with open(input_file, "rb") as audio_file:
transcription = self.aclient.audio.transcriptions.create(
model="whisper-1", # Ensure the correct model for transcription
file=audio_file,
language="en",
)

# Ensure the response contains a valid transcript
if not hasattr(transcription, "text"):
raise ValueError("Transcription failed. No text returned.")

return transcription.text
Comment on lines +48 to +65
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

⚠️ Potential issue

🧩 Analysis chain

Fix issues in the audio transcription implementation.

This implementation has multiple critical issues:

  1. It uses self.aclient instead of self.client
  2. Based on previous reviews, Ollama may not support OpenAI's audio transcription API
  3. There's no error handling for the API call
 def create_transcript(self, input_file: str) -> str:
     """Generate an audio transcript from a user query."""

     if not os.path.isfile(input_file):
         raise FileNotFoundError(f"The file {input_file} does not exist.")

-    with open(input_file, "rb") as audio_file:
-        transcription = self.aclient.audio.transcriptions.create(
-            model="whisper-1",  # Ensure the correct model for transcription
-            file=audio_file,
-            language="en",
-        )
-
-    # Ensure the response contains a valid transcript
-    if not hasattr(transcription, "text"):
-        raise ValueError("Transcription failed. No text returned.")
-
-    return transcription.text
+    try:
+        with open(input_file, "rb") as audio_file:
+            transcription = self.client.audio.transcriptions.create(
+                model="whisper-1",  # Ensure the correct model for transcription
+                file=audio_file,
+                language="en",
+                max_retries=self.MAX_RETRIES,
+            )
+
+        # Ensure the response contains a valid transcript
+        if not hasattr(transcription, "text"):
+            raise ValueError("Transcription failed. No text returned.")
+
+        return transcription.text
+    except Exception as e:
+        raise Exception(f"Error transcribing audio: {str(e)}")

🌐 Web query:

Does Ollama support OpenAI-compatible audio transcription API?

💡 Result:

Ollama does not natively support OpenAI-compatible audio transcription APIs like Whisper. However, developers can combine Ollama with OpenAI's Whisper API or third-party tools to achieve this functionality[1][3][7]. Here's how it works:

  1. Audio Processing Workflow
    Typical implementations use OpenAI's Whisper for speech-to-text:

    // Transcribe audio using OpenAI's API
    const transcription = await openai.audio.transcriptions.create({
      model: 'whisper-1',
      file: audioFile,
    });

    Then pass the text to Ollama for LLM processing[1][4].

  2. Ollama's OpenAI Compatibility
    While Ollama supports OpenAI-style chat completions API for text generation[2][5][8]:

    // Using Ollama with OpenAI-compatible chat API
    const response = await generateText({
      model: ollama('llama3.1'),
      prompt: transcribedText,
    });

    This compatibility doesn't extend to audio endpoints[2][5].

  3. Alternative Implementations
    Some projects combine offline tools with Ollama:

    • WhisperLive for real-time transcription[7]
    • Ollama-Voice stack (Whisper + Ollama + pyttsx3)[10]
    • Custom integrations using Nvidia GPUs[3]

For direct audio API support, developers currently need to use OpenAI's services or third-party solutions alongside Ollama's text capabilities[1][3][7]. The native Ollama API focuses on text generation through its OpenAI-compatible chat endpoints[2][5][8].

Citations:


Critical: Audio Transcription Implementation Requires Revision

The recent changes correctly update the client reference (from self.aclient to self.client) and add error handling with a retry mechanism. However, the core issue still remains: Ollama does not natively support OpenAI’s audio transcription API (e.g., the Whisper endpoint). To ensure reliable functionality, please consider one of the following improvements:

  • Integrate a Supported Transcription Service: If audio transcription is required, use a dedicated service (such as OpenAI’s Whisper API or another third-party solution) rather than relying on Ollama’s endpoints.
  • Refactor the Workflow: Separate the transcription step from the LLM workflow. This will help avoid confusion and prevent the use of unsupported API calls in the Ollama adapter.

Affected Code Location:

  • cognee/infrastructure/llm/ollama/adapter.py (lines 52-69)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_transcript(self, input_file: str) -> str:
"""Generate an audio transcript from a user query."""
if not os.path.isfile(input_file):
raise FileNotFoundError(f"The file {input_file} does not exist.")
with open(input_file, "rb") as audio_file:
transcription = self.aclient.audio.transcriptions.create(
model="whisper-1", # Ensure the correct model for transcription
file=audio_file,
language="en",
)
# Ensure the response contains a valid transcript
if not hasattr(transcription, "text"):
raise ValueError("Transcription failed. No text returned.")
return transcription.text
def create_transcript(self, input_file: str) -> str:
"""Generate an audio transcript from a user query."""
if not os.path.isfile(input_file):
raise FileNotFoundError(f"The file {input_file} does not exist.")
try:
with open(input_file, "rb") as audio_file:
transcription = self.client.audio.transcriptions.create(
model="whisper-1", # Ensure the correct model for transcription
file=audio_file,
language="en",
max_retries=self.MAX_RETRIES,
)
# Ensure the response contains a valid transcript
if not hasattr(transcription, "text"):
raise ValueError("Transcription failed. No text returned.")
return transcription.text
except Exception as e:
raise Exception(f"Error transcribing audio: {str(e)}")


def transcribe_image(self, input_file: str) -> str:
"""Transcribe content from an image using base64 encoding."""

if not os.path.isfile(input_file):
raise FileNotFoundError(f"The file {input_file} does not exist.")

with open(input_file, "rb") as image_file:
encoded_image = base64.b64encode(image_file.read()).decode("utf-8")

response = self.aclient.chat.completions.create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined client reference.

The code uses self.aclient, but only self.client is defined in the constructor. This will cause a runtime error.

-response = self.aclient.chat.completions.create(
+response = self.client.chat.completions.create(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response = self.aclient.chat.completions.create(
response = self.client.chat.completions.create(

model=self.model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"},
},
],
}
Comment on lines +81 to +88
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify Ollama supports multimodal content in this format.

The implementation assumes Ollama supports OpenAI's multimodal message format. Verify that Ollama's API accepts this structure for image analysis.


🌐 Web query:

Does Ollama support OpenAI-compatible image analysis with base64 encoded images?

💡 Result:

Ollama supports image analysis using base64-encoded images through its vision models like LLaVA and Llama 3.2 Vision, though implementation details differ slightly from OpenAI's approach. Here's how it works:

Base64 Image Handling

  1. Encoding requirement:
    Images must be converted to base64 strings without the data:image/...;base64, prefix[6][8]. For example:

    def encode_image_to_base64(image_path):
        with Image.open(image_path) as img:
            buffered = io.BytesIO()
            img.save(buffered, format="PNG")
            return base64.b64encode(buffered.getvalue()).decode('utf-8')

    This matches OpenAI's requirement for raw base64 data[6][8].

  2. API integration:
    Base64 strings are passed via the images parameter in chat requests:

    response = ollama.chat(
        model="llava:13b",
        messages=[{
            "role": "user",
            "content": "Describe this image:",
            "images": [base64_image_string]
        }]
    )

    This structure mirrors OpenAI's vision API format[7][9].

Key Differences from OpenAI

Feature Ollama OpenAI
Model options LLaVA, Llama 3.2 Vision GPT-4 Vision
Local execution ✅ Runs entirely offline ❌ Cloud-only
Base64 prefix handling Requires raw string Requires prefix stripping
Model licensing Apache 2.0/LLaMA license Proprietary

Supported Use Cases

  • Document analysis: Receipts, charts, and diagrams[3][7]
  • Object detection: Local security camera integration[5]
  • OCR: Book cover text extraction[2]
  • Multimodal RAG systems: Combined text/image retrieval[3]

To implement this with Llama 3.2 Vision:

ollama run llama3.2-vision

Then use either file paths or base64 strings in your API calls[9][10]. The system supports both CLI and programmatic access through Python/JavaScript libraries[7][9].

Citations:


Action: Update Image Handling to Meet Ollama API Requirements

We've confirmed that while Ollama does support image analysis with base64-encoded images, it requires some important adjustments compared to OpenAI’s multimodal message format:

  • Strip the Prefix: Ollama expects a raw base64 string without the data:image/jpeg;base64, prefix.
  • Parameter Structure: Instead of using the image_url field, the API typically accepts images via an images parameter in the chat message.

Please update the code in cognee/infrastructure/llm/ollama/adapter.py (lines 82–89) accordingly to comply with these requirements.

],
max_tokens=300,
)

# Ensure response is valid before accessing .choices[0].message.content
if not hasattr(response, "choices") or not response.choices:
raise ValueError("Image transcription failed. No response received.")

return response.choices[0].message.content
Comment on lines +93 to +97
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add proper error handling for the entire API call.

While you've added validation for the response, you should wrap the entire API call in a try-except block to handle API exceptions properly.

+    except Exception as e:
+        raise Exception(f"Error transcribing image: {str(e)}")
+
     return response.choices[0].message.content

Committable suggestion skipped: line range outside the PR's diff.