Add support for transcribe image and audio transcription for gemini, anthropic, mistral and ollama. #1828

rajeevrajeshuni · 2025-11-25T06:58:45Z

Description

Add support for transcribe image and audio transcription for gemini, anthropic, mistral and ollama.

Type of Change

New feature (non-breaking change that adds functionality)
Code refactoring

Pre-submission Checklist

My code follows the project's coding standards and style guidelines.
My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Implementation Questions for Review

I need guidance on the following architectural decisions:

Video Transcription: For video transcription, what is the intended approach?
- Do we extract the audio and pass it to the transcription/audio model (e.g., Whisper, or an LLM with audio support)?
- Or do we expect to send the video file/stream directly to a supported model (if any)?
Standardizing Return Types for Auto Transcription: The implementation currently involves model-specific methods (e.g., a distinct method for OpenAI and different approaches for Mistral). How should we establish a consistent, unified return type (e.g., a common TranscriptionResult object) across all models to ensure a standardized user interface?

Future Scope

Groq Support: Should we also add an adapter for Groq?

Testing Notes

I have tested core components of the new features using a separate validation script.

Test Failure Context: I am currently encountering a test setup error when running python cognee/tests/test_library.py, specifically: LLMAPIKeyNotSetError: LLM API key is not set. (Status code: 422). I need help on the correct environment variable or configuration file location for setting the LLM API key to resolve this and run the full test suite.

Summary by CodeRabbit

New Features
- Added audio transcription and image transcription support across adapters
- Enhanced observability for generation and transcription workflows
Bug Fixes
- Improved retry behavior and fallback handling for more resilient operations
Refactor
- Consolidated adapters into a generic API adapter model
- Unified provider initialization and parameter conventions

_{✏️ Tip: You can customize this high-level summary in your review settings.}

pull-checklist · 2025-11-25T06:58:50Z

Please make sure all the checkboxes are checked:

I have tested these changes locally.
I have reviewed the code changes.
I have added end-to-end and unit tests (if applicable).
I have updated the documentation and README.md file (if necessary).
I have removed unnecessary code and debug statements.
PR title is clear and follows the convention.
I have tagged reviewers or team members for feedback.

coderabbitai · 2025-11-25T06:58:52Z

Walkthrough

Removed the synchronous LLMGateway.create_structured_output and refactored multiple LLM adapters to inherit from a new GenericAPIAdapter, added observability decorators, introduced multimodal transcription/image-transcribe methods and a TranscriptionReturnType, and added an LLMProvider enum for provider selection.

Changes

Cohort / File(s)	Summary
Gateway API removal `cognee/infrastructure/llm/LLMGateway.py`	Removed the synchronous static method `create_structured_output`; left async `acreate_structured_output` intact.
Adapter base refactor & observability `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py`, `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py`, `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py`, `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py`, `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py`	Switched adapters from `LLMInterface` to `GenericAPIAdapter`; updated constructors to delegate to `super().__init__` and changed ctor signatures/arg ordering; removed direct field reassignments; added `@observe` decorators to async structured-output methods; minor formatting fix in Ollama adapter.
Generic adapter expansion `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py`	Added `MAX_RETRIES`, new ctor params (name, endpoint, api_version, transcription/image models), added `create_transcript()` and `transcribe_image()` (file reading, base64, MIME handling), and integrated observe decorators and retry policies.
Provider enum & client routing `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py`	Added public `LLMProvider` enum and shifted adapter instantiation to positional-argument style matching updated constructors; preserved provider selection flow and API-key checks.
LLM interface contract `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py`	Added abstract async methods `create_transcript(self, input) -> Optional[BaseModel]` and `transcribe_image(self, input) -> Optional[BaseModel]`.
Types: transcription container `cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py`	Added `TranscriptionReturnType` simple container class with `text: str` and `payload: BaseModel`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pay close attention to constructor argument reordering and propagation to super().__init__ across adapters.
Review new multimodal methods (create_transcript, transcribe_image) for correct base64/MIME handling, return types, and error/ retry paths.
Verify @observe decorations and as_type values are applied consistently and that observe wiring (get_observe/observe) is correct.
Check call sites (including any synchronous callers) for reliance on the removed synchronous LLMGateway.create_structured_output.

Possibly related PRs

Changes langfuse LLM API call tracing to generation #397 — Changes to observability/decorator usage on adapter methods; strongly related to the added/modified @observe decorators.
feat: dynamic multiple edges in datapoints #1212 — Modifies gateway/graph-related extraction logic; may touch callers or callers of LLMGateway API surfaces.

Suggested reviewers

borisarzentar

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main feature addition: transcribe image and audio transcription support for four specific LLM providers.
Description check	✅ Passed	The description covers the core change (new feature and refactoring), includes architectural questions for reviewer guidance, mentions testing status with known limitations, and affirms the DCO. However, it lacks detailed acceptance criteria and specific testing instructions that are outlined in the template.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions

Hello @rajeevrajeshuni, thank you for submitting a PR! We will respond as soon as possible.

rajeevrajeshuni · 2025-11-27T14:39:38Z

@dexters1 @hande-k Would love to close this out soon. Let me know your feedback, and please point me in the right direction for running the full test suite.

dexters1 · 2025-11-28T15:56:52Z

@rajeevrajeshuni The PR looks good, reusing of the GenericLLMAdapter is a good choice. Will run the test suite on Monday and make sure it's all in working order, if it's working with all the Adapters we have we should be good to go

rajeevrajeshuni · 2025-11-28T16:55:35Z

@dexters1 Thanks for the review. Can you also help me understand how to run the test suite by myself?

rajeevrajeshuni · 2025-11-28T17:02:36Z

@dexters1 Also please review if we should Standardize the return types for Auto Transcription.

dexters1 · 2025-12-05T16:43:18Z

Hey @rajeevrajeshuni I've tried out the PR works fine with OpenAI and Gemini LLM calls, but the output of the transcription when using Gemini breaks Cognee. The request is made properly but errors.

Can you try running it with Gemini? The fix should be just a differently packaged return value

rajeevrajeshuni · 2025-12-06T01:24:25Z

Hey @dexters1 I’m aware the return type is different and have already highlighted it both in the PR description and an earlier comment. Since we currently return a BaseModel everywhere, we should either define a new TranscriptionReturnType or confirm which properties are actually required—whether it’s just text or something more.

rajeevrajeshuni · 2025-12-06T01:26:15Z

@dexters1 Also, please test this with MistralAI as well. I would have done it myself, but I’m currently facing issues with running the automated tests.

rajeevrajeshuni · 2025-12-10T05:23:52Z

Hey @dexters1 I’m aware the return type is different and have already highlighted it both in the PR description and an earlier comment. Since we currently return a BaseModel everywhere, we should either define a new TranscriptionReturnType or confirm which properties are actually required—whether it’s just text or something more.

Hi @dexters1 , need your input on this. I have resolved some merge conflicts with recent changes in dev. So could you kindly test it again or help me with running the automated tests.

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (1)
194-197: Hardcoded MIME type limits image format support.

The MIME type is hardcoded to "image/jpeg", which will incorrectly identify PNG, GIF, WebP, and other image formats. Use mimetypes.guess_type(input_file) to detect the correct type, consistent with GenericAPIAdapter.transcribe_image.
+import mimetypes
...
     async with open_data_file(input_file, mode="rb") as image_file:
         encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
+    mime_type, _ = mimetypes.guess_type(input_file)
+    if not mime_type or not mime_type.startswith("image/"):
+        raise ValueError(f"Could not determine MIME type for image file: {input_file}")
...
-                            "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"},
+                            "image_url": {"url": f"data:{mime_type};base64,{encoded_image}"},
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
95-97: Remove unused import.

GenericAPIAdapter is imported but not used in this branch—OllamaAPIAdapter is instantiated instead.
-        from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.generic_llm_api.adapter import (
-            GenericAPIAdapter,
-        )
-
         return OllamaAPIAdapter(
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
64-68: Use self.api_key instead of fetching from config.

Same issue as in AnthropicAdapter—the constructor receives api_key but line 67 fetches from config, causing potential inconsistency.
         self.aclient = instructor.from_litellm(
             litellm.acompletion,
             mode=instructor.Mode(self.instructor_mode),
-            api_key=get_llm_config().llm_api_key,
+            api_key=self.api_key,
         )
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
34-58: Update docstrings: remove non-existent method, fix typo.

The docstring mentions create_structured_output (sync) at line 41, but this method doesn't exist in the class.

Line 58 has a typo: "GPT=4" should be "GPT-4".

The duplicate docstring at line 58 is redundant.
 class OpenAIAdapter(GenericAPIAdapter):
     """
     Adapter for OpenAI's GPT-3, GPT-4 API.
 
     Public methods:
 
     - acreate_structured_output
-    - create_structured_output
     - create_transcript
     - transcribe_image
     - show_prompt
 
     Instance variables:
 
     - name
     - model
     - api_key
     - api_version
     - MAX_RETRIES
     """
 
     default_instructor_mode = "json_schema_mode"
     MAX_RETRIES = 5
-
-    """Adapter for OpenAI's GPT-3, GPT=4 API"""

🧹 Nitpick comments (9)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (2)
40-40: Add type annotation for input parameter.

The input parameter lacks a type annotation. Based on the implementations in other adapters (e.g., OllamaAPIAdapter.create_transcript(input_file: str)), this should be typed as str.
-    async def create_transcript(self, input) -> Optional[BaseModel]:
+    async def create_transcript(self, input: str) -> Optional[BaseModel]:
58-58: Add type annotation for input parameter.

Same as above—this parameter should be typed as str for consistency with concrete implementations.
-    async def transcribe_image(self, input) -> Optional[BaseModel]:
+    async def transcribe_image(self, input: str) -> Optional[BaseModel]:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
112-117: Consider using keyword arguments for maintainability.

While the positional arguments match the constructor signature, using keyword arguments (like the OpenAIAdapter and GeminiAdapter calls do) improves readability and reduces risk of future mismatches.
         return AnthropicAdapter(
-            llm_config.llm_api_key,
-            llm_config.llm_model,
-            max_completion_tokens,
+            api_key=llm_config.llm_api_key,
+            model=llm_config.llm_model,
+            max_completion_tokens=max_completion_tokens,
             instructor_mode=llm_config.llm_instructor_mode.lower(),
         )
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
291-291: Hardcoded max_completion_tokens=300 differs from audio transcription.

The image transcription uses a hardcoded value of 300 tokens while audio transcription uses self.max_completion_tokens. Consider making this configurable or using the instance attribute for consistency.
-            max_completion_tokens=300,
+            max_completion_tokens=self.max_completion_tokens,
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (2)
63-79: Missing instructor_mode in super().__init__() causes redundant client initialization.

The instructor_mode is not passed to the base class, causing GenericAPIAdapter.__init__ to create self.aclient with the default mode. Then lines 75-78 overwrite it. Pass instructor_mode to avoid the redundant initialization.

Also, image_transcribe_model is not forwarded - verify if Gemini should use a different model for image transcription.
         super().__init__(
             api_key=api_key,
             model=model,
             max_completion_tokens=max_completion_tokens,
             name="Gemini",
             endpoint=endpoint,
             api_version=api_version,
             transcription_model=transcription_model,
+            image_transcribe_model=transcription_model,  # or a separate parameter if needed
+            instructor_mode=instructor_mode if instructor_mode else self.default_instructor_mode,
             fallback_model=fallback_model,
             fallback_api_key=fallback_api_key,
             fallback_endpoint=fallback_endpoint,
         )
-        self.instructor_mode = instructor_mode if instructor_mode else self.default_instructor_mode
-
-        self.aclient = instructor.from_litellm(
-            litellm.acompletion, mode=instructor.Mode(self.instructor_mode)
-        )
81-183: Consider removing duplicated acreate_structured_output override.

This method is nearly identical to GenericAPIAdapter.acreate_structured_output. The only difference appears to be passing api_version to the completion call. If the base class needs api_version support, add it there rather than duplicating ~100 lines of code.

If Gemini requires special handling not present in the base class, document why this override is necessary.
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (3)
74-85: Missing image_transcribe_model in super().__init__().

The image_transcribe_model parameter is not forwarded to the base class. If OpenAI should use a specific model for image transcription, add a parameter and pass it. Otherwise, it will default to model.
         super().__init__(
             api_key=api_key,
             model=model,
             max_completion_tokens=max_completion_tokens,
             name="OpenAI",
             endpoint=endpoint,
             api_version=api_version,
             transcription_model=transcription_model,
+            image_transcribe_model=None,  # Uses default model; add parameter if needed
             fallback_model=fallback_model,
             fallback_api_key=fallback_api_key,
             fallback_endpoint=fallback_endpoint,
         )
176-176: Remove or restore commented-out code.

The commented line # api_base=self.fallback_endpoint should either be restored if fallback_endpoint should be used, or removed entirely. Commented-out code reduces readability.
                         api_key=self.fallback_api_key,
-                        # api_base=self.fallback_endpoint,
+                        api_base=self.fallback_endpoint,
                         response_model=response_model,
203-232: Add return type annotation to create_transcript.

The method lacks a return type annotation. Based on litellm.transcription, the return type should be TranscriptionResponse.
-    async def create_transcript(self, input):
+    async def create_transcript(self, file_path: str) -> "litellm.TranscriptionResponse":
Also, consider renaming input to file_path to avoid shadowing the built-in, consistent with the suggestion for GenericAPIAdapter.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49f7c51 and 8e5f14d.

📒 Files selected for processing (9)

cognee/infrastructure/llm/LLMGateway.py (0 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (2 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (3 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (7 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (3 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (2 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (5 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (2 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (5 hunks)

💤 Files with no reviewable changes (1)

cognee/infrastructure/llm/LLMGateway.py

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

⚙️ CodeRabbit configuration file

**/*.py: When reviewing Python code for this project:

Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.

As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.

As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.

As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.

As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use shared logging utilities from cognee.shared.logging_utils in Python code

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

cognee/{modules,infrastructure,tasks}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

🧠 Learnings (2)

📚 Learning: 2025-11-24T16:45:09.996Z

Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/**/*.py : Use shared logging utilities from cognee.shared.logging_utils in Python code

Applied to files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py

📚 Learning: 2024-07-27T16:15:21.508Z

Learnt from: borisarzentar
Repo: topoteretes/cognee PR: 90
File: cognee/api/v1/cognify/cognify.py:86-0
Timestamp: 2024-07-27T16:15:21.508Z
Learning: The file handling in `cognee/api/v1/cognify/cognify.py` uses a context manager for opening files, ensuring proper closure after operations.

Applied to files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py

🧬 Code graph analysis (5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (5)

cognee/infrastructure/llm/LLMGateway.py (3)

LLMGateway (6-53)

create_transcript (38-44)

transcribe_image (47-53)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)

create_transcript (194-236)

transcribe_image (246-293)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)

create_transcript (138-167)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (2)

create_transcript (125-155)

transcribe_image (164-208)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

create_transcript (203-232)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (5)

cognee/shared/logging_utils.py (1)

get_logger (212-224)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (34-293)

cognee/shared/rate_limiting.py (1)

llm_rate_limiter_context_manager (15-21)

cognee/infrastructure/llm/config.py (1)

get_llm_config (244-258)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (3)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

create_transcript (194-236)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

create_transcript (203-232)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

create_transcript (40-55)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (34-293)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (4)

cognee/infrastructure/llm/exceptions.py (1)

ContentPolicyFilterError (4-5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (34-293)

cognee/shared/logging_utils.py (1)

get_logger (212-224)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

🔇 Additional comments (10)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

51-54: Return type inconsistency across implementations.

The interface declares Optional[BaseModel] but implementations vary significantly: OllamaAPIAdapter returns str, OpenAIAdapter returns the raw transcription object, and MistralAdapter returns transcription_response. This matches the PR discussion about needing a standardized TranscriptionResult type.

Consider defining a common return type (e.g., TranscriptionResult with a text property) to ensure consistent downstream handling.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (2)

70-76: Inconsistent retry backoff with GenericAPIAdapter.

The wait_exponential_jitter(2, 128) here differs from GenericAPIAdapter.acreate_structured_output which uses wait_exponential_jitter(8, 128). If the intent is to align retry behavior across adapters, consider using consistent values. If the shorter initial wait is intentional for Ollama (e.g., local deployment), this is fine but worth documenting.

144-149: Missing await and hardcoded model.

Two concerns:

self.aclient.audio.transcriptions.create(...) appears to be called without await, yet the docstring says "synchronous method" despite the async def signature. Verify whether this call is truly synchronous or needs awaiting.

The model is hardcoded to "whisper-1". Consider using self.transcription_model (if added to the constructor) or a configurable attribute for flexibility with different Ollama transcription backends.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)

127-136: LGTM!

The GenericAPIAdapter instantiation correctly passes positional arguments matching the constructor signature.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (2)

27-27: Verify inherited transcription methods work with Anthropic.

By inheriting from GenericAPIAdapter, this adapter gains create_transcript and transcribe_image methods that use litellm.completion with base64-encoded content. Verify these work correctly with Anthropic's API, or override them if Anthropic requires a different approach.

51-58: LGTM!

The @observe(as_type="generation") decorator addition aligns with the observability pattern used across other adapters.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)

130-137: LGTM!

The @observe(as_type="transcription") decorator and retry configuration are consistent with other adapters.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (3)

3-7: LGTM on new imports.

The imports for base64, mimetypes, and Optional are appropriate for the new transcription functionality.

47-47: Good addition of class constant.

Extracting MAX_RETRIES as a class constant improves maintainability and consistency across retry configurations.

50-81: Extended __init__ signature looks good.

The constructor properly initializes new transcription-related attributes with sensible defaults (transcription_model or model, image_transcribe_model or model). The delegation pattern is clean.

...e/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

...astructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py

dexters1 · 2025-12-10T11:57:56Z

Hey @dexters1 I’m aware the return type is different and have already highlighted it both in the PR description and an earlier comment. Since we currently return a BaseModel everywhere, we should either define a new TranscriptionReturnType or confirm which properties are actually required—whether it’s just text or something more.

Let's define a new TranscriptionReturnType that will have a property 'text' in which we'll just store the text output and a property of 'payload' in which the full response from the LLM will be stored.

This way we can reference text as the standardized use of the return but still keep additional info from the response if necessary

dexters1 · 2025-12-10T12:03:43Z

Hey @dexters1 I’m aware the return type is different and have already highlighted it both in the PR description and an earlier comment. Since we currently return a BaseModel everywhere, we should either define a new TranscriptionReturnType or confirm which properties are actually required—whether it’s just text or something more.

Hi @dexters1 , need your input on this. I have resolved some merge conflicts with recent changes in dev. So could you kindly test it again or help me with running the automated tests.

Conflict resolution seems to have been done fine, we've only added an async context call to handle rate limiting for LLM calls. Will test it when the return type changes get added. Thank you for your work and support! Looks like a very nice improvement and PR

rajeevrajeshuni · 2025-12-11T01:24:47Z

@dexters1 Added the return type as discussed. Please have a look.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)
78-91: Hardcoded max_tokens=4096 ignores the constructor parameter.

The max_completion_tokens is passed to the constructor and stored via the base class, but line 81 uses a hardcoded value of 4096. This inconsistency could lead to unexpected behavior.
             return await self.aclient(
                 model=self.model,
-                max_tokens=4096,
+                max_tokens=self.max_completion_tokens,
                 max_retries=2,
                 messages=[
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
63-67: Inconsistent API key usage: uses config instead of self.api_key.

Line 66 uses get_llm_config().llm_api_key while line 68 correctly uses self.api_key. This inconsistency could cause issues if different keys are passed to the constructor vs stored in config.
         self.aclient = instructor.from_litellm(
             litellm.acompletion,
             mode=instructor.Mode(self.instructor_mode),
-            api_key=get_llm_config().llm_api_key,
+            api_key=self.api_key,
         )
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

223-231: Use litellm.atranscription instead of blocking litellm.transcription in async method.

The create_transcript method is declared async (line 204) but calls the synchronous litellm.transcription() at line 224, which blocks the event loop. LiteLLM provides an async alternative litellm.atranscription() following the same pattern as other async endpoints (acompletion, aembedding). Replace the blocking call with transcription = await litellm.atranscription(...) to maintain async semantics.

♻️ Duplicate comments (4)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)

99-106: Positional arguments mismatch for OllamaAPIAdapter constructor.

This issue was previously flagged. The positional arguments don't match the OllamaAPIAdapter.__init__ signature. Use keyword arguments to ensure correct assignment.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)

153-167: Platform-unsafe path handling and potential sync call in async method.

Line 156: input.split("/")[-1] fails on Windows paths. Use os.path.basename(input) or pathlib.Path(input).name.

Lines 158-164: self.mistral_client.audio.transcriptions.complete() appears to be synchronous, which will block the event loop when called from this async method.
+import os
 ...
-        file_name = input.split("/")[-1]
+        file_name = os.path.basename(input)
         async with open_data_file(input, mode="rb") as f:
-            transcription_response = self.mistral_client.audio.transcriptions.complete(
+            transcription_response = await asyncio.to_thread(
+                self.mistral_client.audio.transcriptions.complete,
                 model=transcription_model,
                 file={
                     "content": f,
                     "file_name": file_name,
                 },
             )

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)

196-242: Parameter naming, sync call in async method, and return type issues.

This method has several issues that were previously flagged:

input shadows the built-in - use file_path: str instead
litellm.completion (line 219) is synchronous and blocks the event loop in this async method - use litellm.acompletion with await
Return type Optional[TranscriptionReturnType] is correct, but the docstring says "The generated transcription" without specifying the type

-    async def create_transcript(self, input) -> Optional[TranscriptionReturnType]:
+    async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]:
         """
         ...
         Parameters:
         -----------
-            - input: The path to the audio file that needs to be transcribed.
+            - file_path: The path to the audio file that needs to be transcribed.
 
         Returns:
         --------
-            The generated transcription of the audio file.
+            - Optional[TranscriptionReturnType]: The transcription result with text and payload,
+              or None if transcription failed.
         """
-        async with open_data_file(input, mode="rb") as audio_file:
+        async with open_data_file(file_path, mode="rb") as audio_file:
             encoded_string = base64.b64encode(audio_file.read()).decode("utf-8")
-        mime_type, _ = mimetypes.guess_type(input)
+        mime_type, _ = mimetypes.guess_type(file_path)
         if not mime_type or not mime_type.startswith("audio/"):
             raise ValueError(
-                f"Could not determine MIME type for audio file: {input}. Is the extension correct?"
+                f"Could not determine MIME type for audio file: {file_path}. Is the extension correct?"
             )
-        response =  litellm.completion(
+        response = await litellm.acompletion(

253-300: Same issues in transcribe_image: parameter naming and sync call.

This method has the same issues as create_transcript:

input shadows the built-in
litellm.completion (line 276) is synchronous in async context
Return type Optional[BaseModel] is inaccurate - litellm.completion returns ModelResponse

-    async def transcribe_image(self, input) -> Optional[BaseModel]:
+    async def transcribe_image(self, file_path: str) -> "litellm.ModelResponse":
         ...
-        async with open_data_file(input, mode="rb") as image_file:
+        async with open_data_file(file_path, mode="rb") as image_file:
             encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
-        mime_type, _ = mimetypes.guess_type(input)
+        mime_type, _ = mimetypes.guess_type(file_path)
         if not mime_type or not mime_type.startswith("image/"):
             raise ValueError(
-                f"Could not determine MIME type for image file: {input}. Is the extension correct?"
+                f"Could not determine MIME type for image file: {file_path}. Is the extension correct?"
             )
-        return litellm.completion(
+        return await litellm.acompletion(

🧹 Nitpick comments (10)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
3-9: Consider using @dataclass or extending BaseModel for proper type safety.

The class defines type annotations at the class level (lines 4-5), but without a @dataclass decorator or inheriting from BaseModel, these are just annotations without runtime enforcement. The manual __init__ duplicates what a dataclass would provide automatically.

Additionally, a module-level docstring and class docstring would improve clarity per PEP 257.
+"""Types for transcription return values."""
+
+from dataclasses import dataclass
 from pydantic import BaseModel
 
+
+@dataclass
 class TranscriptionReturnType:
+    """
+    Container for transcription results.
+
+    Attributes:
+        text: The transcribed text content.
+        payload: The full response model from the LLM API.
+    """
+
     text: str
     payload: BaseModel
-
-    def __init__(self, text:str, payload: BaseModel):
-        self.text = text
-        self.payload = payload
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (3)
95-97: Remove unused import of GenericAPIAdapter.

GenericAPIAdapter is imported inside the Ollama block but never used; OllamaAPIAdapter is returned instead.
     elif provider == LLMProvider.OLLAMA:
         if llm_config.llm_api_key is None and raise_api_key_error:
             raise LLMAPIKeyNotSetError()
 
-        from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.generic_llm_api.adapter import (
-            GenericAPIAdapter,
-        )
-
         return OllamaAPIAdapter(
113-118: Use keyword arguments for AnthropicAdapter for consistency and clarity.

Other adapters (OpenAI, Gemini, Mistral) use keyword arguments, but AnthropicAdapter uses positional arguments. This inconsistency makes the code harder to maintain and more error-prone if the constructor signature changes.
         return AnthropicAdapter(
-            llm_config.llm_api_key,
-            llm_config.llm_model,
-            max_completion_tokens,
+            api_key=llm_config.llm_api_key,
+            model=llm_config.llm_model,
+            max_completion_tokens=max_completion_tokens,
             instructor_mode=llm_config.llm_instructor_mode.lower(),
         )
128-137: Use keyword arguments for GenericAPIAdapter (Custom provider) for consistency.

Similar to the Anthropic case, positional arguments are used here while other providers use keyword arguments.
         return GenericAPIAdapter(
-            llm_config.llm_api_key,
-            llm_config.llm_model,
-            max_completion_tokens,
-            "Custom",
+            api_key=llm_config.llm_api_key,
+            model=llm_config.llm_model,
+            max_completion_tokens=max_completion_tokens,
+            name="Custom",
             instructor_mode=llm_config.llm_instructor_mode.lower(),
             fallback_api_key=llm_config.fallback_api_key,
             fallback_endpoint=llm_config.fallback_endpoint,
             fallback_model=llm_config.fallback_model,
         )
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
138-138: Rename parameter input to avoid shadowing built-in.

The parameter name input shadows Python's built-in function. Use a more descriptive name like file_path or audio_path.
-    async def create_transcript(self, input) -> Optional[TranscriptionReturnType]:
+    async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
44-47: Update class docstring to reflect new public methods.

The docstring only mentions acreate_structured_output, but the class now also provides create_transcript and transcribe_image methods.
     """
     Adapter for Generic API LLM provider API.
 
     This class initializes the API adapter with necessary credentials and configurations for
     interacting with a language model. It provides methods for creating structured outputs
-    based on user input and system prompts.
+    based on user input and system prompts, as well as audio transcription and image description.
 
     Public methods:
     - acreate_structured_output(text_input: str, system_prompt: str, response_model:
-    Type[BaseModel]) -> BaseModel
+      Type[BaseModel]) -> BaseModel
+    - create_transcript(file_path: str) -> Optional[TranscriptionReturnType]
+    - transcribe_image(file_path: str) -> litellm.ModelResponse
     """
239-240: Minor: Extra space before litellm.completion on line 219.

There's an extra space in the assignment: response = litellm.completion(.
-        response =  litellm.completion(
+        response = litellm.completion(
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (3)
36-59: Docstring inconsistencies and duplicate fragment.

The docstring lists create_structured_output as a public method (line 42), but the synchronous version has been removed according to the refactoring.

Line 59 contains a duplicate docstring fragment with a typo (GPT=4 instead of GPT-4).
     """
     Adapter for OpenAI's GPT-3, GPT-4 API.
 
     Public methods:
 
     - acreate_structured_output
-    - create_structured_output
     - create_transcript
     - transcribe_image
     - show_prompt
 
     Instance variables:
 
     - name
     - model
     - api_key
     - api_version
     - MAX_RETRIES
     """
 
     default_instructor_mode = "json_schema_mode"
     MAX_RETRIES = 5
 
-    """Adapter for OpenAI's GPT-3, GPT=4 API"""
66-68: Add explicit Optional type hints for parameters with None defaults.

Parameters that default to None should have Optional[str] type hints for clarity and static analysis compliance per PEP 484.
-        endpoint: str = None,
-        api_version: str = None,
-        transcription_model: str = None,
+        endpoint: Optional[str] = None,
+        api_version: Optional[str] = None,
+        transcription_model: Optional[str] = None,
175-179: Clarify or remove commented-out fallback endpoint.

The commented-out api_base=self.fallback_endpoint at line 177 is unclear. If the fallback endpoint should be used, restore it; otherwise, remove the dead code to avoid confusion.
                         api_key=self.fallback_api_key,
-                        # api_base=self.fallback_endpoint,
+                        api_base=self.fallback_endpoint,
                         response_model=response_model,
Or remove the line entirely if the fallback endpoint is intentionally not used.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d57d188 and 6260f9e.

📒 Files selected for processing (7)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (2 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (3 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (7 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (3 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (5 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (6 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

⚙️ CodeRabbit configuration file

**/*.py: When reviewing Python code for this project:

Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.

As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.

As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.

As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.

As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use shared logging utilities from cognee.shared.logging_utils in Python code

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

cognee/{modules,infrastructure,tasks}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

🧠 Learnings (1)

📚 Learning: 2025-11-24T16:45:09.996Z

Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/**/*.py : Use shared logging utilities from cognee.shared.logging_utils in Python code

Applied to files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

🧬 Code graph analysis (5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (4)

cognee/infrastructure/files/utils/open_data_file.py (1)

open_data_file (11-54)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)

GenericAPIAdapter (36-300)

create_transcript (196-242)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)

TranscriptionReturnType (3-9)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

create_transcript (40-55)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (5)

cognee/shared/logging_utils.py (1)

get_logger (212-224)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (36-300)

cognee/shared/rate_limiting.py (1)

llm_rate_limiter_context_manager (15-21)

cognee/infrastructure/llm/config.py (1)

get_llm_config (244-258)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (4)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)

GenericAPIAdapter (36-300)

create_transcript (196-242)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)

TranscriptionReturnType (3-9)

cognee/infrastructure/llm/LLMGateway.py (1)

create_transcript (38-44)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

create_transcript (40-55)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (6)

cognee/infrastructure/files/utils/open_data_file.py (1)

open_data_file (11-54)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)

TranscriptionReturnType (3-9)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

create_transcript (204-235)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (1)

create_transcript (125-155)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

create_transcript (40-55)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (4)

cognee/infrastructure/llm/exceptions.py (1)

ContentPolicyFilterError (4-5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (36-300)

cognee/shared/logging_utils.py (1)

get_logger (212-224)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

🔇 Additional comments (5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)

46-49: Good fix: Now uses self.api_key instead of fetching from config.

The previous issue has been addressed - the adapter now correctly uses the api_key passed to the constructor via self.api_key.
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (2)
49-73: Missing image_transcribe_model parameter in constructor.

The GenericAPIAdapter base class accepts image_transcribe_model parameter, but GeminiAdapter.__init__ doesn't expose it. If image transcription is needed for Gemini, this parameter should be added.
     def __init__(
         self,
         api_key: str,
         model: str,
         max_completion_tokens: int,
         endpoint: str = None,
         api_version: str = None,
         transcription_model: str = None,
+        image_transcribe_model: str = None,
         instructor_mode: str = None,
         fallback_model: str = None,
         fallback_api_key: str = None,
         fallback_endpoint: str = None,
     ):
         super().__init__(
             api_key=api_key,
             model=model,
             max_completion_tokens=max_completion_tokens,
             name="Gemini",
             endpoint=endpoint,
             api_version=api_version,
             transcription_model=transcription_model,
+            image_transcribe_model=image_transcribe_model,
             fallback_model=fallback_model,
             fallback_api_key=fallback_api_key,
             fallback_endpoint=fallback_endpoint,
         )
33-45: Good refactor: Adapter now extends GenericAPIAdapter and includes observability.

The class properly inherits from GenericAPIAdapter, uses the @observe decorator for generation tracking, and documents the inherited multimodal capabilities. This aligns well with the broader architectural changes in this PR.
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (2)

1-33: LGTM!

Imports are well-organized and appropriate for the refactoring. The use of get_logger() from cognee.shared.logging_utils follows the project's coding guidelines.

237-237: LGTM!

The comment correctly documents that transcribe_image is inherited from GenericAPIAdapter, addressing the previous review feedback.

gitguardian · 2025-12-16T14:42:07Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
9573981	Triggered	Generic Password	`13c034e`	.github/workflows/examples_tests.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
63-68: Use self.api_key instead of fetching from config.

Line 66 uses get_llm_config().llm_api_key instead of self.api_key. This is inconsistent with the constructor pattern where api_key is passed explicitly and stored via super().__init__().
         self.aclient = instructor.from_litellm(
             litellm.acompletion,
             mode=instructor.Mode(self.instructor_mode),
-            api_key=get_llm_config().llm_api_key,
+            api_key=self.api_key,
         )
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)
78-91: Hardcoded max_tokens=4096 ignores constructor parameter.

The constructor accepts max_completion_tokens which is stored via super().__init__(), but line 81 uses a hardcoded value of 4096. Use self.max_completion_tokens for consistency with the configuration.
         async with llm_rate_limiter_context_manager():
             return await self.aclient(
                 model=self.model,
-                max_tokens=4096,
+                max_tokens=self.max_completion_tokens,
                 max_retries=2,
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
225-238: Use litellm.atranscription() and rename input parameter to avoid blocking the event loop and shadowing built-in.

The litellm.atranscription() async variant is available for transcription calls. The current implementation uses synchronous litellm.transcription() inside an async method, which blocks the event loop. Additionally, the parameter name input shadows the built-in function—reusing a builtin name for function arguments increases the difficulty of reading and maintaining code and can cause non-obvious errors.
-    async def create_transcript(self, input, **kwargs) -> Optional[TranscriptionReturnType]:
+    async def create_transcript(self, file_path: str, **kwargs) -> Optional[TranscriptionReturnType]:
         ...
-        async with open_data_file(input, mode="rb") as audio_file:
-            transcription = litellm.transcription(
+        async with open_data_file(file_path, mode="rb") as audio_file:
+            transcription = await litellm.atranscription(
                 model=self.transcription_model,
                 file=audio_file,

♻️ Duplicate comments (3)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)

196-242: Synchronous litellm.completion blocks the event loop in async method.

litellm.completion is synchronous, but this method is async. This will block the event loop during API calls. Use litellm.acompletion instead, or wrap the sync call with asyncio.to_thread.

Additionally, the parameter input shadows the Python built-in. Consider renaming to file_path or audio_path.

-    async def create_transcript(self, input) -> Optional[TranscriptionReturnType]:
+    async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]:
         """
         ...
         Parameters:
         -----------
-            - input: The path to the audio file that needs to be transcribed.
+            - file_path: The path to the audio file that needs to be transcribed.
         ...
         """
-        async with open_data_file(input, mode="rb") as audio_file:
+        async with open_data_file(file_path, mode="rb") as audio_file:
             encoded_string = base64.b64encode(audio_file.read()).decode("utf-8")
-        mime_type, _ = mimetypes.guess_type(input)
+        mime_type, _ = mimetypes.guess_type(file_path)
         if not mime_type or not mime_type.startswith("audio/"):
             raise ValueError(
-                f"Could not determine MIME type for audio file: {input}. Is the extension correct?"
+                f"Could not determine MIME type for audio file: {file_path}. Is the extension correct?"
             )
-        response =  litellm.completion(
+        response = await litellm.acompletion(
             model=self.transcription_model,

253-300: Same issues: sync call in async context; consider consistent return type.

This method has the same blocking I/O issue with synchronous litellm.completion. Additionally:

Return type Optional[BaseModel] is inaccurate—litellm.completion returns ModelResponse.
Unlike create_transcript, this method doesn't wrap the response in TranscriptionReturnType. Per PR discussion, consider returning a consistent type for both transcription methods.

-    async def transcribe_image(self, input) -> Optional[BaseModel]:
+    async def transcribe_image(self, file_path: str) -> Optional[TranscriptionReturnType]:
         ...
-        async with open_data_file(input, mode="rb") as image_file:
+        async with open_data_file(file_path, mode="rb") as image_file:
             encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
-        mime_type, _ = mimetypes.guess_type(input)
+        mime_type, _ = mimetypes.guess_type(file_path)
         ...
-        return litellm.completion(
+        response = await litellm.acompletion(
             model=self.image_transcribe_model,
             ...
         )
+        if response and response.choices and len(response.choices) > 0:
+            return TranscriptionReturnType(response.choices[0].message.content, response)
+        return None

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)

156-167: Platform-unsafe path handling and potential sync call in async method.

input.split("/")[-1] fails on Windows paths. Use os.path.basename() or pathlib.Path().name.

self.mistral_client.audio.transcriptions.complete() appears to be a synchronous SDK call inside an async method. This blocks the event loop. Verify if Mistral SDK offers an async variant, or wrap with asyncio.to_thread().
+import os
 ...
-        file_name = input.split("/")[-1]
+        file_name = os.path.basename(input)
         async with open_data_file(input, mode="rb") as f:
-            transcription_response = self.mistral_client.audio.transcriptions.complete(
+            # Consider wrapping in asyncio.to_thread if no async API available
+            transcription_response = await asyncio.to_thread(
+                self.mistral_client.audio.transcriptions.complete,
                 model=transcription_model,
                 file={
                     "content": f,
                     "file_name": file_name,
                 },
             )

🧹 Nitpick comments (5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (2)
97-99: Remove unused import inside OLLAMA branch.

GenericAPIAdapter is imported but never used in this branch—OllamaAPIAdapter is instantiated instead.
     elif provider == LLMProvider.OLLAMA:
         if llm_config.llm_api_key is None and raise_api_key_error:
             raise LLMAPIKeyNotSetError()

-        from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.generic_llm_api.adapter import (
-            GenericAPIAdapter,
-        )
-
         return OllamaAPIAdapter(
115-120: Consider using keyword arguments for clarity and maintainability.

While the positional argument order matches the current AnthropicAdapter signature, using keyword arguments would make the code more resilient to future signature changes and improve readability.
         return AnthropicAdapter(
-            llm_config.llm_api_key,
-            llm_config.llm_model,
-            max_completion_tokens,
+            api_key=llm_config.llm_api_key,
+            model=llm_config.llm_model,
+            max_completion_tokens=max_completion_tokens,
             instructor_mode=llm_config.llm_instructor_mode.lower(),
         )
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
138-152: Parameter input shadows built-in; add type annotation.

For consistency with the suggested changes in GenericAPIAdapter, rename input to file_path and add the type annotation.
-    async def create_transcript(self, input) -> Optional[TranscriptionReturnType]:
+    async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]:
         """
         ...
         Parameters:
         -----------
-            - input: The path to the audio file that needs to be transcribed.
+            - file_path: The path to the audio file that needs to be transcribed.
         ...
         """
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
59-59: Remove duplicate docstring with typo.

Line 59 contains a duplicate docstring "Adapter for OpenAI's GPT-3, GPT=4 API" (note the = instead of -). The class already has a proper docstring at lines 36-54.
     MAX_RETRIES = 5

-    """Adapter for OpenAI's GPT-3, GPT=4 API"""
-
     def __init__(
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (1)
74-78: Verify if aclient re-initialization is necessary.

Lines 74-78 re-initialize self.instructor_mode and self.aclient after calling super().__init__(). The parent GenericAPIAdapter already initializes these in its __init__ method (see lines 58-61 of the base class).

Please confirm:

Is this re-initialization intentional for Gemini-specific instructor configuration?

If not, consider removing this code block and passing instructor_mode to the parent instead.

If Gemini requires special handling, please document why in a comment.

If the re-initialization is not needed, consider this refactor:
         super().__init__(
             api_key=api_key,
             model=model,
             max_completion_tokens=max_completion_tokens,
             name="Gemini",
             endpoint=endpoint,
             api_version=api_version,
             transcription_model=transcription_model,
+            instructor_mode=instructor_mode,
             fallback_model=fallback_model,
             fallback_api_key=fallback_api_key,
             fallback_endpoint=fallback_endpoint,
         )
-        self.instructor_mode = instructor_mode if instructor_mode else self.default_instructor_mode
-
-        self.aclient = instructor.from_litellm(
-            litellm.acompletion, mode=instructor.Mode(self.instructor_mode)
-        )

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6260f9e and 13c034e.

📒 Files selected for processing (8)

cognee/infrastructure/llm/LLMGateway.py (0 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (2 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (3 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (7 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (3 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (5 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (0 hunks)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (6 hunks)

💤 Files with no reviewable changes (2)

cognee/infrastructure/llm/LLMGateway.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py

⚙️ CodeRabbit configuration file

**/*.py: When reviewing Python code for this project:

Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.

As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.

As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.

As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.

As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py

cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use shared logging utilities from cognee.shared.logging_utils in Python code

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py

cognee/{modules,infrastructure,tasks}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)

Files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py

🧠 Learnings (1)

📚 Learning: 2025-11-24T16:45:09.996Z

Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/**/*.py : Use shared logging utilities from cognee.shared.logging_utils in Python code

Applied to files:

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

🧬 Code graph analysis (4)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (3)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (36-300)

cognee/shared/rate_limiting.py (1)

llm_rate_limiter_context_manager (15-21)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (5)

cognee/infrastructure/llm/exceptions.py (1)

ContentPolicyFilterError (4-5)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

GenericAPIAdapter (36-300)

cognee/shared/logging_utils.py (1)

get_logger (212-224)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

cognee/infrastructure/files/storage/FileBufferedReader.py (1)

name (11-12)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (4)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)

GenericAPIAdapter (36-300)

create_transcript (196-242)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)

TranscriptionReturnType (3-9)

cognee/infrastructure/llm/LLMGateway.py (1)

create_transcript (41-47)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

create_transcript (40-55)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (5)

cognee/infrastructure/files/utils/open_data_file.py (1)

open_data_file (11-54)

cognee/modules/observability/get_observe.py (1)

get_observe (5-25)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)

TranscriptionReturnType (3-9)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

create_transcript (206-238)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)

create_transcript (40-55)

🔇 Additional comments (7)

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)

130-139: LGTM with optional suggestion for keyword arguments.

The positional argument order matches the GenericAPIAdapter signature. For consistency with other provider instantiations (like OpenAIAdapter and GeminiAdapter which use keyword arguments), consider using keyword arguments here as well.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)

49-84: LGTM: Initialization and class setup.

The MAX_RETRIES constant, extended __init__ parameters, and sensible defaults for transcription_model and image_transcribe_model are well-designed. The @observe decorator integration follows the established pattern.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)

35-49: LGTM: Clean refactoring to extend GenericAPIAdapter.

The constructor properly delegates to super().__init__() with keyword arguments, and correctly uses self.api_key for the Anthropic client (addressing the past review concern). The @observe decorator integration follows the established pattern.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)

75-101: LGTM: Clean delegation to GenericAPIAdapter.

The constructor properly delegates to super().__init__() with keyword arguments. The conditional logic for gpt-5 models and maintaining both sync/async clients is reasonable for backward compatibility.

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (3)

1-30: LGTM: Import organization looks good.

The duplicate get_logger import flagged in the previous review has been resolved. The new imports for ContentPolicyFilterError, GenericAPIAdapter, get_logger, and get_observe are all necessary for the refactoring to inherit from GenericAPIAdapter and add observability support.

80-182: LGTM: Override justified by api_version parameter.

The acreate_structured_output method override is nearly identical to the parent GenericAPIAdapter implementation, but it correctly includes the api_version parameter in both the primary and fallback completion calls (lines 131 and 166). This Gemini-specific requirement justifies the override.

The implementation correctly:

Applies the @observe decorator for observability

Uses rate limiting via llm_rate_limiter_context_manager()

Handles content policy violations with fallback logic

Maintains proper error handling and retry mechanisms

33-45: The docstring accurately lists create_transcript as a public method. This method is inherited from GenericAPIAdapter and works correctly for Gemini through litellm's abstraction layer—no Gemini-specific override is needed. The base implementation properly handles the transcription_model parameter and is compatible with Gemini's audio transcription capabilities via litellm.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-12-16T14:46:49Z

...e/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py

+class AnthropicAdapter(GenericAPIAdapter):
    """
    Adapter for interfacing with the Anthropic API, enabling structured output generation
    and prompt display.
    """


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find and examine the GenericAPIAdapter class find . -type f -name "*.py" | xargs grep -l "class GenericAPIAdapter"

Repository: topoteretes/cognee

Length of output: 167

🏁 Script executed:

#!/bin/bash # Find adapter files in the repository find . -path "*/structured_output_framework/*" -name "*.py" -type f | head -20

Repository: topoteretes/cognee

Length of output: 1814

🏁 Script executed:

#!/bin/bash # Search for the file mentioned in the review find . -path "**/litellm_instructor/llm/anthropic/**" -name "*.py" | head -10 # Also search for GenericAPIAdapter find . -type f -name "*.py" | xargs grep -l "class GenericAPIAdapter" 2>/dev/null

Repository: topoteretes/cognee

Length of output: 368

🏁 Script executed:

#!/bin/bash # Look for the adapter and parent class files find . -type f -name "adapter.py" | grep -E "(generic|anthropic)" | head -10 rg "class GenericAPIAdapter" --type=py rg "def create_transcript|def transcribe_image" --type=py -A 5

Repository: topoteretes/cognee

Length of output: 9839

🏁 Script executed:

#!/bin/bash # Locate the AnthropicAdapter file and GenericAPIAdapter find . -name "adapter.py" -path "*anthropic*" -o -name "*.py" -path "*generic*api*" # Search for the class definition rg "class GenericAPIAdapter" # Search for the methods mentioned rg "(create_transcript|transcribe_image)" -A 3

Repository: topoteretes/cognee

Length of output: 12569

🏁 Script executed:

#!/bin/bash # First, let's explore the directory structure find cognee -path "*/litellm_instructor/llm/anthropic/*" -type f 2>/dev/null || echo "Path not found" find cognee -name "*adapter*" -type f 2>/dev/null | head -10

Repository: topoteretes/cognee

Length of output: 859

🏁 Script executed:

#!/bin/bash # Check if the file exists and examine it ls -la cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py 2>/dev/null || \ find cognee -name "adapter.py" | head -5 # Search for GenericAPIAdapter rg "class GenericAPIAdapter" -A 10 # Search for transcription methods rg "def (create_transcript|transcribe_image)" -B 2 -A 10

Repository: topoteretes/cognee

Length of output: 19540

🏁 Script executed:

#!/bin/bash #!/bin/bash # Search for GenericAPIAdapter and related methods rg "GenericAPIAdapter" cognee --type=py rg "create_transcript|transcribe_image" cognee --type=py -B 1 -A 3 # Find adapter files fd "adapter.py" cognee

Repository: topoteretes/cognee

Length of output: 18772

🏁 Script executed:

#!/bin/bash find . -name "*.py" -type f -path "*litellm_instructor*" | head -20 find . -name "*.py" -type f | xargs grep -l "class GenericAPIAdapter" | head -5 find . -name "*.py" -type f | xargs grep -l "create_transcript\|transcribe_image" | head -10

Repository: topoteretes/cognee

Length of output: 2918

🏁 Script executed:

#!/bin/bash # List files in the anthropic directory if [ -d "cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic" ]; then ls -la cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/ fi # Find all adapter.py files in the structured_output_framework find cognee/infrastructure/llm/structured_output_framework -name "adapter.py" -o -name "*adapter*" # Search for GenericAPIAdapter grep -r "class GenericAPIAdapter" . --include="*.py" 2>/dev/null | head -5

Repository: topoteretes/cognee

Length of output: 1117

🏁 Script executed:

#!/bin/bash # Get directory structure of the framework find cognee/infrastructure -type d -name "*litellm*" -o -type d -name "*structured*" | head -10 # List the files in the anthropic directory if it exists find cognee -path "*anthropic*" -name "*.py" 2>/dev/null | head -10 # Find all adapter files find cognee -name "*.py" -type f | xargs grep -l "class.*Adapter"

Repository: topoteretes/cognee

Length of output: 2373

🏁 Script executed:

#!/bin/bash # Search for GenericAPIAdapter definition rg "class GenericAPIAdapter" --type py -C 5 # Search for create_transcript method definition rg "def create_transcript" --type py -C 3 # Search for transcribe_image method definition rg "def transcribe_image" --type py -C 3 # List all adapter.py files find . -name "adapter.py" -type f

Repository: topoteretes/cognee

Length of output: 12294

🏁 Script executed:

#!/bin/bash # Find the specific files mentioned in the review find . -type f -name "*.py" | xargs grep -l "class.*Adapter" | head -20 # Find files in the anthropic directory find . -type f -path "*anthropic*" -name "*.py" | head -20 # Look for create_transcript and transcribe_image definitions grep -r "def create_transcript" . --include="*.py" | head -10 grep -r "def transcribe_image" . --include="*.py" | head -10 # Look for GenericAPIAdapter grep -r "class GenericAPIAdapter" . --include="*.py"

Repository: topoteretes/cognee

Length of output: 3498

🏁 Script executed:

#!/bin/bash # Step 1: List all files in the structured_output_framework directory find . -path "*structured_output_framework*" -name "*.py" -type f 2>/dev/null | head -30 # Step 2: Find all adapter files find . -name "*adapter*" -type f 2>/dev/null | grep -E "\.py$" # Step 3: Search for GenericAPIAdapter and related methods rg "(class GenericAPIAdapter|def create_transcript|def transcribe_image)" --type=py -B 2 -A 5

Repository: topoteretes/cognee

Length of output: 17733

🏁 Script executed:

#!/bin/bash set -e # Find all Python files in cognee/infrastructure/llm find cognee/infrastructure/llm -name "*.py" -type f 2>/dev/null > /tmp/files.txt # Count them wc -l /tmp/files.txt # Show first 30 files head -30 /tmp/files.txt # Search for GenericAPIAdapter in any Python file echo "=== Searching for GenericAPIAdapter ===" grep -r "class GenericAPIAdapter" . --include="*.py" 2>/dev/null || echo "Not found in grep" # Search for the AnthropicAdapter file echo "=== Looking for AnthropicAdapter ===" find . -path "*anthropic*adapter*" -name "*.py" 2>/dev/null # Search for create_transcript method echo "=== Looking for create_transcript method ===" grep -r "def create_transcript" . --include="*.py" 2>/dev/null | head -5 # Search for transcribe_image method echo "=== Looking for transcribe_image method ===" grep -r "def transcribe_image" . --include="*.py" 2>/dev/null | head -5

Repository: topoteretes/cognee

Length of output: 4132

🏁 Script executed:

#!/bin/bash # Comprehensive search to understand the adapter architecture echo "Searching for adapter files..." find . -name "*.py" -type f | xargs grep -l "class.*Adapter" 2>/dev/null | grep -E "(generic|anthropic|adapter)" | head -10 echo -e "\n Looking for GenericAPIAdapter..." find . -name "*.py" -type f | xargs grep "class GenericAPIAdapter" 2>/dev/null echo -e "\nLooking for create_transcript and transcribe_image..." find . -name "*.py" -type f | xargs grep -E "(def create_transcript|def transcribe_image)" 2>/dev/null

Repository: topoteretes/cognee

Length of output: 2734

🏁 Script executed:

#!/bin/bash # Find all .py files that might contain GenericAPIAdapter rg "GenericAPIAdapter" --type py # Find create_transcript and transcribe_image rg "create_transcript|transcribe_image" --type py -B 2 -A 5 # List structured output framework files fd . cognee/infrastructure/llm/structured_output_framework --type f --extension py | head -20

Repository: topoteretes/cognee

Length of output: 25949

🏁 Script executed:

#!/bin/bash # Search for GenericAPIAdapter definition grep -r "class GenericAPIAdapter" cognee --include="*.py" # Search for AnthropicAdapter grep -r "class AnthropicAdapter" cognee --include="*.py" # Search for methods grep -r "def create_transcript\|def transcribe_image" cognee --include="*.py" -A 3 # Check directory structure find cognee/infrastructure/llm -type d | head -15

Repository: topoteretes/cognee

Length of output: 7159

🏁 Script executed:

#!/bin/bash # Direct search for the adapter files and their methods find cognee -name "adapter.py" 2>/dev/null -exec grep -H "class\|def create_transcript\|def transcribe_image" {} \; # Search for GenericAPIAdapter definition rg "class GenericAPIAdapter" --type py --no-heading -A 15 2>/dev/null | head -30 # Search for the methods rg "def (create_transcript|transcribe_image)" --type py --no-heading -B 1 -A 10

Repository: topoteretes/cognee

Length of output: 22727

🏁 Script executed:

#!/bin/bash # Final comprehensive search echo "=== Searching for GenericAPIAdapter ===" rg "class GenericAPIAdapter" --type py echo "=== Searching for transcription methods ===" rg "def (create_transcript|transcribe_image)" --type py echo "=== Listing adapter files ===" find cognee -name "*adapter*" -type f -name "*.py" echo "=== Checking litellm_instructor structure ===" ls -la cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ 2>/dev/null || echo "Directory not accessible"

Repository: topoteretes/cognee

Length of output: 5023

🏁 Script executed:

#!/bin/bash # Check if the adapter classes and methods actually exist git ls-files | grep -E "(adapter|instructor)" | head -20 cat cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py 2>/dev/null || echo "File not found" grep -r "GenericAPIAdapter" cognee --include="*.py" -l grep -r "create_transcript\|transcribe_image" cognee --include="*.py" -B 1 -A 3

Repository: topoteretes/cognee

Length of output: 19507

🏁 Script executed:

#!/bin/bash # Simple direct existence and content check if [ -f "cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py" ]; then echo "File exists, content:" cat cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py else echo "File not found at expected path" fi # Find adapter.py files find cognee -name "adapter.py" # Check for GenericAPIAdapter class rg "class GenericAPIAdapter" cognee/ # Check for the methods in question rg "def create_transcript" cognee/ -A 10 rg "def transcribe_image" cognee/ -A 10

Repository: topoteretes/cognee

Length of output: 20137

Override create_transcript with explicit error handling for Anthropic audio limitation.

AnthropicAdapter inherits create_transcript from GenericAPIAdapter, but Anthropic's API does not support audio transcription. Override this method to raise NotImplementedError with a clear message directing users to external transcription services. For transcribe_image, verify the inherited implementation works correctly with Anthropic's vision API by testing with base64-encoded images.

coderabbitai · 2025-12-16T14:46:49Z

cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py

    def __init__(
        self,
-        endpoint,
        api_key: str,
        model: str,
-        api_version: str,
        max_completion_tokens: int,
+        endpoint: str = None,
+        api_version: str = None,
+        transcription_model: str = None,
        instructor_mode: str = None,
        fallback_model: str = None,
        fallback_api_key: str = None,
        fallback_endpoint: str = None,
    ):
-        self.model = model
-        self.api_key = api_key
-        self.endpoint = endpoint
-        self.api_version = api_version
-        self.max_completion_tokens = max_completion_tokens
-
-        self.fallback_model = fallback_model
-        self.fallback_api_key = fallback_api_key
-        self.fallback_endpoint = fallback_endpoint
-
+        super().__init__(
+            api_key=api_key,
+            model=model,
+            max_completion_tokens=max_completion_tokens,
+            name="Gemini",
+            endpoint=endpoint,
+            api_version=api_version,
+            transcription_model=transcription_model,
+            fallback_model=fallback_model,
+            fallback_api_key=fallback_api_key,
+            fallback_endpoint=fallback_endpoint,
+        )


🛠️ Refactor suggestion | 🟠 Major

Add missing image_transcribe_model parameter and docstring.

Two issues:

Missing parameter: GenericAPIAdapter.__init__ accepts an image_transcribe_model parameter (see line 57 in the base class), but GeminiAdapter.__init__ does not accept or pass this parameter. This means image transcription will default to using the main model instead of a potentially specialized image model.

Missing docstring: Per the coding guidelines, undocumented function definitions are assumed incomplete. Please add a docstring documenting the parameters and their purposes.

Apply this diff to add the missing parameter:

def __init__( self, api_key: str, model: str, max_completion_tokens: int, endpoint: str = None, api_version: str = None, transcription_model: str = None, + image_transcribe_model: str = None, instructor_mode: str = None, fallback_model: str = None, fallback_api_key: str = None, fallback_endpoint: str = None, ): + """ + Initialize the Gemini adapter with API credentials and model configurations. + + Parameters: + ----------- + api_key: Gemini API key + model: Primary model identifier for structured output generation + max_completion_tokens: Maximum tokens for completion responses + endpoint: Optional custom API endpoint + api_version: Optional API version + transcription_model: Optional model for audio transcription (defaults to main model) + image_transcribe_model: Optional model for image transcription (defaults to main model) + instructor_mode: Mode for instructor client (defaults to "json_mode") + fallback_model: Fallback model for content policy violations + fallback_api_key: API key for fallback model + fallback_endpoint: Endpoint for fallback model + """ super().__init__( api_key=api_key, model=model, max_completion_tokens=max_completion_tokens, name="Gemini", endpoint=endpoint, api_version=api_version, transcription_model=transcription_model, + image_transcribe_model=image_transcribe_model, fallback_model=fallback_model, fallback_api_key=fallback_api_key, fallback_endpoint=fallback_endpoint, )

As per coding guidelines, undocumented function definitions are considered incomplete.

dexters1 · 2025-12-16T16:00:57Z

I've merged your PR to this branch (test-audio-image-transcription) and made a new PR:

#1911

So we can run our CI/CD on it and so I can make some minor adjustments.

If CI/CD passes there we can merge it (your contributions will be noted in the commit history of course)

rajeevrajeshuni · 2025-12-16T16:02:15Z

@dexters1 Thank you!

dexters1 · 2025-12-16T17:25:46Z

Closing this PR as feature will be merged as part of the test-audio-image-transcription branch PR to dev and will be part of the next Cognee release

rajeevrajeshuni added 2 commits November 25, 2025 12:22

Adding support for audio/image transcription for all other providers

02b1778

uv lock version revert

09fbf22

github-actions bot reviewed Nov 25, 2025

View reviewed changes

rajeevrajeshuni mentioned this pull request Nov 25, 2025

[Feature]: Extend the APIs of other model providers to support image and video #1767

Open

3 tasks

Vasilije1990 requested a review from dexters1 November 25, 2025 13:31

Merge branch 'dev' into issue-1767

57195fb

rajeevrajeshuni added 2 commits December 10, 2025 10:49

resolving merge conflicts

8e5f14d

resolving merge conflicts

d57d188

coderabbitai bot reviewed Dec 10, 2025

View reviewed changes

strandardizing return type for transcription and some CR changes

6260f9e

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

hande-k added community-contribution Community contribution label review-required Review required labels Dec 16, 2025

Merge branch 'dev' into issue-1767

13c034e

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

dexters1 closed this Dec 16, 2025

Add support for transcribe image and audio transcription for gemini, anthropic, mistral and ollama. #1828

Add support for transcribe image and audio transcription for gemini, anthropic, mistral and ollama. #1828

Uh oh!

Conversation

rajeevrajeshuni commented Nov 25, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Pre-submission Checklist

DCO Affirmation

Implementation Questions for Review

Future Scope

Testing Notes

Summary by CodeRabbit

Uh oh!

pull-checklist bot commented Nov 25, 2025

Please make sure all the checkboxes are checked:

Uh oh!

coderabbitai bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

rajeevrajeshuni commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dexters1 commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rajeevrajeshuni commented Nov 28, 2025

Uh oh!

rajeevrajeshuni commented Nov 28, 2025

Uh oh!

dexters1 commented Dec 5, 2025

Uh oh!

rajeevrajeshuni commented Dec 6, 2025

Uh oh!

rajeevrajeshuni commented Dec 6, 2025

Uh oh!

rajeevrajeshuni commented Dec 10, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dexters1 commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dexters1 commented Dec 10, 2025

Uh oh!

rajeevrajeshuni commented Dec 11, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gitguardian bot commented Dec 16, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 16, 2025

rajeevrajeshuni commented Nov 25, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 25, 2025 •

edited

Loading

rajeevrajeshuni commented Nov 27, 2025 •

edited

Loading

dexters1 commented Nov 28, 2025 •

edited

Loading

dexters1 commented Dec 10, 2025 •

edited

Loading