-
Notifications
You must be signed in to change notification settings - Fork 954
Add support for transcribe image and audio transcription for gemini, anthropic, mistral and ollama. #1828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Please make sure all the checkboxes are checked:
|
WalkthroughRemoved the synchronous LLMGateway.create_structured_output and refactored multiple LLM adapters to inherit from a new GenericAPIAdapter, added observability decorators, introduced multimodal transcription/image-transcribe methods and a TranscriptionReturnType, and added an LLMProvider enum for provider selection. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @rajeevrajeshuni, thank you for submitting a PR! We will respond as soon as possible.
|
@rajeevrajeshuni The PR looks good, reusing of the GenericLLMAdapter is a good choice. Will run the test suite on Monday and make sure it's all in working order, if it's working with all the Adapters we have we should be good to go |
|
@dexters1 Thanks for the review. Can you also help me understand how to run the test suite by myself? |
|
@dexters1 Also please review if we should Standardize the return types for Auto Transcription. |
|
Hey @rajeevrajeshuni I've tried out the PR works fine with OpenAI and Gemini LLM calls, but the output of the transcription when using Gemini breaks Cognee. The request is made properly but errors. Can you try running it with Gemini? The fix should be just a differently packaged return value |
|
Hey @dexters1 I’m aware the return type is different and have already highlighted it both in the PR description and an earlier comment. Since we currently return a BaseModel everywhere, we should either define a new TranscriptionReturnType or confirm which properties are actually required—whether it’s just text or something more. |
|
@dexters1 Also, please test this with MistralAI as well. I would have done it myself, but I’m currently facing issues with running the automated tests. |
Hi @dexters1 , need your input on this. I have resolved some merge conflicts with recent changes in dev. So could you kindly test it again or help me with running the automated tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (1)
194-197: Hardcoded MIME type limits image format support.The MIME type is hardcoded to
"image/jpeg", which will incorrectly identify PNG, GIF, WebP, and other image formats. Usemimetypes.guess_type(input_file)to detect the correct type, consistent withGenericAPIAdapter.transcribe_image.+import mimetypes ... async with open_data_file(input_file, mode="rb") as image_file: encoded_image = base64.b64encode(image_file.read()).decode("utf-8") + mime_type, _ = mimetypes.guess_type(input_file) + if not mime_type or not mime_type.startswith("image/"): + raise ValueError(f"Could not determine MIME type for image file: {input_file}") ... - "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}, + "image_url": {"url": f"data:{mime_type};base64,{encoded_image}"},cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
95-97: Remove unused import.
GenericAPIAdapteris imported but not used in this branch—OllamaAPIAdapteris instantiated instead.- from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.generic_llm_api.adapter import ( - GenericAPIAdapter, - ) - return OllamaAPIAdapter(cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
64-68: Useself.api_keyinstead of fetching from config.Same issue as in
AnthropicAdapter—the constructor receivesapi_keybut line 67 fetches from config, causing potential inconsistency.self.aclient = instructor.from_litellm( litellm.acompletion, mode=instructor.Mode(self.instructor_mode), - api_key=get_llm_config().llm_api_key, + api_key=self.api_key, )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
34-58: Update docstrings: remove non-existent method, fix typo.
- The docstring mentions
create_structured_output(sync) at line 41, but this method doesn't exist in the class.- Line 58 has a typo: "GPT=4" should be "GPT-4".
- The duplicate docstring at line 58 is redundant.
class OpenAIAdapter(GenericAPIAdapter): """ Adapter for OpenAI's GPT-3, GPT-4 API. Public methods: - acreate_structured_output - - create_structured_output - create_transcript - transcribe_image - show_prompt Instance variables: - name - model - api_key - api_version - MAX_RETRIES """ default_instructor_mode = "json_schema_mode" MAX_RETRIES = 5 - - """Adapter for OpenAI's GPT-3, GPT=4 API"""
🧹 Nitpick comments (9)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (2)
40-40: Add type annotation forinputparameter.The
inputparameter lacks a type annotation. Based on the implementations in other adapters (e.g.,OllamaAPIAdapter.create_transcript(input_file: str)), this should be typed asstr.- async def create_transcript(self, input) -> Optional[BaseModel]: + async def create_transcript(self, input: str) -> Optional[BaseModel]:
58-58: Add type annotation forinputparameter.Same as above—this parameter should be typed as
strfor consistency with concrete implementations.- async def transcribe_image(self, input) -> Optional[BaseModel]: + async def transcribe_image(self, input: str) -> Optional[BaseModel]:cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
112-117: Consider using keyword arguments for maintainability.While the positional arguments match the constructor signature, using keyword arguments (like the
OpenAIAdapterandGeminiAdaptercalls do) improves readability and reduces risk of future mismatches.return AnthropicAdapter( - llm_config.llm_api_key, - llm_config.llm_model, - max_completion_tokens, + api_key=llm_config.llm_api_key, + model=llm_config.llm_model, + max_completion_tokens=max_completion_tokens, instructor_mode=llm_config.llm_instructor_mode.lower(), )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
291-291: Hardcodedmax_completion_tokens=300differs from audio transcription.The image transcription uses a hardcoded value of 300 tokens while audio transcription uses
self.max_completion_tokens. Consider making this configurable or using the instance attribute for consistency.- max_completion_tokens=300, + max_completion_tokens=self.max_completion_tokens,cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (2)
63-79: Missinginstructor_modeinsuper().__init__()causes redundant client initialization.The
instructor_modeis not passed to the base class, causingGenericAPIAdapter.__init__to createself.aclientwith the default mode. Then lines 75-78 overwrite it. Passinstructor_modeto avoid the redundant initialization.Also,
image_transcribe_modelis not forwarded - verify if Gemini should use a different model for image transcription.super().__init__( api_key=api_key, model=model, max_completion_tokens=max_completion_tokens, name="Gemini", endpoint=endpoint, api_version=api_version, transcription_model=transcription_model, + image_transcribe_model=transcription_model, # or a separate parameter if needed + instructor_mode=instructor_mode if instructor_mode else self.default_instructor_mode, fallback_model=fallback_model, fallback_api_key=fallback_api_key, fallback_endpoint=fallback_endpoint, ) - self.instructor_mode = instructor_mode if instructor_mode else self.default_instructor_mode - - self.aclient = instructor.from_litellm( - litellm.acompletion, mode=instructor.Mode(self.instructor_mode) - )
81-183: Consider removing duplicatedacreate_structured_outputoverride.This method is nearly identical to
GenericAPIAdapter.acreate_structured_output. The only difference appears to be passingapi_versionto the completion call. If the base class needsapi_versionsupport, add it there rather than duplicating ~100 lines of code.If Gemini requires special handling not present in the base class, document why this override is necessary.
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (3)
74-85: Missingimage_transcribe_modelinsuper().__init__().The
image_transcribe_modelparameter is not forwarded to the base class. If OpenAI should use a specific model for image transcription, add a parameter and pass it. Otherwise, it will default tomodel.super().__init__( api_key=api_key, model=model, max_completion_tokens=max_completion_tokens, name="OpenAI", endpoint=endpoint, api_version=api_version, transcription_model=transcription_model, + image_transcribe_model=None, # Uses default model; add parameter if needed fallback_model=fallback_model, fallback_api_key=fallback_api_key, fallback_endpoint=fallback_endpoint, )
176-176: Remove or restore commented-out code.The commented line
# api_base=self.fallback_endpointshould either be restored iffallback_endpointshould be used, or removed entirely. Commented-out code reduces readability.api_key=self.fallback_api_key, - # api_base=self.fallback_endpoint, + api_base=self.fallback_endpoint, response_model=response_model,
203-232: Add return type annotation tocreate_transcript.The method lacks a return type annotation. Based on
litellm.transcription, the return type should beTranscriptionResponse.- async def create_transcript(self, input): + async def create_transcript(self, file_path: str) -> "litellm.TranscriptionResponse":Also, consider renaming
inputtofile_pathto avoid shadowing the built-in, consistent with the suggestion for GenericAPIAdapter.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
cognee/infrastructure/llm/LLMGateway.py(0 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py(2 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py(3 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py(7 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py(3 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py(2 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py(5 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py(2 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py(5 hunks)
💤 Files with no reviewable changes (1)
- cognee/infrastructure/llm/LLMGateway.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
⚙️ CodeRabbit configuration file
**/*.py: When reviewing Python code for this project:
- Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
- As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
- As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.
- As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
- As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use shared logging utilities from cognee.shared.logging_utils in Python code
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/{modules,infrastructure,tasks}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
🧠 Learnings (2)
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/**/*.py : Use shared logging utilities from cognee.shared.logging_utils in Python code
Applied to files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
📚 Learning: 2024-07-27T16:15:21.508Z
Learnt from: borisarzentar
Repo: topoteretes/cognee PR: 90
File: cognee/api/v1/cognify/cognify.py:86-0
Timestamp: 2024-07-27T16:15:21.508Z
Learning: The file handling in `cognee/api/v1/cognify/cognify.py` uses a context manager for opening files, ensuring proper closure after operations.
Applied to files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
🧬 Code graph analysis (5)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (5)
cognee/infrastructure/llm/LLMGateway.py (3)
LLMGateway(6-53)create_transcript(38-44)transcribe_image(47-53)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
create_transcript(194-236)transcribe_image(246-293)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
create_transcript(138-167)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (2)
create_transcript(125-155)transcribe_image(164-208)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
create_transcript(203-232)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (5)
cognee/shared/logging_utils.py (1)
get_logger(212-224)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(34-293)cognee/shared/rate_limiting.py (1)
llm_rate_limiter_context_manager(15-21)cognee/infrastructure/llm/config.py (1)
get_llm_config(244-258)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (3)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
create_transcript(194-236)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
create_transcript(203-232)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
create_transcript(40-55)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(34-293)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (4)
cognee/infrastructure/llm/exceptions.py (1)
ContentPolicyFilterError(4-5)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(34-293)cognee/shared/logging_utils.py (1)
get_logger(212-224)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)
🔇 Additional comments (10)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
51-54: Return type inconsistency across implementations.The interface declares
Optional[BaseModel]but implementations vary significantly:OllamaAPIAdapterreturnsstr,OpenAIAdapterreturns the raw transcription object, andMistralAdapterreturnstranscription_response. This matches the PR discussion about needing a standardizedTranscriptionResulttype.Consider defining a common return type (e.g.,
TranscriptionResultwith atextproperty) to ensure consistent downstream handling.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (2)
70-76: Inconsistent retry backoff with GenericAPIAdapter.The
wait_exponential_jitter(2, 128)here differs fromGenericAPIAdapter.acreate_structured_outputwhich useswait_exponential_jitter(8, 128). If the intent is to align retry behavior across adapters, consider using consistent values. If the shorter initial wait is intentional for Ollama (e.g., local deployment), this is fine but worth documenting.
144-149: Missingawaitand hardcoded model.Two concerns:
self.aclient.audio.transcriptions.create(...)appears to be called withoutawait, yet the docstring says "synchronous method" despite theasync defsignature. Verify whether this call is truly synchronous or needs awaiting.The model is hardcoded to
"whisper-1". Consider usingself.transcription_model(if added to the constructor) or a configurable attribute for flexibility with different Ollama transcription backends.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
127-136: LGTM!The
GenericAPIAdapterinstantiation correctly passes positional arguments matching the constructor signature.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (2)
27-27: Verify inherited transcription methods work with Anthropic.By inheriting from
GenericAPIAdapter, this adapter gainscreate_transcriptandtranscribe_imagemethods that uselitellm.completionwith base64-encoded content. Verify these work correctly with Anthropic's API, or override them if Anthropic requires a different approach.
51-58: LGTM!The
@observe(as_type="generation")decorator addition aligns with the observability pattern used across other adapters.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
130-137: LGTM!The
@observe(as_type="transcription")decorator and retry configuration are consistent with other adapters.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (3)
3-7: LGTM on new imports.The imports for
base64,mimetypes, andOptionalare appropriate for the new transcription functionality.
47-47: Good addition of class constant.Extracting
MAX_RETRIESas a class constant improves maintainability and consistency across retry configurations.
50-81: Extended__init__signature looks good.The constructor properly initializes new transcription-related attributes with sensible defaults (
transcription_model or model,image_transcribe_model or model). The delegation pattern is clean.
...e/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
Show resolved
Hide resolved
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
Outdated
Show resolved
Hide resolved
...astructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
Outdated
Show resolved
Hide resolved
...astructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
Show resolved
Hide resolved
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py
Show resolved
Hide resolved
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
Outdated
Show resolved
Hide resolved
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
Outdated
Show resolved
Hide resolved
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
Show resolved
Hide resolved
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py
Outdated
Show resolved
Hide resolved
Let's define a new TranscriptionReturnType that will have a property 'text' in which we'll just store the text output and a property of 'payload' in which the full response from the LLM will be stored. This way we can reference text as the standardized use of the return but still keep additional info from the response if necessary |
Conflict resolution seems to have been done fine, we've only added an async context call to handle rate limiting for LLM calls. Will test it when the return type changes get added. Thank you for your work and support! Looks like a very nice improvement and PR |
|
@dexters1 Added the return type as discussed. Please have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)
78-91: Hardcodedmax_tokens=4096ignores the constructor parameter.The
max_completion_tokensis passed to the constructor and stored via the base class, but line 81 uses a hardcoded value of4096. This inconsistency could lead to unexpected behavior.return await self.aclient( model=self.model, - max_tokens=4096, + max_tokens=self.max_completion_tokens, max_retries=2, messages=[cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
63-67: Inconsistent API key usage: uses config instead ofself.api_key.Line 66 uses
get_llm_config().llm_api_keywhile line 68 correctly usesself.api_key. This inconsistency could cause issues if different keys are passed to the constructor vs stored in config.self.aclient = instructor.from_litellm( litellm.acompletion, mode=instructor.Mode(self.instructor_mode), - api_key=get_llm_config().llm_api_key, + api_key=self.api_key, )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
223-231: Uselitellm.atranscriptioninstead of blockinglitellm.transcriptionin async method.The
create_transcriptmethod is declaredasync(line 204) but calls the synchronouslitellm.transcription()at line 224, which blocks the event loop. LiteLLM provides an async alternativelitellm.atranscription()following the same pattern as other async endpoints (acompletion, aembedding). Replace the blocking call withtranscription = await litellm.atranscription(...)to maintain async semantics.
♻️ Duplicate comments (4)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
99-106: Positional arguments mismatch forOllamaAPIAdapterconstructor.This issue was previously flagged. The positional arguments don't match the
OllamaAPIAdapter.__init__signature. Use keyword arguments to ensure correct assignment.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
153-167: Platform-unsafe path handling and potential sync call in async method.
Line 156:
input.split("/")[-1]fails on Windows paths. Useos.path.basename(input)orpathlib.Path(input).name.Lines 158-164:
self.mistral_client.audio.transcriptions.complete()appears to be synchronous, which will block the event loop when called from this async method.+import os ... - file_name = input.split("/")[-1] + file_name = os.path.basename(input) async with open_data_file(input, mode="rb") as f: - transcription_response = self.mistral_client.audio.transcriptions.complete( + transcription_response = await asyncio.to_thread( + self.mistral_client.audio.transcriptions.complete, model=transcription_model, file={ "content": f, "file_name": file_name, }, )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
196-242: Parameter naming, sync call in async method, and return type issues.This method has several issues that were previously flagged:
inputshadows the built-in - usefile_path: strinsteadlitellm.completion(line 219) is synchronous and blocks the event loop in this async method - uselitellm.acompletionwithawait- Return type
Optional[TranscriptionReturnType]is correct, but the docstring says "The generated transcription" without specifying the type- async def create_transcript(self, input) -> Optional[TranscriptionReturnType]: + async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]: """ ... Parameters: ----------- - - input: The path to the audio file that needs to be transcribed. + - file_path: The path to the audio file that needs to be transcribed. Returns: -------- - The generated transcription of the audio file. + - Optional[TranscriptionReturnType]: The transcription result with text and payload, + or None if transcription failed. """ - async with open_data_file(input, mode="rb") as audio_file: + async with open_data_file(file_path, mode="rb") as audio_file: encoded_string = base64.b64encode(audio_file.read()).decode("utf-8") - mime_type, _ = mimetypes.guess_type(input) + mime_type, _ = mimetypes.guess_type(file_path) if not mime_type or not mime_type.startswith("audio/"): raise ValueError( - f"Could not determine MIME type for audio file: {input}. Is the extension correct?" + f"Could not determine MIME type for audio file: {file_path}. Is the extension correct?" ) - response = litellm.completion( + response = await litellm.acompletion(
253-300: Same issues intranscribe_image: parameter naming and sync call.This method has the same issues as
create_transcript:
inputshadows the built-inlitellm.completion(line 276) is synchronous in async context- Return type
Optional[BaseModel]is inaccurate -litellm.completionreturnsModelResponse- async def transcribe_image(self, input) -> Optional[BaseModel]: + async def transcribe_image(self, file_path: str) -> "litellm.ModelResponse": ... - async with open_data_file(input, mode="rb") as image_file: + async with open_data_file(file_path, mode="rb") as image_file: encoded_image = base64.b64encode(image_file.read()).decode("utf-8") - mime_type, _ = mimetypes.guess_type(input) + mime_type, _ = mimetypes.guess_type(file_path) if not mime_type or not mime_type.startswith("image/"): raise ValueError( - f"Could not determine MIME type for image file: {input}. Is the extension correct?" + f"Could not determine MIME type for image file: {file_path}. Is the extension correct?" ) - return litellm.completion( + return await litellm.acompletion(
🧹 Nitpick comments (10)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
3-9: Consider using@dataclassor extendingBaseModelfor proper type safety.The class defines type annotations at the class level (lines 4-5), but without a
@dataclassdecorator or inheriting fromBaseModel, these are just annotations without runtime enforcement. The manual__init__duplicates what a dataclass would provide automatically.Additionally, a module-level docstring and class docstring would improve clarity per PEP 257.
+"""Types for transcription return values.""" + +from dataclasses import dataclass from pydantic import BaseModel + +@dataclass class TranscriptionReturnType: + """ + Container for transcription results. + + Attributes: + text: The transcribed text content. + payload: The full response model from the LLM API. + """ + text: str payload: BaseModel - - def __init__(self, text:str, payload: BaseModel): - self.text = text - self.payload = payloadcognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (3)
95-97: Remove unused import ofGenericAPIAdapter.
GenericAPIAdapteris imported inside the Ollama block but never used;OllamaAPIAdapteris returned instead.elif provider == LLMProvider.OLLAMA: if llm_config.llm_api_key is None and raise_api_key_error: raise LLMAPIKeyNotSetError() - from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.generic_llm_api.adapter import ( - GenericAPIAdapter, - ) - return OllamaAPIAdapter(
113-118: Use keyword arguments forAnthropicAdapterfor consistency and clarity.Other adapters (OpenAI, Gemini, Mistral) use keyword arguments, but
AnthropicAdapteruses positional arguments. This inconsistency makes the code harder to maintain and more error-prone if the constructor signature changes.return AnthropicAdapter( - llm_config.llm_api_key, - llm_config.llm_model, - max_completion_tokens, + api_key=llm_config.llm_api_key, + model=llm_config.llm_model, + max_completion_tokens=max_completion_tokens, instructor_mode=llm_config.llm_instructor_mode.lower(), )
128-137: Use keyword arguments forGenericAPIAdapter(Custom provider) for consistency.Similar to the Anthropic case, positional arguments are used here while other providers use keyword arguments.
return GenericAPIAdapter( - llm_config.llm_api_key, - llm_config.llm_model, - max_completion_tokens, - "Custom", + api_key=llm_config.llm_api_key, + model=llm_config.llm_model, + max_completion_tokens=max_completion_tokens, + name="Custom", instructor_mode=llm_config.llm_instructor_mode.lower(), fallback_api_key=llm_config.fallback_api_key, fallback_endpoint=llm_config.fallback_endpoint, fallback_model=llm_config.fallback_model, )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
138-138: Rename parameterinputto avoid shadowing built-in.The parameter name
inputshadows Python's built-in function. Use a more descriptive name likefile_pathoraudio_path.- async def create_transcript(self, input) -> Optional[TranscriptionReturnType]: + async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]:cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
44-47: Update class docstring to reflect new public methods.The docstring only mentions
acreate_structured_output, but the class now also providescreate_transcriptandtranscribe_imagemethods.""" Adapter for Generic API LLM provider API. This class initializes the API adapter with necessary credentials and configurations for interacting with a language model. It provides methods for creating structured outputs - based on user input and system prompts. + based on user input and system prompts, as well as audio transcription and image description. Public methods: - acreate_structured_output(text_input: str, system_prompt: str, response_model: - Type[BaseModel]) -> BaseModel + Type[BaseModel]) -> BaseModel + - create_transcript(file_path: str) -> Optional[TranscriptionReturnType] + - transcribe_image(file_path: str) -> litellm.ModelResponse """
239-240: Minor: Extra space beforelitellm.completionon line 219.There's an extra space in the assignment:
response = litellm.completion(.- response = litellm.completion( + response = litellm.completion(cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (3)
36-59: Docstring inconsistencies and duplicate fragment.
- The docstring lists
create_structured_outputas a public method (line 42), but the synchronous version has been removed according to the refactoring.- Line 59 contains a duplicate docstring fragment with a typo (
GPT=4instead ofGPT-4).""" Adapter for OpenAI's GPT-3, GPT-4 API. Public methods: - acreate_structured_output - - create_structured_output - create_transcript - transcribe_image - show_prompt Instance variables: - name - model - api_key - api_version - MAX_RETRIES """ default_instructor_mode = "json_schema_mode" MAX_RETRIES = 5 - """Adapter for OpenAI's GPT-3, GPT=4 API"""
66-68: Add explicitOptionaltype hints for parameters withNonedefaults.Parameters that default to
Noneshould haveOptional[str]type hints for clarity and static analysis compliance per PEP 484.- endpoint: str = None, - api_version: str = None, - transcription_model: str = None, + endpoint: Optional[str] = None, + api_version: Optional[str] = None, + transcription_model: Optional[str] = None,
175-179: Clarify or remove commented-out fallback endpoint.The commented-out
api_base=self.fallback_endpointat line 177 is unclear. If the fallback endpoint should be used, restore it; otherwise, remove the dead code to avoid confusion.api_key=self.fallback_api_key, - # api_base=self.fallback_endpoint, + api_base=self.fallback_endpoint, response_model=response_model,Or remove the line entirely if the fallback endpoint is intentionally not used.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py(2 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py(3 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py(7 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py(3 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py(5 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py(6 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
⚙️ CodeRabbit configuration file
**/*.py: When reviewing Python code for this project:
- Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
- As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
- As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.
- As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
- As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use shared logging utilities from cognee.shared.logging_utils in Python code
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/{modules,infrastructure,tasks}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
🧠 Learnings (1)
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/**/*.py : Use shared logging utilities from cognee.shared.logging_utils in Python code
Applied to files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
🧬 Code graph analysis (5)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (4)
cognee/infrastructure/files/utils/open_data_file.py (1)
open_data_file(11-54)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
GenericAPIAdapter(36-300)create_transcript(196-242)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
TranscriptionReturnType(3-9)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
create_transcript(40-55)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (5)
cognee/shared/logging_utils.py (1)
get_logger(212-224)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(36-300)cognee/shared/rate_limiting.py (1)
llm_rate_limiter_context_manager(15-21)cognee/infrastructure/llm/config.py (1)
get_llm_config(244-258)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (4)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
GenericAPIAdapter(36-300)create_transcript(196-242)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
TranscriptionReturnType(3-9)cognee/infrastructure/llm/LLMGateway.py (1)
create_transcript(38-44)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
create_transcript(40-55)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (6)
cognee/infrastructure/files/utils/open_data_file.py (1)
open_data_file(11-54)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
TranscriptionReturnType(3-9)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
create_transcript(204-235)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py (1)
create_transcript(125-155)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
create_transcript(40-55)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (4)
cognee/infrastructure/llm/exceptions.py (1)
ContentPolicyFilterError(4-5)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(36-300)cognee/shared/logging_utils.py (1)
get_logger(212-224)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)
🔇 Additional comments (5)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)
46-49: Good fix: Now usesself.api_keyinstead of fetching from config.The previous issue has been addressed - the adapter now correctly uses the
api_keypassed to the constructor viaself.api_key.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (2)
49-73: Missingimage_transcribe_modelparameter in constructor.The
GenericAPIAdapterbase class acceptsimage_transcribe_modelparameter, butGeminiAdapter.__init__doesn't expose it. If image transcription is needed for Gemini, this parameter should be added.def __init__( self, api_key: str, model: str, max_completion_tokens: int, endpoint: str = None, api_version: str = None, transcription_model: str = None, + image_transcribe_model: str = None, instructor_mode: str = None, fallback_model: str = None, fallback_api_key: str = None, fallback_endpoint: str = None, ): super().__init__( api_key=api_key, model=model, max_completion_tokens=max_completion_tokens, name="Gemini", endpoint=endpoint, api_version=api_version, transcription_model=transcription_model, + image_transcribe_model=image_transcribe_model, fallback_model=fallback_model, fallback_api_key=fallback_api_key, fallback_endpoint=fallback_endpoint, )
33-45: Good refactor: Adapter now extendsGenericAPIAdapterand includes observability.The class properly inherits from
GenericAPIAdapter, uses the@observedecorator for generation tracking, and documents the inherited multimodal capabilities. This aligns well with the broader architectural changes in this PR.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (2)
1-33: LGTM!Imports are well-organized and appropriate for the refactoring. The use of
get_logger()fromcognee.shared.logging_utilsfollows the project's coding guidelines.
237-237: LGTM!The comment correctly documents that
transcribe_imageis inherited fromGenericAPIAdapter, addressing the previous review feedback.
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 9573981 | Triggered | Generic Password | 13c034e | .github/workflows/examples_tests.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
63-68: Useself.api_keyinstead of fetching from config.Line 66 uses
get_llm_config().llm_api_keyinstead ofself.api_key. This is inconsistent with the constructor pattern whereapi_keyis passed explicitly and stored viasuper().__init__().self.aclient = instructor.from_litellm( litellm.acompletion, mode=instructor.Mode(self.instructor_mode), - api_key=get_llm_config().llm_api_key, + api_key=self.api_key, )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)
78-91: Hardcodedmax_tokens=4096ignores constructor parameter.The constructor accepts
max_completion_tokenswhich is stored viasuper().__init__(), but line 81 uses a hardcoded value of4096. Useself.max_completion_tokensfor consistency with the configuration.async with llm_rate_limiter_context_manager(): return await self.aclient( model=self.model, - max_tokens=4096, + max_tokens=self.max_completion_tokens, max_retries=2,cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
225-238: Uselitellm.atranscription()and renameinputparameter to avoid blocking the event loop and shadowing built-in.The
litellm.atranscription()async variant is available for transcription calls. The current implementation uses synchronouslitellm.transcription()inside an async method, which blocks the event loop. Additionally, the parameter nameinputshadows the built-in function—reusing a builtin name for function arguments increases the difficulty of reading and maintaining code and can cause non-obvious errors.- async def create_transcript(self, input, **kwargs) -> Optional[TranscriptionReturnType]: + async def create_transcript(self, file_path: str, **kwargs) -> Optional[TranscriptionReturnType]: ... - async with open_data_file(input, mode="rb") as audio_file: - transcription = litellm.transcription( + async with open_data_file(file_path, mode="rb") as audio_file: + transcription = await litellm.atranscription( model=self.transcription_model, file=audio_file,
♻️ Duplicate comments (3)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
196-242: Synchronouslitellm.completionblocks the event loop in async method.
litellm.completionis synchronous, but this method isasync. This will block the event loop during API calls. Uselitellm.acompletioninstead, or wrap the sync call withasyncio.to_thread.Additionally, the parameter
inputshadows the Python built-in. Consider renaming tofile_pathoraudio_path.- async def create_transcript(self, input) -> Optional[TranscriptionReturnType]: + async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]: """ ... Parameters: ----------- - - input: The path to the audio file that needs to be transcribed. + - file_path: The path to the audio file that needs to be transcribed. ... """ - async with open_data_file(input, mode="rb") as audio_file: + async with open_data_file(file_path, mode="rb") as audio_file: encoded_string = base64.b64encode(audio_file.read()).decode("utf-8") - mime_type, _ = mimetypes.guess_type(input) + mime_type, _ = mimetypes.guess_type(file_path) if not mime_type or not mime_type.startswith("audio/"): raise ValueError( - f"Could not determine MIME type for audio file: {input}. Is the extension correct?" + f"Could not determine MIME type for audio file: {file_path}. Is the extension correct?" ) - response = litellm.completion( + response = await litellm.acompletion( model=self.transcription_model,
253-300: Same issues: sync call in async context; consider consistent return type.This method has the same blocking I/O issue with synchronous
litellm.completion. Additionally:
- Return type
Optional[BaseModel]is inaccurate—litellm.completionreturnsModelResponse.- Unlike
create_transcript, this method doesn't wrap the response inTranscriptionReturnType. Per PR discussion, consider returning a consistent type for both transcription methods.- async def transcribe_image(self, input) -> Optional[BaseModel]: + async def transcribe_image(self, file_path: str) -> Optional[TranscriptionReturnType]: ... - async with open_data_file(input, mode="rb") as image_file: + async with open_data_file(file_path, mode="rb") as image_file: encoded_image = base64.b64encode(image_file.read()).decode("utf-8") - mime_type, _ = mimetypes.guess_type(input) + mime_type, _ = mimetypes.guess_type(file_path) ... - return litellm.completion( + response = await litellm.acompletion( model=self.image_transcribe_model, ... ) + if response and response.choices and len(response.choices) > 0: + return TranscriptionReturnType(response.choices[0].message.content, response) + return Nonecognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
156-167: Platform-unsafe path handling and potential sync call in async method.
input.split("/")[-1]fails on Windows paths. Useos.path.basename()orpathlib.Path().name.
self.mistral_client.audio.transcriptions.complete()appears to be a synchronous SDK call inside an async method. This blocks the event loop. Verify if Mistral SDK offers an async variant, or wrap withasyncio.to_thread().+import os ... - file_name = input.split("/")[-1] + file_name = os.path.basename(input) async with open_data_file(input, mode="rb") as f: - transcription_response = self.mistral_client.audio.transcriptions.complete( + # Consider wrapping in asyncio.to_thread if no async API available + transcription_response = await asyncio.to_thread( + self.mistral_client.audio.transcriptions.complete, model=transcription_model, file={ "content": f, "file_name": file_name, }, )
🧹 Nitpick comments (5)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (2)
97-99: Remove unused import inside OLLAMA branch.
GenericAPIAdapteris imported but never used in this branch—OllamaAPIAdapteris instantiated instead.elif provider == LLMProvider.OLLAMA: if llm_config.llm_api_key is None and raise_api_key_error: raise LLMAPIKeyNotSetError() - from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.generic_llm_api.adapter import ( - GenericAPIAdapter, - ) - return OllamaAPIAdapter(
115-120: Consider using keyword arguments for clarity and maintainability.While the positional argument order matches the current
AnthropicAdaptersignature, using keyword arguments would make the code more resilient to future signature changes and improve readability.return AnthropicAdapter( - llm_config.llm_api_key, - llm_config.llm_model, - max_completion_tokens, + api_key=llm_config.llm_api_key, + model=llm_config.llm_model, + max_completion_tokens=max_completion_tokens, instructor_mode=llm_config.llm_instructor_mode.lower(), )cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py (1)
138-152: Parameterinputshadows built-in; add type annotation.For consistency with the suggested changes in
GenericAPIAdapter, renameinputtofile_pathand add the type annotation.- async def create_transcript(self, input) -> Optional[TranscriptionReturnType]: + async def create_transcript(self, file_path: str) -> Optional[TranscriptionReturnType]: """ ... Parameters: ----------- - - input: The path to the audio file that needs to be transcribed. + - file_path: The path to the audio file that needs to be transcribed. ... """cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
59-59: Remove duplicate docstring with typo.Line 59 contains a duplicate docstring
"Adapter for OpenAI's GPT-3, GPT=4 API"(note the=instead of-). The class already has a proper docstring at lines 36-54.MAX_RETRIES = 5 - """Adapter for OpenAI's GPT-3, GPT=4 API""" - def __init__(cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (1)
74-78: Verify ifaclientre-initialization is necessary.Lines 74-78 re-initialize
self.instructor_modeandself.aclientafter callingsuper().__init__(). The parentGenericAPIAdapteralready initializes these in its__init__method (see lines 58-61 of the base class).Please confirm:
- Is this re-initialization intentional for Gemini-specific instructor configuration?
- If not, consider removing this code block and passing
instructor_modeto the parent instead.If Gemini requires special handling, please document why in a comment.
If the re-initialization is not needed, consider this refactor:
super().__init__( api_key=api_key, model=model, max_completion_tokens=max_completion_tokens, name="Gemini", endpoint=endpoint, api_version=api_version, transcription_model=transcription_model, + instructor_mode=instructor_mode, fallback_model=fallback_model, fallback_api_key=fallback_api_key, fallback_endpoint=fallback_endpoint, ) - self.instructor_mode = instructor_mode if instructor_mode else self.default_instructor_mode - - self.aclient = instructor.from_litellm( - litellm.acompletion, mode=instructor.Mode(self.instructor_mode) - )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
cognee/infrastructure/llm/LLMGateway.py(0 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py(2 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py(3 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py(7 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py(3 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py(5 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py(0 hunks)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py(6 hunks)
💤 Files with no reviewable changes (2)
- cognee/infrastructure/llm/LLMGateway.py
- cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
⚙️ CodeRabbit configuration file
**/*.py: When reviewing Python code for this project:
- Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
- As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
- As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.
- As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
- As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use shared logging utilities from cognee.shared.logging_utils in Python code
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/{modules,infrastructure,tasks}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)
Files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.pycognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
🧠 Learnings (1)
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/**/*.py : Use shared logging utilities from cognee.shared.logging_utils in Python code
Applied to files:
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
🧬 Code graph analysis (4)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (3)
cognee/modules/observability/get_observe.py (1)
get_observe(5-25)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(36-300)cognee/shared/rate_limiting.py (1)
llm_rate_limiter_context_manager(15-21)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (5)
cognee/infrastructure/llm/exceptions.py (1)
ContentPolicyFilterError(4-5)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
GenericAPIAdapter(36-300)cognee/shared/logging_utils.py (1)
get_logger(212-224)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)cognee/infrastructure/files/storage/FileBufferedReader.py (1)
name(11-12)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (4)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (2)
GenericAPIAdapter(36-300)create_transcript(196-242)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
TranscriptionReturnType(3-9)cognee/infrastructure/llm/LLMGateway.py (1)
create_transcript(41-47)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
create_transcript(40-55)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (5)
cognee/infrastructure/files/utils/open_data_file.py (1)
open_data_file(11-54)cognee/modules/observability/get_observe.py (1)
get_observe(5-25)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/types.py (1)
TranscriptionReturnType(3-9)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
create_transcript(206-238)cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py (1)
create_transcript(40-55)
🔇 Additional comments (7)
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/get_llm_client.py (1)
130-139: LGTM with optional suggestion for keyword arguments.The positional argument order matches the
GenericAPIAdaptersignature. For consistency with other provider instantiations (likeOpenAIAdapterandGeminiAdapterwhich use keyword arguments), consider using keyword arguments here as well.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py (1)
49-84: LGTM: Initialization and class setup.The
MAX_RETRIESconstant, extended__init__parameters, and sensible defaults fortranscription_modelandimage_transcribe_modelare well-designed. The@observedecorator integration follows the established pattern.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py (1)
35-49: LGTM: Clean refactoring to extend GenericAPIAdapter.The constructor properly delegates to
super().__init__()with keyword arguments, and correctly usesself.api_keyfor the Anthropic client (addressing the past review concern). The@observedecorator integration follows the established pattern.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py (1)
75-101: LGTM: Clean delegation to GenericAPIAdapter.The constructor properly delegates to
super().__init__()with keyword arguments. The conditional logic forgpt-5models and maintaining both sync/async clients is reasonable for backward compatibility.cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py (3)
1-30: LGTM: Import organization looks good.The duplicate
get_loggerimport flagged in the previous review has been resolved. The new imports forContentPolicyFilterError,GenericAPIAdapter,get_logger, andget_observeare all necessary for the refactoring to inherit fromGenericAPIAdapterand add observability support.
80-182: LGTM: Override justified byapi_versionparameter.The
acreate_structured_outputmethod override is nearly identical to the parentGenericAPIAdapterimplementation, but it correctly includes theapi_versionparameter in both the primary and fallback completion calls (lines 131 and 166). This Gemini-specific requirement justifies the override.The implementation correctly:
- Applies the
@observedecorator for observability- Uses rate limiting via
llm_rate_limiter_context_manager()- Handles content policy violations with fallback logic
- Maintains proper error handling and retry mechanisms
33-45: The docstring accurately listscreate_transcriptas a public method. This method is inherited from GenericAPIAdapter and works correctly for Gemini through litellm's abstraction layer—no Gemini-specific override is needed. The base implementation properly handles thetranscription_modelparameter and is compatible with Gemini's audio transcription capabilities via litellm.Likely an incorrect or invalid review comment.
| class AnthropicAdapter(GenericAPIAdapter): | ||
| """ | ||
| Adapter for interfacing with the Anthropic API, enabling structured output generation | ||
| and prompt display. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find and examine the GenericAPIAdapter class
find . -type f -name "*.py" | xargs grep -l "class GenericAPIAdapter"Repository: topoteretes/cognee
Length of output: 167
🏁 Script executed:
#!/bin/bash
# Find adapter files in the repository
find . -path "*/structured_output_framework/*" -name "*.py" -type f | head -20Repository: topoteretes/cognee
Length of output: 1814
🏁 Script executed:
#!/bin/bash
# Search for the file mentioned in the review
find . -path "**/litellm_instructor/llm/anthropic/**" -name "*.py" | head -10
# Also search for GenericAPIAdapter
find . -type f -name "*.py" | xargs grep -l "class GenericAPIAdapter" 2>/dev/nullRepository: topoteretes/cognee
Length of output: 368
🏁 Script executed:
#!/bin/bash
# Look for the adapter and parent class files
find . -type f -name "adapter.py" | grep -E "(generic|anthropic)" | head -10
rg "class GenericAPIAdapter" --type=py
rg "def create_transcript|def transcribe_image" --type=py -A 5Repository: topoteretes/cognee
Length of output: 9839
🏁 Script executed:
#!/bin/bash
# Locate the AnthropicAdapter file and GenericAPIAdapter
find . -name "adapter.py" -path "*anthropic*" -o -name "*.py" -path "*generic*api*"
# Search for the class definition
rg "class GenericAPIAdapter"
# Search for the methods mentioned
rg "(create_transcript|transcribe_image)" -A 3Repository: topoteretes/cognee
Length of output: 12569
🏁 Script executed:
#!/bin/bash
# First, let's explore the directory structure
find cognee -path "*/litellm_instructor/llm/anthropic/*" -type f 2>/dev/null || echo "Path not found"
find cognee -name "*adapter*" -type f 2>/dev/null | head -10Repository: topoteretes/cognee
Length of output: 859
🏁 Script executed:
#!/bin/bash
# Check if the file exists and examine it
ls -la cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py 2>/dev/null || \
find cognee -name "adapter.py" | head -5
# Search for GenericAPIAdapter
rg "class GenericAPIAdapter" -A 10
# Search for transcription methods
rg "def (create_transcript|transcribe_image)" -B 2 -A 10Repository: topoteretes/cognee
Length of output: 19540
🏁 Script executed:
#!/bin/bash
#!/bin/bash
# Search for GenericAPIAdapter and related methods
rg "GenericAPIAdapter" cognee --type=py
rg "create_transcript|transcribe_image" cognee --type=py -B 1 -A 3
# Find adapter files
fd "adapter.py" cogneeRepository: topoteretes/cognee
Length of output: 18772
🏁 Script executed:
#!/bin/bash
find . -name "*.py" -type f -path "*litellm_instructor*" | head -20
find . -name "*.py" -type f | xargs grep -l "class GenericAPIAdapter" | head -5
find . -name "*.py" -type f | xargs grep -l "create_transcript\|transcribe_image" | head -10Repository: topoteretes/cognee
Length of output: 2918
🏁 Script executed:
#!/bin/bash
# List files in the anthropic directory
if [ -d "cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic" ]; then
ls -la cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/
fi
# Find all adapter.py files in the structured_output_framework
find cognee/infrastructure/llm/structured_output_framework -name "adapter.py" -o -name "*adapter*"
# Search for GenericAPIAdapter
grep -r "class GenericAPIAdapter" . --include="*.py" 2>/dev/null | head -5Repository: topoteretes/cognee
Length of output: 1117
🏁 Script executed:
#!/bin/bash
# Get directory structure of the framework
find cognee/infrastructure -type d -name "*litellm*" -o -type d -name "*structured*" | head -10
# List the files in the anthropic directory if it exists
find cognee -path "*anthropic*" -name "*.py" 2>/dev/null | head -10
# Find all adapter files
find cognee -name "*.py" -type f | xargs grep -l "class.*Adapter"Repository: topoteretes/cognee
Length of output: 2373
🏁 Script executed:
#!/bin/bash
# Search for GenericAPIAdapter definition
rg "class GenericAPIAdapter" --type py -C 5
# Search for create_transcript method definition
rg "def create_transcript" --type py -C 3
# Search for transcribe_image method definition
rg "def transcribe_image" --type py -C 3
# List all adapter.py files
find . -name "adapter.py" -type fRepository: topoteretes/cognee
Length of output: 12294
🏁 Script executed:
#!/bin/bash
# Find the specific files mentioned in the review
find . -type f -name "*.py" | xargs grep -l "class.*Adapter" | head -20
# Find files in the anthropic directory
find . -type f -path "*anthropic*" -name "*.py" | head -20
# Look for create_transcript and transcribe_image definitions
grep -r "def create_transcript" . --include="*.py" | head -10
grep -r "def transcribe_image" . --include="*.py" | head -10
# Look for GenericAPIAdapter
grep -r "class GenericAPIAdapter" . --include="*.py"Repository: topoteretes/cognee
Length of output: 3498
🏁 Script executed:
#!/bin/bash
# Step 1: List all files in the structured_output_framework directory
find . -path "*structured_output_framework*" -name "*.py" -type f 2>/dev/null | head -30
# Step 2: Find all adapter files
find . -name "*adapter*" -type f 2>/dev/null | grep -E "\.py$"
# Step 3: Search for GenericAPIAdapter and related methods
rg "(class GenericAPIAdapter|def create_transcript|def transcribe_image)" --type=py -B 2 -A 5Repository: topoteretes/cognee
Length of output: 17733
🏁 Script executed:
#!/bin/bash
set -e
# Find all Python files in cognee/infrastructure/llm
find cognee/infrastructure/llm -name "*.py" -type f 2>/dev/null > /tmp/files.txt
# Count them
wc -l /tmp/files.txt
# Show first 30 files
head -30 /tmp/files.txt
# Search for GenericAPIAdapter in any Python file
echo "=== Searching for GenericAPIAdapter ==="
grep -r "class GenericAPIAdapter" . --include="*.py" 2>/dev/null || echo "Not found in grep"
# Search for the AnthropicAdapter file
echo "=== Looking for AnthropicAdapter ==="
find . -path "*anthropic*adapter*" -name "*.py" 2>/dev/null
# Search for create_transcript method
echo "=== Looking for create_transcript method ==="
grep -r "def create_transcript" . --include="*.py" 2>/dev/null | head -5
# Search for transcribe_image method
echo "=== Looking for transcribe_image method ==="
grep -r "def transcribe_image" . --include="*.py" 2>/dev/null | head -5Repository: topoteretes/cognee
Length of output: 4132
🏁 Script executed:
#!/bin/bash
# Comprehensive search to understand the adapter architecture
echo "Searching for adapter files..."
find . -name "*.py" -type f | xargs grep -l "class.*Adapter" 2>/dev/null | grep -E "(generic|anthropic|adapter)" | head -10
echo -e "\n Looking for GenericAPIAdapter..."
find . -name "*.py" -type f | xargs grep "class GenericAPIAdapter" 2>/dev/null
echo -e "\nLooking for create_transcript and transcribe_image..."
find . -name "*.py" -type f | xargs grep -E "(def create_transcript|def transcribe_image)" 2>/dev/nullRepository: topoteretes/cognee
Length of output: 2734
🏁 Script executed:
#!/bin/bash
# Find all .py files that might contain GenericAPIAdapter
rg "GenericAPIAdapter" --type py
# Find create_transcript and transcribe_image
rg "create_transcript|transcribe_image" --type py -B 2 -A 5
# List structured output framework files
fd . cognee/infrastructure/llm/structured_output_framework --type f --extension py | head -20Repository: topoteretes/cognee
Length of output: 25949
🏁 Script executed:
#!/bin/bash
# Search for GenericAPIAdapter definition
grep -r "class GenericAPIAdapter" cognee --include="*.py"
# Search for AnthropicAdapter
grep -r "class AnthropicAdapter" cognee --include="*.py"
# Search for methods
grep -r "def create_transcript\|def transcribe_image" cognee --include="*.py" -A 3
# Check directory structure
find cognee/infrastructure/llm -type d | head -15Repository: topoteretes/cognee
Length of output: 7159
🏁 Script executed:
#!/bin/bash
# Direct search for the adapter files and their methods
find cognee -name "adapter.py" 2>/dev/null -exec grep -H "class\|def create_transcript\|def transcribe_image" {} \;
# Search for GenericAPIAdapter definition
rg "class GenericAPIAdapter" --type py --no-heading -A 15 2>/dev/null | head -30
# Search for the methods
rg "def (create_transcript|transcribe_image)" --type py --no-heading -B 1 -A 10Repository: topoteretes/cognee
Length of output: 22727
🏁 Script executed:
#!/bin/bash
# Final comprehensive search
echo "=== Searching for GenericAPIAdapter ==="
rg "class GenericAPIAdapter" --type py
echo "=== Searching for transcription methods ==="
rg "def (create_transcript|transcribe_image)" --type py
echo "=== Listing adapter files ==="
find cognee -name "*adapter*" -type f -name "*.py"
echo "=== Checking litellm_instructor structure ==="
ls -la cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ 2>/dev/null || echo "Directory not accessible"Repository: topoteretes/cognee
Length of output: 5023
🏁 Script executed:
#!/bin/bash
# Check if the adapter classes and methods actually exist
git ls-files | grep -E "(adapter|instructor)" | head -20
cat cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py 2>/dev/null || echo "File not found"
grep -r "GenericAPIAdapter" cognee --include="*.py" -l
grep -r "create_transcript\|transcribe_image" cognee --include="*.py" -B 1 -A 3Repository: topoteretes/cognee
Length of output: 19507
🏁 Script executed:
#!/bin/bash
# Simple direct existence and content check
if [ -f "cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py" ]; then
echo "File exists, content:"
cat cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
else
echo "File not found at expected path"
fi
# Find adapter.py files
find cognee -name "adapter.py"
# Check for GenericAPIAdapter class
rg "class GenericAPIAdapter" cognee/
# Check for the methods in question
rg "def create_transcript" cognee/ -A 10
rg "def transcribe_image" cognee/ -A 10Repository: topoteretes/cognee
Length of output: 20137
Override create_transcript with explicit error handling for Anthropic audio limitation.
AnthropicAdapter inherits create_transcript from GenericAPIAdapter, but Anthropic's API does not support audio transcription. Override this method to raise NotImplementedError with a clear message directing users to external transcription services. For transcribe_image, verify the inherited implementation works correctly with Anthropic's vision API by testing with base64-encoded images.
| def __init__( | ||
| self, | ||
| endpoint, | ||
| api_key: str, | ||
| model: str, | ||
| api_version: str, | ||
| max_completion_tokens: int, | ||
| endpoint: str = None, | ||
| api_version: str = None, | ||
| transcription_model: str = None, | ||
| instructor_mode: str = None, | ||
| fallback_model: str = None, | ||
| fallback_api_key: str = None, | ||
| fallback_endpoint: str = None, | ||
| ): | ||
| self.model = model | ||
| self.api_key = api_key | ||
| self.endpoint = endpoint | ||
| self.api_version = api_version | ||
| self.max_completion_tokens = max_completion_tokens | ||
|
|
||
| self.fallback_model = fallback_model | ||
| self.fallback_api_key = fallback_api_key | ||
| self.fallback_endpoint = fallback_endpoint | ||
|
|
||
| super().__init__( | ||
| api_key=api_key, | ||
| model=model, | ||
| max_completion_tokens=max_completion_tokens, | ||
| name="Gemini", | ||
| endpoint=endpoint, | ||
| api_version=api_version, | ||
| transcription_model=transcription_model, | ||
| fallback_model=fallback_model, | ||
| fallback_api_key=fallback_api_key, | ||
| fallback_endpoint=fallback_endpoint, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add missing image_transcribe_model parameter and docstring.
Two issues:
-
Missing parameter:
GenericAPIAdapter.__init__accepts animage_transcribe_modelparameter (see line 57 in the base class), butGeminiAdapter.__init__does not accept or pass this parameter. This means image transcription will default to using the mainmodelinstead of a potentially specialized image model. -
Missing docstring: Per the coding guidelines, undocumented function definitions are assumed incomplete. Please add a docstring documenting the parameters and their purposes.
Apply this diff to add the missing parameter:
def __init__(
self,
api_key: str,
model: str,
max_completion_tokens: int,
endpoint: str = None,
api_version: str = None,
transcription_model: str = None,
+ image_transcribe_model: str = None,
instructor_mode: str = None,
fallback_model: str = None,
fallback_api_key: str = None,
fallback_endpoint: str = None,
):
+ """
+ Initialize the Gemini adapter with API credentials and model configurations.
+
+ Parameters:
+ -----------
+ api_key: Gemini API key
+ model: Primary model identifier for structured output generation
+ max_completion_tokens: Maximum tokens for completion responses
+ endpoint: Optional custom API endpoint
+ api_version: Optional API version
+ transcription_model: Optional model for audio transcription (defaults to main model)
+ image_transcribe_model: Optional model for image transcription (defaults to main model)
+ instructor_mode: Mode for instructor client (defaults to "json_mode")
+ fallback_model: Fallback model for content policy violations
+ fallback_api_key: API key for fallback model
+ fallback_endpoint: Endpoint for fallback model
+ """
super().__init__(
api_key=api_key,
model=model,
max_completion_tokens=max_completion_tokens,
name="Gemini",
endpoint=endpoint,
api_version=api_version,
transcription_model=transcription_model,
+ image_transcribe_model=image_transcribe_model,
fallback_model=fallback_model,
fallback_api_key=fallback_api_key,
fallback_endpoint=fallback_endpoint,
)As per coding guidelines, undocumented function definitions are considered incomplete.
|
I've merged your PR to this branch (test-audio-image-transcription) and made a new PR: So we can run our CI/CD on it and so I can make some minor adjustments. If CI/CD passes there we can merge it (your contributions will be noted in the commit history of course) |
|
@dexters1 Thank you! |
|
Closing this PR as feature will be merged as part of the test-audio-image-transcription branch PR to dev and will be part of the next Cognee release |
Description
Add support for transcribe image and audio transcription for gemini, anthropic, mistral and ollama.
Type of Change
Pre-submission Checklist
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
Implementation Questions for Review
I need guidance on the following architectural decisions:
Video Transcription: For video transcription, what is the intended approach?
Standardizing Return Types for Auto Transcription: The implementation currently involves model-specific methods (e.g., a distinct method for OpenAI and different approaches for Mistral). How should we establish a consistent, unified return type (e.g., a common
TranscriptionResultobject) across all models to ensure a standardized user interface?Future Scope
Testing Notes
I have tested core components of the new features using a separate validation script.
python cognee/tests/test_library.py, specifically:LLMAPIKeyNotSetError: LLM API key is not set. (Status code: 422). I need help on the correct environment variable or configuration file location for setting the LLM API key to resolve this and run the full test suite.Summary by CodeRabbit
New Features
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.