-
Notifications
You must be signed in to change notification settings - Fork 945
Update OllamaEmbeddingEngine.py #1330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Compatible private embeddeding model (openai format)
Please make sure all the checkboxes are checked:
|
WalkthroughAdds an "input" field mirroring the prompt to the JSON payload sent by OllamaEmbeddingEngine._get_embedding. No other logic, control flow, or public interfaces were changed. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @weikao, thank you for submitting a PR! We will respond as soon as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py (2)
107-112: Harden HTTP handling and support both Ollama and OpenAI response formats.Right now non-2xx responses and OpenAI-style responses will raise. Add a status check and dual-path parse.
Apply:
async with aiohttp.ClientSession() as session: async with session.post( self.endpoint, json=payload, headers=headers, timeout=60.0 ) as response: - data = await response.json() - return data["embedding"] + response.raise_for_status() + data = await response.json() + # Ollama format: {"embedding": [...]} + if "embedding" in data: + return data["embedding"] + # OpenAI format: {"data": [{"embedding": [...], "index": 0}], ...} + if isinstance(data.get("data"), list) and data["data"]: + item = data["data"][0] + if isinstance(item, dict) and "embedding" in item: + return item["embedding"] + raise KeyError("No embedding found in response payload")
47-47: Wire MAX_RETRIES into the retry decorator
Change the decorator on line 92 to pass the constant, e.g.:- @embedding_sleep_and_retry_async() + @embedding_sleep_and_retry_async(max_retries=MAX_RETRIES) async def _get_embedding(self, prompt: str) -> List[float]:This ensures the
MAX_RETRIESconstant is actually used.
🧹 Nitpick comments (3)
cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py (3)
97-101: Use only "input" for embeddings; drop "prompt" to avoid ambiguity.Ollama and OpenAI embeddings expect "input". Keeping both "prompt" and "input" is redundant and may confuse proxy services or schema validators.
Apply:
- payload = { - "model": self.model, - "prompt": prompt, - "input": prompt - } + payload = { + "model": self.model, + "input": prompt + }
89-90: Reuse a single aiohttp session per batch to cut connection overhead.Each prompt opens a new session; reuse one session across the gather to reduce latency and socket churn.
Apply:
- embeddings = await asyncio.gather(*[self._get_embedding(prompt) for prompt in text]) + async with aiohttp.ClientSession() as session: + embeddings = await asyncio.gather( + *[self._get_embedding(prompt, session=session) for prompt in text] + ) return embeddings @@ - async def _get_embedding(self, prompt: str) -> List[float]: + async def _get_embedding(self, prompt: str, session: Optional[aiohttp.ClientSession] = None) -> List[float]: @@ - async with aiohttp.ClientSession() as session: - async with session.post( + _session = session or aiohttp.ClientSession() + try: + async with _session.post( self.endpoint, json=payload, headers=headers, timeout=60.0 ) as response: ... + finally: + if session is None: + await _session.close()Also applies to: 92-93, 107-108
64-68: Simplify env parsing; remove dead isinstance check.
os.getenvreturns str; the isinstance branch never runs.Apply:
- enable_mocking = os.getenv("MOCK_EMBEDDING", "false") - if isinstance(enable_mocking, bool): - enable_mocking = str(enable_mocking).lower() - self.mock = enable_mocking in ("true", "1", "yes") + enable_mocking = os.getenv("MOCK_EMBEDDING", "false").lower() + self.mock = enable_mocking in ("true", "1", "yes")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py(1 hunks)
|
@weikao what is the purpose of this PR and what problem does it solve? |
Compatible private embeddeding model (openai format)
Description
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.