Skip to content

Conversation

@dm1tryG
Copy link
Contributor

@dm1tryG dm1tryG commented Mar 8, 2025

Description

Add Support for ChromaDB

Summary

This PR adds support for ChromaDB as a vector database option in the Cognee application. ChromaDB is a modern, open-source embedding database designed for AI applications.

Changes

  • Created a new ChromaDBAdapter implementation for vector database operations
  • Added comprehensive test suite for ChromaDB functionality
  • Updated docker-compose.yml to include ChromaDB service
  • Modified environment configuration to support ChromaDB settings
  • Updated vector engine creation logic to support ChromaDB as an option

Technical Details

  • Implemented ChromaDBAdapter.py (347 lines) with full CRUD operations for vector data
  • Created test suite (test_chromadb.py) with 171 lines of test coverage
  • Updated vector engine creation process to dynamically select ChromaDB when configured
  • Modified settings router to accommodate new database option
  • Updated environment template with ChromaDB configuration options

Docker Changes

  • Added ChromaDB service to docker-compose.yml with appropriate configuration

This PR enhances Cognee's flexibility by providing an alternative vector database option, allowing users to choose the most appropriate database for their specific use case.

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin

Tested with UI + tests.

Summary by CodeRabbit

  • New Features
    • Expanded vector database integration by adding support for Chromadb, enabling enhanced data management and search functionalities.
  • Tests
    • Added automated tests to validate the Chromadb integration and related operations.
  • Chores
    • Updated configuration guidance and dependency management to include Chromadb.
    • Provided an optional container deployment template for Chromadb.
    • Added a new entry to ignore the .chromadb_data/ directory in version control.
    • Introduced a new GitHub Actions workflow for testing Chromadb integration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2025

Walkthrough

This pull request adds support for the new vector database provider "chromadb." It updates configuration files and provider options in the settings, introduces a new adapter implementation for interacting with ChromaDB, and adjusts the engine creation logic to conditionally instantiate the new adapter. Additionally, tests for ChromaDB functionality are provided along with a Docker service configuration and a new dependency declaration.

Changes

File(s) Change Summary
.env.template, .gitignore, cognee/.../get_settings_router.py Updated configuration: expanded provider comment to include "chromadb", revised type hints by replacing "qdrant" with "chromadb", and added an entry to ignore the .chromadb_data/ directory.
cognee/.../ChromaDBAdapter.py, cognee/.../create_vector_engine.py Introduced "chromadb" support by adding a new adapter (ChromaDBAdapter) that implements the VectorDBInterface and updating the vector engine creation function to import and instantiate this adapter when the provider is set to "chromadb".
cognee/.../test_chromadb.py, docker-compose.yml, pyproject.toml Added tests for chromadb functionality, included a Docker service configuration for ChromaDB, and declared a new dependency (chromadb = "^0.6.0") in the project configuration.

Sequence Diagram(s)

sequenceDiagram
    participant C as Client
    participant E as create_vector_engine
    participant A as ChromaDBAdapter

    C->>E: Request vector engine (provider="chromadb")
    E->>E: Validate provider and import "chromadb" module
    E-->>C: Raise ImportError (if module missing) or proceed
    E->>A: Instantiate ChromaDBAdapter with config parameters
    A-->>E: Return adapter instance
    E-->>C: Return ChromaDBAdapter instance
Loading

Possibly related PRs

Suggested reviewers

  • borisarzentar
  • Vasilije1990

Poem

I’m a nimble rabbit with a coding flair,
Hopping through changes with a joyful air.
Chromadb joins the garden so bright,
New adapters and tests, a delightful sight.
In configs and code, my carrots align—
CodeRabbit cheers as our changes shine!
🐇🥕

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@dm1tryG dm1tryG changed the title Support chroma Add Support ChromaDB Mar 8, 2025
@dm1tryG dm1tryG changed the title Add Support ChromaDB feat: Add Support ChromaDB Mar 8, 2025
@dm1tryG dm1tryG force-pushed the feature/support-chromadb branch from 9e6c021 to 45d3356 Compare March 9, 2025 19:50
@gitguardian
Copy link

gitguardian bot commented Mar 9, 2025

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@dm1tryG dm1tryG force-pushed the feature/support-chromadb branch from 45d3356 to 0318a0f Compare March 9, 2025 19:53
@dm1tryG dm1tryG force-pushed the feature/support-chromadb branch from 90b23be to 8925442 Compare March 9, 2025 19:59
@dm1tryG dm1tryG marked this pull request as ready for review March 9, 2025 20:00
@dm1tryG dm1tryG marked this pull request as draft March 9, 2025 20:01
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
pyproject.toml (1)

60-60: Consider adding ChromaDB to extras section.

The dependency is correctly added, but unlike other vector database providers (qdrant, weaviate, etc.), chromadb hasn't been added to the [tool.poetry.extras] section. This would allow users to selectively install this dependency.

[tool.poetry.extras]
filesystem = ["s3fs", "botocore"]
weaviate = ["weaviate-client"]
qdrant = ["qdrant-client"]
neo4j = ["neo4j"]
postgres = ["psycopg2", "pgvector", "asyncpg"]
notebook = ["notebook", "ipykernel", "overrides", "ipywidgets", "jupyterlab", "jupyterlab_widgets", "jupyterlab-server", "jupyterlab-git"]
langchain = ["langsmith", "langchain_text_splitters"]
llama-index = ["llama-index-core"]
gemini = ["google-generativeai"]
huggingface = ["transformers"]
ollama = ["transformers"]
mistral = ["mistral-common"]
deepeval = ["deepeval"]
posthog = ["posthog"]
falkordb = ["falkordb"]
groq = ["groq"]
milvus = ["pymilvus"]
docs = ["unstructured"]
codegraph = ["fastembed", "transformers", "tree-sitter", "tree-sitter-python"]
evals = ["plotly", "gdown"]
gui = ["pyside6", "qasync"]
graphiti = ["graphiti-core"]
+chromadb = ["chromadb"]
cognee/infrastructure/databases/vector/create_vector_engine.py (3)

84-99: ChromaDB implementation looks good but has extra whitespace.

The ChromaDB integration follows the established pattern of other vector database providers and includes proper error handling. However, there's an extra blank line on line 84 that should be removed for consistency with the rest of the file.

    elif vector_db_provider == "falkordb":
        if not (vector_db_url and vector_db_port):
            raise EnvironmentError("Missing requred FalkorDB credentials!")

        from ..hybrid.falkordb.FalkorDBAdapter import FalkorDBAdapter

        return FalkorDBAdapter(
            database_url=vector_db_url,
            database_port=vector_db_port,
            embedding_engine=embedding_engine,
        )
-        
    elif vector_db_provider == "chromadb":
        try:
            import chromadb
        except ImportError:
            raise ImportError(
                "ChromaDB is not installed. Please install it with 'pip install chromadb'"
            )
            
        from .chromadb.ChromaDBAdapter import ChromaDBAdapter

        return ChromaDBAdapter(
            url=vector_db_url,
            api_key=vector_db_key,
            embedding_engine=embedding_engine,
        )

85-91: Consider adding credential validation.

Unlike other database providers like Weaviate and Qdrant, the ChromaDB implementation doesn't validate if the required credentials are provided. Consider adding credential validation for consistency.

    elif vector_db_provider == "chromadb":
+       if not vector_db_url:
+           raise EnvironmentError("Missing required ChromaDB credentials!")
+
        try:
            import chromadb
        except ImportError:
            raise ImportError(
                "ChromaDB is not installed. Please install it with 'pip install chromadb'"
            )

89-91: Typo in error message.

There's a typo in "required" in other error messages in this file ("requred" instead of "required"). For consistency, you should use the same spelling in your error message, even though it's misspelled elsewhere.

        try:
            import chromadb
        except ImportError:
            raise ImportError(
-                "ChromaDB is not installed. Please install it with 'pip install chromadb'"
+                "ChromaDB is not installed. Please install it with 'pip install chromadb'"
            )

Note: If you prefer to fix all the typos in the file, it would be better to create a separate PR for that.

docker-compose.yml (3)

63-82: Commented-out chromadb service configuration looks fine.

However, note that if you plan to enable this service in the future, ensure the environment variables align with production settings and credentials (e.g., tokens, secrets).

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 82-82: trailing spaces

(trailing-spaces)


82-82: Remove trailing spaces to comply with linting.

Below is a proposed fix:

-  #      - "3002:8000"  
+  #      - "3002:8000"
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 82-82: trailing spaces

(trailing-spaces)


103-103: Remove trailing spaces to comply with linting.

Below is a proposed fix:

-# UNCOMMENT IF USING CHROMADB         
+# UNCOMMENT IF USING CHROMADB
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 103-103: trailing spaces

(trailing-spaces)

cognee/tests/test_chromadb.py (1)

153-154: Use a constant for the expected history length.

Relying on a magic number 8 can make the test brittle if the search history length changes in the future. Consider referencing a descriptive constant or deriving the expected value.

cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1)

236-238: Honor the normalized parameter or remove it.

Currently, normalization is always performed. If a user passes normalized=False, it is disregarded. You could wrap the normalization logic in a conditional or remove the parameter to avoid confusion.

-        normalized: bool = True,
...
-        normalized_values = normalize_distances(vector_list)
+        if normalized:
+            normalized_values = normalize_distances(vector_list)
+        else:
+            normalized_values = [result["_distance"] for result in vector_list]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56427f2 and 8925442.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • .env.template (1 hunks)
  • .gitignore (1 hunks)
  • cognee/api/v1/settings/routers/get_settings_router.py (1 hunks)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
  • cognee/infrastructure/databases/vector/create_vector_engine.py (1 hunks)
  • cognee/tests/test_chromadb.py (1 hunks)
  • docker-compose.yml (2 hunks)
  • pyproject.toml (1 hunks)
🧰 Additional context used
🪛 YAMLlint (1.35.1)
docker-compose.yml

[error] 82-82: trailing spaces

(trailing-spaces)


[error] 103-103: trailing spaces

(trailing-spaces)

🔇 Additional comments (3)
.gitignore (1)

189-191: LGTM! Appropriate data directory exclusion.

The addition of .chromadb_data/ to .gitignore is appropriate and follows the project's pattern of excluding database data directories from version control.

.env.template (1)

33-33: LGTM! Clear environment variable documentation.

The updated comment correctly lists all supported vector database providers including the newly added "chromadb" option.

cognee/api/v1/settings/routers/get_settings_router.py (1)

30-30: Looks great!

Adding "chromadb" as a valid provider is consistent with the new adapter implementation.

@dm1tryG dm1tryG marked this pull request as ready for review March 9, 2025 20:07
@hajdul88 hajdul88 self-requested a review March 11, 2025 07:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (11)
README.md (2)

11-12: Demo Link Positioning & Extra Punctuation
The repositioned demo link now appears before the "Learn more" and "Join Discord" links, which improves its visibility. However, there is an isolated period on line 12 that feels out of place. Consider removing it or incorporating it more naturally with the surrounding content.


92-92: Clarify Default Environment Variables Description
This line clearly explains that if no specific database is specified, the defaults (SQLite, LanceDB, and NetworkX) will be used. For enhanced readability, you might consider splitting this long sentence into two or formatting the options as a bulleted list, particularly as new providers like ChromaDB are now supported elsewhere in the project.

CONTRIBUTING.md (5)

1-6: Invitation Tone and Clarity Improvements
The revised welcome message and introductory paragraph set an inviting tone. One minor suggestion: in line 4, consider adding an article (e.g., “the cognee community”) for grammatical clarity.


32-36: Community Engagement Guidelines
The "Community Channels" section effectively emphasizes joining the Discord community and participating in events. A minor stylistic note: you might consider adding punctuation at the end of each bullet for consistency.


37-41: Direct Contact Information
The "Direct Contact" section provides clear methods for getting in touch. For improved usability, consider turning the email address into a clickable mailto link (e.g., <mailto:[email protected]>).


42-42: Response Time Clarity
The guideline on response times is informative. However, a small punctuation adjustment is recommended: inserting a comma after “Discord channel” (i.e., “...using our Discord channel, where the whole community can help!”) can enhance readability.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~42-~42: Possible missing comma found.
Context: ...r responses, consider using our Discord channel where the whole community can help! ##...

(AI_HYDRA_LEO_MISSING_COMMA)


128-128: Appreciative Sign-off Tone
The closing line effectively conveys gratitude and enthusiasm. Consider moderating the number of exclamation marks to maintain a professional tone if consistency is desired across the document.

🧰 Tools
🪛 LanguageTool

[style] ~128-~128: Using many exclamation marks might seem excessive (in this case: 5 exclamation marks for a text that’s 3216 characters long)
Context: ...Thank you for contributing to cognee! 🌟

(EN_EXCESSIVE_EXCLAMATION)

cognee/eval_framework/metrics_dashboard.py (1)

80-92: Improved HTML generation with dynamic table structure

The HTML generation is now more flexible as it dynamically creates table headers and rows based on the available data. This makes the code more maintainable and adaptable to changes in the metrics structure.

One potential improvement would be to add HTML escaping for user-provided content to prevent potential XSS vulnerabilities.

Consider adding HTML escaping for user-provided values:

- {"".join(f"<td>{value}</td>" for value in item.values())}
+ {"".join(f"<td>{html.escape(str(value))}</td>" for value in item.values())}

And add the import at the top:

+ import html
cognee/modules/retrieval/completion_retriever.py (1)

29-29: Potential limitation with multiple retrieval results

While the top_k parameter allows retrieving multiple document chunks, the get_context method still only returns the first chunk's text. Consider enhancing this method to return a concatenated or otherwise processed collection of all retrieved chunks when top_k > 1.

-        return found_chunks[0].payload["text"]
+        if self.top_k == 1:
+            return found_chunks[0].payload["text"]
+        else:
+            # Join multiple chunks with separators for better context
+            return "\n\n---\n\n".join([chunk.payload["text"] for chunk in found_chunks])
cognee/eval_framework/answer_generation/run_question_answering_module.py (1)

35-35: Add docstring to elaborate usage of top_k.
Consider explaining the effect of passing None vs a positive integer for top_k, so that maintainers can easily understand how to configure the function.

cognee/eval_framework/evaluation/deep_eval_adapter.py (1)

37-37: Consider using an empty list instead of None
To maintain consistency, you may want to store an empty list if golden_context is absent, so that the metric is always dealing with list inputs.

🛑 Comments failed to post (1)
cognee/eval_framework/evaluation/metrics/context_coverage.py (1)

8-33: 💡 Verification agent

🧩 Analysis chain

Handle potential empty context or retrieval_context
If test_case.context or test_case.retrieval_context is empty, indexing [0] will raise an error. Please ensure upstream code always populates these lists, or consider adding a defensive check.


🏁 Script executed:

#!/bin/bash
# Attempt to locate usage of ContextCoverageMetric and see if context is guaranteed to be non-empty
rg -A 5 "ContextCoverageMetric"

Length of output: 1634


Attention: Defensive Check Required for Empty Context Lists

The current implementation in ContextCoverageMetric.measure() assumes that both test_case.context and test_case.retrieval_context contain at least one element. However, if either list is empty, accessing [0] will raise an IndexError. Please either:

  • Ensure upstream guarantees non-empty lists: Confirm that all callers of this metric always provide non-empty context and retrieval_context, or

  • Add a defensive guard in the metric: Introduce a validation check at the beginning of the measure method to handle cases when these lists are empty, for example:

    if not test_case.context or not test_case.retrieval_context:
        raise ValueError("Test case context or retrieval_context cannot be empty.")

Review the surrounding code to decide which approach best fits your error handling strategy.

Copy link
Collaborator

@hajdul88 hajdul88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks for the contribution, it's a solid implementation. I've opened a new PR to run our tests; you can follow it here:

#629

I'm fixing the Ruff issues there and will probably add your test to GA. I also added some minor comments—nothing crucial, just small things.

# chromadb:
# image: chromadb/chroma:0.6.3
# container_name: chromadb
# profiles:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to comment this out. profiles config makes sure that it will start the chromadb service only if chromadb profile is added during docker compose: docker compose --profile chromadb up

langdetect = "1.0.9"
posthog = {version = "^3.5.0", optional = true}
lancedb = "0.16.0"
chromadb = "^0.6.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add chromadb as optional dependency that is installed as chromadb extra.

history = await cognee.get_search_history()
assert len(history) == 8, "Search history is not correct."

results = await brute_force_triplet_search("What is a quantum computer?")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need this one, it is used in other searches.

results = await brute_force_triplet_search("What is a quantum computer?")
assert len(results) > 0

await test_local_file_deletion(text, explanation_file_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not needed probably, it is tested in other tests.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (4)

31-47: Consider validated serialization or sanitized storage.

Storing lists and dictionaries as JSON strings is a practical approach, but be mindful of security implications (e.g., malicious JSON content placed in the dictionary or list). It may be prudent to validate or sanitize data before serialization, to mitigate downstream parsing issues.


200-200: Offer option for more comprehensive search.

The code caps the results at 100 (n_results=100). Depending on the dataset size and use case, this could omit valuable global matches. Consider making this value configurable or removing the cap entirely if your dataset and performance profile permit it.


316-317: Restore metadata for consistent data structure.

This code returns raw metadata instead of running it through restore_data_from_chroma. Consider aligning it with the approach used in other methods for uniform data structures across the interface.

-    payload=metadata,
+    payload=restore_data_from_chroma(metadata),

344-347: Rename method for clarity.

Based on previous review comments, you may rename get_table_names to get_collection_names to align with ChromaDB’s domain model and improve clarity.

-    async def get_table_names(self):
+    async def get_collection_names(self):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3021f1a and b28624c.

📒 Files selected for processing (1)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
🔇 Additional comments (1)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1)

66-77: Confirmed: Removing 6 characters is correct.

The suffix __dict or __list is exactly 6 characters. The slicing operation key[:-6] is appropriate and aligns with the previous feedback. No further changes needed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (2)

183-231: Potential performance concern with large collections.

When computing distances for all elements, the code retrieves the entire collection (n_results=collection_count). For very large collections, this could cause memory or performance issues. Consider setting a reasonable limit or batching the retrieval if you're working with potentially large datasets.

You might implement a paginated approach or a maximum threshold to limit the total number of items processed at once.


346-349: Rename method to reflect collections rather than tables.

The function is named get_table_names but returns collection names. The rest of the code also refers to “collections,” so this can cause confusion.

-    async def get_table_names(self):
+    async def get_collection_names(self):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b28624c and a299bd6.

📒 Files selected for processing (1)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
🔇 Additional comments (2)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (2)

159-166: Handle potential missing 'index_fields' key in metadata.

Here, the code uses data_point.metadata["index_fields"] directly. If the "index_fields" key is missing or null in the metadata, this will cause a KeyError. Consider adding an existence check or a default value before accessing "index_fields".

Can you confirm that every DataPoint in your application includes "index_fields" in the metadata attribute? If not, please ensure to handle or validate this field to avoid exceptions.


314-314: Unify distance normalization approach.

Here, distance is converted to similarity using similarity = 1.0 - min(distance, 2.0) / 2.0, which differs from the normalize_distances(...) approach used elsewhere (e.g., in search). For consistency and clarity, consider using a single normalization method across all queries.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docker-compose.yml (1)

81-81: Remove trailing whitespace.

There is a trailing whitespace on this line that should be removed to fix the YAMLlint warning.

-  
+
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 81-81: trailing spaces

(trailing-spaces)

cognee/tests/test_chromadb.py (1)

69-162: Consider restructuring the main function for better test organization.

The main function contains numerous assertions and test scenarios mixed together. Consider refactoring this into separate, focused test functions for each aspect being tested:

  1. Data addition
  2. Document retrieval
  3. Different search types (INSIGHTS, CHUNKS, etc.)
  4. Search history
  5. Data cleanup

This would improve test readability, maintainability, and make failures easier to diagnose.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a299bd6 and eae2f76.

📒 Files selected for processing (2)
  • cognee/tests/test_chromadb.py (1 hunks)
  • docker-compose.yml (2 hunks)
🧰 Additional context used
🪛 YAMLlint (1.35.1)
docker-compose.yml

[error] 81-81: trailing spaces

(trailing-spaces)

🔇 Additional comments (9)
docker-compose.yml (2)

64-80: Good implementation of the ChromaDB service.

The service configuration includes all necessary components:

  • Specific versioning (0.6.3)
  • Appropriate authentication setup with environment variables
  • Proper network configuration
  • Profile configuration for optional startup
  • Volume mapping for data persistence
  • Port exposure (3002:8000)

103-103: Good addition of the ChromaDB volume.

The volume declaration for chromadb_data has been properly added to support the new ChromaDB service.

cognee/tests/test_chromadb.py (7)

122-124: Remove print statements from test.

These print statements should be removed as they're not necessary for automated tests and were noted in a previous review.

-    print("\n\nExtracted sentences are:\n")
-    for result in search_results:
-        print(f"{result}\n")

130-132: Remove print statements from test.

Print statements should be removed from automated tests.

-    print("\n\nExtracted chunks are:\n")
-    for result in search_results:
-        print(f"{result}\n")

140-141: Remove print statements from test.

Print statements should be removed from automated tests.

-    print("Completion result is:")
-    print(graph_completion)

147-149: Remove print statements from test.

Print statements should be removed from automated tests.

-    print("\n\nExtracted summaries are:\n")
-    for result in search_results:
-        print(f"{result}\n")

154-154: Consider removing redundant test call.

As noted in a previous review, this test call might be redundant as it's likely covered by other tests.

-    await test_local_file_deletion(text, explanation_file_path)

14-49: Good implementation of file deletion tests.

The test function thoroughly validates both cases:

  • Files created by cognee are properly deleted
  • Files not created by cognee are preserved

This ensures the ChromaDB adapter correctly handles file system operations during data deletion.


51-66: Good implementation of document retrieval tests.

The test function properly verifies document retrieval functionality:

  • With specific dataset filtering
  • Without dataset filtering (retrieving all documents)

This ensures the ChromaDB adapter correctly integrates with user permissions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (3)

31-83: Ensure robust handling of nested or complex data structures.

The functions process_data_for_chroma and restore_data_from_chroma perform JSON serialization for dicts and lists. Deeply nested structures will still be serialized to strings without further recursion checks. If this is intended, make a note. Otherwise, consider recursively processing nested objects for more precise control.


183-231: Be mindful of performance with large collections.

get_distance_from_collection_elements sets n_results to the entire collection size (collection_count). This loads all items, which may be expensive for large collections. Consider pagination or specifying a practical limit to prevent potential performance bottlenecks.


359-362: Rename method to align with ChromaDB naming.

The method name get_table_names might be confusing in the context of ChromaDB, as collections are the relevant concept. Consider renaming it to get_collection_names for clarity, unless it must conform to an interface requirement.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eae2f76 and dc73728.

📒 Files selected for processing (1)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
🔇 Additional comments (2)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (2)

21-29: Consider validating the text and metadata fields.

While IndexSchema explicitly expects a text field and sets a default metadata value with index_fields, there's no check to ensure the field in index_fields actually exists or is non-empty. If a data point lacks the required fields, it could trigger runtime errors. Consider adding validation or fallback behavior.


153-167: Guard against missing index_fields in metadata.

In index_data_points, you call
text=getattr(data_point, data_point.metadata["index_fields"][0]).
If metadata["index_fields"] is absent or has fewer elements, this code will raise KeyError or IndexError. Consider safely checking for the key or defaulting to a known attribute.

@dm1tryG
Copy link
Contributor Author

dm1tryG commented Mar 12, 2025

@borisarzentar @hajdul88 I have fixed by comments, take a look again please

@hajdul88
Copy link
Collaborator

Ok from my side, didn't check @borisarzentar 's comments

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
cognee/tests/test_chromadb.py (1)

70-70: 🛠️ Refactor suggestion

Remove debug print statement

Remove the print statement that outputs the configuration.

-    print(cognee.config)
🧹 Nitpick comments (7)
docker-compose.yml (2)

82-82: Remove trailing whitespace

There is trailing whitespace on this line that should be removed.

-  
+
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 82-82: trailing spaces

(trailing-spaces)


103-105: Remove trailing whitespace and uncomment volumes section

There is trailing whitespace on line 103. Also, consistent with the service configuration, you should uncomment the volumes section since it will only be used when the ChromaDB profile is activated.

-# UNCOMMENT IF USING CHROMADB         
-# volumes:
-#  chromadb_data:
+# UNCOMMENT IF USING CHROMADB
+volumes:
+  chromadb_data:
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 103-103: trailing spaces

(trailing-spaces)

cognee/tests/test_chromadb.py (4)

128-130: Remove or disable debug print statements

These print statements are useful for debugging but should be removed or disabled in the test file. Consider using logging at the debug level instead.

Also applies to: 136-138, 146-147, 153-155


47-48: Fix assertion error message

The error message states "Data location doesn't exists" when it should be "Data location still exists".

-        assert os.path.exists(data.raw_data_location), (
-            f"Data location doesn't exists: {data.raw_data_location}"
+        assert os.path.exists(data.raw_data_location), (
+            f"Data location still exists: {data.raw_data_location}"

14-49: Test logic looks good, but consider breaking down into smaller test functions

The test_local_file_deletion function performs two distinct tests:

  1. Deleting data created by cognee (and verifying files are removed)
  2. Deleting data not created by cognee (and verifying files remain)

Consider separating these into two distinct test functions for better clarity and to follow the single responsibility principle.


69-171: Main test function is comprehensive but consider restructuring

The main function serves as a comprehensive integration test but mixes test setup, execution, assertions, and cleanup. Consider refactoring this to follow a more standard test structure:

  1. Setup (configure database, create test data)
  2. Test execution (various search operations)
  3. Assertions (verify results)
  4. Cleanup (prune data)

This would make the test more maintainable and easier to understand.

cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1)

225-233: Ensure the normalized parameter is honored.

Currently, the search method always normalizes distances without conditionally checking the normalized parameter. Consider introducing logic to skip normalization if normalized is False, ensuring users retain explicit control over how distances are handled.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc73728 and 3362e6d.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • .github/workflows/test_chromadb.yml (1 hunks)
  • cognee/api/v1/settings/routers/get_settings_router.py (1 hunks)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
  • cognee/infrastructure/databases/vector/create_vector_engine.py (1 hunks)
  • cognee/tests/test_chromadb.py (1 hunks)
  • docker-compose.yml (2 hunks)
  • pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pyproject.toml
  • cognee/api/v1/settings/routers/get_settings_router.py
🧰 Additional context used
🪛 YAMLlint (1.35.1)
docker-compose.yml

[error] 82-82: trailing spaces

(trailing-spaces)


[error] 103-103: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: run_simple_example_test
🔇 Additional comments (9)
cognee/infrastructure/databases/vector/create_vector_engine.py (1)

85-100: Implementation for ChromaDB looks good!

The implementation follows the same pattern used for other vector database providers, which makes it consistent and maintainable. The error message if ChromaDB is not installed is helpful and provides clear instructions for installation.

.github/workflows/test_chromadb.yml (1)

1-64: Well-structured CI workflow for ChromaDB testing

The workflow is properly configured with:

  • Appropriate triggers (manual and pull request events)
  • Service container for ChromaDB with the correct image version
  • Required environment variables for testing
  • Proper dependency setup with Poetry

This will provide good test coverage for the new ChromaDB integration.

docker-compose.yml (1)

64-81: Uncomment the ChromaDB service instead of using a comment directive

As mentioned in previous review comments, you don't need to comment out this ChromaDB service configuration. The profiles configuration ensures that it will only start when explicitly requested with docker compose --profile chromadb up.

cognee/tests/test_chromadb.py (3)

160-161: Remove redundant test

This test for brute_force_triplet_search is likely redundant as it's used in other search tests.


163-163: Consider relocating test function call

This test is likely redundant as it's tested elsewhere. If it's necessary to include it here, consider adding a comment explaining why it's being tested again in this context.


51-67: Test logic for document retrieval looks good

The test_getting_of_documents function properly tests document retrieval both with and without a dataset filter.

cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (3)

67-67: Likely off-by-one error when removing '__dict' / '__list' suffixes.

These lines remove 7 characters instead of 6, whereas the suffix (__dict or __list) is only 6 characters long. This may cause a truncated character and potentially break the restoration of original keys. Note that a similar comment has already been raised in past reviews.

Also applies to: 76-76


287-330: Unify distance normalization between batch_search and search.

batch_search transforms the raw distance into a similarity measure (1.0 - min(distance, 2.0) / 2.0), whereas search relies on normalize_distances(...). This disparity in approaches may lead to inconsistent scoring across methods. A refactoring suggestion to unify or deduplicate distance handling logic has been raised previously.


347-347: Use a more descriptive method name.

As suggested in prior comments, consider renaming get_table_names to get_collection_names for clarity and consistency with ChromaDB terminology.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (2)

67-76: ⚠️ Potential issue

Suffix removal off by one.
For keys ending with __dict or __list, you should remove only the last 6 characters (the substring __dict or __list). Removing 7 characters will truncate the preceding character in the key, leading to incorrect key restoration.

Apply this fix:

-        original_key = key[:-7]  # Remove '__dict' suffix
+        original_key = key[:-6]  # Remove '__dict' suffix

-        original_key = key[:-7]  # Remove '__list' suffix
+        original_key = key[:-6]  # Remove '__list' suffix

319-319: 🛠️ Refactor suggestion

Restore metadata consistently.
Unlike in search and get_distance_from_collection_elements, this line directly assigns metadata to payload without calling restore_data_from_chroma(). This causes inconsistent payload formats across methods.

Apply this fix:

-                    payload=metadata,
+                    payload=restore_data_from_chroma(metadata),
🧹 Nitpick comments (1)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1)

225-286: Maintain consistent distance normalization.
This method uses normalize_distances, but elsewhere (e.g., in batch_search) you calculate similarity differently with a fixed formula. Either unify the approach with a single normalization function or document the rationale for having two methods.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3362e6d and 9954a09.

📒 Files selected for processing (2)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
  • cognee/tests/test_chromadb.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cognee/tests/test_chromadb.py
🔇 Additional comments (3)
cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (3)

1-19: Solid foundational setup and imports.
The overall structure, including logging configuration and import statements, provides a clear starting point for this adapter. No immediate issues observed in this snippet.


178-224: Confirm 100 as the default result limit.
Here, n_results=100 may be a safe default for typical searches, but large collections or pipeline usage could need more flexibility. Consider making this value configurable or ensuring it meets all use cases.


315-315: Use a shared normalization function for clarity.
Here, similarity = 1.0 - min(distance, 2.0)/2.0 diverges from the main normalization strategy in search. Adopting a consistent approach for all distance calculations simplifies maintenance and reduces potential confusion about how scores are derived.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
cognee/tests/test_chromadb.py (4)

14-49: Fix grammatical issue in assertion message for file existence.
In the assertion at line 47, consider updating the message for clarity. The correct grammar would be "Data location doesn't exist" instead of "doesn't exists."

- f"Data location doesn't exists: {data.raw_data_location}"
+ f"Data location doesn't exist: {data.raw_data_location}"

51-67: Consider additional negative test scenarios.
Currently, the test only checks for the presence of documents (1 or 2). It might be beneficial to add negative tests where no documents exist or invalid dataset names are provided. This would help ensure robust error handling and guard against unexpected data states.


69-164: Refine the large integration test for maintainability.
The main function is quite extensive, testing multiple functionalities end-to-end. Consider splitting it into smaller, more focused test functions or employing parameterized tests to make debugging and maintenance easier.


166-169: Rely on your test runner instead of a manual entry point.
Including a main() entry point for an async test may not be necessary if you are already using an automated test runner like Pytest or unittest. Removing it helps standardize how tests are executed and reported across the project.

🧰 Tools
🪛 GitHub Actions: test | chromadb

[error] 169-169: KeyError: 'Could not automatically map to a tokeniser. Please use tiktoken.get_encoding to explicitly get the tokeniser you expect.'

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9954a09 and 990a04c.

📒 Files selected for processing (3)
  • cognee/infrastructure/databases/vector/chromadb/ChromaDBAdapter.py (1 hunks)
  • cognee/tests/test_chromadb.py (1 hunks)
  • docker-compose.yml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docker-compose.yml
🧰 Additional context used
🪛 GitHub Actions: test | chromadb
cognee/tests/test_chromadb.py

[error] 169-169: KeyError: 'Could not automatically map to a tokeniser. Please use tiktoken.get_encoding to explicitly get the tokeniser you expect.'


[warning] 1-1: Ontology file 'None' not found. Using fallback ontology at http://example.org/empty_ontology


[warning] 1-1: File /home/runner/work/cognee/cognee/cognee/tests/.cognee_system/test_chromadb/databases/cognee.graph not found. Initializing an empty graph.

⏰ Context from checks skipped due to timeout of 90000ms (17)
  • GitHub Check: test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on macos-13
  • GitHub Check: Test on ubuntu-22.04
  • GitHub Check: run_eval_framework_test / test
  • GitHub Check: Test on macos-15
  • GitHub Check: run_notebook_test / test
  • GitHub Check: Test on macos-13
  • GitHub Check: windows-latest
  • GitHub Check: run_notebook_test / test
  • GitHub Check: test
  • GitHub Check: docker-compose-test
  • GitHub Check: run_notebook_test / test
  • GitHub Check: run_simple_example_test / test
  • GitHub Check: Test on macos-15
  • GitHub Check: Test on macos-13
  • GitHub Check: run_simple_example_test

Comment on lines +192 to +199
collection_count = await collection.count()

results = await collection.query(
query_embeddings=[query_vector],
include=["metadatas", "distances"],
n_results=collection_count,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Investigate potential performance impact of retrieving all elements.
Using collection.count() to set n_results=collection_count can lead to very large queries for large collections. Consider adding a limit or a user-configurable parameter to avoid performance bottlenecks when only partial data is needed.

Comment on lines +236 to +289
normalized: bool = True,
):
"""Search for similar items in a collection using text or vector query."""
if query_text is None and query_vector is None:
raise InvalidValueError(message="One of query_text or query_vector must be provided!")

if query_text and not query_vector:
query_vector = (await self.embedding_engine.embed_text([query_text]))[0]

try:
client = await self.get_connection()
collection = await client.get_collection(collection_name)

results = await collection.query(
query_embeddings=[query_vector],
include=["metadatas", "distances", "embeddings"]
if with_vector
else ["metadatas", "distances"],
n_results=limit,
)

vector_list = []
for i, (id, metadata, distance) in enumerate(
zip(results["ids"][0], results["metadatas"][0], results["distances"][0])
):
item = {
"id": parse_id(id),
"payload": restore_data_from_chroma(metadata),
"_distance": distance,
}

if with_vector and "embeddings" in results:
item["vector"] = results["embeddings"][0][i]

vector_list.append(item)

# Normalize vector distance
normalized_values = normalize_distances(vector_list)
for i in range(len(normalized_values)):
vector_list[i]["score"] = normalized_values[i]

# Create and return ScoredResult objects
return [
ScoredResult(
id=row["id"],
payload=row["payload"],
score=row["score"],
vector=row.get("vector") if with_vector else None,
)
for row in vector_list
]
except Exception as e:
logger.error(f"Error in search: {str(e)}")
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Honor the normalized parameter conditionally.
Although the function signature includes a normalized flag, the code always normalizes distances (lines 272–275) regardless of the flag’s value. If you intend to allow unnormalized distances, wrap the normalization step in a conditional check.

 if normalized:
     normalized_values = normalize_distances(vector_list)
     for i in range(len(normalized_values)):
         vector_list[i]["score"] = normalized_values[i]
 else:
     for i in range(len(vector_list)):
         # Use raw distance or compute an alternative measure
         vector_list[i]["score"] = 1.0 - vector_list[i]["_distance"]  # Example
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
normalized: bool = True,
):
"""Search for similar items in a collection using text or vector query."""
if query_text is None and query_vector is None:
raise InvalidValueError(message="One of query_text or query_vector must be provided!")
if query_text and not query_vector:
query_vector = (await self.embedding_engine.embed_text([query_text]))[0]
try:
client = await self.get_connection()
collection = await client.get_collection(collection_name)
results = await collection.query(
query_embeddings=[query_vector],
include=["metadatas", "distances", "embeddings"]
if with_vector
else ["metadatas", "distances"],
n_results=limit,
)
vector_list = []
for i, (id, metadata, distance) in enumerate(
zip(results["ids"][0], results["metadatas"][0], results["distances"][0])
):
item = {
"id": parse_id(id),
"payload": restore_data_from_chroma(metadata),
"_distance": distance,
}
if with_vector and "embeddings" in results:
item["vector"] = results["embeddings"][0][i]
vector_list.append(item)
# Normalize vector distance
normalized_values = normalize_distances(vector_list)
for i in range(len(normalized_values)):
vector_list[i]["score"] = normalized_values[i]
# Create and return ScoredResult objects
return [
ScoredResult(
id=row["id"],
payload=row["payload"],
score=row["score"],
vector=row.get("vector") if with_vector else None,
)
for row in vector_list
]
except Exception as e:
logger.error(f"Error in search: {str(e)}")
return []
normalized: bool = True,
):
"""Search for similar items in a collection using text or vector query."""
if query_text is None and query_vector is None:
raise InvalidValueError(message="One of query_text or query_vector must be provided!")
if query_text and not query_vector:
query_vector = (await self.embedding_engine.embed_text([query_text]))[0]
try:
client = await self.get_connection()
collection = await client.get_collection(collection_name)
results = await collection.query(
query_embeddings=[query_vector],
include=["metadatas", "distances", "embeddings"]
if with_vector
else ["metadatas", "distances"],
n_results=limit,
)
vector_list = []
for i, (id, metadata, distance) in enumerate(
zip(results["ids"][0], results["metadatas"][0], results["distances"][0])
):
item = {
"id": parse_id(id),
"payload": restore_data_from_chroma(metadata),
"_distance": distance,
}
if with_vector and "embeddings" in results:
item["vector"] = results["embeddings"][0][i]
vector_list.append(item)
# Normalize vector distance
if normalized:
normalized_values = normalize_distances(vector_list)
for i in range(len(normalized_values)):
vector_list[i]["score"] = normalized_values[i]
else:
for i in range(len(vector_list)):
# Use raw distance or compute an alternative measure
vector_list[i]["score"] = 1.0 - vector_list[i]["_distance"] # Example
# Create and return ScoredResult objects
return [
ScoredResult(
id=row["id"],
payload=row["payload"],
score=row["score"],
vector=row.get("vector") if with_vector else None,
)
for row in vector_list
]
except Exception as e:
logger.error(f"Error in search: {str(e)}")
return []

@borisarzentar borisarzentar merged commit e147fa5 into topoteretes:dev Mar 13, 2025
12 of 69 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants