Skip to content

Conversation

@dexters1
Copy link
Collaborator

@dexters1 dexters1 commented Dec 1, 2025

Description

Merge main changes into dev branch

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify):

Screenshots/Videos (if applicable)

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Summary by CodeRabbit

  • Documentation

    • Overhauled README: renamed tagline, clarified product positioning, reorganized Get Started into Open Source and Cloud paths, streamlined Quickstart, refreshed demos, navigation, and citation/contributing sections.
  • New Features

    • Added MCP tools for developer rules management, interaction logging, and enhanced search modes (SUMMARIES, CYPHER, FEELING_LUCKY).
  • Bug Fixes

    • Improved embedding handling to support alternate response formats.
  • Chores

    • Updated test/dev dependency versions.

✏️ Tip: You can customize this high-level summary in your review settings.

davidmyriel and others added 30 commits October 30, 2025 16:21
- fix grammar, spelling and semantics
- update headings
- add new copy to mirror website copy
- update product names
<!-- .github/pull_request_template.md -->
Ollama tests were failing during to incorrect embeddings API endpoint

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
The OllamaEmbeddingEngine is compatible with OpenAI
- Add missing tools: cognify_add_developer_rules, get_developer_rules, save_interaction
- Enhance search tool description with all supported search types
- Documentation now accurately reflects all 10 tools implemented in server.py
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md -->

## Description
Resolve issue with cypher search by encoding the return value from the
cypher query into JSON. Uses fastapi json encoder


## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
Example of result now with Cypher search with the following query "MATCH
(src)-[rel]->(nbr) RETURN src, rel" on Simple example:
```
{
   "search_result":[
      [
         [
            {
               "_id":{
                  "offset":0,
                  "table":0
               },
               "_label":"Node",
               "id":"87372381-a9fe-5b82-9c92-3f5dbab1bc35",
               "name":"",
               "type":"DocumentChunk",
               "created_at":"2025-11-05T14:12:46.707597",
               "updated_at":"2025-11-05T14:12:54.801747",
               "properties":"{\"created_at\": 1762351945009, \"updated_at\": 1762351945009, \"ontology_valid\": false, \"version\": 1, \"topological_rank\": 0, \"metadata\": {\"index_fields\": [\"text\"]}, \"belongs_to_set\": null, \"text\": \"\\n    Natural language processing (NLP) is an interdisciplinary\\n    subfield of computer science and information retrieval.\\n    \", \"chunk_size\": 48, \"chunk_index\": 0, \"cut_type\": \"paragraph_end\"}"
            },
            {
               "_src":{
                  "offset":0,
                  "table":0
               },
               "_dst":{
                  "offset":1,
                  "table":0
               },
               "_label":"EDGE",
               "_id":{
                  "offset":0,
                  "table":1
               },
               "relationship_name":"contains",
               "created_at":"2025-11-05T14:12:47.217590",
               "updated_at":"2025-11-05T14:12:55.193003",
               "properties":"{\"source_node_id\": \"87372381-a9fe-5b82-9c92-3f5dbab1bc35\", \"target_node_id\": \"bc338a39-64d6-549a-acec-da60846dd90d\", \"relationship_name\": \"contains\", \"updated_at\": \"2025-11-05 14:12:54\", \"relationship_type\": \"contains\", \"edge_text\": \"relationship_name: contains; entity_name: natural language processing (nlp); entity_description: An interdisciplinary subfield of computer science and information retrieval concerned with interactions between computers and human (natural) languages.\"}"
            }
         ]
      ]
   ],
   "dataset_id":"UUID(""af4b1c1c-90fc-59b7-952c-1da9bbde370c"")",
   "dataset_name":"main_dataset",
   "graphs":"None"
}
```
Relates to #1725
Issue: #1723

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md -->

## Description
Resolve issue with cypher search by encoding the return value from the
cypher query into JSON. Uses fastapi json encoder


## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
Example of result now with Cypher search with the following query "MATCH
(src)-[rel]->(nbr) RETURN src, rel" on Simple example:
```
{
   "search_result":[
      [
         [
            {
               "_id":{
                  "offset":0,
                  "table":0
               },
               "_label":"Node",
               "id":"87372381-a9fe-5b82-9c92-3f5dbab1bc35",
               "name":"",
               "type":"DocumentChunk",
               "created_at":"2025-11-05T14:12:46.707597",
               "updated_at":"2025-11-05T14:12:54.801747",
               "properties":"{\"created_at\": 1762351945009, \"updated_at\": 1762351945009, \"ontology_valid\": false, \"version\": 1, \"topological_rank\": 0, \"metadata\": {\"index_fields\": [\"text\"]}, \"belongs_to_set\": null, \"text\": \"\\n    Natural language processing (NLP) is an interdisciplinary\\n    subfield of computer science and information retrieval.\\n    \", \"chunk_size\": 48, \"chunk_index\": 0, \"cut_type\": \"paragraph_end\"}"
            },
            {
               "_src":{
                  "offset":0,
                  "table":0
               },
               "_dst":{
                  "offset":1,
                  "table":0
               },
               "_label":"EDGE",
               "_id":{
                  "offset":0,
                  "table":1
               },
               "relationship_name":"contains",
               "created_at":"2025-11-05T14:12:47.217590",
               "updated_at":"2025-11-05T14:12:55.193003",
               "properties":"{\"source_node_id\": \"87372381-a9fe-5b82-9c92-3f5dbab1bc35\", \"target_node_id\": \"bc338a39-64d6-549a-acec-da60846dd90d\", \"relationship_name\": \"contains\", \"updated_at\": \"2025-11-05 14:12:54\", \"relationship_type\": \"contains\", \"edge_text\": \"relationship_name: contains; entity_name: natural language processing (nlp); entity_description: An interdisciplinary subfield of computer science and information retrieval concerned with interactions between computers and human (natural) languages.\"}"
            }
         ]
      ]
   ],
   "dataset_id":"UUID(""af4b1c1c-90fc-59b7-952c-1da9bbde370c"")",
   "dataset_name":"main_dataset",
   "graphs":"None"
}
```
Relates to #1725 Issue:
#1723

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> <sup>[Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) is
generating a summary for commit
44f0498. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
- Update the tool name from cognify_add_developer_rules to cognee_add_developer_rules.
<!-- .github/pull_request_template.md -->

## Description
- Add missing tools: cognify_add_developer_rules, get_developer_rules,
save_interaction
- Enhance search tool description with all supported search types
- Documentation now accurately reflects all 10 tools implemented in
server.py

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [x] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
The OllamaEmbeddingEngine is compatible with OpenAI

<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [*] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [* ] **I have tested my changes thoroughly before submitting this PR**
- [*] **This PR contains minimal changes necessary to address the
issue/feature**
- [*] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md -->

## Description
[Release drafter](https://github.com/release-drafter/release-drafter)
template configuration. Will generate and update release draft on each
PR merge to dev

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [x] Other (please specify): CI

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md -->

## Description
Use case: If you merge(d) a PR to `dev` and also it's good to have it in
`main` ASAP. For example, tests fix.
Just add `backport_main` label to the PR (even if it's already merged).
The same PR will be created but targeted to `main`.

It's called
[Backport](https://docs.mergify.com/workflow/actions/backport/). If the
backport has conflicts - it will have a `conflicts` label. Be careful
with that

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
pazone and others added 6 commits November 21, 2025 17:59
This change has been made by @pazone from the Mergify config editor.
…ue with 0.2.0 lance-namespace version) + crawler ingetration test url fix (#1842)

<!-- .github/pull_request_template.md -->

## Description
Implements a quick fix for the lance-namespace 0.0.21 to 0.2.0 release
issue with lancedb. Later this has to be revisited if they fix it on
their side, for now we fixed the lance-namespace version to the previous
one.


**If Lancedb fixes the issue on their side this can be closed**


Additionally cherry picking crawler integration test fixes from dev

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
@dexters1 dexters1 requested a review from siillee December 1, 2025 10:17
@dexters1 dexters1 self-assigned this Dec 1, 2025
@pull-checklist
Copy link

pull-checklist bot commented Dec 1, 2025

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 1, 2025

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Updates documentation (README and cognee-mcp README) with reorganized content and Quickstart; adds MCP tools and search modes to cognee-mcp docs; updates OllamaEmbeddingEngine._get_embedding to accept an alternate embedding response shape; and adds a dev dependency in pyproject.toml.

Changes

Cohort / File(s) Summary
Main README updates
README.md
Rewrote tagline and intro; reorganized Get Started into Open Source vs Cloud paths; added streamlined Quickstart (prereqs, install, configure LLM, run pipeline); updated navigation links, badges, demos, examples, CLI usage, contributing/citation sections, and sample code snippets.
MCP README additions
cognee-mcp/README.md
Documented new MCP tools (cognee_add_developer_rules, delete, get_developer_rules, save_interaction) and expanded search modes to include SUMMARIES, CYPHER, and FEELING_LUCKY; updated API surface docs.
Ollama embedding handling
cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
_get_embedding now handles two response shapes: returns data["embeddings"][0] when present, otherwise falls back to data["data"][0]["embedding"].
Dev dependency update
pyproject.toml
Added pytest-timeout>=2.4.0 to [project.dev] (dev dependencies).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Files warranting extra attention:
    • cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py — verify both response shapes and error handling for missing keys.
    • cognee-mcp/README.md — confirm documented MCP tool names/parameters match implemented API.
    • README.md — spot-check example commands, code snippets, and demo links for accuracy.
    • pyproject.toml — ensure CI/test tooling compatibility with the added dev dependency.

Possibly related PRs

Suggested reviewers

  • Vasilije1990
  • borisarzentar
  • siillee

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description states 'Merge main changes into dev branch' but provides no details about what changes are included, their purpose, or impact; pre-submission checklist is entirely unchecked. Provide a detailed human-written description of the key changes, check applicable 'Type of Change' boxes, and verify pre-submission checklist items before merging.
Title check ❓ Inconclusive The title 'Main merge vol4' is vague and generic, providing no meaningful information about what specific changes are being merged or their purpose. Provide a descriptive title that summarizes the main changes, such as 'Update README, add MCP tools, and enhance embedding handling' or similar.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

siillee
siillee previously approved these changes Dec 1, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py (1)

126-130: Make embedding shape handling more defensive and explicit

The added branch for handling both "embeddings" and "data"[0]["embedding"] shapes is a good step, but this is still quite brittle:

  • If "embeddings" exists but is empty or not a list, data["embeddings"][0] will raise.
  • If neither "embeddings" nor "data" is present (or data["data"] is empty / not a list / missing "embedding"), you’ll get a KeyError/IndexError/TypeError, which will be retried by tenacity as if it were a transient error.

Consider validating the response shape and failing fast with a clear error when the format is unsupported, for example:

-                data = await response.json()
-                if "embeddings" in data:
-                    return data["embeddings"][0]
-                else:
-                    return data["data"][0]["embedding"]
+                data = await response.json()
+
+                embeddings = data.get("embeddings")
+                if isinstance(embeddings, list) and embeddings:
+                    return embeddings[0]
+
+                data_items = data.get("data")
+                if (
+                    isinstance(data_items, list)
+                    and data_items
+                    and isinstance(data_items[0], dict)
+                    and "embedding" in data_items[0]
+                ):
+                    return data_items[0]["embedding"]
+
+                raise ValueError(f"Unexpected embedding response format: {data}")

This keeps the intended flexibility but avoids silent shape drift and long retry loops on permanent format mismatches.

If this is meant to specifically track Ollama vs. OpenAI-style responses, please double-check against the latest API docs for each provider to ensure the key names (embeddings vs embedding, data layout, etc.) match their current contracts.

README.md (1)

160-177: CLI usage section is helpful; verify cognee-cli -ui matches the actual interface

The CLI quickstart (add / cognify / search / delete) is concise and matches the earlier Python example conceptually.

One detail to confirm: the command to open the local UI is shown as:

cognee-cli -ui

Depending on how the CLI is implemented, many users might expect --ui or a subcommand (cognee-cli ui). It’s worth double‑checking that -ui is the correct, supported form and updating here if the actual parser uses a different flag.

If needed, you can quickly confirm this via your CLI help (cognee-cli --help) and align the README.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 508165e and 0bb4ece.

⛔ Files ignored due to path filters (4)
  • .github/release-drafter.yml is excluded by !**/*.yml
  • .mergify.yml is excluded by !**/*.yml
  • poetry.lock is excluded by !**/*.lock, !**/*.lock
  • uv.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (3)
  • README.md (5 hunks)
  • cognee-mcp/README.md (1 hunks)
  • cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code

Files:

  • cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use shared logging utilities from cognee.shared.logging_utils in Python code

Files:

  • cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
cognee/{modules,infrastructure,tasks}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Co-locate feature-specific helpers under their respective package (modules/, infrastructure/, or tasks/)

Files:

  • cognee/infrastructure/databases/vector/embeddings/OllamaEmbeddingEngine.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to {cognee-mcp,cognee-frontend}/**/*.{js,ts,tsx} : Follow local README.md and ESLint/TypeScript configuration in cognee-frontend/ for MCP server and Frontend code
📚 Learning: 2025-10-11T04:18:24.594Z
Learnt from: Vattikuti-Manideep-Sitaram
Repo: topoteretes/cognee PR: 1529
File: cognee/api/v1/cognify/ontology_graph_pipeline.py:69-74
Timestamp: 2025-10-11T04:18:24.594Z
Learning: The code_graph_pipeline.py and ontology_graph_pipeline.py both follow an established pattern of calling cognee.prune.prune_data() and cognee.prune.prune_system(metadata=True) at the start of pipeline execution. This appears to be intentional behavior for pipeline operations in the cognee codebase.

Applied to files:

  • cognee-mcp/README.md
🪛 LanguageTool
README.md

[grammar] ~66-~66: Ensure spelling is correct
Context: ... width="50%" /> ## About Cognee Cognee is an open-source tool and platfo...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~191-~191: Ensure spelling is correct
Context: ...86b2e-305a-42b0-9c2d-9f4473f15df8) ### Cognee with Ollama [Watch Demo](https://githu...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~199-~199: Using many exclamation marks might seem excessive (in this case: 7 exclamation marks for a text that’s 4270 characters long)
Context: ...welcome contributions from the community! Your input helps make Cognee better for...

(EN_EXCESSIVE_EXCLAMATION)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: End-to-End Tests / Run Telemetry Test
🔇 Additional comments (6)
cognee-mcp/README.md (1)

444-463: New MCP tools section looks good; keep names/semantics synced with implementation

The expanded tools list (cognee_add_developer_rules, delete, get_developer_rules, save_interaction, and the richer search modes) reads clearly and aligns with the API surface described elsewhere in the file.

Just make sure the tool identifiers and arguments here stay exactly in sync with the actual MCP server implementation and any client examples, so users don’t hit subtle mismatches (especially for the new delete modes and developer‑rules tools).

It’s worth doing a quick pass over the MCP server code (tool registration) to confirm the tool names and parameter signatures match these docs.

README.md (5)

8-45: Header, tagline, and navigation updates are clear

The new tagline, docs link, and community plugins link at the top read well and make the entry points (Docs, Learn More, Community plugins) much clearer for new users. No issues from a technical-docs perspective.


66-88: About/OSS/Cloud split is helpful; double-check compliance wording

The “About Cognee” plus separate “Cognee Open Source (self-hosted)” and “Cognee Cloud (managed)” sections give a much clearer mental model of how to use the project in different environments. The feature bullets are concrete and consistent with the rest of the README.

One small thing: since you explicitly call out “GDPR compliant, enterprise-grade security”, make sure this claim is aligned with your actual legal/compliance posture and any public policies, so the README doesn’t drift ahead of what’s formally guaranteed.


89-120: Quickstart and LLM configuration look coherent; confirm env-var naming across docs/code

The new Quickstart flow (install via uv pip install cognee, set LLM_API_KEY, then run the minimal async pipeline) is straightforward and feels appropriate for the main README.

A couple of small checks to keep in mind:

  • Ensure LLM_API_KEY is the canonical env var used throughout the codebase and other docs (including cognee-mcp/README and .env.template) so users don’t see conflicting names.
  • In the Python snippet, using os.environ["LLM_API_KEY"] inline is fine for a minimal example; just keep the “use a .env from the template” note in sync with any future changes to configuration keys.

120-155: End‑to‑end async example aligns with the described pipeline

The example that uses cognee.add, cognee.cognify, cognee.memify, and cognee.search in a single async main() function is a nice, minimal demonstration of the ECL-style flow. The control flow and asyncio usage are correct, and the printed result matches the earlier narrative.

Just ensure these function names and their awaited usage remain stable in the public API; otherwise, this snippet will be one of the first places users feel API drift.

Consider adding this exact snippet (or a test derived from it) to your docs/examples test suite so changes to the public API are caught quickly.


179-205: Demos, community, and research sections are well organized

The demos section, community links (Contributing, Code of Conduct), and the research citation provide good context without overloading the main README. The BibTeX entry is syntactically valid and should be easy for users to drop into citation managers.

No changes needed here from a code or structure perspective.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pyproject.toml (2)

28-28: Minor: Fix formatting inconsistency in numpy version constraint.

Line 28 has an inconsistent space after the comma: "numpy>=1.26.4, <=4.0.0". This should be "numpy>=1.26.4,<=4.0.0" to match the formatting of other version constraints in the file.

-    "numpy>=1.26.4, <=4.0.0",
+    "numpy>=1.26.4,<=4.0.0",

134-134: Add upper bound constraint to pytest-timeout for consistency.

Line 134 specifies pytest-timeout>=2.4.0 without an upper bound, while all other dev dependencies define an upper bound (e.g., pytest>=7.4.0,<8, pytest-asyncio>=0.21.1,<0.22). Add an upper bound constraint to align with the established pattern.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bb4ece and 8e67471.

⛔ Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • pyproject.toml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml

📄 CodeRabbit inference engine (AGENTS.md)

Python version requirement: >= 3.10 and < 3.14

Files:

  • pyproject.toml
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to {cognee-mcp,cognee-frontend}/**/*.{js,ts,tsx} : Follow local README.md and ESLint/TypeScript configuration in cognee-frontend/ for MCP server and Frontend code
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to pyproject.toml : Python version requirement: >= 3.10 and < 3.14

Applied to files:

  • pyproject.toml
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to **/*.py : Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml

Applied to files:

  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: End-to-End Tests / Conversation sessions test (FS)
  • GitHub Check: CLI Tests / CLI Unit Tests
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: End-to-End Tests / Concurrent Subprocess access test
  • GitHub Check: End-to-End Tests / Test multi tenancy with different situations in Cognee
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Run Telemetry Test
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Test Feedback Enrichment
  • GitHub Check: End-to-End Tests / Test Entity Extraction
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Test using different async databases in parallel in Cognee
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: Basic Tests / Run Basic Graph Tests
  • GitHub Check: End-to-End Tests / Conversation sessions test (Redis)
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: Basic Tests / Run Simple Examples
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: Basic Tests / Run Integration Tests

siillee
siillee previously approved these changes Dec 2, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pyproject.toml (2)

134-134: Consider adding an upper bound version constraint for consistency.

The newly added pytest-timeout>=2.4.0 uses an open-ended version constraint, which is inconsistent with the strict versioning pattern used for other dev dependencies in this file (e.g., pytest>=7.4.0,<8). For reproducibility and to prevent unexpected breaking changes from future releases, consider specifying an upper bound.

Example refactor:

-    "pytest-timeout>=2.4.0",
+    "pytest-timeout>=2.4.0,<3.0.0",

The pytest-timeout package supports Python 3.7 and higher, including Python 3.13 and 3.14, so it is compatible with your project's Python requirement.


134-134: Add an upper bound version constraint for consistency with project conventions.

The newly added pytest-timeout>=2.4.0 uses an open-ended version constraint, which is inconsistent with the strict versioning pattern used for other dev dependencies in this file (e.g., pytest>=7.4.0,<8). Consider specifying an upper bound to maintain reproducibility and prevent unexpected breaking changes from future releases. The package is compatible with the project's Python requirement (>=3.10,<3.14).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e67471 and c7810e9.

⛔ Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • pyproject.toml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml

📄 CodeRabbit inference engine (AGENTS.md)

Python version requirement: >= 3.10 and < 3.14

Files:

  • pyproject.toml
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to {cognee-mcp,cognee-frontend}/**/*.{js,ts,tsx} : Follow local README.md and ESLint/TypeScript configuration in cognee-frontend/ for MCP server and Frontend code
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to pyproject.toml : Python version requirement: >= 3.10 and < 3.14

Applied to files:

  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: End-to-End Tests / Conversation sessions test (FS)
  • GitHub Check: End-to-End Tests / Conversation sessions test (Redis)
  • GitHub Check: End-to-End Tests / Test Feedback Enrichment
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: End-to-End Tests / Test multi tenancy with different situations in Cognee
  • GitHub Check: End-to-End Tests / Test Entity Extraction
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: End-to-End Tests / Concurrent Subprocess access test
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: Basic Tests / Run Formatting Check
  • GitHub Check: Basic Tests / Run Integration Tests
  • GitHub Check: Basic Tests / Run Linting
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: Basic Tests / Run Simple Examples

@dexters1 dexters1 merged commit 4afde91 into dev Dec 4, 2025
204 of 207 checks passed
@dexters1 dexters1 deleted the main-merge-vol4 branch December 4, 2025 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants