Skip to content

Conversation

@daukadolt
Copy link
Contributor

@daukadolt daukadolt commented Oct 31, 2025

Description

server.py has everything in one file, which is hard to maintain.

This PR:

  1. Extracts tools individually from server.py into tools module
  2. Deprecates coding assistant following fix: removed coding assistance #1657

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify):

Screenshots/Videos (if applicable)

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

@pull-checklist
Copy link

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 31, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request restructures the MCP server architecture by extracting tool implementations from server.py into a modular tools package, introducing a shared context system for managing the CogneeClient globally, and reorganizing client exports. Additionally, the coding_rule_associations module is removed entirely, and the server initialization flow is updated to support context-based client management alongside new HTTP/SSE transport endpoints with CORS support.

Changes

Cohort / File(s) Summary
Removed implementations
cognee-mcp/src/client.py, cognee-mcp/src/codingagents/coding_rule_associations.py
Removed old async client entrypoint and entire rule-associations module with LLM-driven rule generation and graph persistence logic.
Shared context infrastructure
cognee-mcp/src/shared/...
Added new shared package with context module that introduces a global cognee_client variable and setter function for centralized client management across components.
Client exports
cognee-mcp/src/clients/__init__.py, cognee-mcp/src/clients/cognee_client.py
Exposed CogneeClient as a public export; added TODO comment in cognee_client.py regarding OpenAPI exploration.
New tools package structure
cognee-mcp/src/tools/__init__.py
Created tools package initializer that re-exports public tool functions (cognify, search, list_data, delete, prune, cognify_status).
Individual tool implementations
cognee-mcp/src/tools/cognify.py, cognee-mcp/src/tools/search.py, cognee-mcp/src/tools/cognify_status.py, cognee-mcp/src/tools/delete.py, cognee-mcp/src/tools/list_data.py, cognee-mcp/src/tools/prune.py
Implemented six modular async tool functions with background task execution, mode-aware behavior (API vs. direct), and structured error handling.
Tool utilities
cognee-mcp/src/tools/utils.py
Added helper functions for node/edge rendering (node_to_string, retrieved_edges_to_string) and dynamic model loading (load_class).
Server refactor
cognee-mcp/src/server.py
Replaced per-function MCP tool decorators with explicit mcp.tool() registrations; introduced CogneeClient initialization and context binding; added new HTTP/SSE transport runners with CORS support; added /health endpoint; restructured imports to use public APIs.

Sequence Diagram(s)

sequenceDiagram
    participant user as User/MCP Client
    participant mcp as MCP Server
    participant ctx as Shared Context
    participant client as CogneeClient
    participant tools as Tool Module
    
    mcp->>client: instantiate CogneeClient on startup
    mcp->>ctx: context.set_cognee_client(client)
    note over ctx: global cognee_client = client
    
    user->>mcp: invoke tool (e.g., cognify)
    mcp->>tools: call cognify(data, ...)
    tools->>ctx: retrieve cognee_client from context
    tools->>client: client.add(data)
    tools->>client: client.cognify(...) [background task]
    tools-->>user: return status message immediately
    note over tools: async task continues in background
Loading
sequenceDiagram
    participant tool as Tool (search)
    participant ctx as Shared Context
    participant client as CogneeClient
    
    tool->>ctx: access context.cognee_client
    tool->>client: check use_api flag
    
    alt API Mode
        tool->>client: client.search(query)
        client-->>tool: results (string or list)
        tool->>tool: format based on search_type
    else Direct Mode
        tool->>client: client.search(query)
        client-->>tool: results (objects/dicts)
        tool->>tool: convert to string/JSON per type<br/>(CODE, CHUNKS, INSIGHTS, etc.)
    end
    
    tool-->>user: TextContent with formatted results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • server.py refactor: Major architectural change replacing decorator-based tool registration with explicit mcp.tool() calls, context integration, and new transport endpoints. Verify all previous tool functionality maps correctly to the new structure and that removal of legacy paths doesn't break existing workflows.
  • search.py: Complex branching logic across multiple search types (GRAPH_COMPLETION, RAG_COMPLETION, CODE, CHUNKS, SUMMARIES, CYPHER, FEELING_LUCKY, INSIGHTS) and modes (API vs. direct). Ensure all code paths produce correctly formatted output.
  • list_data.py: Mode-aware behavior with different output for API and direct modes; UUID validation and dataset lookups. Verify error paths and formatting for all scenarios.
  • context.py & shared module: New global pattern for client management. Confirm this pattern is thread-safe and properly integrated across all tools.
  • Removed coding_rule_associations.py: Verify that rule association functionality is not relied upon elsewhere or that its removal is intentional and complete.

Possibly related PRs

Suggested reviewers

  • borisarzentar
  • Vasilije1990

Poem

🐰 Tools hop into tidy rooms,
Context blooms where cognee_client looms,
Server startup gleams with fresh design,
SSE and health checks align—
Old patterns fade; new structure shines! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "Refactor: break down server.py, extract tools" directly corresponds to the main changes in the changeset. The raw summary shows that the primary modifications involve extracting individual tool functions (cognify, search, list_data, delete, prune, cognify_status) from server.py into a dedicated tools module, while refactoring server.py to use a centralized context and public clients. The title is concise, specific, and communicates the core refactoring objective clearly enough for a reviewer scanning the commit history.
Description Check ✅ Passed The PR description provides a clear, human-generated explanation of the changes with proper context: the author notes that server.py is hard to maintain as a monolithic file and outlines two specific actions taken (extracting tools and deprecating the coding assistant). However, the description is incomplete relative to the template: the "Type of Change" section has no option selected (this should indicate "Code refactoring"), and all pre-submission checklist items are left unchecked. While the core description section adequately explains what was changed and why, these omissions represent gaps in the expected PR structure.
Docstring Coverage ✅ Passed Docstring coverage is 86.67% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@daukadolt daukadolt marked this pull request as ready for review October 31, 2025 17:16
@daukadolt daukadolt force-pushed the refactor/refactor-cognee-mcp branch from a52af5b to ba9377f Compare October 31, 2025 17:17
@daukadolt
Copy link
Contributor Author

rebase and resolve conflicts

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 31, 2025

@daukadolt, thank you for rebasing and resolving the conflicts! I'll review the changes now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee-mcp/src/server.py (1)

41-46: Allow POST/OPTIONS in CORS middleware

Restricting CORS to ["GET"] blocks POST calls and OPTIONS preflight checks. The MCP HTTP transport sends POST requests, and browsers will issue OPTIONS preflights, so the new transports will immediately fail with 405/CORS errors. Please allow POST and OPTIONS (and keep GET) in both middleware blocks.

-        allow_methods=["GET"],
+        allow_methods=["GET", "POST", "OPTIONS"],

Also applies to: 62-67

🧹 Nitpick comments (1)
cognee-mcp/src/tools/utils.py (1)

9-39: Add type hints for exported utilities.
These helpers now underpin the public tooling surface; leaving them unannotated breaks our type-hint guideline and makes downstream usage harder to reason about. Please add explicit parameter/return annotations.

 import os
 import importlib.util
+from typing import Any, Iterable, Mapping, Type
 
 
-def node_to_string(node):
+def node_to_string(node: Mapping[str, Any]) -> str:
     """Convert a node dictionary to a string representation."""
     node_data = ", ".join(
         [f'{key}: "{value}"' for key, value in node.items() if key in ["id", "name"]]
     )
     return f"Node({node_data})"
 
 
-def retrieved_edges_to_string(search_results):
+def retrieved_edges_to_string(
+    search_results: Iterable[
+        tuple[Mapping[str, Any], Mapping[str, Any], Mapping[str, Any]]
+    ],
+) -> str:
     """Convert graph search results (triplets) to human-readable strings."""
     edge_strings = []
     for triplet in search_results:
         node1, edge, node2 = triplet
         relationship_type = edge["relationship_name"]
@@
 
 
-def load_class(model_file, model_name):
+def load_class(model_file: str, model_name: str) -> Type[Any]:
     """Dynamically load a class from a file."""
     model_file = os.path.abspath(model_file)
     spec = importlib.util.spec_from_file_location("graph_model", model_file)
     if spec is None:
         raise ValueError(f"Could not load specification for module from file: {model_file}")

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e015d6 and ba9377f.

⛔ Files ignored due to path filters (1)
  • cognee-mcp/pyproject.toml is excluded by !**/*.toml
📒 Files selected for processing (15)
  • cognee-mcp/src/client.py (0 hunks)
  • cognee-mcp/src/clients/__init__.py (1 hunks)
  • cognee-mcp/src/clients/cognee_client.py (1 hunks)
  • cognee-mcp/src/codingagents/coding_rule_associations.py (0 hunks)
  • cognee-mcp/src/server.py (2 hunks)
  • cognee-mcp/src/shared/__init__.py (1 hunks)
  • cognee-mcp/src/shared/context.py (1 hunks)
  • cognee-mcp/src/tools/__init__.py (1 hunks)
  • cognee-mcp/src/tools/cognify.py (1 hunks)
  • cognee-mcp/src/tools/cognify_status.py (1 hunks)
  • cognee-mcp/src/tools/delete.py (1 hunks)
  • cognee-mcp/src/tools/list_data.py (1 hunks)
  • cognee-mcp/src/tools/prune.py (1 hunks)
  • cognee-mcp/src/tools/search.py (1 hunks)
  • cognee-mcp/src/tools/utils.py (1 hunks)
💤 Files with no reviewable changes (2)
  • cognee-mcp/src/client.py
  • cognee-mcp/src/codingagents/coding_rule_associations.py
🧰 Additional context used
📓 Path-based instructions (1)
{cognee,cognee-mcp,distributed,examples,alembic}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

{cognee,cognee-mcp,distributed,examples,alembic}/**/*.py: Use 4-space indentation; name modules and functions in snake_case; name classes in PascalCase (Python)
Adhere to ruff rules, including import hygiene and configured line length (100)
Keep Python lines ≤ 100 characters

Files:

  • cognee-mcp/src/clients/__init__.py
  • cognee-mcp/src/clients/cognee_client.py
  • cognee-mcp/src/tools/__init__.py
  • cognee-mcp/src/tools/search.py
  • cognee-mcp/src/tools/prune.py
  • cognee-mcp/src/shared/context.py
  • cognee-mcp/src/tools/cognify.py
  • cognee-mcp/src/tools/delete.py
  • cognee-mcp/src/tools/cognify_status.py
  • cognee-mcp/src/tools/list_data.py
  • cognee-mcp/src/tools/utils.py
  • cognee-mcp/src/shared/__init__.py
  • cognee-mcp/src/server.py
🧠 Learnings (4)
📚 Learning: 2025-10-27T09:21:14.154Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-27T09:21:14.154Z
Learning: Applies to cognee/**/*.py : Public APIs in the core library should be type-annotated where practical

Applied to files:

  • cognee-mcp/src/clients/__init__.py
  • cognee-mcp/src/clients/cognee_client.py
📚 Learning: 2025-10-11T04:18:24.594Z
Learnt from: Vattikuti-Manideep-Sitaram
Repo: topoteretes/cognee PR: 1529
File: cognee/api/v1/cognify/ontology_graph_pipeline.py:69-74
Timestamp: 2025-10-11T04:18:24.594Z
Learning: The code_graph_pipeline.py and ontology_graph_pipeline.py both follow an established pattern of calling cognee.prune.prune_data() and cognee.prune.prune_system(metadata=True) at the start of pipeline execution. This appears to be intentional behavior for pipeline operations in the cognee codebase.

Applied to files:

  • cognee-mcp/src/tools/prune.py
📚 Learning: 2024-07-27T16:15:21.508Z
Learnt from: borisarzentar
Repo: topoteretes/cognee PR: 90
File: cognee/api/v1/cognify/cognify.py:86-0
Timestamp: 2024-07-27T16:15:21.508Z
Learning: The file handling in `cognee/api/v1/cognify/cognify.py` uses a context manager for opening files, ensuring proper closure after operations.

Applied to files:

  • cognee-mcp/src/tools/cognify.py
📚 Learning: 2025-10-27T09:21:14.154Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-27T09:21:14.154Z
Learning: Applies to {cognee,cognee-mcp,distributed,examples,alembic}/**/*.py : Adhere to ruff rules, including import hygiene and configured line length (100)

Applied to files:

  • cognee-mcp/src/server.py
🪛 Pylint (4.0.2)
cognee-mcp/src/tools/search.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)


[refactor] 6-6: Use 'from mcp import types' instead

(R0402)


[refactor] 137-147: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)


[refactor] 150-163: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)


[refactor] 125-125: Too many return statements (9/6)

(R0911)

cognee-mcp/src/tools/prune.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)


[refactor] 5-5: Use 'from mcp import types' instead

(R0402)

cognee-mcp/src/shared/context.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)

cognee-mcp/src/tools/cognify.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)


[refactor] 6-6: Use 'from mcp import types' instead

(R0402)

cognee-mcp/src/tools/delete.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)


[refactor] 7-7: Use 'from mcp import types' instead

(R0402)

cognee-mcp/src/tools/cognify_status.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)


[refactor] 5-5: Use 'from mcp import types' instead

(R0402)

cognee-mcp/src/tools/list_data.py

[error] 1-1: Unrecognized option found: suggestion-mode

(E0015)


[refactor] 6-6: Use 'from mcp import types' instead

(R0402)


[refactor] 14-14: Too many branches (14/12)

(R0912)


[refactor] 14-14: Too many statements (58/50)

(R0915)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: End-to-End Tests / Test Feedback Enrichment
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Test Entity Extraction
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Concurrent Subprocess access test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: End-to-End Tests / Test using different async databases in parallel in Cognee
  • GitHub Check: End-to-End Tests / Conversation sessions test
  • GitHub Check: Basic Tests / Run Simple Examples
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: Basic Tests / Run Integration Tests
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test

Comment on lines +150 to +163
if search_type.upper() == "CODE":
return json.dumps(search_results, cls=JSONEncoder)
elif (
search_type.upper() == "GRAPH_COMPLETION"
or search_type.upper() == "RAG_COMPLETION"
):
return str(search_results[0])
elif search_type.upper() == "CHUNKS":
return str(search_results)
elif search_type.upper() == "INSIGHTS":
results = retrieved_edges_to_string(search_results)
return results
else:
return str(search_results)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix direct-mode completion handling.

search_results[0] assumes a non-empty sequence. In direct mode the client commonly returns a plain string, so this truncates the response to its first character; if the client returns an empty list, it raises IndexError. Return the full string and guard the empty-sequence case instead.

Apply this diff to handle the different result shapes safely:

-                # Direct mode processing
-                if search_type.upper() == "CODE":
+                # Direct mode processing
+                result_type = search_type.upper()
+                if result_type == "CODE":
                     return json.dumps(search_results, cls=JSONEncoder)
-                elif (
-                    search_type.upper() == "GRAPH_COMPLETION"
-                    or search_type.upper() == "RAG_COMPLETION"
-                ):
-                    return str(search_results[0])
-                elif search_type.upper() == "CHUNKS":
+                elif result_type in {"GRAPH_COMPLETION", "RAG_COMPLETION"}:
+                    if isinstance(search_results, list):
+                        return str(search_results[0]) if search_results else "[]"
+                    return str(search_results)
+                elif result_type == "CHUNKS":
                     return str(search_results)
-                elif search_type.upper() == "INSIGHTS":
+                elif result_type == "INSIGHTS":
                     results = retrieved_edges_to_string(search_results)
                     return results
                 else:
                     return str(search_results)
🧰 Tools
🪛 Pylint (4.0.2)

[refactor] 150-163: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

🤖 Prompt for AI Agents
In cognee-mcp/src/tools/search.py around lines 150 to 163, the code assumes
search_results[0] exists for GRAPH_COMPLETION/RAG_COMPLETION which breaks when
search_results is a plain string (returns its first char) or an empty list
(IndexError); change the branch to detect the result shape: if search_results is
a str return it unchanged; if it's a non-empty sequence return
str(search_results[0]); if it's an empty sequence return an empty string (or a
sensible default); apply the same defensive checks where you convert or index
search_results so you never index an empty sequence or truncate a string.

@daukadolt daukadolt requested a review from dexters1 November 5, 2025 13:28
@pazone
Copy link
Contributor

pazone commented Nov 7, 2025

@daukadolt is the MCP test failure related to this change?

@daukadolt
Copy link
Contributor Author

daukadolt commented Nov 7, 2025

@pazone yup. I'll need to update the PR, will try to do so by EOD

@pazone pazone requested a review from hajdul88 November 10, 2025 10:05
@Vasilije1990
Copy link
Contributor

Has not been touched in weeks. closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants