Skip to content

Conversation

@daukadolt
Copy link
Contributor

@daukadolt daukadolt commented Oct 9, 2025

Description

With version 0.3.5 onwards, we start Cognee MCP alongside cognee ui in cognee-cli -ui

Currently, cognee-mcp operates as a standalone cognee instance - with it's own knowledge graph.

This PR

  1. adds support for cognee-mcp to run in non-standalone mode as an interface to an already running cognee backend.
  2. updates cognee-cli -ui MCP startup

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify):

Screenshots/Videos (if applicable)

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

@pull-checklist
Copy link

pull-checklist bot commented Oct 9, 2025

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 9, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds API-mode support across MCP: new CogneeClient enabling direct vs API usage, CLI flags (--api-url/--api-token), entrypoint gating and API_ARGS propagation, Dockerfile OCI label additions, README API-mode docs, UI Docker-run changes returning container name, and container-aware CLI shutdown handling.

Changes

Cohort / File(s) Summary
Docker metadata
cognee-mcp/Dockerfile
Added OCI image label org.opencontainers.image.description="Cognee MCP Server with API mode support" in both build stages (comments preceding each); metadata-only change.
Docs — API mode
cognee-mcp/README.md
Expanded README with API Mode and Docker API Mode sections: usage examples, env vars, host handling, transport notes, limitations, and configuration guidance.
Container entrypoint flow
cognee-mcp/entrypoint.sh
Gate startup on API_URL (skip Alembic in API mode); normalize localhost→host.docker.internal with warnings; build API_ARGS (--api-url/--api-token); propagate args to all exec paths; improve migration error handling.
Client abstraction
cognee-mcp/src/cognee_client.py
New CogneeClient providing direct vs API modes: async HTTP client (5-min timeout), auth headers, conditional imports, methods add, cognify, search, delete, prune_*, get_pipeline_status, list_datasets, close; stdout suppression for direct calls.
Server integration + CLI
cognee-mcp/src/server.py
Added global cognee_client and CLI flags --api-url/--api-token; route tools through cognee_client when API mode active; skip migrations in API mode; add API-mode NotImplemented/NA handling and adapted outputs.
UI Docker orchestration
cognee/api/v1/ui/ui.py
Build docker command list, pre-pull cognee/cognee-mcp:main, generate unique container name, inject API_URL when starting backend, log mode and return (pid, container_name) to PID callback.
CLI signal/container handling
cognee/cli/_cognee.py
Track docker_container state; signal_handler attempts container stop/remove before killing processes; pid_callback accepts (pid, container_name) tuples and updates docker_container.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as MCP CLI
  participant CC as CogneeClient
  participant Cognee as Cognee (direct)
  participant API as Cognee FastAPI

  Note over CLI,CC: Init with or without --api-url
  User->>CLI: start [--api-url, --api-token]
  CLI->>CC: new CogneeClient(api_url, api_token)

  alt API mode
    User->>CLI: action (search/add/cognify)
    CLI->>CC: method(...)
    CC->>API: HTTP request (headers + token)
    API-->>CC: HTTP response
    CC-->>CLI: result
    CLI-->>User: output
  else Direct mode
    User->>CLI: action (search/add/cognify)
    CLI->>CC: method(...)
    CC->>Cognee: direct function call
    Cognee-->>CC: result
    CC-->>CLI: result
    CLI-->>User: output
  end
Loading
sequenceDiagram
  autonumber
  participant Entrypoint as entrypoint.sh
  participant Docker as Docker Runtime
  participant MCP as MCP Process

  Entrypoint->>Entrypoint: detect `API_URL`
  alt `API_URL` set
    Entrypoint->>Entrypoint: normalize localhost -> host.docker.internal (warn)
    Entrypoint->>Entrypoint: build `API_ARGS` (--api-url, --api-token)
    Note right of Entrypoint: skip Alembic migrations
  else no `API_URL`
    Entrypoint->>Entrypoint: run Alembic migrations (handle benign errors)
  end
  Entrypoint->>Docker: exec MCP with transport args + `API_ARGS`
  Docker->>MCP: start process (returns pid, container_name)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • feat: add mcp switch #1039 — Directly related to entrypoint startup command construction and propagation of transport/API args.
  • Release v0.3.5 #1515 — Related UI/CLI Docker startup and graceful shutdown (container-aware PID handling) and MCP startup wiring.
  • chore: update mcp readme #913 — Overlapping README updates around running the MCP and transport usage (doc-level overlap).

Suggested labels

core-team

Suggested reviewers

  • dexters1
  • Vasilije1990
  • borisarzentar

Poem

I hop between two tasty modes today,
Direct greens here, API carrots away.
Flags in paw, I nudge the Docker rim,
Migrations nap while headers swim.
CogneeClient hums — fetch fast, little vim! 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly summarizes the main change by indicating support for non-standalone mode in the MCP component, which aligns with the core functionality added in this PR. Although it includes a “Feat/mcp” prefix that reads like a branch name, it remains fully related and specific to the primary change. There is no misleading or off-topic content, and a teammate reviewing the history would understand the intent from this title.
Description Check ✅ Passed The pull request description follows the repository’s template by including a clear human-written Description section, selecting the appropriate Type of Change, providing a Pre-submission Checklist, and including a DCO Affirmation. Optional sections like Screenshots and remaining checklist items are present or intentionally left blank, which is acceptable when they do not apply. Overall, the structure and required information are complete and coherent.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@daukadolt daukadolt marked this pull request as ready for review October 9, 2025 13:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee-mcp/src/server.py (1)

138-157: API mode never uploads the rule files

Here we pass file_path straight into cognee_client.add. In API mode the helper wraps the payload as ("data.txt", str(data), ...), so the backend only receives the literal path string (e.g. /app/.cursorrules) instead of the file contents. The remote Cognee instance can’t open that path, so developer rules ingestion silently breaks in API mode. Please update CogneeClient.add to detect real file paths (Path/str that exists) and stream the bytes (e.g. open(path, "rb")) when use_api is true, while keeping the current direct-mode behavior.

🧹 Nitpick comments (2)
cognee-mcp/src/cognee_client.py (2)

47-52: Reconsider hardcoded Content-Type in _get_headers.

The method always sets Content-Type: application/json, but the add method (lines 87-92) requires multipart/form-data and manually constructs its own Authorization header without using _get_headers(). While this doesn't cause a runtime bug currently, it creates confusion and inconsistency.

Consider one of these approaches:

  1. Add a parameter to control Content-Type:
-    def _get_headers(self) -> Dict[str, str]:
-        """Get headers for API requests."""
-        headers = {"Content-Type": "application/json"}
+    def _get_headers(self, content_type: Optional[str] = "application/json") -> Dict[str, str]:
+        """Get headers for API requests."""
+        headers = {}
+        if content_type:
+            headers["Content-Type"] = content_type
         if self.api_token:
             headers["Authorization"] = f"Bearer {self.api_token}"
         return headers
  1. Or split into separate methods for auth headers and full headers:
def _get_auth_headers(self) -> Dict[str, str]:
    """Get authentication headers only."""
    if self.api_token:
        return {"Authorization": f"Bearer {self.api_token}"}
    return {}

def _get_headers(self) -> Dict[str, str]:
    """Get headers for JSON API requests."""
    headers = {"Content-Type": "application/json"}
    headers.update(self._get_auth_headers())
    return headers

Then update line 91 in the add method to use _get_auth_headers().


74-99: Consider removing unnecessary else blocks (optional style improvement).

Pylint flags unnecessary else blocks after return or raise statements throughout the file. While the current structure is clear and readable—especially for a dual-mode client where the explicit if API-mode else direct-mode pattern aids comprehension—you could flatten the structure if preferred.

Example for one method:

     async def add(self, ...):
         if self.use_api:
             # API mode: Make HTTP request
             ...
             return response.json()
-        else:
-            # Direct mode: Call cognee directly
-            with redirect_stdout(sys.stderr):
-                await self.cognee.add(data, dataset_name=dataset_name, node_set=node_set)
-                return {"status": "success", "message": "Data added successfully"}
+        
+        # Direct mode: Call cognee directly
+        with redirect_stdout(sys.stderr):
+            await self.cognee.add(data, dataset_name=dataset_name, node_set=node_set)
+            return {"status": "success", "message": "Data added successfully"}

This pattern can be applied to all methods flagged by pylint, but it's entirely optional.

Also applies to: 124-147, 178-198, 218-237, 248-256, 272-279, 297-306, 317-334

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 760a9de and f11c01e.

⛔ Files ignored due to path filters (2)
  • cognee-mcp/pyproject.toml is excluded by !**/*.toml
  • cognee-mcp/uv.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (6)
  • cognee-mcp/Dockerfile (1 hunks)
  • cognee-mcp/README.md (4 hunks)
  • cognee-mcp/entrypoint.sh (1 hunks)
  • cognee-mcp/src/cognee_client.py (1 hunks)
  • cognee-mcp/src/server.py (18 hunks)
  • cognee/api/v1/ui/ui.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
cognee/api/v1/ui/ui.py (1)
cognee/shared/logging_utils.py (1)
  • info (175-175)
cognee-mcp/src/cognee_client.py (4)
cognee/shared/logging_utils.py (1)
  • get_logger (182-194)
cognee/api/v1/datasets/datasets.py (1)
  • datasets (7-40)
cognee/modules/search/types/SearchType.py (1)
  • SearchType (4-20)
cognee/modules/users/methods/get_default_user.py (1)
  • get_default_user (13-41)
cognee-mcp/src/server.py (5)
cognee-mcp/src/cognee_client.py (1)
  • CogneeClient (19-339)
cognee/api/v1/cognify/code_graph_pipeline.py (1)
  • run_code_graph_pipeline (32-96)
cognee/modules/users/methods/get_default_user.py (1)
  • get_default_user (13-41)
cognee/modules/data/methods/get_unique_dataset_id.py (1)
  • get_unique_dataset_id (6-9)
cognee/modules/pipelines/operations/get_pipeline_status.py (1)
  • get_pipeline_status (8-35)
cognee-mcp/entrypoint.sh (1)
alembic/versions/482cd6517ce4_add_default_user.py (1)
  • upgrade (25-29)
🪛 Pylint (3.3.9)
cognee-mcp/src/cognee_client.py

[refactor] 74-99: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 124-147: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 149-149: Too many arguments (6/5)

(R0913)


[refactor] 149-149: Too many positional arguments (6/5)

(R0917)


[refactor] 178-198: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 218-237: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 248-256: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 272-279: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 297-306: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 317-334: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

cognee-mcp/src/server.py

[refactor] 13-13: Use 'from mcp import types' instead

(R0402)


[refactor] 606-616: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)


[refactor] 619-632: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Run Telemetry Test
  • GitHub Check: Basic Tests / Run Basic Graph Tests
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: Basic Tests / Run Formatting Check
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: Basic Tests / Run Simple Examples
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: CLI Tests / CLI Functionality Tests

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee/api/v1/ui/ui.py (1)

505-551: MCP image tag mismatch
We pull cognee/cognee-mcp:main but actually run cognee/cognee-mcp:daulet-dev. This will launch a potentially stale image and ignore the freshly pulled one. Make the docker run tag match the pulled tag (or vice‑versa) so we boot the intended build.

-            image = "cognee/cognee-mcp:main"
-            subprocess.run(["docker", "pull", image], check=True)
+            image = "cognee/cognee-mcp:main"
+            subprocess.run(["docker", "pull", image], check=True)
...
-            docker_cmd.append("cognee/cognee-mcp:daulet-dev")
+            docker_cmd.append(image)
♻️ Duplicate comments (1)
cognee-mcp/src/server.py (1)

313-317: Same ingestion issue as above when data is a path.

This also calls cognee_client.add(data) and will send the path string in API mode. Covered by prior comment; fix centrally in CogneeClient.add.

🧹 Nitpick comments (11)
cognee-mcp/src/server.py (6)

151-154: Remove unused import.

KnowledgeGraph is imported but never used.

-                        from cognee.shared.data_models import KnowledgeGraph
-
                         model = load_class(graph_model_file, graph_model_name)

309-312: Remove unused import (duplicate).

Same unused KnowledgeGraph import appears here; safe to remove.

-                    from cognee.shared.data_models import KnowledgeGraph
-
                     graph_model = load_class(graph_model_file, graph_model_name)

599-617: Harden result handling and simplify branching in search().

  • Direct mode: guard against empty results before indexing [0] for COMPLETION types.
  • Optional: flatten elif after return to satisfy linter and readability.
-            if cognee_client.use_api:
+            if cognee_client.use_api:
                 # API mode returns JSON-serialized results
                 if isinstance(search_results, str):
                     return search_results
-                elif isinstance(search_results, list):
+                if isinstance(search_results, list):
                     if (
                         search_type.upper() in ["GRAPH_COMPLETION", "RAG_COMPLETION"]
                         and len(search_results) > 0
                     ):
                         return str(search_results[0])
                     return str(search_results)
-                else:
-                    return json.dumps(search_results, cls=JSONEncoder)
+                return json.dumps(search_results, cls=JSONEncoder)
             else:
-                # Direct mode processing
-                if search_type.upper() == "CODE":
+                # Direct mode processing
+                if search_type.upper() == "CODE":
                     return json.dumps(search_results, cls=JSONEncoder)
-                elif (
-                    search_type.upper() == "GRAPH_COMPLETION"
-                    or search_type.upper() == "RAG_COMPLETION"
-                ):
-                    return str(search_results[0])
-                elif search_type.upper() == "CHUNKS":
+                if search_type.upper() in ("GRAPH_COMPLETION", "RAG_COMPLETION"):
+                    if not search_results:
+                        return "No results"
+                    return str(search_results[0])
+                if search_type.upper() == "CHUNKS":
                     return str(search_results)
-                elif search_type.upper() == "INSIGHTS":
+                if search_type.upper() == "INSIGHTS":
                     results = retrieved_edges_to_string(search_results)
                     return results
-                else:
-                    return str(search_results)
+                return str(search_results)

[Based on static analysis hints]

Also applies to: 618-633


13-14: Prefer from mcp import types import style.

Avoid import mcp.types as types per linter guidance; it also reads cleaner.

-import mcp.types as types
+from mcp import types

[Based on static analysis hints]


64-74: Make CORS origins configurable.

Hard-coding http://localhost:3000 can break UI usage (Docker, different ports). Read origins from env, comma-separated.

-    sse_app.add_middleware(
+    allowed = os.getenv("MCP_CORS_ORIGINS", "http://localhost:3000").split(",")
+    sse_app.add_middleware(
         CORSMiddleware,
-        allow_origins=["http://localhost:3000"],
+        allow_origins=[o.strip() for o in allowed if o.strip()],
         allow_credentials=True,
         allow_methods=["GET"],
         allow_headers=["*"],
     )

Apply the same change to run_http_with_cors().

Also applies to: 66-73


1129-1140: Close HTTP client on shutdown to avoid resource leaks.

Ensure httpx.AsyncClient is closed when the server stops.

-    logger.info(f"Starting MCP server with transport: {args.transport}")
-    if args.transport == "stdio":
-        await mcp.run_stdio_async()
-    elif args.transport == "sse":
-        logger.info(f"Running MCP server with SSE transport on {args.host}:{args.port}")
-        await run_sse_with_cors()
-    elif args.transport == "http":
-        logger.info(
-            f"Running MCP server with Streamable HTTP transport on {args.host}:{args.port}{args.path}"
-        )
-        await run_http_with_cors()
+    logger.info(f"Starting MCP server with transport: {args.transport}")
+    try:
+        if args.transport == "stdio":
+            await mcp.run_stdio_async()
+        elif args.transport == "sse":
+            logger.info(f"Running MCP server with SSE transport on {args.host}:{args.port}")
+            await run_sse_with_cors()
+        elif args.transport == "http":
+            logger.info(
+                f"Running MCP server with Streamable HTTP transport on {args.host}:{args.port}{args.path}"
+            )
+            await run_http_with_cors()
+    finally:
+        if cognee_client and cognee_client.use_api:
+            await cognee_client.close()
cognee-mcp/src/cognee_client.py (4)

9-15: Add imports for file upload handling and JSON-encoding lists.

To support path detection and list serialization in API mode.

-import sys
+import sys
+import os
+import json
+import mimetypes

140-147: Pass datasets through in direct-mode cognify().

You accept datasets but don’t pass them in direct mode.

                 kwargs = {}
                 if custom_prompt:
                     kwargs["custom_prompt"] = custom_prompt
                 if graph_model:
                     kwargs["graph_model"] = graph_model
+                if datasets:
+                    kwargs["datasets"] = datasets

74-99: Flatten “else/elif after return/raise” for readability.

Multiple blocks can be simplified per Pylint R1705/R1720; no behavior change.

[Based on static analysis hints]

Also applies to: 124-147, 178-198, 218-237, 248-256, 272-279, 297-306, 317-334


37-45: Unify header construction.

Use _get_headers() consistently (including add()) rather than inlining Authorization headers.

Also applies to: 87-95

cognee-mcp/Dockerfile (1)

68-70: Consider adding standard OCI labels and a healthcheck.

Add title, source, URL, licenses, version, and an HTTP healthcheck for better ops metadata.

 LABEL org.opencontainers.image.description="Cognee MCP Server with API mode support"
+LABEL org.opencontainers.image.title="cognee-mcp"
+LABEL org.opencontainers.image.url="https://github.com/topoteretes/cognee"
+LABEL org.opencontainers.image.source="https://github.com/topoteretes/cognee/tree/main/cognee-mcp"
+LABEL org.opencontainers.image.licenses="Apache-2.0"
+# Optional: add healthcheck against /health (HTTP/SSE depending on mode)
+# HEALTHCHECK --interval=30s --timeout=5s --retries=3 CMD curl -fsS http://localhost:8000/health || exit 1
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 760a9de and f11c01e.

⛔ Files ignored due to path filters (2)
  • cognee-mcp/pyproject.toml is excluded by !**/*.toml
  • cognee-mcp/uv.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (6)
  • cognee-mcp/Dockerfile (1 hunks)
  • cognee-mcp/README.md (4 hunks)
  • cognee-mcp/entrypoint.sh (1 hunks)
  • cognee-mcp/src/cognee_client.py (1 hunks)
  • cognee-mcp/src/server.py (18 hunks)
  • cognee/api/v1/ui/ui.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
cognee-mcp/src/cognee_client.py (4)
cognee/shared/logging_utils.py (1)
  • get_logger (182-194)
cognee/api/v1/datasets/datasets.py (1)
  • datasets (7-40)
cognee/modules/search/types/SearchType.py (1)
  • SearchType (4-20)
cognee/modules/users/methods/get_default_user.py (1)
  • get_default_user (13-41)
cognee-mcp/src/server.py (1)
cognee-mcp/src/cognee_client.py (1)
  • CogneeClient (19-339)
cognee/api/v1/ui/ui.py (1)
cognee/shared/logging_utils.py (1)
  • info (175-175)
cognee-mcp/entrypoint.sh (1)
alembic/versions/482cd6517ce4_add_default_user.py (1)
  • upgrade (25-29)
🪛 Pylint (3.3.9)
cognee-mcp/src/cognee_client.py

[refactor] 74-99: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 124-147: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 149-149: Too many arguments (6/5)

(R0913)


[refactor] 149-149: Too many positional arguments (6/5)

(R0917)


[refactor] 178-198: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 218-237: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 248-256: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 272-279: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 297-306: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 317-334: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

cognee-mcp/src/server.py

[refactor] 13-13: Use 'from mcp import types' instead

(R0402)


[refactor] 606-616: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)


[refactor] 619-632: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Run Telemetry Test
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: Basic Tests / Run Formatting Check
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: CLI Tests / CLI Functionality Tests
🔇 Additional comments (5)
cognee-mcp/entrypoint.sh (1)

33-42: Graceful migration handling never runs under set -e

With set -e in effect, a failing alembic upgrade head inside command substitution exits the script immediately, so the special-case logic for UserAlreadyExists is never reached. Even if it were, only stdout is captured—Alembic sends its traceback (and the “already exists” message) to stderr, so the pattern match still fails and the container exits. Please temporarily disable errexit around the migration call and capture stderr as well:

-    MIGRATION_OUTPUT=$(alembic upgrade head)
-    MIGRATION_EXIT_CODE=$?
+    set +e
+    MIGRATION_OUTPUT=$(alembic upgrade head 2>&1)
+    MIGRATION_EXIT_CODE=$?
+    set -e

Likely an incorrect or invalid review comment.

cognee-mcp/src/server.py (1)

1035-1044: Global client initialization order – verify no tool runs before init.

Tools reference cognee_client global. Ensure MCP never invokes tools before main() sets it. If there’s any alternative entrypoint, consider a lazy factory to avoid None access.

Also applies to: 1096-1098

cognee-mcp/README.md (1)

317-386: API mode ingestion note (if not implementing file upload soon).

If you don’t implement file-path upload in API mode, call out that cognify(data=...) requires raw text, not a filesystem path. Otherwise, once fixed per code suggestion, you can omit this note.

cognee-mcp/src/cognee_client.py (2)

194-198: Forward optional args to cognee.search if supported
Direct-mode search currently ignores datasets, system_prompt, and top_k. Ensure these optional parameters are forwarded to self.cognee.search when supported by its signature.


32-40: Ensure httpx is pinned to a tested patch and remove deprecated SSL/proxy args
I didn’t find any httpx version in pyproject.toml; verify that you’re pinning httpx (e.g., 0.28.1) in your dependencies and remove any deprecated SSL or proxy parameters elsewhere.

Comment on lines 74 to 95
if self.use_api:
# API mode: Make HTTP request
endpoint = f"{self.api_url}/api/v1/add"

# For API mode, we need to handle file uploads differently
# For now, we'll assume data is text content
files = {"data": ("data.txt", str(data), "text/plain")}
form_data = {
"datasetName": dataset_name,
}
if node_set:
form_data["node_set"] = node_set

response = await self.client.post(
endpoint,
files=files,
data=form_data,
headers={"Authorization": f"Bearer {self.api_token}"} if self.api_token else {},
)
response.raise_for_status()
return response.json()
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix API-mode add(): detect file paths and upload file bytes; serialize node_set; use common headers.

Current code uploads the literal path string. Also, lists in multipart need explicit serialization.

-            # For API mode, we need to handle file uploads differently
-            # For now, we'll assume data is text content
-            files = {"data": ("data.txt", str(data), "text/plain")}
-            form_data = {
-                "datasetName": dataset_name,
-            }
-            if node_set:
-                form_data["node_set"] = node_set
-
-            response = await self.client.post(
-                endpoint,
-                files=files,
-                data=form_data,
-                headers={"Authorization": f"Bearer {self.api_token}"} if self.api_token else {},
-            )
+            # Build multipart form: upload file if `data` is a file path, else send text content
+            form_data = {"datasetName": dataset_name}
+            if node_set is not None:
+                # Serialize list to string for multipart form fields
+                form_data["node_set"] = json.dumps(node_set)
+
+            path = str(data) if isinstance(data, (str, os.PathLike)) else None
+            if path and os.path.isfile(path):
+                filename = os.path.basename(path)
+                content_type = mimetypes.guess_type(filename)[0] or "application/octet-stream"
+                with open(path, "rb") as fp:
+                    response = await self.client.post(
+                        endpoint, files={"data": (filename, fp, content_type)}, data=form_data, headers=self._get_headers()
+                    )
+            else:
+                response = await self.client.post(
+                    endpoint, files={"data": ("data.txt", str(data), "text/plain")}, data=form_data, headers=self._get_headers()
+                )
             response.raise_for_status()
             return response.json()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self.use_api:
# API mode: Make HTTP request
endpoint = f"{self.api_url}/api/v1/add"
# For API mode, we need to handle file uploads differently
# For now, we'll assume data is text content
files = {"data": ("data.txt", str(data), "text/plain")}
form_data = {
"datasetName": dataset_name,
}
if node_set:
form_data["node_set"] = node_set
response = await self.client.post(
endpoint,
files=files,
data=form_data,
headers={"Authorization": f"Bearer {self.api_token}"} if self.api_token else {},
)
response.raise_for_status()
return response.json()
else:
if self.use_api:
# API mode: Make HTTP request
endpoint = f"{self.api_url}/api/v1/add"
# Build multipart form: upload file if `data` is a file path, else send text content
form_data = {"datasetName": dataset_name}
if node_set is not None:
# Serialize list to string for multipart form fields
form_data["node_set"] = json.dumps(node_set)
path = str(data) if isinstance(data, (str, os.PathLike)) else None
if path and os.path.isfile(path):
filename = os.path.basename(path)
content_type = mimetypes.guess_type(filename)[0] or "application/octet-stream"
with open(path, "rb") as fp:
response = await self.client.post(
endpoint, files={"data": (filename, fp, content_type)}, data=form_data, headers=self._get_headers()
)
else:
response = await self.client.post(
endpoint, files={"data": ("data.txt", str(data), "text/plain")}, data=form_data, headers=self._get_headers()
)
response.raise_for_status()
return response.json()
else:
🧰 Tools
🪛 Pylint (3.3.9)

[refactor] 74-99: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

Comment on lines +142 to 156
await cognee_client.add(file_path, node_set=["developer_rules"])

model = None
if graph_model_file and graph_model_name:
model = load_class(graph_model_file, graph_model_name)
await cognee.cognify(graph_model=model)
if cognee_client.use_api:
logger.warning(
"Custom graph models are not supported in API mode, ignoring."
)
else:
from cognee.shared.data_models import KnowledgeGraph

model = load_class(graph_model_file, graph_model_name)

await cognee_client.cognify(graph_model=model)
logger.info(f"Cognify finished for: {file_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

API-mode ingestion likely broken for file paths (path string sent as content).

Here cognee_client.add(file_path, ...) passes a filesystem path. In API mode, CogneeClient.add assumes text and sends the string value as the file body. This will ingest the literal path, not file contents.

Fix by making CogneeClient.add detect file paths and upload file bytes (see proposed diff in cognee_client.py). Meanwhile, avoid passing paths in API mode until fixed.

🤖 Prompt for AI Agents
In cognee-mcp/src/server.py around lines 142-156, the call await
cognee_client.add(file_path, node_set=["developer_rules"]) passes a filesystem
path string which in API mode will be sent as the file body (ingesting the
literal path). Change the logic so that when cognee_client.use_api is True you
open the file and pass its contents/bytes (or an appropriate file-like object)
to cognee_client.add instead of the path string; if the client API expects a
different parameter name or wrapper, adapt the call accordingly. Alternatively,
until cognee_client.add is updated to auto-detect paths, gate the call so API
mode does not pass raw paths (e.g., read file and pass content, or raise/skip
with a clear warning).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee/api/v1/ui/ui.py (1)

503-594: MCP may start with invalid API_URL if backend fails.

MCP starts at line 503, backend at line 556. If start_backend is True but the backend fails to start (lines 586-588), MCP remains running with API_URL pointing to a non-existent backend. Consider either (1) starting the backend first and verifying it's healthy before starting MCP, or (2) cleaning up the MCP process if the backend fails.

Apply this diff to start the backend first:

     logger.info("✓ All required ports are available")
     backend_process = None

+    # Start backend server first if requested
+    if start_backend:
+        logger.info("Starting cognee backend API server...")
+        try:
+            import sys
+
+            backend_process = subprocess.Popen(
+                [
+                    sys.executable,
+                    "-m",
+                    "uvicorn",
+                    "cognee.api.client:app",
+                    "--host",
+                    "localhost",
+                    "--port",
+                    str(backend_port),
+                ],
+                stdout=subprocess.PIPE,
+                stderr=subprocess.PIPE,
+                preexec_fn=os.setsid if hasattr(os, "setsid") else None,
+            )
+
+            # Start threads to stream backend output with prefix
+            _stream_process_output(backend_process, "stdout", "[BACKEND]", "\033[32m")  # Green
+            _stream_process_output(backend_process, "stderr", "[BACKEND]", "\033[32m")  # Green
+
+            pid_callback(backend_process.pid)
+
+            # Give the backend a moment to start
+            time.sleep(2)
+
+            if backend_process.poll() is not None:
+                logger.error("Backend server failed to start - process exited early")
+                return None
+
+            logger.info(f"✓ Backend API started at http://localhost:{backend_port}")
+
+        except Exception as e:
+            logger.error(f"Failed to start backend server: {str(e)}")
+            return None
+
     if start_mcp:
         logger.info("Starting Cognee MCP server with Docker...")
         cwd = os.getcwd()
         env_file = os.path.join(cwd, ".env")
         try:
             image = "cognee/cognee-mcp:main"
             subprocess.run(["docker", "pull", image], check=True)

             docker_cmd = [
                 "docker",
                 "run",
                 "-p",
                 f"{mcp_port}:8000",
                 "--rm",
                 "--env-file",
                 env_file,
                 "-e",
                 "TRANSPORT_MODE=sse",
             ]

             if start_backend:
                 docker_cmd.extend(
                     [
                         "-e",
                         f"API_URL=http://localhost:{backend_port}",
                     ]
                 )
                 logger.info(
                     f"Configuring MCP to connect to backend API at http://localhost:{backend_port}"
                 )
                 logger.info("(localhost will be auto-converted to host.docker.internal)")

             docker_cmd.append("cognee/cognee-mcp:main")

             mcp_process = subprocess.Popen(
                 docker_cmd,
                 stdout=subprocess.PIPE,
                 stderr=subprocess.PIPE,
                 preexec_fn=os.setsid if hasattr(os, "setsid") else None,
             )

             _stream_process_output(mcp_process, "stdout", "[MCP]", "\033[34m")  # Blue
             _stream_process_output(mcp_process, "stderr", "[MCP]", "\033[34m")  # Blue

             pid_callback(mcp_process.pid)

             mode_info = "API mode" if start_backend else "direct mode"
             logger.info(
                 f"✓ Cognee MCP server starting on http://127.0.0.1:{mcp_port}/sse ({mode_info})"
             )
         except Exception as e:
             logger.error(f"Failed to start MCP server with Docker: {str(e)}")
+            # Clean up backend if MCP fails
+            if backend_process:
+                logger.info("Cleaning up backend process due to MCP failure...")
+                try:
+                    backend_process.terminate()
+                    backend_process.wait(timeout=5)
+                except Exception:
+                    try:
+                        backend_process.kill()
+                    except Exception:
+                        pass
-    # Start backend server if requested
-    if start_backend:
-        logger.info("Starting cognee backend API server...")
-        try:
-            import sys
-
-            backend_process = subprocess.Popen(
-                [
-                    sys.executable,
-                    "-m",
-                    "uvicorn",
-                    "cognee.api.client:app",
-                    "--host",
-                    "localhost",
-                    "--port",
-                    str(backend_port),
-                ],
-                stdout=subprocess.PIPE,
-                stderr=subprocess.PIPE,
-                preexec_fn=os.setsid if hasattr(os, "setsid") else None,
-            )
-
-            # Start threads to stream backend output with prefix
-            _stream_process_output(backend_process, "stdout", "[BACKEND]", "\033[32m")  # Green
-            _stream_process_output(backend_process, "stderr", "[BACKEND]", "\033[32m")  # Green
-
-            pid_callback(backend_process.pid)
-
-            # Give the backend a moment to start
-            time.sleep(2)
-
-            if backend_process.poll() is not None:
-                logger.error("Backend server failed to start - process exited early")
-                return None
-
-            logger.info(f"✓ Backend API started at http://localhost:{backend_port}")
-
-        except Exception as e:
-            logger.error(f"Failed to start backend server: {str(e)}")
-            return None
-
     # Find frontend directory
     frontend_path = find_frontend_path()
🧹 Nitpick comments (1)
cognee/api/v1/ui/ui.py (1)

508-521: Add checks for Docker availability and env file existence.

The code assumes Docker is installed and the .env file exists. If either is missing, the error messages may be unclear.

Add validation before the docker pull:

            # Check Docker availability
            try:
                subprocess.run(
                    ["docker", "--version"],
                    check=True,
                    capture_output=True,
                    timeout=5
                )
            except (subprocess.CalledProcessError, FileNotFoundError, subprocess.TimeoutExpired):
                logger.error("Docker is not installed or not running. MCP server requires Docker.")
                logger.error("Please install Docker from https://www.docker.com/get-started")
                return None
            
            # Check env file existence
            if not os.path.exists(env_file):
                logger.warning(f".env file not found at {env_file}")
                logger.warning("MCP will start without environment file. Some features may not work.")
            
            image = "cognee/cognee-mcp:main"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f11c01e and f227b6d.

📒 Files selected for processing (1)
  • cognee/api/v1/ui/ui.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cognee/api/v1/ui/ui.py (1)
cognee/shared/logging_utils.py (1)
  • info (175-175)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: CLI Tests / CLI Unit Tests
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / Test using different async databases in parallel in Cognee
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: Basic Tests / Run Simple Examples
  • GitHub Check: Basic Tests / Run Formatting Check
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: Basic Tests / Run Basic Graph Tests
  • GitHub Check: End-to-End Tests / Run Telemetry Test
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: Basic Tests / Run Integration Tests
  • GitHub Check: End-to-End Tests / Server Start Test
🔇 Additional comments (3)
cognee/api/v1/ui/ui.py (3)

508-509: Past issue resolved: image tag is now consistent.

The previous mismatch between pulled tag (main) and run tag (daulet-dev) has been fixed. Both pull and run now use the same image variable.

Also applies to: 535-535


549-552: Clear mode indication in logs.

The mode_info variable nicely communicates whether MCP runs in API mode (connected to backend) or direct mode (standalone). This helps users understand the configuration.


523-533: Entrypoint handles localhost conversion
The script in cognee-mcp/entrypoint.sh (line 62) uses sed to rewrite localhost/127.0.0.1 to host.docker.internal, so the log message is accurate.

@daukadolt daukadolt force-pushed the feat/mcp-add-support-for-non-standalone-mode branch from 54d82fd to 079a552 Compare October 9, 2025 15:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee/cli/_cognee.py (1)

230-231: signal_handler always exits with code 0; failure paths report success and make return 1 unreachable

On startup failure you call signal_handler() which sys.exit(0). This masks errors and returns success. Also, the subsequent return 1 is unreachable.

Fix by parameterizing the exit code:

-        def signal_handler(signum, frame):
+        def signal_handler(signum, frame, exit_code=0):
             """Handle Ctrl+C and other termination signals"""
-            nonlocal spawned_pids, docker_container
+            nonlocal spawned_pids, docker_container
             fmt.echo("\nShutting down UI server...")
             ...
-            sys.exit(0)
+            sys.exit(exit_code)
@@
-                signal_handler(signal.SIGTERM, None)
+                signal_handler(signal.SIGTERM, None, exit_code=1)
@@
-            fmt.error(f"Error starting UI: {str(ex)}")
-            signal_handler(signal.SIGTERM, None)
+            fmt.error(f"Error starting UI: {str(ex)}")
+            signal_handler(signal.SIGTERM, None, exit_code=1)

Also applies to: 286-288, 291-295

♻️ Duplicate comments (2)
cognee-mcp/src/cognee_client.py (2)

138-147: Direct-mode cognify: pass datasets for parity with API mode

The API path includes datasets; direct mode ignores it. Pass through when provided.

             with redirect_stdout(sys.stderr):
-                kwargs = {}
+                kwargs = {}
+                if datasets:
+                    kwargs["datasets"] = datasets
                 if custom_prompt:
                     kwargs["custom_prompt"] = custom_prompt
                 if graph_model:
                     kwargs["graph_model"] = graph_model
 
                 await self.cognee.cognify(**kwargs)

32-36: Normalize api_url and derive use_api from the cleaned value

If api_url is "/" it becomes "", leaving self.api_url empty while use_api is True. Also whitespace-only values remain truthy.

-    def __init__(self, api_url: Optional[str] = None, api_token: Optional[str] = None):
-        self.api_url = api_url.rstrip("/") if api_url else None
-        self.api_token = api_token
-        self.use_api = bool(api_url)
+    def __init__(self, api_url: Optional[str] = None, api_token: Optional[str] = None):
+        cleaned = (api_url or "").strip().rstrip("/")
+        self.api_url = cleaned or None
+        self.api_token = api_token
+        self.use_api = bool(self.api_url)
🧹 Nitpick comments (5)
cognee/cli/_cognee.py (2)

198-206: Harden Docker shutdown: robust stderr decode and timeout for forced removal

  • Decoding stderr may raise UnicodeDecodeError; add errors="ignore".
  • Force removal can also hang; add a timeout and handle errors.
-                        fmt.warning(
-                            f"Could not stop container {docker_container}: {result.stderr.decode()}"
-                        )
+                        fmt.warning(
+                            f"Could not stop container {docker_container}: {result.stderr.decode(errors='ignore')}"
+                        )
@@
-                    subprocess.run(
-                        ["docker", "rm", "-f", docker_container], capture_output=True, check=False
-                    )
+                    subprocess.run(
+                        ["docker", "rm", "-f", docker_container],
+                        capture_output=True,
+                        check=False,
+                        timeout=10,
+                    )

240-250: pid_callback stores a single container name; clarify multi-container scenarios

If multiple containers can be spawned, docker_container gets overwritten. If only one is expected, add a brief comment; otherwise collect a set/list and stop all on shutdown.

cognee-mcp/src/cognee_client.py (3)

37-45: Optional: reduce repetition with base_url and default headers

You can initialize the AsyncClient with base_url and default Authorization headers to avoid repeating f-strings and header passing.

-            self.client = httpx.AsyncClient(timeout=300.0)  # 5 minute timeout for long operations
+            default_headers = {"Authorization": f"Bearer {self.api_token}"} if self.api_token else None
+            self.client = httpx.AsyncClient(base_url=self.api_url, timeout=300.0, headers=default_headers)

Then use paths like "/api/v1/add" and drop per-call headers. Keep multipart Content-Type implicit.


248-256: Flatten “else after return/raise” for readability (lint R1705/R1720)

Multiple blocks can be de-indented. Example for prune_data:

-        if self.use_api:
-            # Note: The API doesn't expose a prune endpoint, so we'll need to handle this
-            # For now, raise an error
-            raise NotImplementedError("Prune operation is not available via API")
-        else:
-            # Direct mode: Call cognee directly
-            with redirect_stdout(sys.stderr):
-                await self.cognee.prune.prune_data()
-                return {"status": "success", "message": "Data pruned successfully"}
+        if self.use_api:
+            raise NotImplementedError("Prune operation is not available via API")
+        # Direct mode: Call cognee directly
+        with redirect_stdout(sys.stderr):
+            await self.cognee.prune.prune_data()
+            return {"status": "success", "message": "Data pruned successfully"}

Also applies to: 272-279, 297-306, 317-323


336-339: Add async context manager for clean usage

Implement aenter/aexit so callers can async with CogneeClient(...) as c: and ensure client closure.

     async def close(self):
         """Close the HTTP client if in API mode."""
         if self.use_api and hasattr(self, "client"):
             await self.client.aclose()
+
+    async def __aenter__(self):
+        return self
+
+    async def __aexit__(self, exc_type, exc, tb):
+        await self.close()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f227b6d and 079a552.

📒 Files selected for processing (3)
  • cognee-mcp/src/cognee_client.py (1 hunks)
  • cognee/api/v1/ui/ui.py (2 hunks)
  • cognee/cli/_cognee.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cognee/api/v1/ui/ui.py
🧰 Additional context used
🧬 Code graph analysis (2)
cognee-mcp/src/cognee_client.py (4)
cognee/shared/logging_utils.py (1)
  • get_logger (182-194)
cognee/api/v1/datasets/datasets.py (1)
  • datasets (7-40)
cognee/modules/search/types/SearchType.py (1)
  • SearchType (4-20)
cognee/modules/users/methods/get_default_user.py (1)
  • get_default_user (13-41)
cognee/cli/_cognee.py (1)
cognee/cli/echo.py (3)
  • echo (8-10)
  • success (28-30)
  • warning (18-20)
🪛 Pylint (3.3.9)
cognee-mcp/src/cognee_client.py

[refactor] 74-99: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 124-147: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 149-149: Too many arguments (6/5)

(R0913)


[refactor] 149-149: Too many positional arguments (6/5)

(R0917)


[refactor] 178-198: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 218-237: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[refactor] 248-256: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 272-279: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 297-306: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it

(R1720)


[refactor] 317-334: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Analyze (actions)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
cognee/cli/_cognee.py (1)

214-218: Verify start_ui spawns UI in its own process group
Ensure the UI subprocess launched in start_ui uses start_new_session=True or preexec_fn=os.setsid, otherwise os.killpg won’t target it correctly. I didn’t locate its subprocess.Popen call in _cognee.py—please confirm manually.

Comment on lines 9 to 15
import sys
from typing import Optional, Any, List, Dict
from uuid import UUID
from contextlib import redirect_stdout
import httpx
from cognee.shared.logging_utils import get_logger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

API add(): detect file paths, upload bytes, and serialize node_set; add needed imports

Currently the literal text is sent and lists are not serialized. Support file uploads and stable form encoding.

@@
-import sys
+import sys
+import os
+import json
+import mimetypes
@@
-            # For API mode, we need to handle file uploads differently
-            # For now, we'll assume data is text content
-            files = {"data": ("data.txt", str(data), "text/plain")}
-            form_data = {
-                "datasetName": dataset_name,
-            }
-            if node_set:
-                form_data["node_set"] = node_set
-
-            response = await self.client.post(
-                endpoint,
-                files=files,
-                data=form_data,
-                headers={"Authorization": f"Bearer {self.api_token}"} if self.api_token else {},
-            )
+            form_data = {"datasetName": dataset_name}
+            if node_set is not None:
+                form_data["node_set"] = json.dumps(node_set)
+
+            auth = {"Authorization": f"Bearer {self.api_token}"} if self.api_token else {}
+            path = str(data) if isinstance(data, (str, os.PathLike)) else None
+            if path and os.path.isfile(path):
+                filename = os.path.basename(path)
+                content_type = mimetypes.guess_type(filename)[0] or "application/octet-stream"
+                with open(path, "rb") as fp:
+                    response = await self.client.post(endpoint, files={"data": (filename, fp, content_type)}, data=form_data, headers=auth)
+            else:
+                response = await self.client.post(endpoint, files={"data": ("data.txt", str(data), "text/plain")}, data=form_data, headers=auth)

Also applies to: 78-93

🤖 Prompt for AI Agents
cognee-mcp/src/cognee_client.py around lines 9-15 (also applies to 78-93): the
add() API currently posts literal text and doesn't handle file uploads or
serialize list/dict node_set values; update imports and request building to
detect file paths/bytes/file-like objects and send them as multipart file parts
while serializing lists/dicts to JSON for stable form encoding. Concretely: add
imports (os, json, pathlib.Path, typing Union/IO), update parameter processing
to (1) if a value is bytes/bytearray/IO/Open file or a string path pointing to
an existing file, open/read and include it in httpx multipart files with
appropriate filename and content-type; (2) if a value is a list or dict (e.g.,
node_set), json.dumps it before placing in form fields; and (3) build the
request using httpx.post(..., files=files_dict, data=form_fields) so binary
parts are sent as files and other fields as form-encoded strings.

Comment on lines 23 to 30
Parameters
----------
api_url : str, optional
Base URL of the Cognee API server (e.g., "http://localhost:8000").
If None, uses direct cognee function calls.
api_token : str, optional
Authentication token for API requests. Required if api_url is provided.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring vs behavior: api_token marked “Required” but not enforced

Either drop “Required” from docs or validate and raise a clear error when api_url is set without a token.

🤖 Prompt for AI Agents
In cognee-mcp/src/cognee_client.py around lines 23 to 30 the docstring states
api_token is "Required" when api_url is provided but the constructor does not
enforce this; add validation in the __init__ (or factory) so that when api_url
is not None and api_token is None or empty you raise a clear ValueError (or
custom exception) with a message like "api_token is required when api_url is
set" and update any related tests; alternatively, if you prefer not to enforce,
remove the word "Required" from the docstring to keep docs consistent with
behavior.

Comment on lines +191 to +198
# Direct mode: Call cognee directly
from cognee.modules.search.types import SearchType

with redirect_stdout(sys.stderr):
results = await self.cognee.search(
query_type=SearchType[query_type.upper()], query_text=query_text
)
return results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Direct-mode search ignores datasets/system_prompt/top_k

If supported by cognee.search, pass these for behavioral parity.

If supported, apply:

             with redirect_stdout(sys.stderr):
-                results = await self.cognee.search(
-                    query_type=SearchType[query_type.upper()], query_text=query_text
-                )
+                results = await self.cognee.search(
+                    query_type=SearchType[query_type.upper()],
+                    query_text=query_text,
+                    datasets=datasets if datasets else None,
+                    system_prompt=system_prompt if system_prompt else None,
+                    top_k=top_k,
+                )

🏁 Script executed:

#!/bin/bash
# Inspect cognee.search signature and usages to confirm supported params
ast-grep --pattern $'async def search($$$):\n  $$$'
rg -nP -C2 --type=py '\bsearch\s*\('

Length of output: 99666


🏁 Script executed:

#!/bin/bash
sed -n '140,200p' cognee-mcp/src/cognee_client.py

Length of output: 2124


🏁 Script executed:

#!/bin/bash
sed -n '1,140p' cognee-mcp/src/cognee_client.py

Length of output: 4830


🏁 Script executed:

#!/bin/bash
sed -n '1,200p' cognee/modules/search/methods/search.py

Length of output: 7078


🏁 Script executed:

#!/bin/bash
sed -n '1,200p' cognee/__init__.py

Length of output: 1067


🏁 Script executed:

#!/bin/bash
sed -n '1,200p' cognee/api/v1/search/search.py

Length of output: 8989


Include datasets, system_prompt and top_k in direct-mode search
Direct mode ignores datasets, system_prompt, and top_k. Forward them for parity:

 with redirect_stdout(sys.stderr):
-    results = await self.cognee.search(
-        query_type=SearchType[query_type.upper()], query_text=query_text
-    )
+    results = await self.cognee.search(
+        query_text=query_text,
+        query_type=SearchType[query_type.upper()],
+        datasets=datasets,
+        system_prompt=system_prompt,
+        top_k=top_k,
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Direct mode: Call cognee directly
from cognee.modules.search.types import SearchType
with redirect_stdout(sys.stderr):
results = await self.cognee.search(
query_type=SearchType[query_type.upper()], query_text=query_text
)
return results
# Direct mode: Call cognee directly
from cognee.modules.search.types import SearchType
with redirect_stdout(sys.stderr):
results = await self.cognee.search(
query_text=query_text,
query_type=SearchType[query_type.upper()],
datasets=datasets,
system_prompt=system_prompt,
top_k=top_k,
)
return results
🧰 Tools
🪛 Pylint (3.3.9)

[refactor] 178-198: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

🤖 Prompt for AI Agents
In cognee-mcp/src/cognee_client.py around lines 191 to 198, the direct-mode
search call only passes query_type and query_text and therefore ignores
datasets, system_prompt and top_k; update the await self.cognee.search(...)
invocation to forward the existing datasets, system_prompt and top_k variables
(e.g., pass datasets=datasets, system_prompt=system_prompt, top_k=top_k) so
direct mode has parity with other paths and receives those parameters.

@daukadolt daukadolt changed the title Feat/mcp add support for non standalone mode WIP - Feat/mcp add support for non standalone mode Oct 9, 2025
@daukadolt daukadolt changed the title WIP - Feat/mcp add support for non standalone mode Feat/mcp add support for non standalone mode Oct 9, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cognee/cli/_cognee.py (2)

184-218: Consider refactoring to reduce complexity and improve error handling.

The signal handler has grown complex with 14 branches and nested try-except blocks around echo calls. Additionally, the forced container removal at lines 214-216 doesn't verify success.

Consider extracting container cleanup into a separate function:

+def _stop_docker_container(container_name: str) -> None:
+    """Stop and remove a Docker container"""
+    try:
+        result = subprocess.run(
+            ["docker", "stop", container_name],
+            capture_output=True,
+            timeout=10,
+            check=False,
+        )
+        if result.returncode == 0:
+            try:
+                fmt.success(f"✓ Docker container {container_name} stopped.")
+            except (BrokenPipeError, OSError):
+                pass
+        else:
+            try:
+                fmt.warning(f"Could not stop container {container_name}: {result.stderr.decode()}")
+            except (BrokenPipeError, OSError):
+                pass
+    except subprocess.TimeoutExpired:
+        try:
+            fmt.warning(f"Timeout stopping container {container_name}, forcing removal...")
+        except (BrokenPipeError, OSError):
+            pass
+        result = subprocess.run(
+            ["docker", "rm", "-f", container_name], 
+            capture_output=True, 
+            check=False
+        )
+        # Verify forced removal succeeded
+        if result.returncode != 0:
+            try:
+                fmt.warning(f"Force removal of container {container_name} failed: {result.stderr.decode()}")
+            except (BrokenPipeError, OSError):
+                pass
+    except Exception:
+        pass
+
 def signal_handler(signum, frame):
     """Handle Ctrl+C and other termination signals"""
     nonlocal spawned_pids, docker_container
     
     try:
         fmt.echo("\nShutting down UI server...")
     except (BrokenPipeError, OSError):
         pass
     
     # First, stop Docker container if running
     if docker_container:
-        try:
-            result = subprocess.run(
-                ["docker", "stop", docker_container],
-                capture_output=True,
-                timeout=10,
-                check=False,
-            )
-            try:
-                if result.returncode == 0:
-                    fmt.success(f"✓ Docker container {docker_container} stopped.")
-                else:
-                    fmt.warning(
-                        f"Could not stop container {docker_container}: {result.stderr.decode()}"
-                    )
-            except (BrokenPipeError, OSError):
-                pass
-        except subprocess.TimeoutExpired:
-            try:
-                fmt.warning(
-                    f"Timeout stopping container {docker_container}, forcing removal..."
-                )
-            except (BrokenPipeError, OSError):
-                pass
-            subprocess.run(
-                ["docker", "rm", "-f", docker_container], capture_output=True, check=False
-            )
-        except Exception:
-            pass
+        _stop_docker_container(docker_container)

This refactoring:

  • Reduces cognitive complexity by extracting Docker cleanup logic
  • Verifies that forced removal succeeds
  • Makes the signal handler more maintainable

258-266: Consider adding type hints for clarity.

The callback correctly handles both plain PIDs and (PID, container_name) tuples. Consider adding a type hint to make the expected input explicit.

+from typing import Union, Tuple
+
 # Callback to capture PIDs and Docker container of all spawned processes
-def pid_callback(pid_or_tuple):
+def pid_callback(pid_or_tuple: Union[int, Tuple[int, str]]) -> None:
     nonlocal spawned_pids, docker_container
     # Handle both regular PIDs and (PID, container_name) tuples
     if isinstance(pid_or_tuple, tuple):
         pid, container_name = pid_or_tuple
         spawned_pids.append(pid)
         docker_container = container_name
     else:
         spawned_pids.append(pid_or_tuple)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 079a552 and 2c8d8e8.

📒 Files selected for processing (1)
  • cognee/cli/_cognee.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cognee/cli/_cognee.py (2)
cognee/cli/echo.py (3)
  • echo (8-10)
  • success (28-30)
  • warning (18-20)
cognee/api/v1/ui/ui.py (1)
  • start_ui (426-718)
🪛 Pylint (3.3.9)
cognee/cli/_cognee.py

[refactor] 180-180: Too many branches (14/12)

(R0912)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: End-to-End Tests / Test using different async databases in parallel in Cognee
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: CLI Tests / CLI Integration Tests
  • GitHub Check: CLI Tests / CLI Unit Tests
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: End-to-End Tests / Run Telemetry Test
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: Basic Tests / Run Basic Graph Tests
  • GitHub Check: Basic Tests / Run Integration Tests
  • GitHub Check: Basic Tests / Run Linting
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: Basic Tests / Run Simple Examples
🔇 Additional comments (3)
cognee/cli/_cognee.py (3)

178-178: LGTM!

The docker_container state variable correctly tracks the optional Docker container associated with spawned processes, enabling container-aware shutdown logic.


220-244: LGTM!

The process cleanup logic correctly handles both Unix-like systems (using process groups) and Windows (using taskkill), with appropriate exception handling for broken pipes and process lookup errors.


249-250: LGTM!

The conditional SIGHUP signal handler registration is appropriate for Unix-like systems and follows best practices for cross-platform compatibility.

@daukadolt daukadolt changed the base branch from main to dev October 10, 2025 13:21
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.


docker_cmd.append(
image
) # TODO: change to "cognee/cognee-mcp:main" right before merging into main
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Docker Image Tagging Error

The MCP Docker image is hardcoded to cognee/cognee-mcp:feature-standalone-mcp instead of cognee/cognee-mcp:main. This temporary development tag, noted by TODO comments, was committed and may cause deployment failures or unexpected behavior in production environments.

Fix in Cursor Fix in Web

from cognee.shared.data_models import KnowledgeGraph

model = load_class(graph_model_file, graph_model_name)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Refactoring Breaks Default Graph Model Assignment

The refactoring to use CogneeClient inadvertently removed the default KnowledgeGraph assignment for cognify operations in direct mode. When no custom graph model is provided, the graph_model now defaults to None, changing the expected behavior.

Additional Locations (1)

Fix in Cursor Fix in Web

@Vasilije1990 Vasilije1990 merged commit c6d12e8 into dev Oct 12, 2025
108 of 113 checks passed
@Vasilije1990 Vasilije1990 deleted the feat/mcp-add-support-for-non-standalone-mode branch October 12, 2025 12:10
@dexters1
Copy link
Collaborator

@daukadolt with this change we now have three different MCP integrations to maintain for Cognee, which makes adding new feature or changes pretty hard, do we need all them? Is it enough just to have MCP start Cognee as a FastAPI server and remove the rest of the ways of working?

@daukadolt
Copy link
Contributor Author

daukadolt commented Oct 14, 2025

@dexters1 There is no difference in sse and http in maintenance sense, they are same thing under the hood, and MCP guidance is to always support stdio.

We can remove explicit SSE and keep HTTP and stdio later on

which makes adding new feature or changes pretty hard

Not sure I agree. It's essentially two transports, and both are needed.

HTTP/SSE is needed as we're planning to support remote MCP

Is it enough just to have MCP start Cognee as a FastAPI server

That would work for standalone setup, but current version doesn't benefit from that, as for standalone setup I just moved previously working MCP implementation. It wouldn't work for non-standalone

This was referenced Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants