Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
daf2d54
Add data visualization for Anthropic
Vasilije1990 Jan 10, 2025
b132ff4
Update cognee-mcp/cognee_mcp/server.py
Vasilije1990 Jan 11, 2025
7b0bfe9
Update cognee-mcp/cognee_mcp/server.py
Vasilije1990 Jan 11, 2025
cf4737b
Update cognee-mcp/cognee_mcp/server.py
Vasilije1990 Jan 12, 2025
55e9d64
Add data visualization for Anthropic
Vasilije1990 Jan 13, 2025
047948a
Add data visualization for Anthropic
Vasilije1990 Jan 14, 2025
3ba98b2
Merge branch 'dev' into COG-975
Vasilije1990 Jan 14, 2025
ad07bae
Add data visualization for Anthropic
Vasilije1990 Jan 14, 2025
a0e3686
Update README.md
Vasilije1990 Jan 14, 2025
61118dd
Update README.md
Vasilije1990 Jan 14, 2025
e71f852
Update README.md
Vasilije1990 Jan 14, 2025
933d21a
Update dockerhub pushes
Vasilije1990 Jan 14, 2025
aef7822
Merge branch 'dev' into COG-975
Vasilije1990 Jan 15, 2025
be0b486
Update lock files
Vasilije1990 Jan 16, 2025
662faeb
Update format
Vasilije1990 Jan 16, 2025
4a87df9
Update format
Vasilije1990 Jan 16, 2025
4ae8eb9
Update format
Vasilije1990 Jan 16, 2025
1af24dc
Update format
Vasilije1990 Jan 16, 2025
b2355de
Update format
Vasilije1990 Jan 16, 2025
5b31638
Update format
Vasilije1990 Jan 16, 2025
f19b58a
Update format
Vasilije1990 Jan 16, 2025
5aaf420
Fix for now
Vasilije1990 Jan 16, 2025
72b503f
Fix for now
Vasilije1990 Jan 16, 2025
7a4a0f4
Fix for now
Vasilije1990 Jan 16, 2025
0783625
Fix for now
Vasilije1990 Jan 16, 2025
bbd51e8
Fix for now
Vasilije1990 Jan 16, 2025
cb7b2d3
Fix for now
Vasilije1990 Jan 16, 2025
fe47253
Fix for now
Vasilije1990 Jan 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions cognee-mcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,12 @@ npx -y @smithery/cli install cognee --client claude

Define cognify tool in server.py
Restart your Claude desktop.


To use debugger, run:
```bash
npx @modelcontextprotocol/inspector uv --directory /Users/name/folder run cognee
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace hardcoded paths with placeholders.

The instructions contain specific user paths that won't work for other users.

Apply these changes:

-npx @modelcontextprotocol/inspector uv --directory /Users/name/folder run cognee
+npx @modelcontextprotocol/inspector uv --directory /Users/{username}/path/to/folder run cognee

-npx @modelcontextprotocol/inspector uv --directory /Users/vasilije/cognee/cognee-mcp run cognee
+npx @modelcontextprotocol/inspector uv --directory /Users/{username}/cognee/cognee-mcp run cognee

Also applies to: 105-105

```

To reset the installation
uv sync --dev --all-extras --reinstall
56 changes: 56 additions & 0 deletions cognee-mcp/cognee_mcp/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,17 @@
import asyncio
from contextlib import redirect_stderr, redirect_stdout

from sqlalchemy.testing.plugin.plugin_base import logging

Comment on lines +6 to +7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use the standard Python logging module instead.

Importing logging from sqlalchemy's testing plugin is not recommended. Use Python's built-in logging module instead.

-from sqlalchemy.testing.plugin.plugin_base import logging
+import logging
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from sqlalchemy.testing.plugin.plugin_base import logging
import logging

import cognee
import mcp.server.stdio
import mcp.types as types
from cognee.api.v1.search import SearchType
from cognee.shared.data_models import KnowledgeGraph
from mcp.server import NotificationOptions, Server
from mcp.server.models import InitializationOptions
from PIL import Image


server = Server("cognee-mcp")

Expand Down Expand Up @@ -87,9 +91,45 @@ async def handle_list_tools() -> list[types.Tool]:
},
},
),
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
Comment on lines +94 to +103
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Clarify query parameter requirement in visualization tool schema

The tool's schema defines a "query" property but doesn't mark it as required. Either:

  1. Mark it as required if visualization needs a query parameter
  2. Document that it's optional if visualization can work without a query
         types.Tool(
             name="visualize",
             description="Visualize the knowledge graph.",
             inputSchema={
                 "type": "object",
                 "properties": {
                     "query": {"type": "string"},
                 },
+                "required": ["query"],  # Add this if query is required
             },
         ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
},
),
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
"required": ["query"], # Add this if query is required
},
),
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting. Code does not conform to Ruff's formatting standards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion from coderabbit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borisarzentar Thank you for confirming! I'm glad the suggestion was helpful.

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣿⣿⡄⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⡿⢿⣿⡿⠟⢻⣷⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⣰⣿⣿⠟⠉⠀⢸⣿⠀⠀⢸⣿⣇⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⣰⣿⣿⠋⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⡄⠀⠀⠀⠀⠀
⠀⠀⠀⣰⣿⣿⠃⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣷⠀⠀⠀⠀⠀
⠀⠀⢠⣿⣿⠇⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⡄⠀⠀⠀⠀
⠀⢠⣿⣿⡏⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣷⡀⠀⠀⠀
⢠⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣷⡀⠀⠀
⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⡄⠀
⠛⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⠀

]


def get_freshest_png(directory: str) -> Image.Image:
if not os.path.exists(directory):
raise FileNotFoundError(f"Directory {directory} does not exist")

# List all files in 'directory' that end with .png
files = [f for f in os.listdir(directory) if f.endswith(".png")]
if not files:
raise FileNotFoundError("No PNG files found in the given directory.")

# Sort by integer value of the filename (minus the '.png')
# Example filename: 1673185134.png -> integer 1673185134
try:
files_sorted = sorted(files, key=lambda x: int(x.replace(".png", "")))
except ValueError as e:
raise ValueError("Invalid PNG filename format. Expected timestamp format.") from e

# The "freshest" file has the largest timestamp
freshest_filename = files_sorted[-1]
freshest_path = os.path.join(directory, freshest_filename)

# Open the image with PIL and return the PIL Image object
try:
return Image.open(freshest_path)
except (IOError, OSError) as e:
raise IOError(f"Failed to open PNG file {freshest_path}") from e

@server.call_tool()
async def handle_call_tool(
name: str, arguments: dict | None
Expand Down Expand Up @@ -154,6 +194,22 @@ async def handle_call_tool(
text="Pruned",
)
]

elif name == "visualize":
with open(os.devnull, "w") as fnull:
with redirect_stdout(fnull), redirect_stderr(fnull):
try:
result = await cognee.visualize_graph()
results = retrieved_edges_to_string(result)

return [
types.TextContent(
type="text",
text=results,
)
]
except (FileNotFoundError, IOError, ValueError) as e:
raise ValueError(f"Failed to create visualization: {str(e)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return visualization output instead of text content.

The visualization tool returns text content (types.TextContent) despite being described as a tool to "Visualize the knowledge graph." Consider returning the visualization as an image using types.ImageContent or as an embedded resource using types.EmbeddedResource.

                    return [
-                       types.TextContent(
-                           type="text",
-                           text=results,
-                       )
+                       types.ImageContent(
+                           type="image",
+                           data=await cognee.visualize_graph(),
+                           format="png"
+                       )
                    ]

Committable suggestion skipped: line range outside the PR's diff.

else:
raise ValueError(f"Unknown tool: {name}")

Expand Down
5 changes: 3 additions & 2 deletions cognee-mcp/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ version = "0.1.0"
description = "A MCP server project"
readme = "README.md"
requires-python = ">=3.10"

dependencies = [
"mcp>=1.1.1",
"openai==1.59.4",
Expand Down Expand Up @@ -51,7 +52,7 @@ dependencies = [
"pydantic-settings>=2.2.1,<3.0.0",
"anthropic>=0.26.1,<1.0.0",
"sentry-sdk[fastapi]>=2.9.0,<3.0.0",
"fastapi-users[sqlalchemy]", # Optional
"fastapi-users[sqlalchemy]>=14.0.0", # Optional
"alembic>=1.13.3,<2.0.0",
"asyncpg==0.30.0", # Optional
"pgvector>=0.3.5,<0.4.0", # Optional
Expand Down Expand Up @@ -91,4 +92,4 @@ dev = [
]

[project.scripts]
cognee = "cognee_mcp:main"
cognee = "cognee_mcp:main"
15 changes: 11 additions & 4 deletions cognee-mcp/uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion cognee/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from .api.v1.datasets.datasets import datasets
from .api.v1.prune import prune
from .api.v1.search import SearchType, get_search_history, search
from .api.v1.visualize import visualize
from .api.v1.visualize import visualize_graph
from .shared.utils import create_cognee_style_network_with_logo

# Pipelines
Expand Down
67 changes: 39 additions & 28 deletions cognee/shared/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import tiktoken
import nltk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicate nltk import

The nltk module is imported twice. Remove the duplicate import on line 26 to fix the F811 error.

import nltk
import base64
import time
...
- import nltk
from cognee.shared.exceptions import IngestionError

Also applies to: 26-26

🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File needs formatting according to Ruff standards

import base64

import time

import logging
import sys
Expand All @@ -30,6 +30,34 @@
proxy_url = "https://test.prometh.ai"



def get_entities(tagged_tokens):
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk

return ne_chunk(tagged_tokens)
Comment on lines +31 to +35
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add docstring and error handling for get_entities()

The function lacks a docstring explaining its purpose and parameters. Also, NLTK resource downloads should be handled with error checking.

 def get_entities(tagged_tokens):
+    """Extract named entities from POS-tagged tokens using NLTK's ne_chunk.
+    
+    Args:
+        tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag()
+    
+    Returns:
+        A tree containing chunks of named entities
+    """
+    try:
         nltk.download("maxent_ne_chunker", quiet=True)
         from nltk.chunk import ne_chunk
+    except Exception as e:
+        logging.error(f"Failed to download NLTK resources: {str(e)}")
+        raise

         return ne_chunk(tagged_tokens)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_entities(tagged_tokens):
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk
return ne_chunk(tagged_tokens)
def get_entities(tagged_tokens):
"""Extract named entities from POS-tagged tokens using NLTK's ne_chunk.
Args:
tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag()
Returns:
A tree containing chunks of named entities
"""
try:
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk
except Exception as e:
logging.error(f"Failed to download NLTK resources: {str(e)}")
raise
return ne_chunk(tagged_tokens)
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File requires formatting using Ruff formatter



def extract_pos_tags(sentence):
"""Extract Part-of-Speech (POS) tags for words in a sentence."""

# Ensure that the necessary NLTK resources are downloaded
nltk.download("words", quiet=True)
nltk.download("punkt", quiet=True)
nltk.download("averaged_perceptron_tagger", quiet=True)

from nltk.tag import pos_tag
from nltk.tokenize import word_tokenize

# Tokenize the sentence into words
tokens = word_tokenize(sentence)

# Tag each word with its corresponding POS tag
pos_tags = pos_tag(tokens)

return pos_tags

Comment on lines +38 to +56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add input validation and improve error handling for extract_pos_tags()

The function should validate input and handle NLTK resource downloads more robustly.

 def extract_pos_tags(sentence):
     """Extract Part-of-Speech (POS) tags for words in a sentence.
+    
+    Args:
+        sentence (str): Input sentence to be POS tagged
+    
+    Returns:
+        list: A list of tuples containing (word, POS_tag)
+    
+    Raises:
+        ValueError: If sentence is not a string or is empty
+        Exception: If NLTK resource download fails
+    """
+    if not isinstance(sentence, str) or not sentence.strip():
+        raise ValueError("Input must be a non-empty string")

+    try:
         # Ensure that the necessary NLTK resources are downloaded
         nltk.download("words", quiet=True)
         nltk.download("punkt", quiet=True)
         nltk.download("averaged_perceptron_tagger", quiet=True)
+    except Exception as e:
+        logging.error(f"Failed to download NLTK resources: {str(e)}")
+        raise

     from nltk.tag import pos_tag
     from nltk.tokenize import word_tokenize

     # Tokenize the sentence into words
     tokens = word_tokenize(sentence)

     # Tag each word with its corresponding POS tag
     pos_tags = pos_tag(tokens)

     return pos_tags
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def extract_pos_tags(sentence):
"""Extract Part-of-Speech (POS) tags for words in a sentence."""
# Ensure that the necessary NLTK resources are downloaded
nltk.download("words", quiet=True)
nltk.download("punkt", quiet=True)
nltk.download("averaged_perceptron_tagger", quiet=True)
from nltk.tag import pos_tag
from nltk.tokenize import word_tokenize
# Tokenize the sentence into words
tokens = word_tokenize(sentence)
# Tag each word with its corresponding POS tag
pos_tags = pos_tag(tokens)
return pos_tags
def extract_pos_tags(sentence):
"""Extract Part-of-Speech (POS) tags for words in a sentence.
Args:
sentence (str): Input sentence to be POS tagged
Returns:
list: A list of tuples containing (word, POS_tag)
Raises:
ValueError: If sentence is not a string or is empty
Exception: If NLTK resource download fails
"""
if not isinstance(sentence, str) or not sentence.strip():
raise ValueError("Input must be a non-empty string")
try:
# Ensure that the necessary NLTK resources are downloaded
nltk.download("words", quiet=True)
nltk.download("punkt", quiet=True)
nltk.download("averaged_perceptron_tagger", quiet=True)
except Exception as e:
logging.error(f"Failed to download NLTK resources: {str(e)}")
raise
from nltk.tag import pos_tag
from nltk.tokenize import word_tokenize
# Tokenize the sentence into words
tokens = word_tokenize(sentence)
# Tag each word with its corresponding POS tag
pos_tags = pos_tag(tokens)
return pos_tags
🧰 Tools
🪛 GitHub Actions: ruff format

[warning] File needs formatting according to Ruff standards


def get_anonymous_id():
"""Creates or reads a anonymous user id"""
home_dir = str(pathlib.Path(pathlib.Path(__file__).parent.parent.parent.resolve()))
Expand Down Expand Up @@ -243,31 +271,9 @@ async def render_graph(
# return df.replace([np.inf, -np.inf, np.nan], None)


def get_entities(tagged_tokens):
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk

return ne_chunk(tagged_tokens)


def extract_pos_tags(sentence):
"""Extract Part-of-Speech (POS) tags for words in a sentence."""

# Ensure that the necessary NLTK resources are downloaded
nltk.download("words", quiet=True)
nltk.download("punkt", quiet=True)
nltk.download("averaged_perceptron_tagger", quiet=True)

from nltk.tag import pos_tag
from nltk.tokenize import word_tokenize

# Tokenize the sentence into words
tokens = word_tokenize(sentence)

# Tag each word with its corresponding POS tag
pos_tags = pos_tag(tokens)

return pos_tags


logging.basicConfig(level=logging.INFO)
Expand Down Expand Up @@ -396,6 +402,7 @@ async def create_cognee_style_network_with_logo(

from bokeh.embed import file_html
from bokeh.resources import CDN
from bokeh.io import export_png

logging.info("Converting graph to serializable format...")
G = await convert_to_serializable_graph(G)
Expand Down Expand Up @@ -443,15 +450,19 @@ async def create_cognee_style_network_with_logo(
)
p.add_tools(hover_tool)

# Get the latest Unix timestamp as an integer
timestamp = int(time.time())

# Construct your filename
filename = f"{timestamp}.png"



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling and use the filename variable for PNG export

The PNG export functionality lacks error handling and cleanup of old files. Also, the filename variable is created but never used.

     # Get the latest Unix timestamp as an integer
     timestamp = int(time.time())

     # Construct your filename
     filename = f"{timestamp}.png"

+    try:
+        # Cleanup old PNG files to prevent disk space issues
+        cleanup_old_pngs(directory=".", keep_latest=5)
+        
+        # Export the new PNG
+        export_png(p, filename=filename)
+    except Exception as e:
+        logging.error(f"Failed to export PNG: {str(e)}")
+        raise

Consider adding a helper function to cleanup old PNG files:

def cleanup_old_pngs(directory: str, keep_latest: int = 5):
    """Cleanup old PNG files, keeping only the N latest files."""
    png_files = [f for f in os.listdir(directory) if f.endswith('.png')]
    if len(png_files) <= keep_latest:
        return
        
    # Sort by timestamp in filename
    sorted_files = sorted(png_files, key=lambda x: int(x.replace(".png", "")))
    
    # Remove older files
    for f in sorted_files[:-keep_latest]:
        try:
            os.remove(os.path.join(directory, f))
        except OSError as e:
            logging.warning(f"Failed to remove old PNG file {f}: {str(e)}")
🧰 Tools
🪛 Ruff (0.8.2)

457-457: Local variable filename is assigned to but never used

Remove assignment to unused variable filename

(F841)

🪛 GitHub Actions: ruff format

[warning] File needs formatting according to Ruff standards

🪛 GitHub Actions: ruff lint

[error] 457-457: Local variable 'filename' is assigned to but never used

logging.info(f"Saving visualization to {output_filename}...")
html_content = file_html(p, CDN, title)
with open(output_filename, "w") as f:
f.write(html_content)

logging.info("Visualization complete.")

if bokeh_object:
return p
return html_content


Expand Down Expand Up @@ -512,7 +523,7 @@ def setup_logging(log_level=logging.INFO):
G,
output_filename="example_network.html",
title="Example Cognee Network",
node_attribute="group", # Attribute to use for coloring nodes
label="group", # Attribute to use for coloring nodes
layout_func=nx.spring_layout, # Layout function
layout_scale=3.0, # Scale for the layout
logo_alpha=0.2,
Expand Down
Loading
Loading