-
Notifications
You must be signed in to change notification settings - Fork 968
feat: Add data visualization for Anthropic #432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request introduces enhanced image processing and visualization capabilities for the Cognee knowledge graph system. The changes span several files, primarily Changes
Possibly related PRs
Poem
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cognee-mcp/cognee_mcp/server.py(3 hunks)cognee/shared/utils.py(3 hunks)
🧰 Additional context used
🪛 GitHub Actions: ruff format
cognee-mcp/cognee_mcp/server.py
[warning] File requires formatting. Code does not conform to Ruff formatting standards.
🔇 Additional comments (1)
cognee-mcp/cognee_mcp/server.py (1)
92-101: LGTM: Tool definition looks goodThe "visualize" tool is properly defined with a clear description and appropriate schema.
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff formatting standards.
cognee-mcp/cognee_mcp/server.py
Outdated
| from PIL import Image | ||
| from PIL import Image as PILImage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove duplicate PIL Image import
The Image module from PIL is imported twice with different aliases. This is redundant and could cause confusion.
-from PIL import Image
-from PIL import Image as PILImage
+from PIL import Image📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from PIL import Image | |
| from PIL import Image as PILImage | |
| from PIL import Image |
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff formatting standards.
cognee/shared/utils.py
Outdated
| # Get the latest Unix timestamp as an integer | ||
| timestamp = int(time.time()) | ||
|
|
||
| # Construct your filename | ||
| filename = f"{timestamp}.png" | ||
|
|
||
| export_png(p, filename=filename) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for PNG export
The PNG export functionality lacks error handling and cleanup of old files.
# Get the latest Unix timestamp as an integer
timestamp = int(time.time())
# Construct your filename
filename = f"{timestamp}.png"
- export_png(p, filename=filename)
+ try:
+ # Cleanup old PNG files to prevent disk space issues
+ cleanup_old_pngs(directory=".", keep_latest=5)
+
+ # Export the new PNG
+ export_png(p, filename=filename)
+ except Exception as e:
+ logging.error(f"Failed to export PNG: {str(e)}")
+ raiseConsider adding a helper function to cleanup old PNG files:
def cleanup_old_pngs(directory: str, keep_latest: int = 5):
"""Cleanup old PNG files, keeping only the N latest files."""
png_files = [f for f in os.listdir(directory) if f.endswith('.png')]
if len(png_files) <= keep_latest:
return
# Sort by timestamp in filename
sorted_files = sorted(png_files, key=lambda x: int(x.replace(".png", "")))
# Remove older files
for f in sorted_files[:-keep_latest]:
try:
os.remove(os.path.join(directory, f))
except OSError as e:
logging.warning(f"Failed to remove old PNG file {f}: {str(e)}")Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
cognee-mcp/cognee_mcp/server.py (1)
Fix import statements organization
The file has the following formatting issues:
- Remove duplicate PIL import:
from PIL import Image as PILImageis redundant sinceImageis already imported- Organize imports according to PEP 8 style guide:
- Standard library imports
- Third-party imports
- Local application imports
🔗 Analysis chain
Line range hint
1-230: Fix formatting issuesThe file requires formatting according to Ruff's standards. Please run the formatter:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
ruff format cognee-mcp/cognee_mcp/server.pyLength of output: 140
Script:
#!/bin/bash # Check for basic formatting patterns echo "=== Checking import statements ===" rg "^import|^from.*import" cognee-mcp/cognee_mcp/server.py echo -e "\n=== Checking blank lines between functions ===" rg "^def|^async def" -A 1 cognee-mcp/cognee_mcp/server.py echo -e "\n=== Checking indentation ===" rg "^[[:space:]]+[^[:space:]]" cognee-mcp/cognee_mcp/server.pyLength of output: 8158
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff's formatting standards.
🧹 Nitpick comments (1)
cognee-mcp/cognee_mcp/server.py (1)
105-130: Add docstring to document timestamp format requirementThe function assumes PNG files are named with timestamps but this requirement isn't documented.
def get_freshest_png(directory: str) -> Image.Image: + """ + Retrieve the most recently created PNG file from a directory. + + Args: + directory (str): Path to directory containing PNG files + + Returns: + Image.Image: PIL Image object of the most recent PNG + + Raises: + FileNotFoundError: If directory doesn't exist or no PNG files found + ValueError: If PNG filenames don't follow timestamp format (e.g., '1673185134.png') + IOError: If PNG file can't be opened + + Note: + PNG files must be named with Unix timestamps (e.g., '1673185134.png') + """ if not os.path.exists(directory):🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff's formatting standards.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee-mcp/cognee_mcp/server.py(3 hunks)
🧰 Additional context used
🪛 GitHub Actions: ruff format
cognee-mcp/cognee_mcp/server.py
[warning] File requires formatting. Code does not conform to Ruff's formatting standards.
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: docker-compose-test
🔇 Additional comments (1)
cognee-mcp/cognee_mcp/server.py (1)
13-14: Remove duplicate PIL Image importThe Image module from PIL is imported twice with different aliases. This is redundant and could cause confusion.
-from PIL import Image -from PIL import Image as PILImage +from PIL import Image🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff's formatting standards.
cognee-mcp/cognee_mcp/server.py
Outdated
| with redirect_stdout(fnull), redirect_stderr(fnull): | ||
| try: | ||
| await cognee.visualize | ||
| img = get_freshest_png(".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use configured output directory instead of current directory
Using "." as the directory path is brittle and depends on the current working directory when the server starts. Consider using a configured output directory.
- img = get_freshest_png(".")
+ img = get_freshest_png(cognee.config.visualization_output_dir)You'll need to ensure the output directory is properly configured in the Cognee configuration.
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff's formatting standards.
| types.Tool( | ||
| name="visualize", | ||
| description="Visualize the knowledge graph.", | ||
| inputSchema={ | ||
| "type": "object", | ||
| "properties": { | ||
| "query": {"type": "string"}, | ||
| }, | ||
| }, | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Clarify query parameter requirement in visualization tool schema
The tool's schema defines a "query" property but doesn't mark it as required. Either:
- Mark it as required if visualization needs a query parameter
- Document that it's optional if visualization can work without a query
types.Tool(
name="visualize",
description="Visualize the knowledge graph.",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
},
+ "required": ["query"], # Add this if query is required
},
),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| types.Tool( | |
| name="visualize", | |
| description="Visualize the knowledge graph.", | |
| inputSchema={ | |
| "type": "object", | |
| "properties": { | |
| "query": {"type": "string"}, | |
| }, | |
| }, | |
| ), | |
| types.Tool( | |
| name="visualize", | |
| description="Visualize the knowledge graph.", | |
| inputSchema={ | |
| "type": "object", | |
| "properties": { | |
| "query": {"type": "string"}, | |
| }, | |
| "required": ["query"], # Add this if query is required | |
| }, | |
| ), |
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting. Code does not conform to Ruff's formatting standards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good suggestion from coderabbit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@borisarzentar Thank you for confirming! I'm glad the suggestion was helpful.
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⣿⣿⣿⣿⣿⡄⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⢠⣿⣿⣿⡿⢿⣿⡿⠟⢻⣷⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⣰⣿⣿⠟⠉⠀⢸⣿⠀⠀⢸⣿⣇⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⣰⣿⣿⠋⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⡄⠀⠀⠀⠀⠀
⠀⠀⠀⣰⣿⣿⠃⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣷⠀⠀⠀⠀⠀
⠀⠀⢠⣿⣿⠇⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⡄⠀⠀⠀⠀
⠀⢠⣿⣿⡏⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣷⡀⠀⠀⠀
⢠⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣷⡀⠀⠀
⣿⣿⡟⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⡄⠀
⠛⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⠀
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
cognee-mcp/cognee_mcp/server.py
Outdated
| from mcp.server import NotificationOptions, Server | ||
| from mcp.server.models import InitializationOptions | ||
| from PIL import Image | ||
| from PIL import Image as PILImage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double import here
| types.Tool( | ||
| name="visualize", | ||
| description="Visualize the knowledge graph.", | ||
| inputSchema={ | ||
| "type": "object", | ||
| "properties": { | ||
| "query": {"type": "string"}, | ||
| }, | ||
| }, | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good suggestion from coderabbit
cognee-mcp/cognee_mcp/server.py
Outdated
| with open(os.devnull, "w") as fnull: | ||
| with redirect_stdout(fnull), redirect_stderr(fnull): | ||
| try: | ||
| await cognee.visualize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should visualize receive the parameter where to put the image it generates? Then we can send that path to get_freshest_png to retrieve the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
cognee-mcp/README.md (2)
88-91: Minor grammar fix needed in documentation.Add "the" before "debugger" for better readability.
-To use debugger, run: +To use the debugger, run:🧰 Tools
🪛 LanguageTool
[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...y Restart your Claude desktop. To use debugger, run: ```bash npx @modelcontextprotocol...(AI_EN_LECTOR_MISSING_DETERMINER_THE)
93-94: Add command description for clarity.The reset installation command would benefit from a brief description of when to use it.
-To reset the installation -uv sync --dev --all-extras --reinstall +To reset the installation (useful when dependencies are corrupted): +```bash +uv sync --dev --all-extras --reinstall +```cognee-mcp/cognee_mcp/server.py (1)
6-7: Remove unused logging import.The
loggingimport fromsqlalchemy.testing.plugin.plugin_baseis not used in the code.-from sqlalchemy.testing.plugin.plugin_base import loggingcognee/shared/utils.py (1)
Line range hint
377-494: Consider removing duplicate logo embedding.The logo is embedded twice in the visualization, at both bottom-right and top-left positions. This might be unnecessary and could affect performance. Consider keeping only one logo instance unless there's a specific requirement for two logos.
logging.info("Embedding logo into visualization...") embed_logo(p, layout_scale, logo_alpha, "bottom_right") - embed_logo(p, layout_scale, logo_alpha, "top_left")🧰 Tools
🪛 Ruff (0.8.2)
429-429: Local variable
filenameis assigned to but never usedRemove assignment to unused variable
filename(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
cognee-mcp/uv.lockis excluded by!**/*.lockpoetry.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
cognee-mcp/README.md(1 hunks)cognee-mcp/cognee_mcp/server.py(3 hunks)cognee-mcp/pyproject.toml(4 hunks)cognee/__init__.py(1 hunks)cognee/shared/utils.py(4 hunks)pyproject.toml(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cognee-mcp/README.md
[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...y Restart your Claude desktop. To use debugger, run: ```bash npx @modelcontextprotocol...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
🪛 Ruff (0.8.2)
cognee/shared/utils.py
429-429: Local variable filename is assigned to but never used
Remove assignment to unused variable filename
(F841)
🔇 Additional comments (6)
cognee/__init__.py (1)
7-7: LGTM! Verify function usage across codebase.The renamed import to
visualize_graphis more descriptive of its purpose.Let's verify all usages of the old and new function names:
✅ Verification successful
✅ Function rename verification complete
The
visualizetovisualize_graphrename is consistently applied across the codebase. The new function is properly used incognee-mcp/cognee_mcp/server.pyand no instances of the old function name remain.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any remaining usages of old 'visualize' function rg -l "\.visualize\(" # Search for usages of new 'visualize_graph' function rg -l "\.visualize_graph\("Length of output: 85
Script:
#!/bin/bash # Check the implementation in server.py rg "visualize_graph" cognee-mcp/cognee_mcp/server.py -A 2 -B 2Length of output: 273
cognee-mcp/pyproject.toml (2)
94-94: LGTM! Script entry point is properly configured.The cognee script entry point is correctly defined.
6-6: Verify compatibility with major version updates.Several dependencies have significant version jumps:
- pandas (2.0.3 → 2.2.3)
- instructor (1.5.2 → 1.7.2)
- litellm (1.49.1 → 1.57.2)
Let's check for any breaking changes in the major updates:
Also applies to: 9-9, 24-25, 28-28, 35-35, 47-48, 54-54, 59-59
✅ Verification successful
Dependencies are compatible with current usage
The codebase uses basic, stable features from all updated libraries:
- pandas: Simple DataFrame creation and iteration
- instructor: Basic litellm integration
- litellm: Standard completion API calls
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for pandas DataFrame operations that might be affected rg "pd\.DataFrame" -A 3 # Search for instructor decorators and usage rg "@instruction" -A 3 # Search for litellm configurations rg "litellm\.completion" -A 3Length of output: 1927
pyproject.toml (1)
66-66: LGTM! Version alignment with cognee-mcp.The fastapi-users version is properly pinned and aligned with cognee-mcp's requirements.
cognee-mcp/cognee_mcp/server.py (1)
94-103: Clarify query parameter requirement in visualization tool schema.The tool's schema defines a "query" property but doesn't mark it as required. Either mark it as required if visualization needs a query parameter, or document that it's optional.
cognee/shared/utils.py (1)
425-432: 🛠️ Refactor suggestionRemove unused variable and add PNG export.
The
filenamevariable is created but never used. Additionally, the PNG export functionality is incomplete.# Get the latest Unix timestamp as an integer timestamp = int(time.time()) # Construct your filename filename = f"{timestamp}.png" + + # Export the visualization as PNG + try: + export_png(p, filename=filename) + except Exception as e: + logging.error(f"Failed to export PNG: {str(e)}") + raiseLikely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
429-429: Local variable
filenameis assigned to but never usedRemove assignment to unused variable
filename(F841)
| elif name == "visualize": | ||
| with open(os.devnull, "w") as fnull: | ||
| with redirect_stdout(fnull), redirect_stderr(fnull): | ||
| try: | ||
| result = await cognee.visualize_graph() | ||
| results = retrieved_edges_to_string(result) | ||
|
|
||
| return [ | ||
| types.TextContent( | ||
| type="text", | ||
| text=results, | ||
| ) | ||
| ] | ||
| except (FileNotFoundError, IOError, ValueError) as e: | ||
| raise ValueError(f"Failed to create visualization: {str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Return visualization output instead of text content.
The visualization tool returns text content (types.TextContent) despite being described as a tool to "Visualize the knowledge graph." Consider returning the visualization as an image using types.ImageContent or as an embedded resource using types.EmbeddedResource.
return [
- types.TextContent(
- type="text",
- text=results,
- )
+ types.ImageContent(
+ type="image",
+ data=await cognee.visualize_graph(),
+ format="png"
+ )
]Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
cognee/shared/utils.py (1)
Line range hint
1-526: Format code using RuffThe pipeline indicates that the file requires formatting using the Ruff formatter.
Run the following command to format the code:
ruff format cognee/shared/utils.py🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting using Ruff formatter
♻️ Duplicate comments (1)
cognee/shared/utils.py (1)
453-460:⚠️ Potential issueAdd error handling for PNG export
The PNG export functionality lacks error handling and cleanup of old files.
A past review has already suggested adding error handling and cleanup functionality for PNG exports. The same issues still apply here.
🧰 Tools
🪛 Ruff (0.8.2)
457-457: Local variable
filenameis assigned to but never usedRemove assignment to unused variable
filename(F841)
🪛 GitHub Actions: ruff lint
[error] 457-457: Local variable
filenameis assigned to but never used🪛 GitHub Actions: ruff format
[warning] File requires formatting using Ruff formatter
🧹 Nitpick comments (2)
cognee/shared/utils.py (2)
44-47: Optimize NLTK resource downloadsDownloading NLTK resources on every function call is inefficient. Consider moving these downloads to a module initialization or a dedicated setup function.
+def setup_nltk_resources(): + """Download required NLTK resources if not already present.""" + try: + nltk.download("words", quiet=True) + nltk.download("punkt", quiet=True) + nltk.download("averaged_perceptron_tagger", quiet=True) + nltk.download("maxent_ne_chunker", quiet=True) + except Exception as e: + logging.error(f"Failed to download NLTK resources: {str(e)}") + raise + def extract_pos_tags(sentence): """Extract Part-of-Speech (POS) tags for words in a sentence.""" - # Ensure that the necessary NLTK resources are downloaded - nltk.download("words", quiet=True) - nltk.download("punkt", quiet=True) - nltk.download("averaged_perceptron_tagger", quiet=True)🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting using Ruff formatter
Line range hint
405-526: Consider splitting visualization code into smaller componentsThe
create_cognee_style_network_with_logofunction is handling too many responsibilities, making it hard to maintain and test.Consider:
- Extracting logo embedding into a separate class
- Creating a dedicated NetworkVisualizer class
- Separating layout and styling logic
- Moving PNG export functionality to a dedicated service
This will improve maintainability, testability, and reusability of the visualization components.
🧰 Tools
🪛 Ruff (0.8.2)
457-457: Local variable
filenameis assigned to but never usedRemove assignment to unused variable
filename(F841)
🪛 GitHub Actions: ruff lint
[error] 457-457: Local variable
filenameis assigned to but never used🪛 GitHub Actions: ruff format
[warning] File requires formatting using Ruff formatter
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cognee/shared/utils.py(5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
cognee/shared/utils.py
457-457: Local variable filename is assigned to but never used
Remove assignment to unused variable filename
(F841)
🪛 GitHub Actions: ruff lint
cognee/shared/utils.py
[error] 457-457: Local variable filename is assigned to but never used
🪛 GitHub Actions: ruff format
cognee/shared/utils.py
[warning] File requires formatting using Ruff formatter
🪛 GitHub Actions: test | weaviate
cognee/shared/utils.py
[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'poetry add nltk'.
🪛 GitHub Actions: test | milvus
cognee/shared/utils.py
[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the required package using 'pip install nltk' or add it to poetry dependencies.
🪛 GitHub Actions: test | neo4j
cognee/shared/utils.py
[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry or pip.
🪛 GitHub Actions: test | qdrant
cognee/shared/utils.py
[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.
🪛 GitHub Actions: test | deduplication
cognee/shared/utils.py
[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the package using poetry add nltk or add it to pyproject.toml.
🪛 GitHub Actions: test | pgvector
cognee/shared/utils.py
[error] 14-14: Missing required dependency: Module 'nltk' not found. Please install the nltk package.
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: profiler
- GitHub Check: docker-compose-test
| def get_entities(tagged_tokens): | ||
| nltk.download("maxent_ne_chunker", quiet=True) | ||
| from nltk.chunk import ne_chunk | ||
|
|
||
| return ne_chunk(tagged_tokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add docstring and error handling for get_entities()
The function lacks a docstring explaining its purpose and parameters. Also, NLTK resource downloads should be handled with error checking.
def get_entities(tagged_tokens):
+ """Extract named entities from POS-tagged tokens using NLTK's ne_chunk.
+
+ Args:
+ tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag()
+
+ Returns:
+ A tree containing chunks of named entities
+ """
+ try:
nltk.download("maxent_ne_chunker", quiet=True)
from nltk.chunk import ne_chunk
+ except Exception as e:
+ logging.error(f"Failed to download NLTK resources: {str(e)}")
+ raise
return ne_chunk(tagged_tokens)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_entities(tagged_tokens): | |
| nltk.download("maxent_ne_chunker", quiet=True) | |
| from nltk.chunk import ne_chunk | |
| return ne_chunk(tagged_tokens) | |
| def get_entities(tagged_tokens): | |
| """Extract named entities from POS-tagged tokens using NLTK's ne_chunk. | |
| Args: | |
| tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag() | |
| Returns: | |
| A tree containing chunks of named entities | |
| """ | |
| try: | |
| nltk.download("maxent_ne_chunker", quiet=True) | |
| from nltk.chunk import ne_chunk | |
| except Exception as e: | |
| logging.error(f"Failed to download NLTK resources: {str(e)}") | |
| raise | |
| return ne_chunk(tagged_tokens) |
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting using Ruff formatter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
cognee-mcp/README.md (3)
40-42: Add language specifications to code blocks.The code blocks should specify their language for proper syntax highlighting.
Apply these changes:
-``` +```bash cd ~/Library/Application\ Support/Claude/-
+bash
nano claude_desktop_config.jsonAlso applies to: 46-48 <details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint (0.37.0)</summary> 40-40: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> --- `44-45`: **Improve configuration file instructions.** The instructions about adding paths and API key could be more specific and include a warning about securing sensitive information. Consider adding: - A note about keeping the API key secure - Examples of correct path formatting - Warning about not committing the config file with API keys --- `101-105`: **Enhance development workflow documentation.** The development workflow steps could be more detailed to prevent confusion. Consider: 1. Explaining why each step is necessary 2. Adding expected output or success indicators 3. Mentioning potential troubleshooting steps </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: CodeRabbit UI** **Review profile: CHILL** **Plan: Pro** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between ad07bae9a781c909c67658261d579de5834ea3b1 and 61118dda0353158ed949d04acf95867d30011d47. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `cognee-mcp/README.md` (2 hunks) </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🪛 Markdownlint (0.37.0)</summary> <details> <summary>cognee-mcp/README.md</summary> 40-40: null Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 46-46: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
|
||
| To use debugger, run: | ||
| ```bash | ||
| npx @modelcontextprotocol/inspector uv --directory /Users/name/folder run cognee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace hardcoded paths with placeholders.
The instructions contain specific user paths that won't work for other users.
Apply these changes:
-npx @modelcontextprotocol/inspector uv --directory /Users/name/folder run cognee
+npx @modelcontextprotocol/inspector uv --directory /Users/{username}/path/to/folder run cognee
-npx @modelcontextprotocol/inspector uv --directory /Users/vasilije/cognee/cognee-mcp run cognee
+npx @modelcontextprotocol/inspector uv --directory /Users/{username}/cognee/cognee-mcp run cogneeAlso applies to: 105-105
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
.github/workflows/dockerhub.yml (2)
44-45: Consider using dynamic cache tags.The current cache configuration uses a static
buildcachetag which could lead to race conditions in parallel builds. Consider using a branch-specific cache tag.- cache-from: type=registry,ref=cognee/cognee:buildcache - cache-to: type=registry,ref=cognee/cognee:buildcache,mode=max + cache-from: type=registry,ref=cognee/cognee:buildcache-${{ github.ref_name }} + cache-to: type=registry,ref=cognee/cognee:buildcache-${{ github.ref_name }},mode=max
26-35: Enhance image metadata.Consider adding more metadata to improve image discoverability and traceability:
- name: Extract metadata id: meta uses: docker/metadata-action@v5 with: images: cognee/cognee tags: | type=ref,event=branch type=sha,prefix={{branch}}- type=raw,value=latest,enable={{is_default_branch}} + labels: | + org.opencontainers.image.title=Cognee + org.opencontainers.image.description=Knowledge graph system with visualization capabilities + maintainer=Cognee Team
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/dockerhub.yml(1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/dockerhub.yml
48-48: property "build" is not defined in object type {meta: {conclusion: string; outcome: string; outputs: {annotations: string; bake-file: string; bake-file-annotations: string; bake-file-labels: string; bake-file-tags: string; json: string; labels: string; tags: string; version: string}}}
(expression)
🪛 yamllint (1.35.1)
.github/workflows/dockerhub.yml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: run_simple_example_test / test
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: profiler
- GitHub Check: docker-compose-test
🔇 Additional comments (1)
.github/workflows/dockerhub.yml (1)
1-8: LGTM! Good addition of the dev branch.The workflow trigger configuration is well-structured, allowing for testing in the dev branch before changes reach main.
| - name: Image digest | ||
| run: echo ${{ steps.build.outputs.digest }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the image digest step.
There are two issues to address:
- The digest step references an undefined step ID 'build'
- The file is missing a newline at the end
Apply this diff to fix both issues:
- name: Image digest
- run: echo ${{ steps.build.outputs.digest }}
+ run: echo ${{ steps.meta.outputs.json }}
+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Image digest | |
| run: echo ${{ steps.build.outputs.digest }} | |
| - name: Image digest | |
| run: echo ${{ steps.meta.outputs.json }} | |
🧰 Tools
🪛 actionlint (1.7.4)
48-48: property "build" is not defined in object type {meta: {conclusion: string; outcome: string; outputs: {annotations: string; bake-file: string; bake-file-annotations: string; bake-file-labels: string; bake-file-tags: string; json: string; labels: string; tags: string; version: string}}}
(expression)
🪛 yamllint (1.35.1)
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
| docker-build-and-push: | ||
| runs-on: ubuntu-latest | ||
|
|
||
| steps: | ||
| - name: Checkout repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add explicit permissions configuration.
Consider adding explicit permissions to follow the principle of least privilege:
docker-build-and-push:
runs-on: ubuntu-latest
+ permissions:
+ contents: read
+ packages: write📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| jobs: | |
| docker-build-and-push: | |
| runs-on: ubuntu-latest | |
| steps: | |
| jobs: | |
| docker-build-and-push: | |
| runs-on: ubuntu-latest | |
| permissions: | |
| contents: read | |
| packages: write | |
| steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
.github/workflows/approve_dco.yaml (2)
15-23: Consider making the DCO validation more robust.The current exact text matching might be too strict. Consider:
- Case-insensitive matching
- Normalizing whitespace
- Supporting multiple statement formats
Here's a suggested improvement:
- const requiredStatement = "I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin"; - - if (!prBody.includes(requiredStatement)) { + const requiredStatement = "I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin"; + const normalizedPrBody = prBody.toLowerCase().replace(/\s+/g, ' ').trim(); + const normalizedStatement = requiredStatement.toLowerCase().replace(/\s+/g, ' ').trim(); + + if (!normalizedPrBody.includes(normalizedStatement)) {🧰 Tools
🪛 yamllint (1.35.1)
[error] 18-18: trailing spaces
(trailing-spaces)
18-18: Remove trailing whitespace.There is trailing whitespace on line 18 that should be removed.
- +🧰 Tools
🪛 yamllint (1.35.1)
[error] 18-18: trailing spaces
(trailing-spaces)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/pull_request_template.md(1 hunks).github/workflows/approve_dco.yaml(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- .github/pull_request_template.md
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/approve_dco.yaml
12-12: the runner of "actions/github-script@v6" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 yamllint (1.35.1)
.github/workflows/approve_dco.yaml
[error] 18-18: trailing spaces
(trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms (14)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: profiler
- GitHub Check: docker-compose-test
🔇 Additional comments (1)
.github/workflows/approve_dco.yaml (1)
1-5: LGTM! Well-defined workflow triggers.The workflow name is descriptive and the trigger configuration comprehensively covers all relevant pull request events.
| check-dco: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Validate Developer Certificate of Origin statement | ||
| uses: actions/github-script@v6 | ||
| with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Update actions/github-script from v6 to v7
The latest version is v7.0.1, which includes improvements and bug fixes. Since this is a major version update, review the changelog for any breaking changes before updating.
🔗 Analysis chain
Update the GitHub Script action version.
The current version of actions/github-script@v6 might be outdated. Consider updating to the latest version to ensure compatibility and security.
Let's verify the latest version and any security advisories:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check the latest version and security advisories for actions/github-script
gh api repos/actions/github-script/releases/latest --jq .tag_name
# Check for any security advisories
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: ACTIONS, package: "github-script") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}'
Length of output: 425
🧰 Tools
🪛 actionlint (1.7.4)
12-12: the runner of "actions/github-script@v6" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
.github/workflows/approve_dco.yaml (1)
12-12:⚠️ Potential issueUpdate actions/github-script from v6 to v7.
The current version is outdated and should be updated to the latest v7 for compatibility and security improvements.
- uses: actions/github-script@v6 + uses: actions/github-script@v7🧰 Tools
🪛 actionlint (1.7.4)
12-12: the runner of "actions/github-script@v6" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🧹 Nitpick comments (1)
.github/workflows/approve_dco.yaml (1)
37-44: Enhance error handling with specific error messages.The error handling could be more informative by providing specific error messages for different error scenarios.
} catch (error) { // If we get a 404, user is NOT an org member if (error.status === 404) { console.log(`${prUser} is NOT a member of ${orgName}. Enforcing DCO check.`); } else { // Some other error—fail the workflow or handle accordingly - core.setFailed(`Error checking organization membership: ${error.message}`); + core.setFailed( + `Failed to verify organization membership: ${error.status === 403 ? + 'Insufficient permissions. Ensure GITHUB_TOKEN has read:org scope.' : + `Unexpected error (${error.status}): ${error.message}`}` + ); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
.github/workflows/approve_dco.yaml(1 hunks)cognee/shared/utils.py(5 hunks)pyproject.toml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/approve_dco.yaml
12-12: the runner of "actions/github-script@v6" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 Ruff (0.8.2)
cognee/shared/utils.py
26-26: Redefinition of unused nltk from line 14
Remove definition: nltk
(F811)
457-457: Local variable filename is assigned to but never used
Remove assignment to unused variable filename
(F841)
🪛 GitHub Actions: ruff format
cognee/shared/utils.py
[warning] File needs formatting according to Ruff standards
🪛 GitHub Actions: ruff lint
cognee/shared/utils.py
[error] 26-26: Redefinition of unused 'nltk' from line 14
[error] 457-457: Local variable 'filename' is assigned to but never used
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: docker-compose-test
🔇 Additional comments (3)
.github/workflows/approve_dco.yaml (1)
1-6: LGTM! Comprehensive PR trigger events.The workflow configuration is well-structured with appropriate trigger events for PR-related actions.
cognee/shared/utils.py (2)
Line range hint
60-550: LGTM! Well-structured utility functionsThe remaining utility functions are well-implemented with proper error handling and documentation.
🧰 Tools
🪛 Ruff (0.8.2)
457-457: Local variable
filenameis assigned to but never usedRemove assignment to unused variable
filename(F841)
🪛 GitHub Actions: ruff format
[warning] File needs formatting according to Ruff standards
🪛 GitHub Actions: ruff lint
[error] 457-457: Local variable 'filename' is assigned to but never used
34-38: 🛠️ Refactor suggestionAdd error handling and docstring for get_entities()
The function lacks a docstring and proper error handling for NLTK resource downloads.
def get_entities(tagged_tokens): + """Extract named entities from POS-tagged tokens using NLTK's ne_chunk. + + Args: + tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag() + + Returns: + A tree containing chunks of named entities + + Raises: + Exception: If NLTK resource download fails + """ + try: nltk.download("maxent_ne_chunker", quiet=True) from nltk.chunk import ne_chunk + except Exception as e: + logging.error(f"Failed to download NLTK resources: {str(e)}") + raise return ne_chunk(tagged_tokens)Likely invalid or redundant comment.
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File needs formatting according to Ruff standards
| # In GitHub Enterprise or private orgs, you might need a PAT (personal access token) with read:org scope. | ||
| github-token: ${{ secrets.GITHUB_TOKEN }} | ||
| script: | | ||
| const orgName = 'YOUR_ORGANIZATION_NAME'; // Replace with your org |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace placeholder organization name.
The organization name is currently set to a placeholder value 'YOUR_ORGANIZATION_NAME'. This needs to be replaced with the actual organization name 'topoteretes' as indicated in the PR objectives.
- const orgName = 'YOUR_ORGANIZATION_NAME'; // Replace with your org
+ const orgName = 'topoteretes';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const orgName = 'YOUR_ORGANIZATION_NAME'; // Replace with your org | |
| const orgName = 'topoteretes'; |
cognee/shared/utils.py
Outdated
| import pandas as pd | ||
| import matplotlib.pyplot as plt | ||
| import tiktoken | ||
| import nltk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate nltk import
The nltk module is imported twice. Remove the duplicate import on line 26 to fix the F811 error.
import nltk
import base64
import time
...
- import nltk
from cognee.shared.exceptions import IngestionErrorAlso applies to: 26-26
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File needs formatting according to Ruff standards
cognee/shared/utils.py
Outdated
| # Get the latest Unix timestamp as an integer | ||
| timestamp = int(time.time()) | ||
|
|
||
| # Construct your filename | ||
| filename = f"{timestamp}.png" | ||
|
|
||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and use the filename variable for PNG export
The PNG export functionality lacks error handling and cleanup of old files. Also, the filename variable is created but never used.
# Get the latest Unix timestamp as an integer
timestamp = int(time.time())
# Construct your filename
filename = f"{timestamp}.png"
+ try:
+ # Cleanup old PNG files to prevent disk space issues
+ cleanup_old_pngs(directory=".", keep_latest=5)
+
+ # Export the new PNG
+ export_png(p, filename=filename)
+ except Exception as e:
+ logging.error(f"Failed to export PNG: {str(e)}")
+ raise
Consider adding a helper function to cleanup old PNG files:
def cleanup_old_pngs(directory: str, keep_latest: int = 5):
"""Cleanup old PNG files, keeping only the N latest files."""
png_files = [f for f in os.listdir(directory) if f.endswith('.png')]
if len(png_files) <= keep_latest:
return
# Sort by timestamp in filename
sorted_files = sorted(png_files, key=lambda x: int(x.replace(".png", "")))
# Remove older files
for f in sorted_files[:-keep_latest]:
try:
os.remove(os.path.join(directory, f))
except OSError as e:
logging.warning(f"Failed to remove old PNG file {f}: {str(e)}")🧰 Tools
🪛 Ruff (0.8.2)
457-457: Local variable filename is assigned to but never used
Remove assignment to unused variable filename
(F841)
🪛 GitHub Actions: ruff format
[warning] File needs formatting according to Ruff standards
🪛 GitHub Actions: ruff lint
[error] 457-457: Local variable 'filename' is assigned to but never used
| def extract_pos_tags(sentence): | ||
| """Extract Part-of-Speech (POS) tags for words in a sentence.""" | ||
|
|
||
| # Ensure that the necessary NLTK resources are downloaded | ||
| nltk.download("words", quiet=True) | ||
| nltk.download("punkt", quiet=True) | ||
| nltk.download("averaged_perceptron_tagger", quiet=True) | ||
|
|
||
| from nltk.tag import pos_tag | ||
| from nltk.tokenize import word_tokenize | ||
|
|
||
| # Tokenize the sentence into words | ||
| tokens = word_tokenize(sentence) | ||
|
|
||
| # Tag each word with its corresponding POS tag | ||
| pos_tags = pos_tag(tokens) | ||
|
|
||
| return pos_tags | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation and improve error handling for extract_pos_tags()
The function should validate input and handle NLTK resource downloads more robustly.
def extract_pos_tags(sentence):
"""Extract Part-of-Speech (POS) tags for words in a sentence.
+
+ Args:
+ sentence (str): Input sentence to be POS tagged
+
+ Returns:
+ list: A list of tuples containing (word, POS_tag)
+
+ Raises:
+ ValueError: If sentence is not a string or is empty
+ Exception: If NLTK resource download fails
+ """
+ if not isinstance(sentence, str) or not sentence.strip():
+ raise ValueError("Input must be a non-empty string")
+ try:
# Ensure that the necessary NLTK resources are downloaded
nltk.download("words", quiet=True)
nltk.download("punkt", quiet=True)
nltk.download("averaged_perceptron_tagger", quiet=True)
+ except Exception as e:
+ logging.error(f"Failed to download NLTK resources: {str(e)}")
+ raise
from nltk.tag import pos_tag
from nltk.tokenize import word_tokenize
# Tokenize the sentence into words
tokens = word_tokenize(sentence)
# Tag each word with its corresponding POS tag
pos_tags = pos_tag(tokens)
return pos_tags📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def extract_pos_tags(sentence): | |
| """Extract Part-of-Speech (POS) tags for words in a sentence.""" | |
| # Ensure that the necessary NLTK resources are downloaded | |
| nltk.download("words", quiet=True) | |
| nltk.download("punkt", quiet=True) | |
| nltk.download("averaged_perceptron_tagger", quiet=True) | |
| from nltk.tag import pos_tag | |
| from nltk.tokenize import word_tokenize | |
| # Tokenize the sentence into words | |
| tokens = word_tokenize(sentence) | |
| # Tag each word with its corresponding POS tag | |
| pos_tags = pos_tag(tokens) | |
| return pos_tags | |
| def extract_pos_tags(sentence): | |
| """Extract Part-of-Speech (POS) tags for words in a sentence. | |
| Args: | |
| sentence (str): Input sentence to be POS tagged | |
| Returns: | |
| list: A list of tuples containing (word, POS_tag) | |
| Raises: | |
| ValueError: If sentence is not a string or is empty | |
| Exception: If NLTK resource download fails | |
| """ | |
| if not isinstance(sentence, str) or not sentence.strip(): | |
| raise ValueError("Input must be a non-empty string") | |
| try: | |
| # Ensure that the necessary NLTK resources are downloaded | |
| nltk.download("words", quiet=True) | |
| nltk.download("punkt", quiet=True) | |
| nltk.download("averaged_perceptron_tagger", quiet=True) | |
| except Exception as e: | |
| logging.error(f"Failed to download NLTK resources: {str(e)}") | |
| raise | |
| from nltk.tag import pos_tag | |
| from nltk.tokenize import word_tokenize | |
| # Tokenize the sentence into words | |
| tokens = word_tokenize(sentence) | |
| # Tag each word with its corresponding POS tag | |
| pos_tags = pos_tag(tokens) | |
| return pos_tags |
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File needs formatting according to Ruff standards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
.github/workflows/test_python_3_10.yml (1)
45-47:⚠️ Potential issueFix NLTK tokenizer name and add download verification.
The NLTK tokenizer package name is incorrect. It should be 'punkt' instead of 'punkt_tab'. Also, consider adding download verification.
.github/workflows/test_python_3_11.yml (1)
47-49:⚠️ Potential issueFix NLTK tokenizer name and add download verification.
The NLTK tokenizer package name is incorrect. It should be 'punkt' instead of 'punkt_tab'. Also, consider adding download verification.
🧹 Nitpick comments (4)
.github/workflows/test_python_3_10.yml (1)
48-48: Remove extra blank line.Remove the extra blank line after the NLTK download step to maintain consistent spacing between workflow steps.
.github/workflows/test_python_3_11.yml (1)
50-51: Remove extra blank lines.Remove the two extra blank lines after the NLTK download step to maintain consistent spacing between workflow steps.
cognee/tests/unit/processing/chunks/chunk_by_word_test.py (1)
Line range hint
1-22: Consider batch formatting all test files.Given the consistent pattern of assertion formatting changes across multiple test files, consider:
- Running
ruff formaton all test files in a single batch- Adding a pre-commit hook to automatically format files
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
cognee/tests/test_deduplication.py (1)
33-35: LGTM! The assertion reformatting improves readability.The reformatting of assertions using parentheses improves code readability while maintaining the same functionality.
Please run the Ruff formatter to fix the formatting issues flagged by the pipeline:
#!/bin/bash # Description: Check which files need formatting ruff format --check .Also applies to: 64-66
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
.github/workflows/test_python_3_10.yml(1 hunks).github/workflows/test_python_3_11.yml(1 hunks).github/workflows/test_python_3_12.yml(1 hunks)cognee-mcp/cognee_mcp/server.py(3 hunks)cognee/shared/utils.py(4 hunks)cognee/tests/integration/documents/AudioDocument_test.py(1 hunks)cognee/tests/integration/documents/ImageDocument_test.py(1 hunks)cognee/tests/integration/documents/PdfDocument_test.py(1 hunks)cognee/tests/integration/documents/TextDocument_test.py(1 hunks)cognee/tests/integration/documents/UnstructuredDocument_test.py(1 hunks)cognee/tests/test_deduplication.py(2 hunks)cognee/tests/test_falkordb.py(1 hunks)cognee/tests/test_library.py(1 hunks)cognee/tests/test_pgvector.py(2 hunks)cognee/tests/unit/processing/chunks/chunk_by_paragraph_2_test.py(3 hunks)cognee/tests/unit/processing/chunks/chunk_by_paragraph_test.py(1 hunks)cognee/tests/unit/processing/chunks/chunk_by_sentence_test.py(2 hunks)cognee/tests/unit/processing/chunks/chunk_by_word_test.py(1 hunks)
✅ Files skipped from review due to trivial changes (9)
- cognee/tests/test_falkordb.py
- cognee/tests/unit/processing/chunks/chunk_by_sentence_test.py
- cognee/tests/unit/processing/chunks/chunk_by_paragraph_test.py
- cognee/tests/integration/documents/ImageDocument_test.py
- cognee/tests/integration/documents/PdfDocument_test.py
- cognee/tests/test_library.py
- cognee/tests/integration/documents/TextDocument_test.py
- cognee/tests/test_pgvector.py
- cognee/tests/unit/processing/chunks/chunk_by_paragraph_2_test.py
🧰 Additional context used
🪛 GitHub Actions: ruff format
cognee/tests/unit/processing/chunks/chunk_by_word_test.py
[warning] File requires formatting with Ruff formatter
cognee/tests/integration/documents/AudioDocument_test.py
[warning] File requires formatting with Ruff formatter
cognee/tests/test_deduplication.py
[warning] File requires formatting with Ruff formatter
cognee/tests/integration/documents/UnstructuredDocument_test.py
[warning] File requires formatting with Ruff formatter
⏰ Context from checks skipped due to timeout of 90000ms (14)
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: docker-compose-test
🔇 Additional comments (8)
cognee/tests/integration/documents/AudioDocument_test.py (1)
Line range hint
1-47: Verify PR scope and title accuracy.This file contains only test assertion formatting changes, which seems unrelated to the PR's stated purpose of "Add data visualization for Anthropic". Consider either:
- Updating the PR title/description to reflect these formatting changes, or
- Moving these changes to a separate PR focused on test improvements
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
cognee/tests/unit/processing/chunks/chunk_by_word_test.py (1)
Line range hint
1-1: Overall PR Review Summary
Scope Mismatch: The PR title suggests adding data visualization features, but the changes only contain test assertion formatting modifications. Consider either:
- Updating the PR title/description to accurately reflect these changes
- Moving these formatting changes to a separate PR
- Adding the missing data visualization changes
Formatting Issues: All modified files have Ruff formatting warnings. Recommend:
- Running
ruff formaton all modified files- Setting up pre-commit hooks to prevent formatting issues
- Consider batch formatting all test files for consistency
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
cognee-mcp/cognee_mcp/server.py (3)
94-103: Mark query parameter as required in visualization tool schema.The tool's schema defines a "query" property but doesn't mark it as required. This should be fixed to ensure proper validation.
types.Tool( name="visualize", description="Visualize the knowledge graph.", inputSchema={ "type": "object", "properties": { "query": {"type": "string"}, }, + "required": ["query"], }, ),
107-132: LGTM! Well-implemented function with proper error handling.The get_freshest_png function is well-implemented with:
- Proper error handling for directory and file operations
- Clear documentation
- Good use of type hints
199-213: Return visualization output as image content.The visualization tool returns text content despite being described as a tool to "Visualize the knowledge graph."
return [ - types.TextContent( - type="text", - text=results, - ) + types.ImageContent( + type="image", + data=await cognee.visualize_graph(), + format="png" + ) ]cognee/shared/utils.py (3)
397-397: Add error handling for PNG export.The PNG export functionality lacks error handling and cleanup of old files.
# Get the latest Unix timestamp as an integer timestamp = int(time.time()) # Construct your filename filename = f"{timestamp}.png" - export_png(p, filename=filename) + try: + # Cleanup old PNG files to prevent disk space issues + cleanup_old_pngs(directory=".", keep_latest=5) + + # Export the new PNG + export_png(p, filename=filename) + except Exception as e: + logging.error(f"Failed to export PNG: {str(e)}") + raise
31-35: 🛠️ Refactor suggestionAdd docstring and error handling for get_entities().
The function lacks a docstring explaining its purpose and parameters. Also, NLTK resource downloads should be handled with error checking.
def get_entities(tagged_tokens): + """Extract named entities from POS-tagged tokens using NLTK's ne_chunk. + + Args: + tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag() + + Returns: + A tree containing chunks of named entities + + Raises: + Exception: If NLTK resource download fails + """ + try: nltk.download("maxent_ne_chunker", quiet=True) from nltk.chunk import ne_chunk + except Exception as e: + logging.error(f"Failed to download NLTK resources: {str(e)}") + raise return ne_chunk(tagged_tokens)Likely invalid or redundant comment.
38-56: 🛠️ Refactor suggestionAdd input validation and improve error handling for extract_pos_tags().
The function should validate input and handle NLTK resource downloads more robustly.
def extract_pos_tags(sentence): - """Extract Part-of-Speech (POS) tags for words in a sentence.""" + """Extract Part-of-Speech (POS) tags for words in a sentence. + + Args: + sentence (str): Input sentence to be POS tagged + + Returns: + list: A list of tuples containing (word, POS_tag) + + Raises: + ValueError: If sentence is not a string or is empty + Exception: If NLTK resource download fails + """ + if not isinstance(sentence, str) or not sentence.strip(): + raise ValueError("Input must be a non-empty string") + try: nltk.download("words", quiet=True) nltk.download("punkt", quiet=True) nltk.download("averaged_perceptron_tagger", quiet=True) + except Exception as e: + logging.error(f"Failed to download NLTK resources: {str(e)}") + raise from nltk.tag import pos_tag from nltk.tokenize import word_tokenize tokens = word_tokenize(sentence) pos_tags = pos_tag(tokens) return pos_tagsLikely invalid or redundant comment.
| - name: Download NLTK tokenizer data | ||
| run: | | ||
| poetry run python -m nltk.downloader punkt_tab averaged_perceptron_tagger_eng |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix NLTK tokenizer name and add download verification.
The NLTK tokenizer package name is incorrect. It should be 'punkt' instead of 'punkt_tab'. Also, consider adding download verification.
Apply this diff to fix the issues:
- name: Download NLTK tokenizer data
run: |
- poetry run python -m nltk.downloader punkt_tab averaged_perceptron_tagger_eng
+ poetry run python -c "
+ import nltk
+ for package in ['punkt', 'averaged_perceptron_tagger']:
+ nltk.download(package)
+ # Verify download
+ try:
+ if package == 'punkt':
+ nltk.data.find('tokenizers/punkt')
+ else:
+ nltk.data.find('taggers/averaged_perceptron_tagger')
+ print(f'Successfully downloaded and verified {package}')
+ except LookupError as e:
+ print(f'Failed to verify {package}')
+ exit(1)
+ "📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Download NLTK tokenizer data | |
| run: | | |
| poetry run python -m nltk.downloader punkt_tab averaged_perceptron_tagger_eng | |
| - name: Download NLTK tokenizer data | |
| run: | | |
| poetry run python -c " | |
| import nltk | |
| for package in ['punkt', 'averaged_perceptron_tagger']: | |
| nltk.download(package) | |
| # Verify download | |
| try: | |
| if package == 'punkt': | |
| nltk.data.find('tokenizers/punkt') | |
| else: | |
| nltk.data.find('taggers/averaged_perceptron_tagger') | |
| print(f'Successfully downloaded and verified {package}') | |
| except LookupError as e: | |
| print(f'Failed to verify {package}') | |
| exit(1) | |
| " |
| assert ( | ||
| ground_truth["word_count"] == paragraph_data.word_count | ||
| ), f'{ground_truth["word_count"] = } != {paragraph_data.word_count = }' | ||
| assert ground_truth["len_text"] == len( | ||
| paragraph_data.text | ||
| ), f'{ground_truth["len_text"] = } != {len(paragraph_data.text) = }' | ||
| assert ( | ||
| ground_truth["cut_type"] == paragraph_data.cut_type | ||
| ), f'{ground_truth["cut_type"] = } != {paragraph_data.cut_type = }' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix formatting to comply with Ruff.
The assertion formatting needs to be adjusted according to Ruff's requirements. While the multi-line format improves readability, it should follow the project's automated formatting rules.
Run ruff format on this file to automatically fix the formatting issues.
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
| assert ( | ||
| "sentence_cut" == paragraph_data.cut_type | ||
| ), f" sentence_cut != {paragraph_data.cut_type = }" | ||
|
|
||
| # Test DOCX | ||
| for paragraph_data in docx_document.read(chunk_size=1024, chunker="text_chunker"): | ||
| assert 16 == paragraph_data.word_count, f" 16 != {paragraph_data.word_count = }" | ||
| assert 145 == len(paragraph_data.text), f" 145 != {len(paragraph_data.text) = }" | ||
| assert "sentence_end" == paragraph_data.cut_type, ( | ||
| f" sentence_end != {paragraph_data.cut_type = }" | ||
| ) | ||
| assert ( | ||
| "sentence_end" == paragraph_data.cut_type | ||
| ), f" sentence_end != {paragraph_data.cut_type = }" | ||
|
|
||
| # TEST CSV | ||
| for paragraph_data in csv_document.read(chunk_size=1024, chunker="text_chunker"): | ||
| assert 15 == paragraph_data.word_count, f" 15 != {paragraph_data.word_count = }" | ||
| assert "A A A A A A A A A,A A A A A A,A A" == paragraph_data.text, ( | ||
| f"Read text doesn't match expected text: {paragraph_data.text}" | ||
| ) | ||
| assert "sentence_cut" == paragraph_data.cut_type, ( | ||
| f" sentence_cut != {paragraph_data.cut_type = }" | ||
| ) | ||
| assert ( | ||
| "A A A A A A A A A,A A A A A A,A A" == paragraph_data.text | ||
| ), f"Read text doesn't match expected text: {paragraph_data.text}" | ||
| assert ( | ||
| "sentence_cut" == paragraph_data.cut_type | ||
| ), f" sentence_cut != {paragraph_data.cut_type = }" | ||
|
|
||
| # Test XLSX | ||
| for paragraph_data in xlsx_document.read(chunk_size=1024, chunker="text_chunker"): | ||
| assert 36 == paragraph_data.word_count, f" 36 != {paragraph_data.word_count = }" | ||
| assert 171 == len(paragraph_data.text), f" 171 != {len(paragraph_data.text) = }" | ||
| assert "sentence_cut" == paragraph_data.cut_type, ( | ||
| f" sentence_cut != {paragraph_data.cut_type = }" | ||
| ) | ||
| assert ( | ||
| "sentence_cut" == paragraph_data.cut_type | ||
| ), f" sentence_cut != {paragraph_data.cut_type = }" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix formatting to comply with Ruff.
Multiple assertion blocks need formatting adjustments to comply with Ruff's requirements.
Run ruff format on this file to automatically fix the formatting issues.
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
| assert ( | ||
| reconstructed_text == input_text | ||
| ), f"texts are not identical: {len(input_text) = }, {len(reconstructed_text) = }" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix formatting to comply with Ruff.
The assertion formatting needs to be adjusted according to Ruff's requirements.
Run ruff format on this file to automatically fix the formatting issues.
🧰 Tools
🪛 GitHub Actions: ruff format
[warning] File requires formatting with Ruff formatter
| from sqlalchemy.testing.plugin.plugin_base import logging | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use the standard Python logging module instead.
Importing logging from sqlalchemy's testing plugin is not recommended. Use Python's built-in logging module instead.
-from sqlalchemy.testing.plugin.plugin_base import logging
+import logging📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from sqlalchemy.testing.plugin.plugin_base import logging | |
| import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
cognee-mcp/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
cognee-mcp/cognee_mcp/server.py(3 hunks)cognee/api/v1/visualize/visualize.py(1 hunks)cognee/infrastructure/databases/graph/neo4j_driver/adapter.py(2 hunks)cognee/infrastructure/databases/hybrid/falkordb/FalkorDBAdapter.py(3 hunks)cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py(1 hunks)cognee/shared/utils.py(5 hunks)
✅ Files skipped from review due to trivial changes (4)
- cognee/infrastructure/databases/relational/sqlalchemy/SqlAlchemyAdapter.py
- cognee/infrastructure/databases/hybrid/falkordb/FalkorDBAdapter.py
- cognee/api/v1/visualize/visualize.py
- cognee/infrastructure/databases/graph/neo4j_driver/adapter.py
🚧 Files skipped from review as they are similar to previous changes (1)
- cognee-mcp/cognee_mcp/server.py
⏰ Context from checks skipped due to timeout of 90000ms (17)
- GitHub Check: run_notebook_test / test
- GitHub Check: test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_notebook_test / test
- GitHub Check: run_dynamic_steps_example_test / test
- GitHub Check: run_simple_example_test / test
- GitHub Check: run_multimedia_example_test / test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: docker-compose-test
🔇 Additional comments (3)
cognee/shared/utils.py (3)
38-56: 🛠️ Refactor suggestionAdd input validation and improve error handling for extract_pos_tags()
The function should validate input and handle NLTK resource downloads more robustly.
def extract_pos_tags(sentence): """Extract Part-of-Speech (POS) tags for words in a sentence.""" + if not isinstance(sentence, str) or not sentence.strip(): + raise ValueError("Input must be a non-empty string") + try: # Ensure that the necessary NLTK resources are downloaded nltk.download("words", quiet=True) nltk.download("punkt", quiet=True) nltk.download("averaged_perceptron_tagger", quiet=True) + except Exception as e: + logging.error(f"Failed to download NLTK resources: {str(e)}") + raise from nltk.tag import pos_tag from nltk.tokenize import word_tokenize # Tokenize the sentence into words tokens = word_tokenize(sentence) # Tag each word with its corresponding POS tag pos_tags = pos_tag(tokens) return pos_tagsLikely invalid or redundant comment.
31-35: 🛠️ Refactor suggestionAdd docstring and error handling for get_entities()
The function lacks a docstring explaining its purpose and parameters. Also, NLTK resource downloads should be handled with error checking.
def get_entities(tagged_tokens): + """Extract named entities from POS-tagged tokens using NLTK's ne_chunk. + + Args: + tagged_tokens: A list of POS-tagged tokens from nltk.pos_tag() + + Returns: + A tree containing chunks of named entities + + Raises: + Exception: If NLTK resource download fails + """ + try: nltk.download("maxent_ne_chunker", quiet=True) from nltk.chunk import ne_chunk + except Exception as e: + logging.error(f"Failed to download NLTK resources: {str(e)}") + raise return ne_chunk(tagged_tokens)Likely invalid or redundant comment.
397-397: 🛠️ Refactor suggestionAdd error handling for PNG export
The PNG export functionality lacks error handling and cleanup of old files.
from bokeh.io import export_png + import os + + def cleanup_old_pngs(directory=".", keep_latest=5): + """Cleanup old PNG files, keeping only the N latest files.""" + png_files = [f for f in os.listdir(directory) if f.endswith('.png')] + if len(png_files) <= keep_latest: + return + + # Sort by timestamp in filename + sorted_files = sorted(png_files, key=lambda x: int(x.replace(".png", ""))) + + # Remove older files + for f in sorted_files[:-keep_latest]: + try: + os.remove(os.path.join(directory, f)) + except OSError as e: + logging.warning(f"Failed to remove old PNG file {f}: {str(e)}")Likely invalid or redundant comment.
| home_dir = os.path.expanduser("~") | ||
|
|
||
| # Construct the final output file path | ||
| output_filepath = os.path.join(home_dir, output_filename) | ||
| with open(output_filepath, "w") as f: | ||
| f.write(html_content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for file operations
The file writing operations lack error handling and directory existence checks.
home_dir = os.path.expanduser("~")
+
+ # Ensure output directory exists
+ try:
+ os.makedirs(home_dir, exist_ok=True)
+ except OSError as e:
+ logging.error(f"Failed to create output directory: {str(e)}")
+ raise
# Construct the final output file path
output_filepath = os.path.join(home_dir, output_filename)
+ try:
with open(output_filepath, "w") as f:
f.write(html_content)
+ logging.info(f"Successfully saved visualization to {output_filepath}")
+ except IOError as e:
+ logging.error(f"Failed to write output file: {str(e)}")
+ raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| home_dir = os.path.expanduser("~") | |
| # Construct the final output file path | |
| output_filepath = os.path.join(home_dir, output_filename) | |
| with open(output_filepath, "w") as f: | |
| f.write(html_content) | |
| home_dir = os.path.expanduser("~") | |
| # Ensure output directory exists | |
| try: | |
| os.makedirs(home_dir, exist_ok=True) | |
| except OSError as e: | |
| logging.error(f"Failed to create output directory: {str(e)}") | |
| raise | |
| # Construct the final output file path | |
| output_filepath = os.path.join(home_dir, output_filename) | |
| try: | |
| with open(output_filepath, "w") as f: | |
| f.write(html_content) | |
| logging.info(f"Successfully saved visualization to {output_filepath}") | |
| except IOError as e: | |
| logging.error(f"Failed to write output file: {str(e)}") | |
| raise |
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Improvements
Dependency Updates
Workflows